If you know exactly which entries you need, it's pretty easy to fetch the corresponding Branch objects, but what if you need to search for entries matching one or more criteria?
Searching is implemented in Treequel via Treequel::Branchset. Much like Datasets from the Sequel library which inspired Treequel, a Branchset is an object which represents an abstract set of records returned by a search. The results of the search are returned on demand, so a Branchset can be kept around and reused indefinitely.
You can construct a new Branchset via the usual constructor; it takes the Branch for the base DN of the search:
irb> Treequel::Branchset.new( dir.ou(:people) ) # => #<Treequel::Branchset:0x1a418ec base_dn='ou=people,dc=acme,dc=com', filter=(objectClass=*), scope=subtree, select=*, limit=0, timeout=0.000>
There are also several convenience methods on Branch and Directory that can create a new Branchset relative to themselves, as well:
irb> dir.branchset # => #<Treequel::Branchset:0x1a3fc54 base_dn='dc=acme,dc=com', filter=(objectClass=*), scope=subtree, select=*, limit=0, timeout=0.000> irb> dir.ou(:people).branchset # => #<Treequel::Branchset:0x1998314 base_dn='ou=people,dc=acme,dc=com', filter=(objectClass=*), scope=subtree, select=*, limit=0, timeout=0.000>
Like Sequel Datasets, Branchsets are meant to be chainable, so you can refine what entries it will find by calling one of its mutators. Each mutator method returns a new Branchset with the new criteria set. This allows you to build up a query for what you need gradually, in a concise and flexible manner.
The first of these mutators is Treequel::Branchset#filter.
You can narrow the results of that search by adding one or more filter
statements. Each call to #filter
adds a clause to the LDAP filter string that is eventually sent to the
server.
With no modifications, a Branchset will find every entry below its base
using a filter of (objectClass=*)
(which will match every
entry).
The #filter
method expects one or more expressions which are
transformed into an LDAP
filter, and can be a literal filter String, a Hash or an Array of
criteria, or a Ruby expression.
The simplest of these, of course, is a literal LDAP filter in a String
:
irb> dir.ou( :people ).filter( '(objectClass=room)' ) => #<Treequel::Branchset:0x12b7c48 base_dn='ou=people,dc=acme,dc=com', filter=(objectClass=room), scope=subtree, select=*, limit=0, timeout=0.000>
You can see what the equivalent filter of a Branchset is at any time using
its #filter_string
method:
irb> dir.ou( :people ).filter( '(objectClass=room)' ).filter_string # => "(objectClass=room)"
You can also use a Hash
to do simple
attribute=value
matching:
irb> dir.ou( :people ).filter( :givenName => 'Michael' ).filter_string # => "(givenName=Michael)"
Multiple criteria in a Hash will be ANDed together:
irb> dir.ou( :people ).filter( :givenName => 'Michael', :sn => 'Granger' ) # => "(&(givenName=Michael)(sn=Granger))"
You can include an OR in a filter by passing :or
as the first
element:
irb> dir.ou( :people ).filter( :or, [:sn, 'Granger'], [:sn, 'Smith'] ).filter_string # => "(|(sn=Granger)(sn=Smith))"
or by specifying more than one value for a single attribute:
# => #<Treequel::Directory:0x4e45d5 localhost:389 (connected) base_dn="dc=acme,dc=com", bound as=anonymous, schema=(schema not loaded)> irb> dir.ou( :people ).filter( :uid => [:mahlon, :mgranger, :jtran] ).filter_string
You can do the same with :and
and :not
, and
combine them, too:
irb> dir.ou( :people ).filter( :and, [:sn, 'Granger'], [:sn, 'Smith'] ).filter_string # => "(&(sn=Granger)(sn=Smith))" irb> dir.ou( :people ).filter( :not, [:and, [:sn, 'Granger'], [:sn, 'Smith']] ).filter_string # => "(!(&(sn=Granger)(sn=Smith)))"
Because #filter
returns the mutated branchset, you can always
chain them together instead of using an explicit :and
.
irb> dir.ou( :people ).filter( :objectClass => 'inetOrgPerson' ).filter( :sn => 'Smith' ).filter_string # => "(&(objectClass=inetOrgPerson)(sn=Smith))"
We're experimenting with support for Sequel expressions for more-complex filter expressions, too:
# Negative irb> dir.ou( :people ).filter( ~:photo ).filter_string # => "(!(photo=*))" irb> dir.ou( :people ).filter( :employeeNumber <= 1000 ).filter_string # => "(employeeNumber<=1000)" irb> dir.ou( :people ).filter( :sn.like('smith') ).filter_string # => "(sn~=smith)" irb> dir.ou( :people ).filter( :sn.like('sm*') ).filter_string # => "(sn=sm*)" irb> dir.ou( :people ).filter( :sn => ['smith', 'tran'] ).filter_string # => "(|(sn=smith)(sn=tran))"
You can also create a Branchset that will search using a different scope by
passing :onelevel
, :base
, or
:subtree
(the default) to the #scope
method of
the original Branchset:
Setting the scope to :onelevel
(as you might expect) means
that it will only descend one level when searching:
irb> dir.filter( :objectClass => :organizationalUnit ).scope( :onelevel ).collect {|branch| branch[:ou].first } => ["Hosts", "Groups", "Lists", "Resources", "People", "Departments", "Netgroups"]
Setting it to :subtree
(which is the default) means that it
will descend infinitely, and setting it to :base
means that it
will only consider the base entry, either returning it if it matches, or
returning nil
if it does not.
There are also scope aliases; you can use :one
instead of
:onelevel
, and :sub
instead of
:subtree
.
Setting Treequel::Branchset#limit will limit the number of results the search will return.
irb> dir.ou( :groups ).limit( 5 ).collect {|b| b.dn } # => ["ou=Groups,dc=acme,dc=com", "cn=anim,ou=Groups,dc=acme,dc=com", "cn=acct,ou=Groups,dc=acme,dc=com", "cn=mailuser,ou=Groups,dc=acme,dc=com", "cn=producer,ou=Groups,dc=acme,dc=com"]
Note: The results will be returned in directory
order (at least in OpenLDAP). Until Treequel supports server-side ordering, this
means that #limit
is of limited usefulness; to do real paged
results you need both server-side ordering and the paged results control.
We're planning on adding a convenient way to use controls in a future release.
If you already have a Branchset with a limit, and want a new one that won't have any limits imposed on it, you can get one via the Branchset#without_limit method.
irb> fivegroups = dir.ou( :groups ).limit( 5 ) # => #<Treequel::Branchset:0x1264908 base_dn='ou=groups,dc=acme,dc=com', filter=(objectClass=*), scope=subtree, select=*, limit=5, timeout=0.000> irb> fivegroups.all.length # => 5 irb> fivegroups.without_limit.all.length # => 99
If you should want to limit the attributes that are returned in the entries fetched by the query, you can do so by specifying which ones should be returned with the Treequel::Branchset#select method:
irb> dir.ou( :people ).select( :sn, :givenName ).limit( 5 ).collect {|b| b.entry } # => [{"dn"=>["ou=People,dc=acme,dc=com"]}, {"givenName"=>["Reed"], "sn"=>["Slimlocke"], "dn"=>["uid=rslim,ou=People,dc=acme,dc=com"]}, {"givenName"=>["Jim"], "sn"=>["Tran"], "dn"=>["uid=jtran,ou=People,dc=acme,dc=com"]}, {"givenName"=>["Michael"], "sn"=>["Granger"], "dn"=>["uid=mgranger,ou=People,dc=acme,dc=com"]}, {"givenName"=>["Harken"], "sn"=>["Farkselstein"], "dn"=>["uid=hfarkselstein,ou=People,dc=acme,dc=com"]}]
You can get a copy of a Branchset with additional attributes by passing the additional attributes to Treequel::Branchset#select_more:
irb> people_uids = dir.ou( :people ).select( :uid ) # => #<Treequel::Branchset:0x1181644 base_dn='ou=people,dc=acme,dc=com', filter=(objectClass=*), scope=subtree, select=uid, limit=0, timeout=0.000> irb> people_uids_and_names = people_uids.select_more( :gecos ) # => #<Treequel::Branchset:0x1178b20 base_dn='ou=people,dc=acme,dc=com', filter=(objectClass=*), scope=subtree, select=uid,gecos, limit=0, timeout=0.000> irb> people_uids_names_and_addresses = people_uids.select_more( :gecos, :homePostalAddress ) # => #<Treequel::Branchset:0x10dcb08 base_dn='ou=people,dc=acme,dc=com', filter=(objectClass=*), scope=subtree, select=uid,gecos,homePostalAddress, limit=0, timeout=0.000>
You can also get a copy with the select-list removed with Treequel::Branchset#select_all:
irb> people_uids.select_all # => #<Treequel::Branchset:0x10da308 base_dn='ou=people,dc=acme,dc=com', filter=(objectClass=*), scope=subtree, select=*, limit=0, timeout=0.000>
To avoid unintentional resource consumption on the server, you can specify an explicit timeout for queries. This is useful when searching with user submitted input or other untrusted sources. Note that this can only be reliably used to decrease the timeout, as the server might have a maximum timeout configured that can't be exceeded.
irb> dir.filter('objectClass=*').timeout( 1 ).all LDAP::ResultError: Timed out from ./treequel/directory.rb:328:in `search_ext2' from ./treequel/directory.rb:328:in `search' from ./treequel/branchset.rb:195:in `each' from (irb):8:in `all' from (irb):8 from :0
If you have a canned query that includes a timeout, you can copy it without the restriction.
irb> slow_query = dir.filter('objectClass=*').timeout( 1 ) # => #<Treequel::Branchset:0x1d5c554 base_dn='dc=acme,dc=com', filter=(objectClass=*), scope=subtree, select=*, limit=0, timeout=1.000> irb> slow_query.all LDAP::ResultError: Timed out from ./treequel/directory.rb:328:in `search_ext2' from ./treequel/directory.rb:328:in `search' from ./treequel/branchset.rb:195:in `each' from (irb):13:in `all' from (irb):13 from :0 irb> slow_query.without_timeout.all.length # => 4982 irb> slow_query.without_timeout.all.first # => #<Treequel::Branch:0x1d4f2f0 dc=acme,dc=com @ localhost:389 (dc=acme,dc=com, tls, anonymous) entry={"o"=>["ACME"], "description"=>["http://www.example.com/"], "objectClass"=>["dcObject", "organization"], "dc"=>["acme"], "dn"=>["dc=acme,dc=com"]}>
Branchsets are also
Enumerable
, so you can slice and dice results with its
interface:
irb> people = dir.ou( :people ) # => #<Treequel::Branch:0x11857d0 ou=people,dc=acme,dc=com @ localhost:389 (dc=acme,dc=com, tls, anonymous) entry=nil> irb> people.all? {|person| File.directory?(person[:homeDirectory]) } NoMethodError: undefined method `all?' for #<Treequel::Branch:0x11857d0> from /Users/mgranger/source/ruby/Treequel/lib/treequel/branch.rb:538:in `method_missing' from (irb):3 irb> people.filter( :homeDirectory ).all? {|person| File.directory?(person[:homeDirectory]) } # => false irb> people.filter( :homeDirectory ).find_all {|person| File.exist?(person[:homeDirectory]) && File.stat(person[:homeDirectory]).uid != person[:uidNumber] } # => [#<Treequel::Branch:0x18287b8 uid=wwwspider,ou=People,dc=acme,dc=com @ localhost:389 (dc=acme,dc=com, tls, anonymous) entry={"cn"=>["Auth account for web spider"], "gidNumber"=>["200"], "givenName"=>["WebSpider"], "gecos"=>["WebSpider Account"], "homeDirectory"=>["/dev/null"], "sn"=>["WebSpider Account"], "uid"=>["wwwspider"], "uidNumber"=>["1500"], "objectClass"=>["top", "person", "inetOrgPerson", "posixAccount", "shadowAccount"], "dn"=>["uid=wwwspider,ou=People,dc=acme,dc=com"]}>]
For convenience, the Treequel::Branchset#map method is overridden to facilitate fetching single attributes from the resulting branches:
irb> dir.ou( :hosts ).filter( :ipHostNumber ).map( :ipHostNumber ).flatten => ["192.168.1.253", "192.168.1.14", "192.168.1.21", "192.168.1.22", "192.168.1.23"]