Search indexes allow queries to be made for any type of data that is indexed by the Chef server, including data bags (and data bag items), environments, nodes, and roles. A defined query syntax is used to support search patterns like exact, wildcard, range, and fuzzy. A search is a full-text query that can be done from several locations, including from within a recipe, by using the search subcommand in knife, the search method in the Recipe DSL, the search box in the Chef management console, and by using the /search or /search/INDEX endpoints in the Chef server API. The search engine is based on Apache Solr and is run from the Chef server.

Many of the examples in this section use knife, but the search indexes and search query syntax can be used in many locations, including from within recipes and when using the Chef server API.

A search index is a full-text list of objects that are stored on the Chef server, against which search queries can be made. The following search indexes are built:

Search Index Name

Description

client

API client

DATA_BAG_NAME

A data bag is a global variable that is stored as JSON data and is accessible from a Chef server. The name of the search index is the name of the data bag. For example, if the name of the data bag was “admins” then a corresponding search query might look something like search(:admins,"*:*").

environment

An environment is a way to map an organization’s real-life workflow to what can be configured and managed when using Chef server.

node

A node is any server or virtual server that is configured to be maintained by a chef-client.

role

A role is a way to define certain patterns and processes that exist across nodes in an organization as belonging to a single job function.

A search query is comprised of two parts: the key and the search pattern. A search query has the following syntax:

key:search_pattern

where key is a field name that is found in the JSON description of an indexable object on the Chef server (a role, node, client, environment, or data bag) and search_pattern defines what will be searched for, using one of the following search patterns: exact, wildcard, range, or fuzzy matching. Both key and search_pattern are case-sensitive; key has limited support for multiple character wildcard matching using an asterisk (“*”) (and as long as it is not the first character).

A partial search query allows a search query to be made against specific attribute keys that are stored on the Chef server. A partial search query can search the same set of objects on the Chef server as a full search query, including specifying an object index and providing a query that can be matched to the relevant index. While a full search query will return an array of objects that match (each object containing a full set of attributes for the node), a partial search query will return only the values for the attributes that match. One primary benefit of using a partial search query is that it requires less memory and network bandwidth while the chef-client processes the search results.

Note

To use the partial_search method in a recipe, that recipe must contain a dependency on the partial_search cookbook.

To create a partial search query, use the partial_search method, and then specify the key paths for the attributes to be returned. Each key path should be specified as an array of strings and is mapped to an arbitrary short name. For example:

The following examples show how partial search can be used in a recipe. First, a recipe without partial search:

nodes=search(:node,"keys_ssh:* NOT name:#{node.name}")nodes<<nodebeginother_hosts=data_bag('ssh_known_hosts')other_hosts.eachdo|h|host=data_bag_item('ssh_known_hosts',h).to_hashhost['ipaddress']||=r.getaddress(host['fqdn'])host['keys']={'ssh'=>{}}host['keys']['ssh']['host_rsa_public']=host['rsa']ifhost.has_key?('rsa')host['keys']['ssh']['host_dsa_public']=host['dsa']ifhost.has_key?('dsa')nodes<<hostendend

and then the same recipe that uses the partial_search method to provide better and more targeted search results:

nodes=partial_search(:node,"keys_ssh:* NOT name:#{node.name}",:keys=>{'hostname'=>['hostname'],'fqdn'=>['fqdn'],'ipaddress'=>['ipaddress'],'host_rsa_public'=>['keys','ssh','host_rsa_public'],'host_dsa_public'=>['keys','ssh','host_dsa_public']})nodes<<{'hostname'=>node['hostname'],'fqdn'=>node['fqdn'],'ipaddress'=>node['ipaddress'],'host_rsa_public'=>node['ssh']&&node['ssh']['keys']&&node['ssh']['keys']['host_rsa_public']?node['ssh']['keys']['host_rsa_public']:nil,'host_dsa_public'=>node['ssh']&&node['ssh']['keys']&&node['ssh']['keys']['host_dsa_public']?node['ssh']['keys']['host_dsa_public']:nil,}beginother_hosts=data_bag('ssh_known_hosts')other_hosts.eachdo|h|host=data_bag_item('ssh_known_hosts',h).to_hashhost['ipaddress']||=r.getaddress(host['fqdn'])host['host_rsa_public']=host.has_key?('rsa')?host['rsa']:nilhost['host_dsa_public']=host.has_key?('dsa')?host['dsa']:nilnodes<<hostendend

And a different example from a different recipe. First, without partial search:

A field name/description pair is available in the JSON object. Use the field name when searching for this information in the JSON object. Any field that exists in any JSON description for any role, node, chef-client, environment, or data bag can be searched.

A nested field appears deeper in the JSON data structure. For example, information about a network interface might be several layers deep: node[:network][:interfaces][:en1]. When nested fields are present in a JSON structure, the chef-client will extract those nested fields to the top-level, flattening them into compound fields that support wildcard search patterns.

By combining wildcards with range-matching patterns and wildcard queries, it is possible to perform very powerful searches, such as using the vendor part of the MAC address to find every node that has a network card made by the specified vendor.

which allows searches like the following to find data that is present in this node:

node"network_interfaces_en1_addresses:192.168.0.195"

This flattened data structure also supports using wildcard compound fields, which allow searches to omit levels within the JSON data structure that are not important to the search query. In the following example, an asterisk (*) is used to show where the wildcard can exist when searching for a nested field:

For each of the wildcard examples above, the possible values are shown contained within the brackets. When running a search query, the query syntax for wildcards is to simply omit the name of the node (while preserving the underscores), similar to:

network_interfaces__flags

This query will search within the flags node, within the JSON structure, for each of UP, BROADCAST, SMART, RUNNING, SIMPLEX, and MULTICAST.

A search pattern is a way to fine-tune search results by returning anything that matches some type of incomplete search query. There are four types of search patterns that can be used when searching the search indexes on the Chef server: exact, wildcard, range, and fuzzy.

An exact matching search pattern is used to search for a key with a name that exactly matches a search query. If the name of the key contains spaces, quotes must be used in the search pattern to ensure the search query finds the key. The entire query must also be contained within quotes, so as to prevent it from being interpreted by Ruby or a command shell. The best way to ensure that quotes are used consistently is to quote the entire query using single quotes (‘ ‘) and a search pattern with double quotes (” ”).

To search in a specific data bag for a specific data bag item, enter the following:

$ knife search admins 'id:charlie'

where admins is the name of the data bag and charlie is the name of the data bag item. Something similar to the following will be returned:

A wildcard matching search pattern is used to query for substring matches that replace zero (or more) characters in the search pattern with anything that could match the replaced character. There are two types of wildcard searches:

A question mark (?) can be used to replace exactly one character (as long as that character is not the first character in the search pattern)

An asterisk (*) can be used to replace any number of characters (including zero)

To search for any node that contains the specified key, enter the following:

$ knife search node 'foo:*'

where foo is the name of the node.

To search for a node using a partial name, enter one of the following:

$ knife search node 'name:app*'

or:

$ knife search node 'name:app1*.example.com'

or:

$ knife search node 'name:app?.example.com'

or:

$ knife search node 'name:app1.example.???'

to return app1.example.com (and any other node that matches any of the string searches above).

A range matching search pattern is used to query for values that are within a range defined by upper and lower boundaries. A range matching search pattern can be inclusive or exclusive of the boundaries. Use square brackets (“[ ]”) to denote inclusive boundaries and curly braces (“{ }”) to denote exclusive boundaries and with the following syntax:

boundaryTOboundary

where TO is required (and must be capitalized).

A data bag named sample contains four data bag items: abc, bar, baz, and quz. All of the items in-between bar and foo, inclusive, can be searched for using an inclusive search pattern.

To search using an inclusive range, enter the following:

$ knife search sample "id:[bar TO foo]"

where square brackets ([]) are used to define the range.

A data bag named sample contains four data bag items: abc, bar, baz, and quz. All of the items that are exclusive to bar and foo can be searched for using an exclusive search pattern.

A fuzzy matching search pattern is used to search based on the proximity of two strings of characters. An (optional) integer may be used as part of the search query to more closely define the proximity. A fuzzy matching search pattern has the following syntax:

"search_query"~edit_distance

where search_query is the string that will be used during the search and edit_distance is the proximity. A tilde (“~”) is used to separate the edit distance from the search query.

To use a fuzzy search pattern enter something similar to:

$ knife search client "name:boo~"

where boo~ defines the fuzzy search pattern. This will return something similar to:

An operator can be used to ensure that certain terms are included in the results, are excluded from the results, or are not included even when other aspects of the query match. Searches can use the following operators:

Operator

Description

AND

Use to find a match when both terms exist.

OR

Use to find a match if either term exists.

NOT

Use to exclude the term after NOT from the search results.

Operators must be in ALL CAPS. Parentheses can be used to group clauses and to form sub-queries.

A special character can be used to fine-tune a search query and to increase the accuracy of the search results. The following characters can be included within the search query syntax, but each occurrence of a special character must be escaped with a backslash (\):

Expanded lists of roles (all of the roles that apply to a node, including nested roles) and recipes to the role and recipe attributes on a node are saved on the Chef server. The expanded lists of roles allows for searching within nodes that run a given recipe, even if that recipe is included by a role.

Note

The recipes field is updated each time the chef-client is run; changes to a run-list will not affect recipes until the next time the chef-client is run on the node.

Node Location

Description

In a specified recipe

To find a node with a specified recipe in the run-list, search within the run_list field (and escaping any special characters with the slash symbol) using the following syntax:

search(:node,'run_list:recipe\[foo\:\:bar\]')

where recipe (singular!) indicates the top-level run-list. Variables can be interpolated into search strings using the Ruby alternate quoting syntax:

search(:node,%Q{run_list:"recipe[#{the_recipe}]"})

In an expanded run-list

To find a node with a recipe in an expanded run-list, search within the recipes field (and escaping any special characters with the slash symbol) using the following syntax:

recipes:RECIPE_NAME

where recipes (plural!) indicates to search within an expanded run-list.

If you just want to use each result of the search and don’t care about the aggregate result you can provide a code block to the search method. Each result is then passed to the block:

An API client is any machine that has permission to use the Chef server API to communicate with the Chef server. An API client is typically a node (on which the chef-client runs) or a workstation (on which knife runs), but can also be any other machine configured to use the Chef server API.

Sometimes when a role isn’t fully defined (or implemented), it may be necessary for a machine to connect to a database, search engine, or some other service within an environment by using the settings located on another machine, such as a host name, IP address, or private IP address. The following example shows a simplified settings file:

username:"mysql"password:"MoveAlong"host:"10.40.64.202"port:"3306"

where host is the private IP address of the database server. Use the following knife query to view information about the node:

knife search node "name:name_of_database_server" --long

To access these settings as part of a recipe that is run on the web server, use code similar to:

where the “[0]” is the 0 (zero) index for the db_server identifier. A single document is returned because the node is being searched on its unique name. The identifier private_ip will now have the value of the private IP address of the database server (10.40.64.202) and can then be used in templates as a variable, among other possible uses.

An environment is a way to map an organization’s real-life workflow to what can be configured and managed when using Chef server. Every organization begins with a single environment called the _default environment, which cannot be modified (or deleted). Additional environments can be created to reflect each organization’s patterns and workflow. For example, creating production, staging, testing, and development environments. Generally, an environment is also associated with one (or more) cookbook versions.

When searching, an environment is an attribute. This allows search results to be limited to a specified environment by using Boolean operators and extra search terms. For example, to use knife to search for all of the servers running CentOS in an environment named “QA”, enter the following:

knife search node "chef_environment:QA AND platform:centos"

Or, to include the same search in a recipe, use a code block similar to:

qa_nodes=search(:node,"chef_environment:QA")qa_nodes.eachdo|qa_node|# Do useful work specific to qa nodes onlyend

A data bag is a global variable that is stored as JSON data and is accessible from a Chef server. A data bag is indexed for searching and can be loaded by a recipe or accessed during a search.

Any search for a data bag (or a data bag item) must specify the name of the data bag and then provide the search query string that will be used during the search. For example, to use knife to search within a data bag named “admin_data” across all items, except for the “admin_users” item, enter the following:

$ knife search admin_data "(NOT id:admin_users)"

Or, to include the same search query in a recipe, use a code block similar to:

search(:admin_data,"NOT id:admin_users")

It may not be possible to know which data bag items will be needed. It may be necessary to load everything in a data bag (but not know what “everything” is). Using a search query is the ideal way to deal with that ambiguity, yet still ensure that all of the required data is returned. The following examples show how a recipe can use a series of search queries to search within a data bag named “admins”. For example, to find every administrator:

search(:admins,"*:*")

Or to search for an administrator named “charlie”:

search(:admins,"id:charlie")

Or to search for an administrator with a group identifier of “ops”:

search(:admins,"gid:ops")

Or to search for an administrator whose name begins with the letter “c”:

search(:admins,"id:c*")

Data bag items that are returned by a search query can be used as if they were a hash. For example:

The following recipe can be used to create a user for each administrator by loading all of the items from the “admins” data bag, looping through each admin in the data bag, and then creating a user resource so that each of those admins exist: