This module is a thin API which makes it easy to communicate with an ElasticSearch cluster.

It maintains a list of all servers/nodes in the ElasticSearch cluster,
and spreads the load across these nodes in round-robin fashion.
If the current active node disappears,
then it attempts to connect to another node in the list.

Forking a process triggers a server list refresh,
and a new connection to a randomly chosen node in the list.

Methods that query the ElasticSearch cluster return the raw data structure that the cluster returns. This may change in the future, but as these data structures are still in flux, I thought it safer not to try to interpret.

Anything that is known to be an error throws an exception, eg trying to delete a non-existent index.

servers can be either a single server or an ARRAY ref with a list of servers. If not specified, then it defaults to localhost and the port for the specified transport (eg 9200 for http* or 9500 for thrift).

These servers are used in a round-robin fashion. If any server fails to connect, then the other servers in the list are tried, and if any succeeds, then a list of all servers/nodes currently known to the ElasticSearch cluster are retrieved and stored.

Every max_requests (default 10,000) this list of known nodes is refreshed automatically. To disable this automatic refresh, you can set max_requests to 0.

Regardless of the max_requests setting, a list of live nodes will still be retrieved on the first request. This may not be desirable behaviour if, for instance, you are connecting to remote servers which use internal IP addresses, or which don't allow remote nodes() requests.

If you want to disable this behaviour completely, set no_refresh to 1, in which case the transport module will round robin through the servers list only. Failed nodes will be removed from the list (but added back in every max_requests or when all nodes have failed).

timeout for all CRUD methods and "search()" is a query timeout, specifying the amount of time ElasticSearch will spend (roughly) processing a query. Units can be concatenated with the integer value, e.g., 500ms or 1s.

If $docs or $ids is an empty array ref, then mget() will just return an empty array ref.

Returns an array ref containing all of the documents requested. If a document is not found, then its entry will include {exists => 0}. If you would rather filter these missing docs, pass filter_missing => 1.

dest_index is the name of the destination index, ie where the docs are indexed to. If you are indexing your data from one cluster to another, and you want to use the same index name in your destination cluster, then you can leave this blank.

bulk_size - the number of docs that will be indexed at a time. Defaults to 1,000

Set quiet to 1 if you don't want any progress information to be printed to STDOUT

transform should be a sub-ref which will be called for each doc, allowing you to transform some element of the doc, or to skip the doc by returning undef.

DEPRECATION: count() previously took query types at the top level, eg $es->count( term=> { ... }). This form still works, but is deprecated. Instead use the queryb or query parameter as you would in "search()".

With "msearch()" you can run multiple searches in parallel. queries can contain either an array of queries, or a hash of named queries. $results will return either an array or hash of results, depending on what you pass in.

The top-level index, type and search_type parameters define default values which will be used for each query, although these can be overridden in the query parameters:

DEPRECATION: delete_by_query() previously took query types at the top level, eg $es->delete_by_query( term=> { ... }). This form still works, but is deprecated. Instead use the queryb or query parameter as you would in "search()".

More-like-this (mlt) finds related/similar documents. It is possible to run a search query with a more_like_this clause (where you pass in the text you're trying to match), or to use this method, which uses the text of the document referred to by index/type/id.

This gets transformed into a search query, so all of the search parameters are also available.

Returns a hashref with { valid => 1} if the passed in query (native ES query) queryb (SearchBuilder style query) or q (Lucene query string) is valid. Otherwise valid is false. Set explain to 1 to include the explanation of why the query is invalid.

The open and close index APIs allow you to close an index, and later on open it.

A closed index has almost no overhead on the cluster (except for maintaining its metadata), and is blocked for read/write operations. A closed index can be opened which will then go through the normal recovery process.

Index templates allow you to define templates that will automatically be applied to newly created indices. You can specify both settings and mappings, and a simple pattern template that controls whether the template will be applied to a new index.

Flushes one or more indices, which frees memory from the index by flushing data to the index storage and clearing the internal transaction log. By default, ElasticSearch uses memory heuristics in order to automatically trigger flush operations as required in order to clear memory.

Explicitly refreshes one or more indices, making all operations performed since the last refresh available for search. The (near) real-time capabilities depends on the index engine used. For example, the robin one requires refresh to be called, but by default a refresh is scheduled periodically.

Explicitly performs a snapshot through the gateway of one or more indices (backs them up ). By default, each index gateway periodically snapshot changes, though it can be disabled and be controlled completely through this API.

DEPRECATION: put_mapping() previously took the mapping parameters at the top level, eg $es->put_mapping( properties=> { ... }). This form still works, but is deprecated. Instead use the mapping parameter.

Index warming allow you to run typical search requests to "warm up" new segments before they become available for search. Warmup searches typically include requests that require heavy loading of data, such as faceting or sorting on specific fields.

Returns any matching registered warmers. The $warmer can be blank, the name of a particular warmer, or use wilcards, eg "warmer_*". Throws an error if no matching warmer is found, and ignore_missing is false.

Deletes any matching registered warmers. The index parameter is required and can be set to _all to match all indices. The $warmer can be the name of a particular warmer, or use wilcards, eg "warmer_*" or "*" for any warmer. Throws an error if no matching warmer is found, and ignore_missing is false.

It can block to wait for a particular status (or better), or can block to wait until the specified number of shards have been relocated (where 0 means all) or the specified number of nodes have been allocated.

The "cluster_reroute" command allows you to explicitly affect shard allocation within a cluster. For example, a shard can be moved from one node to another, an allocation can be cancelled, or an unassigned shard can be explicitly allocated on a specific node.

NOTE: after executing the commands, the cluster will automatically rebalance itself if it is out of balance. Use the dry_run parameter to see what the final outcome will be after automatic rebalancing, before executing the real "cluster_reroute" call.

trace_calls() is used for debugging. All requests to the cluster are logged either to STDERR, or the specified filehandle, or the specified filename, with the current $PID appended, in a form that can be rerun with curl.

This tries to retrieve a list of all known live servers in the ElasticSearch cluster by connecting to each of the last known live servers (and the initial list of servers passed to new()) until it succeeds.

This list of live servers is then used in a round-robin fashion.

refresh_servers() is called on the first request and every max_requests. This automatic refresh can be disabled by setting max_requests to 0:

Gets/sets the camel_case flag. If true, then all JSON keys returned by ElasticSearch are in camelCase, instead of with_underscores. This flag does not apply to the source document being indexed or fetched.

The _source key that is returned from a "get()" contains the original JSON string that was used to index the document initially. ElasticSearch parses JSON more leniently than JSON::XS, so if invalid JSON is used to index the document (eg unquoted keys) then $es->get(....) will fail with a JSON exception.

Any documents indexed via this module will be not susceptible to this problem.