ElasticSearch-0.68

NAME

ElasticSearch::QueryParser - Check or filter query strings

DESCRIPTION

Passing an illegal query string to ElasticSearch, the request will fail. When using a query string from an external source, eg the keywords field from a web search form, it is important to filter it to avoid these failures.

You may also want to allow or disallow certain query string features, eg the ability to search on a particular field.

allow_bool

By default, boolean operators are allowed. Set allow_bool to false to disable them.

Note: This doesn't affect the + or - operators, which are always allowed. eg:

+apple -crab

allow_boost

Boost allows you to give a more importance to a particular word, group of words or phrase, eg:

foo^2 (bar baz)^3 "this exact phrase"^5

By default, boost is enabled. Setting allow_boost to false would convert the above example to:

foo (bar baz) "this exact phrase"

allow_fuzzy

Lucene supports fuzzy searches based on the Levenshtein Distance, eg:

supercalifragilisticexpialidocious~0.5

To disable these, set allow_fuzzy to false.

allow_slop

While a phrase search (eg "this exact phrase") looks for the exact phrase, in the same order, you can use phrase slop to find all the words in the phrase, in any order, within a certain number of words, eg:

For the phrase: "The quick brown fox jumped over the lazy dog."
Query string: Matches:
"quick brown" Yes
"brown quick" No
"quick fox" No
"brown quick"~2 Yes # within 2 words of each other
"fox dog"~6 Yes # within 6 words of each other

To disable this "phrase slop", set allow_slop to false

allow_ranges

Lucene can accept ranges, eg:

date:[2001 TO 2010] name:[alan TO john]

To enable these, set allow_ranges to true.

wildcard_prefix

Lucene can accept wildcard searches such as:

jo*n smith?

Lucene takes these wildcards and expands the search to include all matching terms, eg jo*n could be expanded to jon, john, jonathan etc

This can result in a huge number of terms, so it is advisable to require that the first $min characters of the word are not wildcards.

By default, the wildcard_prefix requires that at least the first character is not a wildcard, ie * is not acceptable, but s* is.

You can change the minimum length of the non-wildcard prefix by setting wildcard_prefix, eg:

$qp->filter("foo* foobar*", wildcard_prefix=>4)
# "foo foobar*"

BUGS

This is a new module, so it is likely that there will be bugs, and the list of options and how "filter()" cleans up the query string may well change.