1 Introduction

This is a list of tests, examples, and scripts that I have created in order to
either reproduce an issue, test a bugfix, or validate a behavior.

Most of these examples will either be in a shell format, relying on the use of
curl, or they will be in es-mode format, which will also work in Sense. If you
are reading this as an org-mode file, you can tangle blocks to generate scripts
if so desired.

If you are an Emacs user and want the original, plain-text .org file, replace
the .html for any page with .org to download the file.

This file was last exported: 2016-08-04 Thu 09:37

2 Design

I do a lot of design in org-mode also. My definition of "design" is really more
of a note-taking or measurement-gathering example, so some of these may be more
like scratch pads and some will be more like concrete design docs.

As with any of this information, it could be out of date, or it could be
entirely wrong as I test against an older version of Elasticsearch.

4.9 Using the field_value_factor function in a function score query

By far the most common use case I see for function_score is multiplying the
score of a document by some field inside the document, whether it be star rating
for hotels, or popularity for foods. So instead of requiring the user to write
an Groovy script, it would be nice if we could provide an easy way to do this.

4.10 Naming a query to return which part of the query matched

Sometimes people ask how they can tell which part of a query matched a
particular document, ES queries all support the _name field, which is then
returned in the hits to indicate which of the queries matched.

4.13 Sorting with a script

Sometimes you may want to transform a field for sorting. NOTE a better way to
do this would be to use function_score to score based on the values of the
strings, but this is to demonstrate doing it with sorting.

4.17 Using BM25 or DFR instead of TF-IDF

While TF-IDF does a great job, sometimes people may want to use BM25, which is
another nice similarity algorithm. This is an example of setting it up per-field
so you can compare the two algorithms.

I did this with a multi-field that indexed the body field with all the
different similarities, just so I could compare all at once. The interesting
thing about this is that when I spoke to Robert about this, there's nothing
that's actually being changed during indexing, it's just safety in case that
ever needs to be the case.

I'd like to make it configurable at query time, I think I have a branch for it
somewhere…

4.21.4 Check the field data usage

4.22 Determining why a shard will not be allocated

So, suppose you create an index but you can't figure out why shards won't
allocate. There are a couple of ways to diagnose this like turning the logging
level up, etc. However, you can use the reroute API to give a nice explanation
as well: