Returns information and statistics on terms in the fields of a particular
document. The document could be stored in the index or artificially provided
by the user. Term vectors are realtime by default, not near
realtime. This can be changed by setting realtime parameter to false.

GET /twitter/tweet/1/_termvectors

Optionally, you can specify the fields for which the information is
retrieved either with a parameter in the url

GET /twitter/tweet/1/_termvectors?fields=message

or by adding the requested fields in the request body (see
example below). Fields can also be specified with wildcards
in similar way to the multi match query

Note that the usage of /_termvector is deprecated in 2.0, and replaced by /_termvectors.

Three types of values can be requested: term information, term statistics
and field statistics. By default, all term information and field
statistics are returned for all fields but no term statistics.

If the requested information wasn’t stored in the index, it will be
computed on the fly if possible. Additionally, term vectors could be computed
for documents not even existing in the index, but instead provided by the user.

Start and end offsets assume UTF-16 encoding is being used. If you want to use
these offsets in order to get the original text that produced this token, you
should make sure that the string you are taking a sub-string of is also encoded
using UTF-16.

With the parameter filter, the terms returned could also be filtered based
on their tf-idf scores. This could be useful in order find out a good
characteristic vector of a document. This feature works in a similar manner to
the second phase of the
More Like This Query. See example 5
for usage.

The following sub-parameters are supported:

max_num_terms

Maximum number of terms that must be returned per field. Defaults to 25.

min_term_freq

Ignore words with less than this frequency in the source doc. Defaults to 1.

max_term_freq

Ignore words with more than this frequency in the source doc. Defaults to unbounded.

min_doc_freq

Ignore terms which do not occur in at least this many docs. Defaults to 1.

max_doc_freq

Ignore words which occur in more than this many docs. Defaults to unbounded.

min_word_length

The minimum word length below which words will be ignored. Defaults to 0.

max_word_length

The maximum word length above which words will be ignored. Defaults to unbounded (0).

The term and field statistics are not accurate. Deleted documents
are not taken into account. The information is only retrieved for the
shard the requested document resides in.
The term and field statistics are therefore only useful as relative measures
whereas the absolute numbers have no meaning in this context. By default,
when requesting term vectors of artificial documents, a shard to get the statistics
from is randomly selected. Use routing only to hit a particular shard.

Term vectors which are not explicitly stored in the index are automatically
computed on the fly. The following request returns all information and statistics for the
fields in document 1, even though the terms haven’t been explicitly stored in the index.
Note that for the field text, the terms are not re-generated.

Term vectors can also be generated for artificial documents,
that is for documents not present in the index. For example, the following request would
return the same results as in example 1. The mapping used is determined by the
index and type.

If dynamic mapping is turned on (default), the document fields not in the original
mapping will be dynamically created.

Additionally, a different analyzer than the one at the field may be provided
by using the per_field_analyzer parameter. This is useful in order to
generate term vectors in any fashion, especially when using artificial
documents. When providing an analyzer for a field that already stores term
vectors, the term vectors will be re-generated.

Finally, the terms returned could be filtered based on their tf-idf scores. In
the example below we obtain the three most "interesting" keywords from the
artificial document having the given "plot" field value. Notice
that the keyword "Tony" or any stop words are not part of the response, as
their tf-idf must be too low.