Hi everyone,
I have a 10 million document index, and multiple high-memory (e.g. 250GB
ram, 32 cores) machines available. I'd like to do everything possible to
keep search latency as low as possible (< 50ms ideally), especially during
a high-throughput environment. I know it depends a lot on the query, but
to start with I'm asking about general index/cluster settings.

Here's a list of things I'm doing so far:

ES_HEAP_SIZE=100g

machine has no swap

20 shards

Are there any other settings or search parameters I should be aware of?

Also, I'm wondering how many shards is recommended in my case. Having more
shards helps reduce latency by parallelizing the work, but at some point
the overhead of fanning out the requests and collecting the partial results
will take over and latency would get worse. Is there a rule of thumb for a
sweet spot that others have found?

The volume of updates to the index is relatively small (500K/day), but
bursty. From initial testing, it seems like updates being issued can
increase the search latency happening on the same machine. Is there a good
way to "isolate" search and updates, either by some setting, or splitting
up the cluster somehow to have dedicated update nodes and dedicated search
nodes? (Not sure how you'd deploy a setup like this, or control where the
search/update calls went.)

For the geo filter, I've tried the optimize_bbox option, and the default of
"memory" seemed to work the best, surprisingly. I haven't tried using
geohash yet, and I can't tell from the docs how one might use it, but maybe
that is inherently faster since it uses indexes?

Unfortunately, there are a lot of unique locations in my query stream, so I
don't know if caching this filter will work. (Each filter cache consumes
about 1 bit in memory per document, is that right? So about 1.25MB in my
case. Storing the most frequent 10,000 of these would take up about 12.5GB
of ram. So maybe that's doable...)

Sorry if that's a lot of questions, but I figured other people may benefit
from this thread too.
Thanks for any help.

Hi everyone,
I have a 10 million document index, and multiple high-memory (e.g. 250GB
ram, 32 cores) machines available. I'd like to do everything possible to
keep search latency as low as possible (< 50ms ideally), especially during
a high-throughput environment. I know it depends a lot on the query, but
to start with I'm asking about general index/cluster settings.

Here's a list of things I'm doing so far:

ES_HEAP_SIZE=100g

machine has no swap

20 shards

Are there any other settings or search parameters I should be aware of?

If you care about latency, it might make sense to configure a smaller heap
size. The issue with large heaps is that they take longer to collect,
especially for collections of the old generation (I'm talking about minutes
here). I would recommend setting the heap size at at most 30GB (which will
also allow you to benefit from compressed pointers), one way to go this
route could be to start several nodes per physical machine.

Also, I'm wondering how many shards is recommended in my case. Having more
shards helps reduce latency by parallelizing the work, but at some point
the overhead of fanning out the requests and collecting the partial results
will take over and latency would get worse. Is there a rule of thumb for a
sweet spot that others have found?

If you don't have a lot of traffic, you could think about configuring
num_shards = total_num_cpus / num_concurrent_queries. But as you said,
there is also some overhead to large numbers of shards so this deserves
testing.

The volume of updates to the index is relatively small (500K/day), but
bursty. From initial testing, it seems like updates being issued can
increase the search latency happening on the same machine. Is there a good
way to "isolate" search and updates, either by some setting, or splitting
up the cluster somehow to have dedicated update nodes and dedicated search
nodes? (Not sure how you'd deploy a setup like this, or control where the
search/update calls went.)

For the geo filter, I've tried the optimize_bbox option, and the default
of "memory" seemed to work the best, surprisingly. I haven't tried using
geohash yet, and I can't tell from the docs how one might use it, but maybe
that is inherently faster since it uses indexes?

The bbox optimization is useful if your geo query matches a small portion
of your index. Maybe the issue here is that with such a large radius, you
match most of your documents?

Unfortunately, there are a lot of unique locations in my query stream, so
I don't know if caching this filter will work. (Each filter cache consumes
about 1 bit in memory per document, is that right? So about 1.25MB in my
case. Storing the most frequent 10,000 of these would take up about 12.5GB
of ram. So maybe that's doable...)

One thing to beware of is that if you cache a geo filter, Elasticsearch
will need to evaluate it against all documents from your index before
caching it. On the other hand by default (if the filter is not cached), the
filter will be only evaluated on documents that match the query (but for
all queries not the first one). So unless you have good reasons to think
that your geo filter will be reused, I would recommend against caching it.