Elasticsearch memory requirements for big boards

With vbulletin 3 + sphinx, we have sphinx running on the DB server, and it runs beautifully with 16 gigs of ram and SSD's. Server loads barely register . We have 13M posts, and from what I've read from @Slavik we'll need like 13 gigs of ram just for Elasticsearch? So we'd pretty much need to buy a new dedicated search server with this many posts? If so how robust should this server be?

With vbulletin 3 + sphinx, we have sphinx running on the DB server, and it runs beautifully with 16 gigs of ram and SSD's. Server loads barely register . We have 13M posts, and from what I've read from @Slavik we'll need like 13 gigs of ram just for Elasticsearch? So we'd pretty much need to buy a new dedicated search server with this many posts? If so how robust should this server be?

Click to expand...

With the latest updates, i've been finding 512mb per mil has been working well.

Wow. I also love how on your http://www.tinhte.vn site, you offer Google Custom Search results and XF search results on the same results page, with tabs. There are many odd searches for bass/amp models that Google just does better, and I'd love to offer a similar solution with the user not having to do two separate searches.

Wow. I also love how on your http://www.tinhte.vn site, you offer Google Custom Search results and XF search results on the same results page, with tabs. There are many odd searches for bass/amp models that Google just does better, and I'd love to offer a similar solution with the user not having to do two separate searches.

For purposes of the original question, you can ignore everything except "Index Size" and "Documents", but this is how ours looks as far as how much memory the indexes take:

The searchable document number is actually 22,145,684 (since our setup has all shards replicated so any ES node can fail without service disruption). We have about 18M posts, but we also made other content types searchable (users, reports, conversations, etc).

So if we weren't replicating ES data to multiple servers, our index size of 22,145,684 searchable documents would be 6.6GB.

For us, we have 13.2GB of index data (again, we have 2 copies of everything) spread across 8 servers (so about 1.6GB per server is used for indexes), and I allocate 4GB of RAM for ES on each server just for good measure (servers each have 256GB RAM, so not short on RAM and rather over-allocate than under-allocate).