1.0.0.RC1 released

Today we are happy to announce the release of elasticsearch 1.0.0.RC1, which is based on Lucene 4.6. This is our first (and hopefully last) release candidate before version 1.0.0 stable. You can download elasticsearch 1.0.0.RC1 here.

In the four years that Elasticsearch has been in development, it has accumulated some cruft: there are some inconsistent APIs and parameters. We are using this release to try to fix that. Our goal is that the user interface should be intuitive — you shouldn’t even need to consult the docs for common requests because the API should be obvious. While we try very hard to maintain backwards compatibility, some of these changes require us to break with the past. To help you migrate your application to 1.0, we have put together a list of all the breaking changes that you should be aware of.

New features and enhancements

An Elasticsearch release wouldn’t be complete without new toys, and this release is no different:

Federated search

The first new toy is the (experimental!) tribe node which joins multiple clusters and act as a federated client. Almost all operations are supported: distributed search, suggestions, percolation. You can even index into multiple clusters with the tribe node. Alternatively, you can set a tribe node to not allow any write operations, making it read-only. See the tribe node docs for more information.

Scale and stability

Several of our clients and users are using Elasticsearch at humongous scale, pushing the boundaries of what is possible. Their experiences have helped us to find the breaking points in Elasticsearch, and to improve them. The result is improved stability and scale for all of us. Cluster state processing (eg creating indices, mappings, nodes joining and leaving, shard allocation) has been streamlined and takes 5% of the time that it used to. Shard allocation and recovery has been improved and several bugs have been fixed. See #4373, #4342, #4410, #4630, #4502, #4454, #4457, #4588, #4459, #4413 and #4674.

Memory usage and limits

One of the biggest causes of instability in Elasticsearch is fielddata: field values have to be loaded into memory to make aggregations, sorting and scripting perform as fast as they do. Up until now, it was difficult to prevent this field data from using all available memory and throwing an OOM error. The new field data circuit breaker will throw an exception if you try to exceed the indices.fielddata.breaker.limit, which defaults to 80% of the heap size. You can read more about it in the fielddata circuit breaker docs.

We try to play nicely with the JVM to reduce garbage collection and to keep request latency low. This can be difficult to do when large temporary data structures are required to service requests. The new PageCacheRecycler provides us with pages of memory that can be reused for subsequent requests without interfering with the young generation heap. See #4557 and #4647.

Other new features

The simple_query_string query, which understands a limited search syntax like "fried eggs" +(eggplant | potato) -frittata but, unlike the query_string query, it won’t throw an error if the user gets the syntax wrong.

Geo-distance calculations are now 3.5 - 4.5 times faster (while still being 99.9% accurate), thanks to the sloppy_arc distance calculation.

The token_count field type allows you to index the number of terms in a field automatically.

What’s next

This is a big release — it reflects just how many improvements we wanted to get into version 1.0 of Elasticsearch. Part of the changes include a greatly expanded test suite. But our test suite can never cover everything, so we need your help with verifying this release.

Please download and test out elasticsearch 1.0.0.RC1 and report any problems that you find. This will help us get to version 1.0 stable more quickly!