Elasticsearch 6.0 Now Available on ObjectRocket

By: Steve Croce

Posted on: November 30, 2017

Today we're announcing that Elasticsearch 6.0.0 is now available on the ObjectRocket service. Though this is not as big a launch as last year's release of Elasticsearch 5.0, there are still a number of major additions and improvements you can take advantage of in the new version. You can try it out now, in the the ObjectRocket app.

As with most ".0" versions, we caution that there are likely some unresolved bugs and it might make sense to wait before moving production to the newest version. However, for those looking to test it out, below are some of features we're most excited about in the latest version.

Rolling Major Version Upgrades

The velocity of Elasticsearch version releases has truly become a double-edged sword. On one hand, the number of new features and capabilities delivered in each new version is staggering, but major version upgrades are always painful. When you're running Elasticsearch in production, downtime coupled with a potential reindexing is a non-starter for most customers. This is why we see so many users still running Elasticsearch version 1.x, despite the fact that it's been out of support for over a year.

Elasticsearch 5.6 to 6.0 transition introduces the first major version upgrade that can be accomplished via a rolling restart. Though limited to clusters running Elasticsearch 5.6 and all indexes must have been created in Elasticsearch 5.x, it's a great start towards easier upgrades in the future.

Multiple Logstash Pipelines

A really exciting change in Logstash--where we can all collectively say "Finally!"--is that it can now handle multiple pipelines in a single Logstash instance. In previous versions, each Logstash instance would have a single pipeline, so you'd need to run all of your data through a single pipeline and handle different scenarios in the pipeline with conditionals and other constructs. This would cause your pipelines to be overly complicated and would lead to some tough decisions on how to handle the same data going through the same pipeline. The only other option would be to run multiple Logstash instances for different use cases, but then that would cause a whole host of additional complications. Now in Logstash 6.0, you can configure multiple pipelines and have different flows go through the same Logstash process without having to deal with conditional hell.

Improved Storage Utilization

An area that the Elastic Stack 6.0 is improving in a few different ways space utilization. "Out of the box", Elasticsearch can chew up storage space pretty quickly, so these changes should improve the story in a number of common scenarios.

The first, and most significant change is that sparsely populated fields will not be be filled in for every document. What this means is if you have fields in your indexes that don't exist in every document, then Elasticsearch will be more efficient in storing those documents. No set up, or special configuration is required to take advantage of this.

The second area that Elastic has improved is the amount of data stored by Metricbeat. The fields stored, data intervals and more have been tuned to make metricbeat create a lot less data. It's a small step forward in addressing one of the most common knocks against Elasticsearch as a metrics store. By default, Elasticsearch inflates the amount of data you put in significantly, so a lot of tuning was generally required to keep capacity utilization in check. Now, the defaults are a little more conservative about how much space on disk your data takes up.

The third change, that also addresses the amount of disk space Elasticsearch chews up by default, is that the _all field has been deprecated. This one is fairly controversial, because the _all field is convenient for general queries of your data and can be pretty useful when you're modeling new data. However, the _all field is really expensive from a storage space perspective, so this change will reduce space usage in default scenarios. There's always been a way to disable the _all field or modify what is included in it, and now will be no different. There is the ability to to create a new field and use the new copy_to mapping parameter to determine what goes into it and create similar functionality to the _all field. In the end, it's pretty much 6 and 1/2 dozen with this one, but for users that don't tune their mappings up front, this change should help people save space.