Elasticsearch for Apache Hadoop 6.0.0-beta1 released

​I am excited to announce the release of Elasticsearch for Apache Hadoop (aka ES-Hadoop) 6.0.0-beta1 built against Elasticsearch 6.0.0-beta1.

IMPORTANT: This is an beta release and is intended for testing purposes only. Crazy things might happen when running this code and indices created with this version will certainly not be compatible with Elasticsearch 6.0.0 GA. For the sake of your own sanity, we do not advise using this version in production.

What’s new?

Spark 2.2.0 and Stable Support for Spark Structured Streaming

Spark 2.2.0 landed on July 11th and we spared no time in making sure we work on top of it. What’s with all the excitement? Why, Structured Streaming has graduated to GA status! This means that we’re no longer treating our Structured Streaming integration in ES-Hadoop as an experimental integration as of this beta release. Please note that due to its experimental nature in prior versions, we will only be supporting our Structured Streaming integration on Spark versions 2.2.0 and above. Don’t fret though - This doesn’t impact our existing Spark integrations at all.

Support for new Join Fields

The days are numbered for Multi-typed indices in Elasticsearch. Users who work with Parent-Child based data need not worry about the future due to the advent of the new “join” field type in Elasticsearch. With this beta release, we are rolling out support for reading and writing data with this new field type. We’re excited to hear your feedback on this new feature!

Multiple Mappings and Multiple Index Reads

We took a long hard look at how we handle Elasticsearch mappings in the connector. After that long hard look we re-wrote a healthy chunk of code to fix an unhealthy bunch of problems. In this release you should no longer be bitten by common errors when reading from multiple indices (each with varying field types). The connector will also alert you when the indices you’re reading from have conflicting mappings in them.

Check Out Our Bug Collection

Nested Java Bean serialization problems, field exclusion problems on Pig and SparkSQL, partial document reads and serialization exceptions, all fixed in this release. Take a look at all of the items that have been spruced up in this release!

Feedback

Now you might be wondering, “Why would I want to try a Beta Release? Aren’t these things normally riddled with bugs?” Well, yeah, sometimes. That’s why we need the help from all of you awesome early adopters!

So, please, DO try this at home! You can download ES-Hadoop 6.0.0-beta1, try it out, find out how it breaks, and let us know what you did on Twitter, GitHub, or in the forum. A crisp high five is waiting for all who participate! Not a huge fan of high fives? There’s always the Elasticsearch Pioneer Program instead!