I'm working in the Hibernate and Infinispan teams at JBoss, caring about Lucene integration in products we support, striving to make it easier to use and to integrate in well known APIs and patterns, and finally to make it scale better; I love clean and well performing code.

I've been an early adopter of cloud deployments scaling Lucene to a huge number of requests on EC2 using Hibernate Search, and after that I worked with Sourcesense to make JIRA clusterable via Infinispan. Have been trainer on Seam and Hibernate courses.

The source code of WildFly was updated to include our latest Hibernate Search 5. Looking forward to the final release of this super popular application server, as you won't have to download the Hibernate Search 5 dependencies separately!

If you prefer to use stackoverflow.com, please use the tag hibernate-search.
And if you have a moment to help other users, some please consider registering to the hibernate-search tag to help us answering all the questions.

If you are new to Hibernate Search, best is to start with our getting started guide. And remember: feedback, comments and/or pull-requests are welcome on the website too.

Hibernate Search 4 has been stuck with the quite outdated 3.6.x version of Apache Lucene, while the Lucene 4 series is introducing lots of improvements. Lucene has now reached version 4.10.3 and is considered stable, reliable and significantly more efficient than previous versions; you can now benefit from all these improvements.
Some APIs changed, you might need to make some adjustments to your code such as Analyzer class names, but generally if you were using the Hibernate Search API, the most tricky changes of Lucene are encapsulated and won't affect your code directly.

The major number was increased because the Lucene upgrade is a significant change, and because it forced us to break our API compatibility promise which we apply on minor versions.
Don't assume that this will require Hibernate ORM at version 5 too: it still depends on Hibernate ORM versions 4.3.x (as did Hibernate Search 4.5) and is still compatible with WildFly 8, and we expect it will be compatible with WildFly 9 as well.
It is possible that Hibernate Search 5 will be compatible with ORM version 5; we'll certainly aim for that, but cannot guarantee it.

So if you have an application using Hibernate ORM 4.3.x and Hibernate Search 4.5.x, it should be simple to upgrade as you won't have to upgrade ORM and can focus on changes needed for Search and Lucene only.

The indexing engine has been revisited, providing great performance enhancements and also simplifying configuration: you no longer need to configure a number of backend workers.

Both asynchronous indexing and synchronous indexing have been redesigned.

For the asynchronous indexing backend you now have a per-index index_flush_interval property which you can use to limit the time between your updates committed on the database and the related index commit.

The synchronous backend is now able to merge write requests from multiple parallel transactions so to provide both the benefits of batched writes on the index while still having synchronous updates.
This new model allows to have performance similar to what was previously only possible when selecting the NRT backend, but doesn't have the drawbacks such as not being compatible with the Infinispan Directory.

The project code and build has been refactored to produce nice OSGi compatible libraries. We run integration tests with Apache Karaf so our artefacts should be safe to consume via JBoss FUSE. The Lucene jars are still a bit troublesome, but if you have any problem with it please let us know we might be able to find a solution.

For those developers defining custom domain types, it's now possible to automatically bind a given Java type to a FieldBridge. You won't have to copy/paste those @FieldBridge annotations all over your model.
This feature is explained in the BridgeProvider section of the documentation. You could use it for example to contribute the missing converters for Java 8 Date/Time types.

Using the new MoreLikeThis query capabilities you don't have to target specific fields but can provide an instance of an indexed object. This model is also known as query by example and will trigger a similarity query matching all fields (or a subset of your choice).
A full exaxmple can be seen on this previous blog post.

Until this version Hibernate Search depended on Apache Lucene for most of the work, and also on Lucene's sister project Apache Solr to provide a richer set of analyzers. Since the Lucene project incorporated this functionality from Solr, there is no longer any need to depend on Solr artifacts.

With requirements such as OSGi support, other projects like CapeDwarf and Infinispan integrating Hibernate Search (but excluding dependencies to Hibernate ORM), advanced needs for the Hibernate OGM project our integration API and modularity was extensively stretched and tested, resulting in lots of improvements which you might not directly notice, but will make it much easier to avoid dependency conflicts with any other library you might use, or integrate nicely in your favorite container / framework.

One example is the new structure of the modules we provide for easy WildFly integration: highly encapsulated, and significantly less dependencies than previous versions.

For example the JGroups backend can use a JGroups version of your choice, and it doesn't need to match the JGroups version of Infinispan even if Hibernate Search is using Infinispan as well (which depends on its own JGroups version); this will not be a problem, and JGroups wouldn't even be exposed to your application so in theory you could be using a third different version of the clustering library in your app directly.
In practice you would probably want to keep the versions aligned, but if you prefer otherwise it won't be a problem.

Any numeric property, including Calendar and Date types, are now by default indexed as a NumericField.
A NumericField is more efficient to perform range queries, so we think this is what you should be using in most cases. Of course it's still possible to explicitly annotate the property to revert to the old behaviour: this is just a change in the defaults.

Please keep this change in mind when running queries, as you'll now need to query these as a NumericField. If you use our Query builder DSL this is going to be correct transparently, but if you use the Lucene native APIs to create queries the results won't match and you won't get any kind of warning.

Technically it is possible that this latest version of Lucene could read your existing indexes, but with such a large version increase of Lucene's code, and considering the numeric mapping changes, and the many changes in the Analyzers over time, we highly recommend you replace your old indexes and use the MassIndexer to trigger a fresh rebuilt.

We have several interesting plans ahead, but our priority is defined by feedback. Please let us know what you'd need, or even if it works great for you it's nice for us to hear about it and what you do with it.
You can get in touch with us with any of these media, especially the forums should be a good starting point.

This is what we hope to work in the near future:

dynamic defined models (not strictly bound to annotated classes)

Alternatives to embedded Lucene backends: Apache Solr or ElasticSearch seem to be good candidates for this

Take better advantage of the new Lucene 4 capabilities (Faceting, query-time join, etc..) Can you suggest?

This list is long, and I could easily expand. We could really user your help, especially as our small core team is not familiar with many of the other mentioned technologies: even if you don't feel like coding but are in the mood for bleeding edge testing that would be great.

If you don't specify any FieldBridge for your Numeric attributes, or Date or Calendar fields, now Hibernate Search will encode them by default using Lucene's specialized NumericField format. This format was available since long in both Lucene and Hibernate Search, but so far you had to explicitly enable it as Hibernate Search would - by default - stick to the backwards compatible format of transforming these types into keywords (strings).
The NumericField format is much more efficient to perform range queries - which we expect being common for these types.

Remember that - unless you had explicit field configuration - this implies that you might need to fix how your queries are created. By using the Hibernate Search Query DSL you will get an exception to warn you if you try to force it using the wrong type. If you're using the Lucene API directly, make sure to check you're getting the results you expect.

I have no other major changes to report regarding our public API; however for power users and other frameworks integrating with Hibernate Search you might notice a significant reorganization of our SPI.
We've documented all relevant changes in the Migration Guide.

The final release of version 5 will be released very soon, so please make sure you test this quickly.
Any comment is welcome on the mailing list or via IRC.

If you are around in London the evening of the 14th of January, I would love to see you at our monthly JBUG event.

We'll start the evening with an introduction to Hibernate Search, including basics of concepts from Apache Lucene, and then discuss the novelties you'll find in Hibernate Search 5.0, before discussing the more advanced features.
That should be interesting both for those of you already familiar with the technology, and for those who never heard of it and are now wondering how and if it could help you.

The event will be at Skills Matters, organized by our partner C2B2, and after the demo we'll have plenty of time for pizza, beers and face to face discussions about all things Hibernate.

Normally we would not backport new features to maintenance releases, but some of the great performance improvements of the new indexing engine of upcoming Hibernate Search 5 such as {HSEARCH-1693, HSEARCH-1699, HSEARCH-1725} seem to be very desirable. These are not introducing any API or functionality change, so we could backport them at virtually no risk.

This means you can now easily upgrade your Hibernate Search 4.4.x and 4.5.x applications without necessarily needing to migrate to Hibernate Search 5. Remember though: there are a lot more improvements coming in 5! If you want all the nice improvements you'll have to eventually migrate.

These new backends were created because performance testing of the Infinispan indexing engine highlighted some problems in our backend when using an Infinispan Directory; so while these patches provide an impressive boost on their own, they will be far more effective when paired up with latest Infinispan 7 as some changes where applied to Infinispan too. But we're not upgrading these maintenance branches of Hibernate Search 4 to Infinispan 7 as that would break all of your configurations. To take benefit of the updated Infinispan integration you'll need Hibernate Search 5.
Another great reason to move to Hibernate Search 5 is of course the update to latest Apache Lucene; so these updates announced today should be a nice an easy performance boost but if you are serious about needing the highest speed please keep testing version 5.

While these impressive improvements were created after specific diagnostics work on Infinispan, the benefits are not Infinispan specific: you should be able to experience a significant throughput boost with any storage. The exception is if you were using the NRT backend: I don't expect you to see any benefit in that case. Although if you were forced to use NRT because of throughput needs but didn't like the tradeoffs, you might no longer need to use NRT as the new non-NRT backend could be nearly as efficient.