Hibernate OGM: Lowering the Barrier of Entry to NoSQL

NoSQL tools and big data in general will revolutionize what we do in our applications.

Earlier this month, the public got their first look at a brand
new, NoSQL-orientated Hibernate project: Hibernate
Object Grid Mapping (OGM.) This project aims to provide JPA
engine storing data into NoSQL stores, and the first Alpha release
was used to power the 2011 JBoss World Keynote. JAXenter spoke to
JBoss platform architect and Hibernate Search and Hibernate
Validator founder, Emmanuel Bernard, to find out more about this
new project.

JAXenter: The first public alpha release of new
Hibernate project Hibernate OGM has just been announced. What is
the OGM project?

Emmanuel Bernard: Hibernate OGM stands for
Object / Grid Mapper. The idea is to offer the full JPA API and
semantic (or its Hibernate native counterpart). But instead of
storing data in a relational database like Hibernate Core does, we
store them in a NoSQL datastore. We also hope to offer a subset of
JP-QL. We are talking about a full-fledge JPA engine: same API,
same semantic(cascading, association etc), same query language.

The project started as a solution to offer JPA APIs on top of
Infinispan (JBoss’s data grid). While working on the project, we
realized that the choices we made:
* could be generalized to other key/value stores
* fit nicely to alternative NoSQL families and in particular
Document oriented ones

So we have decided to make Hibernate OGM datastore agnostic
(which in retrospect fits nicely into Hibernate Core’s
philosophy).

In Hibernate OGM, we try very hard to make the underlying data
storage independent of your application. The side effect is that
the same data will be readable by other platforms like Ruby, .net
or whatever the next big thing ends up being. It’s quite critical
for your data to be portable. Data outlive application tenfold
quite easily.

JAXenter: How does Hibernate OGM aim to lower
the barrier of entry for NoSQL?

Emmanuel: When you chose one NoSQL product over
another (or over a relational database), many choices are at
stake:
* the programmatic API
* the (non-)query engine
* the transaction / throughput / availability / partitioning
semantics
* the tools to facilitate ops and support

The idea behind Hibernate OGM is to let you reuse the same
programmatic API, same object lifecycle semantic and (to a certain
extend) the same query engine. There is a difference between a
developer that can use APIs that are already familiar and well
integrated with his programmatic model, versus using an entirely
different set of APIs and programmatic model. You will still need
to focus on which engine best fits your data storage needs, but
assuming your application is domain model driven, Hibernate OGM
will be a useful tool to limit leakage between your app and
your
datastore. You will also be able to choose your NoSQL datastore
later in the application lifecycle just like Hibernate Core
dialects let you decide which relational engine to use later in
your development cycle. To be honest, it won’t be as abstracted for
Hibernate OGM: NoSQL engines are vastly different.

Fundamentally, Hibernate OGM is a tool that reduces the barrier
of entry to NoSQL solutions. We think, like many, that NoSQL tools
and big data in general will revolutionize what we do in our
applications. We want people to try and explore new data patterns
without having to invest a lot in time and money.

JAXenter: What technologies are at work, in
Hibernate OGM?

Emmanuel: That’s one of the beauty of the
project. Instead of writing a JPA engine from scratch, we reuse
most of the mature Hibernate Core engine. We simply replace two
components which are interacting with the datastore (respectively
Persisters and Loaders). To be honest, I did not believe Hibernate
Core was flexible enough for this but I was wrong and the engine
fits quite well.

On the query side, Hibernate OGM uses Lucene and Hibernate
Search to build indexes and keep them up to date. The query engine
converts JP-QL queries into one or several full-text queries.
That’s our first step and will let us do JP-QL queries with
restrictions (where clauses) and simple *-to-one joins. Once this
is stabilized, we will reuse Teiid, a database federation engine
that queries several datasources as if they were one and compute
joins (doing the aggregation and join work if needed). The Teiid
team is working on an embedded version of their query engine as we
speak.

Finally, our initial NoSQL engine is Infinispan which is the
evolution of JBoss Cache. As you can see, we reuse many mature
projects and add the additional layers on top when needed. The idea
is to get a very rich feature set very quickly.

JAXenter: What do you have planned for future
releases of Hibernate OGM?

Emmanuel: We have just released Alpha 2, so
don’t move your production data to Hibernate OGM just yet!

We do have a fairly mature CRUD support (Create Read Update
Delete) even though we are still exploring the best way to store
data (especially associations). JP-QL support for simple queries is
what we are working on at the moment and we hope to get it out as
soon as possible. Once that is out, we will test Hibernate OGM for
performance and stability improvements. From there we will be ready
for a GA release. Support for more complex JP-QL queries will come
next with Teiid’s integration.

In parallel to this feature oriented roadmap, we are exploring
various NoSQL engines. We do support Infinispan but we have also
worked on the abstraction layer. It is already quite advanced but
needs more improvements. The EhCache team is working on a prototype
which will help us refine the contract some more. We also have
community proposals to work on a MongoDB, CouchDB and Redis
dialect. Once we are happy with the abstraction contract, we will
reach out to them for contribution.

Hibernate OGM is still very young and in a state where every
contribution is shaping the project. This is extremely exciting! If
you are interested, please contact us, we have many things to
do!

After graduating from Supelec (French “Grande Ecole”), Emmanuel has spent a few years in the retail industry as developer and architect where he started to be involved in the ORM space. He joined the Hibernate team in 2003 and is now platform architect at JBoss, by Red Hat. Emmanuel is the lead developer of Hibernate Annotations and Hibernate EntityManager, two key projects on top of Hibernate Core implementing the Java Persistence ™ specification. He also has founded and leads Hibernate Search and Hibernate Validator. Emmanuel is a member of the JPA 2.0 expert group and the spec lead of JSR 303: Bean Validation. He is a regular speaker at various conferences and JUGs, including JavaOne, JBoss World and Devoxx and the co-author of Hibernate Search in Action published by Manning.