An in-memory data grid is an in-memory distributed key-value store that enables caching data using distributed clusters. Do not confuse this solution with an in-memory or NoSQL database. In most cases it is used for performance reasons — all data is stored in RAM not in the disk like in traditional databases. The first time I started working with an in-memory data grid, we were considering moving to Oracle Coherence. The solution really made me curious. Oracle Coherence is obviously a paid solution, but there are also some open-source solutions, among which the most interesting seem to be Apache Ignite and Hazelcast. Today, I’m going to show you how to use Hazelcast for caching data stored in the MySQL database accessed by Spring Data DAO objects. Here’s the figure illustrating the architecture of the presented solution.

Implementation

Here's how to implement our solution.

Starting Docker Containers

We use three Docker containers — the first with a MySQL database, the second with a Hazelcast instance, and the third for Hazelcast Management Center — and a UI dashboard for monitoring Hazelcast cluster instances.

If we would like to connect with Hazelcast Management Center from Hazelcast instance we need to place custom hazelcast.xml in the /opt/hazelcast catalog inside our Docker container. This can be done in two ways by extending Hazelcast base image or just by copying the file to an existing Hazelcast container and restarting it.

Inside the person-service module, we declared some other dependencies to Hazelcast artifacts and the Spring Data JPA. I had to override the managed hibernate-core version for Spring Boot 1.5.3.RELEASE because Hazelcast didn’t work properly with 5.0.12.Final. Hazelcast needs hibernate-core in the 5.0.9.Finalversion. Otherwise, an exception occurs when starting an application.

Hibernate Cache Configuration

You can properly configure it in several different ways, but for me, the most suitable solution was inside application.yml. Here’s the YAML configuration file fragment. I enabled the L2 Hibernate cache and set Hazelcast native client address, credentials, and cache factory class HazelcastCacheRegionFactory. We can also set HazelcastLocalCacheRegionFactory. The differences between them are in performance — Local Factory is faster since its operations are handled as distributed calls. If you use HazelcastCacheRegionFactory, you can see your maps on Management Center.

Testing

Let’s insert a little more data to the table. You can use my AddPersonRepositoryTest for that. It will insert 1M rows into the person table. Finally, we can call endpoint http://localhost:2222/persons/{id} twice with the same ID. For me, it looks like below — 22ms for the first call and 3ms for the next call, which is read from L2 cache. The entity can be cached only by the primary key. If you call http://localhost:2222/persons/pesel/{pesel}, the entity will always be searched bypassing the L2 cache.

Now, we should try find the person by the PESEL number by calling the endpoint http://localhost:2222/persons/pesel/{pesel}. The cached query is stored as a map, as you see in the picture below.

Clustering

What is the key functionality of the Hazelcast in-memory data grid? In the previous chapters, we based this on a single Hazelcast instance. Let’s begin with running the second container with Hazelcast exposed on a different port.

docker run -d --name hazelcast2 -p 5702:5701 hazelcast/hazelcast

Now, we should perform one change in the hazelcast.xml configuration file. Because the data grid is run inside the Docker container, the public address has to be set. For the first container, it is 192.168.99.100:5701, and for second, 192.168.99.100:5702, because it is exposed on 5702 port.

Conclusion

Caching and clustering with Hazelcast is simple and fast. We can cache JPA entities and queries. Monitoring is realized via the Hazelcast Management Center Dashboard. One problem for me is that I’m able to cache entities only by primary key. If I would like to find an entity by another index like the PESEL number, I have to cache the findByPesel query. Even if the entity was cached before by ID query will not find it in the cache but perform SQL on the database. Only the next query call is cached. I’ll show you smart solution for that problem in my next article about that subject in-memory data grid with Hazelcast.