Enhance Collection Performance with this Treasure Trove

Ever use the Collections API? How can you say no?! The Collections framework is probably one of the best API sets available to a Java programmer. Since many parts of our applications use a HashMap, ArrayList, or LinkedList at some point, enhancing the performance of these guys can do a lot for us.

Eric D. Friedman wrote a high performance set of collections called Trove. Trove allows you to plug in their versions of certain containers (HashMap, HashSet, LinkedList), and use them just like you did with the standard versions. There are also ways to utilize primitive collections to gain even more performance. Don't you love open source?

In this article, we will discuss:

Using Trove classes

Using a Factory to allow you to switch between Trove and JDK collections

Trove's Primitive collections

Using Trove Classes

Trove is ridiculously simple to work with. You can simply download the code and set up your CLASSPATH to see the lib/trove.jar file.

How is Trove Faster?

The THashMap class is a fair amount faster than java.util.HashMap. How is this so? The standard HashMap holds a Map.Entry object for each of its elements. Trove uses open addressing, and hence gets around this problem with chaining. You can still use Map.entrySet on a Trove Map, but it is a Bad Thing. This will cause Trove to create the Map.Entry objects that it tries to get away from, so if possible, don't use these operations. If you buy into Trove, you can go down a path that couples you tightly to the Trove collections. You can implement Trove specific APIs (forEachEntry, forEachKey, forEachValue, and transformValues) to potentially gain higher performance.

Hash tables have a size, and have to grow when the data set exceeds its current capacity. It has been shown that if the size is a prime number, the performance of the hash table increases. Trove makes sure that its capacity is always a prime number, which is something that the JDK doesn't do.

When I run a benchmark on my system, I get the following results for putting values into a hash table:

How is Trove Smaller?

Another nice side effect of not containing the Map.Entry objects is that the resulting data structures take up a lot less space. This is particularly noticeable with the Set implementations, as they are not backed by Maps like the JDK.