I've got a garbage collection problem relating to java.util.HashMap. After profiling, I've found that the garbage collector (GC) is using around 5% of the processor time. The problem is that I create lots of HashMap.Entry's since I'm always adding to HashMaps then calling clear() to empty the HashMap again, and repeating.

I'm using the HashMaps in the A* path finding code I've been working on. I use HashMaps instead of ArrayLists to achieve constant-time get() and add()/put() operations which has sped up the algorithms, but contributed to GC time. That's because with every put(), a new HashMap.Entry is created.

What should I do to avoid the huge amount of HashMap.Entry object creation? I'm 99% certain that the HashMap.Entry creation is the problem since NetBeans profiler says that I have 248 'live' instances of HashMap.Entry lying around, but I've allocated 58,000 instances over the program's life. Those 248 live instances comprise 6,000 bytes which is 6% of the 'live' bytes.

FastMap are reusable; they maintain an internal pool of Map.Entry objects. When an entry is removed from a map, it is automatically restored to its pool (unless the map is shared in which case the removed entry is candidate for garbage collection as it cannot be safely recycled).

As an alternative I'd like to put forward my own CachingHashMap available as a part of my game library with the source code here (the library is still yet to be completed).

Mine is just a HashMap that caches it's entry nodes when you clear it and remove items. As such if you fill it with a million entries and then remove them all, it still has a million entry nodes stored internally. The advantage is that if you add another million then rather then making new entry nodes it'll re-use all the entry nodes from the last million. This avoids the cost of creating objects (which is where the performance is from) and means that internally it's dereferencing no objects for the GC to clean up (to clarify; it does not hang on to a reference to the items you add, just to it's internal entry nodes).

In terms of CPU performance, using my own very cheap and brief benchmarks I find the FastMap to be marginally faster then then the HashMap but marginally slower then my CachingHashMap.

I'm using the HashMaps in the A* path finding code I've been working on. I use HashMaps instead of ArrayLists to achieve constant-time get() and add()/put() operations which has sped up the algorithms, but contributed to GC time. That's because with every put(), a new HashMap.Entry is created.

My only thought is that there shouldn't be ArrayLists in an A* implementation in the first place. The algorithm requires a priority queue (there's a heap-based PriorityQueue class in java.util that would work), and a hash map can be used in conjunction to speed up contains() checks on the queue.

If you don't expect duplicates in the open list to occur very often then you don't need a contains() check (the duplicates will just order themselves in the queue) and the hash map can be left out.

My only thought is that there shouldn't be ArrayLists in an A* implementation in the first place. The algorithm requires a priority queue (there's a heap-based PriorityQueue class in java.util that would work), and a hash map can be used in conjunction to speed up contains() checks on the queue.

If you don't expect duplicates in the open list to occur very often then you don't need a contains() check (the duplicates will just order themselves in the queue) and the hash map can be left out.

Are you talking about using HashMap's etc. for the open and closed lists in A*? If not, which lists are you talking about?

As an alternative I'd like to put forward my own CachingHashMap available as a part of my game library with the source code here (the library is still yet to be completed).

Mine is just a HashMap that caches it's entry nodes when you clear it and remove items. As such if you fill it with a million entries and then remove them all, it still has a million entry nodes stored internally. The advantage is that if you add another million then rather then making new entry nodes it'll re-use all the entry nodes from the last million. This avoids the cost of creating objects (which is where the performance is from) and means that internally it's dereferencing no objects for the GC to clean up (to clarify; it does not hang on to a reference to the items you add, just to it's internal entry nodes).

In terms of CPU performance, using my own very cheap and brief benchmarks I find the FastMap to be marginally faster then then the HashMap but marginally slower then my CachingHashMap.

Thanks JL235, that will be fantastic if i can use your CachingHashMap implementation. The Javolution FastMap requires me to use the whole 0.4MB package so your single file is much better. The only benefit that Javolution provides is that add() is constant time since the underlying array is never copied when the hashmap outgrows its capacity, but I have no use for that since my map is always re-used so it will always stay big. Thanks Out of interest, what was your reason for making the CachingHashMap? Where did you need it?

My only thought is that there shouldn't be ArrayLists in an A* implementation in the first place. The algorithm requires a priority queue (there's a heap-based PriorityQueue class in java.util that would work), and a hash map can be used in conjunction to speed up contains() checks on the queue.

If you don't expect duplicates in the open list to occur very often then you don't need a contains() check (the duplicates will just order themselves in the queue) and the hash map can be left out.

Hi Jono, thanks for your comments. Yes I use a BinaryHeap priority queue for the openList, but to eliminate use of List.contains() I also use a HashMap for the open and closed lists.

Having never done anything with A*, this may very well be a stupid suggestion, but here goes anyway. Can you get away with using IdentityHashMap? I'm seeing it about 30% faster than HashMap on gets. Obviously this doesn't solve your GC problems, but maybe you could modify JL235's class to use == or base a new one off IdentityHashMap.

Thanks JL235, that will be fantastic if i can use your CachingHashMap implementation. The Javolution FastMap requires me to use the whole 0.4MB package so your single file is much better. The only benefit that Javolution provides is that add() is constant time since the underlying array is never copied when the hashmap outgrows its capacity, but I have no use for that since my map is always re-used so it will always stay big.

If you can predict how big it will be, then you could set an initial capacity to avoid the resizing of the array.

Out of interest, what was your reason for making the CachingHashMap? Where did you need it?

For exactly this problem that your having. I use a hashmap in my collisions grid and found if I had only 1,000 enemies in there, all moving and testing for collisions, then the JVM would be spending about 5% of it's time running the GC in order to clean up the millions of objects passively created. Now it's only spending about 0.5%.

Bear in mind that this is released under a BSD license, and so it does need to be properly credited if you are going to be using it. Otherwise go right ahead and take it.

I'd like to use the Javolution collections and text stuff in some of my projects. I wish the JAR wasn't 400+ kb! The classes aren't easily extracted either.

You can use ProGuard for eliminating unneeded classes in your whole application/libraries, also you can obfuscate them (which will shrink them by some amount too), but it can be set to not obfuscate if not wanted.

You can use ProGuard for eliminating unneeded classes in your whole application/libraries, also you can obfuscate them (which will shrink them by some amount too), but it can be set to not obfuscate if not wanted.

Looking at their source, the classes can't be easily extracted without bringing ~80% of all the classes along for the ride.

Having never done anything with A*, this may very well be a stupid suggestion, but here goes anyway. Can you get away with using IdentityHashMap? I'm seeing it about 30% faster than HashMap on gets. Obviously this doesn't solve your GC problems, but maybe you could modify JL235's class to use == or base a new one off IdentityHashMap.

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org