Pages

ConcurrentHashMap in Java is introduced as an alternative of Hashtable in Java 1.5 as part of Java concurrency package. Prior to Java 1.5 if you need a Map implementation, which can be safely used in a concurrent and multi-threaded Java program, then, you only have Hashtable or synchronized Map because HashMap is not thread-safe. With ConcurrentHashMap, now you have a better choice; because not only it can be safely used in the concurrent multi-threaded environment but also provides better performance over Hashtable and synchronizedMap. ConcurrentHashMap performs better than earlier two because it only locks a portion of Map, instead of whole Map, which is the case with Hashtable and synchronized Map. CHM allows concurred read operations and the same time maintains integrity by synchronizing write operations. We have seen basics of ConcurrentHashMap on Top 5 Java Concurrent Collections from JDK 5 and 6 and in this Java tutorial, we will learn:

ØHow ConcurrentHashMap works in Java or how it is implemented in Java.

ØWhen to use ConcurrentHashMap in Java

ØConcurrentHashMap examples in Java

ØAnd some important properties of CHM.

How ConcurrentHashMap is implemented in Java

ConcurrentHashMap is introduced as an alternative of Hashtable and provided all functions supported by Hashtable with an additional feature called "concurrency level", which allows ConcurrentHashMap to partition Map. ConcurrentHashMap allows multiple readers to read concurrently without any blocking. This is achieved by partitioning Map into different parts based on concurrency level and locking only a portion of Map during updates. Default concurrency level is 16, and accordingly Map is divided into 16 part and each part is governed with a different lock. This means, 16 thread can operate on Map simultaneously until they are operating on different part of Map. This makes ConcurrentHashMap high performance despite keeping thread-safety intact. Though, it comes with a caveat. Since update operations like put(), remove(),putAll() or clear() is not synchronized, concurrent retrieval may not reflect most recent change on Map.

In case of putAll() or clear(), which operates on whole Map, concurrent read may reflect insertion and removal of only some entries. Another important point to remember is iteration over CHM, Iterator returned by keySet of ConcurrentHashMap are weekly consistent and they only reflect state of ConcurrentHashMap and certain point and may not reflect any recent change. Iterator of ConcurrentHashMap's keySet area also fail-safe and doesn’t throw ConcurrentModificationExceptoin..

Default concurrency level is 16 and can be changed, by providing a number which make sense and work for you while creating ConcurrentHashMap. Since concurrency level is used for internal sizing and indicate number of concurrent update without contention, so, if you just have few writers or thread to update Map keeping it low is much better. ConcurrentHashMap also uses ReentrantLock to internally lock its segments.

ConcurrentHashMap putifAbsent example in Java

ConcurrentHashMap examples are similar to Hashtable examples, we have seen earlier, but worth knowing is the use of putIfAbsent() method. Many times we need to insert entry into Map if it's not present already, and we wrote following kind of code:

synchronized(map){

if (map.get(key) ==null){

return map.put(key, value);

} else{

return map.get(key);

}

}

Though this code will work fine inHashMap and Hashtable, This won't work in ConcurrentHashMap; because, during put operation whole map is not locked, and while one thread is putting value, other thread's get() call can still return null which result in one thread overriding value inserted by other thread. Ofcourse, you can wrap whole code in synchronized block and make it thread-safe but that will only make your code single threaded. ConcurrentHashMap provides putIfAbsent(key, value) which does same thing but atomically and thus eliminates above race condition. See Core Java for Inpatients for more details about how to use this method effectively:

When to use ConcurrentHashMap in Java

ConcurrentHashMap is best suited when you have multiple readers and few writers. If writers outnumber reader, or writer is equal to reader, than performance of ConcurrentHashMap effectively reduces to synchronized map or Hashtable. Performance of CHM drops, because you got to lock all portion of Map, and effectively each reader will wait for another writer, operating on that portion of Map. ConcurrentHashMap is a good choice for caches, which can be initialized during application start up and later accessed my many request processing threads. As javadoc states, CHM is also a good replacement of Hashtable and should be used whenever possible, keeping in mind, that CHM provides slightly weeker form of synchronization than Hashtable.

Summary

Now we know What is ConcurrentHashMap in Java and when to use ConcurrentHashMap, it’s time to know and revise some important points about CHM in Java.

2. During the update operation, ConcurrentHashMap only locks a portion of Map instead of whole Map.

3. The concurrent update is achieved by internally dividing Map into the small portion which is defined by concurrency level.

4. Choose concurrency level carefully as a significantly higher number can be a waste of time and space and the lower number may introduce thread contention in case writers over number concurrency level.

6. Since ConcurrentHashMap implementation doesn't lock whole Map, there is chance of read overlapping with update operations like put() and remove(). In that case result returned by get() method will reflect most recently completed operation from there start.

7. Iterator returned by ConcurrentHashMap is weekly consistent, fail-safe and never throw ConcurrentModificationException. In Java.

8. ConcurrentHashMap doesn't allow null as key or value.

9. You can use ConcurrentHashMap in place of Hashtable but with caution as CHM doesn't lock whole Map.

10. During putAll() and clear() operations, the concurrent read may only reflect insertion or deletion of some entries.

That’s all on What is ConcurrentHashMap in Java and when to use it. We have also seen little bit about internal working of ConcurrentHashMap and how it achieves it’s thread-safety and better performance over Hashtable and synchronized Map. Use ConcurrentHashMap in Java program, when there will be more reader than writers and it’s a good choice for creating cache in Java as well.

17 comments
:

interviewer was possibly expecting a simple answer, such as: •if the whole map is synchronized for get/put operations, adding threads won't improve throughput because the bottleneck will be the synchronized blocks. You can then write a piece of code with a synchronizedMap that shows that adding threads does not help •because the map uses several locks, and assuming you have more than one core on your machine, adding threads will improve throughput

So you can see that the synchronized version is more than 3 times slower under high contention (16 threads) whereas the concurrent version is almost twice as fast with multiple threads as with a single thread.

It is also interesting to note that the ConcurrentMap has a non-negligible overhead in a single threaded situation.

This is a very contrived example, with all the possible problems due to micro-benchmarking (first results should be discarded anyway). But it gives a hint at what happens.

@Saral Saxena, great comment. Yes, you are right on single threaded environment, ConcurrentHashMap is tough slow than Hashtable and HashMap. On scalability front, it achieves using concurrency level by dividing Map into segment, with each segment having its own lock.

Thanks Javin,..As I have shown in the above example the two methods simply add 1 million entries to the map. the first one is run in the main thread, the second one is run with an executor service that has 16 threads. To try make to help the CHM, I count with a step of 16 (so one thread will put 0, 15, 31 etc, the second thread 1, 16, 32 etc) which should reduce lock contention and associate each thread with one lock (there are 16 locks, based on the hashcode of the key modulo 16). Note that the effect is more pronounced with more threads (with 100 threads, the sychronized version really suffers).

Hi @mattnguyen, if you don't synchronized then that code snippet will create data race, where one thread calling get() getting null, just before another thread putting value, which result in overriding that value. synchronized is required to get the desired functionality, i.e. only insert,if not present.

@Anonymous, ConcurrentHashMap doesn't use ReadWriteLock, instead it uses multiple tables (arrays) known as Segment, each with there own lock. This concept is known as lock-stripping, this means, you don't need to lock whole map during write operation, only relevant segment gets locked. This is optimized for more reading than updating activity. Even a call to size() may perform poor because, in worst case a call to size() may lock all segment for calculating entries in each segment and then adding them up. As I said, ConcurrentHashMap is optimized for multiple readers with few writers.

Just want to high light one important limitation of ConcurrentHashMap, client side locking is not possible as there is no lock, which can guard whole map. Though this limitation is somewhat addressed by adding ConcurrentMap interface and providing atomic function for compound operations e.g. putIfAbsent(), remove() or replace(key, oldValue, newValue)

on one hand, you have said that CHM has concurrent reads and THREAD SAFE writes, on the other hand you said "Since update operations like put(), remove(), putAll() or clear() is not synchronized".if updates are not synchronized then how come it has thread safe writes?

Hi, I have littlebit confusion about this.You mentioned that only portion of map is locked while put operation is going on. I assume the segment in which map is putting entry is getting locked. Does it allow get() (read) operation on that segment?Because you mentioned:

"because, during put operation whole map is not locked, and while one thread is putting value, other thread's get() call can still return null"

this means get operation is allowed on locked region?

you also mentioned:

"because you got to lock all portion of Map, and effectively each reader will wait for another writer, operating on that portion of Map"

which means read operation is blocked by write operation.

I found these two statements ambiguous and I started getting confused.

Hello Navrattan, rest of things about ConcurrentHashMap is same as HashMap except that it divides the map into segments and lock those segments instead of locking the whole Map. You specify size at the start to prevent resizing because that is expensive as it involves creating a new array and copying content from old array to new one.