Futures

A Future represents an asynchronous computation. You can wrap your computation in a Future and when you need the result, you simply call a blocking Await.result() method on it. An Executor returns a Future. If you use the Finagle RPC system, you use Future instances to hold results that might not have arrived yet.

Thread Safety Problem

This program is not safe in a multi-threaded environment. If two threads have references to the same instance of Person and call set, you can’t predict what name will be at the end of both calls.

In the Java memory model, each processor is allowed to cache values in its L1 or L2 cache so two threads running on different processors can each have their own view of data.

Let’s talk about some tools that force threads to keep a consistent view of data.

Three tools

synchronization

Mutexes provide ownership semantics. When you enter a mutex, you own it. The most common way of using a mutex in the JVM is by synchronizing on something. In this case, we’ll synchronize on our Person.

Does this cost anything?

@AtomicReference is the most costly of these two choices since you have to go through method dispatch to access values.

volatile and synchronized are built on top of Java’s built-in monitors. Monitors cost very little if there’s no contention. Since synchronized allows you more fine-grained control over when you synchronize, there will be less contention so synchronized tends to be the cheapest option.

When you enter synchronized points, access volatile references, or deference AtomicReferences, Java forces the processor to flush their cache lines and provide a consistent view of data.

PLEASECORRECT ME IF I’M WRONGHERE. This is a complicated subject, I’m sure there will be a lengthy classroom discussion at this point.

Other neat tools from Java 5

As I mentioned with AtomicReference, Java 5 brought many great tools along with it.

CountDownLatch

A CountDownLatch is a simple mechanism for multiple threads to communicate with each other.

Among other things, it’s great for unit tests. Let’s say you’re doing some async work and want to ensure that functions are completing. Simply have your functions countDown the latch and await in the test.

AtomicInteger/Long

Since incrementing Ints and Longs is such a common task, AtomicInteger and AtomicLong were added.

AtomicBoolean

I probably don’t have to explain what this would be for.

ReadWriteLocks

ReadWriteLock lets you take reader and writer locks. reader locks only block when a writer lock is taken.

Let’s build an unsafe search engine

Here’s a simple inverted index that isn’t thread-safe. Our inverted index maps parts of a name to a given User.

This is written in a naive way assuming only single-threaded access.

Note the alternative default constructor this() that uses a mutable.HashMap

I’ve left out how to get users out of our index for now. We’ll get to that later.

Let’s make it safe

In our inverted index example above, userMap is not guaranteed to be safe. Multiple clients could try to add items at the same time and have the same kinds of visibility errors we saw in our first Person example.

Since userMap isn’t thread-safe, how do we keep only a single thread at a time mutating it?

Unfortunately, this is too coarse. Always try to do as much expensive work outside of the mutex as possible. Remember what I said about locking being cheap if there is no contention. If you do less work inside of a block, there will be less contention.

For every line in our file, we call makeUser and then add it to our InvertedIndex. If we use a concurrent InvertedIndex, we can call add in parallel and since makeUser has no side-effects, it’s already thread-safe.

We can’t read a file in parallel but we can build the User and add it to the index in parallel.

A solution: Producer/Consumer

A common pattern for async computation is to separate producers from consumers and have them only communicate via a Queue. Let’s walk through how that would work for our search engine indexer.