Search This Blog

How LongAccumulator and DoubleAccumulator classes work?

Two classes new in Java 8 deserve some attention: LongAccumulator and DoubleAccumulator. They are designed to accumulate (more on what does that mean later) values across threads safely while being extremely fast. A test is worth a thousand words, so here is how it works:

So the accumulator takes a binary operator and combines initial value with every accumulated value. That means ((((0 + 1) + 2) + 3) + -4) equals to 2. Don't go away yet, there's much more than that. Accumulator can take other operators as well, as illustrated by this use case:

Obviously accumulator would work just as well under heavy multi-threaded environment - which it was designed for. Now the question is, what other operations are permitted in LongAccumulator (this applies to DoubleAccumulator as well) and why? JavaDoc is not very formal this time (bold mine):

The order of accumulation within or across threads is not
guaranteed and cannot be depended upon, so this class is only
applicable to functions for which the order of accumulation does
not matter. The supplied accumulator function should be
side-effect-free, since it may be re-applied when attempted updates
fail due to contention among threads. The function is applied with
the current value as its first argument, and the given update as
the second argument.

In order to understand how LongAccumulator works, what type of operations are permitted and why it's so fast (because it is, compared to e.g AtomicLong), let's start from the back, the get() method:

We clearly see that there is some internal cells array and that in the end we must go through that array and apply our operator function sequentially on each element. Turns out LongAccumulator has two mechanisms for accumulating values: a single base counter and an array of values in case of high lock thread contention. If LongAccumulator is used under no lock contention, only a single volatile base variable and CAS operations are used, just like in AtomicLong. However if CAS fails, this class falls back to an array of values. You don't want to see the implementation, it's 90 lines long, occasionally with 8 levels of nesting. What you need to know is that it uses simple algorithm to always assign given thread to the same cell (improves cache locality). From now on this thread has its own, almost private copy of counter. It shares this copy with couple of other threads, but not with all of them - they have their own cells. So what you end up in the end is an array of semi-calculated counters which must be aggregated. This is what you saw in get() method.

This brings us again to the question, what kind of operators (op) are permitted in LongAccumulator. We know that the same sequence of accumulations under low load will result e.g. in:

((I op A) op B) //get()

Which means all values are aggregated in base variable and no counter array is used. However under high load, LongAccumulator will split work e.g. into two buckets (cells) and later accumulate buckets as well:

(I op A) //cell 1
(I op B) //cell 2
(I op A) op (I op B) //get()

or vice-versa:

(I op B) //cell 1
(I op A) //cell 2
(I op B) op (I op A) //get()

Clearly all invocations of get() should yield the same result, but it all depends on the properties of op operator being provided (+, *, max, etc.)

Commutative

We have no control over the order of cells and how they are assigned. That's why ((I op A) op (I op B)) and ((I op B) op (I op A)) must return the same result. More compactly we are looking for such operators op where X op Y = Y op X for every X and Y. This means op must be commutative.

Neutral element (identity)

Cells are logically initialized with identity (initial) value I. We have no control over the number and order of cells, thus the identity value can be applied numerous times in any order. However this is an implementation detail, so it shouldn't affect the result. More precisely, for every X and any op:

X op I = I op X = X

Which means the identity (initial) value I must be a neutral value for every argument X to operator op.

Which proves that op must be associative in order for LongAccumulator to actually work.

Wrap up

LongAccumulator and DoubleAccumulator are highly specialized classes new in JDK 8. JavaDoc is quite vaque but we tried to prove properties that an operator and initial value must fullfil in order for them to do their job. We know that the operator must be associative, commutative and have a neutral element. It would have been so much better if JavaDoc clearly stated that it must be an abelian monoid ;-). Nevertheless for practical purposes these accumulators work only for adding, multiplying, min and max, as these are the only useful operators (with appropriate neutral element) that play well. Subtracting and dividing for example is not associative and commutative, thus can't possibly work. To make matters worse, accumulators would simply behave undeterministically.