onSpinWait() method from Thread class

Introduction

The purpose of this article is to describe new method onSpinWait() added to Thread class in JDK 9, including its usage, the pros and cons and also covering few other alternatives.

onSpinWait() was part of JEP 285 to allow Java code to hint CPU there is a busy-waiting loop that may burn few CPU-cycles waiting for something to happen. CPU can assign more resources to other threads, without actually invoking the OS scheduler to dequeue another thread (which may be expensive).

Thread calling onSpinWait() does not give up a time slice, it just delays the next instruction’s execution for a finite period of time. By delaying the execution of the next instruction the processor is not under demand, it emits fewer instructions in the pipeline, hence parts of it are no longer being used which in turn reduces the power consumed by the processor! The number of cycles delayed may vary from one processor family to another.

Usage

onSpinWait() best fits when:

a thread is waiting for an external condition or events to occur, which might happen very frequently (i.e. at a high rate)

and the events finish (or last) very quickly, hence the thread should not wait for a long period of time

Pseudocode Pattern:

Taking into account the events happen very frequently, it is worth it to keep the CPU slice, since the cost of being rescheduled overweight the benefit. Usually, when a thread is rescheduled there is an increased number of context switches at a high latency cost. onSpinWait() tries to mitigate such cost but also reducing the power consumption.

Once classical example relates to Producer-Consumer pattern, where the Producer produces items at a high rate (very frequently) and signals the Consumer to consume them.

Full code listing based on Producer-Consumer patter can be found on Gil Tene’s repository.

Other Alternatives

Sometimes, depending on the context problem, the same behavior could be simulated using other APIs alternatives, as below. However, all of them have some disadvantages and might prove less efficient:

yield()

it allows the OS scheduler to choose any other Thread that is ready to run (based on thread priorities) or still keep on running current Thread without switching it in and out

sleep()

current Thread is forcefully switched out (i.e. context switching) and put in the timed waiting state, regardless of thread priority or processor residency. Once the sleep interval is over, the Thread is scheduled back to the execution.

wait() – notify()

OS scheduler moves current Thread to the wait queue. When the notify happens, OS scheduler move the Thread to the run queue to be scheduled when possible

Just for information, a context switch might cost something around 5,000 cycles, so getting switched out and switched back in means that CPU has wasted around 10,000 cycles of overhead!

Benchmark

I wrote a small benchmark to test the performance between onSpinWait() vs. yield(). vs sleep().

onSpinWait performs better in terms of both average time and number of context switches

yield is almost 10x times slower in comparison to onSpinWait in regards to average time. Also, there is more number of context switches, hence running Thread is at the mercy at the OS scheduler which decides either to keep it running or to de-schedule it in favor of others

sleep is the worse. An important metric is the number of context switches which is significantly higher in comparison to other two, due to the fact OS scheduler always decides to de-schedule running thread.

Might not be available for all architectures!

onSpinWait() relies on PAUSE x86 assembly instruction. However, for other architectures, it might not work as expected! For example, at the time of writing this article there is JDK-8159532 task raised in order to find an appropriate intrinsic for SPARC architectures, hence try to use it carefully on daily basis!