Topics

Featured in Development

Understandability is the concept that a system should be presented so that an engineer can easily comprehend it. The more understandable a system is, the easier it will be for engineers to change it in a predictable and safe manner. A system is understandable if it meets the following criteria: complete, concise, clear, and organized.

Featured in Architecture & Design

Sonali Sharma and Shriya Arora describe how Netflix solved a complex join of two high-volume event streams using Flink. They also talk about managing out of order events and processing late arriving data, exploring keyed state for maintaining large state, fault tolerance of a stateful application, strategies for failure recovery, data validation batch vs streaming, and more.

Featured in Culture & Methods

Tim Cochran presents research gathered from ThoughtWorks' varied clients and projects, and shows some of the metrics their teams have identified as guides to creating the platform and the culture for high performing teams.

Gil Tene: Understanding Hardware Transactional Memory

In his presentation "Understanding Hardware Transactional Memory" at QCon New York 2016, Gil Tene introduces hardware transactional memory (HTM). Whereas the concept of HTM is not new, it is now finally available in commodity hardware. For example, starting in Q2 2016 every Intel based server supports HTM. The basic purpose of HTM is to be able to write to multiple addresses in memory in an atomical way so that there cannot be inconsistencies in cooperation other threads.

Tene starts by explaining the four states of memory caches:

invalid

shared

exclusive

modified

and points out that with regard to hardware transactional memory, there are now two additional states:

line was read during speculation

line was modified during speculation

A transaction has to be aborted if another CPU wants to write data, if it wants to read data that was modified and if a CPU decides to self-evict the cache.

According to Tene, the big advantage of hardware transactional memory is to get rid of blocks of serialization. The goal is to run fully in parallel and only rollback in case of an actual collision while accessing a data item. This is directly related to Amdahl's law which states that the more cores are available, the smaller the gain in speed actually is. If there is ten percent serialization in an application, ten CPUs will only provide little more speed-up than factor five. To actually realize a speed-up of factor ten, one would have to use around 100 CPUs.

Tene goes on to introduce the difference between lock contention and data contention. While working with large hashsets access might for example occur in completely different areas but the whole hashset nevertheless needs to be locked. Usually, lock contention is much bigger than data contention, but only data contention is a problem for CPUs working in parallel. Thus, only aborting transactions in cases of data contention would reduce the impact of Amdahl's law drastically and speed up parallel computing.

With regard to Java synchronized blocks, Tene explains that uncontented blocks would be executed as fast as before. Only in cases where an actual data contention happens there would be the need to rollback the transaction and let the code run again in parallel. For Java applications, this would be completely transparent and no code changes are needed as soon as the Java virtual machine is using HTM. This is the case for Hotspot 8 JVMs from update 40 on. Tene also shows simple benchmarks that visualize the positive effect of hardware transactional memory: even with five percent writes in a hashset, there is a linear scale factor while adding more CPUs.

Gil Tene concludes by pointing out that whereas using HTM is transparent for developers, they now need to start thinking about data contention in their applications. Multiple threads should not modify a single variable because that would lead to data contention and thus to loss of speed up because the advantages of HTM could not be leveraged then.

Please note that most QCon presentations will be made available for free on InfoQ in the weeks after the conference and slides are available for download on the conference web site. You can also watch an interview with Gil Tene on this topic on InfoQ.