Topics

Featured in Development

Peter Alvaro talks about the reasons one should engage in language design and why many of us would (or should) do something so perverse as to design a language that no one will ever use. He shares some of the extreme and sometimes obnoxious opinions that guided his design process.

Featured in AI, ML & Data Engineering

Today on The InfoQ Podcast, Wes talks with Katharine Jarmul about privacy and fairness in machine learning algorithms. Jarul discusses what’s meant by Ethical Machine Learning and some things to consider when working towards achieving fairness. Jarmul is the co-founder at KIProtect a machine learning security and privacy firm based in Germany and is one of the three keynote speakers at QCon.ai.

Featured in Culture & Methods

Organizations struggle to scale their agility. While every organization is different, common patterns explain the major challenges that most organizations face: organizational design, trying to copy others, “one-size-fits-all” scaling, scaling in siloes, and neglecting engineering practices. This article explains why, what to do about it, and how the three leading scaling frameworks compare.

Researchers at MIT have been working on a fork of LLVM to explore a new approach to optimizing parallel code by embedding fork-join parallelism directly into the compiler’s intermediate representation (IR). This, the researchers maintain, makes it possible to leverage most of the IR-level serial optimizations for parallel programs.

Fork-join parallelism is a way to organize parallel programs that is especially suited to divide-and-conquer kind of algorithms, such as merge sort. Fork-join parallelism is supported in mainstream compilers such as GCC and LLVM through a set of linguistic extensions, for example those provided by OpenMP (e.g., #pragma omp parallel, #pragma omp parallel for and more) and Cilk Plus (cilk_spawn and cilk_sync). The compiler front-end handles those linguistic extensions to “lower” parallel constructs to a more primitive representation, which is then translated into IR. For example, the code snippet below uses Cilk cilk_for extension to make it possible to run each iteration of the loop in parallel:

One of the drawbacks of this approach, though, is that the compiler middle-end does not see anymore a for loop, but only opaque runtime calls that abstract the for body in its own function and pass it into a library function that handles the spawning of the loop iterations and later synchronization. This effectively prevents the middle-end from applying all kinds of optimizations that may be applied at the IR-level to for loops, such as loop-invariant code motion, scheduling, etc.

Schardl, Moses, and Leierson’s work brings the fork-join model directly to the middle-end through an extended IR, thus enabling all kinds of optimization strategies to the code before adding any extra code required by parallel processing. This is not in itself new, and several specific IRs have been devised specifically to represent parallelism in a program, howver:

...adopting a separate IR into a mainstream compiler has historically been criticized as requiring considerable effort to engineer, develop, and maintain the additional IR to the same standards as the compiler’s existing serial IR.

The key here is that the three MIT researchers have found a way to extend LLVM’s IR to represent logical task parallelism by preserving their serial semantics. Tapir, as their new IR is called, represents parallel tasks asymmetrically, meaning that the parallel tasks must first complete before the execution flow can be sync-ed and this is what makes it possible for a serial middle-end such as LLVM’s to efficiently optimize parallel code, according to the researchers.

Tapir extends LLVM’s IR by adding three new commands: detach, reattach, and sync. While detach roughly corresponds to the same abstraction as fork, reattach and sync represent a slight departure from join. This is motivated by the requirement that, to be serializable, a parallel computation must ensure that the detached block is executed to completion before branching to the continuation. Thus, while detach and reattach express the start and end of a parallel task, sync synchronizes tasks spawned within its parallel context.

To evaluate the benefits of their new approach, the MIT researchers compared a Cilk-enabled LLVM compiler with a Tapir-enabled one when compiling a suite of 20 Cilk program.

For 17 of the 20 programs, the compiler using the new intermediate representation yielded more efficient software, with gains of 10 to 25 percent for a third of them. On the programs where the new compiler yielded less efficient software, the falloff was less than 2 percent.