How Parallelware works

How Parallelware works

Using LLVM to understand your code

To analyze your code, Parallelware invokes the CLANG compiler and generates an LLVM Intermediate Representation (IR) of your code. To improve the quality of the analysis Parallelware needs to ensure that the IR is as close to the original code as possible, so the Appentra team has developed an LLVM mapper that maps information about the source code to the IR, to recover information that is lost in normal IR generation.

Once the enhanced LLVM IR is generated, Parallelware is able to perform a semantic analysis of the code that covers data flow and data scoping analysis of each variable, which enables the identification of where you can parallelize the code. This is only possible using our unique hierarchical static analysis methods. This identifies where and how parallelism can be exploited in your code, including whether you should have shared or private variables and which parallelism strategies will work.

Finally Parallelware is able to assist you in introducing OpenMP and OpenACC directives to enable CPU and GPU parallelism. The technology can identify multiple solutions for one parallel opportunity and can provide information on how data should be managed to protect the thread safety of your code. Parallelware is able to do everything from showing you what is possible to introducing correct, complete directives, into your code, thereby automating the development of parallelism.

Every product in the Parallelware Tools Suite uses the Parallelware Technology in its core, enabling a unique, productive approach to facilitating the development of parallelized software. This enables non-experts to start developing correct and performant parallel software quickly without learning the theory of computer science first.

What are parallel patterns?

The same concepts and problem types appear in a wide variety of places in many different codes. By recognising them, we can avoid ‘reinventing the wheel’.

Parallel patterns are particularly useful to non-experts in parallel programming, helping make use of common parallel programming paradigms, such as OpenMP, OpenACC and MPI.

The fundamental concepts of how we split up problems and parallelize them have not changed despite evolving languages and new machines. These concepts reoccur even as machines, languages and applications change. By utilizing these concepts anyone can develop optimised, parallel code quickly and efficiently, irrespective of the expertise in the code being parallelized, or in the HPC methods being adopted.

How does Parallelware use parallel patterns?

Parallelware does the hard work of finding the parallel patterns for you. The core Parallelware Technology can recognise any of a large set of patterns. Once identified Parallelware can provide solutions for parallelizing these patterns in quick and efficient ways.

How is Parallelware’s approach to parallel patterns unique?

The problem with classical data dependence analysis.

Traditional approaches to dependence analysis are limited by the identification of only one pair of accesses to the same memory locations at a time. Dependence analysis identifies constraints between statements and instructions, such as mandatory ordering of the execution of statements. This approach is then used to determine when it is safe to reorder and parallelize statements. For example:

The first loop’s iterations do not depend on each other, and can therefore be easily parallelized by using concurrent execution of each loop iteration. However, the second loop’s iterations have data dependencies across loop iterations. To identify these different cases, a compiler tests whether two array references can refer to the same location across iterations.

In reality, real-world applications are far more complex than the examples given above, and we start to hit the limitations of this classical dependency analysis approach. In an ideal world the compiler would know definitively whether the iterations are dependent. An incorrect assumption could lead to incorrect results. However, in classical dependence analysis, as the code becomes more complex, the computational cost of exact dependence analysis becomes prohibitive and eventually unachievable. Many different algorithms have been suggested for improving this analysis, but each trade-off accuracy and efficiency.

The ‘traditional’ approach to the development of parallelized software beyond that achievable by a compiler, which is inevitably limited because of the cost of analysing data dependencies, relies on the ability of human brains to solve N-complete tasks, such as the complex dependence analysis that compilers are unable to execute for us. The implicit dependence analysis carried out by experts when they are parallelising software is one of the reasons that the development of HPC enabled software is complex and time-consuming. Parallelware is able to help with this by identifying opportunities and solutions for parallelism not available with the classical dependence scheme.

Parallelware’s hierarchical classification system

Parallelware’s hierarchical approach to the data dependence analysis provides a new alternative to ease the burden on developers of performing this analysis, identifying the correct parallelization method and implementing it correctly.

The Parallelware Technology is built on ground-breaking research by founder and CEO of Appentra, Manuel Arenaz, in the late 2000s. His research focused on developing a new framework, that unlike the classical dependency analysis methods that focus on specific and isolated computational kernels, recognizes the patterns that appear frequently in real full-scale applications using a hierarchical approach.

Parallelware operates by using an intermediate representation (IR) of the code in LLVM, then splitting up the IR and building a dependency graph that identifies the relationships between all connected components in the LLVM IR.

The Parallelware technology uses the hierarchical methodology to identify opportunities for parallelization not previously possible with classical dependence analysis approaches. By utilizing this approach Parallelware can help users by:

Understanding how data is accessed and processed in code.

Identifying opportunities for parallelization.

Identifying how the opportunities can be parallelized, including a projection of likely performance of each possible approach.