Sign up or log in to save this to your schedule and see who's attending!

Enabling Polyhedral Optimizations in Julia - Matthias ReisingerJulia is a relatively young programming language with a focus on technical computing. While being dynamic it is designed to achieve high performance comparable to that of statically compiled languages. The execution of Julia programs is driven by a just-in-time compiler that relies on LLVM to produce efficient code at run-time. This poster highlights the recent integration of Polly into this environment which has enabled the use of polyhedral optimization in Julia programs.

Towards a generic accelerator offloading approach: implementing OpenMP 4.5 offloading constructs in Clang and LLVM - Gheorghe-Teodor BerceaThe OpenMP 4.5 programming model enables users to run on multiple types of accelerators from a single application source code. Our goal is to integrate a high-performance implementation of OpenMP’s programming model for accelerators in the Clang/LLVM project. This poster is a snapshot of our ongoing efforts towards fully supporting the generation of code for OpenMP device offloading constructs. We have submitted several Clang patches that address some of the major issues that, in our view, prevent the adoption of a generic accelerator offloading strategy. At compiler level, we introduce a new OpenMP-enabled driver implementation which generalizes the current Clang-CUDA approach. The new driver can handle the compilation of several host and device architecture types and can be extended to other offloading programming models such as OpenACC. We developed libomptarget, a runtime library that supports execution of OpenMP 4.5 constructs on NVIDIA architectures and is extensible to other ELF-enabled devices. In this poster we describe two features of libomptarget: the mapping of data to devices and compilation of code sections for different architectures into a single binary. The aforementioned changes have been integrated locally with the Clang/LLVM repositories resulting in a fully functional OpenMP 4.5 compliant prototype. We demonstrate the robustness of our extensions and show preliminary performance results on the LULESH proxy application.

Polly as an analysis pass in LLVM - Utpal BoraIn this talk, we will introduce a new interface to use polyhedral dependence analysis of Polly in LLVM transformation passes such as Loop Vectorizer. As part of GSoC 2016, we implemented an interface to Polly, and provided new APIs that can be used as an Analysis pass within LLVM's transformation passes. We will describe our implementation and demonstrate some loop transformations using the new interface (PolyhedralInfo). Details on GSoC- http://utpalbora.com/gsoc/2016.html

Reducing the Computational Complexity of RegionInfo - Nandini SinghalThe LLVM RegionInfo pass provides a convenient abstraction to reason about independent single-entry-single-exit regions of the control flow graph. RegionInfo has proven useful in the context of Polly and the AMD GPU backend, but the quadratic complexity of RegionInfo construction due to the use of DominanceFrontier makes the use of control flow regions costly and consequently prevents the use of this convenient abstraction. In this work, we present a new approach for RegionInfo construction that replaces the use of DominanceFrontier with a clever combination of LoopInfo, DominanceInfo, and PostDominanceInfo. As these three analysis are (or will soon be) preserved by LLVM and consequently come at zero cost while the quadratic cost of DominanceFrontier construction is avoided, the overall cost of using RegionInfo has been largely reduced, which makes it practical in a lot more cases. Several other problems in the context of RegionInfo still need to be addressed. These include how to remove the RegionPass framework which makes little sense in the new pass manager, how to connect Regions and Loops better, and how we can move from a built-upfront-everything analysis to an on-demand analysis, which step-by-step builds only the needed regions. We hope to discuss some of these topics with the relevant people of the LLVM community as part of the poster session.

Binary Decompilation to LLVM IR - Sandeep DasguptaThis work is about developing a binary to LLVM IR translator to generate higher quality IR than that generated by the existing tools. Such an IR includes variable information, type information and individual stack frames per procedure, which in turn facilitates many sophisticated analysis and optimizations. We are using an open source tool McSema for the purpose and our goal is to extend the tool to 1) extract variable and type information, 2) improve the quality of recovered IR by mitigating some of its limitations and 3) re-construct stack for each procedure. The current status is we have extended the McSema recovered IR to re-construct the stack for each procedure which in turn will help in doing variable recovery and its promotion.

Dynamic Autovectorization - Joshua CranmerWe present our ongoing work on augmenting LLVM with a dynamic autovectorizer. This tool uses dynamic information to circumvent the shortfalls of imprecise static analysis when performing loop vectorization, as well as leveraging dynamic transformations of code and memory to make autovectorization and other optimization passes more effective. The key transformations we illustrate in this poster are the extraction of hot paths in innermost loops (with a current speedup of 5% on SPEC against vanilla LLVM) and the conversion of memory from array-of-structs to a struct-of-array representation.

RV: A Unified Region Vectorizer for LLVM - Simon MollThe Region Vectorizer (RV) is a general-purpose vectorization framework for LLVM. RV provides a unified interface to vectorize code regions, such as inner and outer loops, up to whole functions. Being a vectorization framework, RV is not another vectorization pass but rather enables users to vectorize IR directly from their own code. Currently, vectorization in LLVM is performed by stand-alone optimization passes. Users who want to vectorize IR have to roll their own vectorization code or hope for the existing vectorization passes to operate as the user intends them to. Polly, for example, features a simple built-in vectorizer but also communicates with LLVMs loop vectorizer through metadata. All these vectorizers pursue the same goal that of vectorizing some code region. However, their quality varies wildly and their code bases are redundant. In contrast, with RV users vectorize IR directly from their own code and through a simple unified API. The current prototype is a complete re-design of the earlier Whole-Function Vectorizer by Ralf Karrenberg. Unlike the Whole-Function Vectorizer or any vectorizer in LLVM, RV operates on regions, which are a more general concept. In terms of RV, a valid region is any connected subgraph of the CFG including loop nests. Regions make RV applicable for inner and outer loop vectorization. At the same time, RV attains the capability of its predecessor to vectorize functions into SIMD signatures. However, Whole-Function Vectorization is now only one of many possible use cases for RV. The current prototype of RV implements all stages of a full vectorization pipeline. However, users can compose these stages as they see fit, inserting and extracting IR and analysis information at any point.

Robustness Enhancement of Baggy Bounds Accurate Checking in SAFECode - Zhengyang LiuBaggy Bounds Accurate Checking(BBAC) is a compile-time transform and runtime hardening solution that detects out-of-bounds pointer arithmetic errors. The original version of BBAC implemented on SAFECode is not robust and efficient enough for real world use. Our work has improved the robustness and performance of the SAFECode's BBAC implementation, by fixing the bugs in the compile-time transform passes and runtime checking functions, as well as inlining several runtime checking functions. The latest implementation of BBAC achieves reasonable robustness and performance on various real world applications.

Extending Clang C++ Modules System Stability - Bianca-Cristina CristescuThe current state of the Module System, although fairly stable, has a few bugs for C++ support. Currently, the method for ensuring no regressions is a buildbot for libc++, which builds llvm in modules self-hosted mode. Its main purpose is to find bugs in clang’s implementation and ensure no regression for the ongoing development. We propose a flow for finding bugs, submitting them alongside with a minimal reproducer to the Clang community, and subsequently proposing a fix for the issue which can be reviewed by the community. The poster will emphasise, besides the common complexity of minimising an issue, a comparison between the labour required with and without the methodology proposed.