Contents

Welcome to Pochoir

Pochoir - Parallel Stencil Computation Compiler

Pochoir (pronounced "PO-shwar") is a compiler and run-time system for implementing stencil computations on multicore processors. A stencil defines the value of a grid point in a d-dimensional spatial grid at time t as a function of neighboring grid points at recent times before t. A stencil computation computes the stencil for each grid point over many time steps.

In Pochoir, user typically just need to specify his /or her stencil computing kernel and boundary conditions in an embedded domain specific language in C++. Depending on the purpose of checking functional correctness or performance, user can employ either a native C++ compiler or Pochoir compiler to compile and run his /or her code. If the user employs the Pochoir compiler, the basic parallelization and optimization strategy of Pochoir is divide-and-conquer (cache-oblivious algorithm). In higher dimensional space-time grid, Pochoir employs a novel cutting strategy of simultaneous space cut.

Pochoir is an open source software project hosted by SuperTech group at CSAIL, MIT. You are invited to contribute in many forms (documentation, translation, writing code, fixing bugs, porting to other platforms...).

The Pochoir package contains three (3) main components: an embedded domain specific language (EDSL) in native C++ for stencil, a C++ template library for baseline run, and a domain specific compiler in Haskell for optimal run.

Currently, the Pochoir package is only tested on Linux system.

Performance Peek

Comparing the number of grid points processed per second (semilogarithmic scale) for Pochoir-generated code on 12 cores versus serial- and parallel-loop implementations. In all figures, the top curve is for the Pochoir-generated code, the middle curve is for parallel loops, and the bottom curve is for serial loops. (a) A 3D wave equation with a non-periodic boundary condition executing for 1000 time steps. (b) A 2D heat equation on a torus executing for 3200 time steps. (c) 1D pairwise sequence alignment (no time step). (d) A Lattice Boltzmann method with a non-periodic boundary condition for 3000 time steps.