Service

PP-Code

Scientific area

Short description

The PP-code is a modern high-order charge conserving particle-in-cell code for simulating relativistic and non-relativistic plasmas using either a full particle model or a hybrid ion-particle / electron-fluid model. The code is using a high-order implicit field solver and a novel high-order charge conserving interpolation scheme for particle-to-cell interpolation and charge deposition. It includes powerful diagnostics tools with on the-fly particle tracking, synthetic spectra integration, 2D volume slicing, and a new method to correctly account for radiative cooling in the simulations. A robust technique for imposing (time-dependent) particle and field fluxes on the boundaries is also part of the code, and has been used to couple MHD models of solar plasmas to the PIC models. The code maintains the particles sequentially sorted in memory inside each cell for maximal cache-reuse and best performance. Similarly, the memory locality of the particles is exploited to cache field locks around each cell in to memory for faster interpolation of electromagnetic fields to particle positions, and deposition of averaged charges and currents to the grid. Using a hybrid OpenMP and MPI approach, the code scales efficiently from 2 nodes to the full JUQUEEN system with an excellent weak scaling of 96% and from 2048 nodes to the full system (a factor 14 in size) with a strong scaling efficiency of 97%.

Volume rendering of the late-time stages of a 3-dimensional relativistic collisionless ion-electron shock. Shown from top to bottom are the ion density, the magnetic field density, the electric field density, and the electric field projected along the magnetic field (parallel electric field). The fields are normalised to the kinetic energy density of the upstream bulk flow. The filamentation upstream (left) of the shock and the magnetic turbulence downstream (right) can clearly be recognised.

Scalability

458,752 cores (1,835,008 parallel threads) on BlueGene/Q (JUQUEEN)

262,144 cores on BlueGene/P (JUGENE)

Ten-thousands of cores on x86 (Pleiades, Bluewaters)

Hundreds of GPUs (Local GPU-cluster, Bluewaters)

Weak and strong scaling timings per thread and particle. The benchmark consisted of a relativistic two-stream setup with up to 300 billion particles.