Coupling Memory and Computation for Locality Management

We articulate the need for managing (data) locality automatically rather than leaving it to the
programmer, especially in parallel programming systems. To this end, we propose techniques
for coupling tightly the computation (including the thread scheduler) and the memory manager
so that data and computation can be positioned closely in hardware. Such tight coupling of
computation and memory management is in sharp contrast with the prevailing practice of considering
each in isolation. For example, memory-management techniques usually abstract the
computation as an unknown ﾓmutatorﾔ, which is treated as a ﾓblack boxﾔ.
As an example of the approach, in this paper we consider a specific class of parallel computations,
nested-parallel computations. Such computations dynamically create a nesting of parallel
tasks. We propose a method for organizing memory as a tree of heaps reflecting the structure of
the nesting. More specifically, our approach creates a heap for a task if it is separately scheduled
on a processor. This allows us to couple garbage collection with the structure of the computation
and the way in which it is dynamically scheduled on the processors. This coupling enables taking
advantage of locality in the program by mapping it to the locality of the hardware. For example
for improved locality a heap can be garbage collected immediately after its task finishes when
the heap contents is likely in cache.