The performance benefits of GPU parallelism can be enormous, but unlocking this
performance potential is challenging. The applicability and performance of GPU
parallelizations is limited by the complexities of CPU-GPU communication. To
address these communications problems, this paper presents the first fully
automatic system for managing and optimizing CPU-GPU communcation. This system,
called the CPU-GPU Communication Manager (CGCM), consists of a run-time library
and a set of compiler transformations that work together to manage and optimize
CPU-GPU communication without depending on the strength of static compile-time
analyses or on programmer-supplied annotations. CGCM eases manual GPU
parallelizations and improves the applicability and performance of automatic GPU
parallelizations. For 24 programs, CGCM-enabled automatic GPU parallelization
yields a whole program geomean speedup of 5.36x over the best sequential
CPU-only execution.