The authors have been witnessing a continuous growth of both heterogeneous computational platforms (e.g., Cell blades, or the joint use of traditional CPUs and GPUs) and multicore processor architecture; and it is still an open question how applications can fully exploit such computational potential efficiently. In this paper, they introduce a run-time environment and programming framework which supports the implementation of scalable and efficient parallel applications in such heterogeneous, distributed environments. They assess these issues through well-known kernels and actual applications that behave regularly and irregularly, which are not only relevant but also demanding in terms of computation and I/O.