ABSTRACT:It can be argued that one source of productivity problems in parallel programming is the need to aggregate remote data operations in order to achieve (or preserve) performance. In this work, we compare simple variations of the random access benchmark from the HPC Challenge Benchmarks with a variety of parallel programming models to evaluate both the state of the implementations of those models and to discuss how much of the difference between the models is due to differences in the abilities of the programming models and how much reflects the
state of the implementations. We develop a performance model to evaluate the costs of local and remote updates across different programming models and different implementations of the programming models. Based on our models and validation, we observe that the synchronization overhead in all our implemented versions is the main factor limiting scalability of RandomAccess.