Search form

You are here

Stephen Booth's blog

It's always a bit of an embarrassment when talking about your code tests. I think most developers know that they don’t have enough tests or that their tests are not good enough.

There is never enough time to either write or to run tests that fully cover all possibilities so, like all types of programming, testing becomes a compromise where you try to make the best use of the limited resources available for testing.

Recently I seem to have had many conversations about programming languages for HPC. In some ways this is not a new subject - I have been having similar conversations for the last 20 years. However as HPC hardware evolves, machines become more complex and the issues that need to be addressed by programmers also become more complex. So it is not surprising that we are wondering if there is more the compiler could be doing to help us.

Introduction

While a surprisingly high proportion of HPC users are happy to keep their data on a single HPC service, or at most to move it within the hosting institution, sometimes is becomes necessary to move large volumes of data between different sites and institutions. As anyone who has ever tried to support users in this endeavour knows, it can be much harder to get good performance than it should be. This post is an attempt to document the available tools and technologies as well as common problems and bottlenecks.

In my previous blog post I said that I was working on a library to move data between different data decompositions.

In many cases it is easier for a programmer to work with a global coordinate system that reflects the overall data in the program. This is the approach taken by many PGAS languages and some parallel libraries such as BLACS.

The programmer still wants to be in control over the data decomposition, but ideally this should be a separated concern than can be changed without forcing a complete rewrite of the rest of the program.

I'm currently working on a small library to support decomposition changes in parallel programs.

It turns out that a fairly simple interface can be used to describe a very large space of possible data decompositions. I'm therefore writing a library that can redistribute data between any two such decompositions.