Computing at arbitrary scale

I’ve given a couple presentations lately in which I needed to communicate the challenges of multicore development and the notion of parallelism everywhere with the software, reliability, and OS challenges very large scale parallel computing. Since many of you are probably also giving prsentations like this, I thought I’d share some recent experience with a new phrase I’m trying.

After trying a few variants my audiences seem to be responding to the idea of computing at arbitrary scale. (As Shakespeare said the Book of Ecclesiastes says, there is very little new under the sun; if you know where this term originated, leave a comment please.)

Note that this is arbitrary with respect to what the computational support infrastructure and software development framework are assumed to know a priori about the runtime environment. Certain programming paradigms and most operating systems implementations, for example, implicitly assume that they will be used on relatively low processor counts. As another example, MPI programs frequently conflate algorithms and implementation, and the implicit assumptions about scale embedded in them (choice of algorithm, implementation of collective operations, etc.) limit the performance of applications well outside the design space.

So, we need operating systems and software development frameworks that support computing at arbitrary scale. For arbitrarily large process-count executions, the operating system and the application need to be resilient in the face of the hardware failures that are nearly guaranteed to happen, and the algorithms chosen for the work to be done need to appropriately address the resources (FLOPS, memory, bandwidth, etc) available. For arbitrarily small process-count executions, the same software needs to adapt to the resources available and regain the overhead lost (potentially) in maintaining resilience.

Craig, I’m disappointed in myself. I, of course, was not thinking of Shakespeare’s own words, but a PARAPHRASE of part of Sonnet 59. So not only did I get the author wrong, I didn’t even have the reference right. Oy.

I agree with this in principle, but let me call this the ‘grand unified theory of HPC’. And just like its analogue in physics, I’d bet we’re still a ways off from actually having it. Certainly, it should be pursued, but don’t expect this level of sophistication soon.

I’m a big proponent of ‘smart’ applications.. that is, choose the right algorithm, parallel decomposition, etc., based on what it is you’re trying to do. Got an MD-type simulation? Ok, if you’re running with a small number of particles, you may with a P-P method… going higher, you may want to use a P-M, higher still, some FMM method. A program that’s well-written can choose this at runtime, and if it isn’t clear, why not simply run a few iterations with each algorithm and then select the fastest? (Like ATLAS, but done at runtime.)

And it isn’t just the algorithm… with communication being so important, and the characteristics of communication performance being dependent upon system architecture and software, it’s (in my opinion) worth it to have applications determine the best parallel decomposition at runtime, if possible. Eg, if I run a 3D mesh on 64 processors, do I want that to be decomposed as 4x4x4? What about 8x8x1? The former has a lower surface area to volume ratio of cells, whereas the latter has fewer communication boundaries. Which is better? Beats the heck out of me, but a good application can test this with a few simple iterations and then select the best one.

All in all, software needs to be more flexible, but that takes time and effort, and to my dismay, I rarely find myself with free time with which to go nuts on good design. Until that changes, I don’t see the software getting too much smarter.

By the way, I sent this to a friend yesterday and figure people here may like it as well. The following is a paper titled, “Predictions for Scientific Computing 50 Years from Now” by an Oxford numerical analyst (Trefethen).

I think there’s some fantastic points, and one that I wholly subscribe to is raised in prediction #4 – basically, determinism in numerical computing will go away. This is hard for many people to deal with and gets in the way of smart applications. Let’s say I solve some system of equations to a tolerance of 10^-8, right? Now let’s say I run my smart application again, and this time it selects a different parallel decomposition, some sums are done in a different order, etc. It finishes, but the number I get for this same exact input configuration is different from the previous result by 10^-13. Should I care? No, I don’t think so – it’s well below what I’m solving for. But lots of people care deeply. We need to change that.

Resource Links:

Latest Video

Industry Perspectives

In this episode, the Radio Free HPC team splits on the topic of Net Neutrality. The FCC will soon publish its new rules for ensuring an even playing field for Internet Bandwidth. "Dan doesn't like the idea one bit. Henry disagrees and thinks we need Net Neutrality to keep the Comcasts of the world from running amok. As for Rich, he just finds the whole argument rather amusing since it's pretty much a done deal." [Read More...]