Latency and HPC Workloads

Robert Myers (rbmyersusa.delete@this.gmail.com) on October 16, 2012 9:56 am wrote:
> anon (anon.delete@this.anon.com) on October 16, 2012 8:17 am wrote:
>
> >
>
> > Interesting mindset: everybody else disagrees with me,
> > therefore
> everybody else is wrong.
> >
> > Although there are well known exceptions
>
> > where such mindset has turned an industry or scientific field upside
> down,
> > 99.99x% of the time, it comes from a range of people from the one
> who is very
> > good but does not grasp a particular aspect, down to the
> complete
> > crackpot.
> >
> > Perhaps you are an exception. I would
> like to hear more about the
> > problems and your ideas how to fix them, if
> you would spare the time. (I assume
> > that "everyone stop what you're
> doing" is not actually your proposal!)
> >
>
> Even though you post
> anonymously, and even though you are personally insulting, I'm going to answer
> your post, as you ask most broadly. In the future, if you don't want to be
> thought of as a crackpot yourself, you might leave off speculation about who is
> and who is not a crackpot.

What I said is just facts, but I didn't call you a crackpot as such, but you would agree that many people with this mindset are crackpots.

And you really should think of anonymous posters as crackpots. There is really no other sane way to proceed on the internet.

> Just as soon as the players I have mentioned, plus
> anyone else who plays the same game, stops advertising "the n-th fastest
> computer in the world" based on a single benchmark (Linpack), I'll back off on
> the snarling insults about the practice.

I some knowledge of supercomputer procurement. Not in the top 10, but in the top 100. The clients were very specific about their workloads, and gave a dozen or so which were run by their users for their acceptance suite. Their top500 submission was fun because it gave an "Nth fastest supercomputer" tag, but it was at the bottom of the priority list (and I don't think it was required for acceptance).

Do other HPC sites really just start out by wanting to reach #1 (or some top500 goal), and not have any real ideas about how the machine will be used? I highly doubt it.

>
> I have posted about this general
> issue and about the poverty of non-local bandwidth on a particular forum on
> Usenet at length. I have already argued at length about why gigantic computer
> centers with lousy interconnect are in the interest neither of science nor of
> the national purse. They do serve the interest of some of the players I have
> already mentioned. Someone has responded, and probably the person who
> identified himself here as forestlaughing, that these gigantic machines are
> actually throughput machines that are rarely employed in their actual
> giganticness, except to deal with many users under a single bureaucracy. To
> that argument, I have no answer except that my experience with those gigantic
> bureaucracies has never been positive.
>
> As an introduction to my lengthy
> involvement in this controversy, you may want to endure the following thread on
> comp.arch:
>
> https://groups.google.com/group/comp.arch/browse_thread/thread/225
> ae7ff9050a027/71fbb4d1cdd9651c?hl=en&q=Gordon+Bell+group:comp.arch+author:Ro
> bert+author:Myers#71fbb4d1cdd9651c

GPGPU? Assembly programming? Pure streaming? Sounds fishy.

Custom interconnects can be (and are) used.

Custom chips can be and have been used, software could use vectors, chip bandwidth could be increased. But the fact is that more bang for the buck can be had almost everywhere by using commodity CPUs, or in the case of BG, custom HPC chips which look more like commodity chips (i.e., not huge bandwidth, massive vector, no caches). Because that actually works better. Caches do work for many compute codes.

Custom vectors with gigantic memory bandwidth were well on the way out before top500 list started. Although again, there are custom things out there which some people use, because I guess their workloads really don't fit with traditional CPUs.

http://en.wikipedia.org/wiki/SX-9

And actually you also see installations going the other way too. MD-Grape for example was a custom CPU had massive copmute, but did not have large memory or interconnection bandwidth and sent the same data to multiple pipes (so is not like a traditional vector either). So not everyone wants vectors and bandwidth.

So if the top supercomputers are just about getting #1 spot, wouldn't we see a vibrant community of vectors and weird and wonderful streaming processors programmed in assembly as we go down the list? Or does everyone not care about real computing, but just want a spot somewhere (anywhere) on the list? "Who cares about biochemical simulations, let's spend all our money to get #158 on top500". No, that does not happen. And there are a significant number of private organizations down this list too, you know.

Very few of them even use GPUs let alone full custom CPUs.

>
> I went so far as to enlist the aid of some
> comp.arch participants, who have been generous with their time and their
> patience with the fact that I am not an actual computer architect. When IBM
> bailed on Blue Waters and I seemed to be the only one saying that the Emperor
> was plainly walking naked, I gave up.

Blue Waters was not revolutionary. It was a "commodity" non-vector POWER7 CPU with a custom interconnect. What was good about Blue Waters that is no good with BG/Q or K supercomputer?