I’m just back from the ISC’10Tutorial Sessions. Getting to and from Hamburg in one day from Düsseldorf is a pretty harsh thing you could do. First, the A1 was basically just a concatenation of construction sites, making it quite a hassle to get there. Means, I got there just on time.

We arrived at the registration desk at 13:20 sharp – Tutorial would start at 13:30. Registration was smooth. Give yer name and company, grab badge, WiFi-details, a map how to get from the CCH to the University and a schedule for all the tutorials.

First thing we didn’t like: The schedule was divided in tracks – there was a CUDA-track, an Infiniband-track, and so on. A lot of lectures in every individual track, but no timeline, when the individual talks should start! It was hard to decide what to do first.

We grabbed our stuff and had a really short walk of probably 5 minutes to the venue. The building was a typical: A huge 1900’s building, huge, massive, a lot of stairs – but eventually we got into the lecture room B, where we wanted to hear the CUDA tutorial.

Unfortunately, the CUDA talk wasn’t anything new. I really think that the slides of this tutorial were actually used in a NVIDIA webinar about CUDA I attended last year! It was a real Deja Vu, and I think they just changed the date on that slides. Gernot Ziegler started up with the tutorial – I thought this would be “big time”, cause we’d get the opportunity to hear someone who’s really into CUDA – he’s with NVIDIA after all. Anyway, he passed over the tutorial to John Stratton, who did quite a decent job, but he was pretty unlucky for he had to reiterate that presentation I already enjoyed last year in the webinar.

Ah, what a wasted opportunity. See, I’d expect more from the ISC when it comes to CUDA than reciting what was told earlier on less specific channels!

My colleague and me stuck to the talk until the long break – some cool things were said about how to abuse the texture-buffer in a clever way to to really local multidimensional computations, but unfortunately it wasn’t elaborated enough. For my taste.

After the long break we decided to hit the Infiniband-talk. Since we deploy large installations we thought that this gives us some insight about how to deploy Infiniband and how to make up migration-concepts of how to get from Ethernet to Infiniband. Sadly, the talk was just a roundup of the available vendors, their products and performance comparison. That’s not really what we expected, we could’ve just looked that data up elsewhere. While the talk was still going on (I think we were on slide 78 of 145) I decided to check the proceedings and see what this talk would be offering in the next hours. Unfortunately it was about to continue like that.
That was the time when we left the hall and went for the “Hybrid MPI & OpenMP Parallel Programming” lecture.

And wow, that was awesome. We got there pretty late, it must’ve been 16:30, but we were basically stunned by the ideas. Basically MPI and OpenMP both have their advantages and flaws. I certainly thought about combining both technologies, but never did for lazy personal reasons. And then those guys just did it: Awesome. In general, if we got out cluster of SMP-systems with multiple sockets and multiple cores we should be running MPI on the outer computing domain, and OpenMPI on socket- or core-layer. This ain’t not new, but they gave me some insight about why you should do it and what pitfalls my arise.

In the end, I thought I should have sticked to the last tutorial in the first place. Coming back to my first complaint, there wasn’t a real schedule, which is sad. We couldn’t decide to go to which tutorial first since we had no idea when all those lectures were taking place. If the ISC continues to give those tutorials, they should improve their schedule.

Then again, since my collegue and me were both attending the CUDA-talk, we already elaborated on the spot how we could use CUDA in our usecases. We just got rough ideas, but sitting together, listening to that talk wasn’t just a waste of time.
It brought us together.

Now I’m back at home and don’t feel too bad about the ISC lectures – Now I feel sad that I’m not able to attend the rest of the ISC. But I got to work on Monday. Which is in… omg, in about 8 hours.

The NCSA is offering a free web-based seminar on getting started with performance tools. The National Center for Supercomputing Applications, based at the University of Illinois, is renowned for it’s expertise in the HPC-environment. Their courses and trainings are among the best in the world.

Registration is required for this event. Please complete the Registration Form.

Description:
This webinar will provide an introduction to performance tools and techniques. A common application, High Performance Linpack (HPL), will be analyzed with profiling tools from a high level progressing down to how the code is mapped onto hardware. To do this, HPL will be analyzed with profiling tools for both user and system time and then a representative component of HPL (matrix multiply) from a near-the-hardware vantage point will be used to show how it can be tuned. Finally, emerging trends in performance tool development will be described.

Prerequisites:
This tutorial is intended for users with basic parallel programming experience who are new to the performance engineering process.

That made me sad. I just read that Chris, the guy behind HPC Answers, is going to stop blogging.

Chris, we’ll miss you. You’re one of the exceptional guys who really knew what they’re talking about. They’re ain’t not so much people blogging about HPC and you got a hell of an expertise.
Make sure your legacy is well archived in some safe location, your “Answers” were true and indeed very useful answers.

I wish you all the very best for your new job and I hope you drop by any now and then – virtually here or physically in Germany, if you ever happen to be here.

It was an honor to read your blog. Whoring about HPC-topics won’t be the same without you.

Thomas Sterling, who’s nowadays teaching the the LSU, is preparing a course on supercomputing for spring 2008. They’re going to broadcast the lectures in high-definition TV to other universities over the internet. He’s also working on a textbook of this topic and they’ll also offer the course on DVD later.

SUN just announced it’s new programming language, “Fortress“, which is supposed to be the successor to FORTRAN. The emphasize lies on parallel computing.

Synopsis:

Fortress is a new programming language designed for high-performance computing (HPC) with high programmability. In order to explore breakaway approaches to improving programmability, the Fortress design has not been tied to legacy language syntax or semantics; all aspects of HPC language design have been rethought from the ground up. As a result, we are able to support features in Fortress such as transactions, specification of locality, and implicit parallel computation, as integral features built into the core of the language. Features such as the Fortress component system and test framework facilitate program assembly and testing, and enable powerful compiler optimizations across library boundaries. Even the syntax and type system of Fortress are custom-tailored to modern HPC programming, supporting mathematical notation and static checking of properties such as physical units and dimensions, static type checking of multidimensional arrays and matrices, and definitions of domain-specific language syntax in libraries. Moreover, Fortress has been designed with the intent that it be a “growable” language, gracefully supporting the addition of future language features. In fact, much of the Fortress language itself (even the definition of arrays and other basic types) is encoded in libraries atop a relatively small core language.

A reference implementation (an interpreter, written in Java), is available at the project’s homepage under the BSD-license.

Haven’t looked into it yet, but I’ll definitively will. Stay tuned for updates.

It’s been quite some time, I haven’t blogged much. I’m currently comitted in a project where we’re deploying a couple of new components at a german Telco; a new cluster, a new Network Management System and some kind of Layer-7 Proxy. This keeps me busy, I apologize to my regular readers for the lack of upgrades. So, here a little roundup of things which happened in the last 6 weeks or so.

OK, let’s get started; first there was SC06, which was quite some happening I’ve missed this year. Interesting hardware was the Intel SR1530 systems for example; eight bloody cores in one 1HU-case. Nifty! SiCortex announced a 5832 MIPS-core system for the masses – the SC5832 offers 5.8 TFLOPS (peak) at just 20 kW of power-consumption trough using power-ompitzed MIPS64 cores. Nvidia showed of with CUDA, a library for offloading computing to the GPU and Dell annouced systems with quad-core Opterons. And there was news about IBM’s 1350 hybrid CBE-blade-system.