As some of you might know I ended up spending a great deal of SC08 either in
the hospital or in my hotel room recovering from emergency surgery. Not the best
way to spend SC since it only comes once a year, but my body just didn't allow it.
However, I did get out a little on the last day to run around the show floor like
a mad man.

SC is always a very interesting conference for me for many reasons. I get to see
some cool new toys, see old friends, make new ones, and totally geek-out for
a week without my family rolling their eyes at me in total embarrassment. So,
without further ado, here are my (limited) impressions of SC08

Austin's Really Great

I get to spend a great deal of time in Austin because of my day job but it's not
downtown. So it was very interesting for me to be downtown in Austin, especially
near the night-life around 6th street.
It's a much, much better destination that Reno for SC07 and Tampa for SC06.
There are tons of places to eat including some places with good steaks and
BBQ (I'm a huge BBQ nut). The prices can be a little steep but at least we can
find places to eat (unlike Tampa) and we didn't have to cut through the smoke to
get anywhere (like Reno). So, hat's off to the SC committee - Austin was a pretty
good pick. BTW - the hospital near the downtown,
Brackenridge, is a top notch
hospital and is a main trauma center. The people there were spectacular to say
the least. But then again, I'm not going to judge the location for SC based on
the quality of the hospitals. But given my rapidly advancing age, it may become
one of my key criteria for future SC conferences.

General Impressions

I think several other people (e.g. Doug and Joe Landman), have mentioned that
the show floor felt less full than usual. I don't know what the final attendance was,
but I do know that a number of people who were supposed to come,
canceled at the last minute. I guess the reality of operating expenses has hit
just about everyone. But walking around the floor, I got the distinct impression
that the attendance was down.

Another impression I got was that the number of "customer" booths was way
up. Remember that SC is a unique conference in that the vendors and their
customers all share the same exhibit floor. To me, it seemed as though the
number of customer booths, primarily universities and national labs, was
up considerably. I didn't stop to talk to many of them but my usual favorites,
TACC and aggregate.org were there and in
rare form. I did see more universities from Asia which I think is a good sign.
At the same time, the national lab booths just seem to be getting bigger and
more elaborate every year. I waiting for the day when the largest and loudest
booth with the best swag is not a vendor but rather a national lab. When that
happens I think it will speak volumes about the HPC industry and funding. But,
I digress.

One other impression I have is the general buzz was different as well. It didn't
seem as "fun" as past SC shows. There seemed to be a "bite" in every conversation.
My favorite conversations were academics talking to vendors or sometimes at other
non-vendor booths. These conversations got very heated with some academics
raising their voice telling the vendors that they were dead wrong and they were
hurting the industry and if they only listened to them they had a solution to
whatever problem there was. In 5+ years of SC's I never quite heard conversations
get this heated. Things ware usually very pleasant and at the every least fun
technically. But when you get people who are absolutely convinced they are
absolutely right and their sensitivity, for whatever reason, is heightened, makes
for a really argumentative environment. And I didn't get this impression from
one or two discussions but many of them. Sigh... I hope it was just because
some people were grumpy, but if the general attitude is true then I don't think
it's a good sign for the community (I won't even talk about the beowulf list
which has almost become next to useless, but that's another story... :) ).

Cool Stuff for HPC

Since I didn't walk around the show floor too much, I will have to rely on press
releases and website information to help. I always look for a "theme" or two in the
shows and this year I think I can definitely find one theme and perhaps a second
and third theme. The main theme of this year's show, at least to me, was GPUs.

GPUs

Everyone was talking about or demoing GPUs for HPC. I've been following
GPUs for a number of years and I was glad to see them come to the forefront
this year. A number of vendors were demo-ing systems with GPUs such as
Cray, Bull, Dell, NEC, HP, BOXX, Mathematica, Lenovo and others. Plus
there seemed to be lots of discussion about tools for GPUS with many people
expressing hope that OpenCL would be the savior of GPU coding.

Nvidia had a
press release
about what other companies are doing to incorporate Nvidia GPUs into Personal
Supercomputers. In general, the plan is to use Nvidia's
C1060
card in a workstation or rack mount system. They even have a
website that
discusses personal supercomputers using Tesla cards.

You can go to the
Home of CUDA, Nvidia's
freely available tool for building GPU codes. There are a number of examples of
speedups obtained from running on GPUs. However, getting your application to run
on GPUs is not as simple as a "make" or adding a new variable to a command line
(e.g. "-gpu"). You still have to rethink your algorithms to take advantage of the
GPU. While this sounds easy, it's not. You have to retrain the way you think to
take advantage of the GPU. But if you can coerce your code into running on GPUs,
the potential for magnitude increases in performance is there. Keep in mind that not all codes
or algorithms may be able to take advantage of GPUs.

While Nvidia was the main talk in regard to GPUs, Aprius
was also there showing off an interesting box called the
CA8000 Computational Acceleration System.
It's a 4U box that contain up to eight (8) PCIe boards - most likely computational
acceleration cards (e.g. GPUs). Each card can be a PCIe x16 Gen 2 card that is double
wide that draws up to 300W. Ideally, you populate the CA8000 with a few cards
such as GPUs, and then use the Aprius PCIe Optical adapters in the box that allow
you to connect the box to a single node or multiple nodes. You can use up to four (4)
of these adapters and four (4) cards. This is perfect for situations where the
compute nodes cannot handle a GPU directly (either they don't have the right kind
of slot or they don't have enough power). Using the connectors you can get a 2:1
or a 4:1 accelerators/node ratio with this box.

Since AMD doesn't have an external GPU box as Nvidia does, the CA8000 is perfect
for AMD GPU solutions. It also matches Nvidia's recommended ratio of no more than
2 cores per CPU. But the CA8000 does not offer the density that the Nvidia
S1070 1U box
offers. Nonetheless, I think this box is very interesting for a variety of
reasons - it allows nodes that can't have a GPU to connect to GPUs, it gives
AMD an external solution that comes close to Nvidia's solution.

Nvidia was also on the floor in full force. Their booth is always good and they
have some real technical experts floating around (unlike some companies who stuff
the booths with eye-candy and they don't send anyone with technical skills to back
them up - but that's another story). Due to my horribly limited time I didn't get
to chat with Nvidia. I'm sorry I missed that since that's always a highlight of the
show.

One of the coolest announcements and one I was really looking forward to digging
into was that the Portland Groupannounced the new version
(8.0) of their compiler suite. While the compilers are always good and PGI continues
to make them better, I think this suite could represent the beginning of a huge
trend for GPUs - integrating GPU code generation into standard compilers

The idea is that standard compilers have the ability to generate code for GPUs.
Of course you have to write code that the compilers recognize or even better,
the compiler could have a compile option such as "-gpu" that would look at the
code and generate GPU code where appropriate. I know this is wishful thinking,
but the compiler writers at PGI are exceptional. The advantage of this approach is it allows people
to use standard compilers, that they already may be using, to build applications
that will run on GPUs. This approach is even more important for Fortran since there are no
really good ways to easily port Fortran code to run on GPUs.

Keep watching CM for a follow-up
article I hope to do on the new PGI compilers

Solid State Storage

Another theme that I think is close to the GPU theme in magnitude is Solid-State
storage. For some time I think we've all seen articles coming out about SSD's.
While they are still expensive and have limitations that many people are aware
of (e.g. they can actually lose data). But there were two companies that I would
like to highlight - Texas Memory and
Solid Access.

One of the reasons that SSD's and the like have become so popular is that people
are looking for increased performance and possibly lower power consumption for
applications (even if it is "perceived" need for increased performance). But
in general, people are starting to examine creating "tiers" for the storage
behind a file system with HSM. Figure One below illustrates the concept.

Figure One: Storage Tiering Pyramid

The width of the triangle indicates capacity and the height indicates performance
(however, you want to measure performance - throughput, IOPS, etc.). The general
premise with this illustration is that as you move up the triangle, costs increase
as well. So faster storage costs more (makes sense). Therefore to save money, don't
put all of your storage on the fastest, most expensive storage. It's better to put
only the data that needs that extremely faster storage on the something like SSD's
or Ramdisks, and then move the data to something a lot slower, such as SATA drives
with limited bandwidth to the file system. This is the HSM concept (move the data
up and down as needed). So people are looking at SSD's and Ramdisks the to get
best performance possible but they want to combine with existing storage to
be more cost effective.

Texas Memory has been around for a number of years, but I think their importance
in the HPCC storage market is about to take a quantum leap because of the
tiering approach. They have a variety of products that
have both SSD as the storage medium as Ramdisks as the storage medium. For example,
they have a unit called the RamSan-500
that consists of 1TB to 2TB of Flash Raid along with 16GB to 64GB of cache. It
can be connected via a 4X FC links (2-8 of them). This box alone can do 2GB per
second of throughput and 100,000 IOPS from the flash storage (as a comparison, a
single hard drive could do maybe 50MB/s and around 100 IOPS).
Their RamSan-440 is a
RAM based storage unit with 256 to 512GB of storage. It can do up to 4.5GB/s
throughput and 600,000 IOPS.

Texas Memory has a range of storage options including a 42U rack with flash
based storage and memory cache.
The RamSan-5000
has up top 10-20TB of flash based storage and 160GB to 640GB of storage. In
aggregate, it can do 20GB/s to the flash storage and achieve over 1,000,000 IOPS.
Keep a eye on Texas Memory - they are going to start shaking the HPCC Storage
market.

The other company that has an SSD solution as a stand-alone unit is
Solid Access. They have several
products that offer
various approaches to adding solid-state storage. The base product, the
USSD 200, is a 2U box that has a maximum capacity of 128GB but with a throughput
of 3.6GB/s when you use multiple FC links. It can be connected in a variety of
ways including 320 MB/s SCSI-3 Ultra-wide LVD, 3 Gb/s SAS, and 4 Gb/s FC.

During SC08, they also announced a new 1U box (USSD 300 series) that have up to
256GB of flash storage. It can do 100,000 IOPS per single FC port, and 4GB/s with
aggregated network. They also announced a USSD 320 which is a 2U unit with up to
256GB of storage.

TACC and Visualization

While I don't think it was a "theme" of the show, the TACC announcement of their
new visualization center
that includes a new viz wall called Stallion. This project is very noteworthy because
it's built totally from commodity parts and uses Kubuntu Linux. It has
24 Dell XPS 690 workstations (one of them is a head node). Each of the 23 compute
nodes have two Nvidia graphics cards each with 1GB of video memory, 4.5GB of memory,
and a single Intel quad-core CPU. These are connected to a total of 45 Dell 30"
monitors (I guess each workstation is connected to 2 monitors). The monitors are
capable of 2560 x 1600 in resolution and they are arranged in 15 columns of 5 monitors
each. That's a total of 307 million pixels.

Figure Two: TACC Stallion Viz-Wall

Stallion now is the largest tiled display in the world, passing the San Diego
Supercomputer Center, which is amazing, but I think the coolest aspect to the
whole project is that it's using standard workstations, standard displays,
standard video cards, standard networking, along with Linux and some open-source
viz software. It's not a specialized system customer built and customer integrated
as in the good old SGI days. It follows the same tenants of beowulf clusters but
for viz clusters. Not a bad concept IMHO.

Summary

I hate to say it given my extremely limited time on the show floor and time to
talk to vendors and others, these are the highlights for me. I think Doug has
additional comments that he will be posting. Next SC I will do my level best
not to end up in the hospital so I can at least give a reasonable overview of the
show.

Dr. Jeff Layton hopes to someday have a 20 TB file system in his home
computer. He lives in the Atlanta area
and can sometimes be found lounging at the nearby Fry's, dreaming of
hardware and drinking coffee (but never during working hours).

Unfortunately you have Javascript disabled, please enable Javascript in order to experience the comments correctly