This is a personal blog updated regularly by Dr. Daniel Reed, Vice President for Research and Economic Development at the University of Iowa.
These musing on the current and future state of technology, scientific research and innovation are my own and don’t necessarily represent the University of Iowa's positions, strategies or opinions.

July 08, 2007

Petascale and Multicore Redux

N.B. Sorry for the absence, I’ve been on vacation, thinking about life and doing my best Jimmy Buffett imitation.

In mid-June, I participated in the DOE’s SOS11 workshop, held in Key West. This year’s workshop theme was “Challenges of Sustained Petascale Computation.” The workshop included updates on new systems and experiences from the national laboratories, along with vendor and academic perspectives on the challenges ahead.

By the way, if you don't follow this workshop, you should, as it is an annual snapshot of the state of the art at U.S. HPC centers. Last year's theme was distributed supercomputing, which I kicked off with a keynote on the data deluge.

Rick Stevens kicked off this year's workshop with a provocative and very enjoyable talk about the challenges of the world’s largest computational systems. Sprinkled with futurist insights from the Arlington Institute, Rick challenged us to think expansively about complex applications, notably in computational biology. Jack Dongarra gave a second keynote on adaptive precision arithmetic, learning systems, and linear algebra software for new processors.

In the vendor session, Intel discussed its 80-core teraflop test chip and some of the electrical signaling issues it was intended to test. Everyone at the workshop (and at Microsoft Manycore Computing Workshop) agreed that we would see hundred-core commodity chips by the end of the decade or soon thereafter. Looking further, one can see thousand-core chips coming.

In the SOS11 software session, I talked about the importance of rethinking abstraction at petascale. Abstraction is really the only effective mechanism we humans have identified to manage complexity. In other domains, we isolate and encapsulate complexity, allowing us to compose complex systems and reason about them at higher levels. I believe it is critical that we adopt abstraction and automation to manage petascale and (soon) exoscale systems if they are to have a large and robust application suite.

We can continue down our hand-crafted application development path for petascale systems, and it will work; the evidence to date is clear. Given enough investment of time and money, we will be successful, but I am not convinced we are on the most effective path. (Aerodynamics is not mandatory on cars either, but it surely reduces fuel consumption.)

Let me put things in economic perspective. The total cost of operation of a teraflop (peak) cluster for a year is declining rapidly, due to increased chip performance. Multicore chips will continue that trajectory, albeit with certain software challenges. Concurrently and conversely, the cost of a professional software developer for a year is rising, perhaps not as fast as we in software might like, but it is rising. Moreover, complex software applications have lifetimes far in excess of the systems on which they are initially developed, and they increasingly require teams of developers drawn from many disciplines. The more effectively we can hide hardware idiosyncrasies behind abstractions, the easier and cheaper it will be to migrate and extend complex codes on new systems.

What I am really arguing is that we need to rethink aggressive machine optimization, virtualization and abstraction. What’s wrong with devoting a teraflop-year to large-scale code optimization? I don’t just mean peephole optimization or interprocedural analysis;. Think about genetic programming, evolutionary algorithms, feedback-directed optimization, multiple objective code optimization, redundancy for fault tolerance and other techniques that assemble functionality from building blocks. Why have we come to believe that compilation times should be measurable with a stopwatch rather than a sundial?

I’m not suggesting these approaches be de rigueur during day-to-day development; after all compilation and debugging also involve human aspects. However, I believe we need to move beyond software development in the small to software management in the large, if we are to solve the complex, multidisciplinary problems that petascale and exoscale systems put within our intellectual reach.

Of course, as Dennis Miller used to say, this just my opinion. I could be wrong.

Off Topic

While at SOS11, I went for a stroll along Duval Street in Key West. Although I’m sure parts of Key West are nice (President Truman had a retreat there), Duval Street seemed to embody the phrase “tacky, yet unrefined” – too many tourists and tee-shirt shops. I did catch an interesting photo of the founding site of Pan American Airways, from back when they flew between Key West and Havana.