Researchers give update on road to parallelism

The chip project closest to being tested is Rigel, a 1,024-core processor architecture aimed at high density, high throughput computing. It would be programmed via a task-level applications programming interface targeting jobs in imaging, computer vision, physics and simulations.

Researchers described Rigel as a kind of far-future variation of the Intel Larrabee graphics processor. Like Larrabee, Rigel will use a single address space, four or fewer threads and will be fully cached. However, unlike Larrabee, Rigel will be based on a MIMD engine and use software-based memory coherency.

Rigel will use 16 eight-core nodes per chip with a global shared L3 cache. Each core will be a simple "Tensilica-like" block with one single-precession floating point unit and minimal decode logic.

Early simulations show Rigel could deliver 8 GFlops per square millimeter of die space at 100W, matching or exceeding today's multicore graphics processors. If made in a 45nm process it could run at 1.2 GHz and take up 320 square millimeters of silicon.

Researchers published at least four papers on aspects of Rigel in 2009 and have an upcoming paper planned this year. "But there are still many open research problems in the Rigel design," said Daniel R. Johnson, one of the researchers working on the project.

A third chip design, called Bulk Architecture, is testing the concept of so-called atomic transactions. It defines a compilation stage that gathers groups of instructions into clusters and synchronizes when they are executed to create parallelism.

A simulation based on the HotSpot Java compiler provided an average 37 percent performance improvement over an unmodified compiler on a four-core Intel system, said Josep Torrellas, a University of Illinois computer science professor.

Noted Intel processor architects Glenn Hinton and Shekhar Borkar are working with the lab on a silicon prototype. It will be based on Intel's Single-chip Cloud Computer (SCC), a 48-core x86-based research chip described at ISSCC in February.

"The best thing would be for Intel to prototype the Bulk Architecture [directly], but failing that I have to use SCC," said Torrellas.

The Intel Single-chip Cloud Computer, a 48-core x86-based research vehicle, is being used as the basis for multiple prototypes in R&D projects on parallel architectures at the University of Illinois.