Researchers at MIT have refined a software-based chip simulator that tests chip designs with large numbers of cores for flaws, adding the ability to measure designs' potential power consumption, as well as processing times for tasks, memory access, and core-to-core communications patterns. The team from MIT's Department of Electrical Engineering and Computer Science is using the simulator to test possible designs for a new processor targeted for fabrication later this year—one that they hope will have over 100 cores.

The simulator is called Hornet,Srini Devadas, professor of electrical engineering and computer science at MIT and principle investigator on Hornet, told Ars Technica in an interview. "You can use it to come up with an interesting computer architecture and test it." When flaws are found, Hornet allows designers to quickly try alternative designs to work around them.

Other simulators do more rapid functionality testing, but are less accurate in their simulation of what happens in each processing cycle of a program running on a chip design. "There's always a tradeoff between speed and accuracy," Devadas said. As a result, they can miss flaws such as "deadlocks" (when cores end up idling endlessly while waiting for each other to release memory or other resources, hanging onto the ones they've locked themselves).

In contrast, Hornet runs much slower. But it is "more accurate than a functional simulation in measuring how much time it takes to run a program and how much energy is used," Devadas explained. Hornet performs “cycle-accurate” simulation of chip designs with up to 1,000 cores, measuring the exact results of each computation cycle in a program. That accuracy helped the Hornet team take best paper prize at the Fifth International Symposium on Networks-on-Chip in 2011 with the first version of the simulator, for work showing fatal flaws in a heavily studied multicore-computing technique that other simulations had missed.

By giving designers a tool to analyze much larger multicore designs, Hornet makes it possible to push forward designs that would otherwise be too risky to take to further levels of testing and to fabrication. To date, most of the testing has been done using designs with 64 cores, Devadas said, but shorter simulations have been done on much larger designs.

The problem is one of scale and time—simulating larger numbers of cores takes longer and requires more computing power. In a design with 256 cores, Devadas said, a simulation would have to account for all of the processes running on each thread—about a million instructions per thread, with one thread per core. That means running 256 million instructions per cycle to test the design, and the time spent running the test shifts from hours to days. "If we were designing systems doing 1000 cores," Devadas said, "we would need more computers, and need to run them in parallel."

Testing larger numbers of cores is key to another project of the MIT research team—design and fabrication of a new multicore architecture chip called an execution migration machine. In the planned architecture, Devadas said, the data being processed stays in one place, but the context of processing moves from one core to another. "We've gotten to the point where we've gotten confident in the capabilities of the architecture by using Hornet to test on a 64 core design and beyond," he said. The goal is to build a chip with over 100 cores—possibly as many as 128, though the final number hasn't been determined yet.

Sean Gallagher / Sean is Ars Technica's IT Editor. A former Navy officer, systems administrator, and network systems integrator with 20 years of IT journalism experience, he lives and works in Baltimore, Maryland.