Researchers at the Massachusetts Institute of Technology have developed a 36-core processor in an effort to find new ways to eke more performance out of chips.

The chip is designed to reduce the number of cycles required to execute tasks by enabling data transfers between cores and cache in a more coherent manner, said Bhavya Daya, a Ph.D. candidate in MIT’s Department of Electrical Engineering and Computer Science. With the help of mini-routers, MIT researchers have devised a novel way to reroute data packets to free up bandwidth within multicore chips, Daya said. The research could benefit highly parallel applications such as financial analytics and particle simulation studies.

The chip research revolves around implementing a “shadow network” so cache in specific cores can anticipate data packets. Large data sets received by chips are typically broken down and migrated across multiple cores, which have their own cache to temporarily store data. If a core needs specific data, then requests are broadcast across cores in a chip.

But the broadcasts take up unnecessary bandwidth and through the research, the MIT researchers are enabling more direct communication between cores and cache. The goal is to “force” ordering within a multicore chip so the cache can anticipate and prioritize data packets, Daya said.

The shadow network lines up data transfers in a more orderly fashion, which ensures better cache coherency. Messages and data packet requests sent between cores are more pointed and specific, which also frees up bandwidth and reduces the overhead to execute tasks.

With the shadow network, MIT measured performance improvements of 24.1 percent and 12.9 percent in 36 and 64-core simulations, respectively, compared to similar chips without shadow network implementations, Daya said.

The 36-core chip had a mesh design with Power architecture cores interconnected in a square design. The chip was made using the 45-nanometer process, and the cores were supplied by Freescale Semiconductor.

The chip is for research purposes and won’t likely become available. The researchers’ next step is to look at different chip architectures and to see if the shadow network implementation can scale to hundreds and thousands of cores, Daya said.

Details about the chip were shared during a presentation at the International Symposium on Computer Architecture in Minneapolis this week.

MIT is looking at a number of ways in which memory and traffic can be restructured to improve throughput. MIT last year developed a 110-core processor that used shared memory across cores with no cache.