Commentary: How manycore will reshape EDA

It is difficult to imagine a world without the smartphones, tablets, e-readers and game consoles that are pervasive in our lives today. Behind such innovations are embedded systems. To provide the automation and the speed to build them, EDA companies are taking advantage of the parallelism offered by multiple cores. All claim some multithreading and multicore capabilities and are porting more code to multithreaded applications.

This is certainly no easy task. EDA code is complicated, and EDA developers must work within Amdahl’s Law, which states that the gains you get from parallelizing code are sharply limited by any section that cannot be parallelized. Thus, if 90 percent of the code is parallelized and 10 percent is not, the maximum gain expected is 10x.

Since all EDA flows include some nonparallelizable code, it is unreasonable to expect a 4x gain on four CPUs. Hence the need for holistic parallel processing in EDA design flows. Thus far, the work has focused on tackling “embarrassingly” parallel applications and using coarse-grained partitioning. But EDA applications and the EDA workflow as a whole will not achieve the necessary scaling by focusing only on problems that are easy to parallelize.

Digital IC implementation alone involves many design flow steps from netlist to tapeout. In order for the users of EDA tools and specifically digital IC implementation tools to gain the benefit of multicore chips and multiprocessor machines, every step in the flow will need to support parallel computation.

The need for holistic parallel processing extends to each major segment of the EDA industry, but digital IC implementation software is a good representative example. A coarse-grained partitioning of the problem will not suffice, as coarse-grained approaches must include a serial step in which the work is partitioned and another in which the partitioned elements are reassembled. For large digital design applications, such partitioning is a bottleneck. Nearly every step of the workflow needs to be run in parallel—and to achieve that, the fine-grained parallel problems that are most difficult to parallelize are the ones that most urgently require innovation.

For multicore architectures with four to eight cores, creative techniques can be applied to get legacy code to run in parallel without the need for excessive recoding. Cadence Design Systems has exploited some techniques that use fine-grained distributed processing to speed compute-intensive pieces of serial programs. Most of the code for the Cadence netlist-to-GDSII flow has been parallelized within the context of Amdahl’s Law.

For example, floor planning for complex giga-gate/GHz designs can require many months of designer time. By using a combination of coarse-grained and fine-grained processing capabilities in the Cadence digital flow, designers not only can abstract giga-gate/GHz designs overnight but also can create multiple floor plans in parallel.

Similarly, in the routing, extraction and physical verification engine, where a lot of code leverages fine-grained distributed processing and has been written with parallel processing in mind, a speedup of 3x over four CPUS and 6x over eight CPUs has been observed.

This is the kind of scalability that multithreaded architectures can provide, and it is imperative that EDA companies take full advantage of it.

What has yet to be seen is how EDA will reshape itself to support manycore designs (32 cores or more). A lot of fine-grained problems are not going to scale with manycore; many legacy applications will have to be retrofitted, and others will have to be rewritten.

Such rearchitecting will result in a new paradigm for EDA companies. We will see a focus not only on modifying tools and algorithms for manycore processors and GPUs but also for distribution across server farms and computing clouds. Clusters and clouds built out of manycore processors will provide opportunities for unparalleled compute resources.

EDA on the cloud is an exciting prospect. Cadence is using the software-as-a-service (SaaS) model to provide hosted design solutions, and it offers EDA software and intellectual property evaluation through the Xuropa cloud. At last year’s Design Automation Conference, Bernie Meyerson, vice president of innovation at IBM, suggested cloud computing would become the dominant standard for EDA companies.

Certainly, cloud computing for EDA comes with its own challenges, including security, resource management and licensing issues.

If EDA manages the coming changes well, however, users will benefit from the inexpensive parallel computing possible with manycore and the scalable compute resources available in the age of cloud computing.

About the author

Abha Maheshwari is a product manager in the Silicon Realization group at Cadence Design Systems, where she is responsible for design exploration and planning technologies. She holds an MS from the University of California, Santa Barbara and a bachelor’s degree in electrical engineering from the Indian Institute of Technology Bombay (Mumbai).

The real challenge of course is the parallelization itself. Parallelizing complex algorithms requires a detailed insight into all the dependencies, a complex task even for the experts. Add to this significant amounts of legacy code, written by engineers no longer with the company and increasing amounts of open source software. Like legacy code, there is not a lot of familiarity with the "inner life" of these parts.
Users will be happy with the increased performance but will not accept regression. That will make this shift to manycore a very complex and challenging endeavour.