Architecting For Efficiency

By definition, to be efficient is to perform or function in the best possible manner with the least waste of time and effort; having and using requisite knowledge, skill, and industry. As this relates to SoC design today, achieving the highest level of efficiency is a challenge with many dimensions.

Efficiency comes in multiple ways. “One dimension would be power consumption,” said Oz Levia, vice president of marketing and business development at Jasper Design Automation. “Lower power consumption is always better, but you have to consider there are orders of magnitude that are important in that area, whether it is mobile or rack or wall power and so forth. Together with power, you also have to consider heat dissipation. That is important regardless of the application, for cooling or consideration of that sort.”

Another dimension is silicon. “Silicon area is relatively cheap, but it’s not free,” Levia said. “You want to make sure you have efficiency in terms of silicon area, and it doesn’t always mean the smallest area because sometimes selecting to go with a smaller node can yield smaller area but it could be inefficient economically -— maybe because it’s a much more complicated process with more mask costs and more yield issues and higher NRE, or maybe higher productions costs. You have to weigh that in terms of the cost benefit, depending on the volume you’re looking at.”

The cost of materials in terms of IP is a third dimension of designing an SoC efficiently today. “It’s build vs. buy, but it’s also what do you put on there?” Levia said. “And what do you put on there in terms software or hardware IP? What would come into play here is understanding of the application that you are considering.”

Steve Roddy, Xtensa product line group director at Cadence, agreed. “When you are putting all of this IP down, there are a ton of choices to make about memory sizes, distribution of memory, locality of data at any one given time, bus and fabric connectivity. Let’s say it’s a cell phone apps processor thing or for a tablet. It’s multitasking. There are scenarios for which you want to optimize your multitasking — can your phone effectively have a phone call, allow you to quickly hit the button and go to the home screen, do a Web search, get the result back while you are talking to your buddy and figuring out which movie theater you’re going to meet at? Do you upsize the buses and do you upsize the sizes of local caches and secondary caches and on-chip memory to be able to make that more streamlined? Or do you have a cost-down version where there are fewer resources, and hence it doesn’t multitask as well and doesn’t switch as elegantly between tasks? That works just fine if you are doing one thing, but when you shift around three or four times, it doesn’t. Those are system-level decisions that people are going to make. Maybe the guy doing the cost-down apps processor cuts corners because he’s just trying to sell into the cheaper market at the time. And the person who buys the $89 smartphone doesn’t expect it to elegantly multitask like the iPhone 5.”

Given the fact that we can’t just throw all of the IP together because it’s just not that easy, Roddy said that’s really a cost versus functionality tradeoff. “For example, you could oversize buses or have more point-to-point network-on-chip connectivity and bigger memories. But if the bottleneck is simply solved, you’re not solving the right problem and you could add cost to your product and it not make the user experience better. Or you could simply say, ‘Look, I don’t care. I’m trying to make a cheap one. My users are willing to wait three or four seconds while they switch from one app to the next app because they bought an $89 thing, not an $800 thing.’ And clearly there’s a market for that. If you’re going to give a tablet to your six year old who is liable to spill juice on it and drop it, you don’t expect it to be what you or I would want. The whole question of how you do the system analysis or who does the system analysis is really up in the air. The third-tier apps guy probably isn’t doing any. He’s probably doing a minimal use case, slimming down the resources, cutting the memory in half, lowering the chip price by $5, and off he goes. Whereas the guys at Apple, they are probably doing all kinds of stuff, starting with system simulation, SystemC simulation, working their way down to emulation and prototyping, whether it be FPGA or maybe they are buying the big Palladium boxes where you can load your entire SoC into a hardware emulator and run software, run the emulator platform and in the course of maybe a day an hour’s worth of use of the phone. And now you can get all the statistics about where the bottlenecks are and what’s wrong and what’s right.”

Underdesign vs. overdesign
The challenge for the SoC architect to design the most efficient system possible can be boiled down to underdesign and overdesign.

“The architect’s role in this area is pretty important,” said Pat Sheridan, senior staff product marketing manager in the system level team at Synopsys. “They are coming up with a technical spec for the SoC that has to meet certain marketing requirements that include performance goals and power constraints and cost constraints, and they have to be able to look at all of these things. Once they have their recommended specification and configuration, then this information is used by the hardware and software teams to implement the architecture. So it’s really important for that architect to be thinking not only about performance issues, which is perhaps more commonly what people think about for the early architecture prototyping simulation, but they also need to think about the power considerations in the early phases.”

Put simply, the challenge is underdesign versus overdesign. “If you think about the performance side of the question, what they are trying to do is make sure the product performs well enough and they avoid the risk of underdesign,” Sheridan said. “But power is just the opposite. They want to avoid overdesign that can cause the power problems. They want to avoid the kind of dumb mistakes that put the system into a power mode accidentally.”

The mobility crunch
Power limits performance. But at the same time, the way that SoC architects think about this is basically that there are specific power requirements for certain market segments.

“For example, mobility demands power at 1 watt,” said Anand Iyer, director of product marketing for the low power platform at Calypto. “That’s the constraint by which the SoC architects are looking at power and are trying to conform to that metric, because if you go over that metric you can’t play in that market anymore. With that bar in mind, they look at how they can optimize all the other metrics. That’s the key challenge now. It’s no longer a game of ‘my performance, my functionality trumps yours.’ It’s more, ‘I have this bar and how can I maximize my functionality and performance and show differentiation against the competition’. This metric drives design starts in discrete market segments. Once they fix this bar, they determine how they can achieve it with what they have.”

Consider, for example, a typical wearable device in an IoT application that may contains an MCU (or a processor core) integrated with analog peripherals, such as temperature and light sensors, an accelerometer, analog-to-digital converters, communication interface, a Bluetooth wireless transceiver, and display LCD and LEDs.

“Every component has to be very low-cost and highly energy-efficient,” said Vic Kulkarni, senior vice president and general manager of Ansys-Apache Design. “There is not much power available on-board for IoT devices. A low-cost IoT MCU must have a low gate-count and code footprint to achieve a 32-bit like performance at an 8-bit price point. These low-cost MCUs can be targeted for sensors and smart controls for home appliances, medical monitoring, lighting, motor control, power, and metering.”

Analog content in IoT devices represents more challenges because it is usually implemented bottom up without explicit low-power specifications, leaving transistor-level simulation as the only verification option, Kulkarni explained.

As such, ultra-low power IoT system and sub-system designers would demand EDA tool flows that enable them to manage multiple supply voltages, power shut-off with or without state retention, adaptive and dynamic frequency scaling, and body biasing, Kulkarni said. “In a pure digital design, implementation and verification of these low-power techniques should be highly automated in a top-down methodology using the IEEE1801 UPF specifications and so on.”

For IoT system designers, key challenges are:

• Best practices and methodology for low power mixed-signal functions for sensors;
• Best practices and methodology for digital ultra-low power IP and IP-based SoC design;
• Standardization of protocols between sensor controls and data processing;
• Rapid co-verification of hardware-software system (for example, the new generation of FPGAs such as the Zynq family from Xilinx combines ARM Cortex-A9 MP Cores with programmable logic, which requires hardware-software co-design), and
• Embedded software management.

He asserted that EDA tools must solve the complete problem of energy-efficient IoT system design. A flow that includes the chip, package and system, which Apache refers to as CPS, allows designers to look at the design from the perspective of the system level specification, power budgeting at RTL to final power sign-off at IP, SoC, board, package and board.

System architects should think about ways to “predict” effects early in the design flow, including system-interconnects such as board traces, transmission line effects, reflection and termination, in addition to the impact due to SoC interconnects (e.g. dynamic voltage drop, and signal electro-migration considerations. And when it comes to IoT designs, the importance of CPS simulation for IoT design is heightened by both performance and energy-efficiency requirements, Kulkarni said.