High-performance embedded computing -- Power and energy consumption

Editor's Note: Interest in embedded systems for the Internet of Things often focuses on physical size and power consumption. Yet, the need for tiny systems by no means precludes expectations for greater functionality and higher performance. At the same time, developers need to respond to growing interest in more powerful edge systems able to mind large stables of connected systems while running resource-intensive algorithms for sensor fusion, feedback control, and even machine learning. In this environment and embedded design in general, it's important for developers to understand the nature of embedded systems architectures and methods for extracting their full performance potential. In their book, Embedded Computing for High Performance, the authors offer a detailed look at the hardware and software used to meet growing performance requirements.

As power dissipation and energy consumption are critical concerns in most embedded systems, it is important to be aware of techniques that impact and most importantly that reduce them. Dynamic voltage and frequency scaling (DVFS) [30], dynamic frequency scaling (DFS), dynamic voltage scaling (DVS), and dynamic power management (DPM) are techniques related to architectures and hardware to reduce energy/power consumption.

Power consumption is represented in watts (W), which directly affects system heat (temperature) and the possible need for cooling schemes. The total power consumption [Note: Although the term “power dissipation is more appropriate,” the term “power consumption” is widely used.] of a CMOS integrated circuit (IC) is the sum of the static power and the dynamic power as represented by Eq. (2.6).

Short circuits and leakage currents are responsible for power consumption even when transistor devices are not switching. The static power consumption (Pstatic) can be calculated by Eq. (2.7), where Vcc represents the supply voltage (sometimes also represented as Vdd) and Icc (sometimes represented as Isc) represents the overall current flowing through the device which is given by the sum of the leakage currents. The static power depends mainly on the area of the IC and can be decreased by disconnecting some parts of the circuit from the supply voltage and/or by reducing the supply voltage.

The dynamic power (Pdynamic) consumption can be calculated by Eq. (2.8), where Vcc represents the supply voltage, β represents the activity factor, CL represents the load capacitance, and f denotes the clock frequency at which the device is operating. Pdynamic is proportional to the switching activity of the transistors in the IC. Thus, one way to reduce the dynamic power is to make regions of the IC nonactive and/ or to reduce Vcc and/or f.

When reducing the dynamic power by reducing the frequency and/or the supply voltage, it is common to attempt to reduce the value of the supply voltage as its value impacts in a quadratic way the dynamic power Pdynamic∝ V2cc . Reducing the clock frequency clearly has a negative impact on execution time as the components will operate at a lower clock frequency thus translating into longer execution times, increased latencies, or lower throughputs. Moreover, reducing the supply voltage may also imply a reduction of the clock frequency. Typically, the system provides a table of discrete values of supply voltages that they can operate under along with the corresponding maximum clock frequencies (known as frequency-voltage table). Thus, the supply voltage (Vcc) can be seen as a linear function of the clock frequency as depicted in Eq. (2.9) and the dynamic power consumption is directly proportional to the cube of the clock frequency (Pdynamic∝f3), as shown in Eq. (2.10).

The tuple consisting of the voltage and the corresponding maximum clock frequency is known as the operating performance point (OPP), OPP = (f,V).

FREQUENCY-VOLTAGE PAIRS

The operating performance points (OPPs) depend on the components (e.g., CPUs) and the support included in the system implemented in the IC.

For example, the Texas Instruments OMAP-L138a IC includes an ARM9 RISC processor and a VLIW DSP. The following table presents the OOPs for three subsystems of the IC, the ARM processor, the DSP, and the RAM.

Intel also has provided control over the processor’s the Enhanced Intel SpeedStep Technology.b