The core has been designed with particular attention to the lowest possible operating power, consuming just 37 µW per MHz when implemented in 0.18-µm technology. Its pipelined Harvard RISC architecture is four times as fast as the original implementation. With instruction fetch and memory transfers overlapped in a multi-stage pipeline, the next instruction can be fetched from program memory while the current instruction is being executed with data from data memory. Most instructions, excluding those that operate directly on the program counter, are executed in one system clock cycle. Clearing and refilling the pipeline takes an additional clock cycle.