2
Wire delay is emerging as the natural limiter to microprocessor scalability. A new architectural approach could solve this problem, as well as deliver unprecedented performance, energy efficiency, and cost effectiveness. The Raw Microprocessor

3
Scalable ISA Provide a parallel, software interface to the gate, wire, and pin resources of the chip Allow programmers more control of physical resources to achieve maximum performance and energy efficiency The Raw Microprocessor Problem: How to leverage growing quantities of chip resources even as wire delays become substantial?

4
Until recently, the abstraction of a wire as an instantaneous connection between transistors has shaped assumptions and architectural designs However, today, it takes on the order of two clock cycles for a signal to travel from edge- to-edge of a 2-GHz processor die Processor manufacturers have strived to maintain high clock rates in spite of the increased impact of wire delay; but materials and process changes have not been sufficient to solve the problem The Raw Microprocessor Technology Trends

8
The tiles interconnect using four 32-bit full- duplex on-chip networks, consisting of over 12,500 wires. Each tile only connects to its four neighbors. The length of the longest wire in the system is no greater than the length or width of a tile. This property ensures high clock speeds, and the continued scalability of the architecture. The Raw Microprocessor

12
Raw processors will have: More functional units, as well as more flexible and efficient pin utilization Higher pin count due to this efficiency More predictablity and have higher clock frequencies due to explicit exposure of wire delay The Raw Microprocessor Architectural Entities

13
Applications can leverage the Raw static network’s ASIC-like place and route facility -- applications that do so are called software circuits The Raw operating system allows both space and time multiplexing of processes -- it allocates a rectangular-shaped number of tiles to each process The Raw Microprocessor Application Mapping

19
The Raw Microprocessor Design Decisions Dynamic Networks: Supports need for dynamic events and message passing Better suited for long data streams due to large overhead

20
The Raw Microprocessor Implementation IBM’s SA-27E, 0.15 micron, six-level copper, ASIC process 25W power consumption Wire delay in tiles was large enough that placement could not be ignored

21

22
The Raw Microprocessor Implementation Applications with very small ILP generally do not benefit from running on Raw For applications with moderate to significant ILP, performance increases are observed Authors attain speedups ranging from 6x to 11x versus a single tile on Specfp applications for a 16-tile Raw processor and9x to 19x for 32 tiles

23
The Raw Microprocessor Conclusion Replicated tile design saved time in design, RTL Verilog coding, resynthesis, verification, placement, and back-end flow Virtual Raw systems can be created from glueless connection of up to 64 chips Authors believe that reaching the point at which a Raw tile is a relatively small portion of total computation could change the way we compute

24
The Raw Microprocessor Discussion

25
The Raw Microprocessor Discussion Questions Does this paper discuss enough real program and benchmark results? Is 25W power consumption “energy efficient” for the performance they have indicated? Are there negative consequences of exposing so much complexity to the software/programmer? How can the functionality of this processor be likened to a 2-D pipeline? Does cost need to be addressed? How advantageous is the design time reduction achieved through redundancy?