6
Anti- and Output Dependencies ► Also called read-after-write (RAW) hazards ► An instruction may use a result produced by the previous instruction Both instructions may not execute simultaneously in multiple pipelines. The second instruction must typically be stalled.

7
Structural Dependencies ► Stalls results in less than optimal performance ­We may have single-issue cycles, which process only a single instruction. ­Worse, we may have zero-issue cycles, which initiate no new instructions. ► Data dependencies can also limit performance for a scalar machine Two cycle memory load/write Intra-instruction dependencies

14
Automatic Register Renaming ► Every R-write allocates new R ► The register name A is an alias for the last R allocated by a write to A ► An instruction reading and writing an register allocates a new R too

15
Advantages over More ISA Registers ► Smaller instructions ► Allow same software to run on range of implementations Compare the same program running on Pentium or AMD Ath ► Less state to save Faster function calls Faster context switches Life-times can be optimized

22
Virtual-Physical Register Renaming ► General Map Table Indexed by logical register L VP register: last virtual-physical register that L has been mapped to P register: Last physical register that L and VP have been mapped to V-bit: indicates whether P is valid ► Physical Map Table Has entry for each VP Contains last physical register that VP has been mapped to

23
Functional Description ► For each logical source register S do a GMT lookup If V-bit is set, rename S to P Otherwise, rename S to VP ► Rename the logical destination register to a new VP ► Update GMT: set VP to new mapping and reset V ► Save previous VP in reorder buffer to be able to roll back

25
Functional Description ► When source operands are ready, instruction is issued ► When instruction completes: new physical register R is allocated for result PMT is updated to reflect new mapping VP number of destination is broadcast to all entries in instruction queue with physical register identifier GMT is updated: entry corresponding to logical destination is checked for match with the VP and if so, the physical register nr is copied to the P register field and the V flag is set As a result a new instruction using same logical register will find corresponding physical register in GMT Lastly, C flag of entry in reorder buffer is set

31
Renaming aware scheduling? ► Use Register Renaming in allocator minimal number of named registers maximal number of register instances ► Do not do scheduling that CPU can do over-scheduling can be worse than no scheduling at all