SIMD architectures

What do Sony's Playstation2 and Motorola's MPC7400 (a.k.a. the G4) have in
…

SIMD operations

The basic unit of SIMD love is the vector, which is why SIMD computing is also known as vector processing.? A vector is nothing more than a row of individual numbers, or scalars.

A regular CPU operates on scalars, one at a time.? (A superscalar CPU operates on multiple scalars at once, but it performs a different operation on each instruction.)? A vector processor, on the other hand, lines up a whole row of these scalars, all of the same type, and operates on them as a unit.??

These vectors are represented in what is called packed data format.? Data are grouped into bytes (8 bits) or words (16 bits), and packed into a vector to be operated on.? One of the biggest issues in designing a SIMD implementation is how many data elements will it be able to operate on in parallel.? If you want to do single-precision (32-bit) floating-point calculations in parallel, then you can use a 4-element, 128-bit vector to do four-way single-precision floating-point, or you can use a 2-element 64-bit vector to do two-way SP FP.? So the length of the individual vectors dictates how many elements of what type of data you can work with.

Motorola's AltiVec literature divides into four useful and easily comprehendible categories the types of SIMD operations that AltiVec can do.? These categories are a good way of dividing up the basic things you can do with vectors.? Unfortunately for people who write SIMD comparison articles, both AMD's and Intel's tech docs categorize their hardware's SIMD operations in a completely different and less accessible way.? (Actually, Intel's tech docs categorize things one way, and AMD's tech docs copy Intel's categorization.? It's good to see that at least Motorola can think differently.)? I'm going to use Motorola's categories, at least initially, for tutorial purposes.? I'm also going to rob some of Motorola's pictures out of their AltiVec literature, and modify them a bit.??

I.? Intra element arithmetic and non-arithmetic functions.

?

?

Intra-element arithmetic is one of the most basic and obvious types of SIMD operation.? Consider an intra-element addition.? This involves lining up two vectors (VA and VB), and adding their individual elements together to produce a sum vector (VT). The above picture shows an example of inter-element arithmetic at work. Inter-element operations also include multiplication, multiply-add, average, and min.

Intra-element non-arithmetic functions basically work the same as above, except for the fact the operations performed are different.? Intra-element non-arithmetic operations include AND, OR, and XOR.

Vector intra element instructions

integer instructions

integer arithmetic instructions

integer compare instructions

integer rotate and shift instructions

floating-point instructions

floating-point arithmetic instructions

floating-point rounding and conversion instructions

floating-point compare instruction

floating-point estimate instructions

memory access instructions

?

II.? Inter Element Arithmetic

?

?

Inter-element operations are operations that happen between the elements in a single vector.? An example of an inter-element arithmetic operation is shown above.? This operation sums across the elements in a vector, and stores the result in an accumulation vector.

IV.? Inter Element Non-arithmetic

Inter-element non-arithmetic operations are operations like vector permute, which rearrange the order of the elements in an individual vector.? We'll look at the permute operation a little closer in a later section.