AMD kept up with the SIMD processing standards Intel set by licensing its popular CPU instruction sets such as MMX, SSE, SSE2, and SSE3. The three were used as is by AMD, except for that AMD chose not to conform completely with Supplemental SSE3, SSE4 and its revisions (SSE4.1, SSE4.2). The company devised the SSE4A instruction set to feature with its K10 micro-architecture. SSE4A is a lighter version that features LZCNT (Leading Zero Count), POPCNT (bit population count), EXTRQ/INSERTQ and MOVNTSD/MOVNTSS (Scalar streaming store instructions). What's more, the company even decided back in 2007 that it would come up with SSE5, that then Intel sought to leave development with AMD.

In due course of time, Intel started development of AVX (Advanced Vector eXtensions) that enhances processing of FPU-intensive workloads. AMD gained interest in this technology, and is looking to make it compatible with the originally-conceived SSE5. The instructions that remain as part of the superset that doesn't include AVX is now referred to by AMD as XOP (eXtended OPerations). In addition to this, AMD will include FMA4 (Floating point vector Multiply-Accumulate). The new instruction sets make it to AMD's next-generation Bulldozer micro-architecture slated for 2011. Meanwhile, Intel's AVX makes it to the Sandy Bridge micro-architecture slated for 2010~11. AMD published the Programmer’s Manual document on 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions, which can be read here (PDF).

- SSE5 was conceptualized as a standard for both Intel and AMD (circa 2007).
- Intel came up with AVX in/since 2008, and broke away from the SIMD design plan. AVX and the original SSE5 are mutually incompatible
- AMD included AVX in its set and made it compatible with SSE5 (May 2009)
- AMD-exclusive instructions referred to as XOP

They should just make SSEx that includes all previous SSE instructions. The list is getting silly long with new pricessors on what they support.

Click to expand...

AMD had done that, in essence creating something called SSEPlus. The open source project allows developers to code once using SSEPlus. Basically SSEPlus will determine if a CPU supports a given SSE instruction or not. If it does, the instruction is called normally, if it doesn't the program will emulate the SSE instructions.

Developers no longer have to redevelop their algorithms to write for multiple SSE revisions

Simplified CPUID checking

Simplified maintenance of code that targets different SSE instruction mixes