"For years, PC programmers used x86 assembly to write performance-critical code. However, 32-bit PCs are being replaced with 64-bit ones, and the underlying assembly code has changed. This white paper is an introduction to x64 assembly. No prior knowledge of x86 code is needed, although it makes the transition easier."

Well, I'd say that the fix for x86 extension proliferation is easy: find that forgotten corner of the AMD64 spec where it states that the architecture mandates the presence of SSE2, NX and CPUID extensions among a few other things, then leave the rest to compilers, unless you really need a feature (like VT) or have to get the most out of a specific range of hardware for some reason.

I don't think it's a very good idea to rely on hardware-specific features which are not guaranteed to be present on future processor generations, even if it is the case in practice for now. Just look at how little of these x86 extensions Atom processors support: tomorrow, these little chips will likely be good enough for the average Joe's computing needs...

The nice thing about the early 32b x86 RISC books like "Inner Loops" was they made it quite clear which instructions should be used in assembler and which to ignore completely. So several hundred codes was reduced to a very small set of basic ops, almost all reg to reg and the load store. Basically the Pentium was a improved 486.

Well, Atom has this problem as well. All Atoms support up to SSE3, some have support for SSSE3 (An extension to SSE3), and the newest support Intel VT-x.

Obviously, they don't support the AMD extensions (3D Now!, XOP, FMA4, and CVT16)

All this stuff adds complexity to the front end (One of the main targets for reducing power consumption for the Atom), but at least the back-end stages doesn't get significantly more complex.

My point was not that Atom processors do not suffer from extension proliferation, but that x86 extensions are not guaranteed to last forever. Especially considering that in computing history, any time computer hardware has started to get dangerously close to "good enough", hardware guys have come up with a more constrained computer form factor that called for less capable CPUs and thus yet another new performance race.

Today, cellphones SoCs are getting so fast that Apple, Google and Microsoft have a hard time keeping OS bloat high enough to drive hardware sales. So I'm pretty sure that somewhere in R&D labs, the Next Big Thing is closing in pretty fast. And that its earliest iteration will have an incredibly primitive CPU by modern standards.

Unless, of course, everything goes cloud at this point, bringing back the mainframe age. In which case CPU extensions could still become irrelevant, but this time it would be because no one cares about the performance of individual CPUs when the main issue is spreading work on thousands of these.