Applied Micro Circuits, a company known for networking chips and for dabbling a bit in embedded PowerPC processors, has aimed a haymaker of an ARM server chip right at the cloudy jaws of Intel and AMD.
What's more, the specs divulged by Applied Micro – if all works according to plan – suggest that the x86 chip makers might end …

Ok, I confess

256 will be needed

Mini-ITX-sized servers?

Being old-school and all I was lately surprised at the (absence of) size of the server mainboards inside of those big 19'' caissons. The biggest thing now are the power supplies (really, a server has no business having power supplies at all, not to mention dual ones, what's with this "industry"?) and the stupid fans. The spinning rust is being seized down to 4 units of 2.5'' + the unfortunately still off-mainboard RAID controller, which is acceptable.

So will I be able to have a few of the standard servers inside a 1-HU box before I ride into the sunset? I hope so.

The biggest problem for this kind of non-x86 thing is going to market and breaking down the deadly embrace of integrator reluctance and customer apathy.

But will it run crysis?

ok, that's not so much of an issue, but what about on the desktop? I'd like something cooler than an i7's 90-130w(?) envelope.

And for a home server? Mostly they sit idle, but I want to be able to fill dual-gig network pipes when required and run asterisk & mythtv without a hiccup and in the scorching Oz summer with no air-con.

Intel

Intel has an ARM licence. They even kept one XScale chip, a communications processor.

All Intel has to do is make like ADI, Texas, Qualcomm and Samsung who design their own variations of ARM. Like Samsung leverage their own processes / fabrication too for the 3 layer Flash + RAM + CPU chip trio.

Only Texas does the OMAP version ARM

Only Qualcomm does the Snapdragon version ARM

Intel can use their design expertise and fabrication to make an ARM that Texas, Samsung or Qualcomm can't compete with. Maybe they already have a skunkworks team and should add all the Itanium resources to it. After all, the only Itanium customer, HP, is flirting with ARM based servers.

Unfortunately No

You've missed some bits or too caught up in marketing hyperbole. Texas Instruments and Samsung both take ARM-provided cores like every other Core licensee at RTL and then do their own timing synthesis on it - the fabled "Hummingbird" core was Timed by Intrinsity for Sammy but as a hard macro it is reused in both the SP5C110 and Apple A4 - both are 'standard' ARM cores, not fuss no bother.

Qualcomm's Scorpion and Krait cores are indeed different because Qualcomm (as with Marvell, Microsoft and in past life Intel) are Instruction Set licensees. The microarchitecture of Scorpion and Krait are much different than Cortex A8 / A9 / A15 despite them running the same code - this is Intel vs AMD vs VIA in the x86 space. Marvell's PJ4 core (and its variants) similarly support a different uArch and in fact have brought more advanced features (OOO execution primarily) to the fold before ARM themselves.

While I cannot disprove the statements that Intel maintains its IS license or uses an XScale core in some legacy products, I know for a fact they sold that entire business unit to Marvell as suggested in the article and there have been no improvements to XScale since; Marvell have even dropped this from their product line moving to the PJ4 instead in their ARMADA and Pantheon families. And why not, given XScale is onyl ARMv5 IS equivalent and we're all now up in arms over v8.

@Mage

"Intel can use their design expertise and fabrication to make an ARM that Texas, Samsung or Qualcomm can't compete with."

Yes they could easily do that. But that would mean that Intel are just another supplier in amongst a whole host of others. Intel's wafer baking expertise would mark them out though.

However, I'm pretty sure that Intel are desperate for the world to stay with x86 for servers, because then they'd keep the whole thing to themselves. At the moment they have only AMD to contend with. In the ARM market there's hundreds of other companies to compete against, and it would be difficult to dominate them all.

The moment Intel start fabbing ARMs will signify that they've finally admitted that x86 is a crappy, out of date architecture. That would really make them wince, especially as it would be an admission to their shareholders that the game is up, the big days are over.

Intel could ...

... but that would risk canibalizing CPU revenues.

Still, smart thing would be do it early. Otherwise if they join late, it would mean they are desperate - that could be interpreted as admission to failure of their own lines of CPU. That would not come as shock to many, since x86 architecture long outlived its "best before" date.

Re: Intel

But why would they...The Server is a huge market for Intel (x86). If they try and license ARM by producing their own version they would also be shooting their own legs from under themselves in the desktop market.

The restriction on X86 and Legacy apps are the only thing keeping Intel ahead of the competition, They are not going to create a bridgehead with a self produced ARM server CPU

"Intel can use their design expertise and fabrication to make an ARM that Texas, Samsung or Qualcomm can't compete with. Maybe they already have a skunkworks team and should add all the Itanium resources to it."

Perfectly true. But what would be the point? Intel haven't had any real success outside x86 in the last couple of decades, why would that be about to change?

"the only Itanium customer, HP, is flirting with ARM based servers"

People that buy Itanium buy it despite IA64 not because of IA64. They buy it either because they need ultra-massive single-image SMP systems of a kind that AMD64 still can't quite deliver, or because their software stack needs an OS that HP choose not to port off IA64 (HP-UX, VMS, NSK). That situation won't last forever.

OS porting

You raise a fascinating point about the whole HP ecosystem. I suppose we have to consider the possibility that in the long run HP will port VMS to ARM. Now that really would be a strange event in history indeed!

@Ramazan

RE: OS porting

".....HP will port VMS to ARM...." Yeah, did you actually even try thinking about what that would involve? First off, Itanium was designed to be a porting-friendly platform, it is equal-endian and has more than enough registers to cope with even highly-demanding languages like NonStop or OpenVMS. The job of squeezing that type of CPU load into the ARM cores would be beyond a quart-into-a-pint-pot job, we're talking more of a gallon into a thimble! We already have plenty of ARM cores on the market but the biggest impact is with smartphones from Apple and HTC, running cut-down BSD/Linux OSs, not even proper desktop OSs. ARm cores may appear soon in low-end PCs, but they will be budget home jobs for consumers, and probably won't be in corporate desktops for a few years yet.

If hp should ever see the need to port off Itanium for any of their enterprise OSs, it is much more likely to be to another enterprise design such as x64. The 64-bit ARM cores are interesting for low-power (as in "low CPU performance") replacements for bottom-end x64 servers and desktops/notebooks, but they have a long way to go until they can challenge Xeon/Opteron, let alone Itanium.

Is there any sign that the real world is anywhere near filling a 64bit address space let alone needing a 128bit? By the time the real world needs 65bits of addressable memory (which *does not* need a 65 bit address space, in the same way that PDP11s could address 4MB even though they were 16bit machines, and Xeons could do >4GB of physical memory long before AMD64 arrived), we'll be long gone. And Intel will have faded away long before that.

Re:

Exactly. What we need to do is get main memory humping the processors in 3D stacks and integrate everything possible onto the silicon. You want a cube with a port for network and a port for power and that is about it.

RE: X86 vs ARM RISC

Whilst 128-bit cores would be fun, they would need a big push to get OSs and applications re-written to take advantage of the extra addressing, otherwise there would be zero advantage (and probably a cost disadvantage) to running 64-bit code on them compared to a 64-bit core. The UNIX vendors needed to go to 64-bit to maitain a scaling advantage over M$ Windows, and to address larger memory spaces. When M$ (and Linux) moved to 64-bit it was to chase after the UNIX OSs and to improve their own scaling. It would take a massive investment from software vendors to move up to 128-bit, and 64-bit is fulfilling their current needs, so unless there is a real need (such as to stay ahead of Android maybe?), I can't see it happening soon.

More than 2 cores.

Umm... Hey Tim? You might want to take a closer look at that nifty little block diagram.

If you look a little closer, you can see under the slightly transparent CPU complex block. It's worth noting that what's on that block contradicts your supposition of 2-4 cores.

If the diagram is accurate, we are looking at the following:

8 ARMv8 cores at 2.5GHz, arranged in 4 pairs, each core having its own independent core and L1I$, L1D$, but with the L2 (and presumably the system interface) shared between pairs. The L3$ is stated to be 8MB, and connects to the cores through the coherent network. Moreover, there are two "memory bridges" connected to the coherent network, each with a pair of DDR3 controllers (this makes 4 channels total, if the controllers are 64b wide).

This is very different than the supposed 2 cores, and much more exciting as well. If applied micro can get this in at 2W per core (perhaps 30W or 40W per node overall, memory and storage included), the datacenter is going to be an interesting place in late 2012.

Re: More than 2 cores.

I guess... At the same time, we know there is already a solid high-speed SERDES bank on the first generation for the pair of 10Gb Ethernet ports (either that or it's hard block, which I kind of doubt). That seems like a lot of bandwidth for just 2, or even 4 cores. If memory serves, managing even a single 10GbE port takes a decent amount of CPU time, or at least it did when AnandTech ran tests on dual Shanghai Opterons a couple years ago.

Moreover, it looks like, regardless of the initial core count, the SMP interconnect is a separate interface from the SERDES block, as are the SATA/SAS PHYs. (Sidenote: those configurable blocks for handling network protocols, RAID, etc. are *%^#ing sexy)

That is a lot of overhead; it seems like they'd want to cram as many cores as possible onto the die, simply to distribute that overhead over more CPUs.

If that shot of the simulation board is correct, then we know it only takes seven 40nm FPGAs to simulate 128 cores; this would suggest that more than 16 cores can be simulated on a given FPGA, and while I'm just a layman, its kind of obvious that, in terms of logic density, FPGA<<ASIC.

This is just a little devil's advocacy. Whatever the core count turns out to be, if Applied Micro can deliver, we may be witness round two of the ISA wars sooner than we thought.

I watched the webcast, and I didn't see any indication of the number of how many cores were running using the FPGA simulation (though the article indicates that it is 128 cores). Given that only a pair of the FPGAs appear dedicated to the CPU complex, I suspect that much more than a single board is necessary to simulate the full 128 cores; this board seems that it is intended to mirror the architecture of an SOC. Honestly, I'm not terribly surprised - four-issue OoO cores are not traditionally small devices.

Because the cores share an L2 cache (the presentation refers to them as a module, although that brings back unpleasant memories of certain AMD slide decks that haven't delivered), we will probably see the cores coming in pairs. At least for validation purposes, they probably want two modules, so that they can test the on-chip coherency fabric.

Very nice.

I always thought that the 80* and Z80 CPU families and architectures were fugly, so preferred the far more elegant 68* and 65* CPU families and architectures, especially the 680x0 CPUs. I only moved to a 80* CPU PCs because the nasty IBM PC 80x86 'architecture' became the defacto system architecture, so got the investment to become much faster; hopefully this will reverse now that the 6502 derived ARM architecture dominates low power devices and will soon have the processing power to seriously grow into desktop and server areas

I hope the kludged Atom get snuffed out ASAP; given it is a poor CPU, especially for NAS.