ARM goes 64-bit with new Cortex-A53 and Cortex-A57 designs

AMD, Broadcom, and Samsung are among the first licensees.

AMD revealed yesterday it would be building new 64-bit ARM-based Opteron chips intended for use in servers, and now it's clear what technology those chips will be using. ARM just announced two new Cortex-A50-series chips that will bring 64-bit capabilities to ARM SoCs. In addition to AMD, Broadcom, Calxeda, HiSilicon, Samsung, and STMicroelectronics are also listed as licensees.

ARM's press release mentions two specific processors: the first is the Cortex-A57, a high-performance design that will likely be more suited for server use. The second is the Cortex-A53, which has the same capabilities as the A57, but in a more power-efficient (and thus, slower) package. These two chips can be combined into one package if desired—the Cortex-A53 cores can handle low-impact workloads and stay on while the system is idle, and the more power-hungry Cortex-A57 cores will spin up only when the workload requires it. ARM calls this type of processor layout "big.LITTLE" and already offers licensees the ability to pair a Cortex-A15 CPU with a Cortex-A7 chip to achieve similar results.

According to ARM, the new processors should scale well enough that they can be used in smartphones, tablets, and laptops as well as servers. So while they are fully 64-bit capable, they also fully support 32-bit programs and operating systems. Whether the new chips are intended to fully replace current Cortex-A9 and Cortex-A15-based chips isn't clear, but there's a lot of overlap in today's devices between newer and older ARM architecture. The Cortex-A15 design, for example, has been around for a couple of years now, but shipping products based on that architecture (like the new Nexus 10 or ARM-based Chromebook) have only just begun to make it to market.

The list of licensees is impressive and includes most of the major players currently using ARM's architectures in their chips, though NVIDIA and Texas Instruments are notably absent. However, we don't expect to see chips based on these designs until 2014 or so, conforming to the timeline AMD has set.

35 Reader Comments

Since the current Android phones are basically out of virtual address on the Cortex A15, I assume the A50 series will have to completely replace the A9/A15 in the next couple years unless they're going to introduce a 64 bit variant of the A15.

After ARM announced its v8 core with support for the 64-bit instruction set many were tempted to believe that Nvidia's upcoming Project Denver is based on this new architecture, but the company's CEO has recently stated that this isn't true as the chip will use a 64-bit technology developed by Nvidia...."We are busily working on Denver, it is on track. Our expectation is that we will talk about it more, hopefully, towards the end of next year,” said Huang.

Core performance on phones/tablets (especially upcoming A15 designs) has reached the point of good enough. I guess it's probably too much to hope that SoC manufacturers will stop the arms (get it) race for higher clock speeds in favor of keeping performance near level and instead dramatically increase battery life. With A15's increase in performance/watt all of the manufacturers seem to have just increased the performance and kept battery life equal. Now that we have plenty fast CPU's I hope the next generational SoC's will go for 24 hour battery life! 8-12 is good enough, but being able to go a full day would be wonderful for those long days!

Although for that to matter we would also have to stop the screen resolution wars as well. 1080p on my phone? Sure, I'll take it. Double that...ok I can't see the difference and now my power consumption just doubled. Spec wars ftl.

ARM's press release talks about "delivering up to three times the performance of today’s superphones". I wonder if they mean "3 times Cortex A-15", or 3 times the weaker-CPU of Apple A6 / Qualcomm S4 Pro etc., given that no Cortex A-15 phones are shipping imminently yet.

I've always been a strong proponent of the "big.LITTLE" architecture. There's ton's of rules of thumb about CPU arch that dictate super-linear power increases with additional frequency / processing power. In other words, by halving the computing speed (frequency + CPI), you can bring the power budget down by way more than half. Couple this with the observation that many things in the OS don't require great CPU power, but DO require a CPU to be on.

In the retail space, this is obvious ... iPads go into some wild power-saving mode where I can come back to my iPad after (literally) months and it still has half battery power and wakes up instantly. That's amazing. Even when my iPad is on, much of the time, I'm staring at the screen reading something, using very little CPU. Assuming you can wake the fast CPU with sub-millisecond latency, there is very very little penalty to having most UX and interrupts processing done on a super-low-power CPU, with the big CPU waking whenever a reflow or repaint needs to occur.

In the server space (where these 64-bit chips are targeted), while the benefit might be less obvious (until you think about it), the savings are probably even more dramatic. It's pretty well known in the industry that average CPU utilization on server hardware is well under 10%. In fact, if you average across all servers, many many datacenters will be closer to a single digit percentage. If that sounds crazy to you, think about it for a second. I spent 2 years working in EC2 and have worked at multiple companies that are major EC2 customers. In EC2, you never even have 100% of the servers allocated to customers (else, you'd be out of capacity). Then, most of those boxes are going to be doing something like what my t1.micro does .... sit there all day running an idle Nginx process in the hopes that someone visits my vanity page. Maybe once a week it handles an SSH session for me. That's way less than 1% average load. Even more serious services, if they are handling request traffic (eg web front-ends), you want to run them around 30% utilization AT PEAK, or else you will have queuing based latency (see formal queue theory). That's at peak. At night, there are long stretches of near idleness. And if your service faces spikey or unpredictable loads, you'll over-provision so you can handle the max potential traffic, not the just the actual traffic you are seeing. Yeah, yeah, autoscaling and all that, but the basic point is that even fully loaded service tiers are nowhere near 100%. Then you have hot-spare machines and machines reserved for DR (disaster-recovery). And dev/staging/qa boxes. Etc. Unlike storage, compute is nowhere near efficiently allocated because it's not fungible. Pretty much the only thing that achieves better than 50% load is heavily subscribed worker queues and HPC apps (and even those boxes sit idle some % of the time waiting for a customer). So in each 1-millisecond increment, about 90% of the machines in your datacenter are going to be totally idle (for that millisecond). Now imagine that you can turn off the main CPU for all those idle milliseconds and then spin it back up the instant a request comes in. Think about how much power that is.

So server-space, long-story-short, you can save a ton of power by idling down under-used resources. Most importantly, it doesn't just lower the minimum power, it lowers the P99% max power load. For each individual machine, peak power doesn't change, but on an AGGREGATE level, you will never have 100% of your machines at 100% load. You can do population statistics and by lowering the per-machine minimum power, your aggregate population's maximum power will improve too. Why is this important? See in a real datacenter, the capital cost of the transformers, and generators and UPS gear and whatnot (stuff that is provisioned for your peak power load) -- that capital cost dwarfs the variable-cost of the actual kWh energy from the utility. So lowering the datacenter's peak power consumption saves you a ton of money by letting you buy fewer generators/transformers/etc. Burning through fewer kWh saves money too, just not as much.

And if you do the math, even the "peak" power consumption when all boxes are in use, that will be FAR less than the theoretical po

This could make a lot of sense in a future MacBook Air lineup. Many users don't need the sheer processing power of an Intel CPU. Less power consumption means a smaller battery and heatsink, possibly even eliminating cooling fans. What Intel-based Ultrabook could match that?

I would expect to keep Intel chips in iMacs, Minis, MacBook Pros, and Mac Pros, until they are at least equal performers to Intel with solid software coverage.

Jim Z wrote:

yes, 'cos I'm sure Mac users are just itching to go back to Rosetta-levels of performance.

1) Ideally, this would be easy for developers to build for, with just a setting in Xcode to build a universal binary.

2) Apple has massive leverage via the App Store. They would simply require that all new app submissions are universal binaries well in advance of the hardware release. Such a strategy would put them in far better position than Windows RT.

3) Yes, important players such as Adobe will likely take ages to adapt, however, if you need performance in Photoshop, you'd probably have a MacBook Pro, not an Air anyway.

4) The picture becomes less clear when it comed to running Boot Camp or VMs. Apple could be willing to say this is simply not supported on the Air lineup, if you want to do that, buy a Pro.

ARM's press release talks about "delivering up to three times the performance of today’s superphones". I wonder if they mean "3 times Cortex A-15", or 3 times the weaker-CPU of Apple A6 / Qualcomm S4 Pro etc., given that no Cortex A-15 phones are shipping imminently yet.

There's no evidence that a Cortex 15 based SoC for smartphones would be any better than an Apple A6. They should be about the same.

And considering that these 64 bit designs are a couple of years off, it's irrelevant as to how they compare to current models. In fact, do you doubt that Apple, and others, will use either these designs, for those who just built chips with them, or for those, such as Apple, Samsung, Qualcomm and others who have architecture licenses, and so design their own CPU's will be using these when they're available? Of course they will!

I hope they are at least *testing* them in Macs, but I doubt they'll ever be used.

They are more likely to create their own in-house ARM cpus, since they now own one of the best CPU design companies in the world and Apple's research/development budget is a *lot* bigger than AMD's research/development budget.

My god, we only just finished going through the PPC -> x86 transition. I really hope they don't move to ARM any time soon... but I'm sure they at least have it going as a skunkworks project.

If iOS is any indicator (iPhone apps run perfect when compiled for native x64 in the simulator), then I think OS X would run extremely well on an ARM chipset. It would probably be slower, but a 64 bit version of the A6X, with 8GB of RAM... that would be pretty nice in a MacBook Air.

And from what I've seen of third party battery tests, the iPad's battery life estimate is a lot more realistic than the MacBook Air one.

NeXT used to support multiple platforms, and OS X did a pretty good job running on PPC/x86/x64, and all of that cross-platform stuff is still in place to at least allow x86 and x64 binaries to co-exist inside a single executable, even though they've killed PPC support. I think Apple could make a very smooth transition to ARM, but people like me who like to continue using a Mac 5+ years after buying it will suffer. I would still be using my 2004 PowerMac today if Apple hadn't dropped support for it. It was crazy fast when new, and still fast enough for home use today.

Bring it on. The x86 architecture lost the battle for high-throughput computing to stream processing architectures like CUDA (think Larabee), and for interactive workloads, the key architectural features are control flow (branches and jumps in x86 vs. predicated instructions in ARM) and addressing (x86 addressing works well for arrays but not for maps and map-based objects).

Haswell throws more transistors at Intel's addressing bottleneck. I suspect they are having a real hard time feeding their hungry ALUs under languages like JavaScript, Objective-C, and Python. The x86 architecture reflects a period in computing history when data collections were modeled using arrays and structs in languages like C and control flow was modeled using the stack in languages like Lisp. If Emacs is your idea of what an operating system should be, then x86 is perfect.

All you have to do to transform ARM from its roots in embedded C to the ideal architecture for JavaScript and friends is add a load/store instruction set which takes hash keys rather than linear offsets. It would be much more difficult to do that for x86 because the addressing modes are built into the compute instructions.

Most of the JavaScript engine braintrust is working for companies which are investing heavily in ARM hardware. Apple isn't completely happy with Intel, and both Microsoft and Google got burned by uncompetitive Atom chips. Intel isn't the best for hardcore number crunching or interactive message handling. They're stuck in no-man's land.

I suspect they are having a real hard time feeding their hungry ALUs under languages like JavaScript, Objective-C, and Python. The x86 architecture reflects a period in computing history when data collections were modeled using arrays and structs in languages like C and control flow was modeled using the stack in languages like Lisp.

Don't know about Python, but JavaScript and Objective-C both run *way* faster on x86 hardware than on the fastest ARM chips.

It's not even close. I have some code for dealing with a massive dataset that executes in about 5 minutes on an old x86 processor inside the iPhone Simulator, but takes several *hours* on an actual iPhone (obviously this is just experimental code, not a shipping app, but I did put days of effort into getting it to run as fast as possible).

Even though javascript is supposedly hardware accelerated on both ARM (at least Safari is) and x86. A test I found comparing JavaScript performance on an 800Mhz ARM to an 800Mhz x86 chip (underclocked) showed between 200 and 280% better performance on the x86 chip.

Still, I think the processor in the latest iPad would be "fast enough" for most people, if it was combined with enough RAM and a good SSD.

Yeah, but you can get pretty much the same results by powering down (N-1) of the homogeneous cores in your N-core server, and save a ton of complexity by not having to design a management process to ensure you maintain the necessary performance. big.LITTLE is really more for situations where the workload approaches, but never quite reaches, zero. In server situations the power use of other system components (disks, NICs, DRAM, etc) that stay on with the little core are going to dwarf the power saved by using a little core instead of a big one. Putting the whole system into a sleep state temporarily would be vastly preferable.

I'll note that Cortex-A15 added virtualization extensions, VFPv4, and 40-bit LPAE. And, as the Cortex-A7 has an identical instruction set to the A15 (for the whole big.LITTLE thing), it supports all of that, too.

LPAE can hold off the 32-bit barrier for a while, without requiring a switch to AArch64. As long as per-process memory stays below 4 GiB (or whatever limit it actually is - 2 GiB at worst), LPAE can keep things going until 1 TiB RAM.

Now, LPAE is NOT a good long-term strategy. But, until the OSes are 64-bitted, it's a great stopgap.

Core performance on phones/tablets (especially upcoming A15 designs) has reached the point of good enough.

Nonsense. To take one example, augmented reality currently behaves like crap. It could probably be a lot more pleasant to use if the phone could throw a LOT more processing power at the problem to calm jitter, figure out edges, etc.

ARM's press release talks about "delivering up to three times the performance of today’s superphones". I wonder if they mean "3 times Cortex A-15", or 3 times the weaker-CPU of Apple A6 / Qualcomm S4 Pro etc., given that no Cortex A-15 phones are shipping imminently yet.

There's no evidence that a Cortex 15 based SoC for smartphones would be any better than an Apple A6. They should be about the same.

And considering that these 64 bit designs are a couple of years off, it's irrelevant as to how they compare to current models. In fact, do you doubt that Apple, and others, will use either these designs, for those who just built chips with them, or for those, such as Apple, Samsung, Qualcomm and others who have architecture licenses, and so design their own CPU's will be using these when they're available? Of course they will!

We are looking at more than double the performance in V8 and Octane. Even adjusting for the difference of-- Chrome OS versus Android (different OS and different browser)-- Cortex A-15 in smartphones may run at slower frequencies (to save battery)it seems likely that Cortex A-15 SoC for smartphones will be faster than Apple A6 overall (CPU-wise).

yes, 'cos I'm sure Mac users are just itching to go back to Rosetta-levels of performance.

Actually, that's an interesting part of the ARM-64 effort. How will these chips stack up against Intel's best?

If they're talking about "three times faster" than the top current smartphone CPUs, then they're talking about "not worthy to wipe a Core i7's ass."

Quote:

RISC was originally about better performance. We'll see how that holds up in the modern era.

I thought "RISC" was about moving complexity off of (then) expensive silicon into the compilers. "Better performance" was just assumed to be a result of that. Unfortunately, as history has proven, RISC vs. CISC doesn't matter when you can throw transistors and bleeding-edge fab processes at the problem.

I thought "RISC" was about moving complexity off of (then) expensive silicon into the compilers. "Better performance" was just assumed to be a result of that. Unfortunately, as history has proven, RISC vs. CISC doesn't matter when you can throw transistors and bleeding-edge fab processes at the problem.

Actually, how it actually turns out is, RISC vs. CISC is a fuzzy question nowadays.

So, there's three ways to define RISC vs. CISC.

Load-store architecture (interestingly, IIRC, newer versions of POWER have one or two instructions that disqualify it from this, meaning they're CISC by that definition)Whether the processor dispatches micro-ops (ARMv7 designs actually do, for two instructions - LDM and STM. But, given that those are the exception to the rule on ARMv7... it can be let slide. POWER, on the other hand...)Whether the processor has the following attributes: equal-length instructions, minimum necessary instruction complexity, and to hell with everything else (ARM has equal-length instructions, but some RISC purists hate that it has a multiply instruction in anything after the ARM1 (which was essentially the prototype), and that it has conditional execution)

In any case, the problem is, the RISC purists are wrong. RISC was designed as a way to support pipelining with equal execution length instructions, so that clock speeds could be ramped up (because back when the first RISC CPUs were designed, RAM was actually FASTER than CPUs - look at the personal computer architectures of the time, many interleaved RAM access between CPU and video). Turns out, though, the method worked so well that CPUs got many times faster than RAM. And, pure RISC executable code is actually at a disadvantage here, because it's not very dense. This means more memory bandwidth is needed, and more total memory.

Enter the RISC/CISC hybrid - the NexGen Nx586. While nobody has ever accused x86 of being an elegant architecture (usually it's PDP-11 and 68000 that are seen as the elegant CISC architectures), it is more code efficient than the purist RISC architectures. If you can efficiently translate x86 to an internal purist RISC architecture that's optimized towards x86 workloads (which *IS* screaming fast if you can keep it fed with data), in hardware, you've got the memory requirements of an x86 processor, with the performance (or very nearly so) of a pure RISC architecture. Basically, it turned out, the answer was, externally CISC, internally RISC.

ARM has some very interesting corner cases going for it. ARM has driven since day 0 to use little power. Three times the results of the A15 is still a pretty low end server chip even in the many wimpy core solution spaces ARM, and now AMD, is aiming for. In particular when one compares it to proper server chips, even limiting only to Opteron variations. It is worse if one considers that same Opteron on 20nm. Three times the speed of a smart phone for use as a smart phone is cool (in two years we will be nearly there anyway), but for use as a server ... we will have to wait and see.

When one scales ARM up to modern processing speeds/data widths/cache sizes they have the same problems that their competitors have; Speed costs Power. ARM has been a master at keeping their (and your) foot off the throttle. The entire reason for big.little is because ARM cannot make things work better then anyone else when the speed and additional processing resources are cranked up. Which makes that solution not genius but necessary. The little processor HAS to exist or the big one isn't better then the competition. ARM physics is still the physics that everyone else has to use since they are all using TSMC ;-P

I hope the AMD board can find some success with this but I still worry about chasing the bottom rung of the ladder. AMD has made money when it doesn't do that in server space but instead marking out a profitable mid to high performance niche. They are setting themselves up to fight for server chips that many will be reluctant to pay very much money for. That will make them money and maybe even some profits but not nearly enough to make a significant difference on a balance sheet. Those profits will require many wafer starts and high yields on new processes to succeed and will not replace the higher priced Opterons.

Their advantages that might work out are AMD floating point and ATI graphics being fused into that same space. Trying to merge those technologies could be a process nightmare but that may well be the plan.

Joke:Maybe this is tied to the reduction in force for engineering?We don't need designers; we have ARM.Or adding an ARM to spite your engineers?/Joke off, I hope. I haven't been able to figure out their board for the last several years except they certainly don't understand silicon.

Historical note:While ARM expertise has been showing how not to use power, oddly enough, AMD's history (from way, way back) was to out power Intel when second sourcing their IP thus producing faster chips. That was why their traditional x86 packaging was ceramic, they needed it because of heat. Of course, the modern x86 processors would terrify the normal users of pre-80386 systems. AMD desktops using 130 watts? Argh! Just think what an iTanic would do their blood pressure? I'd laugh if I knew CPR.

Actually, how it actually turns out is, RISC vs. CISC is a fuzzy question nowadays.

So, there's three ways to define RISC vs. CISC.

Load-store architecture (interestingly, IIRC, newer versions of POWER have one or two instructions that disqualify it from this, meaning they're CISC by that definition)Whether the processor dispatches micro-ops (ARMv7 designs actually do, for two instructions - LDM and STM. But, given that those are the exception to the rule on ARMv7... it can be let slide. POWER, on the other hand...)Whether the processor has the following attributes: equal-length instructions, minimum necessary instruction complexity, and to hell with everything else (ARM has equal-length instructions, but some RISC purists hate that it has a multiply instruction in anything after the ARM1 (which was essentially the prototype), and that it has conditional execution)

In any case, the problem is, the RISC purists are wrong. RISC was designed as a way to support pipelining with equal execution length instructions, so that clock speeds could be ramped up (because back when the first RISC CPUs were designed, RAM was actually FASTER than CPUs - look at the personal computer architectures of the time, many interleaved RAM access between CPU and video). Turns out, though, the method worked so well that CPUs got many times faster than RAM. And, pure RISC executable code is actually at a disadvantage here, because it's not very dense. This means more memory bandwidth is needed, and more total memory.

Enter the RISC/CISC hybrid - the NexGen Nx586. While nobody has ever accused x86 of being an elegant architecture (usually it's PDP-11 and 68000 that are seen as the elegant CISC architectures), it is more code efficient than the purist RISC architectures. If you can efficiently translate x86 to an internal purist RISC architecture that's optimized towards x86 workloads (which *IS* screaming fast if you can keep it fed with data), in hardware, you've got the memory requirements of an x86 processor, with the performance (or very nearly so) of a pure RISC architecture. Basically, it turned out, the answer was, externally CISC, internally RISC.

Well said.

We ran tests (many, many times and many years ago) comparing them and the results were irritating and consistent. RISC produces larger programs, required longer to load, more space on the (very expensive) hard drives and more memory. CISC was uglier but ran in less memory, with smaller file system use and produced very similar execution results. Also, it produced internal flame wars, much like it still does today.

I've always wanted any of the modern Intel CPU to let me to access their internal RISC. That would be a blast to play with. Just skip the x86 decode block and let me have at it!

All the processor capability aside, I bet the big drive for 64-bit arm chips will be just to increase memory address space. There are phones and tablets that already have 2 gigs of physical memory; the 4-gig address space barrier is probably already constraining virtual memory on current-gen mobiles.

yes, 'cos I'm sure Mac users are just itching to go back to Rosetta-levels of performance.

Actually, that's an interesting part of the ARM-64 effort. How will these chips stack up against Intel's best?

If they're talking about "three times faster" than the top current smartphone CPUs, then they're talking about "not worthy to wipe a Core i7's ass."

My post was in regards to NVIDIA's Project Denver, and I don't believe they've released performance data. Thus my remark. You might also reflect on the fact that NVIDIA is all about moving many compute-intensive workloads on to the GPU. Combined ARM+GPU units are sure to be in the works for high-end applications. I'm sure NVIDIA will support the same advanced NEON functionality as 64-bit Cortex.

Quote:

Quote:

RISC was originally about better performance. We'll see how that holds up in the modern era.

I thought "RISC" was about moving complexity off of (then) expensive silicon into the compilers.

RISC was about a number of things, for instance larger register files (32 for Cortex A57 & Power vs. 4 for IA32 and 8 for AMD64). Silicon is still expensive, and simpler, smaller cores mean you can put more of them on a single die. Software is (slowly) becoming more multithreading friendly.

Quote:

"Better performance" was just assumed to be a result of that. Unfortunately, as history has proven, RISC vs. CISC doesn't matter when you can throw transistors and bleeding-edge fab processes at the problem.

Those are both straw man arguments. RISC has plenty of uses for "extra" transistors (cache, register file, SIMD units, cores) just as does CISC. As to process, that has been Intel's great strength, but it is getting close to true physical limits these days. In fact, the end of the silicon process roadmap may be an inflection point for Intel. There's a lot of activity devoted to researching alternatives these days, see IBM's recent work on self-organizing carbon nanotube circuits.

Core performance on phones/tablets (especially upcoming A15 designs) has reached the point of good enough. I guess it's probably too much to hope that SoC manufacturers will stop the arms (get it) race for higher clock speeds in favor of keeping performance near level and instead dramatically increase battery life. With A15's increase in performance/watt all of the manufacturers seem to have just increased the performance and kept battery life equal. Now that we have plenty fast CPU's I hope the next generational SoC's will go for 24 hour battery life! 8-12 is good enough, but being able to go a full day would be wonderful for those long days!

Although for that to matter we would also have to stop the screen resolution wars as well. 1080p on my phone? Sure, I'll take it. Double that...ok I can't see the difference and now my power consumption just doubled. Spec wars ftl.

Well said. It became obvious to me that once tech media sites started posting CPU performance benchmarks in phone reviews a couple of years ago that this sort of rot in common sense had started.

The tech media have a big responsibility to stop insisting on unnecessary/unoptimised performance at the expense of efficiency. Just recently we had ARM themselves praising certain phone manufacturers for sticking to dual-core Cortex A15s instead of quads, yet the tech media completely ignore the advice given by those who designed the original tech! Yes, and that includes Ars, who recently dismissed certain phones in a round-up recently for "only" having a dual core A15 when some competition had quads.

Core performance on phones/tablets (especially upcoming A15 designs) has reached the point of good enough.

Nonsense. To take one example, augmented reality currently behaves like crap. It could probably be a lot more pleasant to use if the phone could throw a LOT more processing power at the problem to calm jitter, figure out edges, etc.

... or by employing competent software coders capable of using the excessive amounts of hardware already at their disposal ...

ARM's press release talks about "delivering up to three times the performance of today’s superphones". I wonder if they mean "3 times Cortex A-15", or 3 times the weaker-CPU of Apple A6 / Qualcomm S4 Pro etc., given that no Cortex A-15 phones are shipping imminently yet.

There's no evidence that a Cortex 15 based SoC for smartphones would be any better than an Apple A6. They should be about the same.

And considering that these 64 bit designs are a couple of years off, it's irrelevant as to how they compare to current models. In fact, do you doubt that Apple, and others, will use either these designs, for those who just built chips with them, or for those, such as Apple, Samsung, Qualcomm and others who have architecture licenses, and so design their own CPU's will be using these when they're available? Of course they will!

We are looking at more than double the performance in V8 and Octane. Even adjusting for the difference of-- Chrome OS versus Android (different OS and different browser)-- Cortex A-15 in smartphones may run at slower frequencies (to save battery)it seems likely that Cortex A-15 SoC for smartphones will be faster than Apple A6 overall (CPU-wise).

Having a rough idea of how fast performance will improve in the next 2-3 years is useful to plan one's upgrade frequency/strategy.

Well, it's running much faster than Apple is clocking the A6. It's also difficult to get the c,ock of Apple's chips for differing tests, as the chip clocks differently according to the load. I don't know how these chips compare in that. But the A6 uses much less power than does the one in these tests as well. It's not a fantastic comparison. The new iPad is supposedly clocked a bit higher than the iPhone 5. The 15 for phones is different in more ways than just a lower clock.

Oh, and something to note about ARMv8, if anyone's still reading this comment thread...

It's not really ARM at all, when in 64-bit mode.

Conditional execution has been removed except on a few instructionsPC, SP are no longer general purpose registersThere's a dedicated zero registerLDM/STM, which were just introduced with ARMv7, have been replaced with LDP/STP (pair)

Oh, and something to note about ARMv8, if anyone's still reading this comment thread...

It's not really ARM at all, when in 64-bit mode.

Conditional execution has been removed except on a few instructionsPC, SP are no longer general purpose registersThere's a dedicated zero registerLDM/STM, which were just introduced with ARMv7, have been replaced with LDP/STP (pair)

This thing is friggin' MIPS, just without the branch delay slot!

That is depressing!

At least AMD understood how to expand the x86 ISA and since Intel had access to all their IP via the court case they used it as well.

You did make me look up The Soul of a New Machine. So some good came of your depressing information. It is about Data General building a new 32bit mini-computer that was backwards compatible with their 16bit version.

Nobody said anything about TI's absence, but I suspect it's related to their decision to exit the market for high-end ARM processors (the ones found in phones and tablets) and concentrate on embedded products.

Andrew Cunningham / Andrew has a B.A. in Classics from Kenyon College and has over five years of experience in IT. His work has appeared on Charge Shot!!! and AnandTech, and he records a weekly book podcast called Overdue.