ISC Fujitsu is working on a supercomputer built around 64-bit ARMv8 cores and creating a machine which will clean the clock of the new crowned Chinese king of the supercomputers – Sunway TaihuLight machine.

The computer is going to be called the Post-K super because it will succeed Japan's K Supercomputer. The K machine is the fifth fastest in the world and can manage 10.5 PFLOPS, needs 12MW of power, and is built out of 705,000 Sparc64 VIIIfx cores.

But the Post-K will have 100 times more application performance than the K Supercomputer and manage 1,000 PFLOPS when it is switched on in 2020. The fastest known super in the world now is China's 125.4 PFLOPS Sunway TaihuLight machine which uses Chinese chips.

The Post-K will be the fastest known computer an ARM-powered system. It is not difficult really because there are no other ARM supercomputers out there.

The custom-designed supercomputer CPU powering the Post-K will run ARMv8 code which has been optimized o speed up the adding up. Fujitsu is an ARM licensee and is a big fan of classic RISC architectures. It used to use the Sparc64 and has now gone to the 64-bit ARM.

Phytium have announced a processor codenamed Mars that comes with 64 ARM v8 based processors optimised for High Performance Computing (HPC). Eight of the Xiaomi cores make a Panel and Panel connects directly to the DDR3 memory or PCIe interface.

All eight Panels share 128 MB L3 cache and 16 DDR3 1600 channels with 204GB bandwidth available. They share two PCIe 3.0, with 16 lanes with 32GB per second total bandwith.

Given this is a 28nm based processor manufactured at TSMC it is huge. It measures 640 square mm and works at 2GHz. It can produce 512 GFlops with Double Precision.

In single copy SPEC CPU2006, a single core scores 19.2 in an integer test and 17.8 in a Floating point test. With all 64 cores, the Mars processor scores 672 with the Integer and 585 with the Floating point test.

Andreas said that two Xeon E5 2697v3 with 56 logic cores scores only 120 with integer and 857 in Floating point tests.

Mars has 120W TDP which is low for the super compute market. We will see if Qualcomm can do better with its Hydra processor which will end up on a smaller manufacturing node.

The 64-bit Cortex-A57 core is ARM’s latest and greatest CPU design, but very few chipmakers are actually building products based on this flagship core. In fact, many are skipping it altogether, so what’s going on here?

There is one thing to keep in mind. The Cortex-A57 is by no means a new design. In fact, it was announced in October 2012, with availability slated for 2014. As we all know, the rollout wasn’t very smooth and the only Cortex-A57 consumer part ready to ship in 2014 was the Exynos 7410 of Galaxy Note 4 fame. It was followed by the Snapdragon 810 and Exynos 7420, which hardly need an introduction.

Cortex-A57 is on almost schedule, so what’s the big deal?

While it is true that the Cortex-A57 was almost on time, our concern isn’t the rollout schedule – it’s the lack of designs. For a product announced 30 months ago, it has relatively few design wins and this is not going to change. In fact, at this point it is more or less obvious that a number of major SoC makers will skip it altogether.

Another relatively big player, Huawei HiSilicon, also appears to be skipping the A57. The company’s upcoming Kirin 940 and Kirin 950 parts should end up with Cortex-A72 cores instead. That’s not all, because some outfits like Nvidia have their own custom cores. Qualcomm is also expected to employ a custom core in the Snapdragon 820, while rumours of a Samsung custom ARMv8 core have been floating around for ages.

Thermal barrier and economics stall ARM SoC evolution

There are a few possible explanations for the lack of Cortex-A57 design wins, and they involve physics and economics.

From a technical perspective, the A57 requires too much effort and does not provide huge performance gains. Used in a big.LITTLE octa-core, the Cortex-A57 necessitates the use of four additional Cortex-A53 cores, a big GPU to match its potential, and the customary 4G modem found on high-end devices. All this results in a relatively big die with a lot of transistors, especially on planar nodes.

Snapdragon 810 layout - note the size of the GPU, modem, and other modules. The CPU doesn't look too big, but unlike other modules it is always utilised to some extent. Under load, the CPU and GPU are bound by the thermal envelope, which is not the case with the rest of the chip.

Thermal and power efficiency issues are another concern, as such a chip simply can’t reach its full potential on planar nodes, unless consumers suddenly become interested in buying big and thick phones, with oversized heatsinks and batteries.

The Cortex-A57 really isn’t an option at 28nm. It can, however, be successfully deployed on 20nm and 14/16nm FinFET nodes. This makes it an unattractive proposition for all but the most expensive devices, since it’s an elaborate design that requires an expensive, cutting-edge node to be implemented. By the time FinFET matures and foundry costs go down, ARM will already have another design to take its place – the Cortex-A72.

Cortex-A57 vs. Cortex-A72

The Cortex-A72 was announced in February 2015 and ARM expects to see it in commercially available devices by early 2016. Some chipmakers would like to get their hands on it even sooner, even using it on 28nm nodes rather than FinFET nodes it was originally designed for.

In some respects, the Cortex-A57 shared a similar fate to that of its predecessor, the Cortex-A15. The latter debuted on Samsung’s 32nm parts, but due to thermal issues the core wasn’t widely used until 28nm nodes became available (and cheap). However, it was all a matter of good timing – the A15 arrived just in time for 28nm, while the A57 sort of missed its window of opportunity.

Worse, Android 5.0 brought 64-bit support last year, prompting Google to tap Nvidia for its Nexus 9 tablet, as its Denver core was practically the only 64-bit ARM "big core" Google could use. Consumers could get affordable Cortex-A53 devices with 64-bit support, but they couldn’t get flagship 64-devoces. This may not be an important distinction for the average Fudzilla reader, since tech enthusiasts know 64-bit support simply wasn’t too relevant in 2014 (and still isn’t). However, it was a lot easier to market 64-bit parts based on small cores than big 32-bit cores.

So, will the Cortex-A72 end up with more design wins than the A57? Is it really much better than the A57?

Personally, I am inclined to say that the Cortex-A72 will be a lot more successful, not by virtue of its design, but thanks to better timing and the limited appeal of the Cortex-A57. ARM did not reveal a lot of information on the A72, other than to state that new core will be vastly more efficient than the A15 and A57, but its numbers were based on different nodes (28nm for A15, 20nm for A57, 16nm for A72).

We simply don’t know much about the Cortex-A72 yet and it's too early to jump to conclusions.

What does this mean for 2015?

Moving forward, the lack of a viable 64-bit ARM core for mid-range, and even some high-end devices on 28nm, is bound have a number of implications on the smartphone SoC market and smartphone design in general.

The Cortex-A57 simply won’t end up in a lot of devices, as it only makes sense on 20nm and 14/16nm FinFET nodes, so chipmakers will have only one choice – churn out more Cortex-A53 parts at higher clocks, with faster GPUs and better LTE support. Unlike last year, they don’t have the option of using four cores (A15, A17, A9 and A7), as they can only use A57 and A53 cores, but the A57 simply doesn’t work for most market segments. The Cortex-A17 looks like a very tempting alternative and MediaTek already tapped it for some parts, but this is a 32-bit core, positioned below the Cortex-A15 and Cortex-A57. While the A17 is a good performer with a good price/performance ratio, consumers demand 64-bit chips, plain and simple.

This will obviously have the effect of blurring the line between low- and mid-end devices, as many of them will have to share similar silicon – consumers will get A53 cores whether they’re buying a $100 phone or a $300 phone.

Companies like Huawei and MediaTek have already hinted at, or revealed chips designed to address the problem, by including four A53 cores at higher clocks (Huawei calls them A53e or enhanced cores). These cores will be backed by four slower A53 cores, and Qualcomm already uses such a layout in the Snapdragon 615.

When ARM announced the Cortex-A17 last February, the company made it clear that 28nm would provide “the most transistors per dollar” – this is still the case.

It is highly unlikely that any of these chips will be manufactured using expensive 20nm or FinFET nodes, at least not in the foreseeable future (at least four quarters, possibly five due to high demand for flagship chips in Q1 2016). Capacity is limited, cost will remain prohibitively high for months, and 28nm works just fine for Cortex-A53 parts. As a result, SoC designers are already doubling down on 28nm capacity, as it is obvious the node will have to soldier on well into 2016.

Even more 28nm Cortex-A53 designs with tweaked cores, updated graphics and modems.

Smartphone makers will have to devise new ways of differentiating non-flagship products.

Prices of mid-range devices are likely to drop.

No Cortex-A53 parts on 20nm or 14/16nm nodes.

28nm node will continue to dominate the mobile landscape for at least 4 quarters and start tapering off in the second half of 2016.

Soft demand for limited capacity FinFET nodes over the next 2-3 quarters due to lack of Cortex-A57 designs.

Intel could benefit from stalled ARM development.

There are a few caveats. Some small-core chips could make it to a new node later this year, but we are talking about niche products (perhaps some wearable SoCs, or in-house designs for certain low-volume smartphones). If demand for FinFET parts proves to be much lower than anticipated, it is possible that foundries will have to reduce pricing as more capacity comes online – but this depends on a wide range of factors and we doubt anyone can make a good forecast for at least the next quarter or so.

2015 will not be a very eventful year for the ARM SoC market, but it might turn out to be a race to the bottom.

British chip designer ARM has just signed off its 50th licensing agreement for its ARMv8-A technology, which includes support for 64-bit computing.

More than 27 companies have signed agreements for the company’s ARMv8-A technology and it seems that the idea is starting to gather some momentum. Arm said that those company’s that signed up include all of the top 10 companies who sell application processors for smartphones; nine of the top 10 application processor companies for tablets; four of the top five companies that provide chips for consumer electronics; four of the top five companies that provide chips for enterprise networking and servers; and eight silicon vendors from Greater China.

ARM said that the licensing agreement demonstrates the continuing strength in demand for the company’s 64-bit-capable ARM Cortex-A50 processor family and ARMv8 architecture licenses which will serve future digital devices and infrastructure deployments coping with more complex applications within strict power budgets.

Noel Hurley, general manager, processor division, ARM claimed that ARMv8-A technology brings 64-bit capability and improved efficiency of existing 32-bit applications. All this should help tablets and smartphones replace PCs for many tasks.

The ARMv8-A platforms are fully backward compatible and will efficiently execute over a million 32-bit apps and extensive software assets already in use, Hurley said.

ARM says that no one really needs 128-bit chips and 64-bit is enough for everyone for a long time. ARM dismissed a rumour that it was working on 128-bit processing architecture saying that in he coming years everyone in the ARM camp will try to improve their 64-bit offerings, not introduce all-new 128-bit CPUs that will not be needed for years to come.

Ian Drew, chief marketing officer and executive vice president of ARM said that news reports have suggested that ARM is developing 128-bit processor technology are untrue.

“64-bit processors are capable of supporting the needs of the computing industry now and for many years to come. There are absolutely no plans underway for 128 bit ARM-based chips because they simply aren’t needed. Rumors to the contrary are simply incorrect,” he said.

He said that it was an exciting time for the ARM ecosystem, with leading solutions from ARM partners taking computing to the next level. In the coming year he expect we will see increasing announcements of 64-bit solutions across mobile, networking and server markets, Drew said.

Over the last couple of months Qualcomm CMO Anand Chandrasekher got plenty of publicity thanks to his undiplomatic comments about Samsung’s octa-core Exynos chips and Apple’s 64-bit A7, which he described as a marketing gimmick. Then again, he was even harsher with Samsung, as he said that doing octa-core chips is plain stupid and that Qualcomm wouldn’t make any.

However, questioning religious beliefs of Apple fanboys in this day and age is simply not an option. It’s a bit like spitting in the Pope’s face or walking around Mecca with a cartoon of Mohammad on your T-shirt. It’s only a matter of time before radical Apple fanboys start suicide bombing those who question their Lord and Saviour, or church doctrine.

In the end, Qualcomm decided to quietly reassign Chandrasekher to a new post “exploring certain enterprise related initiatives,” whatever that’s supposed to mean, reports CNET. A few weeks ago Qualcomm had to issue a statement following Chandrasekher’s 64-bit comments, describing them as “inaccurate”.

Unlike Samsung and Nvidia, Qualcomm has not said anything about its 64-bit plans. Nvidia has Project Denver, Samsung already said its Exynos chips will go 64-bit, although it is unclear when. It is obvious that Qualcomm is working on 64-bit chips of its own, but it doesn’t tend to announce products two years in advance like Nvidia, so Chandrasekher’s comments were the last thing it needed.

A week ago Qualcomm CMO Anand Chandrasekher called Apple’s 64-bit support in the A7 SoC a “marketing gimmick” and last night Qualcomm issued a statement retracting Anand’s bash.

“I think they are doing a marketing gimmick. There’s zero benefit a consumer gets from that,” said Chandrasekher. Of course, he was talking about a specific product at a specific time and since we all know 64-bit support is inevitable, Qualcomm clarified Chandrasekher’s statements.

"The comments made by Anand Chandrasekher, Qualcomm CMO, about 64-bit computing were inaccurate. The mobile hardware and software ecosystem is already moving in the direction of 64-bit; and, the evolution to 64-bit brings desktop class capabilities and user experiences to mobile, as well as enabling mobile processors and software to run new classes of computing devices," a Qualcomm spokesperson told us.

However, Qualcomm stopped short of shedding any more light on its own 64-bit plans. Samsung already made it clear that it’s working on 64-bit parts, ARM’s A53 and A57 cores were announced ages ago and Nvidia has Project Denver. Qualcomm is still refusing to say anything about 64-bit Krait parts, but it’s more than obvious that it will have to roll out its own ARMv8 parts soon.

Given Qualcomm’s launch schedule, this probably won’t happen next year, as the Snapdragon 600 and 800 should be replaced in the first half of the year and it is highly unlikely that the new parts will feature 64-bit support.

Qualcomm is advertising at least three software engineering positions that seem to indicate it is gearing up to enter the ARM server market.

According to EE Times, the chipmaker is looking for designers who know their way around ARMv8-based server SoC ASICs and can deal with development, porting and integration of server platform management software and firmware.

In plain English, Qualcomm is apparently working on 64-bit server SoCs. Although the company is the clear market leader in the ARM space, it never really focused on server parts. Other players, including AMD, Marvell, Samsung and Nvidia have also announced plans to enter the ARM server market.

The first shipments of 64-bit server SoCs are expected next year, but Calxeda and Marvell are already shipping 32-bit SoCs for servers, but in limited numbers.