I got this on another message board I go to. It was posted by a guy who tries to get us to switch to linux, weary of google because they spy on us etc. I am not sure if that means this article is bullshit, but since I am not an expert I was hoping to get information from those who are more versed in computing.

linkI recommend to click on the link because it buttresses the article with numerous graphics, which I am not going to load on this site.

Quote:

ICube UPU, the next step in processor evolution?Reported by Nebojsa Novakovic on Friday, January 13 2012 4:34 pm

Any processor guru will tell you - It's very difficult to prop up a new instruction set architecture, even after going through all the technical difficulties, as the whole software stack has to be created from scratch: BIOS, compilers, libraries, operating system porting, drivers, basic applications, not to mention convincing the critical third-party software vendors to actually support the new architecture. All this requires substantial engineering, marketing and financial resources, not to mention a lot of guts.

Ever since Alpha in 1991, there was no new major instruction set architecture to appear in the general market. In fact, since then, most of the non-X86 architectures disappeared from the scene, leaving the X86 - even though widely agreed to be technically the worst - as the pre-dominant one. Power and Sparc still keep a part of the server field, while ARM is, of course, the king of the hill right now in the mobile arena, with its old RISC competitor, MIPS, making some inroads as well.

Now, for the first time in two decades, there's a company openly promoting its own new instruction set, and launching a processor based on it right into the hot waters on the mobile device market. Furthermore, UPU is a brand new philosophy too - for the first time, CPU and GPU are truly fused into one processor core, MVP (Multi-thread Virtual Pipeline), where even the register file is shared! And, the extra surprise element is that the whole thing is fully designed and made in China, a true 'China Core' right from the instruction set definition, without any licensing or other dependencies on the US technology.

ICube was set up by Fred Chow and Simon Moy, two industry veterans: Simon was behind the world's first 64-bits MIPS processors in SGI, and after that the principal engineer in Nvidia for 7 years until 2004, in charge of all the inital GPU, shader and GPU computing efforts. Fred was chief scientist at SGI, in its golden days of funky coloured superworkstations, and principal engineers at MIPS, later developing the Pathscale compiler suite that enabled AMD Opteron its first 64-bit X86 support. He is the chief architect of the open-source Open64 compiler suite.

So, interestingly, here you have two CPU designers actually driving the initial CPU design and instruction set - often it was memory companies (Intel was a DRAM maker when designing its first CPUs) or system companies like IBM, DEC or HP.

UPU (Unified Processor Unit) approach in their 'Harmony' architecture is the first situation where CPU and GPU threads are sharing the same execution units, register file and many instructions. In a sense, it is a 'total fusion' of the two, unlike the AMD Fusion APU approach where CPU and GPU are still distinct, with separate instruction sets, registers, execution units and such. You could consider the UPU as an example of 'homogeneous computing' just like any standard processors we are used to, while the APU belongs to 'heterogeneous computing' where different threads of different nature would run separately on CPU and GPU portions.

A very simple, elegant 32-bit RISC core, not unlike the original MIPS, does both functions, and the single 32-unit 32-bit register file is there for all operations. To support further parallelism, 4-way multithreading per core is supported, with optimised logic to remove the need for 4 separate register files. The compromises in the initial version? No SIMD vector stuff like Intel AVX, and no double-precision FP either. If you want more performance, you use more cores, which can be piled up together easily due to comparatively very small core footprint - only 2.7 square mm in the old 65 nm process. If in 32 nm process, it'd likely be only 1 square mm. This means that, on an average current 200 square mm CPU chip in 32 nm process, you could mount over 100 of these cores, plus interconnect logic and huge multimegabyte shared cache, all together.

The result is a very small, compact dual-core chip even in the initial IC1 iteration, which they plan to scale to a quad-core IC2 next year, using 40 nm or better process. Simon Moy told us that, since China is in need of fast catching up, we are likely to see its CPU vendors jumping two process generations in one go to reach higher performance. That also applies to high-end MIPS-based Loongson processors there.

The first chip specs are not bad knowing it's the very first iteration of a brand new instruction set:

While this first iteration is aimed squarely at inexpensive smartphones and tablets, without need for Full HD display or encoding, the potential of the architecture is there. Small, very low power core, yet with clean, elegant instruction set and ability to put thousands of cores in a single rack, can cover both client and server sides of a cloud, since many-user web page serving doesn't require 64-bitness or fast cores with large memory, just many cores for each of many threads to have its own resource without context switching overhead.

What they could add further is, obviously, 64-bitness with SIMD support in some future iteration for the high end market, as it would help both integer and GPU performance ultimately, as well as address the high end market too. Also, a single channel of DDR2-533 is enough for a low-end phone, but multichannel DDR3 support will be important for market expansion to higher-end devices like servers where its new instruction set is not a problem as the whole open source stuff can be quickly compiled usually.

Once the Android port is fully tuned over the next few months, we'll have a look at the device's performance in real OS, and its real potential. ICube has a tough job on their hand convincing partners that a brand new instruction set should be supported, but, what they created here can apply across many market segments ultimately, from smartphone to superserver. Let's see how they do in their initial chosen market segments first.

On the technical side it seems legit at first glance. But the author is overstating the ramifications. It's only one more SoC for mobile devices. So they designed a 100% new instruction set... *yawn* I wish them all the best but I don't think this will have as much of an impact as the article implies.

This seems like one of a number of projects to erase the division between CPU and GPU. I know the possibility has been talked about before, at least, with more applications turning to the GPU for loads of additional processing power - if people are doing that why not just make the whole system a set of 'GPUs'?

On the technical side it seems legit at first glance. But the author is overstating the ramifications. It's only one more SoC for mobile devices. So they designed a 100% new instruction set... *yawn* I wish them all the best but I don't think this will have as much of an impact as the article implies.

Er, SoC? Whats that abbreviation mean.

Can you also comment on the technical side? It seems that they claim the chip will produce more computing power per unit energy used according to one of their tables.

Can you also comment on the technical side? It seems that they claim the chip will produce more computing power per unit energy used according to one of their tables.

System on a chip; it's a full computing solution optimized for mass production, minimal circuit board count (almost always one) and typically small size and limited power draw.The "history of instruction sets" part of the article is mostly lies and obfuscation and they're unlikely to beat the various ARM licensees on energy efficiency.

EDIT: simpler explanation: a SoC is a circuit board with everything that the computer is going to need soldered on (storage, ram, cpu, gpu, network interface, etc.).

Joined: 2002-12-24 08:29amPosts: 10681Location: The Covenants last and final line of defense

mr friendly guy wrote:

Er, SoC? Whats that abbreviation mean.

In a computer you have a motherboard containing your various components. In a SoC they basically try to squeeze as many of these things as possible into one tiny chip (usually no bigger than a fingernail). It makes it cheap to produce and small enough to put inside mobilphones, handhelds etc.

A TI OMAP chip that may be found inside a typical N-Series Nokia phone contains an application processor that handles the UI, a baseband processor that talks to the cellular network as well as additional cores handling multimedia like HD video playback or bluetooth/wifi connectivity.

Can you also comment on the technical side? It seems that they claim the chip will produce more computing power per unit energy used according to one of their tables.

That they "unified" processing units even further is no surprise. As the article touched on, that's what all big players are working on right now. The reasoning behind this is as follows: if I have several distinct processing units that can only do a very specific thing each, they will very often have to waste time/energy waiting for another unit to complete it's task. If every unit can do any task, each unit has to be much more complex but you always get optimal throughput. Unfortunately, complexity costs money, energy and computing speed. And as you can see on their comparison chart, their chip is nothing to write home about. Oh, they seem to get better performance per watt, but not by a substantial margin. So this chip - although probably full of clever design details - is just another product not that different from everything else on the market. Their instruction set is noteworthy because it is 100% chinese IP, a fact that will probably pay off in the future. But it might also be its biggest downside. Because switching everything to a new instruction set is a pretty complex task in and of itself. ARM and x86 have to much momentum, imho. Well, we will see.

Joined: 2002-12-24 08:29amPosts: 10681Location: The Covenants last and final line of defense

nVidia Tegra 3 has an interesting approach to power saving. Normally it's a quad core chip, but it also has a fifth low power core. Usually only this is low performance,energy saving core is active as it is sufficiently for typical phone UI or telephony. As application demand rises such as videocalls or intense 3D games additional cores are brought online to handle the load.

Who is online: Users browsing this forum: No registered users and 1 guest

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum