NVIDIA looks to strengthen its position in the mobile sectors with Tegra 4

With smartphones and tablets taking over the computing world and putting traditional PCs and notebooks on the back burner, a number of companies are jockeying for position to deliver the highest performing SOCs on the market. NVIDIA has been in this game for a while with its Tegra line of processors, and its most recent Tegra 3 has scored a slew of design wins over the past year.

NVIDIA is looking to build upon that success with its next generation Tegra 4. While we're sure that NVIDIA was looking to surprise everyone at CES, most of the details on the new chip leaked in mid-December. The Cortex-A15-based Tegra 4 is built on a 28nm process, continues the 4+1 design (quad-core + companion core), and features 72 NVIDIA GeForce GPU cores.

NVIDIA says that the move to 28nm help the Tegra 4 consumer 45 percent less power than its predecessor and would allow mainstream phones to deliver 14 hours of continues HD video content.

However, Tegra 4 doesn't have integrated LTE onboard. Instead, NVIDIA is hyping up its optional Icera i500 processor for LTE functionality. Although the move to 28nm is likely to improve battery life across the board, not having LTE integrated on-chip isn't going to do it any favors either.

We’re likely to see a number of Tegra 4-based products at CES, and we’ll be sure to keep you informed of the ones that really catch our eye.

quote: Up until the A6, they literally did very little work, and yet smashed Tegra3 in pretty much every way.

This is not a helpful way to characterize Apple's behavior.Apple started off with extremely off-the-shelf parts, the innovation was in the software (and perhaps in the choice of additional parts added, the breadth of sensors, the use of a high-quality, for the time, h.264 decoder).They moved on to asking for parts that had higher performance (most obvious in the GPU arena).They moved on to a custom SOC (made with mostly 3rd party cells, but with some unorthodox choices, like higher memory bandwidth).And then we got a custom CPU.

You see this as four years of doing nothing, then a custom CPU. I see it as four years of learning on the job, one step after another.

As for the "my 4 A15s can beat up your 2 Swifts", grow up.There is no perfect CPU design. You can optimize for power. You can optimize for computational performance. You can optimize for memory performance. There's a reason ARM AND Intel AND IBM are all three successful in the CPU business.The trick, however, is to know what your market requires, otherwise you produce a white elephant like the Pentium4.If you're going to rant about how wonderful 4 A15 cores are, do so by(a) providing numbers (power and memory numbers) and(b) providing realistic CURRENT-DAY scenarios in which four cores are useful.

I'm all for competition in this space. A7, Tegra4, next-gen Atom --- bring them all on. But let's try to maintain a level of discourse higher than "15>7 therefore A15 is better than A7" or "two cores bad, four cores good".

Of particular interest here is the memory behavior of Tegra 4. As I have said many times before, an obvious lesson to me when I worked at Apple and we were using PPC was that memory performance really matters, and PPC's superior core was irrelevant in the face of Intel's substantially superior memory performance. Apple's constant push to ramp up the memory performance of iOS devices shows that they still remember this lesson.It would not surprise me if a future path for Apple consists of a lot more apparently unsexy stuff going on in the memory uncore, and a lot less concern with pumping up the core. So while Tegra may look cool in terms of better branch predictors, a more aggressive superscalar implementation, more rename registers, etc, Apple will be putting its design energy into a smarter memory controller and better PoP packaging, to bring the DRAM closer to the core and allow it to run faster, while still not burning too much power in the memory bus.

The details of the core are sexy, I know --- I love that stuff. The details of the memory system seem so damn boring, from coherency protocols to memory controller state machines. But we are at the stage (and about fifteen years of experience has shown) that a sexy core hooked up to a sub-par memory system is usually very disappointing in real-world code, as opposed to micro benchmarks.