The last time we visited TI's OMAP 4 SoC was at Mobile World Congress, there we benchmarked the LG Optimus 3D and came away decently impressed with performance even on a pre-launch device. Back then, Anand wrote that the remainder of this year and the next is going to be a heated battle for dual core and quad core SoCs fighting in the tablet and smartphone space. After today, you can add Windows 8 to that list as well. Today, TI is announcing its latest SoC, the OMAP4470, which offers a 20% increase in CPU clocks and an entirely new SGX 544GPU over OMAP4460.

OMAP4470 is architecturally very similar to OMAP4460 with a number of notable changes. First off is that 20% increase in CPU clocks from 1.5 GHz in OMAP4460 to 1.8 GHz in OMAP4470. TI's comparison point for most of the OMAP4470 specs is the OMAP4430 which has its two Cortex-A9s clocked at 1.0 GHz. The two Cortex-M3 cores remain clocked at 266 MHz for handling multimedia processing and background realtime events. The end result is an effort to both let the two Cortex-A9s remain idle for more of the time, and unburden them during heavy processing. TI feels this dichotomy of two big and fast Cortex-A9 cores for web browsing and very computationally intensive tasks augmented with two ligher weight, low power Cortex-M3 cores offers it unique power savings potential. The two Cortex-M3 cores can offload Thumb and Thumb-2 instructions, as well as some hardware multiply and divide operations from the A9s.

The real interesting change with OMAP4470, however, is a similar two-pronged approach on the GPU side of things. First, OMAP4470 moves from the PowerVR SGX540 present in OMAP4430 and OMAP4460 to a more powerful single core (MP1, if you will) PowerVR SGX544 GPU which offers 2.5x the performance of OMAP4430's SGX540.

If you recall from Anand's excellent iPad 2 GPU exploration, SGX543/544 features four USSE2 pipes each with a 4-wide vector ALU churning thorugh 4 MADs per clock. I'm reproducing his table below, but if you mentally replace SGX543 with SGX544 you get the same picture. As an aside, the difference between SGX543 and SGX544 is purely that full DirectX 9 compliance is offered in the latter, making it a possible shoe-in for future Windows 8 platforms.

Mobile SoC GPU Comparison

PowerVR SGX 530

PowerVR SGX 535

PowerVR SGX 540

PowerVR SGX 543/544

PowerVR SGX 543/544MP2

GeForce ULP

Kal-El GeForce

SIMD Name

USSE

USSE

USSE

USSE2

USSE2

Core

Core

# of SIMDs

2

2

4

4

8

8

12

MADs per SIMD

2

2

2

4

4

1

?

Total MADs

4

4

8

16

32

8

?

GFLOPS @ 200MHz

1.6 GFLOPS

1.6 GFLOPS

3.2 GFLOPS

6.4 GFLOPS

12.8 GFLOPS

3.2 GFLOPS

?

GFLOPS @ 300MHz

2.4 GFLOPS

2.4 GFLOPS

4.8 GFLOPS

9.6 GFLOPS

19.2 GFLOPS

4.8 GFLOPS

?

If you recall the clocks for the OMAP4430, and OMAP4460, you can start to see where TI's 2.5x claim over its own OMAP4430 comes into play. Going from 304 MHz to 384 MHz is an ~25% increase in clock speed, which adds into the 200% increase in MADs per clock from the change from USSE to USSE2 going from SGX540 to SGX544. Do the math and it works out to almost exactly 2.5x.

TI OMAP 4xxx SoC GPU Comparison

OMAP4430

OMAP4460

OMAP4470

GPU Used

PowerVR SGX540

PowerVR SGX540

PowerVR SGX544

Clock

304 MHz

384 MHz

384 MHz

The next part of what's new in OMAP4470 is inclusion of a new hardware composition system for doing display composition without taxing the SGX544. TI wouldn't disclose whose IP this is, but did acknowledge that it's from a third party and includes a dedicated 2D graphics core for compositing the entire display. Ordinarily this is done on the GPU, but TI hopes to accomplish the same composition on this hardware accelerator in a more power and bandwidth efficient manner for driving large displays while maintaining low power profile.

When big 3D applications kick in, then SGX544 powers up and takes over, but for the majority of UI paradigms, TI believes its hardware composition engine can enable power savings - analogous to the way the two Cortex-M3 cores augment the two Cortex-A9s. It's an interesting approach, and TI claims the hardware composition abstraction layer (HAL) is already completed to enable Android and other mobile OSes to leverage that acceleration immediately.

Supports as many as three HD displays and up to QXGA (2048x1536) resolution; HDMI supporting stereoscopic 3D

Dual-channel, 466 MHz LPDDR2 memory

Higher memory bandwidth enables rendering and compositing of multilayer content at high resolutions

Complete pin-to-pin hardware and software compatibility

Rapid transition and maximum re-use of investment from OMAP4430 and OMAP4460 processors

The real hope with OMAP4470 is the ability to drive very high resolution displays as well, up QXGA (2048x1536) and maintaining HDMI 1.4a stereoscopic 3D support. TI expects OMAP4470 devices to arrive in the first half of 2012 with sampling happening in the second half of 2011.

I know the 4xxx series is on 40nm and 5xxx on 28nm but don't want to just assume .Qualcomm said today they are sampling MSM8960 this month and MSM8960 should be on 28nm so i figured others should be having 28nm soon too..Reply

I see that the smartphone world is innovating much faster than the x86 world. I been saying for a while that intel/amd need to have a couple cores like this so they can keep their monster x86 cores powered down when you are reading a web page or typing into a box like this. I am thinking a 386 with 8k of cache, clocked at 66 MHz would do the trick. A tiny little core.Reply