Tag Archives: Core

A recent trip got us access to an early sample of Intel’s upcoming Core i7-4770K. We compare its performance to Ivy Bridge- and Sandy Bridge-based processors, so you have some idea what to expect when Intel officially introduces its Haswell architecture.

We recently got our hands on a Core i7-4770K, based on Intel’s Haswell micro-architecture. It’s not final silicon, but compared to earlier steppings (and earlier drivers), we’re comfortable enough about the way this chip performs to preview it against the Ivy and Sandy Bridge designs.

Presentations at last year’s Developer Forum in San Francisco taught us as much as there is to know about the Haswell architecture itself. But as we get closer to the official launch, more details become known about how Haswell will materialize into actual products. Fortunately for us, some of the first CPUs based on Intel’s newest design will be aimed at enthusiasts.

Fourth-Generation Intel Core Desktop Line-Up

Cores / Threads

TDP (W)

Clock Rate

1 Core

2 Cores

3 Cores

4 Cores

L3

GPU

Max. GPU Clock

TSX

i7-4770K

4 / 8

84

3.5 GHz

3.9 GHz

3.9 GHz

3.8 GHz

3.7 GHz

8 MB

GT2

1.25 GHz

No

i7-4770

4 / 8

84

3.4 GHz

3.9 GHz

3.9 GHz

3.8 GHz

3.7 GHz

8 MB

GT2

1.2 GHz

Yes

i5-4670K

4 / 4

84

3.4 GHz

3.8 GHz

3.8 GHz

3.7 GHz

3.6 GHz

6 MB

GT2

1.2 GHz

No

i5-4670

4 /4

84

3.4 GHz

3.8 GHz

3.8 GHz

3.7 GHz

3.6 GHz

6 MB

GT2

1.2 GHz

Yes

i5-4570

4 / 4

84

3.2 GHz

3.6 GHz

3.6 GHz

3.5 GHz

3.4 GHz

6 MB

GT2

1.15GHz

Yes

i5-4430

4 / 4

84

3 GHz

3.2 GHz

3.2 GHz

3.1 GHz

3 GHz

6 MB

GT2

1.1 GHz

No

i7-4770S

4 / 4

65

3.1 GHz

3.9 GHz

3.8 GHz

3.6 GHz

3.5 GHz

8 MB

GT2

1.2 GHz

Yes

i5-4570S

4 / 4

65

2.9 GHz

3.6 GHz

3.5 GHz

3.3 GHz

3.2 GHz

6 MB

GT2

1.15GHz

Yes

i5-4670S

4 / 4

65

3.1 GHz

3.8 GHz

3.7 GHz

3.5 GHz

3.4 GHz

6 MB

GT2

1.2 GHz

Yes

i5-4430S

4 / 4

65

2.7 GHz

3.2 GHz

3.1 GHz

2.9 GHz

2.8 GHz

6 MB

GT2

1.1 GHz

No

i7-4770T

4 / 4

45

2.5 GHz

3.7 GHz

3.6 GHz

3.4 GHz

3.1 GHz

8 MB

GT2

1.2 GHz

Yes

i5-4670T

4 / 4

45

2.3 GHz

3.3 GHz

3.2 GHz

3 GHz

2.9 GHz

6 MB

GT2

1.2 GHz

Yes

i7-4765T

4 / 4

35

2 GHz

3 GHz

2.9 GHz

2.7 GHz

2.6 GHz

8 MB

GT2

1.2 GHz

Yes

i5-4570T

2 / 4

35

2.9 GHz

3.6 GHz

3.3 GHz

–

–

4 MB

GT2

1.15 GHz

Yes

According to Intel’s current plans, you’ll find dual- and quad-core LGA 1150 models with the GT2 graphics configuration sporting 20 execution units. There will also be dual- and quad-core socketed rPGA-based models for the mobile space, featuring the same graphics setup. Everything in the table above is LGA 1150, though. All of those models share support for two channels of DDR3-1600 at 1.5 V and 800 MHz minimum core frequencies. They also share a 16-lane PCI Express 3.0 controller, AVX2 support, and AES-NI support. Interestingly, four of the listed models do not support Intel’s new Transactional Synchronization Extensions (TSX). We’re not sure why Intel would want to differentiate its products with a feature intended to handle locking more efficiently, but that appears to be what it’s doing.

The much-anticipated GT3 graphics engine, with 40 EUs, is limited to BGA-based applications, meaning it won’t be upgradeable. Intel will have quad-core with GT3, quad-core with GT2, and dual-core with GT2 versions in ball grid array packaging. GT3 will also make an appearance in a BGA-based multi-chip package that includes a Lynx Point chipset. That’ll be a dual-core part, though.

In addition to the processors Intel plans to launch here in a few months, we’ll also be introduced to the 8-series Platform Controller Hubs, currently code-named Lynx Point. The most feature-complete version of Lynx Point will incorporate six SATA 6Gb/s ports, 14 total USB ports (six of which are USB 3.0), eight lanes of second-gen PCIe, and VGA output.

Eight-series chipsets are going to be physically smaller than their predecessors (23×22 millimeters on the desktop, rather than 27×27) with lower pin-counts. This is largely attributable to more capabilities integrated on the CPU itself. Previously, eight Flexible Display Interface lanes connected the processor and PCH. Although the processor die hosted an embedded DisplayPort controller, the VGA, LVDS, digital display interfaces, and audio were all down on the chipset. Now, the three digital ports are up in the processor, along with the audio and embedded DisplayPort. LVDS is gone altogether, as are six of the FDI lanes.

Although Dhrystone isn’t necessarily applicable to real-world performance, a lack of software already-optimized for AVX2 means we need to go to SiSoftware’s diagnostic for an idea of how Haswell’s support for the instruction set might affect general integer performance in properly-optimized software.

The Whetstone module employs SSE3, so Haswell’s improvements over Ivy Bridge are far more incremental.

Sandra’s Multimedia benchmark generates a 640×480 image of the Mandelbrot Set fractal using 255 iterations for each pixel, representing vectorised code that runs as close to perfectly parallel as possible.

The integer test employs the AVX2 instruction set on Intel’s Haswell-based Core i7-4770K, while the Ivy andSandy Bridge-based processors are limited to AVX support. As you see in the red bar, the task is finished much faster on Haswell. It’s close, but not quite 2x.

Floating-point performance also enjoys a significant speed-up from Intel’s first implementation of FMA3 (AMD’s Bulldozer design supports FMA4, while Piledriver supports both the three- and four-operand versions). The Ivy and Sandy Bridge-based processors utilize AVX-optimized code paths, falling quite a bit behind at the same clock rate.

Why do doubles seem to speed up so much more than floats on Haswell? The code path for FMA3 is actually latency-bound. If we were to turn off FMA3 support altogether in Sandra’s options and used AVX, the scaling proves similar.

All three of these chips feature AES-NI support, and we know from past reviews that because Sandra runs entirely in hardware, our platforms are processing instructions as fast as they’re sent from memory. The Core i7-4770K’s slight disadvantage in our AES256 test is indicative of slightly less throughput—something I’m comfortable chalking up to the early status of our test system.

Meanwhile, SHA2-256 performance is all about each core’s compute performance. So, the IPC improvements that go into Haswell help propel it ahead of Ivy Bridge, which is in turn faster than Sandy Bridge.

The memory bandwidth module confirms our findings in the Cryptography benchmark. All three platforms are running 1,600 MT/s data rates; the Haswell-based machine just looks like it needs a little tuning.

We already know that Intel optimized Haswell’s memory hierarchy for performance, based on information discussed at last year’s IDF. As expected, Sandra’s cache bandwidth test shows an almost-doubling of performance from the 32 KB L1 data cache.

Gains from the L2 cache are actually a lot lower than we’d expect though; we thought that number would be close to 2x as well, given 64 bytes/cycle throughput (theoretically, the L2 should be capable of more than 900 GB/s). The L3 cache actually drops back a bit, which could be related to its separate clock domain.

It still isn’t clear whether something’s up with our engineering sample CPU, or if there’s still work to be done on the testing side. Either way, this is a pre-production chip, so we aren’t jumping to any conclusions.

Late last week we pulled back the covers on Intel’s next-generation Core architecture update: Sandy Bridge. Due out in Q1 2011, we learned a lot about Sandy Bridge’s performance in our preview. Sandy Bridge will be the first high performance monolithic CPU/GPU from Intel. Its performance was generally noticeably better than the present generation of processors, both on the CPU and GPU side. If you haven’t read the preview by now, I’d encourage you to do so.

One of the questions we got in response to the article was: what about Sandy Bridge for notebooks? While Sandy Bridge is pretty significant for mainstream quad-core desktops, it’s even more tailored to the notebook space. I’ve put together some spec and roadmap information for those of you who might be looking for a new notebook early next year.

Mobile Sandy Bridge

Like the desktop offering, mobile Sandy Bridge will arrive sometime in Q1 of next year. If 2010 was any indication of what’s to come, we’ll see both mobile and desktop parts launch at the same time around CES.

The mobile Sandy Bridge parts are a little more straightforward in some areas but more confusing in others. The biggest problem is that both dual and quad-core parts share the same brand; in fact, the letter Q is the only indication that the Core i7 2720QM is a quad-core and the Core i7 2620M isn’t. Given AMD’s Bulldozer strategy, I’m sure Intel doesn’t want folks worrying about how many cores they have – just that higher numbers mean better things.

Mobile Sandy Bridge CPU Comparison

Base Frequency

L3 Cache

Cores / Threads

Max Single Core Turbo

Memory Support

Intel Graphics EUs

Intel HD Graphics Frequency / Max Turbo

TDP

Core i7 2920XM

2.5GHz

8MB

4 / 8

3.5GHz

DDR3-1600

12

650 / 1300MHz

55W

Core i7 2820QM

2.3GHz

8MB

4 / 8

3.4GHz

DDR3-1600

12

650 / 1300MHz

45W

Core i7 2720QM

2.2GHz

6MB

4 / 8

3.3GHz

DDR3-1600

12

650 / 1300MHz

45W

Core i7 2620M

2.7GHz

4MB

2 / 4

3.4GHz

DDR3-1600

12

650 / 1300MHz

35W

Core i5 2540M

2.6GHz

3MB

2 / 4

3.3GHz

DDR3-1333

12

650 / 1150MHz

35W

Core i5 2520M

2.5GHz

3MB

2 / 4

3.2GHz

DDR3-1333

12

650 / 1150MHz

35W

You’ll notice a few changes compared to the desktop lineup. Clock speeds are understandably lower, and all launch parts have Hyper Threading enabled. Mobile Sandy Bridge also officially supports up to DDR3-1600 while the desktop CPUs top out at DDR3-1333 (though running them at 1600 shouldn’t be a problem assuming you have a P67 board).

The major difference between mobile Sandy Bridge and its desktop countpart is all mobile SB launch SKUs have two graphics cores (12 EUs), while only some desktop parts have 12 EUs (it looks like the high-end K SKUs will have it). The base GPU clock is lower but it can turbo up to 1.3GHz, higher than most desktop Sandy Bridge CPUs. Note that the GPU we tested in Friday’s preview had 6 EUs, so mobile Sandy Bridge should be noticeably quicker as long as we don’t run into memory bandwidth issues. Update: Our preview article may have actually used a 12 EU part, we’re still trying to confirm!

Even if we only get 50% more performance out of the 12 EU GPU, that’d be enough for me to say that there’s no need for discrete graphics in a notebook – as long as you don’t use it for high-end gaming.

While Arrandale boosted multithreaded performance significantly, Sandy Bridge is going to offer an across the board increase in CPU performance and a dramatic increase in GPU performance. And from what I’ve heard, NVIDIA’s Optimus technology will work with the platform in case you want to do some serious gaming on your notebook.