Nvidia has finally confirmed the specs for its highly anticipated $379 GeForce GTX 1070 graphics card. The more wallet friendly younger sibling of the GTX 1080. The card will feature the same GP104 GPU that powers the flagship GTX 1080, all be it with a few bibs and bobs disabled.

The graphics card will also feature 1920 CUDA cores, a slightly lower boost clock of 1.6Ghz and a more lean TDP of 150W. Most importantly Nvidia maintains that the GTX 1070 will still be faster than the company’s previous heavy hitters, the GTX Titan X and GTX 980 Ti. The GTX 1070 Founder’s Edition – reference design – will launch on June 10 for $449. Custom cooled and factory overclocked GTX 1070 from Nvidia’s partners will launch soon afterwards for $379.

[UPDATE / 2016-05-18 11:01:34 AM]Nvidia has published the specifications for the GTX 1070 on the GeForce website alongside preliminary benchmarks.

Nvidia GeForce GTX 1070 – Most Of The 1080’s Performance At Nearly Half The Price

If you’re a PC gamer or a hardware enthusiast that’s been paying close attention to the GPU market for the past couple of years then the GTX 1070 shouldn’t surprise you at all. The cut back versions of any graphics chip are almost always the better buy. They deliver a huge chunk of the performance of their higher-end counterparts at a fraction of the cost.

We’ve seen this happen with every graphics generation that both Nvidia and AMD have introduced for years. We’ve seen it happen with Nvidia’s GTX 980 Ti & Titan X, with the 970 & 980, with the 780 Ti and Titan, with the GTX 670 and the 680. And again the exact same trend on AMD’s side. The R9 390 offered 90% of the performance delivered by the 390X for a fraction of the cost. We’ve seen it with the 290 and 290X, 7950 and 7970 and the trend goes back more than a decade. In every one of those examples the cut back version of the very same chip delivers most of its counterpart’s performance at a much lower price point.

This is why in any given generation of graphics cards, gamers have always overwhelmingly chosen to to go with the second fastest graphics card in thelineup. It’s almost certainly always the smarter decision. You just get a lot more for your money. The GTX 1070 is definitely not an exception. If you’re a gamer that’s looking to upgrade to a new higher resolution monitor or just want more performance for any number of good reasons then the GTX 1070 will be the Pascal card you’ll want to get.

Nvidia GTX 1080 Minus 5 SMs Is Still A LOT of Horsepower

Inside Nvidia’s full 7.2 billion 314mm² GP104 GPU that powers the GTX 1080 there’s a total of 2560 CUDA cores, grouped into 20 Streaming Multiprocessors arranged into four Graphics Processing Clusters. Each SM houses 128 CUDA cores. The 1070 having 1920 CUDA cores means we’re looking at five SMs fewer. Now, inside every SM there are eight Texture Mapping Units.
Inside every GPC there are five SMs and a total of 16 Render Output Units or ROPs for short. Cutting five SMs means the GTX 1070 ends up with 120 TMUs as opposed to the GTX 1080’s 160. If an entire GPC is disabled to make a GTX 1070 then it will also end up with 48 ROPs as opposed to the GTX 1080’s 64. The GTX 1070 will still feature 8GB of memory. Although it’s 8Gbps GDDR5 as opposed to 10Gbps GDDR5X.

Nvidia GTX 1080 – GP104 Block DIagram

Despite the fewer CUDA cores, TMUs and ROPs, Nvidia is promising GTX Titan X & 980 Ti topping performance. Which isn’t entirely too surprising. For example the GTX 980 had 23% more CUDA cores and a higher clock speed, but it ended up roughly 15% faster. We see this with all full vs cut back GPUs.

This is because not all resources inside the chip are disabled proportionally to the number of CUDA cores that are lasered off. Other on-chip engines are always left untouched. In this case the Raster engines are such an example.What this means is that while we end up with fewer CUDA cores, these CUDA cores have access to proportionally more resources which in turn means we end up with more performance per CUDA core.

Official Geforce GTX 1080 and Geforce GTX 1070 Specifications

WCCFTech

Nvidia Geforce GTX 1080

Nvidia Geforce GTX 1070

Architecture

Pascal

Pascal

Transistors

7.2 Billion

7.2 Billion

CUDA Cores

2560

1920

Core Clock

1607 Mhz

TBA

Boost Clock

1733 Mhz

1683 Mhz

Memory Type

G5X (GDDR5X)

GDDR5

Memory Speed

10 Gbps

8 Gbps

Memory Configuration

8GB

8GB

Bus Width

256-bit

256-bit

Memory Bandwidth

320 GB/s

256 GB/s

Multi Projection

Yes

Yes

HB SLI Bridge Support

Yes

Yes

Nvidia GPU Boost

3.0

3.0

DirectX 12 Feature Level

12_1

12_1

OpenGL

4.5

4.5

Vulkan API

Yes

Yes

Maximum Digital Resolution

7680x4320@60Hz

7680x4320@60Hz

Display Connectors

DP 1.42, HDMI 2.0b, DL-DVI

DP 1.42, HDMI 2.0b, DL-DVI

HDCP

2.2

2.2

Power Draw

180W

150W

Power Connector

Single 8-Pin

Single 8-Pin

Maximum Operating Temp

94 C

94 C

Partner Price (MSRP)

$599

$379

FE Price (MSRP)

$699

$449

The Nvidia GeForce GTX 1070 Comes With All Of The GTX 1080’s Goodies

One of the new exciting features that Nvidia is debuting with the GTX 1080 and GTX 1070 is GPU Boost 3.0. The feature builds on GPU Boost 2.0 but is cleverer and more granular. The frequency curve is now dynamic and follows per voltage points as opposed to a fixed margin. The result is higher clock speeds and better voltage utilization which translates to superior performance per watt. And what it means is that while the GTX 1080’s boost clock is rated at around 1.7Ghz the card will on average boost to nearly 1.9Ghz. And the GTX 1070 will be no different, despite its lower boost clock of 1.6Ghz we should see a similar 200mhz uplift in average clocks while gaming.

Continuing on the feature front there’s one more mighty exciting update to the capablity of GeForce graphics cards that many gamers and particularly game streamers will absolutely love. The GTX 1080 and 1070 can drive resolutions of up to 7860×4320 at 60Hz and 8K decode at 30Hz. That’s 8K folks! The cards are also capable of up to 4K 60Hz encode and 4K 120Hz decode.

The cards also finally, a first for any GeForce products, support 10-bit per color channel. Something that the Radeon counterparts had for years. So if you have a color rich 10-bit per channel monitor you’ll finally be able to enjoy it on GeForce. Blacks will be absolutely black, reds greens and blues incredibly more accurate. This includes support for upcoming High Dynamic Range monitors as well.

Pascal features two new key improvements over Maxwell. Dynamic load balancing and improved pre-emption. Both of which do a great job to help boost Pascal’s async compute performance compared Maxwell. They allow time critical workloads to be inserted more quickly into the pipeline and improve utilization by layering PhysX and post-processing workloads over gaps in the pipeline.

However, principally Pascal still can’t execute async code concurrently without pre-emption. This is quite different from AMD’s GCN architecture which has Asynchronous Compute Engines and hardware schedulers that enable the execution of multiple kernels concurrently without pre-emption or context switching.

Although what really matters at the end of the day is performance. Testing is well underway by some of the industry’s most respectable publications to determine whether Pascal’s dynamic load balancing and speedier pre-emption can do just enough to close the gap with AMD’s ACEs in DirectX 12 Async Compute performance. We’ll offer an in-depth write-up to detail Pascal’s Async Compute advances very soon.

Simultaneous Multi-Projection

Simultaneous Multi-Projection is a VR specific feature that improves performance by eliminating the need to render what the user can’t see and only spend energy on what the user can actually see. The Oculus Rift outputs a total of 4.2 megapixels not all of which are actually useful to the user, nor do all of them form part of the visible scene. SMP can eliminate up 1.4 megapixels, rendering only 2.8 megapixels in total, significantly improving performance.

MP also cuts Geometry render times by half. SMP is not a Pascal specific feature as it’s done more in software, but according to Nvidia a new sub-segment inside the PolyMorph 4.0 engine has been built specifically to improve Pascal’s SMP capability.