Microsoft’s Undertaking Scorpio: Extra Particulars Revealed

This information piece incorporates hypothesis, and suggests silicon implementation based mostly on launched merchandise and roadmaps. The one components confirmed for Undertaking Scorpio are the eight x86 cores, >6 TFLOPs, 320 GB/s, it's constructed by AMD, and it’s coming in 2017. If anybody needs to formally appropriate any hypothesis, please get in contact.

One of many vital factors of competition with consoles, particularly when considered via the lens of the PC fanatic, is the specs. Consoles have lengthy growth processes, and are thus already behind the curve at launch – resulting in a speedy enlargement away from high-end elements because the life-cycle of the console is wherever from 5 to seven years. The trade-off is often that the console is an optimized platform, significantly for software program: efficiency is common and it’s a lot simpler to optimize for.

For six months or so now, Microsoft has been teasing its subsequent technology console. Apart from launching the Xbox One S as a minor mid-season revision to the Xbox One, the next-generation ‘Undertaking Scorpio’ goals to be probably the most highly effective console out there. Whereas this can be a commendable aspiration (one that might look odd if it wasn’t achieved), the meat and potatoes of the dialogue has nonetheless been comparatively unknown. Effectively, among the particulars have come to the floor via a PR reveal with Eurogamer’s Digital Foundry.

We all know the purpose with Undertaking Scorpio is to assist 4K playback (4K UHD Blu-Ray), in addition to a considerable a part of 4K gaming. With latest introductions within the PC area of ‘VR’ succesful coming down in worth, Microsoft is ready to rigorously navigate what it will possibly supply. It’s anticipated that this technology will nonetheless depend on AMD’s semi-custom foundry enterprise, on condition that high-end consoles at the moment are on x86 applied sciences and Intel’s foundry enterprise continues to be within the technique of being enabled (Intel’s foundry can also be anticipated to be costly). After all, pairing an AMD CPU and AMD GPU can be the good choice right here, with AMD launching a brand new GPU structure final 12 months in Polaris.

Right here’s a desk of what the reveal is:

Microsoft Console Specification Comparability

Xbox 360

Xbox One

Undertaking Scorpio

CPU Cores/Threads

three/6

Eight/Eight

Eight / ?

CPU Frequency

three.2 GHz

1.6 GHz (est)

2.three GHz

CPU µArch

IBM PowerPC

AMD Jaguar

AMD x86 ?

Shared L2 Cache

1MB

2 x 2MB

? GPU L2 is 4x

GPU Cores

16 CUs 768 SPs 853 MHz

40 CUs 1920 SPs ? 1172 MHz

Peak Shader Throughput

zero.24 TFLOPS

1.23 TFLOPS

>6 TFLOPs

Embedded Reminiscence

10MB eDRAM

32MB eSRAM

None

Embedded Reminiscence Bandwidth

32GB/s

102-204 GB/s

None

System Reminiscence

512MB GDDR3-1400

8GB DDR3-2133

12GB GDDR5-1700

System Reminiscence Bus

128-bits

256-bits

384-bit

System Reminiscence Bandwidth

22.four GB/s

68.three GB/s

326GB/s

Manufacturing Course of

28nm

16nm TSMC

Specs in italics have been added after the desk was created.

On the excessive degree, we’ve eight ‘’ x86 cores working at 2.three GHz for the CPU and 40 compute models at 1172 MHz for the GPU. The GPU will likely be paired with 12GB of GDDR5, to present 326GB/s of bandwidth. Storage is through a 1TB HDD, and the optical drive helps 4K UHD Blu-Ray.

Let’s break this down with some rationalization and predictions.

Eight Customized CPU Cores

The Xbox One makes use of AMD’s Jaguar cores – these are low powered and easier cores, geared toward a low-performance profile and optimized for price and energy. In non-custom designs, we noticed these CPUs hit above 2 GHz, however these have been restricted to 1.75 GHz within the Xbox One. Whereas not fully inconceivable, it will be unlikely that Jaguar cores (that have been made on a 28nm course of) would even be within the Scorpio.

The opposite cores AMD has out there are Excavator based mostly (28nm) or Zen based mostly (14nm). The latter is a design that has returned AMD to the high-end of x86 efficiency computing, providing excessive efficiency for affordable energy, however a 14nm design can be comparatively costly. Eight cores would slot in with a normal Zeppelin silicon design, which AMD has been manufacturing hand-over-fist because the launch of desktop-based Zen CPUs for PCs in March. One of many detractors towards Zen inside Scorpio is the truth that it was solely launched lately, and arguably the desktop PC market is extra financially profitable for AMD.

Technically Microsoft might go for Zen within the Scorpio, however I think this might enhance the bottom price of the console. Nonetheless, if Microsoft have been going for a premium console ($700+), this may make sense.

A notice on Zen energy and frequency – 2.three GHz is a low frequency for a Zen CPU based mostly on what we’ve seen in desktop PCs. Some work completed internally on the ability consumption of Zen CPUs has proven that the design requires numerous energy to maneuver between three.5 GHz and four.zero GHz, maybe suggesting that 2.three GHz is thus far down the DVFS curve that the ability consumption is comparatively low. Additionally, we’re beneath the impression that getting a brilliant excessive frequency on Zen is a tricky restriction in the case of binning chips – providing a low-frequency bin would imply that every one the silicon that doesn’t make it to desktop retail because of an incapability to go up the DVFS curve might find yourself in gadgets just like the Scorpio. The spec listing doesn’t have a turbo frequency, which stays an unknown (if current).

That being stated, this can be a ‘’ x86 core. Microsoft might have requested particular IP blocks and options not current in desktop CPUs, or totally different strategies of department prediction enabled and so on. This may both require a brand new silicon design of the Zeppelin silicon, or it’s already in there, prepared for Microsoft. Usually a console shares DRAM between the CPU and GPU, so it may be one thing so simple as the CPU reminiscence controller supporting GDDR5. So both we’re seeing Zen coming to consoles, or we’re seeing one other crack at utilizing Jaguar on 28nm (it’s unlikely to get a 14nm spin), to maintain general prices down – and on condition that the primary concentrate on a console is the GPU, that’s completely attainable.

40 Personalized Compute Items

AMD launched Polaris 10 final 12 months – their newest compute structure on a 14nm course of giving substantial energy effectivity beneficial properties over earlier 28nm designs. The primary shopper GPUs have been aimed on the $200-$230 market and beneath, which is one thing that might be of curiosity to console producers. Nonetheless, AMD is about to launch Vega this 12 months, on a brand new structure (additionally on 14nm) with further efficiency per watt beneficial properties, however for high-end GPUs.

Bypassing AMD’s Fiji GPUs utilizing silicon interposers and high-bandwidth reminiscence, AMD’s newest design is the RX480. The RX 480 is a 36 compute unit design, utilizing 4GB or 8GB of 256-bit GDDR5 reminiscence, giving 256GB/s of whole reminiscence bandwidth. Based on the knowledge given to Digital Foundry, Scorpio may have 40 compute models, 12 GB of GDDR5, and will likely be good for 326 GB/s of reminiscence bandwidth. Technically the RX 480 is a completely enabled design, and solely gives 36 compute models in whole, suggesting that Scorpio is both utilizing a brand new silicon spin model of this design (with a lop-sided reminiscence configuration), or is shifting on to a Vega based mostly design. The truth that the spec listing has 1172 MHz on it, and Vega is meant to supply larger clocks, implies that we’re in a price challenge once more: Vega is anticipated to price a reasonably penny, whereas consoles are sometimes low-cost designs. That is almost definitely a Polaris implementation, particularly as we already know that Scorpio will likely be > 6 TFLOPs, and the RX 480 is ~5 TFLOPs.

Ideally I wish to get Ryan’s ideas on this, and can achieve this when he indicators in for the day, however his evaluation on among the specs again in June 2016 nonetheless stands:

The reminiscence bandwidth of Undertaking Scorpio, 320 GB/s, can also be comparatively fascinating given the present charges of the RX 480 topping out at 256 GB/s. The 320 GB/s quantity appears spherical sufficient to be a GPU solely determine, however given earlier embedded reminiscence designs is prone to embody some type of embedded reminiscence. How a lot is inconceivable to say at this level.

Extra: On 4K assist, the most recent AMD media block helps 4K60 with HEVC, in addition to HDMI 2.zero. When rendering 4K content material to a 1080p display screen, Microsoft has mandated that Extremely-HD rendering ought to super-sample all the way down to 1080p to all builders.

What We Don’t Know

The Xbox One used a mixed CPU/GPU in a single piece of silicon – including up the Zen silicon space + a Polaris 10 die comes up at nearly 450mm2, which might be a big piece of silicon from International Foundries (in addition to being costly with low yields), so we’re in all probability taking a look at a break up silicon design. This may imply that the reminiscence is break up between the CPU/GPU (maybe 4GB for CPU, 8GB for GPU?), or some low-level software program is managing DRAM distribution between the 2 to make the most of HSA options similar to zero-copy.

The unique Xbox One used 8GB of DDR3 reminiscence for use between the CPU and GPU, in addition to a 32MB ESRAM mini-cache to assist enhance reminiscence bandwidth. There’s no indication that Undertaking Scorpio makes use of a caching methodology, and should but nonetheless achieve this. The reminiscence bandwidth worth may be a mixture of what’s out there to the primary reminiscence and cache, or may simply be associated to the GPU – we don’t know at this level.

If the entire core silicon is utilizing AMD's newest, then we’d count on it to be made at International Foundries on a 14nm course of. This results in questions on yields and price – we’re assuming that Microsoft goes for a high-end design, which is prone to appeal to a high-end worth. Going again over the console generations and adjusting for inflation to at this time’s costs, some consoles within the final couple of a long time have drifted right into a $600+ equal territory. It may be doubtless that Microsoft is taking a look at that, in the event that they’re going with the most recent know-how. The choice is utilizing older applied sciences (similar to 28nm Jaguar cores for the CPU and 14nm GPU) to maintain prices down.

apart, the launch titles will likely be an fascinating story in itself, particularly with latest closures of devoted MS studios similar to Lionhead.

Undertaking Scorpio is due out in Fall / Q3 2017.

Extra four/6 – 16nm TSMC

I missed this after I initially learn the peace: Undertaking Scorpio's central piece of silicon will likely be constructed on 16nm TSMC. Time to course of this one.

Supply: Digital Foundry

Jaguar was made at 28nm TSMC, and would require a redesign for 16nm. It might end in a lot decrease energy, and in addition a lot decrease die space. In comparison with the GPU, an Eight-core Jaguar design may be 10-15% of all the silicon.

Nonetheless, AMD lately afforded further quarterly prices for utilizing foundries apart from International Foundries (as per their renegotiated wafer settlement), which quite a lot of analysts chalked as much as future server designs being made elsewhere. A couple of of us postulated it's extra to do with AMD's semi-custom enterprise, and both means it factors to Zen being redesigned for 16nm TSMC. This makes it an fascinating query throughout. [update, see below]

Equally, the appliance of the GPU – Polaris and Vega are promoted as being 14nm processes, however may very well be redesigned for 16nm. The Eurogamer article quotes Andrew Goossen, Technical Fellow for Graphics at Microsoft:

These are the large ticket objects, however there's numerous different configuration that we needed to do as effectively," says Goossen, pointing to a structure of the Scorpio Engine processor. "As you may see, we doubled the quantity of shader engines. That has the impact of enchancment of boosting our triangle and vertex fee by 2.7x whenever you embody the clock enhance as effectively. We doubled the variety of render back-ends, which has the impact of accelerating our fill-rate by 2.7x. We quadrupled the GPU L2 cache dimension, once more for concentrating on the 4K efficiency."

Extra #2 four/6 – 384-bit interface, 12GB is break up

The reminiscence bus is listed as a 384-bit interface. This in all probability means we're coping with a Vega-based design. This implies 12 32-bit channels, with modules working at 6.Eight GB/s (or GDDR5-1700, which has similarities to desktop processors).

The 12GB of GDDR5 is break up with 4GB out there for the system and 8GB out there for builders. There is no such thing as a ESRAM, given the rationale that the bandwidth of the GDDR5 is adequate. The counter to this can be a barely larger latency, which Microsoft expects builders to cover when pushing larger resolutions.

One aspect of the outline handed me by initially: Digital Foundry noticed the silicon the ground plan, and studies two clusters of two CPU cores. These may be CCX models from Zen, every being 4 cores. AMD acknowledged Zen CCX was 44mm2 every on GloFo 14nm, so it will be about the identical on TSMC. However this might put a sizeable chunk of the die space on the silicon, not less than one-third of the chip. We don't know the dimensions of Vega, however 36 CUs of Polaris 10 on GloFo is 232mm2 at 5.7 billion transistors. So ~230 for GPU + ~100 for CPU comes out as round 330mm2. The whole die dimension for the mix chip is listed 360mm2, together with CPU and GPU, with 4 shader engines every containing 11 compute models (one is disabled per block). That is all inside 7 billion transistors.

Microsoft additionally states that the ability provide with the unit may be suited as much as 245W. If we assume a low frequency Zen CPU inside, that may very well be round 45W max, leaving 200W for the GPU. A full sized RX 480 is available in at 150W, and given this GPU is a bit more than that, maybe nearer 170W. The facility provide, in a Zen + Polaris configuration, appears to have a great 20-25% energy funds in hand.

Supply: Digital Foundry

Primarily based on among the dialogue from the supply, it will appear that AMD is implementing a great variety of its energy saving options, significantly associated to distinctive DVFS profiles per silicon die because it comes off the manufacturing line, somewhat than a one-size matches all method. The silicon can even be paired with a vapor chamber cooler, utilizing a centrifugal fan.