Coming soon to a Radeon near you: AMD unveils its plans for High Bandwidth Memory

During its analyst day two weeks ago, AMD confirmed that its next iteration of high-end Radeon cards would adopt High Bandwidth Memory, or HBM. We’ve previously covered HBM’s technical implementation in some depth, but we haven’t had formal acknowledgment from AMD that it would release the technology, or official data on how it compared to GDDR5. Now, we do. and the final figures point to potent performance for the upcoming Radeon.

AMD decided to invest in HBM research seven years ago, when it became apparent that a new memory standard would be needed to replace GDDR5. Conventional GDDR designs have scaled extremely well over the past decade, but as the slide below shows, conventional DRAM scaling and the difficulty of routing so many traces around the GPU itself had become a significant problem.

Simply scaling GDDR5 to higher clock speeds in order to meet the demands of faster GPUs was no longer sufficient. Earlier this year, Samsung announced that it had begun producing GDDR5 rated for up to 8Gbps, but that’s just a 14% bandwidth increase over existing 7Gbps GDDR5. Like CPUs, DRAM has a non-linear power consumption curve. Higher clocks require higher voltages, and power consumption increases as the square of the voltage increase. A new approach was needed — and HBM provides it.

Introducing High Bandwidth Memory

AMD’s next-generation Radeon will be packaged together with its memory through the use of a 2.5D interposer. The diagram below illustrates how this is accomplished. Instead of connecting to off-package DRAM through a variety of circuit traces, the GPU and its memory connect through the interposer itself.

AMD’s HBM implementation is the first iteration of High Bandwidth Memory to come to market and stacks four DRAM die, one on top of the other. Each individual DRAM die contains two gigabits of memory, which means four DRAM integrated circuits (ICs) adds up to 1GB. The first-generation HBM technology that AMD has deployed here allows for up to four DRAM stacks of 1GB each. Each stack is accessed through a 1024-bit memory channel and clocked at up to 500MHz (1GB/s effective transfer rate). This suggests a maximum throughput of 512GB/s for a four-way memory controller. Total scaling performance is expected to be excellent, with 4GB of HBM providing resolution scaling equivalent to 2-3x as much GDDR5.

Power consumption, die size

One of the major improvements HBM will bring over existing GDDR5 is in power consumption. According to AMD, a high-end GPU like the R9 290X spends 15-20% of its power budget on its RAM. On a card with a 250-300W TDP, this suggests that 37W-60W of a GPUs total power consumption is spent on memory while under load. According to AMD, adopting HBM slashes this by more than 50%, with a huge increase in total bandwidth delivered per watt.

This is in-line with other estimates we’ve seen for relative power consumption between DRAM and GDDR5. These aren’t the only savings — the combined interposer is also far smaller than the GPU + DRAM stack on a conventional GPU. Again, AMD estimates its slashed its PCB footprint nearly in half.

These advances should make dual-GPU designs far simpler to build in the future than they’ve been to-date. The dual boards themselves need not be as complex, and the total area devoted to memory will be an order of magnitude smaller than in current designs like the R9 295X2.

HBM: Nice for GPU, but an APU game-changer

AMD will be the first company to bring HBM to market in a mainstream part, and we don’t yet know how many SKUs the company is launching. All of the sources we’ve spoken to — sources in a position to know — say that Fiji will have 4GB of RAM when it launches in the not-too-distant future, not 8GB. Given that Nvidia’s highest end consumer GPUs offer 4GB and 4GBish worth of memory (not counting the $1000 Titan), Fiji should compete well on that front.

Even more exciting is what HBM could mean for APUs 18-24 months from now. While AMD obviously isn’t giving timelines, the company confirmed that it intends to extend HBM across its product stack, including future APU designs. Even a single link would provide a 1024-bit memory bus, dwarfing the performance of even a high-speed quad-channel DDR4 design.

That much memory bandwidth could crack the bandwidth problems that have choked integrated graphics to one degree or another from the very beginning. Sharing main memory and competing with the CPU for bandwidth has always hurt integrated graphics performance, going all the way back to the Cyrix MediaGX. By 2017-2018, AMD may have solved that issue for good.

Tagged In

I got retired HP case with only low profile slots. it would be nice to reuse it along with low profile high-end graphic card.

Joel Hruska

Fiji is unlikely to ship in low-profile. You can pick up a GTX 750 in a low-profile form factor, however.

Henry Massey

“Each individual DRAM die contains two gigabits of memory, which means four stacks adds up to 1GB.”

Shouldn’t that be 4GB? If a stack is 4 dies, that’s 8Gb or 1GB…

Andreas Sjånes Berg

4x (4 dies stacked on top) *2Gb = 1GB
And then you got four of theese stacks, giving you the 4GB total.

They mean that the stack has 1 GB.

Probably should have writen “…which means four stacked adds up…

Joel Hruska

I will clarify the wording on this.

Steven Massey

that’s what I got out of it, One byte = grouping of 8 bits 2×4=8

Joel Hruska

Each DRAM die contains two gigabits of memory. There are four die per stack, or 1GB of RAM. Then there are four stacks total on the chip. 4GB of RAM.

Marcus2012

Nope. gigaBIT, not byte. divide by 8 to get gigabytes. each chip is 256MB, and there are 4 of them, aka 1GB.

Henry Massey

If you read the comments again, you”ll perhaps understand that that is what I said. One die is 2Gigabits (256MB), a stack is 4 dies, 8Gb (1GB) and 4 stacks would be 4 GigaBytes. I was commenting on Joel’s assertion that “…four stacks adds up to 1GB.”

Joel Hruska

I edited the language in that section to hopefully clarify what I was referring to.

Each “stack” of DRAM contains four die. Total size of each stack is 1GB in this first generation product. There are four stacks of memory on Fiji, for a total capacity of 4GB.

They were, as AMD is now, PRICE competitive – if you didn’t care about performance vs. basic functionality. The problem I remember most (about the GX specifically), is that the drivers were horribly unstable. When later editions of Windows arrived, they simply didn’t update them.

I think GX was on the right track, but didn’t have the resources to keep going. AMD A-series is doing it all over again, but this time their drivers actually work well.

Busybee

Their CPU design had mainly one fatal flaw, that is FPU performance was very bad (for example, try using any software based MPEG-1 decoder playing video at full quality and it will struggle). The other issue is software compatibility, and MediaGX was certainly problematic (worse than their 6×86 chips, was more like a 80486 class CPU instead of a Pentium class CPU which caused certain software to crash or refuse to work properly).

Joel Hruska

Busy,

That’s certainly what I recall. Between Intel, AMD, and Cyrix, I always went with AMD. Their FPU performance lagged Intel, but it was “good enough” for gaming once 3D cards started taking the market. FPU performance still mattered, but the K6-2 could maintain a playable framerate in tests like Q2Crusher.

Busybee

Well, I was talking about those Cyrix chips which had really bad FPU performance (even reviews had shown that issue, example: http://www.anandtech.com/show/34 ). I’ve handled many of them in the past, including that MediaGX (mostly used in mini PCs).

FrankenPC .

I remember the Cyrix 486 processors. 486 sx I think? They didn’t fully support the instruction set or the timing. Something made them really unstable and slow.

Joel Hruska

The 486SX was Intel’s budget 486 version, but the history and product-naming here is very confusing. To keep a longish story short: Cyrix had a number of products that used similar nomenclature to Intel and AMD (486, 586) but implemented limited features from those chip families. During this time period, neither AMD nor Cyrix could match Intel, but AMD was generally closer.

pepe2907

All these nice things should have happened 2-3 years ago,now we are hoping to see them 2-3 years from now, not surprisingly PC sales hit the bottom.

Joel Hruska

Yes, it would be nice if physics gave us everything we wanted. Instead it takes time and effort. However, look for HBM shipping in less then two months.

( )

Actualy AMD announced HBM on a 2012 Radeon Tiran issue. But it was cancelled due to manufacturing problems evidentlly.

Also while writers have been humpng the glories of multicore cpu’s so long that they have failed to remind the consumer that they;ve been wasting their money.

DX11 does not use more than ONE cpu core to feed the GPU. The rest of them just sit there doing nothing. And since the single core speed hasn’t increased much actual graphics performance has not increased that much either. In fact some enthusiasts pay $$$$thousands for the latest tech to gain single digit performance increases.

THAT ENDS WITH DX12. ALL CPU cores scale DX12 performance!!

DX12 by itself will show around 500% increase in performance. dGPU running DX11 show 2.2MILLION draw calls with DX12 that bumps to 18MILLION!!!

Finally after AMD developed and released Mantle and gave it to MS to release as DX12 we will be able to use ALL CPU cores and the result is a staggering increase in performance. ALL CPU multithreading and multithreaded cores are enabled with DX12 regardless of how the game is coded.

WIndows 10 and DX12 will likely show a bump in PC sales. But that is only when the media starts to write the truth about DX12. But they would have to admit about lying to the consumer about how they needed to buy the latest multicore CPU to drive beaucoup$$$$ dGPU.

Here’s a good one for you. Intel i7-4960 + nVidia GTX 980 using DX11 produces 2.2MILLION drawcalls at 30FPS. It costs about $1500.

Switch to DX12 and an AMD A6-7400 APU produces 4.4MILLION drawcalls at 30FPS. That only cost $100 or so.

This is why the media continues to hide the truth from the consumer.

Let’s see if Hruska has the stones to bench Fiji using DX12!!!!

raulortiz318

100% marketing speak. I’m sorry, but anytime a DX (or any software) release claims massive improvement like that, it never emerges. Gains are never what the best case, lab tested scenario claims it to be.

PeasantCrusher9000

Can these cards do okay with a maximum of 4GB? I mean, NVidia has 3.5 on one of them and people only got a little mad.

Joel Hruska

The GTX 970 has 4GB of memory but cannot address it all in a conventional manner for gaming. Fiji should be fine at 4GB

Ninja Squirrel

Not sure about this, but WCCFTech claims that the R9 390X will use Dual Link Interposer. It allows to build 8Hi stack just like the HBM2. The R9 390X could have 8GB VRAM in that way.

I wonder how much low tdp whe could get with HBM if is was implant on current nvidia 750 ti lol

Joel Hruska

The 750 Ti probably uses less power for GDDR5 because it doesn’t use a particularly aggressive clock speed. I would guess NV could trim 10% or so off the power consumption of the card with HBM. MOre importantly, they could make a smaller version.

AS118

It’s going to be interesting to see what this’ll do for GPU’s and especially APU’s. If AMD can get it working well with their next generation of 14mm Zen APU’s, they may have a killer product that can make onboard graphics capable of competing with the midrange GPU’s of recent times.

Edit: Intel too, but AMD probably has the better tech for that for now, and more motivation to do something about it. With AMD not really challenging Intel in the x86 space, I feel that Intel feels comfortable not competing very hard, except in mobile.

( )

AMD has thrown down the gauntlet in the x86 APU and GPU space by releasing Mantle and giving it to Microsoft to release as DX12.

Not only does AMD challenge Intel in the gaming graphic space, because of Mantle and DX12 they are crushing Intel. Because of Mantle and DX12, all AMD A6-A10 APU’s outperform ALL Intel i3, i5 and i7 IGP by 100% in draw call performance at 30fps; 4.4MILLION vs 2.2MILLION!!!

Writers like Joel Hruska do not want that information widely disseminated, it shatters their world view that AMD can not build good silicon so they will continue to use DX11 to benchmark bleeding edge technology as it cripples the performance of Radeon.

We have to realise that HBM is a solution that was designed for Mantle and DX12. 3dMarks API Overhead Test indicates that there is a GPU bottleneck when the CPU scales up to 6 cores and above. HBM will solve the GPU bottleneck that occurs when the CPU scales above 6 cores. The API Overhead test that Anand posted a few months back made that point pretty clear. HIgh Bandwidth Memory will allow much more through-put than conventional RAM.

With DX11 HBM will probably not show a huge gain in performance as the API just will not tax the GPU all that much. The MINspec of DX12 is somewhere around 9x-12x the MAXspec of DX11.

That is why using DX11 to benchmark Fiji is essentially lying to the consumer. If you are going to spend $900 for new technology do you want to know EXACTLY how it performs with equally bleeding edge DX12 and Mantle?

Don’t you also want to know EXACTLY what CPU needs to drive it too? Intel or AMD?

Benchmarks drive consumer choices. ExtremeTechs readers depend upon the integrity of it’s writers to give the the BEST information to make informed choices.

We’ll see.

Joseph Snodgrass

My gtx 960 owns your apu

( )

@joelhruska:disqus

I am waiting to see just who has the guts to actually benchmark Fiji using DX12 and Mantle API vs the BEST that nVidia can come up with.

Likely I will have to wait a very long time as I am sure this outlet will never show nVidia poorly performing in camparison to new Radeon technology.

Using DX11 to test this fine piece of cutting edge technology would be a waste of time and basically a huge lie.

So what’s it going to be Joel?

Are you going to tell the truth about Fiji and use DX12 benchmarks: Starswarm and 3dMark API Overhead Test?

Or are you going to continue to mislead the consumer and test Radeon using DX11? Why not use DX10 and DX9 for that matter. The results would be about as useful and relevant!

As you well know, by Christmas most new games will be ported or written for DX12. As the MAXspec of DX11 does not come close to the MINspec of DX12, Fiji will likely DECISIVELY crush ANY nVidia contender using DX12.

So are you going to use DX12? Or are you going to recommend that the consumer spend their hard earned money based on the benchmark scores of the obsolete API DX11?

ANd that brings us to my last point. Fiji has burned into it so much bleeding edge memory technology, HSA, GCN 1.2 Asynchronous Shader Pipelines, Asynchronous Compute Engines, etc. Pairing this with an Intel CPU again would be a lie to the consumer and a deliberate attempt to cripple the performance of cutting edge technology.

Radeon needs to be paired with AMD FX AND AMD A10 APU as well as Intel cpu’s. You owe it to your readers to be accurate and your readers do need to have accurate data to make educated buying choices.

So drive a stake through DX11, it is obsolete, and soon to be DEAD.

DX12 will change the gaming world.

Stardock CEO Brad Wardell, has been quite vocal about DX12 benefits for some time now. Being one of the first studios to have invested in the tech, during GDC 2015 Wardell demonstrated Ashes of the Singularity, a DX12-enabled RTS game. He also gave an interview with Nichegamer and laid the entire DX12 controversy at the feet of the media.

Wardell lays this right at the feet of the media. How writers such as you Hruska have been promoting multicore CPU’s for gamers while knowing full well that more than one core is never used for games with DX11.

DX12 changes all that but the media doesn’t want to come out and actually admit that they have been giving really bad advice to the consumer and wasting their money.

Let’s see if you have the guts to provide relevant DX12 benchmarks and show Intel and nVidia in a bad light.

Joel Hruska

(),

I am working with the Star Swarm developers to secure testable software in the coming months. Right now, their code is not in a state to be adopted as a standard GPU benchmarks. Early coverage of Star Swarm and From Ashes has explicitly stated that the game engine, underlying drivers, and Windows 10 itself are not yet in a state to be benchmarked for authoritative results.

As I have previously stated: We will begin profiling DX12 and Windows 10 when games, applications, and the operating system reach RTM / 1.0 status.

“How writers such as you Hruska have been promoting multicore CPU’s for gamers while knowing full well that more than one core is never used for games with DX11.”

You do not appear to understand the difference between DirectX multi-threading, specifically, and multi-threading in general.

I picked these two examples as representative. There are many others. Adding a quad-core chip does not offer *great* game scaling, it’s true — a GPU is typically a better upgrade in terms of raw performance — but in many modern titles an additional pair of cores improves performance by 15-30%.

( )

“You do not appear to understand the difference between DirectX multi-threading, specifically, and multi-threading in general.”

Don’t deflect.

CEO Wardell is correct. Gaming does not use multicore cpu’s. So I see your attempt at deflection and raise you with a direct quote:

“But everyone’s really iffy about that, because that means acknowledging that for the past several years, only one of your cores was talking to the GPU, and no one wants to go ‘You know by the way, you know that multi-core CPU? It was useless for your games.’ Alright? No one wants to be that guy. People wonder, saying ‘Gosh, doesn’t it seem like PC games have stalled? I wonder why that is?’ Well, the speed of a single core on a computer has not changed in years. It’s been at 3GHz, or 2-something GHz for years, I mean that’s not the only thing that affects the speed, but you get the idea. Now, with DirectX 12, Vulkan, and Mantle, it’s how many cores you’ve got. We’ve got lots of those. Suddenly, you go by 4x, 5x, the performance.”

I think that Wardell is in a much better postion to observe the merits of multicore cpu’s and gaming. HE is the CEO of a gaming studio YOU are a writer.

Regarding the 2011 Techspot review: Who cares about DX10 or DX11 scores. Use DX12 and see what you get.

Frankly I am starting to believe that Directx has been a deliberate POX on the PC industry perpetuated by Intel and MS until AMD changed the world with Mantle. All of a sudden Mantle could not be ignored and MS had no choice but to adopt it. Mantle is a great example of the tail wagging the dog.

In fact why don’t you tell us of ANY game that uses ALL CPU cores?

Besides chess and maybe checkers or back gammon, NONE.

And yes Starswarm may have some birthing pains BUT ALL things being equal it is an OUTSTANDING choice to illustrate the comparative differences between DX11, DX12 and Mantle as it stands.

Besides ANAND used it quite well:

“It should be noted that while Star Swarm itself is a synthetic benchmark, the underlying Nitrous engine is relevant and is being used in multiple upcoming games. Stardock is using the Nitrous engine for their forthcoming Star Control game, and Oxide is using the engine for their own game, set to be announced at GDC 2015. So although Star Swarm is still a best case scenario, many of its lessons will be applicable to these future games.”

In fact running the simulation produces some very interesting results.

But you failed to remark upon 3dMark API Overhead Test. That too provides a level playing field or comparative DX11, Mantle and DX12 benchtest.

Bottom line it doesn’t matter WHAT my experience or understanding is. I turn to writers such as YOU to give me those insights.

So again like Wardell says: MULTICORE CPU’S ARE USELESS WITH DX11!!!!.

And you still haven’t responded whether you wil bench Fiji with DX12 3Dmark or NOT? That IS a mature and very telling benchmark.

So what’s it going to be?

Joel Hruska

I’m not deflecting. You’re just wrong.

If you truly think multi-cores are useless for gaming, I invite you to use the BIOS of your system and shut all but one off. Report back with your results.

How we test Fiji will depend partly on what AMD recommends and the state of its DX12 drivers and of Windows 10. That’s not a conversation we’ve had yet.

( )

You need to tell Wardell that HE is wrong.

But again all you debate is multicores. So what.

Are you going to use DX12 Benhcmarks or are you going to waste our time and money with USELSS DX11 BS?

3dMark API Overhead nor Starswarm needs DX12 drivers. And I am sure that AMD will provide Fiji with the capability to use Mantle er (cough, cough) DX12. That’s a cop out too. Can’t you just be honest and forthright?

YES OR NO! Cut the crap and the equivocation. And in case you don’t know the definition of equivocation: “the use of ambiguous language to conceal the truth or to avoid committing oneself”.

And since when do you limit your testing to what the oem wants you to do?

So are you going to waste our time and money with DX11?

Joel Hruska

You don’t actually read responses. It’s becoming rather tiresome.

Yes, DX11 is mostly single-threaded. Yes, draw calls and other *visual* calls are largely single-threaded. But that does not mean that the *games do not benefit from multi-core CPUs.*

In any given title, you have background texture loads, streaming level information, AI, audio processing, physics processing, server communication, voice or in-game chat, and the background processing required by the operating system itself.

None of these tasks relate to the DX11 engine. *All* of them can benefit from multi-threading.

Saying that games have not gotten faster due to multi-core CPUs is simply incorrect. Sometimes a game stops scaling at two cores, more often at four, occasionally even higher than that. The gains are modest in most cases, but real.

Busybee

Look at dear “()”, he been continuously “spamming” his DirectX12 “propaganda” non-stop.

Joel Hruska

So he has. And he’s wrong. You will not see $100 APUs matching the performance of $1500 systems from any vendor.

A little bit of understanding is a dangerous thing.

But I’ll tell you what I find most amusing: AMD’s own reviewer guides recommend testing AMD GPUs with Intel CPUs. Yet according to our friend of the closed parenthetical loops, this is exactly the wrong way to test them.

( )

I DIDN’T SAY IT WAS WRONG, JUST NOT COMPLETE, ACCURATE OR PRECISE.

And AMD is absolutely right. In order to gain market share they need to show OEM’s just how poorly nVidia dGPU’s will perform with Intel CPU’s running DX12 compared to Radeon.

nVidia enjoys it’s dGPU market share not by the merits of their performance but rather by Intel systems are all spec’d with nVidia products. Probably as a result of the nVidia Intel Cross License Agreement. Intel gets nVidia Patents and nVidia gets a free ride on Intel systems.

In order to break that relationship AMD is comparing their product against nVidia via a LEVEL Playing field: Intel CPUs.

And Yes FX IS an OLD CPU but anything more tha 6 cores does not scale well in fact the performance seems to stay flat. HBM should solve that problem.

However it is sill relevant to test AMD GPU’s with AMD cpu’s.
Especially since FX only costs $150 or so.

And why not?

Joel Hruska

“However it is sill relevant to test AMD GPU’s with AMD cpu’s. ”

How about “Because AMD prefers we test its GPUs with Intel CPUs.”

Is that enough for you?

( )

“You will not see $100 APUs matching the performance of $1500 systems from any vendor.”

Your reading conmprehension is as lacking as your writing. No surprise there I guess.

That is what I wrote. I wrote it again out of context just in case there are any big words that you might have difficulty with.

I did not write they were equal.

DX12 allows a very inexpensive system to have the performance that a high cost system barely achieves with DX11.
Smarten up.

Joel Hruska

Game performance is not determined solely by draw calls. If it were, then modern systems still couldn’t match an Xbox 360 or PS3, both of which could field 40 million+ draw calls.

Textures, memory bandwidth, and pixel shader requirements all play substantial parts. You are treating this like an absurdist reduction in which all games, everywhere, are dependent on one thing.

It’s not true. It’s never been true. It’ll never be true.

( )

SO WHAT.

Is that the best you got?

You have no argument so you whine about my challenging Hruska?

Why do you care anyway? Go wipe your nose.

And while your at it, GROW UP. And fortunately I am not your dear…what a pompous arse.

( )

DX11 can ONLY serialize draw calls from the CPU to APU.

“

Kurtis Simpson: Is this due to the benefits of DX12?

Brad Wardell: DX12, Mantle and Vulkan make it really practical, so even under DirectX11 we’re doing a lot of crazy stuff with all the cores but the problem with DirectX 11 is that even with our scheduler, DirectX11 still serializes up a lot of our commands so we lose a lot of benefits. Not all of it but you know, a substantial amount, so we have to turn down a lot of our cool effects. But we’re still able to do thousands of units on-screen at once, we just can’t show them at quite the same glory. On DirectX 12 though they get out of our way entirely and we can have complete control of the GPU.”

“Kurtis Simpson: As of right now it would seem like brute-force with graphic cards, I mean that seems to be the only plausible way right now with 4K.
Brad Wardell: Well I mean it is brute-force but you can’t even feed the graphics cards with DirectX11 or 9 fast enough. I could take a ten year old game or a five year old game even and run them at 4K, but if you want to do a modern production game with that sophistication running at 4K, you can’t do that running on DirectX11….
There just isn’t the bandwidth between the CPU and the GPU because you’re just having one core talk to the graphics card.

Kurtis Simpson: DirectX11 is like a bottleneck that’s overstayed its welcome.
Brad Wardell: Yes and Microsoft’s pushing hard, Windows 10 is free to everyone who has Windows 7…”

What if I told you that everything in this quoted excerpt is true, that I agree with all of it, and that you’ve still misrepresented the use of multi-core CPUs — in fact, that it *illustrates* the problem with your understanding of game threading?

Brad Wardell isn’t wrong, but you’re not understanding the nuance in the issues at hand. The *reason* that games gain advantages from multi-core CPUs isn’t that the DX11 code can be threaded but because multi-cores are used for more than rendering 3D graphics.

No one is arguing against the idea that DX12 will allow game engines to use more CPU cores or add draw calls. No one. And if you’d actually read my own work on DX12, Mantle, and Vulkan, you’d already know that because it’s been a core underpinning of my coverage for more than a year.

You’re so twisted around a narrative in which the media plays willing lapdog to Nvidia or DX11 that you completely ignore the reality of the stories I’ve actually written. It’s baffling.

Joel Hruska

Let me answer you this way: When DX12 and Windows 10 debut, there will be two benchmarks available to the best of my knowledge — 3DMark and possibly Star Swarm.

I will run these tests as a preview of DX12 performance.

I will not base my review recommendations on their performance, whether it is pro or anti-AMD. I *will* treat them as early data that should point in the right direction.

I do not recommend people buy hardware for the performance they may get “someday.” Not with DX12. Not with Mantle. Not with PhysX, or GameWorks, or CUDA, or multi-core CPUs, or for DX11 when that API was new.

I weight my recommendations based on performance *today* because that’s what you’re paying for. Future performance is, at most, a feather in the cap.

( )

HBM was developed to solve a problem: DX12 when scaled to six cores or more seems to create a GPU bottleneck at current ram speeds. A wider bus means greater bandwidth and more through-put: no GPU bottleneck. In fact the API Overhead test ran on Anands site seem to show this dGPU bottleneck at 18Million draw calls. That would be up from 2Million draw calls that DX11 allows. In short:

DX11 is obsolete.

Microsoft has announced that DX12 adoption will be the fastest seen in a decade. AMD’s “mistaken” announcement of Windows 10 release was to give consumers a heads up regarding hardware choices.

Furthermore, MS will adopting DX12 for XBOX. In short this Christmas will see the release of many DX12 gaming titles.

I am not an industry insider so I really do not know exactly WHAT the DX12 MINSPEC just might be compared to DX11; or more importantly just how that MINSPEC would impact current and future consumer hardware choices. This is a very good subject for an indepth piece of good journalism.

After considerable reading and reflection I have come to this conclusion:

Porting a CURRENT game designed and CODED to DX11 MAX SPEC to DX12 does not mean that it will automatically look better or play better if you do not consider faster fps as the main criteria for quality game play. In fact DX11 Game benchmarks will not show ANY increase in performance using Mantle or DX12. They just default to a substandard API. If UGLY is programmed in with DX11 then DX12 will still give you ugly, just at higher frame rates.

And logically, continuing to write to this DX11 MAXSPEC will NOT improve gaming community-wide in general. Let’s be clear, a higher spec game will cost more money. So the studio must balance cost and projected sales. So I would expect that incremental increases in game quality may occur over the next few years as studios become more confident with spending more of the gaming budget on a higher MINSPEC DX12 game. Hey, it is ALL ABOUT THE MONEY.

If a game was written with the limitations or, better, say the maximums or MAXSPEC of DX11 then that game will in all likelihood not look any better with DX12. You will run it at faster frame rates but if the polygons, texture details and AI objects aren’t there then the game will only be as detailed as the original programming intent will allow.

However, what DX12 will give you is a game that is highly playable with much less expensive hardware.

Switch to DX12 and it is revealed that a single $100 AMD A6-7400 APU can produce 4,400,000 draw calls and get 30 fps. Of course these aren’t rendered but you can’t render the object if hasn’t been drawn.

If you are happy with the level of performance that $1500 will get you with DX11 then you should be ecstatic to get very close to the same level of play that DX12 and a $100 A6 AMD APU will get you!!!!

That was the whole point behind Mantle, er (cough, cough) DX12. Gaming is opened up to more folks without massive amounts of surplus CASH. Incidentally this is why studios who adopt nVidia proprietary libraries are only slitting their own throats. They loose more users.

Right NOW, games are written to the maximums of DX11 about 5000 AI objects and about 10,000+ rendered draw calls per second. Mantle and Dx12 can allow your GPU to process 100,000 AI objects and 600,000 draw calls. (an AI Object would be something like a missile or programmed character that tracks or engages you or other AI objects during play).

If the studios do not write well beyond the greatly expanded limitation imposed by DX11 then a game will not look any better with DX12 than it does now. What will happen is more FPS but no more detail.

BUT again you will be able to enjoy that game with much cheaper silicon.

And that is what the industry is really worried about. How much collateral damage will a higher performing API do to existing discrete GPU sales? There will always be the enthusiast who needs to have the absolute bleeding edge, but then again there are those who will be just as happy to make do with quite a bit less. And those who have no choice but will be happy to discover that their little $100 AMD APU gives them a great game!

With DX12 more folks will be able to enjoy good gaming with moderately priced silicon and just maybe this will stimulate PC sales as finally there will be a performance reason to upgrade.

Use of this site is governed by our Terms of Use and Privacy Policy. Copyright 1996-2016 Ziff Davis, LLC.PCMag Digital Group All Rights Reserved. ExtremeTech is a registered trademark of Ziff Davis, LLC. Reproduction in whole or in part in any form or medium without express written permission of Ziff Davis, LLC. is prohibited.