AMD: Navi Speculation, Rumours and Discussion [2019]

RegularNewcomer

No mention of Navi at all at CES. With both Epyc 2 and Ryzen 3 being mentioned as mid 2019, not mentioning Navi at all lends to Navi being (much?) later on the timeline. I still subscribe to my personal theory that Navi was originally a Globalfoundery destined part and cancelling of 7nm there meant delaying Navi as the design moved to TSMC.

Legend

No mention of Navi at all at CES. With both Epyc 2 and Ryzen 3 being mentioned as mid 2019, not mentioning Navi at all lends to Navi being (much?) later on the timeline. I still subscribe to my personal theory that Navi was originally a Globalfoundery destined part and cancelling of 7nm there meant delaying Navi as the design moved to TSMC.

Click to expand...

Not true at all, Su said that there will be next gen graphics this year and in the very end she said Navi by name

RegularNewcomer

Not true at all, Su said that there will be next gen graphics this year and in the very end she said Navi by name

Click to expand...

I don't remember hearing her say anything about Navi but coming this year is the same thing they been saying this whole time, which is probably around Q4 release based on all the indirect hints. If it was coming earlier, they would probably have shown more of it. If it was coming mid year, they probably would have said mid year.

If it does come out Q4, it is super late based on a timeline of having a replacement to polaris. I don't see any real technical reason why they would need to wait until then. They came out with polaris much faster after the introduction of 14nm. Vega being HBM focused and not using GDDR makes it a terrible product for the markets that don't need 1TB/s memory bandwidth.

Legend

I don't remember hearing her say anything about Navi but coming this year is the same thing they been saying this whole time, which is probably around Q4 release based on all the indirect hints. If it was coming earlier, they would probably have shown more of it. If it was coming mid year, they probably would have said mid year.

If it does come out Q4, it is super late based on a timeline of having a replacement to polaris. I don't see any real technical reason why they would need to wait until then. They came out with polaris much faster after the introduction of 14nm. Vega being HBM focused and not using GDDR makes it a terrible product for the markets that don't need 1TB/s memory bandwidth.

Click to expand...

Or they're keeping it under tight blankets and release it sometime in Q1, Q2 or Q3. What indirect hints are you referring to anyway? Why would have they shown more of it, when they had new product to launch regardless of when Navi is coming, that would be just stupid. "Hey look, we got this fancy pants new gfx card here, but nevermind that, we got next gen super fancy pants coming next month"?

Newcomer

There are some things you should know. I haven’t been a product manager for a little while and never for GPUs but certain principles apply.

A) margin is everything. If you don’t have a lot of margin you have limited space to move and any fluctuation on the market causes severe damage to low margin products. So if you’re going to spoil up all your manufacturing for a low cost high volume product, you have to sell all of it because you’ve committed your manufacturing towards it. If you hit recession and no one buys your products, you’re dead. Which is why high volume low cost tends to coincide with iGPUs now. They work in multiple formats and devices.

B) the other way to do margin is to take as much advantage of binning as possible. Same product different performance levels. This is done today, and there are a great deal of many ways that GPUs pricing strategies can happen but leading with a low cost high volume product is pretty backwards. It’s much easier to optimize your inventory with higher products and bin downwards.

C) know your volumes and market size. You can’t sell more than what the market is willing to accept and you have to be realistic about your expected penetration into the market. Once again this is why low cost high volume doesn’t work so hot. Saturate the market with cheap stuff and you didnt profit much, that’s the best case scenario.

Profit = revenue - expenses. How you leverage technology is not in that equation, so you’ll need to explain that concept further.

Click to expand...

I do believe how AMD has LEVERAGE the MI50 into Vega Seven.... does matter.

And that that over-all cost of HBM2 1.8Ghz memory (vs older HBM2, or GDDR6) doesn't matter as much as you are suggesting (based solely) on a Radeon 7's margins. And by using what AMD already knows and leveraging cut-down MI50's into Radeon sevens does add to the value of the card.

RegularNewcomer

Or they're keeping it under tight blankets and release it sometime in Q1, Q2 or Q3. What indirect hints are you referring to anyway? Why would have they shown more of it, when they had new product to launch regardless of when Navi is coming, that would be just stupid. "Hey look, we got this fancy pants new gfx card here, but nevermind that, we got next gen super fancy pants coming next month"?

Click to expand...

AMD would have promoted VII purely as a pro GPU if they were going to release anything Navi any time soon. I doubt they gain much for selling a GPU so mismatched in terms of features for gaming at a price that is barely considered reasonable. If they had Navi even close to being ready, they could have easily gone with: "Here is Radeon VII, a prosumer 7nm frontier card, it can game but it's true value is a card for a variety of prosumer applications like the original Vega frontier edition. For gamers we will launch a card specially for you very soon in the form of Navi, it will be our next generation of gaming GPU."

AMD has not talked about Navi at all like they have with any of their previous launches such as tape out. There has been no hype for it coming from AMD like with polaris or Vega. The only real solid indication Navi is even on the way is from Linux drivers. I don't think there has been any reliable piece of any data pointing to where Navi will even fit in terms of performance with most people only being able to speculate on it being a midrange GPU. Every "leak" has been pretty much BS. We still know pretty much the same as we did 2 years ago when it was first on the roadmap. This does not look like a AMD GPU waiting to imminently launch.

LegendVeteranSubscriber

If they had Navi even close to being ready, they could have easily gone with: "Here is Radeon VII, a prosumer 7nm frontier card, it can game but it's true value is a card for a variety of prosumer applications like the original Vega frontier edition. For gamers we will launch a card specially for you very soon in the form of Navi, it will be our next generation of gaming GPU."

Click to expand...

I agree they should have presented Radeon VII as a prosumer card from the get go, at the very least to increase its value perception.

Them not saying anything about Navi if it's not coming in a couple of months is just good strategy: not osborne your current lineup and don't create overhype.
Raja's method of yelling about the cards some 8 months before release was terrible, IMO. Let's hope he doesn't bring that vice to Intel.

Newcomer

There haven't been any Linux patches for Navi. AMD usually does that at least three month before the release, sometimes more close to sixth month before the release. They had some Navi/GFX10/GFX1000 stuff in their Windows drivers by the end of last year, and now they have removed all of that. Yes, Lisa mentioned the name very late at the CES-Keynote, but that is all.

Yes, maybe it is just a delay caused by moving from GloFo to TSMC, but maybe it's a little bit more. I guess they had to re-implement everything because of different design rules, and maybe took the time and opportunity to adjust some things, so they may have started at almost zero again last year.

LegendVeteranSubscriber

There haven't been any Linux patches for Navi. AMD usually does that at least three month before the release, sometimes more close to sixth month before the release. They had some Navi/GFX10/GFX1000 stuff in their Windows drivers by the end of last year, and now they have removed all of that. Yes, Lisa mentioned the name very late at the CES-Keynote, but that is all.

Yes, maybe it is just a delay caused by moving from GloFo to TSMC, but maybe it's a little bit more. I guess they had to re-implement everything because of different design rules, and maybe took the time and opportunity to adjust some things, so they may have started at almost zero again last year.

Click to expand...

Perhaps removing them was just a way to increase secrecy?

One thing is certain: the days of Raja's RTG bragging about a certain GPU some 6+ months before release are over.
The RX590 had little to no pre-launch hype. Vega VII was first mentioned one month before going on sale.

That's really positive IMO.
I recon that Raja was just trying to maintain brand awareness when confronted with a drought of graphics card releases, but the end result was just annoying.

ModeratorLegendVeteran

Do we believe the "Super SIMD" / "VLIW2" patent is applicable to Navi? It doesn't feel like a huge departure from GCN to me, so it feels very plausible to me that it'd be considered for Navi (whether it makes it into the final design is another question; patents don't always result in end-products).

One thing I'm still confused about with GCN and my Google-fu is failing me (asking console devs on Twitter might be the easiest route but hopefully someone here knows as well): transcendental/special-function is 1/4 rate on GCN, but do they stall the entire pipeline for 4 cycles, or can FMAs be issued in parallel for some of these cycles?

Everything I've found implies that they stall the pipeline for 4 cycles, which is pretty bad (speaking from experience for mobile workloads *sigh* maybe not as bad on PC-level workloads) and compares pretty badly with NVIDIA which on Volta/Turing is able to co-issue SFU instructions 100% for free and they don't stall the pipeline unless they're the overall bottleneck (as they've got spare decoder and spare register bandwidth, and they deschedule the warp until the result is ready; obviously they can't co-issue FP+INT+SFU, but FP+SFU and INT+SFU are fine).

It feels to me like at this point, 1 NVIDIA "CUDA core" is actually quite a bit more "effective flops" than an AMD ALU. It's not just the SFU but also interpolation, cubemap instructions, etc... We can examine other parts of the architecture in a lot of detail as much as we want, but I suspect the lower effective ALU throughput is probably a significant part of the performance difference at this point... unlike the Kepler days when NVIDIA was a lot less efficient per claimed flop than they are today.

The "Super SIMD" patent is an interesting opportunity to reverse that for AMD, especially if the "extra FMA" can run in parallel to SFU and interpolation instructions and so on... I really hope it gets implemented in Navi and the desktop GPU market gets a little bit more exciting again!

EDIT: Also this would allow a "64 CU" chip to have 2x as many flops/clock as today's Vega 10 without having to scale the rest of the architecture (for better or worse). It feels like 8192 ALUs with 256-bit GDDR6 and better memory compression could be a very impressive mainstream GPU.

ModeratorLegendVeteran

Don't they just evaluate transcendentals using NR or something similar. Then it's one cycle to look up the initial estimate and three iterations to refine the result.

Click to expand...

Hmm didn't think that was the case, especially refining the result using the main pipeline FMAs, but maybe you're right? It's not possible to tell from the ISA obviously as GCN is very much a CISC architecture...

Veteran

The silicon you don't spend on special purpose transcendental hardware can be spent on more FMAs and probably improve general performance.

I would expect the 1/4 throughput is down to APIs exposing transcendental functions with precision guaranteed to some degree, forcing extra iterations to reach the precision. For SIMD on CPUs you have different instructions to initialize starting coefficients and decide how many iterations you want to do to get the needed precision.

ModeratorLegendVeteran

The silicon you don't spend on special purpose transcendental hardware can be spent on more FMAs and probably improve general performance.

I would expect the 1/4 throughput is down to APIs exposing transcendental functions with precision guaranteed to some degree, forcing extra iterations to reach the precision. For SIMD on CPUs you have different instructions to initialize starting coefficients and decide how many iterations you want to do to get the needed precision.

Cheers

Click to expand...

Sure but the cost of more CUs isn't just the FMA unit itself; for a given level of performance, it may or may not be cheaper to make trascendentals cheaper, rather than adding more CUs. Also it may be more power efficient; i.e. AMD's approach may (or may not) be more area efficient but less power efficient (see: dark silicon). Many GPU architectures have fully co-issued special function units including Pascal/Volta/Turing (even past architectures from AMD; e.g. Xenos isn't really Vec5, it's Vec4 FMA + scalar special function).

Everything's a trade-off and clearly AMD went strongly in the direction of doing more on the general-function FMA units compared to their VLIW4/VLIW5 architectures and compared to NVIDIA. It's not obvious to me whether that has actually paid off for them...

Also I'm still not 100% sure whether GCN special function ops really stall the pipeline for 4 cycles or only 1 cycle.

About Us

Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!