Newcomer

If you expectrd ray tracing to be free of charge you need to have your expectations checked.

Apologies if I sounded confrontational, didn't mean to. However there is nothing prejudiced in what I said, it's just basic counter arguments.

Click to expand...

I think it was more of because you sound exactly like nvidia and in your rebuttals, are towing their marketing line, for verbatim.

In reality though, (no matter what you or nvidia thinks), Gamers do NOT care about reflection in windows, or water... they want better game play and more screen size and more fps.. Nobody I know (or even have heard of), is demanding realism or perfect shadows in their games. So John Q Public is not going to pay for it, no matter how much marketing nvidia throws at it.

Turing itself, is NOT a gaming GPU, meant for gaming. We all know this and nvidia has taken their marketing too far. The ONLY reason to upgrade to RTX is because of USB-C.

Anyone using G-sync and needing ultimate frames, will have no choice... and upgrade to an RTX. While those who have not yet purchased their (FreeSync2/G-sync) monitors, are most likely waiting 2 more months until AMD releases their 7nm Radeons. Knowing as a gamer, the public will probably get a more dedicated die for their money.

VeteranSubscriber

...most likely waiting 2 more months until AMD releases their 7nm Radeons. Knowing as a gamer, the public will probably get a more dedicated die for their money.

Click to expand...

For the longest time, AMD has promoting Vega 20 as a GPU that’s for deep learning and pretty much deep learning only.

Since you are waiting for a very dedicated 7nm die that will show up in 2 months, the only reasonable conclusion is that you think that the new RTX GPUs don’t have enough dedicated deep learning resources.

And that makes total sense in some way, because for a gamer who is clearly not interested in image quality, there’s no reason at all to buy a traditional GPU to begin with: if you put all the settings to “low”, there’s no game that’s not playable at very high frame rates on your 4K FreeSync monitor, and no reason to upgrade at all.

LegendVeteranSubscriber

In reality though, (no matter what you or nvidia thinks), Gamers do NOT care about reflection in windows, or water... they want better game play and more screen size and more fps.. Nobody I know (or even have heard of), is demanding realism or perfect shadows in their games. So John Q Public is not going to pay for it, no matter how much marketing nvidia throws at it.

Click to expand...

Please don't speak for all gamers. I'm personally very excited about what this will bring eventually to gaming, including reflections in windows mirroring truly what is behind me instead of a fake cubemap that's reused everywhere. I care more about those things than 100fps. That doesn't mean I'm buying into RTX but I'm very much looking forward to having true lighting, shadows and reflections in games in the coming years.

Legend

Seems obvious to me why a company would capitalize on a technology advantage that the competition can't match. This debate would make sense if we were talking about some promised future but RT and DLSS are working right now. It's a real, tangible thing that people can actually use. That's a lot more than can be said for other features that never materialized even after several years.

The reason to suggest it's too early - these dies are massive and expensive! Games have to target a viable mainstream tech level, and realtime raytracing is far beyond that mainstream (<$300 GPUs?).

Click to expand...

They would likely be massive and expensive even without RT. The larger, faster caches, improved compression and independent INT pipeline aren't free. Turing is clearly more efficient than Pascal so those changes were worthwhile. Of course we don't know how much RT and tensors cost in terms of die size but they certainly don't deserve all the blame for the size and cost of Turing chips.

LegendAlpha

If you would restrict denoising to neural networks that would be true, as the tensor cores are faster to compute NN then shaders.
Shaders on the otherhand are not limited to NN denoising, and there may be as good algorithms not based on NN denoising, that run equally fast or faster on shaders.

Also using the tensor cores comes not for free, they require massive amounts of NN weights data, and NN input and output data, competing with shaders for register and cache bandwidth.

From the Turing white paper it looks even worse as when tensor cores are active, no shading or RT can happen:View attachment 2663

Click to expand...

DLSS is described as a post-processing method, which only happens after all shading is done to give that phase the necessary inputs.
Since it's only one frame, I suspect the usual concurrent start of frame N+1's rendering would be happening in reality.

I haven't been able to absorb the full range of details on the technique, but perhaps that is something that can be constrained by where it is in the frame at this early adopter's state.
Some of the artifacting may be worsened by a complex scene's broadly constructed ground truth leading to unwanted interference in the behaviors of certain materials or geometry that sporadically create similar inputs. Perhaps something like a primitive buffer or primitive ID (Nvidia's task and mesh shaders do allow for the possibility) could help focus the learning process or more cleanly represent idealized weights for specific subsets of the world like grass or finely detailed surfaces. It's not as automatic, but it may give the developer a chance to more feed the network indications of their intent. Perhaps if this can somehow be applied in different phases, then shaders could be constructed to key into certain execution or data patterns that could flag an area as being more amenable to a specific set of weights despite the pixel output being ambiguous as to what fed into it.

Newcomer

Please don't speak for all gamers. I'm personally very excited about what this will bring eventually to gaming, including reflections in windows mirroring truly what is behind me instead of a fake cubemap that's reused everywhere. I care more about those things than 100fps. That doesn't mean I'm buying into RTX but I'm very much looking forward to having true lighting, shadows and reflections in games in the coming years.

Click to expand...

But your, yourself claim that real life shadows & reflections are not a top priority.

Heaving reflections in a window, (or water puddle) does not have to be perfectly real... or perfectly rendered, for you to see their movement out of the corner of your eye. The point is, nvidia is selling unnecessary fluff... because shadows & reflections are already in games.

In addition, having "perfect" and ultra-realistic shadows and reflections is just marketing hype. It is something that makes sense when the games themselves are rendered in such detail. Subsequently, DLSS bring nothing new to the table, just another proprietary form of AA cheats, to make something look good, instead of rendering for upmost quality...

RTX won't look the same under 7nm, Nvidia will listen and make a GPU for gaming, different their then ML bins. The reviews don't lie, turing is a powerful chip, yes. But for the money, it is not meant specifically for gaming and the price premium isn't there. Naming the top half-dozen games, is there any reason for someone who owns a 1080ti, to spend $900 bucks for nearly the same performance?

Those who were waiting to see what Nvidia's 2k series would bring, still might opt for the RTX 2080 instead of the cheaper 1080ti, just to get USB-C and better board. But if that is not a concern, then expect the 1080ti to power all the sub 4k g-sync displays.

VeteranSubscriber

Heaving reflections in a window, (or water puddle) does not have to be perfectly real... or perfectly rendered, for you to see their movement out of the corner of your eye. The point is, nvidia is selling unnecessary fluff... because shadows & reflections are already in games.

Click to expand...

The reflection of the flame thrower in the BF5 demo was awesome, and impossible to do with fake techniques.

You call that unnecessary, and that’s fine, because you already admitted to not caring about image quality.

But when people are spending hours on custom Skyrim improvements or the Witcher 3, claiming that nobody else cares either is just projection.

... just another proprietary form of AA cheats, to make something look good, instead of rendering for upmost quality...

Click to expand...

And that utmost quality is achieved in movies by rendering at 64x SS. Which you know full well is impossible to do in real time. So you look for alternatives that are good enough with a lower performance impact. And DLSS is just another way of doing that.

RTX won't look the same under 7nm, Nvidia will listen and make a GPU for gaming, different their then ML bins.

I expect 7nm gaming GPUs will have more SMs (obviously), more RT cores per SM, and the same or more Tensor cores.

And the reason is very simple: these GPUs are already being designed today, and it’s way too early to declare victory or defeat today about ray tracing becoming entrenched.

Similarly, there are no machine learning accelerators that you can buy today to shove under your desk other than those from Nvidia. And a gaming GPU is perfect to spread the R&D cost for that.

The reviews don't lie, turing is a powerful chip, yes. But for the money, it is not meant specifically for gaming and the price premium isn't there.

Click to expand...

In the end, your biggest beef is with the one thing that can be changed at a moment’s notice. If Nvidia prices the RTX GPUs, 20% cheaper (which they could easily do if forced to), none of your other arguments are relevant anymore.

Regular

DLSS is described as a post-processing method, which only happens after all shading is done to give that phase the necessary inputs.
Since it's only one frame, I suspect the usual concurrent start of frame N+1's rendering would be happening in reality.

I haven't been able to absorb the full range of details on the technique, but perhaps that is something that can be constrained by where it is in the frame at this early adopter's state.
Some of the artifacting may be worsened by a complex scene's broadly constructed ground truth leading to unwanted interference in the behaviors of certain materials or geometry that sporadically create similar inputs. Perhaps something like a primitive buffer or primitive ID (Nvidia's task and mesh shaders do allow for the possibility) could help focus the learning process or more cleanly represent idealized weights for specific subsets of the world like grass or finely detailed surfaces. It's not as automatic, but it may give the developer a chance to more feed the network indications of their intent. Perhaps if this can somehow be applied in different phases, then shaders could be constructed to key into certain execution or data patterns that could flag an area as being more amenable to a specific set of weights despite the pixel output being ambiguous as to what fed into it.

Click to expand...

From what I can tell, once tensor cores are fully utilized there is no possibility for the shader cores to be active:
One tensor core needs the equivalent register read/write bandwidth of 16 FP ALUs (or 16 INT ALUs)
A tensor core does a 4^2x4^2+4^2 matrix Multiply/Add: AxB+C=D, A,B need 2x16x16 bit read, C 16x32 bit read, result 16x32 bit write
As a SM has 64 FP and 64 INT ALUs, and 8 tensor cores. Doing the maths the 8 tensor cores use an equivalent number of register ports as the 64 FP + 64 INT ALUs. So when the 8 tensors are active there is no room to overlap that with shader computations of the SM.

VeteranRegular

But why do the Tensor cores even matter for ray tracing? The answer is that AI and machine learning are becoming increasingly powerful, and quite a few algorithms have been developed and trained on deep learning networks to improve graphics. Nvidia's DLSS (Deep Learning Super Sampling) allows games to render at lower resolutions without AA, and then the Tensor cores can run the trained network to change each frame into a higher resolution anti-aliased image. Denoising can be a similarly potent tool for ray tracing work.
...
Here's where Nvidia's Turing architecture really gets clever. As if the RT cores and enhanced CUDA cores aren't enough, Turing has Tensor cores that can dramatically accelerate machine learning calculations. In FP16 workloads, the RTX 2080 Ti FE's Tensor cores work at 114 TFLOPS, compared to just 14.2 TFLOPS of FP32 on the CUDA cores. That's basically like ten GTX 1080 Ti cards waiting to crunch numbers.

Pixar has been at the forefront of using computer generated graphics to create movies, and its earlier efforts largely relied on hybrid rendering models—more complex models perhaps than what RTX / DXR games are planning to run, but they weren't fully ray traced or path traced. The reason: it simply took too long. This is where denoising comes into play.
...
Denoising allowed Pixar to reportedly achieve an order of magnitude speedup in rendering time. This allowed Pixar to do fully path traced rendering for its latest movies, without requiring potentially years of render farm time, and both Cars 3 and Coco made extensive use of denoising.

If the algorithms are good enough for Pixar's latest movies, what about using them in games? And more importantly, what about using denoising algorithms on just the lighting, shadows, and reflections in a hybrid rendering model?

The RTX 20-series GPUs are the first implementation of ray tracing acceleration in consumer hardware, and future Nvidia GPUs could easily double or quadruple the number of RT cores per SM. With increasing core counts, today's 10 GR/s performance might end up looking incredibly pathetic. But look at where GPUs have come from in the past decade.

Veteran

In reality though, (no matter what you or nvidia thinks), Gamers do NOT care about reflection in windows, or water... they want better game play and more screen size and more fps.. Nobody I know (or even have heard of), is demanding realism or perfect shadows in their games. So John Q Public is not going to pay for it, no matter how much marketing nvidia throws at it.

...
AND Vega is the latest technology on the streets. That is why it requires a premium.

Vega vs Pascal isn't about raw game numbers, it is about about building blocks for the future games. If RX Vega can run last years games, equal to their competitors gtx1070 & 1080, then it is a far better value to the end user and Gamer.

Again, not being attached to any brand and logically looking at now and the future, any unbiased Gamer building/buying a new rig over the upcoming holiday are not going to pass up the technology Radeon-Vega brings, all because 2 year old Pascal can crunch certain outdated games extremely well?

Latest tech = Premium

Honestly, what would buying a Pascal card today, get you.. ? over buying a Vega card..? And in the future..?

Even at equal cost, Vega is a much better choice for the Premium Gamer, because RX supports the most standards, better compute, latest VR tech, prepped & ready for future game titles and game engines. AMD has announced many Gaming partners and pranced many of them on-stage at many of their events. They always mention their "partners" in tweets, etc. AMD has a lot of collaboration in the gaming industry and what seems like a lot of Development studios are on board and behind them.

I suspect AMD is holding back, and Dr Su is a Gamer herself. I think there are several huuuge announcement yet to be made about RXv64. Anyone remember that cube? And what ever happened to "infinity fabric" and why are we locked out of it, or kept in the dark..? What is AMD brewing?

I believe there are mystical RX Vega drivers... I have seen them in my dreams. (And on presentation slides)

I am not worried about $50 here, or there. I see RXv as the latest tech and demands a premium. We have all seen Pascal, Polaris, Fiji and other GPU uArchs get mystical drivers during their life times too. So, perhaps it isn't mystical, but logical to think that refinement will come for RX Vega (fine wine?)

Legend

DLSS is described as a post-processing method, which only happens after all shading is done to give that phase the necessary inputs.
Since it's only one frame, I suspect the usual concurrent start of frame N+1's rendering would be happening in reality.

Click to expand...

It's multiple frames like TAA is, it just does some cleanup to remove the afterimages. Here's quote from the whitepaper with my bolding

DLSS leverages a deep neural network to extract multidimensional features of the rendered scene and intelligently combine details from multiple frames to construct a high-quality final image.

Legend

I'm not sure of that limitation, we know that FFXV does it at half the resolution but I'm not sure if anyone has confirmed Infiltrator or Star Wars resolution for example. The whitepaper at least makes no mention of having to use exactly half the res.

Regular

I'm not sure of that limitation, we know that FFXV does it at half the resolution but I'm not sure if anyone has confirmed Infiltrator or Star Wars resolution for example. The whitepaper at least makes no mention of having to use exactly half the res.

LegendAlpha

From what I can tell, once tensor cores are fully utilized there is no possibility for the shader cores to be active:
One tensor core needs the equivalent register read/write bandwidth of 16 FP ALUs (or 16 INT ALUs)
A tensor core does a 4^2x4^2+4^2 matrix Multiply/Add: AxB+C=D, A,B need 2x16x16 bit read, C 16x32 bit read, result 16x32 bit write
As a SM has 64 FP and 64 INT ALUs, and 8 tensor cores. Doing the maths the 8 tensor cores use an equivalent number of register ports as the 64 FP + 64 INT ALUs. So when the 8 tensors are active there is no room to overlap that with shader computations of the SM.

Click to expand...

This seems like a relatively neutral outcome relative to what happens with a post-process compute shader heavily utilizing the SM, or what would happen on prior architectures. It reduces the throughput of other rendering work by either occupying the standard SIMD units or occupying the register ports they would need. The GPU's opportunistically switching to other threads if the tensor/compute shader stalls and the overal GPU/driver load balancing by controlling how many SMs may be running a given invocation remain.

It's multiple frames like TAA is, it just does some cleanup to remove the afterimages. Here's quote from the whitepaper with my bolding

Click to expand...

That was in reference to the image attached to the post I replied to, which was a simplified marketing diagram about a frame being rendered by Turing. Even if from multiple frames, the post-process phase would be executed at the tail end of the frame that is the last input.

About Us

Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!