You cna thanks nVidia for that. Had they actually adopted DX9 properly, and DX10, all the needed software would be part of the OS now. But due to them doing thier own thing, we the consumer got screwed.

I don't know why you even care if it uses software. All computing does....PC's are useless without software.

No, it's on the same silicon man, there's no latency between the communication of CPU-GPU ( or very little)

It does have benefits.

Click to expand...

But you add it back by longer traces to memory. The benefits are mostly matters of convenience, marketing and packaging, not any performance benefits noticeable to end user. It makes sense from a business standpoint and may eventually lead to performance gains. I'm not arguing that. What I am arguing is that what is currently using these APUs is not hardware based, as in transparent to the OS. They are software based, just like CUDA and Stream. To use the APUs, the program must be specifically written to take advantage of them. Nothing changes that fact.

You cna thanks nVidia for that. Had they actually adopted DX9 properly, and DX10, all the needed software would be part of the OS now. But due to them doing thier own thing, we the consumer got screwed.

I don't know why you even care if it uses software. All computing does....PC's are useless without software.

CUDA says you have no choice. The whole point of DX10 was to provide OPEN access to features such as what CUDA offers, and nV said, quite literally, Microsoft developed APIs, so knew nothing about hardware design, and that thier API (DX) wasn't the right approach. DX10.1 is the perfect example of this behavior continuing.

DirectX, is largely, broken, because of CUDA. Should I mention the whole Batman antialiasing mumbo-jumbo?

I mean, I understand teh business side, and CUDA, potentially, has saved nV's butt.

But it's existence as a closed platform does more harm than good.

Thankfully, AMD will have thier GPUs in thier CPUs, which, in hardware, will provide alot more functionality than nV can ever bring to the table.

CUDA says you have no choice. The whole point of DX10 was to provide OPEN access to features such as what CUDA offers, and nV said, quite literally, Microsoft developed APIs, so knew nothing about hardware design, and that thier API (DX) wasn't the right approach. DX10.1 is the perfect example of this behavior continuing.

DirectX, is largely, broken, because of CUDA. Should I mention the whole Batman antialiasing mumbo-jumbo?

I mean, I understand teh business side, and CUDA, potentially, has saved nV's butt.

But it's existence as a closed platform does more harm than good.

Thankfully, AMD will have thier GPUs in thier CPUs, which, in hardware, will provide alot more functionality than nV can ever bring to the table.

Click to expand...

It is not broken because of CUDA. 10.1 didn't add what CUDA added. And CUDA certainly didn't effect DX9. Granted, 10.1 is what 10 should have been, mostly due to nV, but it had nothing to do with CUDA.

It is not broken because of CUDA. 10.1 didn't add what CUDA added. And CUDA certainly didn't effect DX9. Granted, 10.1 is what 10 should have been, mostly due to nV, but it had nothing to do with CUDA.

More anti-CUDA bs with nothing to back it.

Click to expand...

10.1 Didn't add that stuff because of Nvidia not being ready for the features that later became dx11.

Tesselation and compute features. ( Ati had a a tessellation unit ready a long time ago)

Computing is evolving from "central processing" on the CPU to "co-processing" on the CPU and GPU. To enable this new computing paradigm, NVIDIA invented the CUDA parallel computing architecture that is now shipping in GeForce, ION, Quadro, and Tesla GPUs, representing a significant installed base for application developers.

Click to expand...

The bolded part is the BS, simply because it's DirectX and Windows that enables such fuctionality, not CUDA. In fact, it's like they are saying they invented GPGPU.

In that regard, it's impossible for me to be "anti-CUDA". It's wrapping GPGPU functions into that specific term that's the issue.

10.1 Didn't add that stuff because of Nvidia not being ready for the features that later became dx11.

Tesselation and compute features. ( Ati had a a tessellation unit ready a long time ago)

Click to expand...

DX10 or DX10.1 or whatever was going to be the DX after DX9 never had compute. Compute came to DX thanks to other APIs that came first, like Stream and CUDA, because those ones created demand. And it certainly was not Nvidia the one who prevented compute features added to DirectX. It would have been a COMPLETE win for Nvidia, if DX10 had included them, for instance. Nvidia was ready for compute back then with G80 and with a 6 months lead over Ati's chip, which was clearly inferior. Cayman can barely outclass Nvidia's 5 year old G80 chip on compute oriented features, let alone previous cards. HD2000/3000 and even 4000 were simply no match for G80 for compute tasks.

As for tesselation, it was not included because it didn't make sense to include it at all, not because Nvidia was not ready. ANYTHING besides a current high-end card is brought to its knees when tesselation is enabled, so tesselation in HD4000 and worse yet HD2/3000 was a waste of time that no developer really wanted, because it was futile. If they had wanted it then no one would have stopped them from implementing it in games, they don't even use it on the Xbox which is a closed platform and much easier to implement without worries of screwing up for non-supporting cards.

Besides a tesselator (especially the one that Ati used before the DX11 implementation) is the most simple thing you can throw on a circuit, it's just an interpolator, and Nvidia already toyed with the idea of interpolated meshes with the FX series. It even had some dedicated hardware for it, like a very archaic tesselator. Remember how that went? Ati also created something similar, much more advanced (yet nowhere near close to DX11 tesselation) and was also scrapped by game developers, because it was not viable.

The bolded part is the BS, simply because it's DirectX and Windows that enables such fuctionality, not CUDA. In fact, it's like they are saying they invented GPGPU.

In that regard, it's impossible for me to be "anti-CUDA". It's wrapping GPGPU functions into that specific term that's the issue.

Click to expand...

What are you talking about man? CUDA has nothing to do with DirectX. They are two very different API's that have hardware (ISA) correlation on the GPU and are exposed via the GPU drivers. DirectX and Windows have nothing to do with that. BTW considering what you think about it, how do you explain CUDA (GPGPU) on Linux and Apple OS's?

DX10 or DX10.1 or whatever was going to be the DX after DX9 never had compute. Compute came to DX thanks to other APIs that came first, like Stream and CUDA, because those ones created demand. And it certainly was not Nvidia the one who prevented compute features added to DirectX. It would have been a COMPLETE win for Nvidia, if DX10 had included them, for instance. Nvidia was ready for compute back then with G80 and with a 6 months lead over Ati's chip

Click to expand...

Um, yeah.

G80 launched November 2006.

R520, which featured CTM, and Compute support(and as such, even supported F@H on GPU long before nVidia did), launched a year earlier, when nVidia had no such options, due to a lack of "double precision", which was the integral feature that G80 brought to the market for nV. This "delay" is EXACTLY what delayed DirectCompute.

R520, which featured CTM, and Compute support(and as such, even supported F@H on GPU long before nVidia did), launched a year earlier, when nVidia had no such options, due to a lack of "double precision", which was the integral feature that G80 brought to the market for nV. This "delay" is EXACTLY what delayed DirectCompute.

Click to expand...

That GPGPU implementation was not Ati's work in reality, but Standford University's. That was nothing but BrookGPU and used DirectX instead of accesing the ISA directly like now. Of course Ati collaborated in the development of drivers so they deserve the credit of .

That has nothing to do with our discussion though. Ati being first means nothing as to the current and 5 past years situation. Ati was bought and dissapeared a long time ago and in the process the project was abandoned. AMD* was simply not ready to let GPGPU interfere with their need to sell high-end CPU (none is Intel), and that's why they have never really pused for GPGPU programs until now. Until Fusion, so that they can continue selling high-end CPU AND high end GPUs. There's nothing honorable on this Fusion thing.

* I want to be clear about a fact that not many see apparently. Ati != AMD and has never been. I never said nothing about what Ati pursued, achieved or made before it was bought. It's after the acquisition that the GPGPU push was completely abandoned.

BTW your last sentence holds no water. So DirectCompute was not included in DX10 because Nvidia released a DX10 card 7 months earlier than AMD, which also happens to be compute ready (and can be used even on todays GPGPU programs)? Makes no sense dude. Realistically only AMD could have halted DirectCompute, but reality is that they didn't because DirectCompute never existed, nor was it planned until other APIs appeared and showed that DirectX's supremacy and Windows as a gaming platform was in danger.

It's after the acquisition that the GPGPU push was completely abandoned.

Click to expand...

OK, if you wanna take that tact, I'll agree.

I said, very simply, that nVidia's delayed implementations ("CUDA" hardware support), and the supporting software, has greatly affected the transparacy of "stream"-based computing iin the end-user space.

What are you talking about man? CUDA has nothing to do with DirectX. They are two very different API's that have hardware (ISA) correlation on the GPU and are exposed via the GPU drivers.

Click to expand...

CUDA has EVERYTHING to do with DirectX, as it replaces it, rather than works with it. Because the actual uses are very limited, there's no reason for a closed API such as CUDA, except to make money. And that's fine, that's business, but it does hurt the consumer in the end.

10.1 Didn't add that stuff because of Nvidia not being ready for the features that later became dx11.

Click to expand...

Wrong. All of nVidia's DX10 cards are capable of computing. nVidia did not hold back DX11 development, they did hold back some features in 10, but those were added back for 10.1. None of those said feature were GPGPU. The compute features of DX11 were developed BECAUSE of the demand for compute functions like CUDA.

Tesselation and compute features. ( Ati had a a tessellation unit ready a long time ago)

Click to expand...

The early implementation of ATI's tessellation engine is completely different to the current implementation. Their earlier version was proprietary. Exactly the same concept as CUDA vs DX compute. And guess what, that proprietary innovation lead to an open standard. Also just like CUDA.

As per usual in this forum, there is a lot of CUDA/nV hate, with no real substance to back it up.

I said, very simply, that nVidia's delayed implementations ("CUDA" hardware support), and the supporting software, has greatly affected the transparacy of "stream"-based computing iin the end-user space.

Says it all.

The "software" needed is already there(there's actually very limited purposes for "GPU" based computing), and has been for a long time. Hardware functionality is here, with APUs.

CUDA has EVERYTHING to do with DirectX, as it replaces it, rather than works with it. Because the actual uses are very limited, there's no reason for a closed API such as CUDA, except to make money. And that's fine, that's business, but it does hurt the consumer in the end.

Click to expand...

Wrong. See above. It creates a market that open standards eventually capitalize on. Again, your disdain for CUDA is still completely unfounded.

I said, very simply, that nVidia's delayed implementations ("CUDA" hardware support), and the supporting software, has greatly affected the transparacy of "stream"-based computing iin the end-user space.

Says it all.

The "software" needed is already there(there's actually very limited purposes for "GPU" based computing), and has been for a long time. Hardware functionality is here, with APUs.

CUDA has EVERYTHING to do with DirectX, as it replaces it, rather than works with it. Because the actual uses are very limited, there's no reason for a closed API such as CUDA, except to make money. And that's fine, that's business, but it does hurt the consumer in the end.

Click to expand...

Without CUDA GPGPU would have died. Plain and simple. After the only other company interested in GPGPU was bought by a CPU manufacturer, only CUDA remained and only Nvidia pushed for GPGPU. And please don't say AMD has also pushed for it, because that's simply not true. Ati pushed it in 2006 and it's true that AMD has been pushing a little bit, but only since 2009 or so, when it became obvious they would be left behind if they didn't. They always talked about supporting it but never actually released any software or put money on it. That is until now, until they have released Fusion and thanks to that they can still continue milking us customers, by making us buy high end CPUs and high-end GPUs, when a mainstream CPU and high-end GPU would do it just as well.

The idea of APU for laptops and HTPC is great, but for HPC or enthusiast use it's retarded and I don't know why so many people are content with it. Why I need 400 SPs on a CPU, which are not enough for modern games, just to run GPGPU code on it, when I can have 3000 on a GPU and use as many as I want? Also when a new game is released and needs 800 SP, oh well I need a new CPU, not because I need a better CPU, but because I need the integrated GPU to have 800 SP. RETARDED. And of course I would still need the 6000 SP GPU for the game to run.

It's also false that GPGPU runs better on an APU because t's close to the CPU. It varies with the task. many tasks are run much much better on dedicated GPU, thanks to the high bandwidth and numenrous and fast local cache and registers.

You're missing the point. I'll tend to agree that nVidia, with CUDA has kept GPGPU going, but like I said earlier...it's actually uses are so few and far between, it's almost stupid. It doesn't offer anything to the end user, really.

You're missing the point. I'll tend to agree that nVidia, with CUDA has kept GPGPU going, but like I said earlier...it's actually uses are so few and far between, it's almost stupid. It doesn't offer anything to the end user, really.

Click to expand...

You don't really follow the news a lot isn't it? There's hundreds of uses for GPGPU

Like why haven't they jsut sold the software to microsoft, already?

Click to expand...

Because Microsoft never buys something they can copy. Hello DirectCompute.

And I'm not saying they copied CUDA btw (although it's very similar), but the concept and CUDA is in fact the evolution of Brook/Brook++/BrookGPU, made by the same people who made Brook in Standford and who actually invented the Stream processor concept. Nvdia didn't invent GPGPU, but many people who did work fr Nvidia now. i.e. Bill Dally.

Why don't they make it work on ATI GPUs too?

Click to expand...

Because AMD doesn't want it and they can't do it without permission. And never wanted it tbh, because it would have exposed their inferiority on that front. Nvidia already offered CUDA and PhysX to AMD and for free in 2007, but AMD refused.

Also there's OpenCL which is the same thing and something both AMD and Nvidia are supporting so...

I mean really...uses are so few, what's the point?

Click to expand...

Uses are few, there's no point, yet AMD is promoting the same concept as the future. A hint, uses are not few. Until now you don't see many because:

1- Intel and AMD have been trying hard to delay GPGPU.
2- It takes time to implement things. i.e. How much it took developers to implement SSE? And the complexity of SSE in comparison to GPGPU is like...
3- You don't read a lot. There's hundreds of implementations in the scientific arena.

http://blogs.msdn.com/b/ptaylor/archive/2007/03/03/optimized-for-vista-does-not-mean-dx10.aspx
Given the state of the NV drivers for the G80 and that ATI hasn’t released their hw yet; it’s hard to see how this is really a bad plan. We really want to see final ATI hw and production quality NV and ATI drivers before we ship our DX10 support. Early tests on ATI hw show their geometry shader unit is much more performant than the GS unit on the NV hw. That could influence our feature plan.

http://blogs.msdn.com/b/ptaylor/archive/2007/03/03/optimized-for-vista-does-not-mean-dx10.aspx
Given the state of the NV drivers for the G80 and that ATI hasn’t released their hw yet; it’s hard to see how this is really a bad plan. We really want to see final ATI hw and production quality NV and ATI drivers before we ship our DX10 support. Early tests on ATI hw show their geometry shader unit is much more performant than the GS unit on the NV hw. That could influence our feature plan.

That's one use, to me, and not one that I personally get any use out of. You falsely inflating the possibilities.

Click to expand...

That is not one use. Scientist use GPGPU for physics simulations, treatment and comparison of image data (medical, satellite, military), artificial/distributed intelligence, data reorganization, stock market flow control and many many others. That is not one use.

As a home user, there's 3D browser acceleration, encoding accelleration, and game physics. Is there more than that for a HOME user? Because that's what I am, right, so that's all I care about.

1- Intel and AMD have been trying hard to delay GPGPU.
2- It takes time to implement things. i.e. How much it took developers to implement SSE? And the complexity of SSE in comparison to GPGPU is like...
3- You don't read a lot. There's hundreds of implementations in the scientific arena.

May sound a bit fan-boyish here but just sharing my experience take it as you will.

Geeks3d and the other tech blogs demi-frequently post up comparisons of cards on new benchmarks or compute programs can find results there.

Been a while since I've read up though so can't point you in a specific direction, only that it's not so much a case of hardware vs hardware.

Click to expand...

Feture set != performance.

There's many apps where AMD cards are faster. This is obvious, highly parallel applications which riquire very little CPU-like behavior, will always run on a highly parallelized architecture. That's not to say that Cayman has many GPGPU oriented hardware features that G80 didn't have 5 years ago.

And regarding that advantage, AMD is stepping away from that architecture in the future right? They are embracing scalar design. So which architecture was essentially right in 2006? VLIW or scalar? It really is that simple, if moving into the future for AMD means going scalar, there really is very few questions unanswered. When AMD's design is almost a copy* of Kepler and Maxwell which were announced a year ago, there's very few questions about what is the correct direction. And then it just becomes obvious who followed that path before...

Well you said "it does not offer anything to the end user". That's not the same as saying that it does not offer anything to you. It offers a lot to me. Of course that's subjective, but even for the arguably few apps where it works, I feel it helps a lot. Kinda unrelated or not, but I usually hear how useless it is because "it only boasts video encoding by 50-100%". Lol you need a completely new $1000 CPU + supporting MB to achieve the same improvement, but nevermind.

There's many apps where AMD cards are faster. This is obvious, highly parallel applications which riquire very little CPU-like behavior, will always run on a highly parallelized architecture. That's not to say that Cayman has many GPGPU oriented hardware features that G80 didn't have 5 years ago.

And regarding that advantage, AMD is stepping away from that architecture in the future right? They are embracing scalar design. So which architecture was essentially right in 2006? VLIW or scalar? It really is that simple, if moving into the future for AMD means going scalar, there really is very few questions unanswered. When AMD's design is almost a copy* of Kepler and Maxwell which were announced a year ago, there's very few questions about what is the correct direction. And then it just becomes obvious who followed that path before...

Click to expand...

Still not sure how scalar has a performance advantage tbh, at a glance it should be weaker