If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Linux OpenCL Performance With The Newest AMD & NVIDIA Drivers

The latest Linux GPU benchmarks at Phoronix for your viewing pleasure are looking at the OpenCL compute performance with the latest AMD and NVIDIA binary blobs while also marking down the performance efficiency and overall system power consumption.

Comment

I can propose couple of another benchmarks:
1) bfgminer --scrypt --benchmark (https://github.com/luke-jr/bfgminer.git) - massively parallel computations with incredibly heavy GPU memory demands. GPU should both be good at parallel computations and provide fast memory. Can be tricky a bit in sense that best results are obtained after tuning parameters for particular GPU.
2) clpeak utility (https://github.com/krrishnarraj/clpeak.git). GPU VRAM speed benchmark. While it sounds simple, it depends on both GPU and drivers so it can be quite interesting thing to compare. This one also known good way to crash MESA+LLVM OpenCL stack, at least on AMD cards .

Comment

Is there a danger that GPU-based computing (OpenCL, CUDA, HSA, etc.) is going to be replaced by FPGAs? Probably certainly not in the consumer space (where these technologies are rare anyway), but in HPC, which could lead to these technologies, in time, withering on the vine.

I'm just thinking about how quickly GPU mining collapsed based on a market need to go further than what GPUs can do. Would the same pressures apply to typical HPC markets today?

Comment

I can propose couple of another benchmarks:
1) bfgminer --scrypt --benchmark (https://github.com/luke-jr/bfgminer.git) - massively parallel computations with incredibly heavy GPU memory demands. GPU should both be good at parallel computations and provide fast memory. Can be tricky a bit in sense that best results are obtained after tuning parameters for particular GPU.
2) clpeak utility (https://github.com/krrishnarraj/clpeak.git). GPU VRAM speed benchmark. While it sounds simple, it depends on both GPU and drivers so it can be quite interesting thing to compare. This one also known good way to crash MESA+LLVM OpenCL stack, at least on AMD cards .

Yep, given the way that nVidia INTENTIONALLY gimps gpgpu capabilities of their consumer cards I was incredibly surprised to see 780 TI perf so close to the R9 290X and exceeding even the more modest ATI cards.

Radeons pretty much trash nvidia consumer cards since, 600 series?

Bought myself a 780 Ti as a christmas present to self last year since when I did my prior desktop build I woosed out and bought a 670 FTW instead of the 680 that I had originally planned. Thanks to a nearby microcenter and massive GPU/CPU discounts(the 670 FTW/i7-3930k) I saved c. $700 just from those plus maybe a few $100 on other components(ended up just buying everything from them as (a) it was cheaper than newegg, et. al. even including sales tax v. shipping costs(bought monitor and case as well and those are pricey to ship and this was pre-Amazon prime days for me) and (b) I could(and did) just drive out one morning to get everything to build that day... (This weas the most that I'd ever spent building a desktop system, usually I'd go with even more mid range e.g. 660, probably an ivy bridge or whatever was available back then v. LGA2011(the 4 core I don't know why they made that for 2011 it wasn't enough cheaper to truly be an option to anyone other than someone who might not be able to afford the 3930k(or 60k) immediately but with plans to upgrade later... still a waste IMNHO as quad channel memory doesn't add enough and now that I think of it I'm not even sure that 3820k(?) was even able to support quad channel as IIRC it was pretty heavily gimped...now I'm just waiting for haswell-e... but will steal as many component from the 2011 as I can and replace those with lower end stuff as it gets demoted... 4930k just didn't offer enough(ivy bridge) to bother looking at for c. $400(I love microcenters...))

Comment

Yep, given the way that nVidia INTENTIONALLY gimps gpgpu capabilities of their consumer cards I was incredibly surprised to see 780 TI perf so close to the R9 290X and exceeding even the more modest ATI cards.

NVIDIA GPUs aren't "gimped". They just aren't very good at some of those operations that they weren't designed for, like scrypt. It's not like you can grab a zillion-dollar Tesla card and all of a sudden get great scrypt performance.

Bought myself a 780 Ti as a christmas present to self last year since when I did my prior desktop build I woosed out and bought a 670 FTW instead of the 680 that I had originally planned. Thanks to a nearby microcenter and massive GPU/CPU discounts(the 670 FTW/i7-3930k) I saved c. $700 just from those plus maybe a few $100 on other components(ended up just buying everything from them as (a) it was cheaper than newegg, et. al. even including sales tax v. shipping costs(bought monitor and case as well and those are pricey to ship and this was pre-Amazon prime days for me) and (b) I could(and did) just drive out one morning to get everything to build that day... (This weas the most that I'd ever spent building a desktop system, usually I'd go with even more mid range e.g. 660, probably an ivy bridge or whatever was available back then v. LGA2011(the 4 core I don't know why they made that for 2011 it wasn't enough cheaper to truly be an option to anyone other than someone who might not be able to afford the 3930k(or 60k) immediately but with plans to upgrade later... still a waste IMNHO as quad channel memory doesn't add enough and now that I think of it I'm not even sure that 3820k(?) was even able to support quad channel as IIRC it was pretty heavily gimped...now I'm just waiting for haswell-e... but will steal as many component from the 2011 as I can and replace those with lower end stuff as it gets demoted... 4930k just didn't offer enough(ivy bridge) to bother looking at for c. $400(I love microcenters...))

Thanks for sharing.

Comment

NVIDIA GPUs aren't "gimped". They just aren't very good at some of those operations that they weren't designed for, like scrypt. It's not like you can grab a zillion-dollar Tesla card and all of a sudden get great scrypt performance.

Thanks for sharing.

But they are. You can find quite a lot of "patches" for Geforce GPUs to gain Tessla-quality.

Part about both of them not targeting scrypt is correct though. Same as with DX FL11_2 (or whatever MS call feature level for DX 11.2), Nvidia skimped on few rarely* used functions.

* But quite useful in OpenCL and in coin mining.

Anyway, if we where fair those fingers should be pointed at...
AMD....

For their quite horrible OpenCL compiler that had have trouble with compiling complex kernels (OpenCL programs).
IIRC AMD already issued somehow better drivers, but they still do not satisfy needs of OpenCL renderers. (Right now, most demanding apps in terms of complexity of code)

Comment

I highly doubt that fpgas are gonna be 'mainstream' anytime soon, but there is some work going on in research where people are trying to translate opencl code into vhdl automatically.
It's going to take years before this is really useful, but it's going to happen (at least in some form).