Posted
by
Zonk
on Thursday November 09, 2006 @05:51PM
from the you-can-use-them-to-play-quake-too dept.

evanwired writes "Revolution is a word that's often thrown around with little thought in high tech circles, but this one looks real. Wired News has a comprehensive report on computer scientists' efforts to adapt graphics processors for high performance computing. The goal for these NVidia and ATI chips is to tackle non-graphics related number crunching for complex scientific calculations. NVIDIA announced this week along with its new wicked fast GeForce 8800 release the first C-compiler environment for the GPU; Wired reports that ATI is planning to release at least some of its proprietary code to the public domain to spur non-graphics related development of its technology. Meanwhile lab results are showing some amazing comparisons between CPU and GPU performance. Stanford's distributed computing project Folding@Home launched a GPU beta last month that is now publishing data putting donated GPU performance at 20-40 times the efficiency of donated CPU performance."

Talking about run windows, Is it possible(or legal) for Nvidia to sell proprietary software to do system demanding activities like video and audio encoding and runing it in their hardware only, taking advantage of its supercomputer-like properties?

One more step toward GPU Raytracing. We're already pushing rediculous numbers of polygons, with less and less return for our efforts. The future lies in projects like OpenRT [openrt.de]. With any luck, we'll start being able to blow holes through levels rather than having to run the rat-maze.;)

It may be the same reason that you can't be bothered to capitalize or use apostrophes.cant [kant] -noun1. insincere, esp. conventional expressions of enthusiasm for high ideals, goodness, or piety.2. the private language of the underworld.3. the phraseology peculiar to a particular class, party, profession, etc.: the cant of the fashion industry.4. whining or singsong speech, esp. of beggars.-verb (used without object)5. to talk hypocritically.6. to speak in the whining or singsong tone o

Let me see if I have this down right: With the progress of multi-core CPU's, especially looking at the AMD / ATI deal, PC's are moving towards a single 'super chip' that will do everything while phasing out the use of a truly separate graphics system. Meanwhile, supercomputers are moving towards using GPU's as the main workhorse. Doesn't that strike anybody else as a little odd?

With the progress of multi-core CPU's, especially looking at the AMD / ATI deal, PC's are moving towards a single 'super chip' that will do everything while phasing out the use of a truly separate graphics system.Not really...

PC's run multiple processes that have unpredictable branching - like network protocol stacks, device drivers and word processors and plug'n'play devices. More CPU cores help to spread the load. For the desktop windows system, 3D functionality was simply a bolt-on to the windows system

Simple video games that run ENTIRELY on the GPU- mainly for developers. Got 3 hours (or I guess it's now going on 7 hours) to wait for an ALTER statement to a table to complete, and you're bored stiff? Fire up this video game, and while your CPU cranks away, you can be playing the video game instead with virtually NO performance hit to the background CPU task.

Do people who work with databases large enough to make an alter table run for 3 hours commonly put them on their workstations? Why not run the alter table command on the database server, and play your game on your workstation?

Do people who work with databases large enough to make an alter table run for 3 hours commonly put them on their workstations? Why not run the alter table command on the database server, and play your game on your workstation?

I would hope the later- but the 2^24th bug rebuild I was refering to sure took a long time.

I would suggest reading the net while you're waiting for the computation to finish, but I'm sitting here with Mozilla using 150MB of RAM and burning 98% of CPU because it's gotten itself into some kind of loop.... But Nethack is a nice low-CPU low-RAM game that shouldn't bother your CPU much.

GPUs have dedicated circuitry to do math, math, and more math - and to do it *fast*. In a single cycle, they can perform mathematical computations that take general-purpose CPUs an eternity, in comparison.

The 8800 looks like the first GPU that really enters the realm of the old fashioned supercomputing architectures pioneered by Seymour Cray that I cut my teeth on in the mid 1970s.
I can't wait to get my hands on their "C" compiler.

SAN JOSE, Calif., Nov. 8 -- A $90 million supercomputer made for nuclear weapons simulation cannot yet be rivaled by a single PC chip for a serious video gamer. But the gap is closing quickly.

Indeed, a new breed of consumer-oriented graphics chips have roughly the brute computing processing power of the world's fastest computing system of just seven years ago. And the latest advance came Wednesday when the Nvidia Corporation introduced its next-generation processor, capable of more than three trillion mathematical operations per second.

Nvidia and its rival, ATI Technologies, which was recently acquired by the microprocessor maker Advanced Micro Devices, are engaged in a technology race that is rapidly changing the face of computing as the chips -- known as graphical processing units, or G.P.U.'s -- take on more general capabilities.

In recent years, the lead has switched quickly with each new family of chips, and for the moment the new chip, the GeForce 8800, appears to give the performance advantage to Nvidia.

On Wednesday, the company said its processors would be priced at $599 and $449, sold as add-ins for use by video game enthusiasts and for computer users with advanced graphics applications.

Yet both companies have said that the line between such chips and conventional microprocessors is beginning to blur. For example, the new Nvidia chip will handle physics computations that are performed by Sony's Cell microprocessor in the company's forthcoming PlayStation 3 console.

The new Nvidia chip will have 128 processors intended for specific functions, including displaying high-resolution video.

And the next generation of the 8800, scheduled to arrive in about a year, will have "double precision" mathematical capabilities that will make it a more direct competitor to today's supercomputers for many applications.

"I am eagerly looking forward to our next generation," said Andy Keane, general manager of Nvidia's professional products division, a business the company set up recently to aim at commercial high-performance computing applications like geosciences and gene splicing.

The chips made by Nvidia and ATI are shaking up the computing industry and causing a level of excitement among computer designers, who in recent years have complained that the industry seemed to have run out of new ideas for gaining computing speed. ATI and Advanced Micro Devices have said they are working on a chip, likely to emerge in 2008, that would combine the functions of conventional microprocessors and graphics processors.

That convergence was emphasized earlier this year when an annual competition sponsored by Microsoft's research labs to determine the fastest sorting algorithm was won this year by a team that used a G.P.U. instead of a traditional microprocessor. The result is significant, according to Microsoft researchers, because sorting is a basic element of many modern computing operations.

Moreover, while innovation in the world of conventional microprocessors has become more muted and largely confined to adding multiple processors, or "cores," to single chips, G.P.U. technology is continuing to advance rapidly.

"The G.P.U. has this incredible memory bandwidth, and it will continue to double for the foreseeable future," said Jim Gray, manager of Microsoft's eScience group.

Although the comparison has many caveats, both computer scientists and game designers said that Nvidia GeForce 8800 had in some ways moved near the realm for the computing power of the supercomputing world of the last decade.

Let me see if I have this down right: With the progress of multi-core CPU's, especially looking at the AMD / ATI deal, PC's are moving towards a single 'super chip' that will do everything while phasing out the use of a truly separate graphics system. Meanwhile, supercomputers are moving towards using GPU's as the main workhorse. Doesn't that strike anybody else as a little odd?

Odd? Not really. The "PC super chip" design is practically the same thing as the "GPU Supercomputer" design. The big difference is

I think computers will eventually contain an FPGA, which can be re-programmed to perform any task. For example, a physics processor can be programmed into the FPGA when a game launches, folding@home can program the FPGA to do specific vector calculations very quickly, encryption algorithms can be programmed in to perform encryption/decryption very quickly, etc.

FPGAs are getting quite powerful and are getting a lot cheaper. It definitely won't be as fast as a dedicated ASIC, but if programmed properly, it

The specs on this board are pretty crazy. 128 single precision FP units each capable of doing a FP Multiply add or a multiply and operating at 1.35 GHz and no longer closely coupled to the tradition graphics pipeline. The memory hierarchy also looks interesting... this design is going to be seeing a lot of comparisons to the Cell processor. Memory is attached via a 384 bit bus (320 on the GTS) and operates at 900MHz.

The addition of a C compiler, drivers specific to GPGPU applications and available for linux (!) as well as XP/Vista means that this is going to be seeing widespread adoption amongst the HPC crowd. There probably won't be any papers on it published at SC06 in Florida next week, but over the next year there probably will be a veritable torrent of publications (there already is a LOT being done with GPUs). The new architecture really promotes GPGPU apps, and the potential performance/$ especially factoring in the development time which should be significantly less with this toolchain. A couple 8800GTXes in SLI and I could be giving traditional clusters a run for their money when it comes to apps like FFTs etc. I can't wait till someone benchmarks FFT performance using CUDA. If anyone finds such numbers post and let me know!

Original [slashdot.org]
It's not unusual at all. CPUs are very general and do certain things very quickly & efficiently. GPUs on the other hand do other things very quickly and efficiently. The type of number crunching that GPUs do is actually well suited to the massively repetitive number crunching done by most of the big super computers [think climatology studies]. Shifting from CPU to GPU architectures just makes sense there.

It's nice to see the name Acceleware mentioned in the NVIDIA press release, although they are missing from the 'comprehensive' report on wired. It should be noted that they have been delivering High performance computing solutions for a couple of years or so already. I guess now it's out of the bag that NVIDIA's little graphics cards had something to with that.

Anyone know of any other companies that have already been commercializing GPGPU technology?

"Let me see if I have this down right: With the progress of multi-core CPU's, especially looking at the AMD / ATI deal, PC's are moving towards a single 'super chip' that will do everything while phasing out the use of a truly separate graphics system. Meanwhile, supercomputers are moving towards using GPU's as the main workhorse. Doesn't that strike anybody else as a little odd?"16789087 [slashdot.org]

My honors thesis at college back in 2004 was a framework that would allow you to load pixel shaders (written in CG) as 'threads' and run them in parallel on one GPU. As far as I can tell nVidia has done the same thing, but taken it a step further by translating from C (and more efficient I'm sure).

Why is a GPU so great FOR MATH? Parallel processing (it is on Page 2 of the Wired article linked at the first of the Slashdot summary) If you need to have lots of branching and decision making, it is not as good. The better bandwidth, etc sure helps, but parallel processing is part of it. That is why they are so great for tasks such as number crunching involved in graphics (3d is done not by "moving the points" but by changing the base axis around the points-- this is a way of visualizing the math done to

Folding@Home launched a GPU beta last month that is now publishing data putting donated GPU performance at 20-40 times the efficiency of donated CPU performance.

Obviously some of that is due to GPUs being better than general-purpose CPUs at this sort of math, but how much is also due to the fact that the people who are willing to run a Beta version of Folding@Home on their GPU tend to be the sort of people who would have much better computers overall than those who are merely running the project on their

The following idea from TFA is what caught my eye:"In a sign of the growing importance of graphics processors, chipmaker Advanced Micro Devices inked a deal in July to acquire ATI for $5.4 billion, and then unveiled plans to develop a new "fusion" chip that combines CPU and GPU functions."I can see the coming age of multi-core CPU's not necessarily lasting very long now. We don't tend to need a large number of general-purpose CPU's. But a CPU+GPU chip, where the GPU has for example 128 1.35GHz cores (from t

nVidia has PureVideo, ATi has whatever. Why are there still no GPU-assisted MPEG2 (or any other format) video encoders? Modern GPUs will do hardware assisted MPEG decoding, but software-only encoding is still too slow. TMPGEnc could be much faster. Same for the others. It seems as though the headlong rush to HD formats have left SD in the dust.

<shameless plug>While it's probably too late to sign up for the general-purpose GPU tutorial at Supercomputing '06, there may still be time to get to the "General-Purpose GPU Computing: Practice and Experience" workshop (assuming you're going to Supercomputing to begin with.) Workshop's web page is http://www.gpgpu.org/sc2006/workshop/ [gpgpu.org]

The workshop itself has turned into a kind of "GPU and multi-core" forum, with lots of great speakers. NVIDIA's Ian Buck and ATI's Mark Segal will both be speaking to th

Intel's 80 core chip wasn't symmetric; most of those cores were stripped-down processors, not x86 standard. Like the Cell, only more so.

nVidia's G80, while not on the same chip, takes this to 128 cores. G90 will support full double-precision math. And although it's separate from the CPU, graphics cards are such a standard part of most systems that by the time five years have elapsed, you'll likely be able to get a quad-core x86 + 256-core DP gfx/HPC system for somewhat less than Intel's fancy new 80-core r

Unfortunately, the new NV80 is still not IEEE754 compliant for single precision (32 bit) floating point math. It is mostly compliant however, so may be usable by some people. Forget it if you want to do 64 bit double precision floats though.

CPUs are inherently good at doing serial jobs. and GPUs are good at doing parallel jobs.
GPUs can be though of as the extreme enhanced graphical equivalent of DSP chips.
So basically, any combination of a controlling and parallel execution processor can give you the supercomputing
environment you need. Which again brings us back to our traditional supercomputing model; Except for one change, that the mathematical units have grown faster and massively parallel in nature!
We haven't done much past anything

the mathematical units have grown faster and massively parallel in nature!

This is not english. Even if I try, I can't guess what you are trying to tell. It makes no sense whatsoever...

We haven't done much past anything turing computable anyway

Please understand what "turing computable" means. All computers are devices capable of emulating a turing machine (at least if it fits within RAM, etc). And computers is something you can emulate on a turing machine. Your criticism is like someone complaining tha

I've read this isn't quite as much a waste of electricity as it seems, at least during the winter if you have electric heating. The majority of the energy consumed by your CPU goes into thermal energy, which your heatsink disipates into the air. Thus every watt your CPU burns is one wat your furnace doesn't have to burn to keep your house warm enough. I'm sure it doesn't work out perfectly, but one way you're running a whole bunch of ele

Great if you want fast answers, but the RAM used in GPUs isn't as robust accuracy-wise as normal RAM.

You're confusing your technologies. The RAM used on video cards these days is effectively the same RAM you use with your CPU. The memory cannot lose data or very bad things will happen to the rendering pipeline.

What you're thinking of is the intentional inaccuracy of the floating point calculations done by the GPU. In order to obtain the highest absolute graphical performance, most 3D drivers optimized for gaming attempt to drop the precision of the calculations to a degree that's unacceptable for engineering uses, but perfectly acceptable for gaming. NVidia and ATI make a lot of money by selling "professional" cards like the Quadro and the FireGL to engineering companies that need the greater precision. A lot of the difference is in the drivers (especially for the low-end models), but the cards do often have hardware technologies better suited to CAD-type work.

First of all, the gf8800 has the same deficiency that the cell has, in that both are really good at performing single precision floating point math. This is great for video processing and the like, but real science has been using 64bit floats since the mid 70's. It might be hard to convince users that they can get the wrong answer, but it'll be really cheap and really fast.secondly, the bandwidth to memory is very high, but the amount of addressable memory is very very low. 768MB of memory, divided by 128 p

"GPUs have dedicated circuitry to do math, math, and more math - and to do it *fast*. In a single cycle, they can perform mathematical computations that take general-purpose CPUs an eternity, in comparison."

Sounds like there is a lot of untapped potential. I propose we move GPUs off the external cards, and give them their own dedicated spot on the motherboard. Though, since we will allowing it be used for more general applications, we could just call it a Math Processor. Then again, it's not really a full processor like a duel core, so, we'll just call it a Co-Processor. This new "Math Co- Processor" will revolutionize PCs like nothing we have ever seen before. Think of it, who would have thought 20 years ago we could have a whole chip just for floating point math!

Its quite obvious that computing is going in a direction where we won't say GPUs or CPUs, but rather serial processors and parallel processors, with the assumption of having both. The cell processors are a good example of this thought, although they're too heavy on the parallel side. Many tasks do not parallelize well, and will still need a solid serial processor.

Remember this [wikipedia.org]? although it was a failure commercially, it was the right idea after all: lots of small processing units that are able to process in parallel big chunks of data; that's what modern GPUs do.

So what we need now is for this kind of architecture to pass in CPUs (maybe already scheduled from what I've read lately) and then a programming language where operations are parallel, except when data dependencies exist (functional languages may be good for this task).

Until these things are able to do double-precision, their applicability to general HPC problems remains very limited.
Make them do DP arithmetic, and benchmark with SPEC and McAlpine's STREAMS benchmarks, and then we'll see. Oh and BTW, make a Fortran-90/95 compiler available.

Perhaps finally, we will see the popular commercial/shareware/freeware programs taking advantage of GPU acceleration.

There are two main areas that I would love to see accelerated by GPU:
DivX or other MPEG4 Codec
MP3 Codec

Due to the asymmetry in CPU usage it is the ENCODING that would be revolutionized by GPU acceleration. I am sure I am not alone when I think of these two areas as the most time consuming tasks my home PC is set-upon. Yes ATI may have a soilution, but I want to see support for both N

My rational is something along the lines of how Apple may have implemented hardware assisted vector operations; falling back to scalar equivalents when altivec wasn't available.

On kernel startup (or dynamically, assuming hot swapping GPUs!) the system could load a configuration for a shared library to take advantage of GPU acceleration. Whether this happened when coding to a specific API or could somehow be trapped in the platform c lib or at

Everyone knows that the language of supercomputing is Fortran, for historical (legacy code) as well as truly practical reasons such as braindead language (very good for compiler optimizations and automatic rewriting) efficient and predictible (loop unrolling, peephole optimization, optimal memory access without pointer indirections and heavy objects to pass between functions) linear algebra handling, which is the core of heavy numerical computing. What are they waiting to release a Fortran compiler for the