Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

Gianna Borgnine writes "NVIDIA is predicting that GPU performance is going to increase a whopping 570-fold in the next six years. According to TG Daily, NVIDIA CEO Jen-Hsun Huang made the prediction at this year's Hot Chips symposium. Huang claimed that while the performance of GPU silicon is heading for a monumental increase in the next six years — making it 570 times faster than the products available today — CPU technology will find itself lagging behind, increasing to a mere 3 times current performance levels. 'Huang also discussed a number of "real-world" GPU applications, including energy exploration, interactive ray tracing and CGI simulations.'"

No No. GPU's only become CPU's when they are 570.34567 times faster. You will note that he precisely said only 570 times faster. That is he did not say an even 600 or 1000 or 500, but precisely 570, so we can assume he knew it was not 570.34567.

I see a few tags that cast doubt on the prediction. Why? I'll bet there were skeptics of Moore's Law when that became widely disseminated.

What troubles me is that this sort of cell GPU is not more widely used in everyday applications. We who program for a living are feeling like we have been engaging in 'self stimulation' for years and wish there were some new target platform/market that we could so some interesting work in.

From reading slashdot, I know there are several technologies which have been "a couple of years away" for a while now which could (if people bothered with the expense) turn the most common problem in computer heat dissipation into "how do I prevent it from getting too cold and forming condensation on everything?"

Currently CPUs and GPUs are stamped together. Basically, they take a bunch of pre-made blocks of transistors(millions of blocks, billions of transistors in a GPU), and etch those into the silicon, and out comes a working GPU.

It's easy - relatively speaking - and doesn't require a huge amount of redesign between generations. When you get a certain combination working, you improve (shrink) your nanometre process and add more blocks.

However, compiler technology has advanced a lot recently, and with the vast amounts of processing power now available, it should be simpler getting more complex blocks fully utilized. A vastly more complex block, with interconnects to many other blocks, could perform better at a swath of different tasks. This is evident when comparing the performance hit from Anti-Aliasing. Previously even 2xAA had a huge performance hit, but nVidia altered their designs, and now Multisampling AA is basically free.

I recall seeing an article about a new kind of shadowing that was going to be used in DX11 games. The card used for the review got almost 200fps at high settings - with AA enabled that dropped to about 60fps, and with the new shadowing enabled, it dropped to about 20fps. It appears the hardware needs a redesign to be more optimized for whatever algorithm it uses!

Two other factors you're forgetting...

1) 3D CPU/GPU designs are coming slowly, where the transistors aren't just on a 2D plane... that would allow vastly denser CPUs and GPUs. If a processor had minimal leakage, and low power consumption, 500x more transistors wouldn't be a stretch.

2) Performance claims are merely claims. Intel claims a quad-core gives 4x more performance, but in many cases it's slower than a faster dual-core.

570x faster for every game? Doubtful. 570x faster at the most advanced rendering techniques being designed today, with AA and other memory-bandwidth hammering features ramped to the max? Might be accurate. A high end GPU from 6 years ago probably won't get 1fps on a modern game, so this estimate might even be low.

A claim of 250x the framerate in Crysis, with everything ramped to the absolute maximum, might be even accurate.

The GeForce 9 series was a rebrand/die shrink of GeForce 8, but the GTX 200 series has some major improvements under the hood:

* Vastly smarter memory controller including better batching of reads, and the ability to map host memory into the GPU memory space* Double the number of registers* Hardware double precision support (not as fast as single, but way faster than emulating it)

These sorts of things probably don't matter to people playing games, but they are huge wins for people doing GPU computing. The GTX 200 series has also seen a minor die shrink during the generation, so I don't know if the next generation will be more of a die shrink or actually include improved performance. (Hopefully the latter to keep up with Larrabee.)

I don't doubt the prediction at all, I just have concerns about the vat of liquid nitrogen I'm going to have to immerse my computer in to keep that thing from overheating, and the power substation I'm going to need to build in my backyard to power it.

I don't doubt the prediction at all, I just have concerns about the vat of liquid nitrogen I'm going to have to immerse my computer in to keep that thing from overheating, and the power substation I'm going to need to build in my backyard to power it.

But GPUs today are somewhat more than 570x more powerful than they were several years ago and we haven't had to submerge them in a vat of liquid nitrogen yet, so what makes you think that's going to be the case in the next 570x power increase? (whenever that happens...)

Its easy to get a 570x increase with parallel cores. You will just have a GPU that is 570 times bigger, costs 570 times more and consumes 570 times more energy. As far as any kind of real break through though, I'm not seeing it from the information at hand.

There is something worthy of note in all this though, which is that the new way of doing business is through massive parallelism. We've all known this was coming for a long time, but its officially here.

The prediction is complete nonsense. It assumes that CPU processors only get 20% faster per year (compounded). That would only be true if they did not add more cores to the CPU. And finally GPUs are hitting the same thermal/power leakage wall that CPUs hit several years ago - they will at best get faster in lock step with CPUs.

A GPU is not a general purpose processor, as is a CPU. It is only good at performing a large number of repetitive single precision (32 bit) floating point calculations without branching. Double precision (64 bit) calculations - double in C speak - is 4 times slower than single precision on a GPU. And the second you have an "if" in GPU code, everything grinds to a halt. Conditions effectively break the GPU SIMD (single instruction multiple data) model and bring the pipeline to a halt.

"It assumes that CPU processors only get 20% faster per year (compounded). That would only be true if they did not add more cores to the CPU."

"It is only good at performing a large number of repetitive single precision (32 bit) floating point calculations without branching."

If we wanted a 64-bit GPU it would be easy enough to make. GPUs used to do weird mixes of integer and floating point math until the manufacturers made an effort to guarantee 32-bit precision throughout. That leaves the branching part o

In other news, ATI is selling their 4870 series cards for $130 on newegg, which are twice as fast as an Nvidia 9800GTS which is the same price (at least on Left 4 Dead, Call of Duty, and any other game that matters). ATI is blowing Nvidia out of the water in terms of performance per dollar and will continue to do so through at least the middle of next year. See here:

well thats a new one. There's not even slight technical merit to that statement but its certainly demonstrates the amusing creativity of ATi fanbois.

>> The 9800GT and 8800GT are the same price and the ATI card blows it out of the water

I have no argument that you should go with Ati if you're windows only and looking at cheaper-end cards.

Its totally irrelevant to me though as I go for best overall performance, decent drivers, and only consider cards that have drivers that work well with Linux. ATI suck on all counts in my areas of interest.

I enjoy the following features of my GTX280 (used for calcs not games):
CUDA (I compile C code, throw in a couple of lines of stuff for the GPU and it runs on the GPU, easy)
Hardware optimizes my memory accesses and at times branchy code so the GPU is doing as much work as possible (makes it easy to get good results on the GPU)

WTF Mods. He's just saying that at this price point you can get nearly double the performance from ATI than from nVidia. I love nVidia too, I run a 9800GT, but I'm not going to mod someone troll for pointing out that something else is now faster and cheaper.

Depending on vendor it is now possible to get a 275 less than a 4890 and a 260 for only slightly more than a 4870; at lower prices its very competitive too. My point is that both NV and ATI are on pretty level ground again and the ONLY reason I now choose NV over ATI is because of the superior NV drivers (both Linux and Windows side)...oh and the fact that ATI pulled a fast one on me with their AVIVO performance claims. Shame on you ATI!

In other news, ATI is selling their 4870 series cards for $130 on newegg, which are twice as fast as an Nvidia 9800GTS which is the same price (at least on Left 4 Dead, Call of Duty, and any other game that matters). ATI is blowing Nvidia out of the water in terms of performance per dollar and will continue to do so through at least the middle of next year. See here:

Even when it comes to GPGPU (General Purpose computing on the GPU), ATI's hardware is much better than NVIDIA's. However, the programming interfaces for ATI suck big times, whereas NVIDIA's CUDA is much more comfortable to code for, and it has an extensive range of documentation and examples that provide developers with all they need to improve their NVIDIA GPGPU programming. It also has much more aggressive marketing.

As a sad result, NVIDIA is often the platform of choice for GPU usage for HPC, despite it having inferior hardware. And I doubt OpenCL is going to fix this, since it basically standardizes the low-level API, keeping NVIDIA with its superior high-level API.

In addition to VDPAU enabled mplayer, I can actually FIND CUDA enabled apps. There's CUDA enabled md5 crackers, cuda enabled BOINC, Matlab has a CUDA plugin. I'm considering buying CUDA compatible card so I can install it at work just to play with it in Matlab.

I agree. I recently bought a laptop with an ATI card and the biggest reason why I did that is because I heard they went open source. I was disappointed by the fact that their latest Catalyst driver doesn't work well on Ubuntu 9.04. The one recommended by Ubuntu works but it's VERY slow when restoring a window in Compiz. All in all it feels like a downgrade compared to my Intel integrated graphics card. Sigh.:(

Agreed. My primary use for the nvidia gpu is watching HD. Let's do some math.

1080 * sqrt570 = 25784

I like. Considering even the most basic of today's gpus, the ion and tegra, for example, are capable of 1080p, Mr Nvidia is predicting that my handheld 6 years hence will be able to smoothly decode mkvs and output them real-time to my new UltraMegaFullHD(TM) 25784p tv? Bring on the future!

I read the article, but I don't see any explanation of how exactly that performance increase will come about. Nor is there any explanation of why GPUs will see the increase but CPUs will not. Anyone have a better article on the matter?

> 'Huang also discussed a number of "real-world" GPU applications, including energy exploration, interactive ray tracing and CGI simulations.'"

Add to that 'MD5 collisions etc"

GPU coding really is going to separate the men from the boys. I sense a return to the old days, where people had to think about coding, and where brilliant discoveries were made.( like this: http://en.wikipedia.org/wiki/HAKMEM [wikipedia.org] )

Moore's Law states a doubling in transistors (but we'll call it performance) at every 18 month interval, so:

72/18 = 4 Moore cycles

2^4 = 32

So in six years, Gordon Moore says we should have 32x the performance we have now.

But it's indeed interesting... Silicon was a much easier-to-predict medium in the 20th Century. And yet here we have these two mature, opposing approaches to silicon-based computing, represented by the CPU and the GPU, with some predicting unprecedented growth for one and

Or what he is actually saying is that they (nVidia) will have more than 9 generations (~9.15) within 6 years... 1.5 generations/year... which I believe is fairly doable and actually slightly slower than the 6 month release cycle we have been accustomed to since 1998.

I do high-performance lattice QCD calculations as a grad student. At the moment I'm running code on 2048 Opteron cores, which is about typical for us -- I think the big jobs use 4096 sometimes. We soak up a *lot* of CPU time on some large machines -- hundreds of millions of core-hours -- so making this stuff run faster is something People Care About.

This sort of problem is very well suited to being put on GPU's, since the simulations are done on a four-dimensional lattice (say 40x40x40x96 -- for technical reasons the time direction is elongated) and since "do this to the whole lattice" is something that can be parallelized easily. The trouble is that the GPU's don't have enough RAM to fit everything into memory (which is understandable, they're huge) and communications between multiple GPU's are slow (since we have to go GPU -> PCI Express -> Infiniband).

If Nvidia were to make GPU's with extra RAM (could you stuff 16GB on a card?) or a way to connect them to each other by some faster method, they'd make a lot of scientists happy.

You can -- that's what people are trying now. The issue is that in order for the GPU's to communicate, they've got to go over the PCI Express bus to the motherboard, and then via whatever interconnect you use from one motherboard to another.

I don't know all the details, but the people who have studied this say that PCI Express (or, more specifically, the PCI Express to Infiniband connection) is a serious bottleneck.

If Nvidia were to make GPU's with extra RAM (could you stuff 16GB on a card?) or a way to connect them to each other by some faster method, they'd make a lot of scientists happy.

Do you really need to ask them to do this for you? I'd think if you are a grad student you might be able to get together with some Electrical Engineering students and rig up something and turn a profit! The only thing you really need to know is how much memory the GPU can address, if you can get a hold of the source for the drivers, etc..

A video card isn't much more than a GPU with memory soldered on to it...

That product was actually specifically mentioned in the plenary talk at the 2009 Lattice Gauge Theory conference as the most likely contender for doing QCD on GPU's. It's still got the problem I mentioned, though -- not enough RAM to store everything, and not enough bandwidth to talk to the other units that are storing it.

No, that won't do. The NVIDIA architecture (which is shared between Tesla and graphic cards) is 32-bit, meaning that it can only flat-address 4GB of RAM tops. The more sophisticated Tesla solutions are essentially built from clusters of Tesla cards, each with its own 4GB of RAM tops. Separate memory spaces means expensive memory transfers to share data between the cards, which is not an issue if you can get good domain decomposition, but is a BIG issue if you cannot.

The revolution for HPC on GPUs would be a 64-bit GPU architecture.

Proper support for doubles and possibly even long doubles would be a plus, for applications that need it.

It turns out single precision is enough for lattice QCD. The step that requires the most CPU time doesn't need to generate an exact result; it only needs to get close. If the result is too far off then you wind up wasting time, but the result will still be valid.

(This is the Metropolis procedure, if you're familiar with it: the accept/reject step takes care of any computational errors that occur)

"Did I mention that our next model is going to be SO amazing that you'll think that our current product is crap? The new model will make EVERYTHING obsolete and the entire world will need to upgrade to it when it comes out. People won't even be able to give away any older products. Sooooo... how many of this year's model will you be buying today?

This isn't about math (well maybe a little) as much as it is about wording. Basically, the difference between "as fast/increasing/the speed" and "faster/increase". The first is a multiplicative action, while the second is an additive. So if you say, 100% as fast, you are basically saying a * 100% = a. While if you are saying 100% faster you are saying a + a*100% = 2*a. Now looking at the posts. You started by this assertion.

Intel said 4 nm for 2022, that's in 13 years. What precisely allows you to doubt that claims, except maybe the fact that deadlines are often missed? Let me rephrase that, what allows you to think that it'll be reached much later than anything else?

Also, queue a dozen+ posts explaining to the armchair pundits how 560x is possible.

The IEEE figures that semiconductor tech will be at the 11nm level around 2022. Intel and Nvidia both claim that they'll be significantly further along the path than the IEEE's roadmap. Maybe they're right, and I hope they are, but there are some very significant problems that appear as the process shrinks to that level.

Keep in mind that is only ~3x per year because 3^6 = 729. If Moore's law holds with a 2x every 18 months that would be 16x in 6 years 570/16 = 35.652. The sixth root of 35 is 1.8. So they only have to improve the architecture by ~2x every year and ride Moore's law.

I'm not shorting Intel's capabilities, but the IEEE has some solid people in it, too -- many of whom work at Intel -- and they're very capable of recognizing the potential problems with process shrinks. The issues that come about at the sizes they're discussing involve quantum tunneling effects that would (as I understand it) interfere in accurate computing. There is also doubt that transistors can be made to work at all at sizes below 16nm because the mechanisms that might deal with quantum tunneling may

Intel said 4 nm for 2022, that's in 13 years. What precisely allows you to doubt that claims, except maybe the fact that deadlines are often missed? Let me rephrase that, what allows you to think that it'll be reached much later than anything else?

I'm dunno. Most CEOs don't make claims unless their business plan includes said claims else they look like a fool at the next shareholder meeting. That doesn't stop them from making claims that don't come through.

Agreed with the other poster, you're wrong, they've been actively pursuing the goal of fulfilling the prophecy. They make it a primary goal to increase the number of transistors by all means. If it wasn't for the law they wouldn't have done things like they have. The law itself won't fulfil itself that simply, you need to throw billions at it to keep up.

The marketing guys originally wanted to say 1000x, but when they ran it past the engineers, the engineers couldn't stop laughing at such a ridiculous assertion. The marketing guys kept lowering the number, but the engineers just couldn't stop laughing. 570x is how low they got before the engineers passed out from laughing so much, which the marketing guys interpreted as agreement.