Posted
by
samzenpus
on Wednesday October 17, 2007 @01:47PM
from the read-all-about-it dept.

Martin Ecker writes "Weighing in at fifty pages short of a thousand, NVIDIA has recently released the third installment of its GPU Gems series, aptly titled "GPU Gems 3" published by Addison-Wesley Publishing. Just like the two previous books before it, GPU Gems 3 is a collection of articles by numerous authors from the game development industry, the offline rendering industry, academia, and of course NVIDIA. The 41 chapters of the book grouped into six parts discuss a wide range of topics, all dealing with recent advancements in using graphics processing units (GPUs, for short) to either render highly realistic images in real-time or do high-performance, parallel computation, an area that is called GPGPU (short for General Purpose computation on GPUs). In this latest installment of the series, the focus of a lot of the chapters is on using new hardware features of Direct3D 10-level hardware, such as NVIDIA's GeForce 8 series, to either get more realistic looking results or higher performance."
Read on for the rest of Martin's review.

GPU Gems 3

author

Huber Nguyen (Editor)

pages

942

publisher

Addison-Wesley Publishing

rating

9/10

reviewer

Martin Ecker

ISBN

0-321-51526-9

summary

in-depth discussions of bleeding-edge techniques, tips, and tricks in real-time graphics and GPGPU.

The book is aimed at the intermediate and advanced graphics programmer that has a solid background in computer graphics algorithms. The reader is also expected to be familiar with commonly used real-time shading languages, in particular HLSL, which is used in most of the chapters. Familiarity with graphics APIs, such as Direct3D and OpenGL, is also required to get the most out of this book.

The first part of the book is about geometry with the first chapter diving right into generating complex procedural terrains on the GPU. This interesting chapter explains the techniques behind a recent NVIDIA demo that shows very nice, 3-dimensional, procedurally generated terrain using layering of multiple octaves of 3-dimensional noise. An interesting contribution of this chapter is how the authors texture the terrain avoiding the typical, ugly texture stretching that previous techniques exhibit. This is followed by a chapter on rendering a large amount of animated characters using new Direct3D 10 features, in particular the powerful geometry instancing that is now available. The author suggests doing palette skinning by storing bone matrices in animation textures instead of the traditional way where they are stored in shader constant registers. The next chapter is in a similar vein, but uses blend shapes aka morph targets instead of skinning to animate characters. In particular, the main focus is again on how to use Direct3D 10 features to accelerate blend shapes on the GPU. Other chapters in this part of the book are on rendering and animating trees, visualizing metaballs (also useful for rendering fluids), and adaptive mesh refinement in a vertex shader.

Part two of the book deals with light and shadows. For me personally, this is one of the most exciting parts of the book with very practical techniques that we are going to see applied fairly soon in video games. The first chapter is on summed-area variance shadow maps, an extension to the popular variance shadow maps algorithm that provides nice soft shadows without aliasing artifacts. The next chapter is on GPU-based relighting, which is mostly useful for fast previewing in offline rendering. Then we move on to a nice chapter on parallel-split shadow maps, which are a way of doing dynamic, large-scale environment shadows by splitting the view frustum into different parts and having a separate shadow map for each of them. Other chapters in this part of the book are on improved shadow volumes, high-quality ambient occlusion, which is an improvement of a technique previously presented in GPU Gems 2, and volumetric light scattering.

The third part of the book is on rendering techniques and it starts with a very interesting chapter on rendering realistic skin in real-time. This chapter with its more than fifty pages is one of the longest in the book, but it definitely deserves the space. I have never seen such realistic looking skin rendered in real-time before. The result is really astonishing and the authors go into detail of all the various techniques and tricks employed to achieve it. Simply put, they take a diffuse map and apply multiple Gaussian blurs of varying kernel sizes to it. These blurred images are then linearly combined using certain weights to get an approximation to a so-called diffusion profile, which is used to visualize subsurface scattering. Of course, the devil is in the details and the technique is a bit more complicated than what I've described here. Some other chapters in this part of the book are on capturing animated facial textures and storing them efficiently using principal component analysis (PCA) as used in recent EA Sports games, animating and shading vegetation in the upcoming game Crysis, and a way of doing relief mapping without the artifacts of previous methods.

Part four starts out with a chapter on true imposters, i.e. billboards generated by raytracing through a volumetric object on the GPU. It's fairly interesting but I doubt that we'll see it in video games anytime soon because the costs of this technique seem fairly high. Another chapter is on rendering large particle systems to lower resolution, off-screen buffers and then recombining them with the framebuffer as a post process. This technique allows for rendering very fill-rate intensive particle systems with good performance. Other chapters include an appeal to make sure you do your lighting calculations in linear space and be careful when and where gamma correction needs to be applied, followed by some chapters on post processing effects, in particular motion blur and depth of field, and a chapter co-authored by Jim Blinn himself on rendering vector fonts in high quality via pixel shaders.

With part five dealing with physics simulation on the GPU we enter GPGPU territory. While a lot of the techniques in this and the following part of the book are highly interesting and innovative, I doubt we'll be seeing them applied a lot in video games in the next year or two, simply because they use up a lot of GPU processing power and GPU memory that us game developers would rather spend on doing fancy graphics. The first chapter is on doing rigid body simulation on the GPU. The author uses spherical particles to represent rigid bodies, which greatly simplifies the collision detection even between the most complex shapes. The subsequent chapter is on simulating and rendering volumetric fluids entirely on the GPU. The authors apply fluid simulation to create realistic smoke, fire, and water effects. The presented technique is based on running a fluid simulator on a voxelized 3D volume stored in 3D textures. Also solid objects that interact with the fluid are voxelized on the fly on the GPU. To render the fluid a ray-marching algorithm is used. The remaining chapters of this part of the book discuss N-body simulation, broad-phase collision detection and convex collision detection with Lemke's algorithm for the linear complementarity problem. Many chapters of this part of the book use NVIDIA's new language for doing GPGPU called CUDA and the reader is expected to be familiar with it. CUDA is both a runtime system and a language based on C that eliminates the need to have in-depth knowledge of a graphics API in order to implement GPGPU algorithms.

The final part of the book is on GPU computing with chapters that show how to apply the incredible parallel computing power of modern GPUs to classic computation problems that are not directly related to either computer graphics or physics. One chapter demonstrates how to search for virus signatures on the GPU, effectively turning your graphics card into an antivirus scanner. Another chapter shows how to do AES encryption and decryption on the GPU, which is now possible thanks to the new generation of GPUs supporting integer operations in addition to floating-point operations. Other chapters deal with generating random numbers, computing the Gaussian, and using the geometry shader introduced with Direct3D 10 to implement computer vision algorithms on the GPU that previously were not possible with vertex and pixel shaders only, such as histogram building and corner detection.

One of the features that distinguishes the GPU Gems series from other graphics books was kept for GPU Gems 3: the high quality and large number of images and diagrams. All figures in the book are in color, and there are plenty of them. The book also comes with a DVD that has the sample source code to most of the techniques discussed in the book. A lot of these programs require Direct3D 10 hardware (and as consequence Windows Vista) to run. However, for most of these, demo videos are also made available so you can see how a technique looks like without having the latest hardware or operating system. Furthermore, the book's website offers a visual table of content and three sample chapters to download in PDF format.

As with the previous two GPU Gems books, most of the chapters in this book are fairly advanced and ahead of their time. A lot of the presented techniques are not yet practical for video games on current generation GPUs, simply because they use up all the computation power and/or memory that they have to offer. However, a lot of techniques from the previous two books are now commonly used and we can expect the same to be the case for many of the techniques discussed in this book. As such, it is required reading for any serious professional working in the real-time computer graphics industry.

Martin has been involved in real-time graphics programming for more than 10 years and works as a professional game developer for High Moon Studios in sunny California.

The word is "unstable", and there are lots of people who aren't programmers, but can serve to support programmers by, I don't know, making documentation and collating information into something like, say, a book?

Dammit I hate to see all this DirectX10 emphasis. It's games only. I am a scientist and CAD user. Right now there is no laptop let alone "consumer" card in the world that can handle even the kind of CAD work a lot of people have to do. OpenGL was created for science. DirectX was a copy of a subset of it applicable only to games. And now all the graphics cards are focusing on the DirectX and neglecting OpenGL. Arg! This copy of OpenGL is short-circuiting advancement in the very thing 3D graphics were originally, you could say, invented for -- the thing they actually are useful for beyond plain entertainment. These cards cost hundreds of dollars but they can't handle an assembly with 100 parts in a CAD model simply because they barely have any OpenGL hardware in them. A car, airplane, etc has millions of parts.

These cards cost hundreds of dollars but they can't handle an assembly with 100 parts in a CAD model simply because they barely have any OpenGL hardware in them.

Because there's very little money there.

There is?

Last I was aware pretty much all non-Microsoft specific functionality for graphics was using OpenGL now, and the Linux Gaming market - which uses OpenGL - is a growing market too. Additionally, the CAD (AutoCAD, etc.) market is also a very ripe market for graphics, and a lucrative market too. (N

Sure. We're talking about scale here. Performance increases on graphics cards are being pushed by the high-end gaming market, not by guys like you. The high end gaming market has folks that buy 2 SLI enabled 8800 GTXs one year (at $1100), and go out and turn around and do the same thing on next year's cards. And there are a whole lot more of them then there are of CAD guys buying Quadro FX5500's at $1400 a pop every 3 years. The bleeding edge technology is naturally going to seek the money. Eventually

But who's going to support it, kid? You? Writing graphics drivers ain't like dusting crops, boy. Without precise calculations you could overflow right through a buffer, or divide to close to a zero, and that'd end your app real quick, wouldn't it?

Microsoft's argument for having DirectX is that having to go through a committee slows down their development process, and so they prefer to have their own API for the games market (Desktop's, consoles).

This was always a specious argument. OpenGL includes a vendor extension mechanism so vendors DON'T have to wait for a new 'official' version to roll out new functionality.

Microsoft simply wanted (as it always does) to control the APIs and promote Microsoft platform lockin. It has been trying to push Dire

Well, like it or not, real-time high framerate graphics are the primary use case for these cards. And immediate-mode API's are turning out to be too bus-heavy for that use case. Whether you like Microsoft or not, the programming model used in DX is an attempt to mitigate this. The proper response is to lobby the OpenGL ARB to add API features more amenable to modern graphics processing. They are making large steps with OpenGL 2.0 in this regard.
Given their history I'm willing to bet that nvidia would

And immediate-mode API's are turning out to be too bus-heavy for that use case. Whether you like Microsoft or not, the programming model used in DX is an attempt to mitigate this. The proper response is to lobby the OpenGL ARB to add API features more amenable to modern graphics processing.

What exactly are you referring to by "features more amenable to modern graphics processing"?

OpenGL is hardly restricted to immediate mode - no one uses that if they care about performance - and things like vertex buffer o

How many are "a lot"? A lot compared to the millions and millions playing MMOs or FPSs or RTSs or anything else that benefits from flashy graphics card (yeah, yeah gameplay is important too) and are often paying several hundred bucks each? Let me think back at the university - how many there who would use heavy CAD? Maybe 200 people out of 30000, and I'm probably being kind. How many of those would pay the tens of thousands of dollars needed from each to compete with the gamers? Sorry, but there's just no b

Let me think back at the university - how many there who would use heavy CAD? Maybe 200 people out of 30000, and I'm probably being kind. How many of those would pay the tens of thousands of dollars needed from each to compete with the gamers?

Given that each copy of high-end CAD software costs tens of thousands of dollars (with mid-range costing thousands), I'd say a lot of them.

Judging from your comment about your university, you appear to be thinking that the CAD market is just students tinkering on a bud

The book is written for game developers, and none of the topics are exclusive to DX10 - NVIDIA has already released OpenGL extensions that offer the same functionality under OpenGL. The fact that the samples use DX10 is irrelevant because the API isn't the point. Anyone with a working knowledge of both DX and GL can translate code from one to the other fairly easily.

Right now there is no laptop let alone "consumer" card in the world that can handle even the kind of CAD work a lot of people have to do.

These cards cost hundreds of dollars but they can't handle an assembly with 100 parts in a CAD model simply because they barely have any OpenGL hardware in them. A car, airplane, etc has millions of parts.

That's like comparing a pickup truck to a freight train. Consumer cards aren't designed to do CAD, they're designed to do games because (surprise!) they're sold to gamers. Workstation cards are made to do CAD. If you want to play the latest games, you get a 8800GTX. If you want to do CAD, or ultra high-poly modeling, or movie-quality animation, you get a Quadro FX. Or a FireGL if you prefer AMD/ATI.

And now all the graphics cards are focusing on the DirectX and neglecting OpenGL.

Graphics cards don't focus on either. Graphics cards focus on accelerating the sort of math that's common to all 3D rendering - transforming vertices, rasterizing triangles, and shading fragments (which are roughly analogous to pixels, for those of you that don't speak GL). Graphics drivers focus on DX or GL, and even in the consumer space you'd be stretching if you said that OpenGL is being neglected (see all the OpenGL extensions [opengl.org] that start with NV_ or ATI_ for proof).

Well I'm pleased to hear that NV and ATI are still working on OpenGL as much as they are DirectX if that's really the case.

Are you really telling me that the only difference between a $1500 Quadro and gamer card is the drivers though? The bad-ass gamer card in my friend's computer chokes and can barely run at even the most basic animation of an of maybe 30 parts in CAD.

Well I'm pleased to hear that NV and ATI are still working on OpenGL as much as they are DirectX if that's really the case.

It certainly is. NVIDIA had OpenGL equivalents to the new DX10 features out in the very first release of its DX10 driver. So did ATI (though their first DX10 card came much later than NV's so they had more time to begin with). I don't think either will ever be ignored - and that's a good thing. Competition between the two APIs has yielded a lot of good innovation that's now been adopted into both.

Are you really telling me that the only difference between a $1500 Quadro and gamer card is the drivers though? The bad-ass gamer card in my friend's computer chokes and can barely run at even the most basic animation of an of maybe 30 parts in CAD.

No. There's much more to it than that, of course. It all comes down to usage. If you profile a video game and a CAD program you'll see that they stress completely different parts of the card. Workstation cards will have more silicon dedicated to things like the memory controller (CAD sends a lot more data across the bus each frame than a game does), whereas consumer cards put most of their power behind the shader processor (games use long and complex shaders to implement animation, lighting, shadowing, etc - CAD typically just shades everything with simple Phong lighting). There's a lot of other differences as well, though I'd rather not write a 10 page essay on the topic right now:)

CAD passes more data because it still processes tessellation (converting a geometry to a polygon mesh) type features in CPU, then passes them to GPU (say CSG [wikipedia.org]). Geometry Shaders don't handle tessellation well, but rumor (or speculation due to research and rumors) has it the next type of shader will be specifically for tessellation (so you'll have vertex, fragment/pixel, geometry, and tessellation shader) and you will be able to pass a set of primitives into the shader and perform CSG on them.

Actually, as a graphics chip developer, I can tell you that Graphics chip development focuses almost exclusively on Direct3D. What Microsoft wants, Microsoft gets. The needs of OpenGL are entirely secondary when it comes to the hardware design.

Actually, as a graphics chip developer, I can tell you that Graphics chip development focuses almost exclusively on Direct3D. What Microsoft wants, Microsoft gets. The needs of OpenGL are entirely secondary when it comes to the hardware design.

Who's chips do you develop? And if the answer's NV or ATI then maybe you should talk to whoever gets sent out to GDC, because that sure as hell isn't what they're telling us game developers.

In fact, I can actually think of a few cases where GL had something before DX did: NV_primitive_restart [opengl.org]'s been spec'd since 2002 and MS just brought it into DX with DX10 (could have been a caps bit long before then). Same thing with EXT_depth_bounds_test [opengl.org] (is this even in DX10? - I haven't seen it in the docs yet). I'm p

Each chapter is contributed by a different author, and each author decides which API to use. I wrote one of the chapters of GPU Gems 2 (see http://sponeil.net/ [sponeil.net]), and my chapter/demo used OpenGL. When I asked the guys at nVidia if they had a preference, they didn't care. They didn't even care whether I used nVidia's Cg or the standard GLSL. (I started with GLSL but switched to Cg because the GLSL compiler didn't optimize it well enough.)

Me too. I do CFD and 6DOF modeling and simulation. Guess what I wrote my last piece of CFD visualization software in? C# and DirectX 9.0c.
I use OpenGL as well, for some things, but unless you can enlighten me what technical reason is there that you cannot use DirectX for scientific visualization? I can't think of one off of the top of my head. (For one, it's object-oriented...) In fact one of the reasons I like DirectX for CFD is the mesh class. If you are visualizing a flow, you are often looking at a mesh of an object, or cut through the flow. The mesh class in DirectX fully encapsulates the creation of a mesh, vertices, etc. With OpenGL you'd have to manage your own struct of data. Which is fine, but one more thing you have to debug.

And now all the graphics cards are focusing on the DirectX and neglecting OpenGL. Arg!
No, ARB... the OpenGL Architecture Review Board. Design by committee is slow. When was the last time you heard someone say deisgn by comittee was a good thing?:) The spec hasn't been updated meaningfully in ages (though the 3.0 spec is due soon... I think... dates keep getting pushed back). So there is nothing for the manufacturers to update! These cards cost hundreds of dollars but they can't handle an assembly with 100 parts in a CAD model simply because they barely have any OpenGL hardware in them.
Sorry, they utilize the same hardware. OpenGL are DirectX are both API's to the same hardware on any given video card. Not being able to load 100 parts sounds like a problem... you sure you aren't using a software renderer?

Last date I heard was "end of September, 2007" for OpenGL 3.0. I've heard no new dates, and it still isn't ratified.As was said in a previous post, CAD is more bandwidth dependent than games, which is why the OpenGL cards on the market are optimized for bandwidth. There is no good way to do CSG in hardware at the moment (thus the rumored tessellation shaders in next gen cards), so it's done in software.

I don't see why loading 100 parts on a consumer level card would be a problem, either, bu

Google OpenGL3.0 The specification should be coming out shortly. No guarantee when support will come out in driver form but one of the major thrusts was to modernize OpenGL for the newer generation of graphics cards. Alot of openGL was legacy support that isn't such an issue these days. Further the big thing that DX10 is offering is the geometry shaders.so that would be

While what you say is likely true, may I ask if you have looked into the FireGL line of cards, or perhaps a fast CPU/ large mem card, and emulation? I do not know about FireGL emulation, but Some NV cards have historically been able to load Quadro drivers using Rivatuner to emulate the quadro's bits in identifier, or some such. I would expect it to be slower, but CPUs are at quad core and counting, so emulation may be viable, depending upon if it still works.

While what you say is likely true, may I ask if you have looked into the FireGL line of cards, or perhaps a fast CPU/ large mem card, and emulation? I do not know about FireGL emulation, but Some NV cards have historically been able to load Quadro drivers using Rivatuner to emulate the quadro's bits in identifier, or some such. I would expect it to be slower, but CPUs are at quad core and counting, so emulation may be viable, depending upon if it still works.

Yeah, I dunno about that... I just this year finished a project designing the mechanical systems (hvac, plumbing, medical gas, fire, etc...) for a rather large hospital.
It was done almost entirely in 3d, on very basic workstations with off the shelf consumer hardware (not to mention a few mid-range laptops). Not the most spectacular performance, but the consumer stuff got the job done. To give an idea of assembly/sub-assembly resolution, a single file might have 1,000 assemblies, approx. 100 of which are

There was no emphasis on DirectX in GPU Gems 2 (which I only recently bought, dammit!), and I suspect there won't be in this book either - the book's are basically a collection of essays, and it's the choice of the individual author.Also this is a GPU programming book, so the shading language is more relevant than the API (if you don't even know the required OpenGL or DirectX commands to set up shaders, these books aren't for you). And again, GPU Gems 2 at least varies between Cg, HLSL and GLSL. It's not li

Not that I made it extremely clear, but I think most of you are missing my point. The point was that if the same resources went into development of OpenGL and hardware optimized for it as goes into DirectX, then maybe both what 3D graphics was originially invented for and then gaming also could benefit at the same time.Instead, any CAD use requires $1500 graphics cards because of poor economies of scale. This is a stupid waste.

A lot of you seem to think that I don't understand why DirectX is receiving pre

Instead, any CAD use requires $1500 graphics cards because of poor economies of scale. This is a stupid waste.

You are talking bullshit. I work in Aerospace. My buddy a few floors down does CAD work on a HP desktop workstation (same build as mine) with a beefy but standard nVidia video card (Quadro FX). It costs half the price you cite. I know people who have that card in their **home gaming** computer. He visualizes entire rocket stages, thousands of unique components, smoothly.

You need to talk to a CAD draftsman some time. You'll quickly learn that there is not a gaming card in existence that can hold a candle to even the cheapest CAD-oriented card on the market. The gaming cards are useless as soon as you try to do anything with any movement.

These cards cost hundreds of dollars but they can't handle an assembly with 100 parts in a CAD model simply because they barely have any OpenGL hardware in them. A car, airplane, etc has millions of parts.

If you're using an Autodesk 07 product it will likely have looked to see what graphics hardware and driver version you have, not found it in it's hardware database and disabled all the 3D acceleration.

You could try having a look on the Autodesk website and downloading the latest XML computability file and see if your card is now supported but if it's a consumer card this is unlikely and you're probably just going to have to go into the options and enable it manually. Before you do this it's also worth doin

So, I'm cited in this book for my work on the parallel prefix sum implementation they used. I later went on to rework an MPEG4 encoder for CUDA acceleration. So, to answer your question about using CUDA in these projects: it does offer a speed up, specifically of motion estimation, where most of encoding spends its time. Also, a lot of that speed up comes from exploiting the G80's memory architecture, which I do not believe you can do using GLSL. The problem ends up being that you need a G80, you need NVIDI

This is a bit off-topic, but does anyone know what happened exactly with OpenGL 3.0? I thought it was supposed to have released in September, but I've been unable to dig up any information as for the delay.

There hasn't been any info on OpenGL 3.0 in a while now. Even the "sneak peek" type articles seem to have dried up. My guess is that NVIDIA and AMD/ATI are either doing some last-minute bickering over features, or that someone realized that the spec contains something that would be impossible to implement (my guess would be "mixed OpenGL 2.x/3.0" rendering).

I saw a bunch of nvidia's nice desk-side gpus in a glassed-in projection room run by sgi at an industrial vr show. Being able to throw two or more at the data flow lets them drive a 4K image (of a sportscar), though even then it looked underpowered.But those little boxes (I guess this is the NVIDIA Quadro Plex series) go for over $20K each. At the current rate of progress, (when) will that drop in price and size to something that can fit inside a desktop or laptop, presumably with a giant asic one day? It w