This demo renders a particle system using a range of different methods. The most basic (and slowest) method simply draws each particle with its own draw call. The next method assembles everything in a large vertex array and draws it with a single draw call. The third method assembles it into a vertex buffer instead, resizing the buffer if needed. The two remaining methods implements instancing in two different ways. Using instancing means that the only information that needs to be passed to the card is the data specific for each instance, cutting it down to a fraction of what would otherwise be needed. Less than 1/4 in this case. One method uses SetStreamFrequency() and passes the instance data through a second vertex stream that's read on a much lower frequency. This is what people generally are referring to when they talk about "instancing" and requires special hardware support. The other method uses a technique to implement instancing without special hardware except for VS2.0 support. The instance data is passed through vertex shader constants instead. This has a couple of drawbacks. First, the number of spare vertex shader constants are limited, which limits the number of instances that can be drawn in a single draw call. This also means you'll overwrite previously set constants for subsequent draw calls, so you can't reuse the data in another pass without passing it to the card again. Another drawback is that the source model will be larger because you need to create multiple copies of the model with indices that selects the right vertex shader constant. We are usually talking about relatively small objects when we're doing instancing anyway, so this may not be much of a problem.

This should run on Radeon 8500 and up, and on GeForce3 and up. The instancing path will be only be available on 9500 and up with Catalyst 4.8 or newer drivers and on the GeForce 6800 series. The instancing through vertex shader constants path will only be available on 9500 and up and on GeForce FX 5x00 series and up.Soft shadowsTuesday, July 13, 2004 | Permalink

The fastest way to compute is to precompute. Which is also one of the main reasons why lightmaps are still hanging around. Lightmaps have a number of great advantages. They are very cheap, and you can get soft shadows for free. The disadvantage though is that it's static. You can't move the light and you the geometry must be static. This demo however shows a way to get a bit of dynamic lighting into lightmaps. By storing a bunch of lightmaps for a range of different light positions, and interpolating between the closest lightmaps, you can animate the light and get soft dynamic shadows very cheaply. The idea could also be used for dynamic geometry with a static light in a similiar way. This demo doesn't implement that though.
It won't work for arbitrary light positions, but most lights in real world applications don't move around arbitrarily. If they are dynamic they are often swinging in a cable, or otherwise limited in their movement to a simple animation. In these cases, this technique can be used for rendering soft shadows extremely cheap.

This demo should work on Radeon 9500 and up and GF 5200 and up.Dynamic branchingThursday, July 1, 2004 | Permalink

One of the main features of pixel shader 3.0 is that is supports dynamic branching (also called data dependent branching), something that nVidia of course makes a big deal about. This demo however shows a way to implement dynamic branching without pixel shader 3.0, using basic graphics functionality such as alpha testing and stencil testing. Not only is it doable, but also gives every bit of the performance of real dynamic branching.
One scenario where dynamic branching comes in handy is when you have range-limited lights. Whenever a fragment is outside the range of the light, you can skip all the lighting math and just return zero immediately. With pixel shader 3.0 you'd simply implement this with an if-statement. Without pixel shader 3.0, you just move this check to another shader. This shader returns a truth value to alpha. The alpha test then kills all unlit fragments. Surviving fragments will write 1 to stencil. In the next pass you simply draw lighting as usual. The stencil test will remove any fragments that aren't tagged as lit. If the hardware performs the stencil test prior to shading you will save a lot of shading power. In fact, the shading workload is reduced to the same level of pixel shader 3.0, or in some cases even slightly below since the alpha test can do a compare that otherwise would have to be performed in the shader.
The result is that early-out speeds things up considerably. In this demo the speed-up is generally in the range of two to four times as fast than without early-out. And with more complex lighting, the gain would have been even larger.

This demo should run on Radeon 9500 and up, and GeForce FX 5200 and up.3DcSunday, June 27, 2004 | Permalink

This demo illustrates the use of the new 3Dc texture compression format, which is particularly suitable for normal map compression. It lets you compare quality between 3Dc and DXT5 normal maps, and it lets you compare the performance of using 3Dc and DXT compression over using uncompressed textures.

The performance increase of 3Dc and DXT is well worth the effort. Some benchmark numbers:

Quality-wise the DXT5 is often usable, but in some situations it just won't cut it. 3Dc on the other hand gives very good quality for all normal maps I've tried.

The demo also illustrates how to implement detail normal mapping. One normal map adds the large features, and on top of that another normal map is added that's sampled at a much higher frequency that adds the small details. Like with standard detail mapping for base textures this means you don't need to use very large textures to get a detailed image. A generic small or medium size detail normal map can be used together with a medium size normal map to get a final image that's easily comparable to a 4x4 times larger normal map.

It will run with 3Dc on by default on Radeon X800 cards. On Radeon 9500 and up, and on GeForce 5200 and up, it will run but with DXT5 as the only normal map compression option.Fire2Wednesday, April 7, 2004 | Permalink

This demo shows off a set of interesting (though hardly revolutionary) techniques. The fire is created with a standard particle system. However, instead of having just one texture for all particles I'm using an animated texture. The particles will change over time from the bottom of the fire to the top. The problem with that is that if that's done with normal 2D textures, then you need many draw calls to draw all particles that need different textures. To solve that I'm using a 3D texture containing all slices. This way the whole particle system can be drawn with just one draw call, but I also get interpolation between the different slices through the texture filtering, which gives smoother animation. In addition to that, I have added rotation to the particles, which makes the fire look more natural. To further improve the fire I have ensured that the fire shapes around the wood. This is done with some collision detection and some code to let the particles assemble again after being spread by the wood. The result looks pretty good. Also included is a simple wood shader, which doesn't need a lot of comments, and a simple terrain shader. The terrain shader mixes a grass and a dirt texture according to a 3D noise. This way you can make seemingly infinite non-repeating patterns, though in this demo the terrain is quite small so this aspect isn't very visible. In the shader it's easy to modify how smooth the transition from grass to dirt is, and if you want more dirt than grass etc.

You will need a Radeon 9500+ or GFFX to run it.Volume LightmappingTuesday, March 2, 2004 | Permalink

This demo uses a technique that's similar to standard lightmapping, except that this is done in three dimensions. Every point in the room has a light visibility precomputed and stored in a volumetric lightmap. This way you only need a simple texture lookup to determine the amount of light that hits the object. The advantage of this technique is that it's cheap and you get soft shadows for free. The disadvantage is of course that lights must be static. The advantage of the 3D version however is that it unlike 2D lightmaps is also useful for characters and dynamic objects.

The demo should run on Radeon 9500 and up and GFFX. You may have to do a registry tweak to enable GLSL on the GFFX at the moment.FlagThursday, February 5, 2004 | Permalink

This is one of my coolest demos so far in my own humble opinion. The demo simulates the physics of cloth. It properly interacts with the spheres it falls over and slides off them due to weight. The code to do this is surprisingly simple. The cloth is a rectangular field of points. Each point connects to its neighbors in all directions through imagined springs. For each frame the force, speed and direction is computed through Newton's simple laws of physics, and the position of each point is updated accordingly. In a second pass the normals are evaluated from the position of each point. The result is a very realistic cloth simulation.

The physics is done on the CPU, but is simple enough to be evaluated on the GPU with the upcoming superbuffers extension.

The lighting is done with GLSL, but there's also a vertex lighting fallback for cards that don't support GLSL. This gives lower quality however, but all cards should be able to run this demo.

2004-01-11:
Updated with more springs, which makes the cloth less stretchable and behave better. A standard GL vertex lighting path was also added.More pages:12345 6 789101112