Since then, Stephen contacted me – it turns out I got some details wrong, and he also provided me with some additional details about the techniques in his talk. I will give the corrections and additional details here.

What I described in the post as a “software hierarchical Z-Buffer occlusion system” actually runs completely on the GPU. It was directly inspired by the GPU occlusion system used in ATI’s “March of the Froblins” demo (described here), and indirectly by the original (1993) hierarchical z-buffer paper. Stephen describes his original contribution as “mostly scaling it up to lots of objects on DX9 hardware, piggy-backing other work and the 2-pass shadow culling”. Stephen promises more details on this “in a book chapter and possibly… a blog post or two” – I look forward to it.

The character occlusion was not performed using capsules, but via nonuniformly-scaled spheres. I’ll let Stephen speak to the details: “we transform the receiver point into ‘ellipsoid’-local space, scale the axes and lookup into a 1D texture (using distance to centre) to get the zonal harmonics for a unit sphere, which are then used to scale the direction vector. This works very well in practice due to the softness of the occlusion. It’s also pretty similar to Hardware Accelerated Ambient Occlusion Techniques on GPUs although they work purely with spheres, which may simplify some things. I checked the P4 history, and our implementation was before their publication, so I’m not sure if there was any direct inspiration. I’m pretty sure our initial version also predated Real-time Soft Shadows in Dynamic Scenes using Spherical Harmonic Exponentiation since I remember attending SIGGRAPH that year and teasing a friend about the fact that we had something really simple.”

My statement that the downsampled AO buffer is applied to the frame using cross-bilateral upsampling was incorrect. Stephen just takes the most representative sample by comparing the full-resolution depth and object IDs against the surrounding down-sampled values. This is a kind of “bilateral point-sampling” which apparently works surprisingly well in practice, and is significantly cheaper than a full bilateral upsample. Interestingly, Stephen did try a more complex filter at one point: “Near the end I did try performing a bilinearly-interpolated lookup for pixels with a matching ID and nearby depth but there were failure cases, so I dropped it due to lack of time. I will certainly be looking at performing more sophisticated upsampling or simply increasing the resolution (as some optimisations near the end paid off) next time around.”

Although the technique builds upon previous ones, it does add several new elements, and works well in the game. The technique does suffer from multiple-occlusion; I wonder if a technique similar to the 1D “compensation map’ used by Morgan McGuire might help.

I attended this year’s Gamefest back in February. Gamefest is a conference run by Microsoft, focusing on games development for Microsoft platforms (Xbox 360 and Windows). This year (unusually, due to the presence of prerelease information on Kinect, at the time still known as “Project Natal”) the conference was only open to registered platform developers. For this reason, I didn’t blog about it at the time (no sense in telling people about stuff they can’t see).

Recently (thanks to the Legalize Adulthood! blog) I became aware that the Gamefest 2010 presentations are online on the conference website, and available for anyone (not just registered XBox 360 and Windows Live developers). I’ll briefly discuss which presentations I think are of most interest. First, the ones I attended and found interesting:

This was a very nice talk about baking lighting into volumes by John O’Rorke, Director of Technology at Monolith Productions. Monolith were trying to light a large city at night, where the character could traverse the city pretty freely both horizontally and vertically. Lots of instances and geometry Levels-of-Detail (LODs), lots of dynamic lights. A standard lightmap + light probe solution took up too much memory given the large surface area, and Monolith didn’t like the slow baking workflow involved, as well as the inconsistencies between static and dynamic objects.

Instead, Monolith stored light probes in volume textures. They tried spherical harmonics (SH) and didn’t like it (too much memory, too blurry to use for specular). F.E.A.R. 2 shipped with an approach similar to Valve’s “Ambient Cube” (6 RGB coefficients), which has the advantage of cheap shader evaluation. For their new game they went with a stripped-down version of this, which had a single RGB color and 6 luminance coefficients; this reduces from 18 to 9 scalars and it was hard to tell the difference. Besides memory, this also sped up the shaders (less cache misses) and gave them better precision (since the luminance and color can be combined in a way that increases precision). For HDR they used a scale value for each volume (the game had multiple volumes in it) – this also gave them good precision in dark areas. Evaluating the “luminance cube” is extremely cheap (details in the slides). John also described some implementation details to do with stenciling out areas of the screen, using MIP maps, and getting around 360 alignment issues with DXT1 textures (all volumes were stored as DXT1).

Generation: the artists place lights (including area lights) and all the lights are baked (direct only, no global illumination (GI) bounces) during level packing. The math is simple – the tools just evaluated diffuse lighting for 6 normal directions at the center of each volume texel. Once the number of lights added by the artists started getting large this slowed down a bit so they added a caching system for the baked volumes. They eventually added GI support by rendering cube map probes in the game.

Downsides: low resolution, bad for high contrast shadows, can get light or shadow bleeding through thin geometry. They use dynamic lights for high contrast / shadow casting lighting.

For the future they plan to cascade the volumes and stream them. They also tried raymarching against the volume to get atmospheric effects, this was fast enough on high-end PCs but not consoles.

This great talk (by Stephen Hill from Ubisoft) went into detail on two rendering systems used in the game Splinter Cell: Conviction. The first was a software hierarchical Z-Buffer occlusion system. They used this in various ways to cull draw calls from shadows as well as primary rendering. The system could handle over occlusion 20,000 queries in around 1/2 millisecond. Results looked pretty good.

Next, Stephen discussed is the game’s ambient occlusion (AO) system. The game developers didn’t use screen-space ambient occlusion (SSAO), since they didn’t like the inaccuracy, cost, and lack of artist control. Instead they went for a hybrid baked system. Over background surfaces (buildings, etc.) they bake precomputed AO maps. The precomputation is GPU-accelerated, based on the GPU Gems 2 article “High-Quality Global Illumination Rendering Using Rasterization” (available here: http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter38.html). For dynamic rigid objects like tables, chairs, vehicles, etc. they precompute AO volumes (16x16x16 or so). Finally for characters, they analytically compute AO from an articulating model of “capsules” (two half-spheres connected by a cylinder). Ubisoft combine all of these (not trying to address double-occlusion, so results are slightly too dark) into a downsampled offscreen buffer. Rather than simple scalar AO, all this stuff uses a directional 4-number AO representation (essentially linear SH) so that they can later apply high-res normal maps to it when the offscreen buffer is applied. They figured out a clever way to map the math so that they can use blending hardware to combine these directional AOs into the offscreen buffer in a way that makes sense. The AO buffer is later applied using cross-bilateral upscaling. For the future Ubisoft would like to add streaming support for the AO maps and volumes to allow for higher resolution.

Stephen showed the end result, and it looked pretty good with a character running through a crowded scene, vaulting over tables, knocking down chairs, with nice ambient occlusion effects whenever any two objects were close. A system like this is definitely worth considering as an alternative to SSAO.

This excellent talk (by Wade Brainerd, who like me works in Activision‘s Studio Central group) dives deep into a low-level description of Xbox 360 internals and the modified version of DirectX that it uses. A rare opportunity for people without registered console developer accounts to look at this stuff, which is relevant to PC developers as well since it shows you what happens under the driver’s hood.

This talk by NVIDIA contained basically the same stuff as the I3D paper Interactive Fluid-Particle Simulation using Translating Eulerian Grids, which can be found here: http://www.jcohen.name/. It was interesting to hear about such a high-end CUDA fluid sim system being integrated into a shipping game (even if only on the PC version) – they got some cool particle effects out of it with turbulence etc. These kinds of effects will probably become more common once a new generation of console hardware arrives.

This talk was about various ways to use DX11 Compute Shaders in graphics. This talk included stuff like fast computation of summed area tables for fast anisotropic blurring of environment maps and depth of field. The speakers also showed an A-buffer-like technique for order-independent transparency, and a tile-based deferred rendering system that was more efficient than using pixel shaders. Like the previous talk, this seemed like the kind of stuff that could become mainstream in the next console generation.

This presentation discussed research published in the SIGGRAPH Asia 2009 paper “All-Frequency Rendering of Dynamic, Spatially-Varying Reflectance“ (available here: http://research.microsoft.com/en-us/um/people/johnsny/). The presentation was by John Snyder, one of the paper authors. It’s similar to some other recent papers which represent normal distribution functions as a sum of Gaussians and filter them, but this paper does some interesting things with regards to supporting environment maps and transforming from half-angle to view space. Worth a read for people looking at specular shader stuff.

This talk was probably old hat to anyone with significant 360 experience but should be interesting to anyone who does not fit that description – it was a rare public discussion of low-level console details.

This talk was about combining physics with canned animation (similar to some of NaturalMotion‘s tools). It looked pretty good. The basic idea is straightforward – artist paints tightness of springs connecting the character’s joints to the skeleton playing the animation – a state machine allows to vary these tightness values based on animation and gameplay events.

Illuminate Labs (the makers of Beast and Turtle) gave this talk about baked lighting. It was pretty basic for anyone who’s done work in this area but might be good to brush up with for people who aren’t familiar with the latest practice.

I attended Gamefest 2008 last week. Gamefest (formerly called Meltdown) is a Microsoft-run Windows and Xbox 360 game development conference. This year there were two notable announcements: XNA Community games (discussed in a previous blog post) and the first public disclosure of Direct3D 11.

Direct3D is, of course, the API used by most Windows games, but its importance extends beyond Windows. Direct3D features guide the development of graphics hardware in general, so these features are bound to show up in future consoles, as well as in OpenGL.

The announcement that Direct3D 11 would not be tied to the next version of Windows (as many had feared), and would be available on Windows Vista was very significant to Windows developers, many of whom complained about the tying of Direct3D 10 to Windows Vista. Direct3D 11 will also be available on Direct3D 9, 10, and 10.1 level graphics hardware (although the new features will not be available there, with the exception of some multithreading enhancements).

The fact that the Direct3D 11 API is a strict superset of the 10/10.1 API is also cause for relief among game developers. From Direct3D 9 to 10, the API went through extensive changes. These changes were mostly long-overdue cleanups and improvements, but they left developers supporting two very different APIs if they wanted to support the many customers using Windows XP and also expose the new Direct3D 10 hardware features.

This is the first part of a multi-part post which will summarize the essential facts about Direct3D 11, as known from the Gamefest slides. Eventually, the slides should show up on the XNA Presentations page.

Full disclosure of Direct3D 11 should occur later this year – the November 2008 DirectX SDK release will feature a preview version of the API, including full documentation and code samples.

Many of our readers are not professional game developers, but do graphics or game programming as a hobby or as part of their academic research. Microsoft’s XNA Game Studio is interesting since it allows free development of Xbox 360 games. To be precise, although the software is free, Xbox 360 development does require a $99/year premium membership – still a bargain compared to the many thousands of dollars required for a professional console development kit. However, the resulting games could only be played by other people with premium memberships – not exactly a mass market.

This week, at Gamefest, Microsoft announced that these “homebrew” games could now be sold to Xbox 360 owners in general. Interestingly, the games will not be selected by Microsoft themselves (although I am sure they will do some gatekeeping) but by the community (similarly to the selection of posts at Digg or Slashdot).