Description

For //build/ 2012, we wanted to showcase what Windows 8 can offer developers. There are a lot of projects showing off great things like contracts and Live Tiles, but we wanted to show off some of the lesser known features. This project focuses on one of those: stereoscopic 3D with DirectX 11.1.

Prior to DirectX 11.1, stereoscopic 3D required specific hardware and a custom API written for that hardware. With DX11.1, stereoscopic 3D has been "democratized." Any GPU that supports DirectX 11.1 can be connected to any device which supports stereoscopic 3D, be it a projector, an LCD TV, or anything else. Plug it in and DirectX does the rest.

From the software side of things, any DX11.1 application can determine if the connected display supports stereoscopic 3D, and choose to render itself separately for the player's left and right eye.

To showcase this feature, we decided to build a very simple game that would give the illusion of depth, but be easy to explain and play. What's easier than Pong? So, we built the world's most over-engineered game of 3D Pong named Maelstrom.

Software

Each player setup consists of two applications: the DirectX 11.1 game written in C++, and the touch-screen controller written in C#/XAML. Both are Windows Store applications. Since this is a two player game, there are two instances of each application running, one per player. All for applications are networked together using StreamSockets from the Windows Runtime. The two controllers and player two's DirectX game connect to player one's DirectX game, which acts as the "master". Controller input is sent here, and, once the ball and paddle positions are calculated, the data is drawn for player one and sent to player two which draws the world from the other player's perspective.

Direct3D Application

Getting Started with stereoscopic DirectX11, C++, and XAML

If you have never worked with DirectX before, it can be a little overwhelming at first. And even if you have worked with it some in the past, targeting the new Windows 8 ecosystem, along with C++ and XAML have added some additional changes in how you may have designed your solution previously.

Fortunately, the Windows Dev Center for Windows Store Apps has some great samples to get you started, and we took full advantage of them to get to speed. For a great, simple example of how to leverage the new stereoscopic feature in Direct3D 11.1, we started with Direct3D Stereoscopic Sample which shows the basic adjustments to the Render loop for toggling your virtual cameras. However, to see a great example of a simple game structure that also leverages stereoscopic rendering where available, the tutorial found at Walkthrough: a simple Windows Store game with DirectX is invaluable. Further in this article, we will dive deeper into the specifics of stereoscopic rendering in our game.

One thing to note, if you follow the link in the above Walkthrough to the original project, it will take you to a C++ only implementation of the game. Now, of course, all the DirectX game objects such as the paddle, puck and walls are all rendered using D3D. However, for HUD (Heads up Display) elements, this C++ only sample also uses DirectX exclusively. If you are coming from a managed code background, this will definitely seem like unnecessary overhead. That is because this C++ only sample was created after last year's BUILD conference in 2011 and C++ and DirectX still did not play well with XAML.

However, a few months later, the ability to nest DirectX content in a XAML project became available for true hybrid style solutions (see the article DirectX and XAML interop - Windows Store apps using C++ and DirectX for more information). After this feature was added, the simple Shooter Game referenced above had its HUD logic rewritten in XAML and posted up to Dev Center as XAML DirectX 3D shooting game sample, which shows both stereoscopic support, a simple Game Engine structure in C++ and XAML integration. At this point, we had all the starter code we needed to start writing our own game.

Game Engine

We modified the base sample to accommodate our needs. We created specific GameObjects, such as Paddle, Puck, etc. to add the behaviors we needed. We also added an Update and Render method to the base GameObject so that, for every frame, we could do any calculations required, and then draw the object to the screen. This is very similar to how XNA sets up its game engine.

Game Constants

Because we were tweaking a variety of values like colors, sizes, camera locations, etc., we created a GameConstants.h header file which contains nothing but these types of values in a single location. This made it very easy for us to quickly try out various tweaks and see the results on the next run. Using namespaces helped keep the code a bit more manageable here as well. Here’s a quick snippet of that file:

Stereoscopic 3D

Direct3D must be initialized properly to support stereoscopic displays. When the swap chains are created, an additional render target is required, such that one render target is for the left eye, and one render target is for the right eye. Direct3D will let you know if a stereoscopic display is available, so you can create the swap chain and render targets appropriately.

With those in place, it’s simply a matter of rendering your scene twice, once per eye…that is, once per render target.

For our game this was very simple. Our in-game camera contains two projection matrices, one representing the view from the left eye, and one from the right eye. These are calculated when the projection parameters are set.

Depending on which eye we are rendering, we grab the appropriate projection matrix and pass it down to the vertex shader, so the final scene is rendered offset for the proper eye.

Collision Detection

If you are just starting to move into 3D modeling and programming, one of the trickier aspects of your game can be collision detection and response. Maelstrom uses primitives for all of the game elements, so our collision code was able to be a bit more straightforward compared to complex mesh collisions, but understanding a few core math concepts is still critical to grasp what the code is doing.

Fortunately, DirectX provides us with an DirectX Math Library that is able to do the serious heavy lifting, so the main complexity comes from framing the problem and learning how to apply the library.

For example, In our situation we had up to three very fast moving spheres and needed to check for wall collisions and then handle to appropriate bounce, since some of the walls would also be angled. In a 2D game, a collision detection between a sphere and an axis line is very easy. If the distance between a circle and the line is less than or equal to the radius of the sphere, they are touching. On every frame, you move your circle based on its velocity and do your collision test again. But even here, your solution may not be that easy for two reasons.

First, what if the line is angled and not lying flat on the X or Y axis? You have to find the point on the line based on the line's angle that is closest to the sphere to do your distance calculations. And if you then want it to bounce, you have to rotate the velocity of the circle by the line's angle, calculate your bounce, and then rotate back. And that's just rotated walls in 2D. When you move up to 3D, you have to take into account the surface normal (which way the 3D plane is facing) in your calculations.

The second complexity that we needed to account for and which pops up in either 2D or 3D collision detection is travel between frames. In other words, if your ball is travelling very fast, it may have completely travelled through your collision boundary in between frames and you wouldn't notice it if you are only doing a distance / overlap check as outlined above. In our case, the pucks had the ability of travelling very fast with a speed boost, so we needed a more robust solution. Therefore, instead of implementing a simple sphere plane intersection test, we needed to create a line of motion that represented where the ball ended on the previous frame and where it currently is after it's new velocity is added to it's position. That line then needs to first be tested to see if it crosses a WallTile. If it does cross, then we know an collision has occurred between frames. We then solve for the time (t) between frames the Sphere would have first made contact to know the exact point of impact and calculate the appropriate "bounce off" direction.

The final code for a puck (or moving sphere) and wallTile collision test looks like this:

Drawing Maelstrom

To draw the game, we wanted to use some advanced techniques. We decided to go with a light pre-pass deferred rendering pipeline with normal mapping. That’s a lot of jargon but it isn’t all that complicated once you know what the jargon means, so let’s break it down.

When you draw something in 3D, there are three things that come together to determine the final color of each pixel on the screen: meshes, materials, and lights. A mesh is a collection of triangles that make up a game object (such as a wall tile in Maelstrom). On its own, a mesh is just a bunch of dots and lines. A material makes a mesh look like something. It could be as simple as a solid color but usually it’s a texture and sometimes it’s more (the wall tiles in Maelstrom use both a texture and a normal map to define their material properties). Lastly, lights transform materials by determining how bright they should appear and what sort of tint, if any, they should have. Without lights you would either have complete darkness or you would have flat lighting (where everything has a uniform brightness and adding a tint color would uniformly tint everything on the screen).

Forward Rendering vs. Deferred Rendering vs. Light Pre-Pass Rendering

The simplest approach to drawing 3D graphics is something called forward rendering. With forward rendering, drawing consists of rendering the mesh and calculating its material and all the lights that affect the material all at the same time. The more lights you add, the more complicated your shaders become since you have to determine whether each light affects the material and if so how much. (Ok, so there’s also multi-pass forward rendering, but that has its own problems – more passes mean longer render times and thus a lower frame rate – and we wanted to keep the descriptions simple).

In the last 5 years, many games started using a technique called deferred rendering. In classic deferred rendering, there are two rendering passes. The first pass renders the positions, normals, and material values of all the meshes in the scene to something called a G-Buffer (two or more render targets); nothing is actually drawn to the screen in this first pass. The second pass uses the data from the G-Buffer (which tells us everything we need to know about the geometry that appears at each screen pixel) and combines it with the lights to create the final image that you see. By doing this, we decouple geometry and lighting. This makes it possible to add more lights to the scene with a much smaller performance impact than in forward rendering since we don’t need to create a really complex pixel shader to handle all the lights (single-pass forward rendering) or draw the geometry over and over again for each light (multi-pass forward rendering).

There are drawbacks to classic deferred rendering though. Even a minimal G-Buffer takes up quite a bit of memory and the more different types of materials you want to support, the larger the G-Buffer will need to be. Wolfgang Engel, an XNA/DirectX MVP, came up with a variation on deferred rendering which he called Light Pre-Pass Rendering. This is a three pass technique. We once again use a G-Buffer, but in this case it is smaller than the classic deferred rendering G-Buffer and can even be squeezed down to a single render target which makes it viable for graphics hardware which does not support drawing to multiple render targets at the same time.

The G-Buffer is created in the first pass by rendering all the scene geometry. It only needs to store normals and the geometry’s world position. We stored the world position of the geometry at that screen position in one render target and its normal at that screen position in second render target for simplicity.

The next pass draws the lights to a light accumulation buffer. The buffer starts out entirely dark and each light that is rendered adds brightness (and tint, if any) to the light buffer. These lighting calculations take into account the normal and world position of the geometry that is at each screen position, drawing the values from the G-Buffer, such that each light only affects the pixels it is supposed to have an impact on. In Maelstrom we ended up only using point lights (spheres of light that fade out as you get further from the light’s position), but you can use any kind of light you can imagine (spot lights and directional lights are the two other common light types). Adding more lights has a very low impact on rendering time and this kind of lighting tends to be much easier for the designer to work with since there’s no need for him or her to understand HLSL or even any complicated C++ in order to add, remove, reposition, or otherwise change any lights.

The final pass draws the geometry a second time. This time, though, all the lighting calculations are done so all we do here is just render the meshes with their appropriate materials, adjust the color values and intensities from the material based on the light buffer value, and we’re done. Each rendering style (forward, deferred, and light pre-pass) has its own benefits and drawbacks, but in this case light pre-pass was a good solution and choosing it let us show how a state-of-the-art graphics technique works.

Normal Mapping

We also incorporated normal mapping. Normal mapping makes us of a special texture (a normal map) in addition to the regular texture that a material has. Normals are values used in lighting calculations to determine how much a particular light should affect a particular pixel. If you wanted to draw a brick wall, you would typically create two triangles that lined up to form a rectangle and you would apply a texture of a brick wall to them as their material. The end result of that doesn’t look very convincing though since unlike a real brick wall there are no grooves in the mortared area between each brick since our brick and mortar is just a flat texture applied to flat triangles. We could fix this by changing from two triangles to a fully modeled mesh with actual grooves, but that would add thousands of extra vertices which would lower the frame rate.

So instead we use a normal map, which fakes it. One of the reasons that the two triangles + a brick wall texture approach doesn’t look right is because the lighting doesn’t behave correctly when compared to a real brick wall (or to a fully modeled mesh of a real brick wall). The normals point straight out perpendicular from the face of the rectangle whereas if we had the fully modeled mesh with actual grooves, the surface normals would only point straight out on the bricks themselves and they would curve along the mortared areas such that the lighting calculations would end up giving us the right levels of light and dark depending on the location and direction of the light. That’s where a normal map comes in. The normal map (which you can generate using a plugin for Adobe Photoshop or GIMP or by modeling a real brick wall in 3DSMax, Maya, or Blender which you then “bake” a normal map from) allows us to get the same lighting effect as we would with a fully modeled mesh while still keeping the simple two triangle + a brick wall texture approach that gives us really good performance for our game. There are limits to the effectiveness of normal mapping (you can’t use it to fake anything too deep and it doesn’t hold up as well if the camera can get really close to the object) but in Maelstrom it allowed us to keep the walls as simple triangles (like the two triangles + a brick wall texture example above) while making it seem like there were geometric grooves in the wall. Here’s a before and after screenshot using normal mapping:

Post-Processing Effects

We also used several post-processing effects. The first was the bloom effect. Bloom is an effect that analyzes a rendered image, identifies parts that are above a certain brightness threshold, and makes those areas brighter and adds a peripheral glow to them as well, giving it a look and feel that is similar to a neon sign or to the light cycles in the movie Tron. Here’s the same shot as above with the addition of bloom:

We also made use of two different damage effects. Whenever the player took damage, we had a reddish tinge around the edge of the screen. This was simply a full screen overlay texture that is actually white but is tinted red by the shader. It is alpha-blended over the final rendered scene and fades out over the course of a couple of seconds. Rather than fading out linearly, we use a power curve which helps to sell the effect as being more complicated than it really is.

Lastly we added in some damage particles. The particles themselves were created using a geometry shader. The vertex shader took in a series of points in world space and passed these points along to the geometry shader. The geometry shader expanded these points into two triangles by generating the missing vertices and applying the world-view-projection transformation matrix to transform the positions from world coordinates to homogeneous coordinates so that they can then be rasterized correctly by D3D and the resulting pixels passed along to the pixel shader. Once again we used a simple texture with alpha blending to simulate much more complicated geometry than we were actually drawing. In this case we also made use of a texture atlas (an image made up of smaller images) which, in conjunction with the randomizer we used to generate the initial vertices for the particles, allowed us to have several different particle textures. Like with the power curve for the damage texture, the texture atlas allowed us to make the particles seem more complex than they really were. It also let us show off the use of a geometry shader, a feature that was added in DirectX 10 and requires DirectX 10 or higher hardware.

Audio

All audio was done using the XAudio2 API. Thankfully, we were able to get a huge head start by using some of the code from the sample project we started from. The audio engine sets up the very basics of XAudio2, and then wraps that with a simpler API for the rest of the application to call.

We don’t have many sound effects, so we on startup, we load all sounds effects and music cues into a std::map, keyed on a SoundCue enum. Sounds are loaded using the Media Foundation classes, and the resulting byte data of the sound (and some format information) are stored in our SoundEffect class.

When the game needs to play a sound, it simply calls the PlaySound method, passing in the cue to play, and the volume to play it at. PlaySound keys into the sound map, getting the associated SoundEffect, and plays it.

MJPEG Cameras

To achieve the effect of seeing the opponent in stereoscopic 3D, we strapped two Axis M1014 network cameras side-by-side. Using Brian’s MJPEG Decoder library, with a special port to Windows Runtime, individual JPEG frames were pulled off each camera, and then applied to a texture at the back of the arena. The image from the left camera is drawn when DirectX renders the player’s left eye, and the frame from the right camera is drawn when DirectX renders the right eye. This is a cheap and simple way to pull off live stereoscopic 3D.

With the distance between the cameras being about the distance of human eyes (called the intra-axial distance), the effect works pretty well!

Tablet/Controller

The Tablet controller is the touch screen that lets the player control their 3D paddle in the Game Console app. For this part of the game system, there wasn't a reason to dive deep into DirectX and C++ since the controller is neither stereoscopic or visually intense, so we kept things simple with C#.

Since the controller would also serve as our attract screen in the podium to entice potential players, we wanted to have the wait screen do something eye-catching. However, if you are moving from C# in WPF to C# and XAML in WinRT and are used to taking advantage of some of the more common "memory hoggish UX hacks" from WPF, you'll quickly find them absent in WinRT! For example, we no longer have OpacityMask, non-rectangular clipping paths or the ability to render a UIElement to a Bitmap. Our bag of UX tricks may be in need of an overhaul. However, what we do get in C# / XAML for WinRT is Z rotation, which is something we've had in Silverlight but I personally have been begging for in WPF for a long time.

Therefore, the opening animation in the controller is a procedurally generated effect that rotates PNG "blades" in 3D space, creating a very compelling effect. Here is how it works. The Blade user control is a simple canvas that displays one of a few possible blade images. The Canvas has a RenderTransform to control the scale and rotation and a PlaneProjection which allows us to rotate the blade graphic in X, Y and Z space.

Since each Blade has been assigned an individual speed and angle of rotation along all three axis, they have a very straightforward Update function. The reason we keep the rotation values between -180 and 180 during the spinning loop is to make it easier to spin them out zero when we need them to eventually leave the screen.

Network

To continue the experiment of blending C# and C++ code, the network communication layer was written in C# as a Windows Runtime component. Two classes are key to the system: SocketClient and SocketListener. Player one’s game console starts a SocketListener to listen for incoming connections from each game controller, as well as player two’s game console. Each of those use a SocketClient object to make the connection.

In either case, once the connection is made, the client and the listener sit and wait for data to be transmitted. Data must be sent as an object which implements our IGamePacket interface. This contains two important methods: FromDataReaderAsync and WritePacket. These methods serialize and deserialze the byte data to/from an IGameState packet of whatever type is specified in the PacketType property.

Player one’s game console writes a GameStatePacket to player' two’s game console, which consists of the positions of each paddle, each ball, the score, and which tiles are lit for the ball splitter power up. Player two’s Update and Render methods use this data to draw the screen appropriately.

Hardware

The hardware layer of this project is responsible for two big parts. One is a rumble effect that fires every time the player is hit, and the other is a lighting effect that changes depending on the game state.

As all good programmers do, we reused code from another project. We leveraged the proven web server from Project Detroit for our Netduino, but with a few changes. Here, we had static class “modules” which knew how to talk to the physical hardware, and “controllers” which handled items like a player scoring, game state animations, and taking damage. Because the modules are static classes, we can have them referenced in multiple classes without issue.

NETMF Web Server

When a request comes in, we perform the requested operation, and then return a new line character to verify we got the request. If you don’t return any data, some clients will actually fire a second request which then can cause some odd behaviors. The flow is as follows:

Making It Shake

We used a Sparkfun MP3 trigger board, a subwoofer amplifier, and bass rumble plates to create this effect. The MP3 board requires power, and two jumpers to cause the MP3 to play. It has an audio jack that then gets plugged into the amplifier which powers the rumble plates.

From here, we just needed to wire a ground to the MP3 player’s ground pin, and the target pin on the MP3 player to a digital IO pin on the Netduino. In the code, we declare it as an OutputPort and give it an initial state of true. When we get a request, we toggle the pin on a separate thread.

Lighting Up the Room

For lighting, we used some RGB Lighting strips. The strips can change a single color and use a PWM signal to do this. This is different than the lighting we used in Project Detroit which allowed us to individually control each LED and used SPI to communicate. We purchased an RGB amplifier to allow a PWM signal to power a 12 volt strip. We purchased ours from US LED Supply and the exact product was RGB Amplifier 4A/Ch for interfacing with a Micro-Controller (PWM/TTL Input).

We alter the Duty Cycle to shift the brightness of the LEDs and do this on a separate thread. Below is a stripped down version of the lighting hardware class.

Conclusion

We built this experience over the course of around 4 to 5 weeks. It was our first DirectX application in a very long time, and our first C++ application in a very long time. However, we were able to pick up the new platform and language changes fairly easily and create a simple, yet fun game in that time period.