Yearly Archives: 2008

It’s a common practice not to check what malloc() or new returns, since in an ideal world you will never run out of memory and so allocating memory will never fail. With a reasonably sized page file that is mostly true. When using a page file, this wouldn’t be a problem since if there is free space on the drive the page file is located, virtual memory space can be temporarily enlarged (and a message pops up mentioning that) and the application that tried to allocate memory got what it asked.

However, since I upgraded to the maximum amount of memory Windows XP supports, and switched memory paging completely off, I’ve noticed some programs will silently fail when the system runs out of memory. And this isn’t that uncommon since I develop software and the crap programmer I am, silly mistakes are and always will be made that cause huge memory usage (and there’s also the rest of the software running on the computer). I have noticed at least Firefox simply disappears, while some software die with an error message.

For a game this wouldn’t be an issue (or a web browser even) but there are tons of programs downloading stuff or just hanging there that you’ll never notice missing. But it’s annoying to notice everything isn’t as you left it.

So, in case you haven’t thought of allocation failing, you probably should do something about it. When there’s no free memory left, even a malloc(sizeof(char)) will invariably fail. A lazy way out would be wrapping the allocation with something that checks if the allocation succeeded and if not, displaying a message box will halt the program until the user does something (kills the offending app reserving all that memory or frees some disk space for the page file) and then clicks “Retry” and then the wrapper simply tries to allocate again. In C++, the new operator can be overloaded.

C++

Here’s a desktop calculator I wrote because I thought the standard Windows Calculator sucks (which it does). I got annoyed of how you can’t paste more than a single number and how the scientific mode misses a simple shortcut to square root.

My approach is closer to the Google Search calculator feature in that the program only has two text fields, one for the expression and one for the result. It also does conversions and has user-definable functions, constants and whatnot.

Currently, the software is in alpha state (which means it works but is not finished) which I hope will not be perpetual. It is already very much usable, though.

What is left to do is that currently the software does only 80-bit floating point arithmetics (the best precision offered by x87) and displaying the result is double precision (thanks to the fact MinGW uses Windows libraries for printing floats, which have support only up to double precision). I also plan adding simple graphical features for graphing functions.

Links

A lot of sound science goes here, including quantum mechanics, relativity and such but they satisfy the peer-reviewed research requirement ↩

Creationism has tons of this, including using them as an authority rather than a guy who knows about things. And tons of Steves disagree. ↩

The whole climate change hassle fits here, albeit only because while the supporters probably are right, they use their energy to moan about the subject and boosting their egos instead of doing anything. ↩

Introduction

Here we describe a method for checking collisions between arbitrary objects drawn by the GPU. The objects are not limited to any shape, complexity or orientation. The objects that are checked for collisions can be the same objects that are drawn on the screen. The algorithm is pixel perfect but without further refining is limited to 2D collisions.

Description

The algorithm can be divided in three separate steps:

Determine which objects need testing

Set up the GPU buffers and draw the first object so the stencil buffer is set

Make an occlusion query using the and using the stencil from the previous buffer

If the occlusion query returns any pixels were drawn, the two objects overlap.

Determining objects that need to be tested

Figure 1.

To speed up the algorithm, we need to eliminate object pairs that can’t overlap. This can be done by projecting the object bounding volumes in screen space. Then, we can quickly check which 2D polygons overlap and do further checking. Choosing the shape of the bounding volumes/polygons is not relevant. In this document we will use rectangles for the simplicity.

The example case in figure 1 shows two colliding objects (overlapping bounding rectangles shown red), one near hit (green overlap) and one object with no collisions or bounding rectangle overlap. To determine the collisions (and non-collisions) we need checks for every bounding rectangle (15 checks) and two checks using the GPU1. Without the use of the bounding rectangles, we would need 13 more checks on the GPU.

The bounding rectangles in the example can be generated by finding the maximum and minimum coordinates and using them as the corner points of the rectangle. Since the algorithm finds collisions as seen from the camera, we need first project the object coordinates (or the object bounding volume coordinates) on the screen and find the minima and maxima using those coordinates. Most 3D frameworks provide functions for projecting a 3D point to screen coordinates using the same transformations as seen on screen2.

Drawing objects

Before we start drawing the objects on the graphics buffer, we need to clear the stencil buffer. After that, the GPU should be set to draw only to the stencil buffer, that is we don’t need the color buffer or depth buffer to be updated.

Now, the first object can be drawn. After that, we will draw the second object. Before we start we need to set up the occlusion query to count the drawn pixels. We also need to use the stencil buffer created in the previous step and set the GPU to draw only if the stencil buffer is non-zero. We don’t need to update any buffers at this point.

After the second object is drawn, the pixel count returned by the occlusion query tells how many pixels overlap. For most purposes, we can assume a non-zero value means the two objects collide3.

Figure 2.

Figure 3.

Figure 4.

Consider the above example. Figure 2 shows two objects as seen on screen. Figure 3 shows the stencil derived by drawing only the green (right) object. Figure 4 shows what will be drawn of the red (left) object if we limit the rendering to drawing on the stencil (the occlusion query will tell us six pixels were drawn).

Sprites

For sprite objects that consist of only one quad and an alpha blended texture with transparent area representing the area that cannot collide with anything, the same result can be achieved by setting the alpha testing to discard pixels below a certain tolerance. This will then avoid updating the stencil buffer on the whole area of the quad.

Optimizations

The stencil buffer can be created to contain the collision area of e.g. every enemy currently on the screen and then testing with the player object. This avoids the necessity to test between every object individually, which can be costly since the occlusion query has to finish before the next query can start. If the objects on the screen can be grouped to a handful of categories, such as the player, the enemies, player bullets and so on, multiple objects can be tested simultaneously where it is not important to know exactly which two objects collided (e.g. when deciding whether to kill the player which in a shooter game generally is because the player collides with the background, an enemy bullet or an enemy — all of which would be drawn in the same stencil buffer).

Ideally, the above can be done so that every object group has its own stencil buffer. This then makes it possible to create the stencil and start the queries in their respective buffers, do something useful while they execute and then query for the occlusion query results.

Caveats

Most of all, doing collision detection with occlusion queries is admittedly a curiosity. There are many, many ways to do collisions faster and most likely with less technicalities. Using a simple rectangle (a cube or a sphere in 3D applications) to test collisions between complex objects has been used since the first video games and it still provides acceptable accuracy. Indeed, some games even use an extremely small collision area (e.g. a single pixel) to test collisions with larger objects with great success. Additionally, the algorithm is potentially a bottleneck, especially when triple buffering etc. are used since occlusion queries generally are started after the previous frame is drawn and then read when the next frame is drawn (i.e. it’s asyncronous). This limits either the algorithm accuracy for real time (frame accuracy) or the framerate since we may have to wait for the query to finish for a relatively long time.

A serious consideration is to use the occlusion query algorithm to test between a complex object, e.g. with a large end of level boss with a complex, changing shape and use a simpler geometry based algoritm to test between hundreds of bullets. A hybrid algorithm could even first make an occlusion query to know which objects collide and then use a different algorithm to find out more about the collision.

This is inherently a 2D algorithm. The GPU sees any situations in which two objects overlap as a collision, as seen from the location of the camera. This if fine for side-scrollers et cetera, which also benefit from pixel perfect (fair, predictable) collisions, and other situations in which all the objects to be tested are on the same plane. Multiple depth levels (e.g. foreground objects never collide with background items) improve the situation and so does drawing only the parts of the objects that are between specified depth levels4.

Another possible problem is that the screen resolution may be different between systems, which makes the algorithm have different levels of accuracy. This can be avoided by doing the checks in a fixed resolution. Limiting the collision buffer resolution also improves the algorithm speed in cases the GPU fillrate is a bottleneck.

Finally, some older systems may not support occlusion queries or stencil buffers. As of 2008, this is increasingly rare but should be taken into consideration.

Appendix A. Source code for OpenGL

The following is source code extracted from a proof of a concept/prototype game written in C++ using OpenGL. The source code should be self explanatory.

The following can be used to set the rendering mode before drawing the object on the screen, on the stencil buffer or when making the occlusion query (note how we disable most of the rendering when drawing something that will not be visible to the player, also notice how the stencil buffer is cleared if we enable stencil drawing mode — using scissor makes this efficient):

This is from the main loop, it simply goes through all objects in the game. Notice how we first prepare the collision checking for the first object to be tested (i.e. create the stencil as in step 2 of the algorithm).

Note that DrawObject is the same method as we use to draw the visible object on the screen. The only thing that differs is that the drawing mode is set to stencil or occlusion query. Also worth noting is how we use the rectangle to set the scissor to further reduce the workload.

Links

To speed up the checks done on the GPU, we can set the scissor rectangle to the overlapping area. This helps in cases in which the overlapping area is small compared to the complete area of the object. ↩

Stop luring people on your trite news blogs with misleading titles. Sure, a mysterious or provocative title is a good way to get people interested, but it in worst case makes your blog (and you) look uninformed. Especially, when the first sentence of the news item doesn’t explain the inaccuracies of the main headline.

News should never be speculative and if that is the case, it has to be explicitly stated where the line between fact and fiction goes. Some news are meant to be taken as entertainment but that style of writing should not bleed to potentially important news items. At least I expect to get information from news, even if I just casually pick the interesting sounding ones.

I hereby give away another priceless idea: one-sentence news. That is, you do not need to read the full article after you have read the descriptive, 5-15 word headline. Maybe that’ll cut into the advertising profit for there are less visitors to your site from feed readers etc. but not providing full articles in a feed is yesterday. And if you really only write 20 words per article, well, I guess that could make it profitable if you divide your income by the amount of text you write.

Here’s an example: Popular Science reports “Genetic Material Found on Meteorite“. What went wrong? Take a guess. At the very least I expect to read about scientists finding very discriminating evidence on a meteorite with a blacklight, kinda like how they do it with dead hookers in CSI.

Even the subtitle right below the main title already downplays the sensational title and states “A meteorite in Australia has been found to contain component molecules of DNA”. And if you take the time to read the article, you’ll learn similar molecules that make up DNA are found on a meteorite (I bet a rather large percent of the molecules in your DNA most likely are entirely made on Earth). That’s like reporting there are organic molecules in space (look it up), and not explaining what “organic” actually means. Except in the end, right below the ad banner.

How could that be reported in one sentence and what are the benefits? First, you’ll deliver the exact same amount of information but quicker. “Similar Molecules as in Your DNA Found on a Meteorite”. “The Birth of DNA Molecules Could Have Been Helped by Completely Uninteresting Stuff on Meteorites Billions of Years Ago”. Hell, I don’t care if it’s a haiku. Just make it terse, informative and less stupid.