Picking Tutorial

The Name Stack

The OpenGL API provides a mechanism for picking objects in a 3D scene. This tutorial will show you how to detect which objects are bellow the mouse or in a square region of the OpenGL window. The steps involved in detecting which objects are at the location where the mouse was clicked are:

1. Get the window coordinates of the mouse

2. Enter selection mode

3. Redefine the viewing volume so that only a small area of the window around the cursor is rendered

4. Render the scene, either using all primitives or only those relevant to the picking operation

5. Exit selection mode and identify the objects which were rendered on that small part of the screen.

In order to identify the rendered objects using the OpenGL API you must name all relevant objects in your scene. The OpenGL API allows you to give names to primitives, or sets of primitives (objects). When in selection mode, a special rendering mode provided by the OpenGL API, no objects are actually rendered in the framebuffer. Instead the names of the objects (plus depth information) are collected in an array. For unnamed objects, only depth information is collected.

Using the OpenGL terminology, a hit occurs whenever a primitive is rendered in selection mode. Hit records are stored in the selection buffer. Upon exiting the selection mode OpenGL returns the selection buffer with the set of hit records. Since OpenGL provides depth information for each hit the application can then easily detect which object is closer to the user.

Introducing the Name Stack

As the title suggests, the names you assign to objects are stored in a stack. Actually you don't give names to objects, as in a text string. Instead you number objects. Nevertheless, since in OpenGL the term name is used, the tutorial will also use the term name instead of number.

When an object is rendered, if it intersects the new viewing volume, a hit record is created. The hit record contains the names currently on the name stack plus the minimum and maximum depth for the object. Note that a hit record is created even if the name stack is empty, in which case it only contains depth information. If more objects are rendered before the name stack is altered or the application leaves the selection mode, then the depth values stored on the hit record are altered accordingly.

A hit record is stored on the selection buffer only when the current contents of the name stack are altered or when the application leaves the selection mode.

The rendering function for the selection mode therefore is responsible for the contents of the name stack as well as the rendering of primitives.

OpenGL provides the following functions to manipulate the Name Stack:

void glInitNames(void);
This function creates an empty name stack. You are required to call this function to initialize the stack prior to pushing names.

void glPushName(GLuint name);
Adds name to the top of the stack. The stacks maximum dimension is implementation dependent, however according to the specs it must contain at least 64 names which should prove to be more than enough for the vast majority of applications. Nevertheless if you want to be sure you may query the state variable GL_NAME_STACK_DEPTH (use glGetIntegerv(GL_NAME_STACK_DEPTH)). Pushing values onto the stack beyond its capacity causes an overflow error GL_STACK_OVERFLOW.

void glPopName();
Removes the name from top of the stack. Popping a value from an empty stack causes an underflow, error GL_STACK_UNDERFLOW.

void glLoadName(GLunit name);
This function replaces the top of the stack with name. It is the same as calling

glPopName();
glPushName(name);

This function is basically a short cut for the above snippet of code. Loading a name on an empty stack causes the error GL_INVALID_OPERATION.

Note: Calls to the above functions are ignored when not in selection mode. This means that you may have a single rendering function with all the name stack functions inside it. When in the normal rendering mode the functions are ignored and when in selection mode the hit records will be collected.

Note: You can't place these functions inside a glBegin glEnd construction, which is kind of annoying since that will require a new rendering function for the selection mode in some cases. For instance if you have a set of points inside a glBegin glEnd and you want to name them in such a way that you can tell them apart, then you must create one glBegin glEnd block for each name.

1 glInitNames(); - This function creates an empty stack. This is required before any other operation on the stack such as Load, Push or Pop.

2 glPushName(BODY); - A name is pushed onto the stack. The stack now contains a single name.

3 drawBody(); - A function which calls OpenGL primitives to draw something. If any of the primitives called in here intersects the viewing volume a hit record is created. The contents of the hit record will be the name currently on the name stack, BODY, plus the minimum and maximum depth values for those primitives that intersect the viewing volume

4 glPopName(); - Removes the name of the top of the stack. Since the stack had a single item, it will now be empty. The name stack has been altered so if a hit record was created in 2 it will be saved in the selection buffer

5 glPushName(HEAD); - A name is pushed onto the stack. The stack now contains a single name again. The stack has been altered, but there is no hit record so nothing goes into the selection buffer

6 drawHead(); - Another function that renders OpenGL primitives. Again if any of the primitives intersects the viewing volume a hit record is created

7 drawEyes(); - Yet another function which renders OpenGL primitives. If any of the primitives in here intersects the viewing volume and a hit record already exists from 6, the hit record is updated accordingly. The names currently in the hit record are kept, but if any of the primitives in drawEyes() has a smaller minimum, or a larger maximum depth, then these new values are stored in the hit record. If the primitives in drawEyes() do intersect the viewing volume but there was no hit record from drawHead then a new one is created.

8 glPopName(); - The name on the top of the stack is removed. Since the stack had a single name the stack is now empty. Again the stack has been altered so if a hit record was created it will be stored in the selection buffer

9 drawGround(); - If any if the primitives called in here intersects the viewing volume a hit record is created. The stack is empty, so no names will be stored in the hit record, only depth information. If no alteration to the name stack occurs after this point, then the hit record created will only be stored in the selection buffer when the application leaves the selection mode. This will be covered later on the tutorial.

Note: lines 4 and 5 could have been replaced by glLoadName(HEAD);

Note that you can push a dummy name (some unused value) right at the start, and afterwards just use glLoadName, instead of Pushing and Popping names. However it is faster to disregard objects that don't have a name than to check if the name is the dummy name. On the other hand it is probably faster to call glLoadName than it is to Pop followed by Push.

Using multiple names for an object

There is no rule that says that an object must have a single name. You can give multiple names to an object. Suppose you have a number of snowmen disposed on a grid. Instead of naming them as 1,2,3,... you could name each of them with the row and column where they are placed. In this case each snowman will have two names describing its position: the row and column of the grid.

The following two functions show the two different approaches. First using a single name for each snowman.

This is a natural extension to multiple naming. It may be of interest to find out where in a particular object you have clicked. For instance you may want to know not only in which snowman you clicked but also if you clicked on the head or the body. The following function provides this information.

In this case you'll have three names when you click on either the body or head of a snowman: the row, the column and the value BODY or HEAD respectively.

A Note about Rendering on Selection Mode

As mentioned before, all calls to stack functions are ignored when not in selection mode, i.e. when rendering to the frame buffer. This means that you can have a single rendering function for both modes. However this can result in a serious waste of time. An application may have only a small number of pickable objects. Using the same function for both modes will require to render a lot of non-pickable objects when in selection mode. Not only the rendering time will be longer than required as will the time required to process the hit records.

Therefore it makes sense to consider writing a special rendering function for the selection mode in these cases. However you should take extra care when doing so. Consider for instance an application where you have several rooms, each room with a potential set of pickable objects. You must prevent the user from selecting objects through walls, i.e. objects that are in other rooms and are not visible. If you use the same rendering function then the walls will cause a hit, and therefore you can use depth information to disregard objects that are behind the wall. However if you decide to build a function just with the pickable objects then there is no way to tell if an object is in the current room, or behind a wall.

Therefore, when building a special rendering function for the selection mode you should include not only the pickable objects, but also all the objects that can cause occlusions, such as walls. For highly interactive real time applications it is probably a good option to build simplified representations of the objects that may cause occlusions. For instance a single polygon may replace a complex wall, or a box may replace a table.