Thursday, June 4, 2009

Okay, this is the posting that I've been dreading. Conceptually speaking, today's topic is the most difficult part of 3D programming, and it's one I've struggled with.

At this point, you should understand 3D geometry and the cartesian coordinate system. You should understand that objects in OpenGL's virtual world are built out of triangles made up of vertices, and that each vertex defines a specific point in three-dimensional space, and you know how to use that information to do basic drawing using OpenGL ES on the iPhone. If not, you should probably go back and reread the first six installments in this series before tackling this monstrosity.

In order for the objects in your virtual world to be at all useful for interactive programs like games, there has to be a way to change the position of objects in relation to each other and in relation to the viewer. There has to be a way to not only move, but rotate and scale objects. There also has to be way to translate that virtual three-dimensional world onto a two-dimensional computer screen. All of these are accomplished using something called transformations. The underlying mechanism that enables transformations are matrices (or matrixes if you prefer).

Although you can do a fair amount in OpenGL without ever really understanding matrices and the mathematics of the matrix, it is a really good idea to have at least a basic understanding of the mechanism.

Built-In Transformations and the Identity Matrix

You've already seen some of the OpenGL's stock transformations. One call that you've seen in every application we've written is glLoadIdentity(), which we've called at the beginning of the drawView: method to reset the state of the world.

You've also seen glRotatef(), which we used to make our icosahedron spin, and glTranslatef() which was used to move objects around in the virtual world.

Let's look at the call to glLoadIdentity() first. This call loads the identity matrix. We'll talk about this special matrix later, but loading the identity matrix basically resets the virtual world. It gets rid of any transformations that have been previously performed. It is standard practice to call glLoadIdentity() at the beginning of your drawing method so that your transformations have predictable results because you always know your starting point - the origin.

To give you an idea of what would happen if you didn't call glLoadIdentity(), grab the Xcode project from Part 4, comment out the call to glLoadIdentity() in drawView: and run the application. Go ahead and do it, I'll wait. What happens?

The icosahedron, which used to just slowly spin in place, scoots away from us doesn't it? Like Mighty Mouse, it flies away and then up into the sky, exit stage right.1

The reason why that happens is because we were using two transformations in that project. The vertices of our icosahedron were defined around the origin, so we used a translate transformation to move it three units away from the viewer so the whole thing could be seen. The second transformation we used was a rotation transformation to spin the cube in place. When the call to glLoadIdentity() was still in place, we started fresh each frame back at the origin looking straight down the Z-axis. Back then, when we translated 3 units away from the viewer, it always ended up at the same location of z == -3.0. Similarly, the rotation value, which we constantly increase based on the amount of time elapsed, caused the icosahedron to spin at an even pace. Because earlier rotations were removed by the call to glLoadIdentity() before it was rotated, the rotation was at a consistent speed.

Without the call to glLoadIdentity(), the first time through, the icosahedron is translated three units away from us and the icosahedron rotates a small amount. The next frame (a fraction of a second later), the icosahedron moves back another three frames, and the adds the value of rot to the amount the icosahedron was already rotated. This happens each frame, meaning the icosahedron moves away from us three units every frame, and the speed of rotation increases every frame.

It would be possible to not call glLoadIdentity() and compensate for this behavior, but we can't predict or compensate for transformations done in other code, so the best bet is to start from a known position, which is the origin, with no scaling or rotation, which is why we always call glLoadIdentity()

The Stock Transformations

In addition to glTranslatef() and glRotatef(), there is also glScalef(), which will cause the size of objects drawn to be increased or decreased. There are some other transformation functions available in OpenGL ES, but these three (combined with glLoadIdentity() are the ones that you'll use the most. The other ones are used primarily in the process of converting the three-dimensional virtual world into a two-dimensional representation, a process known as projection. We'll touch on projection a little in this article, but in most scenarios, you don't have to be directly involved with that process other then setting up your viewport.

The stock transformations can get you a long way. You could conceivably create an entire game using just these four calls to manipulate your geometry. There are times when you might want to take transformations into your own hands, however. One reason you might want to handle the transformations yourself is that these stock transformations have to be called sequentially as a separate function call, each resulting in a somewhat computationally costly matrix multiplication (something we'll talk more about later). If you do the transformations yourself rather than using the transformations provided by OpenGL, you can often combine multiple transformations into a single matrix, reducing the number of matrix multiplication operations that have to be performed every frame.

It's also possible to eke out better performance by doing your own matrices because you can vectorize your matrix multiplication calls. As far as I can tell, the iPhone's behavior in this regard is not documented, but as a general rule, OpenGL ES will hardware accelerate multiplication between a vector or vertex and a matrix, but not between two transformation matrices. By vectorizing matrix multiplication, you can actually get better performance than you can by letting OpenGL do the matrix multiplication. This won't give you a huge performance boost, as there are generally far less matrix by matrix multiplication calls than vector/vertex by matrix multiplication calls, but in complex 3D programs, every little bit of extra performance can help.

Enter the Matrix

Obviously, I have to make a reference to the movie "The Matrix", since we're going to spend the next few thousand words talking about matrices. It's sort of a geek requirement, so let me get it out of the way:

Unfortunately, nobody can be told what The Matrix is.

Only, it's not true in this case; matrices are really not that big of a deal. A matrix is just a two-dimensional array of values. That's it. Nothing mystical here. Here's a simple example of a matrix:

That's a 3x3 matrix, because it has three columns and three rows. Vectors and Vertices can actually be represented in a 1x3 matrix (remember this for later, it's kind of important):

A vertex could also be represented by a 3x1 array instead of a 1x3 array, but for our purposes, we're going to represent them using the 1x3 format (you'll see why later). Even a single data element is technically a 1x1 matrix, although that's not a very useful matrix.

You know what else can be represented in an array? Coordinate systems. Watch this, it's kind of cool. You remember vectors, right? Vectors are imaginary lines running from the origin to a point in space. Now, remember that the Cartesian coordinate system has three axes:

So, what would a normalized vector that ran down the X axis look like? Remember: A normalized vector has a length of one, so, a normalized vector that runs up the X axis would look like this:

Notice that we're representing the vector as a 3x1 matrix rather than a 1x3 matrix as we did with the vertex. Again, it doesn't actually matter, as long as we use the opposite for vertices and these vectors. All three of the values in this vector apply to the same axis. I know, it probably doesn't make sense yet, but bear with me, this will clear in up in a second. A vector that runs up the Y axis would look like this:

And one that runs up the Z axis looks like this:

Now, if we put these three vector matrices together in the same order as they are represented in a vertex (x then y then z), it would look like this:

That's a special matrix called the identity matrix. Sound familiar? When you call glLoadIdentity(), you are loading that matrix right there2. Here's why this is a special matrix. Matrices can be multiplied together, and multiplying matrices is how you combine them. If you multiply any matrix by the identity matrix, the result is the original matrix. Just like multiplying a number by one. You can always calculate the identity matrix for any given size matrix by setting all the values to 0.0 except where the row and column number are the same, in which case you set the value to 1.0.

Matrix Multiplication

Matrix multiplication is the key to combining matrices. If you have a one matrix that defines a translate, and another that defines a rotate, if you multiply them together, you get a single matrix that defines both a rotate and a translate. Let's look at a simple example of matrix multiplication. Imagine these two matrices:

The result of a matrix multiplication is another matrix that is exactly the same size as the matrix on the left side of the equation. Matrix multiplication is not commutative. The order matters. The result of multiplying matrix a by matrix b is not necessarily the same as the result from multiplying matrix b by matrix a (although it could be in some situations).

Here's another thing about matrix multiplication: Not every pair of matrices can be multiplied together. They don't have to be the same size, but the matrix on the right side of the equation has to have the same number of rows as the number of columns that the matrix on the left side of the equation has. So, you can multiply a 3x3 matrix with another 3x3 matrix, or you can multiply a 1x3 matrix with a 3x6 matrix, but you can't multiply a 2x4 matrix with, say, another 2x4 matrix because the number of columns in a 2x4 matrix is not the same as the number of rows in a 2x4 matrix.

To figure the result of a matrix multiplication, we make an empty matrix of the same size as the matrix on the left side of the equation:

Now, for each spot in this matrix, we take the corresponding row from the left-hand matrix and the corresponding column from the right hand matrix. So, for the top left position in the result matrix, we take the top row of the left side of the equation and the first column of the right side of the equation, like so:

Then we multiply the first value in the row from the left-hand matrix by the first value in the right-hand column, multiply the second value in the left-hand row by the second value in the right-hand column, multiply the third value in the left-hand row by the third value in the right-hand column, then add them all together. So, it would be:

If you repeat this process for every spot in the result matrix, then you get the result of the matrix multiplication:

And look at that. We multiplied a matrix (the blue one) by the identity matrix (the red one) and the result is exactly the same as the original matrix. If you think about it, it makes sense since the identity matrix represents our coordinate system with no transformations. This also works with vertices. We can multiply a vertex by a matrix, and the same thing happens:

Now, let's say that we wanted to rotate an object. What we do is define a matrix that describes a coordinate system that is rotated. In a sense, we actually rotate the world, and then draw the object into it. Let's say we want to spin an object along the Z axis. To do that, the Z axis is going to remain unchanged, but the X axis and the Y axis need to change. Now, this is a little hard to imagine, and it's not really vital that you understand the underlying math, but to define a coordinate system rotated on the z axis, we would adjust the x and y vectors in our 3x3 matrix, in other words, we have to make changes to the first and second column.

So, the X value of the X-axis vector and the Y value of the Y-axis vector need to be adjusted by the cosine of the rotation angle. The cosine, remember, is the length of the side adjacent to the angle in a triangle. We also need to adjust the Y value of the X-axis vector by an amount equal to minus the sine of the angle and the X value of the Y-axis vertex by the sine of the angle. The sine of an angle is the length of the opposite side of the angle in a triangle. That's hard to follow, it might be easier to understand expressed as a matrix:

Now, if you take ever vertex in every object in your world and multiply them by this matrix, you get the new location of the vertex in the rotated world. Once you've applied this matrix to every vertex in your object, you have an object that has been rotated along the Z-axis by n degrees.

If that doesn't make sense, it's okay. You really don't need to understand the math to use matrices. These are all solved problems, and you can find the matrices for any transformation using google. In fact, you can find most of them in the OpenGL man pages. So don't beat yourself up if you're not fully understanding why that matrix results in a Z-axis rotation.

A 3x3 matrix can describe the world rotated at any angle on any axis. However, we actually need a fourth row and column in order to be able to represent all the transformations we might need. We need a fourth column to hold translation information, and a fourth row which is needed to do the perspective transformation. I don't want to get into the math underlying the perspective transformation because they require understanding homogenous coordinates and projective space, and it's not really important to becoming a good OpenGL programmer. In order to multiply a vertex by a 4x4 matrix, we just pad it with an extra value, usually referred to as W. W should be set to 1. After the multiplication is complete, ignore the value W. We're not actually going to look at vector by matrix multiplication in this installment because OpenGL already hardware accelerates that, so there's usually not a need to handle that manually, but it's a good idea to understand the basic process.

OpenGL ES's Matrices

OpenGL ES maintains two separate matrices, both are 4x4 matrices of GLfloats. One of these matrices, called the modelview matrix is the one you'll be interacting with most of the time. This is the one that you use to apply transformations to the virtual world. To rotate, translate, or scale objects in your virtual world, you do it by making changes to the model view matrix.

The other matrix is used in creating the two-dimensional representation of the world based on the viewport you set up. This second matrix is called the projection matrix. The vast majority of the time, you won't touch the projection matrix.

Only one of these two matrices is active at a time, and all matrix-related calls, including those to glLoadIdentity(), glRotatef(), glTranslatef(), and glScalef() affect the active matrix. When you call glLoadIdentity, you set the active matrix to the identity matrix. When you call the other three, OpenGL ES creates an appropriate translate, scale, or rotate matrix and multiplies the active matrix by that matrix, replacing the results of the active matrix with the result of the matrix multiplication operation.

For most practical purposes, you'll just set the modelview matrix as the active matrix early on and then leave it that way. In fact, if you look at my OpenGL ES template, you'll se that I do that in the setupView: method, with this line of code:

glMatrixMode(GL_MODELVIEW);

OpenGL ES's matrix are defined as an array of 16 GLfloats, like this:

GLfloat matrix[16];

They could also be represented as two-dimensional C arrays like this:

GLfloat matrix[4][4];

Both of those declarations result in the same amount of memory being allocated, so it's a matter of personal preference which you use, though the former seems to be more common.

Let's Play

Okay, at this point, I'm sure you've had enough theory and want to see some of this in action, so create a new project using my OpenGL ES template, and replace the drawView: and setupView: with the versions below:

// Define a cutoff angle. This defines a 90° field of vision, since the cutoff// is number of degrees to each side of an imaginary line drawn from the light's// position along the vector supplied in GL_SPOT_DIRECTION aboveglLightf(GL_LIGHT0, GL_SPOT_CUTOFF, 25.0);

glLoadIdentity(); glClearColor(0.0f, 0.0f, 0.0f, 1.0f); }

This creates a simple program with our friend, the icosahedron. It rotates, as it did before, but it also moves up and down along the Y axis using a translate transform, and it increases and decreases in size using a scale transform. This uses all the common modelview transforms; we load the identity matrix, do a scale, a rotate, and a translate using the stock OpenGL ES transform functions.

Let's replace each of stock functions with our own matrices. Before proceeding, build and run the application so you know what the correct behavior for our application looks like.

Defining a Matrix

Let's define our own object to hold a matrix. This is just to make our code easier to read:

typedef GLfloat Matrix3D[16];

Our Own Identity Matrix

For our first trick, let's create our own identity matrix. The identity matrix for a 4x4 matrix needs to look like this:

Here's a simple function to populate an existing Matrix3D with the identity matrix.

Now, this probably looks wrong at first glance. It looks like we're passing Matrix3D by value, which wouldn't work. However we're using a typedef3 array, and due to C99's array-pointer equivalency, arrays are passed by reference and not by value, so we can just assign the individual values of the arrays and don't have to pass pointers.

I'm using inline functions to eliminate the overhead of a function call. This involves a trade-off (primarily increased code size) and all the code in this article will work just as well as regular C functions. To use them that way, just remove the static inline keywords and place them in a .c or .m file instead of a .h file. Note that the static keyword is correct (and a good idea) in C and Objective-C programs, but if you're using C++ or Objective-C++, then you should probably exclude it. The GCC Manual recommends making C inline functions like this one static because doing so allows the compiler to remove the generated assembly for unused inline functions. However, if you're using C++ or Objective-C++ the static keyword can potentially impact linkage behavior and offers no real benefit.

Okay, let's use this new function to replace the call to glLoadIdentity(). Delete the call to glLoadIdentity() and replace it with the following code:

So, we declare a Matrix3D, populate it with the identity matrix, and then we load that matrix using glLoadMatrixf(), which replaces the active matrix (in our case, the modelview matrix) with the identity matrix. That's exactly the same thing as calling glLoadIdentity(). Exactly. Build and run the program now, and it should look exactly the same as before. No difference.

Now, you know really know what glLoadIdentity() does since you've done it manually. Let's continue.

Matrix Multiplication

Before we can implement any more transformations, we need to write a function to multiply two matrices. Remember, multiplying matrices is how we combine two matrices into a single matrix. We could write a generic matrix multiplication method that would work with any size array and that used for loops to do the calculation, but let's just do it without the loop. Loops have a tiny bit of overhead associated with them, and unrolling loops in functionality that gets called a lot can make a difference. Since OpenGL ES's matrices are always 4x4, the fastest multiplication is to just do each calculation. Here's our matrix multiplication:

Again, this function doesn't allocate any memory, it just populates an existing result array (result) by multiplying the other two arrays. The result array should not be one of the two values being multiplied, however, because that would yield incorrect results since values are changed that will be used again.

But, wait… this can actually be faster. at least when you run the program on the iPhone instead of the simulator. The iPhone has four vector processors that are able to do floating point math much faster than the iPhone's CPU can. Taking advantage of these vector processors, however, requires writing ARM6 assembly because there are no libraries for accessing the vectors from C.

Fortunately, somebody's already figured out how to a matrix multiply using the vector processors. The VFP Math Library contains a lot of vectorized functionality, and it's released under a fairly permissive license. So, I took the VFP Math Library vectorized matrix multiply and incorporated it into my method, so that when it's run on the device, the vectorized version is used, but the regular version is used when run on the simulator (note that I've included the original comments with ownership and licensing information and identified that the code has been modified in order to comply with VFP Math Library license):

/* These define the vectorized version of the matrix multiply function and are based on the Matrix4Mul method from the vfp-math-library. This code has been modified, but is still subject to the original license terms and ownership as follow:

This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions:

1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required.

2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.

Now that we have the ability to multiply matrices together we can combine multiple matrices. Since our matrix multiply is hardware accelerated and OpenGL ES does not hardware accelerate matrix by matrix multiplication, our version should actually be a tiny bit faster than using the stock transformations4. Let's add the translate transformation now.

Our Own Translate

If you recall earlier, one of the reasons why we need to use a 4x4 matrix instead of a 3x3 matrix was because we needed an extra column for translation information. Indeed, this is what a translation matrix looks like:

Now, how do we incorporate that into our drawView: method? Well, we can delete the call to glTranslatef(), and replace it with code that declares another matrix, populates it with the appropriate translation values, multiplies that matrix by the existing matrix and then load the result into the OpenGL, right?

Well, yeah, that'll work, but it's doing unnecessary work. Remember, if you multiply any matrix by the identity matrix, the result is itself. So when working with our own matrices, we no longer need to load the identity matrix first if we're using any other transformation. Instead, we can just create the translation matrix and load that:

Since we don't have to load the identity matrix, we saved ourself a little tiny bit of work each time through this method. Also notice that I'm declaring the Matrix3Ds as static. We don't want to constantly allocate and deallocate memory. We know we're going to need this matrix several times every second all while the program is running, so by declaring it static, we cause it to stick around and be reused rather than having the overhead of constant memory allocation and deallocation.

Our Own Scaling Transformation

A matrix to change the size of objects looks like this:

A value of 1.0 for x, y, or z indicates that there is no change in scale in that direction. A 1.0 for all three would result in (you guessed it) the identity matrix. If you pass a 2.0, it will double the size of the object along that axis. We can turn the scaling matrix into an OpenGL ES matrix like this:

Now, we are going to have to multiply matrices because we want to apply more than one trasnformation. To apply both a scaling and a rotation ourself, we need to multiply those two matrices together. Delete the call to glScalef() and the previous code we wrote and replace with this:

We create a matrix and populate it with the appropriate translate values. Then we create a scaling matrix and populate it with the appropriate values. Then we multiply those two together and load them into the model view matrix. Now for the tough one. Rotation.

Our Own Rotation

Rotation is a little tougher. We can create matrices for rotation around each of the axes. We already know what Z-axis rotation looks like:

X-axis rotation looks similar:

And so does Y-axis rotation:

These three rotations can be turned into OpenGL matrices with these functions:

There's two methods for each axis' rotation, one to set by radians and another to set by degrees. These three matrices represent what are called Eular angles The problem with Eular angles is that we have to apply rotations along multiple axes sequentially, and when we set rotation on all three angles, we'll end up experiencing a phenomenon called gimbal lock, which results in the loss of rotation on one axis. In order to avoid this problem, we need to create a single matrix that can handle rotation on multiple axes. In addition to eliminating problem gimbal lock, this will also save processing overhead when rotations are needed on more than one axis.

Now, honestly, I don't pretend to understand the math behind this one. I've read doctoral theses on this (a matrix representation of quaternions) but the math just doesn't fully sink in, so you and I are just going to take it on faith that this multi-rotation matrix works (it does). This matrix assumes a single angle designated N, and a vector expressed as three floating point values. Each component of the vector will be multiplied by N to result in the angle of rotation about that axis:

This matrix requires that the vector passed in be a unit vector (aka a normalized vector), so we have to ensure that before populating the matrix. Expressed as an OpenGL matrix, it would be:

Wacky and Wonderful Custom Matrices

Are you still with me? This has been a doozy of an installment, hasn't it? We can't stop quite yet, though. And here's the reason why - so far I've only shown you how to recreate existing functionality in OpenGL. Yeah, what we've done can give a tiny performance boost thanks to the fact that our matrix multiplication is hardware accelerated, but that's not really enough of a justification in 99% of the cases to recreate the wheel like this.

But, there are other benefits to handling the matrix transformations yourself. You can, for example, create transformations that OpenGL ES doesn't have built-in to it. For example, we could do a shear transformation. Shearing is basically skewing an object along two axes. If you applied a shear axis to a tower, you'd get the leaning tower of Pisa. Here is what the shear matrix looks like:

Try doing that with stock calls. You can also combine matrix calls. So we could, for example, create a single function to create a translate and scale matrix without having to do even a single matrix multiplication.

Exit the Matrix

Matrices are a huge and often misunderstood topic, one that many people (including me) struggle with understanding. Hopefully this gives you enough of an understand of what's going on under the hood, and it also gives you a library of matrix-related functions you can use in your own applications. If you want to download the project and try it out yourself, please feel free. I've defined two constants that you can change to switch between using stock transforms and custom transforms, and also to turn on and off the shear transformation. You can find these in GLViewController.h:

#defineUSE_CUSTOM_MATRICES1#defineUSE_SHEAR_TRANSFORM1

Setting them to 1 turns them on, setting them to 0 turns them off. I've also updated the OpenGL ES Xcode Template with the new Matrix functions, including the vectorized matrix multiply function. Best of luck with it, and don't worry if you don't fully grok, this is hard stuff, and for 99% of what most people do in OpenGL ES, you don't need to fully understand projective space, homogeneous coordinates, or linear transformations, so as long as you get the big picture, you should be fine.

With great thanks to Noel Llopis of Snappy Touch for his help and patience. If you haven't checked out his awesome Flower Garden, you really should - it's an absolute treat.Footnotes

No, that's not an error. Stage right is what the audience would perceive as going to the left. Our icosahedron goes off to the left, so it is existing "stage right".

Actually, not quite. When you call glLoadIdentity(), you're loading the 4x4 identity matrix, that illustration shows the 3x3 identity matrix.

You might want to use a struct instead of a typedef to gain type safety. If you do that, then you'll have to make sure that you specifically pass parameters in by reference because structs are NOT automatically passed by reference, unlike arrays.

Using Shark, the drawView: method went from being .7% of processing time to .1% of processing time. A substantial improvement in the speed of that method, but not in the overall application performance.

28 comments:

I have found the Richard Paul text on robot manipulators to provide the most lucid explaination of transformations, coordinate frames, and vectors I've ever seen. I highly recommend folks to check it out.

One newbie question: so by applying transformations through the modelview matrix and leaving the projection view matrix constant, what you're effectively doing is moving the virtual world around you rather than you moving through the virtual world? EG: to simulate movement along the x axis, you actually translate the entire co-ordinate system (well, the x component anyway) of the virtual world?

My first instinct was that this would be more computationally intensive than simply moving the projection matrix (the world is constant and your view into the world changes), but on second thoughts I can see an equivalence between the two methods.

There is a camera sitting somewhere in space. There are objects the camera is looking at. The modelview matrix collapses together camera aiming and positioning/aiming and model positioning/scaling.

Unfortunately, OpenGL has no concept of a camera, creating needless pain an suffering for folks learning 3D graphics through the rather bizarro lens of OpenGL.

If you think OpenGL makes this all harder then it needs to be, you are correct. I often think of OpenGL as the assembly language of 3D graphics.

So, when you say "...simulate movement ..." you must decide weather you want a camera dollying along the x-axis (in which case you move the world instead of the camera) or rather a model moving along the x-axis in which case you would use glTranslate - which moves the model *not* the camera followed by concatenation with whatever transformations you want applied to the camera.

"you must decide weather you want a camera dollying along the x-axis (in which case you move the world instead of the camera) or rather a model moving along the x-axis in which case you would use glTranslate - which moves the model *not* the camera followed by concatenation with whatever transformations you want applied to the camera."

Aren't both of these options the same thing?

In any case, your statement that the modelview matrix as a combination of camera & model makes things a bit clearer.

I was thinking that the projection matrix had something to do with the viewport discussed earlier (it may not, I'm not sure). Couldn't you simulate dollying along the x axis by keeping the model/view matrix constant, and simply change the perspective viewport? This intuitively is a closer match to the apparent behaviour of a camera dollying along the x-axis, but most probably has some limitations/drawbacks (eg: can you rotate the viewport...)

To prevent the onset of early insanity, you want to keep camera and model manipulation at arms length. I know, OpenGL makes this painful. No pain, no gain.

Seriously though, it is crucial that you allow yourself the flexibly to move the camera independent of the models in the scene. Very, very important.

Remember silent films? Recall how static they all looked. That's because the camera was locked down and couldn't move. Boring. Well, Charlie Chaplin was never boring but you get the point.

Regarding the projection matrix. Projection takes everything in the scene and "projects" it onto the screen. It is a 2D thing and not really relevant to this discussion. The only manipulation you will typically do at the projection stage of the rendering pipeline is futz with the field of view to zoom in/out.

"Seriously though, it is crucial that you allow yourself the flexibly to move the camera independent of the models in the scene. Very, very important."

I must be missing something here... moving a camera along the x axis of a scene should be exactly the same as moving the *entire* model backwards along the x axis and keeping the camera still (after all, all motion is relative). Of course this movement of the *entire* model is different to a single object in the model moving separately.

You are correct that the scene will look *identical* if you apply a transform to the model or apply the inverse of that transform to the camera. No difference.

However, what you need to think about is do you want that to be the *only* way you create movement in your app. It's sorta like Steven Spielberg trying to shoot Jaws with his camera bolted to the ground at a single location for the entirety of the movie production. Oy vey. ;-)

Downloaded the Part6 Project and it just shows white on the iPhone 3.1 simulator. When I exit the app (via the simulator) I see the object working as expected briefly prior to the app closing. Any change to get working under the 3.1 simulator? Thanks.

From a post on another one of the examples (re iPhone 3.*)the project uses a mainwindow.xib file which declares the delegate and its own instance of UIWindow. so there's actually two windows being created and on iphone os 3.0 apparently the one in the xib file appears on top of the one defined in code.

by removing this line

window = [[UIWindow alloc] initWithFrame:rect];

only one window instance is left, so we definitely add the view to the window that will be shown.

I think Taras's point is that it would seem that, when you want the effect of moving the camera, you would only need to alter the viewport once, and then draw everything else without having to apply the same translation matrix (for example) to all the objects. Certainly at first glance that would seem less computationally expensive. Even if you do want to move some of the objects around relative to each other, you only need to apply translation matrices to those particular objects, not to all of them.

So is this the usual practice, or not, and if not, then why not?

Being a relative newbie to iPhone GL ES myself, I don't know the answer for sure, but I suspect some reasons might be that

- updating the viewport may be a very expensive GL ES operation,

- it may not be easy to achieve consistently correct and "artifactless" results doing that, particularly when rotating the viewport.

- there are other ways of achieving the same or similar savings without mucking about with the viewport.

My understanding of matrix multiplication is that, when you multiply matrix A by matrix B, you essentially apply whatever transformations of identity that have been applied in B to A. Is that not correct?

And, my understanding is that you have to perform matrix operations in the following specific order, in order to properly setup a matrix for drawing an object:

1 - load identity2 - rotate the object about its own origin 3 - translate the object it to its position in space4 - rotate it about the origin by the inverse of the current viewer (camera) angle5 - translate it by the inverse of the viewer's (camera's) position in the world.

If all that is true, then I would think you could:

..when moving or rotating an object:

1 - load identity2 - rotate the object about its origin3 - translate the object to its position in space4 - save this as the object's "world" matrix

..when moving or rotating the camera:

1 - load identity2 - rotate the camera about its origin3 - translate the camera to its position in space4 - save this as the camera matrix

..when drawing an object:

1 - Load the object's current matrix2 - Multiply by the current camera matrix.3 - Draw the object.

This way, you only update an object's matrix when you actually move or rotate that object in game space, and you only update the camera matrix when you actually move or rotate the camera.

(The notion of a camera is still useful and relevant, I would think, even if the pipeline doesn't physically implement one. Regardless of how it is done, you still need some way to control the effective vantage point from which the user is viewing the scene.)

There's a memory cost, obviously for saving all those matrices, but it may be quite reasonable, depending on the number of objects you have in memory at any given time.

I'm not sure, however, that this method would turn out to be any more efficient, because of the cost of loading and saving matrices and such.

I know this article is a little old to revive with a comment but i just noticed from the OpenGL ES reference pages that glLoadMatrix() expects a 4x4 matrix (http://www.khronos.org/opengles/sdk/1.1/docs/man/)

In your example code, you declare a 3x3 matrix which is then passed to glLoadMatrix(); Is this correct? Surely either your example code is wrong, or the reference page for glLoadMatrix() is incorrect? Or perhaps said method accepts both but does the appropriate work to cast to a 4x4 matrix in the event that a 3x3 matrix is passed?

Sorry for the comment on an old topic, this has just thrown my understand off and i'm not sure which is the correct method of usage given that the two (your tutorial and the khronos reference) contradict each other :/

Hi Jeff!I'd like to express how much I admire your blog. Is one of the best and most positioned in Google.... Keep going with great job.I am following your all tutorials.... hoverer with this tutorial I get:{standard input}:1861:unshifted register required -- `bic r0,r0,#0x00370000'{standard input}:1862:unshifted register required -- `orr r0,r0,#0x00030000'{standard input}:1863:selected processor does not support `fmxr fpscr,r0'….bunch of others selected procesro does not support …..{standard input}:1891:selected processor does not support `fmxr fpscr,r0'… and gcc-4.2 failed with exit code ONLY when trying on the device. Simulator wokrs OK.I am using iPhone 3GS (3.1.2) as testing device.What could I do wrong? I am working with your OpenGLCommon.h file for a long time and it is first time this problem occurs. Where to look for any clue?ANy help appreciated.

francesca I had the same problem the other day. It's because you are missing his ConstantsAndMacros.h and his OpenGLCommon.h files or you forgot to #include them in the class that is getting that error. The newer sdk templates use open gl rendering which is completely different too.

You can't just use his draw code in the stock templates. You either have to use his template which he provides or build a frankenstein one that mixes the two together.

What youre saying is completely true. I know that everybody must say the same thing, but I just think that you put it in a way that everyone can understand. I also love the images you put in here. They fit so well with what youre trying to say. Im sure youll reach so many people with what youve got to say.

I'm new to OpenGL but I've worked with many a rendering API. Something that confuses me about this is that glFrustum takes positive z-coordinates (in your case 0.01 to 1000), but then your transform takes a negative z-coordinate, which I would think would display nothing (transforming the verts outside of the frustum) but clearly it works. (It works in my program too.) Anyhow, either I don't understand what glFrustum is doing or I don't understand what glTranslate is doing. (glTranslate in x and y does what I expect...)So...do you know?

MartianCraft Apparel

About Me

I'm a programmer, author, and one of the founders of MartianCraft, a software development shop doing both contract and in-house software development.
My iPhone books (both written with Dave Mark) are published by Apress. My first book, Beginning iPhone Development (currently in its 6th edition) is the all-time best-selling book on learning to program the iPhone and iPad.
I also sculpt and take pictures when I need to engage the other side of my brain a little more.