Scroll below to see the newest Q&As. Some of the earlier questions regard later OpenGL versions and vise-versa, so if you're looking at this topic to see an answer, make sure you look around. It's a little muddled up.

Old questionI wasn't really sure where to start with OpenGL so I had a look at SHC's tutorials and started there. Hopefully I'm wording this correctly (a lot of my confusion is with the wording of things).Let me start off with what I think I know:

Everything is done in a line in the order you call functions

The modelview matrix is used to manipulate objects in the world

The projection matrix adjusts the view of the camera

You use the glMatrixMode() function to select the matrix you want to manipulate

The glLoadIdentity() function resets the matrix to its original state

You need to use glOrtho() to set up an orthographic projection to view the modelview matrix

You manipulate a matrix between the glPushMatrix() and glPopMatrix() functions

The glTranslatef() function moves the modelview matrix position

You draw vertices between the glBegin() and glEnd() function. The parameter of glBegin() is the shape you want to draw.

The glViewport() function resizes the orthographic camera

I've probably got half that wrong and I'd really appreciate if someone could tell me exactly what I've got wrong and in the simplest possible way you can think of, the correct definitions.

Also I have a question:

Why are modelview and projection both matrices. I see a lot of references to the modelview matrix but I don't understand what makes it a matrix. I also have seen references to the matrix stacks. I understand what a stack is, but how are these processed?

The modelview matrix is actually two matrices in one. The model matrix (AKA object matrix) is used to position a model. This is very useful when you have a 3D model with vertices in local space since it allows you to position, scale and rotate the model to its appropriate position in world space. The view matrix does something completely different: It holds the inverse of the position and orientation of the camera. The projection matrix can be seen as the lens of the camera; It defines stuff like field-of-view angle, aspect ratio and near/far planes when using 3D perspective projection or the bounds of the orthographic projection when using orthographic projection. Finally, the viewport defines what part of the window the projected coordinates should map to.

Crash course on matrices: Multiplying a matrix with a vertex takes it from one space to another. If we want to reverse this process, we can take the inverse of a matrix. This new matrix does the same transformation backwards so we get back the original vertex.

1. Local space --model matrix--> World space. This is pretty easy to understand. If we call glTranslatef() on this matrix we'll move the object around in world space.

2. World space --view matrix--> Eye space. This is a bit more complicated. Let's say we have a camera position (x, y). If we do the same thing we did to the model matrix by calling glTranslatef(x, y, 0), we're actually getting a matrix that takes things in eye space and transforms them into world space. We want to do the opposite, AKA the inverse of it. The simplest fix is therefore to simply do it backwards manually by calling glTranslatef(-x, -y, 0). That's a proper view matrix.

3. Eye space --projection matrix--> Normalized device coordinates. The projection matrix takes in coordinates relative to the eye (camera) and maps all three dimensions to [-1, 1], so if the coordinates are (0, 0, 0) after transformation, they're at the center.

4. Normalized device coordinates --viewport--> screen coordinates. Finally these [-1, 1] coordinates are mapped to actual pixels using the viewport settings. If you call glViewport(100, 100, 200, 200) and end up with a coordinate with the normalized device coordinates (0, 0, 0), they'll be mapped to to (150, 150) in the actual window.

Why 2 matrices when we actually need 3? This has to do with 3D. World space coordinates aren't really needed for anything special, but lighting is usually done in eye space. By premultiplying the model and view matrices together, we get a single matrix that does the same as both the original matrices, so it's basically a shortcut to eye space. We can't skip eye space though since we need to do lighting there, so in 3D we can't multiply in the projection matrix too. For 2D however, this point is moot. There's no actual need for 2 separate matrices, but since you're stuck with two matrices you might as well use them as they're supposed to be used, if only because it's a good habit once you start with basic 3D rendering.

glPushMatrix() stores the current matrix on the matrix stack. Think of the stack as a pile of matrices. Push stores the matrix on top of the pile and pop takes off the matrix on top again. This is also called a First-in-First-out queue. This is very useful when working with the modelview matrix:

1. Local space --model matrix--> World space. This is pretty easy to understand. If we call glTranslatef() on this matrix we'll move the object around in world space.

2. World space --view matrix--> Eye space. This is a bit more complicated. Let's say we have a camera position (x, y). If we do the same thing we did to the model matrix by calling glTranslatef(x, y, 0), we're actually getting a matrix that takes things in eye space and transforms them into world space. We want to do the opposite, AKA the inverse of it. The simplest fix is therefore to simply do it backwards manually by calling glTranslatef(-x, -y, 0). That's a proper view matrix.

3. Eye space --projection matrix--> Normalized device coordinates. The projection matrix takes in coordinates relative to the eye (camera) and maps all three dimensions to [-1, 1], so if the coordinates are (0, 0, 0) after transformation, they're at the center.

4. Normalized device coordinates --viewport--> screen coordinates. Finally these [-1, 1] coordinates are mapped to actual pixels using the viewport settings. If you call glViewport(100, 100, 200, 200) and end up with a coordinate with the normalized device coordinates (0, 0, 0), they'll be mapped to to (150, 150) in the actual window.

I think the root of the problem is that I don't understand what local space, eye space or NDCs are.

Also, why in your code example do you push, pop and then re-push and re-pop the matrix? What does undoing the glTranslate(...) call do?

Local space is the simplest, but also not very relevant for 2D. If you open up a 3D model file, you'll find vertex positions. These are relative to some origin (0, 0, 0) point chosen by the modeler. For a cube it'd most likely be the center of the cube, and for a human it's usually the point the human is standing on right between his feet. The same concept applies for a 2D sprite. Let's say you set up a sprite centered over (0, 0).

This sprite is made out of 4 vertices forming a quad, and its vertices are in currently in local space. However, it should be obvious that we don't always want to draw this sprite centered at (0, 0). Let's say that this is a player sprite, so we want to move it to where the player currently is, which we'll say currently is at (50, 50). We'll therefore need to translate the model matrix to (50, 50) to move the object's coordinates so they're centered at (50, 50) instead of the sprite's original origin of (0, 0). For the sake of it, we also want to make the sprite twice as big, so we also scale it using glScalef(2, 2, 1). When we multiply it by this matrix, we get the following sprite:

The sprite is now at its world position. Again, this can be in any unit you want as long as it makes sense to you. It could be pixels on the screen, millimeters in an ant strategy game, blocks in tetris or light-years in a space game. This is a space defined by you and it's the same space as all your game object's coordinates are in.

Next we can also move around the camera in the world. Since the above sprite is the player sprite, the camera happens to be tracking a point 5 units to the right of the player sprite. As I wrote in my previous post, the view matrix is responsible for camera movement. Basically it's supposed to move vertices so they're relative to the camera instead of relative to world's arbitrarily chosen (by you) origin. So we translate our view matrix using glTranslatef(-50, -50, 0) and transform our sprite using it:

As you can imagine eye space is very similar to world space, only that the objects are relative to the camera instead. In other words, in this space the camera is always at (0, 0). For 2D, this doesn't actually mean much and is usually not actually the case though. The reason lies in how most people use glOrtho(). glOrtho(0, 100, 100, 0, -1, 1) will map (0, 0) in eye space to (-1, 1) in normalized device coordinates, which is the top left corner. If you pass in (100, 100), you'll end up at (1, -1), which is the bottom right corner. These coordinates are then mapped to the screen using the viewport. What this glOrtho() call in practice does is map the area (0, 0) to (100, 100) of your eye space coordinates to the viewport. With that glOrtho() call and our viewport set to (0, 0, 100, 100), our sprite would end up at the top left corner of the screen, with the top half of it being outside of the screen.

If we were to not reset the matrix in the object rendering loop, only our first object would render correctly while the rest would most likely end up far outside the screen somewhere.

However, the above code is as I said not valid since we actually only have two matrices. Since our view matrix doesn't change in the middle of a frame being rendered, we still want to set it up just once. We could solve this quite easily by setting up our view matrix in our modelview matrix, saving it, then modifying its "model matrix part" for our object, render that object and then reloading the saved matrix into OpenGL so we're back to our original view matrix again:

//By now, the three object specific transformations above aren't needed anymore,//and we need to get rid of them so we can render the next object. glLoadIdentity()//would also reset our view matrix, so let's just overwrite the matrix with our stored//unmodified view matrix instead!glLoadMatrixf(viewMatrix);//And voila! We've effectively reset our model matrix but left our view matrix untouched!}

This works perfectly fine, and you could even say that this looks cleaner than using glPush/PopMatrix(). The exact same thing can be accomplished with glPush/PopMatrix() though and you wouldn't need a FloatBuffer variable to hold the view matrix since it does that for you.

//Resetting time! Since we pushed the unmodified view matrix to the matrix stack, we can pop it//off the stack again to get back the matrix we pushed onto it. This also removes it from the stack//which is why we need to push before rendering each object. You can think of push as a kind of//matrixStack.add(getCurrentMatrix()) and pop as setCurrentMatrix(matrixStack.removeLast()).glPopMatrix();//And voila! We've restored the unmodified view matrix!}

I hope that explains it.

Pushing and popping matrices is especially useful when you have hierarchical objects. Let's say you have a city object with a number of buildings in it. Each building has a position relative to the city it is in.

A very important thing to note though is that this can be extremely slow if you're approaching 1000 objects. It's worth noting that that for this reason all built-in matrices have been deprecated starting with OpenGL 3 and above, but I still think that this is a good place to start if you're new to OpenGL. Learn how it works and then quickly try to move on to shaders. The missing matrix functionality can be replaced by a math library, like the one included in LWJGL. It's not the best one out there, but for 2D it should be more than enough.

So you're saying that since the model and view matrices are combined, you have to save the modelview matrix after setting up the view matrix, then because translations stack up you have to keep saving and loading the matrix with every translation otherwise the first translation would effectively be done twice on the second translation and so on?

Also I've read SHC's tutorial on textures:

Am I right in saying that glGenTextures() returns a unique id number for the texture?

Does binding a texture use that id number to select which texture to bind and does binding a texture ensure that the current bounded texture is the only texture affected by OpenGL calls?

Why do you bind a texture every time you render? Is this a way of indicating which texture to render?

Concerning glGenTextures() you're right. It basically gives you a currently free texture handle and marks it as "in use" for future calls to glGenTextures().

Binding in OpenGL means that the all subsequent commands will affect or use the bound object. Binding a texture both allows you to modify it with subsequent calls like glTexImage() and glTexParameter*(), and to apply it to your rendered geometry. This is a recurring concept in OpenGL and is also used for Vertex Buffer Objects (VBOs), Vertex Array Objects (VAOs), Framebuffer Objects (FBOs), etc.

It's important to understand how the target parameter works. The OpenGL specification has this to say:

Quote

When a texture is first bound, it assumes the specified target: A texture first bound to GL_TEXTURE_1D becomes one-dimensional texture, a texture first bound to GL_TEXTURE_2D becomes two-dimensional texture, a texture first bound to GL_TEXTURE_3D becomes three-dimensional texture [...]

What this essentially means is that the first call to glBindTexture() also associates a texture handle you've gotten from glGenTextures() with the specified target. The spec continues:

Quote

While a texture is bound, GL operations on the target to which it is bound affect the bound texture, and queries of the target to which it is bound return state from the bound texture.

Note how the targets must match between your texture related commands! In essence, you can have both a 1D texture and a 2D texture bound at the same time since they're bound to different targets, and direct OpenGL commands to either of them using GL_TEXTURE_1D and GL_TEXTURE_2D as targets to your commands.

A shader 'object' defines a vertex shader, a fragment shader, or even just a single function. A 'program' links together all of its attached 'objects' so you can use it. The idea was to decouple everything so that you can re-use functions / shaders across multiple programs. All good in theory but not all drivers implement that correctly and usually it isn't worth the trouble trying to share shader objects. In ES it's not supported afaik.

LWJGL uses strings for convenience. Pretty sure you can use the direct buffer method too.

Thanks but I'm still unclear on what the glLinkProgram function does. The glAttachShader function attaches the shaders to the program so what does linking the program do? Also this may be a stupid question but when you say shader object do you mean shader? Are they the same thing?

Thanks but I'm still unclear on what the glLinkProgram function does. The glAttachShader function attaches the shaders to the program so what does linking the program do? Also this may be a stupid question but when you say shader object do you mean shader? Are they the same thing?

Although your vertex and fragment shaders may both have successfully compiled, they still have to be linked together to form a sort of pipeline:

Linking does further optimizations based on how the vertex shader and pixel shader interact with each other. Let's say you have color data in your VBO and your vertex shader reads this and passes it on to the pixel shader. However, the pixel shader ignores the color value and makes everything white regardless of the color value. In this case, the linking compiler will realize that generating a color value for each pixel is just wasted work since it won't be used at all, so it removes that output from the vertex shader. This in turn makes the color vertex attribute (vertex shader input) unnecessary since it's not being used either, and poof; there goes that as well, and you'll get -1 when you try to query the location of that attribute from Java (= attribute doesn't exist). GLSL always optimizes away unused uniforms and attributes.

What's the point of this? For example, you can write a massive vertex shader that does everything you'll ever need: Colors, texture coordinates, shadow map coordinates, normals, tangents, you name it. Then you can reuse this vertex shader for any number of fragment shaders that only use a small number of those output variables without having to worry about performance, since the compiler will automatically optimize away unused variables and computations that aren't needed by that specific fragment shader. The linking step allows you to mix and match vertex and fragment shaders and get optimal performance anyway. It also has uses in more advanced OpenGL.

Thanks but I'm still unclear on what the glLinkProgram function does. The glAttachShader function attaches the shaders to the program so what does linking the program do? Also this may be a stupid question but when you say shader object do you mean shader? Are they the same thing?

Before writing about those functions, I want to say how C programs get created (Just to get the term LINKING). There, linking means that the generated object code is (in some compilers) linked to a stub (format the OS can understand) which is the executable file we see after compilation. This is the same concept here,

glLinkProgram

links the program into an executable that the GPU can run. Before linking the program, we attach the shaders to the program using the

glAttachShader

program. This function attaches the shaders in an order that first the vertex shader is attached and then the fragment shader is attached, no matter what the order we used in our source code.

Then at the time of executing, the GPU passes the program passes the vertex data to the vertex shader.

I thought it'd be unrelated so I decided not to write anything, but here goes:

You have to set up certain things before linking. In OpenGL 3+, there's no built-in gl_FragColor output for fragment shaders, so you have to define your output(s) yourself. When combined with MRT (rendering to multiple textures at the same time), you have to specify which output goes to which color attachment using glBindFragDataLocation(). This has to be done before linking. (Note: The output index can also be defined in your shader.)

The same is true when capturing vertex data using transform feedback. You have to tell OpenGL which outputs of your vertex or geometry shader you're interested in using glTransformFeedbackVaryings() to prevent the GLSL compiler from potentially optimizing those attributes away. Again, this has to be done before linking.

(pixels) --> (fragment_shader) // Adds the colour data from the textures to the pixels and lighting

Those pixels will be then transformed to the screen coordinates and displayed on the screen. ...

Just a small detail, but I'd like to point out that pixels are generated by transforming the geometry to screen coordinates and filling in the pixels that have their centers covered by the geometry. Screen coordinate transformation happens before the fragment shader and is available to the fragment shader in the built-in varying gl_FragCoords.

Question: in what order is everything done?My understanding (I'm assuming things here) is that when glDrawArrays is called, the glVertexAttribPointer call formats the data in the VBO and it sends that data to the attribute index specified by the first argument. Since attribute index 0 is enabled, the position field is initialized? Am I correct?

Question: in what order is everything done?My understanding (I'm assuming things here) is that when glDrawArrays is called, the glVertexAttribPointer call formats the data in the VBO and it sends that data to the attribute index specified by the first argument. Since attribute index 0 is enabled, the position field is initialized? Am I correct?

glVertexAttribPointer() does not "format" the data in the VBO. It simply explains how to interpret it.

glVertexAttribPointer(0, 4, GL_FLOAT, false, 0, 0);

Arguments:1: Which attribute location this should be put in. Basically, which shader input variable should we store this in?2: The number of components of this attribute. If 4, then the shader input variable has to be a vec4.3: Data type.4: Should the data be normalized? Used when uploading bytes, shorts and ints. If true with GL_UNSIGNED_BYTE, then the byte range 0-255 is mapped to 0.0 to 1.0. If false, then treated as 0.0 to 255.0. If normalized and GL_BYTE, the signed byte range is mapped to -1 to 1.5: How many bytes each vertex is. 0 = tightly packed, in which case it'll calculate the size of this variable and use that. In this case, that's 4 components times 4 bytes per float, so 16.6: Offset in bytes. As with stride, useful when having more than one attribute interleaved in the VBO.

As you can see, nothing here actually modifies the VBO. It tells what to do with the data in it. When you then call glDrawArrays(), OpenGL will read vertex data from the VBO based on the vertex attribute setup.

glDrawArrays(GL_TRIANGLES, 0, 3);

renders 3 vertices. For each vertex, it goes over all enabled attributes and reads data from a VBO as specified by each corresponding call to glVertexAttribPointer(). For the first vertex (vertex 0), it'd look at attribute 0, see that it should read 4 floats from the position (offset + vertexID * stride) = (0 + 0*16), so it reads byte 0 to 16. For the second, it sees (0 + 1*16) = 16, etc...

If w == 1, then the vector (x,y,z,1) is a position in space.If w == 0, then the vector (x,y,z,0) is a direction.

So first off, what are a scenarios where we'd need a direction vector? Is this related to the direction of scaling an object or something? Also I thought that W could be between 0 and 1. What happens if the W component is defined as lets say 0.25?

Translation:Lets now refer to this image that it provides.

Lets say we changed that matrix to:

1, 1, 0, 100, 1, 0, 00, 0, 1, 00, 0, 0, 1

We'd get (30, 10, 10, 1). We get a change to the X coordinate but we're multiplying it by our Y coordinate. Can someone explain of what use this is? Do any of the OpenGL functions modify that part of the matrix?Also what is the translation column actually for? Couldn't you do translations in the X/Y/Z columns?

When you rotate over the Z axis: the incoming X affects the outgoing Y the incoming Y affects the outgoing X

That is when 'OpenGL uses that part of the matrix'.

Ahh thanks, I found a page which explains everything.

Next Question:How do the stride and offset parameters of glVertexPointer() and the like work? I've tried searching around and I can only find defines the byte offset between data etc. explanations which I don't understand and C++ related explanations which uses sizeof() as a parameter which I don't understand. Examples would be appriciated.

EDIT: Never mind I'm pretty sure that I'm correct in thinking that that function is used for VAOs. The difference between VBOs and VAOs is that the data for a VBO is placed on the GPU and you access it via a handle whereas with a VAO you have to keep creating the buffer. Am I correct?

About the stride and offset thing.The stride defines how big all your vertex attributes are in bytes.

In your example you have two vertex attributes. Each of them consist of 3 floats and a float has a size of 4 bytes, so you need 24 bytes(2*3*4 == 6 << 2) for one vertex. So that OpenGL knows where it can find each single attribute, you define an offset into this vertex data block. In your example the position attribute is at the beginning of each block so its offset is 0. The color attribute is placed right after the position attribute, so it's offset is equal to the size of the position attribute(3*4=12 bytes).

When you don't want to create an interleaved data-structure(VVVCCC) you would bind the same buffer, but with different pointer offsets.

Well, there are some weird parameters in OpenGL that you're likely to never use, although having options never hurts.

In modern OpenGL you fill up buffers just as you do now (glBufferData(...)/glBufferSubData(...)) but to assign data that goes to your shaders you have to use glVertexAttribPointer(...) and using that function you can set offsets and strides, letting you use a single buffer for multiple parameters. This means that you can store your vertex, normal and texture coordinate (and even more) data in a single buffer/VBO than you can render from that using your shaders.

To answer your question: You can use offsets in pointers to tell OpenGL where your attribute begins in the buffer but you shouldn't use this to "skip over" indices as you think right now.If you want to skip over for example the first 5 indices you should render with glDrawArrays()'s first parameter set to 5.I know it possibly sounds a bit overwhelming right now but if you have any questions just ask, after all that's what this topic is for.

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org