Search

programming

I already wrote about using OpenGL 3.3 with Qt applications, using new style shaders and helper classes for handling shader programs and buffers. But there is one more important thing to do before we can start writing OpenGL 3.3 applications with Qt. The problem is that usually functions and constants from OpenGL 3.3 won't be available even if we have the appropriate libraries and drivers. That's simply how OpenGL works and we have to work around this limitation.

The qgl.h header, which is used by all other headers from the QtOpenGL module, includes <GL/gl.h> (or its equivalent, depending on the platform). However, on Windows this standard header is always compatible with version 1.1 of OpenGL (even if you have the latest Platform SDK), and on systems using recent versions of MESA (including most Linuxes) it's compatible with version 1.3. To have all the new symbols from version 3.3, you need to include <GL/glext.h>, but it also doesn't help much. First, this header is not available on Windows. Second, it only defines typedefs for function pointers that you have to retrieve by yourself using a platform-specific function, because they are not directly exported by the OpenGL library like in case of most other APIs. And even if they were, they may not be available on some platforms, depending on the actual version and available extensions, and you may still want your code to work without some of them.

There are some existing libraries that attempt to solve this problem by automatically loading those functions behind the scenes. The most popular ones are GLEW and GL Load (which is a part of the Unofficial GL SDK). They are cool but both are relatively huge (well over 2 MB of header files and source code) for a simple task of loading a few dozens of functions. They include a bunch of extensions which are not part of the OpenGL 3.3 core profile. They are also meant to completely replace <GL/gl.h>, and although they work with Qt, it's not an elegant solution.

Qt itself also has a rather funny approach to this problem. All classes that require 2.0+ functionality (shaders, buffers, etc.) use an internal header, qglextensions_p.h. It works in a somewhat similar way to those libraries. It defines the function pointer types and constants and then defines macros which replace canonical function names with appropriate entries in an internal structure which is stored in the QGLContext. Obviously we cannot rely on it because it's internal, and besides it only defines a small set of functions and constants which are directly used by Qt.

There is also a public class QGLFunctions which is part of the API, though it's not internally used by Qt. It takes a completely different approach and instead of using macros, it's a class with methods of the same name as canonical OpenGL functions. The recommended way to use it is to inherit this class in each class that needs to use those functions. It seems like a bit of WTF to me. Even worse, it only covers OpenGL/ES 2.0 which is fine for embedded applications, but not enough for a desktop application targeting OpenGL 3.3.

As you can probably guess I came up with a custom solution. The idea is that it only needs to add symbols not already defined in <GL/gl.h>, assuming that it's compatible with at least OpenGL 1.1. It also only covers the OpenGL 3.3 core profile without any additional extensions or features removed from the core profile (though those defined by <GL/gl.h> will still be available). It consists of a header file which is basically a slightly stripped version of gl3.h from the official OpenGL Registry. I basically removed everything pre-1.2 and post-3.3 and some other unnecessary stuff. Another header defines a structure holding all function pointers and all the necessary macro definitions, and a single source file contains code that initializes this structure using a QGLContext, which takes care of retrieving function pointers in a cross-platform way.

The size of all three files is a mere 120 kilobytes. Some day I may publish them as a separate mini-libary, but for now you can find them in the SVN repository of Descend.

In the previous article I wrote that using modern OpenGL (i.e. version 3.0 and above) is possible, although the core profile cannot be used yet. I also mentioned this article which briefly describes how to use the core profile, although in fact this example will also work in the default compatibility mode. In this mode we can use both the fixed pipeline and shaders, but I will focus on the "modern" approach.

Qt has a handy class called QGLShaderProgram which wraps the OpenGL API related to shaders. A big advantage of this class is that it supports all classes related to 3D graphics provided by Qt, such as QVector3D and QMatrix4x4, as well as basic types like QColor. This way we don't have to worry about converting those types to OpenGL types. Internally this class is little more than a GLuint storing the handle of the shader program and most its methods are simple wrappers around functions like glUniform3fv so it's very lightweight.

Note, however, that shaders work in quite a different way depending on the version of the GLSL specification. By default version 1.20 is assumed, so your shaders can access all information known from the fixed pipeline - vertex position, normal, texture coordinates, transformation matrices, lighting parameters, etc. Things change dramatically when you put the following declaration at the beginning of the shader:

#version 330

Any attempt to access these built-in uniforms and attributes will result in an error. It means that you have to pass all information using explicitly declared uniforms and attributes. For example, to define the world-to-camera transformation matrix, you could use the following code:

This is not only much more elegant than a series of calls to glMatrixMode, glIdentity, glRotate etc., but also faster and more flexible. The vector and matrix classes provided by Qt are really handy; the authors of this class even thought about the normalMatrix method that calculates the transposed inverse (or was it inversed transpose?) for transforming normal vectors.

Similarly, uniforms can be used to pass lighting parameters, materials, blending information and many more things which are not possible to achieve using the fixed pipeline. When it comes to attributes, the QGLShaderProgram offers a bunch of functions for passing single values to attributes (which are not very useful in most cases) and for passing arrays of various types. However this is not recommended, because OpenGL knows nothing about the contents of these arrays and it cannot assume that they don't change between executions of the shader or between successive frames.

A much better approach is to use the setAttributeBuffer method in connection with the QGLBuffer class. Internally this method is a wrapper for glVertexAttribPointer just like the attribute array methods, but it makes the code much more readable as it explicitly states that vertex buffers are used. In addition there's no need to cast the offset to a pointer because Qt will do that for us.

The QGLBuffer class is also a very thin wrapper around a GLuint representing the vertex buffer object (or index buffer or pixel buffer object). Unlike QGLShaderProgram it's a value type (it doesn't make sense to copy a program anyway), so we can share buffers without having to worry about tracking and releasing them when they are no longer needed.

In order to use the QGLBuffer, we need to create it and fill it with data; then we can bind it with the attributes of the shader program. By using appropriate offset and stride, we can easily bind multiple attributes to a single buffer; usually all attributes of a single vertex would be stored together, followed by the remaining vertices. Don't forget about calling enableAttributeArray for each attribute. We can also use another instance of QGLBuffer to store the indexes.

When everything is set up like this, the rendering is a matter of binding the program and both buffers to the context and calling glDrawElements. In more complex scenarios we can use multiple vertex array objects to store the bindings between vertex buffers and attributes. But since we're not using the core profile, OpenGL will create an implicit vertex array object for us.

We can also use uniform buffer objects to simplify passing lots of uniforms to multiple programs. Although Qt doesn't support them at the moment, there is a simple hack which allows us to abuse QGLBuffer. If you look at the declaration of this class you will notice that the values of the enumeration defining the type of a buffer are the same as the corresponding target constants in OpenGL. So we could simply pass GL_UNIFORM_BUFFER as the type of the buffer - I haven't tested it yet, but it should work.

Some time ago I stumbled upon a great e-book on OpenGL programming: Learning Modern 3D Graphics Programming. The best thing about it is that it teaches the modern approach to graphics programming, based on OpenGL 3.3 with programmable shaders, and not the "fixed pipeline" known from OpenGL 1.x which is now considered obsolete. I already know a lot about vectors, matrices and all the basics, and I have some general idea about how shaders work, but this book describes everything in a very organized fashion and it allows me to broaden my knowledge.

When I first learned OpenGL over 10 years ago, it was all about a bunch of glBegin/glVertex/glEnd calls and that's how Grape3D, my first 3D graphics program, actually worked. Fraqtive, which also has a 3D mode, used the incredibly advanced technique of glVertexPointer and glDrawElements, which dates back to OpenGL 1.1.

A lot has changed since then. OpenGL 2.0 introduced shaders, but they were still closely tied to the fixed pipeline state objects, such as materials and lights. The idea was that shaders could be used when supported to improve graphical effects, for example by using per-pixel Phong lighting instead of Gouraud lighting provided by the fixed pipeline. Since many graphics cards didn't support shaders at that time, OpenGL would gracefully fall back to the fixed pipeline functionality, and everything would still be rendered correctly.

Nowadays all decent graphics cards support shaders, so in OpenGL 3.x the entire fixed pipeline became obsolete and using shaders is the only "right" way to go. There is even a special mode called the "Core profile" which enforces this by disabling all the old style API. This means that without a proper graphics chipset the program will simply no longer work. I don't consider this a big issue. All modern games require a chipset compatible with DirectX 10, so why should a program dedicated to rendering 3D graphics be any different? Functionally OpenGL 3.3 is more or less the equivalent of DirectX 10, so it seems like a reasonable choice.

I was happy to learn that Qt supports the Core profile, only to discover that it's not actually working because of an unresolved bug. Besides, the article mentions that "some drivers may incur a small performance penalty when using the Core profile". This sounds like a major WTF to me, because the whole idea of the Core profile was to simplify and optimize things, right? Anyway I decided to use OpenGL 3.3 without enforcing the Core profile for now, but to try to implement everything as if I was using that profile.

Another problem that I faced is that my laptop is three years old, and even though its graphics chipset is pretty good for that time (NVIDIA Quadro NVS 140M), I discovered that the OpenGL version was only 2.1. I couldn't find any newer drivers from Lenovo, so I installed the latest generic drivers from NVIDIA and now I have OpenGL 3.3. Yay! So I modified my Descend prototype to use shaders 3.30 and QGLBuffer objects (which are wrappers for Vertex Buffer Objects and Index Buffer Objects), but I will write more about it in the next post.

The Descend project is now officially reactivated and yesterday I committed the current version of the prototype into SVN repository. The UI is very basic, but it does its job of drawing a 3D surface based on mathematical equations. So far it consist of three important parts:

A compiler and interpreter of Misc, a programming language designed for calculating geometry with high performance. You can read more about it in the previous post and I will keep returning to it as it's quite an interesting subject.

An adaptive tessellator which is described below in more details.

A vertex and pixel shader that perform per-pixel Phong shading, which looks nice on the Barbie-pink surface :).

A parametric surface is described by a function V(p, q), where V is the vector describing the location of a point in 3D space and p and q are parameters (usually in [0, 1] or some other range). Obviously the surface consists of an infinite number of points, so we must calculate a number of samples and join the resulting points into triangles. This process is called tessellation. If the triangles are small enough, the Phong shading will create an illusion that the surface is smooth and curved.

The only difficulty is to determine what does "small enough" mean. Some surfaces are flat or almost flat and need just a few triangles to look good. Other surfaces are very curved and require thousands of triangles. In practice most surfaces are flatter is some areas and more curved in other areas. Take a sphere for example. It's curvature is the same everywhere, but we must remember that our samples are not distributed uniformly on its surface. Imagine a globe: meridians are located much closer to each other near the poles than near the equator. So in practice the distance between two samples located near the equator is greater and the surface needs to be divided into more triangles. This way, the size of all triangles will be more or less the same. Without adaptive tessellation, triangles would be closely packed near the pole and very large near the equator.

The tessellation algorithm works by first calculating four points at the corners of the surface. Wait, where does a sphere have corners? Just unwrap it mentally into a rectangular map, transforming it from (x, y, z) space into the (p, q) space. This gives us a square divided diagonally into two triangles. Then we calculate a point in the middle of the diagonal and divide each triangle into two smaller triangles. This process can be repeated recursively until the desired quality is reached.

How to measure the quality? The simplest method is to calculate the distance between the "new" point and the line that we are attempting to divide. The greater the distance, relatively to the length of the line, the more curved the surface. If this distance is smaller than some threshold value, we simply assume that the point lays on the line and discard it. The smaller the threshold, the more accurate the tessellation and the more triangles we get.

Unfortunately there are situations when this gives wrong results. If the curvature of the surface between two points resembles a sinusoid, then the third point in between appears to be located very near the line drawn between those two points. The tessellation algorithm will assume that the surface is not curved in this area. This produces very ugly artifacts.

So I came up with a method which produces much more accurate results. In order to render the surface with lighting, we need to calculate normal vectors at every point. For the Phong shading to look nice, those normals must be calculated very accurately. So two more points are calculated at a very small distance from the original one and the resulting triangle is used to calculate the normal. Note that the angle between normals is a very accurate measure of the curvature. An algorithm which compares the angle between the normals of two endpoints and the normal of the "new" point with a threshold angle can handle situations like the above much better. It's also more computationally expensive, because we must calculate three samples before we can decide if the point is rejected or not.

Of course this method can also be fooled in some specific cases, but in combination with the first one it works accurately in most cases. Experimentation shows that the threshold angle of 5° gives excellent results for every reasonable surface I was able to come up with.

In practice we also have to introduce the minimum and maximum number of divisions. Up to a certain point we simply keep dividing the grid into smaller triangles without even measuring the curvature, because otherwise the results would be very inaccurate. And since the curvature may be infinite in some points, we also must have some upper limit.

Final notes: Adaptive tessellation of parametric surfaces is the subject of many PhD dissertations and my algorithm is very simplistic, but it's just fast and accurate enough for the purposes of Descend. Also it should not be confused with adaptive tessellation of displacement mapping, which is a different concept.

A long period since my last post indicates that a lot has happened. In case you didn't notice in the photo gallery, my son Adam was born on January 11th at about 3 a.m. I just uploaded some more recent photos to the gallery. He's changing a lot, already smiles, speaks various vowels and tries to catch objects and hold them in his hand. It's a great experience to be a father, although it requires a lot of patience. Well, it's never a bad time to learn something new, and I personally find it very inspiring :).

In the meantime I released an update for WebIssues, changed the layout of my web sites a bit and also made some improvements of Saladin (you can expect a new version soon). I also made a pretty good progress with my novel. I will write more about it shortly, but it's generally a story about hackers, magic and virtual worlds. The nature of my day job is rather destructive for my imagination (and for the amount of time I can spend on other things), but it turns out that I'm still able to use it and I have some interesting ideas from time to time.

I'm also very proud that after so many years I successfully revived the idea of Descend. In a nutshell, it's the successor of a program for drawing parametric surfaces which I wrote over 10 years ago. Last year I wrote an adaptive tessellator and I made a working prototype of the program using Qt's built-in JavaScript engine. It worked nicely, but it wasn't very fast. Qt uses the JavaScriptCore engine from WebKit, which is currently developed by Apple and powers the Safari browser. It is one of the fastest JS engines, with a register based vritual machine and built-in JIT compiler (you can find more interesting information about it in this post). However, JavaScript is an object oriented, dynamically typed language, best suited for creating complex AJAX-based applications. It's not optimized for number crunching.

So I designed my own language, which is statically typed and has only four types: boolean, float, 4D vector and 4x4 matrix. I called it Misc, although it doesn't have much to do with the old Misc engine that I wrote 7 years ago (which was dynamically typed and supported complex types like objects and arrays). I wrote a relatively simple compiler which produces bytecode and doesn't perform any fancy optimizations, and a stack based virtual machine with dedicated instructions for vector and matrix operations, trigonometric functions, etc. As soon as I got it working, I performed some benchmarks.

I created three versions of Descend: one based on JavaScriptCore, another one using the new Misc engine, and the third one with equations hard-coded in C++. All three versions were compiled using Visual C++ 2008 with full optimizations and link-time code generation. I measured the total time of generating a deformed sphere which consisted of 43,000 vertices and required calculating 135,000 samples (the tesselator needs 3 samples per vertex in order to accurately calculate normal vectors).

The results are very encouraging: the version using Misc is not only 5 times faster than the one based on JavaScriptCore, but also just 3.5 times slower than the one using native code! Obviously this doesn't mean that Misc code is almost as fast as native code, but the overhead added by the Misc interpreter is not very big, compared to the cost of the tessellator itself and the cost of calculating the trigonometric functions (which is the same for all three versions). In contrast, the overhead of JavaScriptCore is much more noticeable.

Surely there's still some room for improvement, but I doubt I would be able to squeeze more than a few CPU cycles without making the code significantly more complex. So I'd rather start working on the UI to make it possible to draw something more than a single surface. I will publish the first version once it's usable, which may take a few months depending on how much time I will be able to dedicate to it.