Blinn-Phong Shading

One of the goals of rendering is to put a credible appearance of light into the scene - we want to show how illumination changes as objects move through the scene. The most widely-used scheme to do that is the Blinn-Phong model.

In this model, light is represented by three channels - ambient (the omnidirectional light falling also onto shaded surfaces), diffuse (directional light that is scattered off a surface into all direction) and specular (light that is mirror-like reflected into a certain direction).

These light channels interact with surfaces which have similar properties - so called materials. In a 3d modeling application, you can for instance define the ambient, diffuse and specular colors of a material and assign that material to a surface. In addition a surface can have an emissive color, i.e. a color that is used even without any light source. Finally, it can also have a shininess parameter which determines how 'glossy' the surface will look.

Lighting data structures

In the GLSL version we're discussing here, light and materials are described as pre-defined structs and accessible as uniforms in the shader. Lights actually are arrays - gl_LightSource[0] is the first defined light source, gl_LightSource[1] the second, and so on. As usual, they have to be set in the C++ part of the rendering application, here we just assume a single light source is provided.

Materials are likewise made available as pre-defined uniform structs - there's for instance gl_FrontMaterial available to hold the properties of the front face of a triangle surface.

Their properties are simply pulled out like for normal C++ structs combined with the GSLS features like swizzling, for instance

The algorithm

The idea of the Blinn-Phong is that the color of a pixel is always the product (component by component) of light color and material color. Thus, iluminating a bright grey (0.8, 0.8, 0.8) surface with weak red (5.0, 0.0, 0.0) light results in a somewhat darker red pixel (0.4, 0.0, 0.0).

All channels are added up to give the final pixel color - thus some care has to be taken to not exceed the limits of color values, i.e. return a channel larger than unity.

Emissive light is easiest - the emissive color is returned directly. Ambient light is then the product of the ambient channels of light source and material without any directional dependence. All surfaces in shadow will have ambient and emissive lighting, but no other channels.

The diffuse illumination of a surface depends on its orientation. Basically, if the normal of a surface points directly towards the light source, the surface is fully illuminated, if the light hits the surface under a 90 degree angle it is just barely not illuminated. The functional dependence is the cosine of the angle, and in vector algebra this is the dot product of normal and direction towards the light. If the dot product is negative, the surface in fact points away from the light and is shaded. So the diffuse channel is the product of diffuse light, color value and dot product.

The specular, mirror like reflection is the most complicated. Using the reflection law of geometrical optics of incident angle equal to outgoing angle, we see that a reflection happens most pronounced when the half vector, i.e. the vector right between incoming (light to vertex) and outgoing (vertex to eye) is aligned with the surface normal. How much deviation from this ideal condition is allowed is governed by the shininess parameter which blurs the reflection a bit. Thus, the specular channel is given by the product of specular light, specular color value and dot product between half vector and surface normal.

The code

To see these ideas applied, let's take a look at a pair of complete Blinn-Phong shaders. Here is the vertex shader:

First, take note of the keyword varying. The variables we declare with this keyword are attached to the vertex at the end of the vertex processing stage and can be retrieved - linearly interpolated across a triangle, in the fragment shader. For this role, we mark a color vector, the view direction in eye coordinates and the surface normal. Note that the fragment shader itself does not have access to coordinates or a normal automatically, so we have to declare here what we will need later.

The view direction towards the vertex in eye coordinates ecViewdir basically is the vertex since the origin is at eye position (this is one of the neat things about eye coordinate space).

Take note that the normal is not transformed using gl_ModelViewMatrix but gl_NormalMatrix. The reason is a bit tricky to explain, it has to do with the fact that a normal actually is not a vector but, being the cross product of the vectors which span a surface, a pseudo-vector. As such, it has different transformation properties and needs a different matrix.

The diffuse term is Blinn-Phong as outlined above without the dot product (we're going to do this in the fragment shader) and the constant term takes care of the ambient and emissive channel. The constant term is written to gl_FrontColor which is something like a predefined varying vec4 which can be picked up in the fragment shader - dependent on which side of a surface we see, gl_Color. there gets the value of either gl_FrontColor or gl_BackColor

First, we pick up all the varying we have declared in the vertex shader - so now we have the vector from eye to vertex, the normal and the diffuse color before the directional dependence available. In addition, we pick up the sum of ambient and emissive color from gl_Color (that's what the rasterizer made from gl_FrontColor and gl_BackColor, remember?

Then we start on light directionality. In the event, we're just using the light source position rather than the difference between position and vertex - why? Because this shader assumes the sun as a lightsource, which is infinitely far away. The floating point accuracy can't handle the real sun distance, but we can just make the sun position a direction rather than a position (by setting the 4th component of the vector to 0, remember?) and assume the direction light comes from is the same everywhere in the scene. Note that the position info of the light source usually comes in eye coordinates - so we can directly combine it with the second vector ecViewDir to generate a half vector for mirror reflections. Since we want really direction vectors, note the liberal use of the normalize() function which sets the length of a vector to unity without altering its direction.

In fact, we also normalize the normal. It should have unit length in the vertex shader, but since it's components are interpolated across the triange, there's no guarantee that it's still unit length when we pick it up in the fragment shader - we do it again just to be sure.

From there, the calculation is pretty straightforward. We compute the directional dot products, add the diffuse and specular term and in the end make sure the color value is between 0 and 1 by the line color = clamp(color, 0.0, 1.0);. Note how shininess is used as the power of NdotHV - if the exponent is large, then the region where there is strong mirror reflection is small and the object appears very glossy, if the exponent is small the specular term is not so different from the diffuse.

Finally, the computed color value is written out to gl_FragColor as required. It's good practice to write to this only once and assemble the final color in an internal variable like fragColor first!

Vertex vs. fragment shader

Just why are things done that way? Couldn't we have simply computed all directionality and the dot products in the vertex shader where we have the coordinates available anyway and then just passed the colors to the fragment shader, rather than passing all coordinates as varying to the fragment shader (which is expensive - every varying needs to be known for all vertices and fragments, and there's typically millions of fragments and often hundreds of thousands of vertices) and computing there?

Yes, we could have.

The question is one of accuracy. Interpolation across a triangle is always linear. So are coordinates. If you interpolate the coordinates at triangle edges linearly to the center of the triangle, you get the actual coordinates of the triangle center without any errors. And thus our pointing vectors above are the actual pointing vectors to the pixel we're processing, not approximations.

A dot product (which involves a cosine function) or a power like pow(NdotHV, gl_FrontMaterial.shininess) however are not linear. We can compute their values at the edge of the triangle and interpolate them to the center, but that is not what we would get if we would evaluate them for the true coordinates at the center of the triangle.

What mathematically happens if we compute non-linear functions of the coordinates in the vertex shader is that we replace a smooth, non-linear function by piecewise linear approximations. For relatively smoothly varying functions and a dense mesh, that may be completely acceptable. For a sharp specular reflection on a relatively coarse mesh, it for sure is not.

Thus, to be safe, we make sure we pass only linearly interpolating quantities from vertex to fragment shader and evaluate all non-linearity there.

(There's also other concerns involved here, having to do with performance - in some circumstances, the vertex shader may just be much faster and this may well be worth a hardly visible inaccuracy in lighting).