WebGL guide (part 1/2)

May 2020

News

The visits of my website are skyrocketing thanks to you! <3
The part 2 is coming soon with Shadowing, 3D files loading, and many other advanced WebGL techniques!
Feedbacks are welcome! You can contact me on Twitter or Github, or follow me to be notified when it's released :)
Would you be interested by a WebGL book based on this guide? Please tell me!

What is WebGL?

Presentation

WebGL is a JavaScript API designed to compute and draw 2D and 3D graphics very fast in a Web browser, by using the processing power of the GPU.
It's based on OpenGL ES and is programmed using shaders coded in GLSL (OpenGL Shading Language), a language similar to C and C++.
A scene rendered by WebGL is mainly made of vertices (points in 3D space, with coordinates X, Y, Z), which can be drawn as points, lines or triangles (colored, shaded or textured),
but many other effects can be displayed as well (shadows, particles, fog, post-processing...).

Two versions of the API exist: WebGL 1.0, supported by 97% of browsers, and its evolution WebGL 2.0, supported by 74% of browsers as of may 2020, according to caniuse.
This guide will focus on WebGL 1.0, but all the features added in WebGL 2.0 will be explained at the end if you want to take the plunge.

The workflow of a WebGL program can be summarized like this:

The JavaScript code initializes the WebGL program and pilots it to draw a 2D or 3D scene on the webgl context of a HTML5 canvas.

A first GLSL script called vertex shader is executed for every vertex of the scene. It computes an abstract, mathematical model of the scene and hands it to a second GLSL script.

The second GLSL script, called fragment shader, is executed for every visible fragment (pixel) of the canvas. It computes each fragment's color, in a process called rasterization.

The fragments constitute a bitmap image stored in a color buffer, which is finally displayed on the canvas.

The GLSL language

The shaders source code can be placed in a JavaScript string or loaded from a separate file.
Here are the key features of their syntax:

An int is a whole number: 0, 1, 2, -10,...

A float is a number written with at least one decimal: 0.0, 0.1, 1.0, -10.5,...

Tests and loops are also available (if, else, switch, for, while), but loops must have a constant limit. (you can't do

for(int i = 0; i < j; i++){...}

if j is variable).

The entry point of each shader (where their execution starts) is a

void main(){...}

function.

Custom functions can also be created and called by main() or by each other, but recursion isn't allowed.

The precision of ints, floats and Sampler2Ds (lowp / mediump / highp) can be set in each shader with a directive, like

precision highp int;

or

precision mediump float;

These directives must be at the beginning of the shader's code, however only the float precision is mandatory in the fragment buffer, all the others have values by default.

The vertex shader must set a global variable gl_Position containing the coordinates of the current vertex (It must also set gl_PointSize when rendering individual points).

The fragment shader must set a global variable gl_FragColor containing the color of the current fragment.
It has access to 3 global variables: gl_FragCoord (window coordinates), gl_PointCoord (coordinates inside a point) and gl_FrontFacing (current triangle orientation).

Communication between JavaScript and WebGL

Four main mechanisms exist to send data between the different scripts:

Attributes are global variable passed by JavaScript to the vertex shader. Their value can change for each vertex (ex: vertex coordinates).

Uniforms are global variables passed by JavaScript to both vertex and fragment shaders (ex: a color). Their value stays constant for an entire frame.

Varyings are not accessible by JavaScript. They can only be set by the vertex shader and read by the fragment shader.

Data buffers are big arrays of numbers passed by JS to the vertex shader in chunks of 1 to 4 values.
For example, if a long list of vertex coordinates (X,Y,Z, X,Y,Z, ...) is sent to the vertex buffer 3 by 3, the shader will receive each chunk in the form of an attribute vec3.

Each attribute, uniform and varying must be declared before main() in the shaders that uses them.

Don't worry, these features will be explained and illustrated in the next chapters.

Maths required to follow this guide

If you're not friend with maths, don't worry! 3D programming actually requires a very limited subset of maths, which is summarized below:

The basics of geometry:
- A point in 2D has two spatial coordinates (X horizontally, Y vertically).
- A point in 3D has a third Z coordinate for depth.
- The origin is the point where all the coordinates are equal to 0.

The basics of trigonometry:

An angle can measure between 0 and 360 degrees, which is equivalent to: 0 to 2 * Pi radians (Pi radians is half a turn).

An angle in degrees can be converted in radians by multiplying it with π/180.

An angle in radians can be converted in degrees by multiplying it with 180/π.

The trigonometric circle is a circle of radius 1, centered on the origin of a 2D plane. Every point of this circle corresponds to an angle, measured anti-clockwise:
The rightmost point represents the angle 0 (or 2*Pi rad), the topmost point is Pi/2 rad, the leftmost is Pi rad, and the bottom is 3*Pi/2 rad.

An angle bigger than 2*Pi or smaller than 0 is similar to the same angle modulo 2*Pi (ex: 5*Pi rad = Pi rad; -Pi/2 rad = 3*Pi/2 rad).

The cosine of an angle "α" is the X coordinate of the corresponding point on the trigonometric circle, and oscillates between -1 and 1.

The sine is the Y coordinate of the same point, and also oscillates between -1 and 1.

The tangent is the length of the segment perpendicular to the radius at this point and the X axis. Its value goes between -∞ and +∞, and is equal to sin(α) / cos(α).

Vectors:

A vector is an array of numbers. It can represent either a point in space (a vertex), or a direction (an offset).

When it represents a point, it's a list of coordinates.
For example, [2,4] can represent the X and Y coordinates of a 2D point, and [3,5,2] the X, Y and Z coordinates of a 3D point.

When it's a direction (from a position in space to another position), it represents how the offset is applied in each coordinate. You can imagine it like an arrow.
for example [1,2,3] means X offset of 1 unit, a Y offset of 2 units and a Z offset of 3 units.

Contrary to vertices, direction vectors don't have a position. They only represent an offset, and this offset can start from anywhere.

You can build a vector AB (going from a point A to a point B) like this: AB = [xB - xA, yB - yA, zB - zA].

Measuring the length (or magnitude) of a vector is similar to measuring the distance between two points with Pythagore:
||V|| = sqrt(xV² + yV² + zV²).

Normalizing a vector consists in adjusting its length to 1 unit, without changing its direction. It's equivalent to scaling it by 1 / ||V||.

The relative angle between two normalized vectors V and W can be computed using the dot product: V.W = xV * xW + yV * yW + zV * zW.
The dot product is equal to the cosine of the angle between the vectors. For example, it's 1 if they're equal, 0 if they're perpendicular, -1 if they're opposite.

The cross-product of two vectors V and W is a vector perpendicular to both vectors. It can be computed like this: V×W = [yV*zW - zV*yW, zV*xW - xV*zW, xV*yW - yV*xW].

The normal of a triangle ABC is a vector perpendicular to its surface, more precisely, perpendicular to any vector inside the triangle.
It can be computed as the cross-product of the vectors AB and BC, if the points A, B and C are arranged counter-clockwise. (if clockwise, the normal will point to the opposite direction).
The normal of a triangle, as its name suggests, must be normalized, because it will often be involved in dot products.

Matrices:

A matrix is a grid of numbers. It represents a system of linear equations that can be applied to any vector with a multiplication.
Multiplying a matrix and a vector consists in computing the dot product of the vector with each line of the matrix.
For example, in 3D, the "identity" matrix below transforms a vector into itself (it's a neutral operation):

A matrix can be transposed by inverting its horizontal and vertical axis (the diagonal stays unchanged):

A = [123456789]

AT = [147258369]

Two or more matrices of equal size can be multiplied together to combine (accumulate) their transformations.
The result is a matrix containing the dot products of each line of the first matrix with each column of the second matrix.
The combination order is important, and is from right to left. For example, a matrix that performs a translation T, then a rotation R, then a scale S is equal to S * R * T.

Finally, a matrix can be inversed, with a complicated equation that we'll see later in this guide. It cancels the transformation made by the original matrix. Hence:
A * A-1 = identity
A * A-1 * V = V

Homogeneous coordinates:

Finally, a little extension of the 3D vectors and matrices above, consists in giving them a fourth dimension, called W.
The goal is not to draw 4-dimensional objects, but to allow more operations on vertices, like translations and camera projections.
Those 4D vectors are called homogeneous coordinates and are noted [X, Y, Z, 1] for vertices, and [X, Y, Z, 0] for normals.
For example, here's the translation matrix that moves a point 2 units along X, 3 units along Y, and -1 unit along Z:

Remarks:
- The (4D) transformation matrix performing a translation is an identity matrix, with the X, Y and Z offsets encoded on the last column.
- These offsets are multiplied by the fourth element of the 4D vector, that's why a point can be translated (W = 1) and a normal cannot (W = 0).
- The 3x3 matrix on the top left corner can still be used to perform rotations and scalings.
- When a vertex is rendered on the screen, only its X, Y and Z coordinates matter. It's W coordinate, only useful for computing translations and projections, is discarded.

2D graphics

Hello, point

Here's the simplest possible WebGL program, drawing a red, square point in the middle of the canvas.

It's a live demo, feel free to play with the code and change some values!

Demo

What happens here?

Two JavaScript objects are essential in a WebGL app: the canvas context gl, returned by

canvas.getContext('webgl')

, and program returned by

gl.createProgram()

(lines 4 & 37).

JavaScript also uses the functions createShader, shaderSource, compileShader, attachShader, linkProgram and useProgram to set up and run the app (lines 27-41),
and the functions clearColor, clear and drawArrays to set the default background color, clear the canvas and draw a point on it (lines 48-59).

The vertex shader (lines 7-15) sets the vec4 gl_Position (x, y, z, 1.0), and gl_PointSize (in pixels).It is executed once, as there's only one vertex.
Since we're drawing in 2D, the point's Z coordinate is 0, while X and Y are in the range [-1 : 1]: within the bounds of the canvas.
The 4th vertex coordinate (W) is fixed to 1.0, and allows many transformations detailed in the next chapters.

The fragment shader (lines 18-24) sets the vec4 gl_Fragolor (r, g, b, alpha), where each component is in the range [0 : 1].It is executed 100 times (once for each pixel inside the point).
It starts with a mandatory directive, used to define the precision of its floating numbers:

precision mediump float;

(lowp and highp are also available, but less useful).

If an error occurs during the compilation, it's caught by getShaderInfoLog or getProgramInfoLog (lines 44-46) and logged in the browser's JS console.

Tips & tricks

On some devices, the biggest supported point size is 62px (more info on webglstats).

On some devices, the points may disappear entirely if their center is outside of the canvas (more info on webglfundamentals).

If you don't like placing your shaders' source code in a JS string, you can also use:

script blocks (

<script type="x-shader/x-vertex" id="vshader">...</script>

/

<script type="x-shader/x-fragment" id="fshader">...</script>

(and retrieve it with

document.getElementById('vshader').innerText

/

document.getElementById('fshader').innerText

).

external files, like fshader.glsl and vshader.glsl(and retrieve it as text with XHR, fetch, or any method of your choice).

The program creation (lines 26 to 46) is always the same and pretty verbose, so we can put it in a compile() function and reuse it in the next chapters (see webgl.js):

Custom values: attributes and uniforms

Of course, WebGL wouldn't be interesting if it could just draw one hard-coded point.
To make it less rigid, we can give it custom values. This can be done with attributes (readable by the vertex buffer) and uniforms (readable by both buffers).

an attribute is variable and can contain a float or a vector (vec2, vec3, vec4). Your program should not exceed 16 attributes to work on all devices.

(vec2 and vec3 are declared similarly to vec4, mat2 and mat3 similarly to mat4).

Tips & tricks

The fourth value of a vec4 attribute is 1.0 by default, so it's frequent to encounter some code that only sets x, y and z with

gl.vertexAttrib3f(position, 0, 0, 0)

.

Boolean uniforms also exist in the language's specs, but don't work on all devices. If you need one, consider replacing it with an int or a float.

Matrix attributes also exist in the language's specs, but JavaScript doesn't have a convenient method to set their value, and they're not very useful anyways.

You can draw as many points as you want by setting new attributes / uniforms values and calling drawArrays again.
The same shaders will be executed each time but with different inputs.
For example, you can add these lines to add 2 other points:

Inside the fragment shader, you have access to a gl_PointCoord vec2 telling where the fragment is placed in the point (x and y coordinates are between 0 and 1).
Moreover, in GLSL, you can prevent a fragment from being rendered using the discard; statement, and measure a distance between two points with distance().
As a result, you can make a rounded point by discarding every fragment further than a radius of 0.5 from the center:

Drawing many points, a matter of continuity

By default, WebGL has no problems calling gl.drawArrays many times in a row, but only if these calls happen in the same timeframe.
Here's a program trying to draw a new random point every 500ms:

Demo

What happens here?

As you can see, the canvas is not cleared (in black), but completely reset each time drawArrays() is called. This is the standard behavior when the draws happen at different moments.
To solve this, there are two solutions:

Save the positions and colors of every new point in a JS array or object.Then every 500ms, clear the canvas (to make the background black) and redraw all the saved points;

Force

{ preserveDrawingBuffer: true }

when creating the WebGL context, as you can see by removing the commented code on line 4.
In this case, you won't have to clear the canvas if you want the old points to stay visible and immobile.

In both cases, the result will look like this:

Both solutions are okay in this example, but in real conditions (animated WebGL scenes with moving objects), you don't want the previous frames to stay visible.
So the only solution will be to clear the canvas and redraw everything at each new frame.

Drawing lines and triangles

The next step consists in declaring many points at once, and telling WebGL how to display them: as points, as lines or as triangles.
To do this, we'll use a data buffer (an array of binary numbers) to send vertex properties from JS to the fragment shader, via an attribute. The following types are supported:

Name

Bounds

Bytes

JS container

WebGL type

Unsigned byte

0 ... 255

1

new Uint8Array([...])

gl.UNSIGNED_BYTE

Signed short integer

−32,768 ... 32,767

2

new Int16Array([...])

gl.SHORT

Unsigned short integer

0 ... 65,535

2

new Uint16Array([...])

gl.UNSIGNED_SHORT

Signed integer

-2,147,483,648 ... 2,147,483,647

4

new Int32Array([...])

gl.INT

Unsigned integer

0 ... 4,294,967,295

4

new Uint32Array([...])

gl.UNSIGNED_INT

Floating point number

-2128 ... 2127

4

new Float32Array([...])

gl.FLOAT

Then, gl.drawArrays can render these vertices as points, lines and triangles in 7 different ways, by changing its first parameter:

Demo

Here's the simplest way to draw a colored triangle:

What happens here?

A data buffer is filled with 3 points coordinates and bound to a position attribute with createBuffer, bindBuffer, bufferData, vertexAttribPointer & enableVertexAttribArray (lines 31-49).

At the end, we tell gl.drawArrays to render these points as a triangle. As a result, every fragment inside the triangle will automatically reuse the "color" uniform variable (red).

If you replace gl.TRIANGLES with gl.LINE_LOOP, only the lines between points 0-1, 1-2, and 2-0 will be rendered, with a line width of 1px (1 fragment).
You can also try gl.LINE_STRIP to trace lines between points 0-1 and 1-2, and gl.LINES to draw a line between points 0 and 1, as it only works on consecutive pairs of points.
Unfortunately, the line width can't be changed on most devices, so we have to stick with 1px lines or "cheat" with triangles (more info on MDN and mattdesl's website).

Lines and triangles do not need gl_PointSize to be set in the vertex shader. If you replace gl.TRIANGLES with gl.POINTS, you'll have to set gl_PointSize again or they won't appear.

Tips & tricks

You can draw as many triangles as you want by adding vertex coordinates on line 33 and updating the vertex count on line 61.

In all the following chapters' demos, you can replace gl.TRIANGLES with gl.LINE_LOOP to see the scene in wireframe.

WebGL does antialiasing (pixel smoothing) by default. This can be disabled with

canvas.getContext('webgl', {antialias: false});

, to save resources, especially on retina screens.

The buffer creation and binding is also quite verbose, so let's put it in the function buffer(), in webgl.js:

Multi-attribute buffer and varying color

Now, we want to give a different color to our three vertices, and draw a triangle with them.
The vertex colors can be transmitted to the fragment buffer via a varying variable, to produce a gradient (this process is called color interpolation).
The X/Y/Z and R/G/B values for each vertex can be stored in two data buffers, or in an interleaved data buffer, like here:

Demo

What happens here?

A buffer of 3 x 6 floats is initialized and bound to the program (lines 27-43).

Then, for every chunk of 6 floats in the data buffer,
-

gl.vertexAttribPointer(position, 3, gl.FLOAT, false, FSIZE*6, 0);

reserves the first 3 values for the attribute position (line 47),
-

gl.vertexAttribPointer(color, 3, gl.FLOAT, false, FSIZE*6, FSIZE*3);

reserves the last 3 values for the attribute color (line 59).

The last two params of vertexAttribPointer (stride and offset) are counted in bytes, and the size of a data buffer item can be retrieved using BYTES_PER_ELEMENT (line 39).

gl.enableVertexAttribArray (lines 55 and 67) finish binding the attributes to the verticesColors data buffer.
The data buffer is not named explicitly though (the last buffer bound to the WebGL program is used automatically).

The varying v_color is declared in both shaders.
- In the vertex shader, it receives the color of the current vertex.
- In the fragment shader, its value is automatically interpolated from the three vertices around it:

Tips and tricks

Color interpolation also works in LINES, LINE_STRIP and LINE_LOOP modes.

Most WebGL tutorials online stop when they reach this famous "tricolor triangle" step. But there's a lot more to cover! ;)

Contrary to POINTS mode (that has gl_PointCoords), in TRIANGLES mode there is no global variable indicating where the current fragment is situated inside the triangle.
But you have access to gl_FragCoords telling where the fragment is positioned on the canvas.

Translate, rotate, scale

If we want to move, rotate or scale a triangle, we need to know how to transform each of its vertices.

Translation consists in moving all the vertices in a given direction (by increasing or decreasing their X/Y/Z coordinates).

Rotation consists in moving the vertices around a pivot point, with a given angle (a full turn clockwise is 360 degrees or 2 * Pi radians).

Scaling consists in making the triangle smaller or bigger by bringing the vertices closer or further from a pivot point.

These operations can be done component per component (compute the new value of X, then Y, then Z), but we generally use a much powerful tool: matrix transformations:
Each transformation can be written as a mat4 (a matrix of 4x4 floats), and applied to a vertex's homogeneous coordinates (vec4(X, Y, Z, 1.0)) with a multiplication.

Transformation

Transformation applied to each vertex coordinates

transformation via a matrix

Identity (no change)

x' = x
y' = y
z' = z

[x'y'z'1]
=
[1000010000100001]
×
[xyz1]

Translation along X, Y and Z axis

x' = x + Tx
y' = y + Ty
z' = z + Tz

[x'y'z'1]
=
[100Tx010Ty001Tz0001]
×
[xyz1]

Rotation around the X with an angle φ
(φ is in radians)

x' = x
y' = y cos φ - z sin φ
z' = y sin φ + z cos φ

[x'y'z'1]
=
[10000cos φ-sin φ00sin φcos φ00001]
×
[xyz1]

Rotation around the Y with an angle θ

x' = x cos θ + z sin θ
y' = y
z' = -x sin θ + z cos θ

[x'y'z'1]
=
[cos θ0sin θ00100-sin θ0cos θ00001]
×
[xyz1]

Rotation around the Z axis with an angle ψ

x' = x cos ψ – y sin ψ
y' = x sin ψ + y cos ψ
z' = z

[x'y'z'1]
=
[cos ψ-sin ψ00sin ψcos ψ0000100001]
×
[xyz1]

Scaling along X, Y and Z axis

x' = Sx * x
y' = Sy * y
z' = Sz * z

[x'y'z'1]
=
[Sx0000Sy0000Sz00001]
×
[xyz1]

Demo

What happens here?

This demo performs 3 transformations on the same triangle: translate, then rotate, then scale.
These transformations can be done in this order by multiplying their matrices from right to left, and multiplying the resulting matrix product with the vertex coordinates (see line 14).
These 3 matrices are declared in JS and sent to the fragment shader using uniforms (see lines 41 to 72).
WebGL only accepts uniform matrices that are transposed (matrices with the horizontal and vertical axis inverted), so we transpose them manually in the JS code.
In the following chapters, the matrix product will be computed only once (in JS) and passed to the vertex shader, to avoid recomputing it for each vertex.

Tips and tricks

Matrices and vectors with the same size can be multiplied together natively in GLSL (ex: mat4 * vec4).

How to change the pivot point

The rotation and scaling matrices, as described above, only allow to use the world's origin [0, 0, 0] as pivot point.
Imagine a triangle that is not centered on the origin, that you need to rotate 90 degrees (Pi/2 radians) around its center, for example the point [0.5, 0.5, 0].
The solution is to apply 3 transformation matrices to this triangle's vertices:

Translate them around the origin [0, 0, 0]

Apply a 90 degrees rotation

Translate them back to around the [0.5, 0.5, 0].

Texturing

As we saw earlier, a fragment's color inside a triangle can be interpolated from the colors of each vertex around it.
The same principle can be used with a texture image (it's called sampling in this case).
A WebGL texture (whatever its size in pixels) has a local coordinates system (U,V) between 0 and 1, and any vertex can have texture coordinates in this system.

Demo

Here's an example of texture applied to a quad (a square made of two triangles):

What happens here?

An image is loaded and a WebGL texture sampler is created from it, using the functions createTexture, pixelStorei, activeTexture, bindTexture, texParameteri (lines 59 to 85).
Most of these steps plus clear and drawArrays are executed after the image has finished loading (line 63).

Special texture behaviors (wrap / mirror / clamp on edges, minimize / magnify filters, etc) can be configured with texParameteri (more info on MDN).
In particular, gl.TEXTURE_WRAP_S and gl.TEXTURE_WRAP_T can be set to gl.REPEAT (default), gl.CLAMP_TO_EDGE or gl.MIRRORED_REPEAT.
This tells WebGL what to do if a texture coordinates is not between 0 and 1:

The Y axis flip (on line 66) puts the image's UV origin at the top left corner, and avoids having to work with an upsise-down image.

The vertices positions are interleaved with the texture coordinates in the data buffer (lines 30 to 35).
The vertex buffer receives the vertex positions and texture coordinates as attributes (lines 8,9), and sends the latter to the fragment shader using a varying (lines 10, 13, 20).

The fragment buffer receives the coordinates as a varying, the texture image as a uniform sampler2D (line 19), and calls

texture2D(sampler, v_TexCoord)

to interpolate it (line 22).

Tips and tricks

The U and V axis are sometimes called S and T, but it's exactly the same thing.

The "pixels" inside a texture are called texels.

Since UV coordinates are between 0 and 1, the vast majority of texture images are square to avoid being distorted when they're mapped on a 3D shape.

In WebGL 1.0, the textures width and height in pixels need (in most cases) to be a power of two (2, 4, 8, 16, 32, 64, 128, 256, 1024, 2048, 4096).
It's recommended to always use power-of-two sizes, but if you really need textures of different sizes, you can fix them by adding wrap on S and T axis:

In this demo, we're using a single texture (TEXTURE0). You can use more, but you can't exceed 8 on some devices (more info on webglstats).
You can call

gl.getParameter(gl.MAX_TEXTURE_IMAGE_UNITS)

to know the limit on your device.

The maximum texture size also varies with the device used. To be safe, width and height shouldn't exceed 4096px (more info on webglstats).
You can call

gl.getParameter(gl.MAX_TEXTURE_SIZE)

to know the limit on your device.

You can overwrite textures after a draw call (after calling drawArrays or drawElements) if they're not used anymore.

If you need more than 8 textures without constantly switching between them, you can make a texture atlas (a mosaic of textures), and pick coordinates in the regions you want.
Warning: texture bleeding can occur if you use texture coordinates at the fronteer between two texture, due to antialiasing.

Changing the texture's appearance

Remember that the values you're manipulating in the fragment buffer are rgba colors, so you can do anything you want with them, like:
- inverting the colors (r = 1-r; g = 1-g; b = 1-b).
- greyscaling (compute the average of r, g and b, and apply it to r, g and b).
- exchanging color components (ex: gl_FragColor = color.brga).
- playing with gl_FragCoord (the current canvas coordinates, in pixels, in the form of a vec2).

Combining multiple textures

Two or more textures can be used on a triangle at the same time. For example, you can initialize two samplers, and add or multiply them in the fragment shader:

3D graphics

The 3D camera

You should know by now that computers don't do "3D" natively.
You (or your 3D framework) will have to do all the computing to simulate the camera, the perspective, and how they affect each polygon, so the scene can seem to be in 3D.
Fortunately, the API we use (WebGL) provides very helpful tools to help rendering complex scenes without too much effort.

In 3D, the "camera", with its position, angle and perspective, is defined by nothing more than a 4x4 matrix.
During render, every vertex in the scene is multiplied by this matrix to simulate these camera properties and appear at the right position.
The camera's frustum, also called clipping volume, defines an area in which the triangles will be rendered.
For a camera with perspective, it is defined by a field of view angle, an aspect ratio, a near clip plane and a far clip plane, and can be set with this matrix:

Then, the camera can be translated, rotated and scaled (zoomed) similarly to the vertices, by using the matrices multiplications we saw earlier.
A slightly more advanced LookAt() function is often used by developers to set the camera's position, angle and target all at once.

Reduce repetitions with indexed vertices

Before starting to draw meshes (3D objects) that contain a lot of triangles, we need to learn an optimized way to write our data buffers.
As we have seen before, data buffers (with the type gl.ARRAY_BUFFER) can hold vertex properties (position, color, texture coordinates...).
These properties can be placed into multiple buffers or interleaved into a single one.

In 3D, vertices are often shared between multiple triangles.
Instead of repeating the same vertices many times in the same buffer, it's possible to write each vertex only once in a data buffer,
and use a second buffer with the type gl.ELEMENT_ARRAY_BUFFER that declares all our triangles by using indices of the first object.

Even if many data buffers exist in your program, only one index buffer can be used, and it will list indices from all the data buffers at the same time,
so they all need to be stored in the same order (the N'th item of every buffer must belong to the same N'th vertex).

The indices stored in the index buffer have integer values (N = 0, 1, 2...), and you can choose their size in bytes depending on the number of vertices you want to index:

Number of vertices to index

Index buffer type

drawElements type

0 ... 256

Uint8Array([...])

gl.UNSIGNED_BYTE

0 ... 65,536

Uint16Array([...])

gl.UNSIGNED_SHORT

0 ... 4,294,967,296

Uint32Array([...])

gl.UNSIGNED_INT (*)

(*) In WebGL 1.0, an extension must be enabled before using the type UNSIGNED_INT: gl.getExtension('OES_element_index_uint');. In WebGL2, it's enabled by default.

Hello cube

The easiest shape to render in 3D is a cube composed of 8 points and 12 triangles.

Demo

What happens here?

Here we are, finally drawing in 3D! But to render it correctly, we had to enable WebGL's depth sorting.
This mechanism ensures that only the fragments that are the closest to the camera are drawn, in order to avoid, for example, seeing the back face of the cube on top of the front face.
To do that, we add

Remember that in every demo, you can see the triangles in wireframe by changing the first parameter of gl.grawElements to gl.LINE_STRIP, which makes the face diagonals visible:

How to color each face of the cube

To color each face individually, each vertex can't have an unique color like we did before. Its color needs to vary depending on which face is being rendered.
The solution is to declare all the possible combinations of vertices positions and colors in two data buffers, and use an index buffer to create the corresponding triangles.
Its indeed a bit more verbose, but still the simplese way to achieve it.
The same principle applies if you want to make a cube with different textures on each face (each combination of vertex position and texture coordinates must be declared separately).

Demo

What happens here?

The vertex positions and colors are split in two data buffers to improve readability. Each line declares the 4 vertices composing one (square) face of the cube (lines 39-55).
Then, the index buffer makes two triangles from the vertices of each face, and recycles the vertices placed on the diagonal (lines 57-64).
Thanks to this indexing, we only need to declare 24 vertices (4 per face) instead of 36 (3 per triangle x 12 triangles).

Tips and tricks

Let's add the cube declaration in the cube() function of shapes.js to avoid repeating it in the next demos:

In bonus, shapes.js also contains the models for a sphere and a pyramid.
You can try them in all the following demos by replacing cube() with sphere() or pyramid().

NB: all the shapes in shapes.js use Uint16Array's for indices, so if you use them, remember to use the type gl.UNSIGNED_SHORT in drawArrays().

Lighting and shading

The terms lighting and shading are often used without distinction but they actually represent two different things:

Lighting is a physics notion, representing how the light affects an object in the real world or in a 3D scene.

Shading is specific to computer graphics, and indicate how the pixels are rendered on a screen according to lighting.

Sometimes, lighting is also referred to as coloring, and it makes sense when you think about it, as the apparent color of an object is produced by the lightwaves it absorbs and/or reflects.

Most 3D scenes need a minimum of shading to avoid looking flat and confusing, even a simple colored cube:

In the first case, all the pixels are the same color, which doesn't look natural.In the second case, every face has a different color, but our brains interpret in as a shaded, red cube.

There are many different ways to light a 3D scene, here are the five main ones we can start with:

1) Diffuse light

Diffuse light (also called directional light) is the equivalent of the sun's light on Earth: all the rays are parallel and have the same intensity everywhere in the scene.
When it hits a surface, it is reflected in all directions, but the intensity of the reflexion decreases proportionally to the angle in which the light hits the surface:

To simulate it, we need to define a light source with a color (for example, white), and a direction (for example, vec3(0.5, 3.0, 4.0)).
The color set by the fragment shader is: the light's color (rgb) × the face color (rgb) × the dot product of the normal and the light. The color's alpha is fixed to 1.0.
If the dot product is negative, we set it to "0" (there can't be a negative amount of light). This is done with max(dot(lightDirection, normal), 0.0); (see line 27).
Note: it's a good practice to re-normalize the normals in the shaders to ensure they have the right length. This is done with normalize().

Demo

2) Ambient light

With diffuse lighting alone, some faces are too dark, like the rightmost one in the previous demo. To fix that, we can add an ambient light reflexion.
It's a light that is applied equally to all the triangles in the scene, regardless to their normal vector.
To simulate it, we need to set a light color (not too bright, for example: vec3(0.2, 0.2, 0.2)), multiply it with the surface color, and add it to the diffuse light (see line 37):

Demo

3) Point light

We can also have a point light representing a light bulb, with a specific position and color.
It's similar to diffuse light, except that the light rays are not parallel, because the light source is not "infinitely" far away: it's in the scene and emits lights in all directions.
With a point light, the shading intensity will vary according to the angle of the light rays, but also according the distance from the light source to the object: it's called light attenuation.
In the real world, the light attenuation is proportional to the distance squared (d²), but in computer graphics, it's usually proportional to the distance (not squared).
When a point light reflexion is computed per vertex, it looks a bit nicer but the triangles are still visible:

The best solution consists in computing the right color for every fragment of the cube (i.e. every pixel) according to its distance from the light source,and that's exactly what the fragment shader is here for:

The following demo shows how a point light can be computed per fragment.
The vertex shader sends the vertices positions, colors and normals to the fragment shader using three varyings, and the fragment shader computes everything.

Demo

4) Spot light

A spot light is very similar to a point light, except that it does not emit light rays in all directions, but rather inside a given "cone".
When this cone hits a surface, the resulting lighting is elliptic, like a spot. In the real world, this is similar to a torchlight.

To implement it, we can start from a point light, and add a direction (like we had in the diffuse light) and an angle (like the camera's field of view angle).
Then, the diffuse light is only computed for the fragments where the dot product of the spot direction and the spot-to-surface vector is bigger than the cosine of the spot's angle.

Here's the code added to the point light demo to transform it into a 20° spot light:

5) Specular light

When an object is shiny, it can reflect a point light like a mirror under a certain angle. Its result looks like a spot light, but it's very different.
Specular light is actually the only light in this list that relies on the camera's position to be computed: when the camera moves, the reflect moves too.
To implement it, we need to:

Create a new uniform vector containing the camera's coordinates.

Compute the vector between the current fragment and the camera.

Compute the vector between the current fragment and the light source (like in the previous demos).

Choose a level of "shininess" (the intensity of the reflect), for example around 100 for a metallic object.

Reflect the fragment-to-light vector on the triangle's surface, i.e. compute its symmetric vector according to the surface's normal.
reflect(vector, mirror) is a native GLSL function that mirrors a vector based on another vector, and it's exactly what we need here, in order to check if the reflect hits the camera.

Compute the dot product between the reflect and the fragment-to-camera vector, powered to the surface's "shininess": pow(dot(reflect, toCamera))

Multiply the result with the light's color and add it to the other (ambient / diffuse) lights.

Here's a demo that combines diffuse, ambient and specular lighting.
This combination is also called Phong reflection.

Demo

Smooth shading

By default, the normal vector of a triangle is reused by all its vertices.
This provokes a facetted (or polygonal) rendering, where neighbour triangles are separated by a visible "hard edge".
Smooth shading (or "Phong shading"), consists in computing a different normal for each vertex, equal to the mean of the normals of all the triangles around it.
But no need to use divisions to obtain this mean value! Just normalize the sum of all the neighbour normals and you'll get the right result! (more info on iquilezles's website)
Also, since the normal is now a varying vector interpolated for each fragment, the fragment shader needs to re-normalize it to stay accurate.
Example: the same 3D model with smooth shading disabled / enabled (more info about loading 3D models in a future chapter!):

How about raytracing?

You may have heard about raytracing as a way to produce protorealistic lights and reflections in 3D scenes, but this is actually a whole other domain of computer graphics.

Raytracing can be done with WebGL, but it's generally computed on a fullscreen shader than in a scene made of triangles, as you can see on Shadertoy.
It also enables more advanced lighting modes, such as Emissive lighting, where an object is self-illuminated, glows and affects the the surrounding objects like a light source.

But this will not be covered in this guide... maybe the next one?

How to transform a 3D model

So far, we've only transformed the camera matrix to make it revolve around the cube, which created an illusion of cube rotation.
Now that we learned how to place a fixed light in the scene, if we want to rotate, translate or scale a cube without touching the rest of the scene, we need to do three things:

Introduce a model matrix (the transformation matrix of the cube).

Update the position of each vertex by multiplying it with the camera matrix (as usual) and also multiplying it with the model matrix.

Recompute all the vertex normals and the lighting of every fragment after each update of the model matrix.

How to update the normals efficiently

It's possible to recompute the normals from scratch in JS (with a cross product) after each transform, but it can represent a lot of computations in scenes that contain a lot of triangles.
So, in practice, the best approach is to keep the original normals unchanged, and have a matrix that can be used to update each of them.
Then, when the model matrix changes, two things can occur:

If the transformation is a simple translation, rotation, or uniform scale, the original normal's xyz coordinates can be multiplied with the top-left 3x3 matrix of the model matrix:

vec3 v_normal = normalize(mat3(model) * normal.xyz)

But if the transformation is a non-uniform scale or a combination of multiple transformations, then applying a subset of the model matrix won't be accurate.
In this case, the solution is to multiply the normal with the inverse transpose of the model matrix, i.e. the transpose of the inverse of the model matrix:

vec3 v_normal = vec3(inverseTranspose * normal)

The second approach is often the only one used in 3D programs, because it works for both simple and complex transformations.
This generic approach means that the inverse transpose matrix must be computed once per frame, in JavaScript, and passed to the vertex shader,
along with the original normals, the model matrix, and all the usual attributes and uniforms (position, color, light...).

Demo

Here's a cube rotating on itself with a fixed camera and a fixed point light:

What happens here?

When the page loads, the cube, the light and the camera matrix are set as usual. (lines 68-105).

Three new matrices are introduced: model (the model matrix), inverseTranspose and mvp (the model view projection matrix, equal to cameraMatrix * modelMatrix).

In the loop executed 60 times per second, we update the cube Y angle (line 117), recompute the model matrix (lines 119-122), the mvp matrix (lines 124-126),the inverse transpose matrix (lines 128-130), and send them to WebGL as uniforms, before rendering the scene.
Note: the identity matrix is used in each frame to recompute the model matrix from scratch (line 120). It avoids updating the same matrix with small increments every time.

The vertex shader applies the mvp matrix to the current vertex (line 26), and sends three varyings (position, normal and color) to the fragment shader.

The fragment shader does nothing new. As far as it is concerned, its task is still to compute a shading from the variables at its disposal, regardless of how they were computed.

Tips and tricks

The mvp matrix introduced above is precomputed in JavaScript to avoid recomputing it for every vertex, just like we did for combined transformation matrices.

Here are the inverse() and inverseTranspose() functions added in matrix.js:

As you can see, the inversion algorithm is a bit hairy, so it's better to do it the as few times per second as possible.
Nevertheless, the inverse transpose is very useful for other life-saving tricks, as we will see in the part 2 of this guide.

Drawing many cubes

To draw many cubes at once, it's of course possible to declare the vertices coordinates, colors and normals of each cube separately, but that would be very verbose.
Instead, we can consider the cube we already declared as a reusable model.
For each cube we want to draw, we simply need to transform it (by giving it a new model matrix, mvp matrix and inverse transpose matrix), and render it.

Demo

Here's a demo with 3 red cuboids (that's how we call deformed cubes)

What happens here?

A new model matrix is created and drawn three times using the same data buffers (lines 110, 132, 154).

Tips and tricks

To avoid repetitions, I added a drawShape() function in shapes.js, with optional scaling along X/Y/Z. Its role is to (re)render the current model as many times as necessary.

Hierarchical objects

A hierarchical object is a 3D model made of several basic objects (called segments), for example a robotic arm.
Joints are where the segments are linked and rotate relatively to each other, like an elbow or a wrist.
To keep the segments linked to each other, the transformation matrix is inherited from segment to segment (ex: a hand will apply its own transformation matrix to the arm's one).

The following demo shows an robotic arm made of three cuboids (click the buttons below to make it move).

Demo

What happens here?

You can see on lines 146-150 that the vertical "arm" cuboid can rotate around its X axis, and once it's rotated, we perform a "-2" translation along Y.
As described in the "transformations" chapter, this allows to make the cuboid rotate around a pivot point placed at its extremity (the "elbow", 2 units higher) instead of its center.

Then, on lines 152-155, the hand cuboid, which is attached at the end of the arm cuboid, inherits its model matrix, and updates it to add its own rotation and translation.
(the rotation happens along the Y axis, and the translation places it at the end of the arm).

Tips and tricks

The matrix inheritance hierarchy is also called scene graph.

This process can be repeated many times to make a multi-joint object, like an entire robot or a rope made of many segments!

If many segments are attached to the same parent (for example, the fingers of a hand), they must all reuse their parent's matrix (for example, see multi-joint demo).

Debugging

Many kind of errors can be present in your WebGL shaders or occur on runtime. The most frequent I've encountered are:

Missing semicolon at the end of a line.

Missing decimal part in a float number (1 is an int, 1.0 is a float).

Trying to change the value of a const variable.

Trying to set a value already set by an uniform or a varying (they are read-only).

Trying to set a non-constant limit in a for-loop.

Trying to use a function recursively.

Trying to use === or !== operators.

Mismatching int or float precision for a variable read by both shaders.

Other errors can be made in the JS program, and can sometimes fail silently. especially:

Using the bad count parameter for gl.drawArrays or gl.drawElements (must be the number of vertices).

Using the wrong combination of types for an index buffer and gl.drawElements (Uint16Array only work with gl.UNSIGNED_SHORT).

Not passing the right amount of data in attributes or uniforms (ex: 4 floats in a vec3).

Finally, if no syntax errors were made but nothing appears, check if:

The camera looks in the right direction (and with a decent fov angle, usually around 0.9 radians or 50 degrees).

The light source is not too dark or trapped inside a 3D object.

Your normals are not inverted (they must point "outside" to let the object reflect the light correctly).

You didn't accidentally set any color's alpha to 0.0.

You're not drawing points that are too big or have their center placed outside of the canvas.

WebGL 2.0

As I said in the introduction, WebGL 2.0 brings new features and changes a few things compared to WebGL 1.0.
I decided to not write a WebGL 2.0 guide due to its limited browser support and its new syntax rules that I personally find more confusing than useful.

Anyway, if you want to enable it, you need to change the canvas context creation:

canvas.getContext("webgl2");

and add

#version 300 es

on the very first line of your shaders.

Here are the most important changes:

attribute must be renamed as in inside the shaders (ex: in vec4 a_position;).

varyings must be renamed as out in the vertex shader and in in the fragment buffer.

gl_FragColor doesn't exist anymore. Instead, the fragment shader needs to declare its own out vec4 fragColor; before main() and set its value inside main().

The fragment shader can edit the depth buffer directly using the global gl_FragDepth.

The functions texture2D and textureCube are now simply called texture.

Mipmapping now works on textures even if their width and height are not a power of 2.
In summary, mipmaps are smaller versions of a 2D texture (size/4, size/8 size/16 ...), used by WebGL when a textured object is moving away from the camera.
They can be generated with gl.generateMipmap(gl.TEXTURE_2D) or provided by the developer/artist.
The GLSL function textureSize(sampler, lod) gives you the mipmap texture size for a given level of detail.
And the function texelFetch(sampler, ivec2(x,y), lod) gives you the value of a given texel in this texture.

Most extensions don't need to be loaded anymore, as they are enabled by default.

In particular, Vertex array objects are now available natively and allow to cache the attributes binding, leading to a performance boost in programs doing many draw calls per frame:

Similarly, Uniform Buffer Objects can be used to cache uniforms, but it's harder to setup and generally less useful as uniforms tend to be rarely updated.

To be continued...

At this point we've covered all the basis of 2D and 3D rendering in WebGL 1.0 and 2.0!
For the record, this single page contained more information than 430 pages of the book and 3/4 of the site that inspired it at the beginning. Did I invent tutorial golfing? :D
I could have made it longer but unfortunately, most browsers can't display more than 16 WebGL canvas contexts in the same page...
so all the advanced techniques will be in the upcoming part 2... and maybe a book one day?

Bonus: WebGL and code-golfing

If you're interested in extreme minification and compression, click here to open!

Introduction

Despite my efforts to make the demos of this guide as short as possible, you may have noticed that the WebGL API and the GLSL language are extremely verbose.
Though, if you're into code-golfing like me, you may be interested in making tiny WebGL programs, or even games that fit in 1kb, or 13kb...
As I said in the intro, the Codegolf Team and I already golfed a Shadertoy-like boilerplate in 242 bytes and a raymarching algorithm for Signed Distance Functions (SDF) in 135 bytes.
Keep in mind that SDF are by far the shortest way to display geometrical and fractal shapes (cubes, spheres, torus, mandelboxes, ...) without any triangle.
Nevertheless, here's a list of tricks you can use to golf a "real" WebGL app, with vertices, triangles, normals, matrices, textures, etc...!

1. Writing less lines of code

If you're on a tight byte budget, there are a number of optimizations that you can remove from your code:

Hashing WebGL functions (make a short alias for many functions and constants of "gl").
A popular hash consists in using the chars 0 and 6 of each WebGL context property as aliases. You can try other hashes on this interactive page!