WebGLFundamentals.org

Fix, Fork, Contribute

WebGL 2D Matrices

This post is a continuation of a series of posts about WebGL. The first started with fundamentals and the previous was about scaling 2D geometry.

Math vs Programming vs WebGL

Before we get started, if you have previously studied linear algebra or in general have experience working with matrices then please read this article before continuing below..

If you have little to no experience with matrices then feel free to skip the link above for now and continue reading.

In the last 3 posts we went over how to translate geometry, rotate geometry, and scale geometry. Translation, rotation and scale are each considered a type of 'transformation'. Each of these transformations required changes to the shader and each of the 3 transformations was order dependent. In our previous example we scaled, then rotated, then translated. If we applied those in a different order we'd get a different result.

For example here is a scale of 2, 1, rotation of 30 degrees, and translation of 100, 0. And here is a translation of 100,0, rotation of 30 degrees and scale of 2, 1 The results are completely different. Even worse, if we needed the second example we'd have to write a different shader that applied the translation, rotation, and scale in our new desired order.

Well, some people way smarter than me figured out that you can do all the same stuff with matrix math. For 2D we use a 3x3 matrix. A 3x3 matrix is like a grid with 9 boxes:

 1 2 3 4 5 6 7 8 9

To do the math we multiply the position down the columns of the matrix and add up the results. Our positions only have 2 values, x and y, but to do this math we need 3 values so we'll use 1 for the third value.

In this case our result would be

 newX = x * 1 + newY = x * 2 + extra = x * 3 + y * 4 + y * 5 + y * 6 + 1 * 7 1 * 8 1 * 9

You're probably looking at that and thinking "WHAT'S THE POINT?" Well, let's assume we have a translation. We'll call the amount we want to translate by tx and ty. Let's make a matrix like this

 1.0 0.0 0 0.0 1.0 0 tx ty 1

And now check it out

 newX = x * 1.0 + newY = x * 0.0 + extra = x * 0 + y * 0.0 + y * 1.0 + y * 0 + 1 * tx 1 * ty 1 * 1

If you remember your algebra, we can delete any place that multiplies by zero. Multiplying by 1 effectively does nothing so let's simplify to see what's happening

 newX = x * 1.0 + newY = x * 0.0 + extra = x * 0 + y * 0.0 + y * 1.0 + y * 0 + 1 * tx 1 * ty 1 * 1

or more succinctly

newX = x + tx;
newY = y + ty;

And extra we don't really care about. That looks surprisingly like the translation code from our translation example.

Similarly let's do rotation. Like we pointed out in the rotation post we just need the sine and cosine of the angle at which we want to rotate, so

And we build a matrix like this

 c -s 0 s c 0 0.0 0.0 1

Applying the matrix we get this

 newX = x * c + newY = x * -s + extra = x * 0 + y * s + y * c + y * 0 + 1 * 0.0 1 * 0.0 1 * 1

Blacking out all multiply by 0s and 1s we get

 newX = x * c + newY = x * -s + extra = x * 0 + y * s + y * c + y * 0 + 1 * 0.0 1 * 0.0 1 * 1

And simplifying we get

newX = x *  c + y * s;
newY = x * -s + y * c;

Which is exactly what we had in our rotation sample.

And lastly scale. We'll call our 2 scale factors sx and sy

And we build a matrix like this

 sx 0.0 0 0.0 sy 0 0.0 0.0 1

Applying the matrix we get this

 newX = x * sx + newY = x * 0.0 + extra = x * 0 + y * 0.0 + y * sy + y * 0 + 1 * 0.0 1 * 0.0 1 * 1

which is really

 newX = x * sx + newY = x * 0.0 + extra = x * 0 + y * 0.0 + y * sy + y * 0 + 1 * 0.0 1 * 0.0 1 * 1

which simplified is

newX = x * sx;
newY = y * sy;

Which is the same as our scaling sample.

Now I'm sure you might still be thinking "So what? What's the point?". That seems like a lot of work just to do the same thing we were already doing.

This is where the magic comes in. It turns out we can multiply matrices together and apply all the transformations at once. Let's assume we have a function, m3.multiply, that takes two matrices, multiplies them and returns the result.

var m3 = {
multiply: function(a, b) {
var a00 = a[0 * 3 + 0];
var a01 = a[0 * 3 + 1];
var a02 = a[0 * 3 + 2];
var a10 = a[1 * 3 + 0];
var a11 = a[1 * 3 + 1];
var a12 = a[1 * 3 + 2];
var a20 = a[2 * 3 + 0];
var a21 = a[2 * 3 + 1];
var a22 = a[2 * 3 + 2];
var b00 = b[0 * 3 + 0];
var b01 = b[0 * 3 + 1];
var b02 = b[0 * 3 + 2];
var b10 = b[1 * 3 + 0];
var b11 = b[1 * 3 + 1];
var b12 = b[1 * 3 + 2];
var b20 = b[2 * 3 + 0];
var b21 = b[2 * 3 + 1];
var b22 = b[2 * 3 + 2];

return [
b00 * a00 + b01 * a10 + b02 * a20,
b00 * a01 + b01 * a11 + b02 * a21,
b00 * a02 + b01 * a12 + b02 * a22,
b10 * a00 + b11 * a10 + b12 * a20,
b10 * a01 + b11 * a11 + b12 * a21,
b10 * a02 + b11 * a12 + b12 * a22,
b20 * a00 + b21 * a10 + b22 * a20,
b20 * a01 + b21 * a11 + b22 * a21,
b20 * a02 + b21 * a12 + b22 * a22,
];
}
}

To make things clearer let's make functions to build matrices for translation, rotation and scale.

var m3 = {
translation: function(tx, ty) {
return [
1, 0, 0,
0, 1, 0,
tx, ty, 1,
];
},

return [
c,-s, 0,
s, c, 0,
0, 0, 1,
];
},

scaling: function(sx, sy) {
return [
sx, 0, 0,
0, sy, 0,
0, 0, 1,
];
},
};

attribute vec2 a_position;

uniform vec2 u_resolution;
uniform vec2 u_translation;
uniform vec2 u_rotation;
uniform vec2 u_scale;

void main() {
// Scale the position
vec2 scaledPosition = a_position * u_scale;

// Rotate the position
vec2 rotatedPosition = vec2(
scaledPosition.x * u_rotation.y + scaledPosition.y * u_rotation.x,
scaledPosition.y * u_rotation.y - scaledPosition.x * u_rotation.x);

vec2 position = rotatedPosition + u_translation;
...

Our new shader will be much simpler.

attribute vec2 a_position;

uniform vec2 u_resolution;
uniform mat3 u_matrix;

void main() {
// Multiply the position by the matrix.
vec2 position = (u_matrix * vec3(a_position, 1)).xy;
...

And here's how we use it

// Draw the scene.
function drawScene() {

,,,

// Compute the matrices
var translationMatrix = m3.translation(translation, translation);
var scaleMatrix = m3.scaling(scale, scale);

// Multiply the matrices.
var matrix = m3.multiply(translationMatrix, rotationMatrix);
matrix = m3.multiply(matrix, scaleMatrix);

// Set the matrix.
gl.uniformMatrix3fv(matrixLocation, false, matrix);

// Draw the rectangle.
gl.drawArrays(gl.TRIANGLES, 0, 18);
}

Here's a sample using our new code. The sliders are the same, translation, rotation and scale. But the way they get used in the shader is much simpler.

Still, you might be asking, so what? That doesn't seem like much of a benefit. But, now if we want to change the order we don't have to write a new shader. We can just change the math.

...
// Multiply the matrices.
var matrix = m3.multiply(scaleMatrix, rotationMatrix);
matrix = m3.multiply(matrix, translationMatrix);
...

Here's that version.

Being able to apply matrices like this is especially important for hierarchical animation like arms on a body, moons on a planet around a sun, or branches on a tree. For a simple example of hierarchical animation lets draw draw our 'F' 5 times but each time lets start with the matrix from the previous 'F'.

// Draw the scene.
function drawScene() {
// Clear the canvas.
gl.clear(gl.COLOR_BUFFER_BIT);

// Compute the matrices
var translationMatrix = m3.translation(translation, translation);
var scaleMatrix = m3.scaling(scale, scale);

// Starting Matrix.
var matrix = m3.identity();

for (var i = 0; i < 5; ++i) {
// Multiply the matrices.
matrix = m3.multiply(matrix, translationMatrix);
matrix = m3.multiply(matrix, rotationMatrix);
matrix = m3.multiply(matrix, scaleMatrix);

// Set the matrix.
gl.uniformMatrix3fv(matrixLocation, false, matrix);

// Draw the geometry.
gl.drawArrays(gl.TRIANGLES, 0, 18);
}
}

To do this we introduced the function, m3.identity, that makes an identity matrix. An identity matrix is a matrix that effectively represents 1.0 so that if you multiply by the identity nothing happens. Just like

X * 1 = X

so too

matrixX * identity = matrixX

Here's the code to make an identity matrix.

var m3 = {
identity: function() {
return [
1, 0, 0,
0, 1, 0,
0, 0, 1,
];
},

...

Here's the 5 Fs.

Let's see one more example. In every sample so far our 'F' rotates around its top left corner (well except for the example were we reversed the order above). This is because the math we are using always rotates around the origin and the top left corner of our 'F' is at the origin, (0, 0).

But now, because we can do matrix math and we can choose the order that transforms are applied we can move the origin.

// make a matrix that will move the origin of the 'F' to its center.
var moveOriginMatrix = m3.translation(-50, -75);
...

// Multiply the matrices.
var matrix = m3.multiply(translationMatrix, rotationMatrix);
matrix = m3.multiply(matrix, scaleMatrix);
matrix = m3.multiply(matrix, moveOriginMatrix);

Here's that sample. Notice the F rotates and scales around the center.

Using that technique you can rotate or scale from any point. Now you know how Photoshop or Flash let you move the rotation point.

Let's go even more crazy. If you go back to the first article on WebGL fundamentals you might remember we have code in the shader to convert from pixels to clip space that looks like this.

...
// convert the rectangle from pixels to 0.0 to 1.0
vec2 zeroToOne = position / u_resolution;

// convert from 0->1 to 0->2
vec2 zeroToTwo = zeroToOne * 2.0;

// convert from 0->2 to -1->+1 (clip space)
vec2 clipSpace = zeroToTwo - 1.0;

gl_Position = vec4(clipSpace * vec2(1, -1), 0, 1);

If you look at each of those steps in turn, the first step, "convert from pixels to 0.0 to 1.0", is really a scale operation. The second is also a scale operation. The next is a translation and the very last scales Y by -1. We can actually do that all in the matrix we pass into the shader. We could make 2 scale matrices, one to scale by 1.0/resolution, another to scale by 2.0, a 3rd to translate by -1.0,-1.0 and a 4th to scale Y by -1 then multiply them all together but instead, because the math is simple, we'll just make a function that makes a 'projection' matrix for a given resolution directly.

var m3 = {
projection: function(width, height) {
// Note: This matrix flips the Y axis so that 0 is at the top.
return [
2 / width, 0, 0,
0, -2 / height, 0,
-1, 1, 1
];
},

...

Now we can simplify the shader even more. Here's the entire new vertex shader.

attribute vec2 a_position;

uniform mat3 u_matrix;

void main() {
// Multiply the position by the matrix.
gl_Position = vec4((u_matrix * vec3(a_position, 1)).xy, 0, 1);
}
</script>

And in JavaScript we need to multiply by the projection matrix

// Draw the scene.
function drawScene() {
...

// Compute the matrices
var projectionMatrix = m3.projection(
gl.canvas.clientWidth, gl.canvas.clientHeight);

...

// Multiply the matrices.
var matrix = m3.multiply(projectionMatrix, translationMatrix);
matrix = m3.multiply(matrix, rotationMatrix);
matrix = m3.multiply(matrix, scaleMatrix);

...
}

We also removed the code that set the resolution. With this last step we've gone from a rather complicated shader with 6-7 steps to a very simple shader with only 1 step all due to the magic of matrix math.

Before we move on let's simplify a little bit. While it's common to generate various matrices and separately multiply them together it's also common to just multiply them as we go. Effectively we could functions like this

var m3 = {

...

translate: function(m, tx, ty) {
return m3.multiply(m, m3.translation(tx, ty));
},

},

scale: function(m, sx, sy) {
return m3.multiply(m, m3.scaling(sx, sy));
},

...

};

This would let us change 7 lines of matrix code above to just 4 lines like this

// Compute the matrix
var matrix = m3.projection(gl.canvas.clientWidth, gl.canvas.clientHeight);
matrix = m3.translate(matrix, translation, translation);
matrix = m3.scale(matrix, scale, scale);

And here's that

One last thing, we saw above order matters. In the first example we had

translation * rotation * scale

and in the second we had

scale * rotation * translation

And we saw how they are different.

The are 2 ways to look at matrices. Given the expression

projectionMat * translationMat * rotationMat * scaleMat * position

The first way which many people find natural is to start on the right and work to the left

First we multiply the position by the scale matrix to get a scaled position

scaledPosition = scaleMat * position

Then we multiply the scaledPosition by the rotation matrix to get a rotatedScaledPosition

rotatedScaledPosition = rotationMat * scaledPosition

Then we multiply the rotatedScaledPosition by the translation matrix to get a translatedRotatedScaledPosition

translatedRotatedScaledPosition = translationMat * rotatedScaledPosition

And finally we multiply that by the projection matrix to get clip space positions

clipspacePosition = projectionMatrix * translatedRotatedScaledPosition

The 2nd way to look at matrices is reading from left to right. In that case each matrix changes the space represented by the canvas. The canvas starts with representing clip space (-1 to +1) in each direction. Each matrix applied from left to right changes the space represented by the canvas.

Step 1: no matrix (or the identity matrix)

clip space

The white area is the canvas. Blue is outside the canvas. We're in clip space. Positions passed in need to be in clip space

Step 2: matrix = m3.projection(gl.canvas.clientWidth, gl.canvas.clientHeight);

from clip space to pixel space

We're now in pixel space. X = 0 to 400, Y = 0 to 300 with 0,0 at the top left. Positions passed using this matrix in need to be in pixel space. The flash you see is when the space flips from positive Y = up to positive Y = down.

Step 3: matrix = m3.translate(matrix, tx, ty);

move origin to tx, ty

The origin has now been moved to tx, ty (150, 100). The space has moved.

Step 4: matrix = m3.rotate(matrix, rotationInRadians);

rotate 33 degrees

The space has been rotated around tx, ty

Step 5: matrix = m3.scale(matrix, sx, sy);

scale the space

The previously rotated space with its center at tx, ty has been scaled 2 in x, 1.5 in y

In the shader we then do gl_Position = matrix * position;. The position values are effectively in this final space.

Use which ever way you feel is easier to understand.

I hope these posts have helped demystify matrix math. If you want to stick with 2D I'd suggest checking out recreating canvas 2d's drawImage function and following that into recreating canvas 2d's matrix stack.

Otherwise next we'll move on to 3D. In 3D the matrix math follows the same principles and usage. I started with 2D to hopefully keep it simple to understand.

Also, if you really want to become an expert in matrix math check out this amazing videos.

What are clientWidth and clientHeight?

Up until this point whenever I referred to the canvas's dimensions I used canvas.width and canvas.height but above when I called m3.projection I instead used canvas.clientWidth and canvas.clientHeight. Why?

Projection matrices are concerned with how to take clip space (-1 to +1 in each dimension) and convert it back to pixels. But, in the browser, there are 2 types of pixels we are dealing with. One is the number of pixels in the canvas itself. So for example a canvas defined like this.

<canvas width="400" height="300"></canvas>

or one defined like this

var canvas = document.createElement("canvas");
canvas.width = 400;
canvas.height = 300;

both contain an image 400 pixels wide by 300 pixels tall. But, that size is separate from what size the browser actually displays that 400x300 pixel canvas. CSS defines what size the canvas is displayed. For example if we made a canvas like this.

<style>
canvas {
width: 100vw;
height: 100vh;
}
</style>
...
<canvas width="400" height="300"></canvas>

The canvas will be displayed whatever size its container is. That's likely not 400x300.

Here are two examples that set the canvas's CSS display size to 100% so the canvas is stretched out to fill the page. The first one uses canvas.width and canvas.height. Open it in a new window and resize the window. Notice how the 'F' doesn't have the correct aspect. It gets distorted.

In this second example we use canvas.clientWidth and canvas.clientHeight. canvas.clientWidth and canvas.clientHeight report the size the canvas is actually being displayed by the browser so in this case, even though the canvas still only has 400x300 pixels since we're defining our aspect ratio based on the size the canvas is being displayed the F always looks correct.

Most apps that allow their canvases to be resized try to make the canvas.width and canvas.height match the canvas.clientWidth and canvas.clientHeight because they want there to be one pixel in the canvas for each pixel displayed by the browser. But, as we've seen above, that's not the only option. That means, in almost all cases, it's more technically correct to compute a projection matrix's aspect ratio using canvas.clientHeight and canvas.clientWidth.