Matrix Operations

When I was a very young coder, one of the main problems I had with 3D algorithms was the transformation calculation. I had some text about rotations and transformations in 3D with and without matrices, but I especially had no clue how to find the correct matrices, like e.g. for rotations about an arbitrary axis, or for a camera (with given position, look-at point and up vector).

This article is not meant to show you these special matrices, but rather is a tutorial about how to create the correct matrix for the desired transformation.

Introduction to matrices

A matrix is a very powerful and simple-to-use tool. Let's suppose you want to do any (linear) transformation on some of your vectors with a given matrix, you just multiply your matrix with this vector. In the future I will use right-multiplication with a column vector, so:

/ a11 a12 a13 a14\   /v1\   /a11*v1 + a12*v2 + a13*v3 + a14*v4\
| a21 a22 a23 a24|   |v2|   |a21*v1 + a22*v2 + a23*v3 + a24*v4|
| a31 a32 a33 a34| * |v3| = |a31*v1 + a32*v2 + a33*v3 + a34*v4|
\ a41 a42 a43 a44/   \v4/   \a41*v1 + a42*v2 + a43*v3 + a44*v4/

This is important if you use DirectX (I don't know what's the case with OpenGL...) as Direct3D does it the other way round (if I'm not wrong). So it left-multiplies with a row vector. But you must only mirror the matrix entries by the main diagonal (along a11 to a44) to get the correct matrix for Direct3D.

Composition of Matrices

The most useful thing about matrices is composition of transformations. If you have a car driving along a street with a person sitting in it, nodding his head, you will probably have several transformations. If the matrix for the position of the car is A, the matrix of the person in the car is B, the matrix for the nodding-transformation is C, and one vector in the head of the person is v, you will first have to calculate C*v resulting in the new head position. Then take the matrix B and multiply it with the vector C*v. You will end up with the vector A*(B*(C*v)). The clue about this is that the matrix multiplication is associative, that means you may exchange the brackets as you need. So A*(B*(C*v)) = ((A*B)*C)*v. This is useful, as you may calculate D = ((A*B)*C) once and then calculate D*v for every Vector v in the head of the person. You only have to take care that the matrix for the first transformation must be the rightmost one in your multiplication (This is exactly the other way round in Direct3D!). If you don't know how to do matrix multiplication, here is a short help:

                    / b11 b12 b13 b14\
| b21 b22 b23 b24|
| b31 b32 b33 b34|
\ b41 b42 b43 b44/

/ a11 a12 a13 a14\  / c11 c12 c13 c14\
| a21 a22 a23 a24|  | c21 c22 c23 c24|
| a31 a32 a33 a34|  | c31 c32 c33 c34|
\ a41 a42 a43 a44/  \ c41 c42 c43 c44/

If you want to multiply A and B and want to store it in C, you write A and B as in the picture above and then for calculation the cij, you take the correct row from A and the correct column from B and multiply and add, so:

c11 = a11*b11 + a12*b21 + a13*b31 + a14*b41
c32 = a31*b12 + a32*b22 + a33*b32 + a34*b42

and so on. And you will get C = A*B.

Some sort of Linear Algebra

Before I do start with the real article now, I want to make a short excursus to some Linear Algebra, the father and mother of all 3D graphics. Let's say you have the common 3D-Vectorspace, with vectors as you know them (I mean columns of real numbers). Linear Algebra teaches us that there exist "bases" for every vector space, with as many elements as the dimension of the vspace. The special thing about those bases is that every vector in the space may be described as a linear combination of the elements of the base. Let's take an example to make clear what I mean. E = (e1, e2, e3) is the canonical base with:

     /1\      /0\      /0\
e1 = |0| e2 = |1| e3 = |0|
\0/      \0/      \1/

Now every vector v may be described as v = v1*e1 + v2*e2 + v3*e3, where in this special case of a base, v1, v2 and v3 are the coordinates of v. There are also other bases, like:

     /1\      /0     \      /0      \
b1 = |0| b2 = |cos(a)| b3 = |-sin(a)|
\0/      \sin(a)/      \cos(a) /

With 'a' being any angle.

Again every vector v may be described as v = v1*b1 + v2*b2 + v3*e3, where v1, v2 and v3 are now of course different than above. You may think of the bases as coordinate systems. But the axes do not have to be orthogonal to each other and they don't need to have equal lengths.

What's the use of all this for transformations? That's not too hard. If you want to move something, you may change all of the coordinates in the way that you want by hand. OK. But it's also possible to just change the coordinate system of your object and have a look at the new coordinates. Let me present you an example for clarity. Let's suppose you have the mesh of an apple, the origin of the mesh being the apple's center. Now you also have the Mesh of a house with a table, being far from the mesh's center. And now you want to position the apple onto that table and want to get the coordinates of the apple there. What you could do now, is trying to figure out a transformation that moves the apple to some other place. But there is a simpler way. Just imagine the apple's coordinate system lying on top of that table. Now you may say that the vectors of the apple are already at the correct place, only that they are still in the wrong coordinate system. Everything you have to do is expressing the vectors of the apple in the coordinate system (base) of the house. The next chapter will show you how to do this.

Change of basis as a transformation

At first let's not speak about translation, only rotation and scaling and stuff. I hope it's clear now that all you want is to create two bases, representing the old and new coordinate system and then a matrix that describes the transformation from one base to the other. The first is simple to do. Always. If you want a scaling, just use another base with different lengths of the axes. If you want a rotation about any axis, just rotate the canonical basis E around this axis and use the result as the second basis. If you have a camera, just use the camera information as the other base, and so on...

The only thing that is still missing is the matrix transformation. If you want to transform matrices from a basis a = (a1, a2, a3) to a basis b = (b1, b2, b3) (notice that a1, a2, a3, b1, b2 and b3 are vectors) you have to express the vectors of a by means of the basis b. That means you have to solve the system of equations:

a1 = m11*b1 + m21*b2 + m31*b3
a2 = m12*b1 + m22*b2 + m32*b3
a3 = m13*b1 + m23*b2 + m33*b3

You will end up in 9 linear equations and get the coefficients of the transformation matrix M. Notice that in the above example the coefficients of M are not at the correct positions, as

    /m11 m12 m13\
M = |m21 m22 m23|
\m31 m32 m33/

If you have the special case that the destination basis is the canonical basis E = (e1, e2, e3), which is often the case (except the camera thing), you can simply use the vectors of the source basis as the columns of the transformation matrix. Thus if you want to transform vectors from the above basis B to E, you get the transformation matrix

    /1  0       0      \
M = |0  cos(a)  -sin(a)|
\0  sin(a)  cos(a) /

It should be clear that this matrix should only be calculated once per hand for the whole demo with some parameters (like angels, axis and so on) and not at runtime.

Translations

Translations are no linear transformations in 3D. This is the reason why programmers flee into 4D. I never knew the exact idea how and why this works, so don't ask me about anything. I just know that it's easy to calculate. You just add the position of the coordinate system as the last vector and a (0 0 0 1) row at the bottom. Everything else is equivalent to the above. So if you have an object that lies 3 to the right along the y-axis, your source basis and the transformation matrix will be

/1 0 0 0\
|0 1 0 3|
|0 0 1 0|
\0 0 0 1/

That's all about it. You should now be able to write a transformation matrix for a rotation about any axis or a camera or something as an exercise.

Pitfalls

If you get used to the formalism above, you will find out that it's very simple indeed. Even though there are some pitfalls I often fall into.

The thing I most often do wrong is mixing up the source and destination bases. It's normally exactly the other way as you think at first. I always draw some pictures to get clarity.

Another thing I get problems with is mixing up rows and columns of matrices. Especially as I have to change them as soon as changing to D3D. If you are 100% sure about the correctness of your matrix, try to mirror it around the main diagonal (exchanging rows and columns). You wouldn't guess how often this helped me.

And also don't get mixed up with the transformation matrix. Some sources tell you that the transformation matrix is coordinates of the destination matrix by means of the source matrix. But this way you will get the matrix that transforms the basis, not the vectors that lie in this basis.

A general hint, if your matrix doesn't work just exchange 1) the order of the coordinates in the matrix or 2) the order in which you multiply matrices if using several of them.

Some thoughts about formal Math

Some people think that the more formal things are, the more difficult they are. I am not completely of this opinion. Before I went to university, I only used the rotation, scaling and translation matrix I read somewhere and all possible combinations of them. But some things are hard to model with this. After hearing Linear Algebra for one year now, I must say that many things are clear now that were not some time ago. Not everything that is done in Math is useful for demo writing but much is. If there is anything unclear for you, I would suggest to read a good Linear Algebra book. I personally prefer "Linear Algebra" by Seymour Lipschutz. It was written in January 1968, and I don't know whether you may still get it, but it's way best. You just may not think that you will be able to work it through in a day or two.

If you understood everything, there might also be a reason to read something like that. For example you might want to calculate the inverse of a matrix, e.g. for a calculation of inhomogeneous light sources. Or you may want to learn something about different ways to measure lengths. As different lengths lead to different spheres (and different sphere environmental maps, like e.g. cube maps, as cubes are also spheres, at least for Mathematicians :)).

And so on.