Texture mapping, part 2

This is intended to be a sequel to the article on texture mapping that Boreal wrote for Hugi 23. Boreal's article is on what is usually called "affine" texture mapping: that is, one picks the texture map, passes it through an "affine" transformation, and draws the result.

What is an "affine" transformation? Simply put, that means that you
stretch the texture map after perspective projection has took place:
if *(u,v)* are the coordinates in texture space, and *(x,y)*
are the coordinates in screen space, the formula is simply *(u,v) = A
(x,y)* where A is an appropriate matrix. Computing the matrix is
what the precomputation phase in Boreal's code does.

This method is extremely fast, because you only have to do two adds per
pixel. Unfortunately, it causes severe distortion in the texture if the
*z* varies a lot between different edges of the texture; this is
because the algorithm basically throws away the *z* coordinate and
thus can be exact only if the *z* coordinate is constant throughout
the polygon.

The method I present here is quite the opposite: setting up the algorithm for each polygon is quite expensive, and you have to do two divides (!) per pixel, but the result is perspectively perfect. The two cubes below (in particular, the rightmost faces) can show you the difference:

It is rare to see tutorials with a full derivation of perfect texture mapping; in fact, I could not say to understand it fully before finishing this article. I will first present an even more expensive algorithm that works like this:

1) map the coordinates in screenspace back to 3D space

2) compute an affine transform on the 3D *x* and *y*
coordinates (rather than on the screen *x* and *y*)

3) use the resulting coordinates as offsets in the texture map

I will attack the first two points (the third has nothing special in it); then I'll show a way to reduce the algorithm to the promised two divides per pixel.

Mapping the coordinates in screenspace back to 3D space

First of all, let's clarify a fundamental point. Why is this necessary?
Couldn't we do everything in 3D space first, and then everything in
screenspace? The answer of course is no, and this is quite a general
fact: for example, when doing *z*-buffering you can fill the buffer
only once you know which parts of the buffer are to be filled -- that
is, you can compute the value of *z* for each pixel only after you
have done perspective projection and scanline conversion.

Our purpose can be fulfilled with a little algebra, based on this very
important fact: *1/z* values are linear in screenspace -- that is,
*1/z* values are a linear combination of the bidimensional *x*
and *y* coordinates. This is easily proven starting from the
equations of a plane and those of perspective projection (here and
throughout the article, lowercase letters indicate 3D space, and
uppercase letters):

a x + b y + c z + d = 0 X = x / z Y = y / z

and dividing everything by z:

a x/z + b y/z + c + d/z = 0 d * 1/z = -a X - b Y - c 1/z = -a/d X - b/d Y - c/d

This is a useful little theorem that is of common use. For example,
most 3D engines (including the original Quake engine by Abrash and
Carmack) actually store *1/z* values in their *z*-buffer.

Note that the coefficients in the plane equation are all known: a, b, and c are simply the components of the face normal, and d is computed when you do backface culling (it is the scalar product of the normal and an arbitrary point of the polygon).

Anyway, now we have three values that are linear in screenspace: screen
*X*, screen *Y* and 3D *1/z*; actually we don't care
about screen *Y* because we will render our polygon in horizontal
scanlines. We can reverse the perspective projection and obtain the
three-dimensional *x* and *y* for each point on the screen:

x = X * z = X / (1/z) y = Y * z = Y / (1/z)

So the basic algorithm is as this (the *zInv* variable is
*1/z*):

for each polygon scan convert it for each scanline Y X1 = leftmost point to be plotted (inclusive) X2 = rightmost point to be plotted (inclusive) zInv = -a/d * X - b/d * Y - Z; for X = X1 to X2 x = x1 / zInv y = y1 / zInv zInv = zInv - a/d... map (x,y) to (u,v)...... plot the (u,v) texel at (X,Y)...

Things to be noted:

1) variable names are case sensitive and follow the same convention as the rest of the document

2) the *X* and *Y* screen coordinates must be
computed as *X=x/z* and *Y=y/z*. If you scale and translate
them (for example a common choice in a 640x480 resolution is
*X=480 x/z + 320*, *Y=480 y/z + 240*) you must compensate for
this in your code.

3) most of the coefficients can be computed once per polygon

Now, let's proceed.

Compute an affine transform on the 3D *x* and *y* coordinates

Some more math... I will show that the *u* and *v* coordinates
can be computed from *x* and *y* only -- *z* is not
necessary. This is very easy; certainly *u* and *v* are a
linear combination of *x*, *y* and *z* (almost everything
is a linear combination of *x*, *y* and *z*...), so we
can write these equations

u = I x + J y + K z + L a x + b y + c z + d = 0

and substitute from the plane equation:

z = -d/c - a/c x - b/c y u = I x + J y - d/c -a/c x - b/c y + L = (I - a/c) x + (J - b/c) y + (L - d/c)

The same holds for *v*. This is not needed to implement the
algorithm, but only to justify it. So we have to find six values
*A*, *B*, *C*, *D*, *E*, *F*
such that

u = A x + B y + C v = D x + E y + F

This can be found if we have the *(x,y)* and *(u,v)* values
for any three points on the polygon. We can set up a system of three
linear equations and solve it. The formula is scary, about as scary
as the following pseudocode:

den = -x2*y1 + x3*y1 + x1*y2 - x3*y2 - x1*y3 + x2*y3 if abs(den) is small then don't plot the face, it is perpendicular to the viewer den = 1.0 / den A = (-u2*y1 + u3*y1 + u1*y2 - u3*y2 - u1*y3 + u2*y3) * den B = ( u2*x1 - u3*x1 - u1*x2 + u3*x2 + u1*x3 - u2*x3) * den C = (-u3*x2*y1 + u2*x3*y1 + u3*x1*y2 - u1*x3*y2 - u2*x1*y3 + u1*x2*y3) * den D = (-v2*y1 + v3*y1 + v1*y2 - v3*y2 - v1*y3 + v2*y3) * den E = ( v2*x1 - v3*x1 - v1*x2 + v3*x2 + v1*x3 - v2*x3) * den F = (-v3*x2*y1 + v2*x3*y1 + v3*x1*y2 - v1*x3*y2 - v2*x1y3 + v1*x2*y3) * den

(There a few places where to apply the distributive property of multiplication, but that would clutter the pseudocode with parentheses).

Now we can be more specific in our rendering loop:

for each polygon compute A, B, C, D, E, F and other common values scan convert it for each scanline Y X1 = leftmost point to be plotted (inclusive) X2 = rightmost point to be plotted (inclusive) zInv = -a/d * X - b/d * Y - Z for X = X1 to X2 x = x1 / zInv y = y1 / zInv zInv = zInv - a/d u = A * x + B * y + C v = D * x + E * y + F... plot the (u,v) texel at (X,Y)...

That's it. Only, I promised two divides and we still have four multiplications.

Removing the multiplications

Let's operate on *u* alone, since the same simplifications that one
can do on *v* are exactly the same. We have

u = A * X / zInv + B * Y / zInv + C = (A * X + B * Y + C * zInv) / zInv

Since for two adjacent pixels *X* varies by 1 and *zInv*
varies by *-a/d*, the numerator varies by *A + C * (-a/d)*.
We can then use the numerators as "magic coordinates" (let's call
them *m* and *n*) that are related to *(u,v)* but
increase linearly in screenspace: they are simply

m = u zInv n = v zInv

Using *(m,n)* instead of *(x,y)* we have the final algorithm:

for each polygon compute A, B, C, D, E, F and other common values scan convert it for each scanline Y X1 = leftmost point to be plotted (inclusive) X2 = rightmost point to be plotted (inclusive) zInv = -a/d * X - b/d * Y - Z m = (A * X + B * Y + C) * zInv n = (D * X + E * Y + F) * zInv dm = A - C * a/d dn = D - F * a/d for X = X1 to X2 u = m / zInv v = n / zInv zInv = zInv - a/d m = m + dm n = n + dn... plot the (u,v) texel at (X,Y)...

For now, that's it... Next time, I will remove the two divides as well. This is quite tricky because it involves approximation (affine texture mapping being the most brutal of the approximations).

Have fun!

-- |_ _ _ __ |_)(_)| ) ,' -------- '-._