Faster and Better RGB Palette Fades

TAD

Introduction

Welcome coders, to another article for Hugi diskmag. I will describe RGB palette fading, and a simple, fast way to get a better fade.

Fades are important as they can help to smooth the transition from one part into another. Let's face it, they look much better than a sudden clear-screen.

They give a more cinematic feel to your productions, be they demos, intros or full blown games. They should be used very carefully, otherwise the audience will get very bored, very quickly watching the same fade 50 times over. Perhaps the best place for a fade-out (fade to black) is at the end of your production just before it exits back to DOS/Windoze 89 etc.

The Problem

The easiest fade to do, is the fade-to-black. This is where all the individual components of an R,G,B palette (the Red, the Green and the Blue of each of your 256 or so colours) are slowly, and smoothly decreased towards black (Red=0, Green=0, Blue=0).

Once you have the fade-out routine you will want to write a fade-in routine. This is the opposite of a fade-out (surprise, surprise). You begin with an all black palette, where all 256 or so colours have the value Red=0, Green=0, Blue=0, and then you slowly increase each component until it matches the final palette's R,G,B values.

So it seems that we need to write two routines for fading, a fade-out and a fade-in routine. But, a more flexible way would be to write a single routine which fades from a starting-palette to a destination-palette. Then we could specify an all black palette as our destination to create a fade-out, and likewise we could start with an all black palette and give our final palette as the destination to produce a fade-in.

One-By-One

Unless you choose a strange way to store your RGB palettes there will be a block of 768 bytes, where each byte has a value from 0 to 63 (00..3F hex). We arrive at the number 768 because there are 3 bytes per color (a Red byte, a Green byte, and a Blue byte) and because there are 256 colors.

Sorry if this seems obvious, but there may be some "newbie" coders reading and we wouldn't want to scare them off this early in the article.

Let's begin with a fade-out routine, I will describe the more flexible method later on.

Because a Red, Green and Blue byte in our palette all share the same property (of having a value from 0 to 63) we can use a single operation for all three, and so a really easy loop of 768 iterations can be used.

In the loop we take each palette byte, if it's non-zero then we decrement it.

         [ES:DI] --> palette

 FadeOut:
         MOV    CX, 768           ; = 256 * 3 bytes to fade
 fadeout2:
         MOV    AL, ES:[DI]       ; get a palette byte
         CMP    AL, 0
         JNZ    SHORT fadeout3    ; is it already 0?
         DEC    AL                ; else descrease it by 1
 fadeout3:
         STOSB                    ; store the palette byte
         LOOP   fadeout2          ; and repeat for the entire palette

I didn't say palette fading was difficult, did I?

The above code will only decrease each component of the palette once, of course you still need to set the VGA colors using this palette, wait for the normal V-Sync signal and repeat the process.

But how do we know when all 256 colours have been faded out?

It is safe to assume that after 64 calls that the entire palette will have been faded out correctly, because we already know that each component can only have the range 0 to 63.

The Fade-In

The next easiest fade after the fade-out is the fade-in. This needs two palettes. One for a temporary 768 byte workspace and the final palette (so we know what to fade-in to).

This time in the loop we increment each R,G,B component until it matches our final palette.

         [ES:DI] --> 768 byte temporary palette
         [DS:SI] --> the final palette
 FadeIn:
         MOV    CX, 768           ; = 256 * 3 bytes to fade
 fadein2:
         CMPSB                    ; does the palette byte
         JZ     SHORT fadein3     ; already match our final palette byte?
         INC  BYTE PTR ES:[DI-1]  ; else increment our temporary value.
 fadein3:
         LOOP   fadein2           ; and repeat for the entire palette

Again we need to send the palette to the VGA RGB colour registers, wait for V-Sync and repeat the process 64 or so times.

We also need to initialise our temporary palette to all 0's before we start this fade-in.

The Flexible Fade

This is another type of fade, but it is more flexible than both the fade-in and fade-out routines. You can think of it as a palette-morph.

We compare each component in both our temporary and our final palette and increment or decrement as needed, and repeat this for all 768 components.

Like the fade-in routine we need two palettes, a temporary one and our final palette. But this time our temporary palette isn't initialised to 0's we copy our current palette into this buffer and use it as our starting point for the fading process.

         [ES:DI] --> 768 byte temporary palette
         [DS:SI] --> the final palette
 FadeTo:
         MOV    CX, 768           ; = 256 * 3 bytes to fade
 fadeto2:
         CMPSB                    ; does the palette byte
         JZ     SHORT fadeto3     ; already match our final palette byte?
         JG     SHORT fadeto4     ; do we need to fade-up?
         DEC BYTE PTR ES:[DI-1]   ; else fade-down
         JMP    SHORT fadeto3
 fadeto4:
         INC BYTE PTR ES:[DI-1]   ; fade-up
 fadeto3:
         LOOP   fadeto2           ; and repeat for the entire palette

A Better Way

That last routine "FadeTo" seemed very nice, short and reasonably quick. But in fact there is a problem with the increment/decrement method which is easy to overlook.

The R,G,B components of the palette are not faded out evenly as a correct fade would do. For example, say we had the values 1 and 63 and both needed to be faded out to 0 (black), then the decrement method would fade the first value out straight away, while the 2nd value 63, would take another 62 loops to fade out.

This may not seem a problem, but try altering the brightness on your monitor and you should see that all the colours are faded out evenly.

What we need to do is to scale the fading process so that every component, independant of value, takes 63 (or 64) loops to change, this should give a far better fade.

But doesn't this mean we need to divide and multiply to scale the values? This would mean a much slower fade routine than the current increment/decrement method, wouldn't it?

Fixed-Point Palette Fades

Well, no.

Enter the world of fixed-point maths (yet again).

What we need to do is to step up each component based upon its difference between our starting component value and our final component value, AND the period of time over which the fade should take place.

Let's suppose we use a 64 loop cycle to perform our fade, so after 64 loops our fade is complete, and after just 32 loops we are half way there.

Say we need to fade from 8 to 24 in 64 steps, then we begin at 8 and then step up by 0.25 per loop. After 64 loops we would have 8 + (0.25 * 64) = 24.

The formula is just:

            step value = (final - start) / 64

If we wanted a different period for the fade then simply change 64. Can I suggest using 32 or 128 etc.?

To perform this fixed-point fade we need an extra 3072 bytes (768 * 2 * 2). This is used to store some increments and our temporary, working palette in 8.8 fixed-point format.

The set-up process for the fade is slightly longer, but as you will see the actual fade loop is much easier as it requires no conditional jumps.

         [ES:DI] --> 3072 byte temporary palette
         [DS:BX] --> the starting palette (768 bytes)
         [DS:SI] --> the final palette (768 bytes)
 InitFade:
         MOV    CX, 768           ; = 256 * 3 bytes to fade
 initfade2:
         LODSB
         SUB    AL, [BX]          ; the component difference
         CBW
         ROL    AX, 1
         ROL    AX, 1             ; increment = (diff / 64) * 256
         STOSW
         MOV    AH, [SI-1]
         MOV    AL, 0
         MOV  ES:[DI+1536-2], AX  ; value = start * 256
         INC    BX
         LOOP   initfade2         ; and repeat for the entire palette

Now the temporary 3072 byte palette has this format for all 768 values:

 +0          768 x   WORD   increment in 8.8 format
 +1536       768 x   WORD   value in 8.8 format

To set the VGA colours using this format needs a custom routine:

         [ES:DI] --> 3072 byte temporary palette
 SetFixedPalt:
         MOV    CX, 768           ; = 256 * 3 bytes to write
         MOV    DX, 3C8h          ; PEL port
         MOV    AL, 0
         OUT    DX, AL            ; start with colour 0
         INC    DX                ; PEL write port
 setfpal2:
         MOV    AL, ES:[DI+1537]  ; high byte of temp value (8.8 format)
         OUT    DX, AL            ; output each R,G,B component...
         ADD    DI, 2
         LOOP   setfpal2

Now the actual fading routine:

         [DS:SI] --> 3072 byte temporary palette
 FixedFade:
         MOV    CX, 768           ; = 256 * 3 bytes to fade
 fixfade2:
         LODSW                    ; get the increment (8.8)
         ADD    [SI+1536-2], AX   ; add to the temp value (8.8)
         LOOP   setfpal2

Of course you can use an entirely different method and palette format. The only thing to remember is that you need space for the 8.8 fixed-point maths (the temp-value of each 768 component and the 768 increment values).

If you don't understand fixed-point maths then I suggest looking for one of the many tutorials documents on the net or post a question to one of the many newsgroups, and some kind person will post some information to you.

Closing Words

Well, that's another article done. I haven't seen any palette fading method which uses a similar technique to this one, so it may be a first. It's not a ground-breaking discovery, and it's probably not even new, but it should give coders something to think about. Sometimes you have so much to do that the little things like this can get overlooked.

The method of using fixed-point maths could also be applied to other morphs such as co-ordinates or any other kind of data which needs to be interpolated in some quick, simple way.

Oh, a quick message to all the millions of elite coders out there, please don't send me any flames. Instead spend the time writing an article for Hugi or some other lesser diskmag. This way other people can see how smart you are, instead of being a fire hazard.

Have fun.

Regards

TAD #:o)