ray's 16/32 bit atari page

downloads/misc

  demos/intros
  wolfenstein 3d
  miscellaneous
  bundeswehr

docs

  unrolling loops
  c2p part I (st)
  c2p part II (st)
  avoiding c2p (st)
  interlacing (st)
  fat mapping
  3d pipeline
  portal rendering
  8bpp color mixing
  fixedpoint math
  blitter (mst/ste)
  sample replay (st)
  blitter gouraud (falc)
  blitter fading (falc)
  arbitrary mapping
  frustrum clipping etc.

sourcecode

  mc68000 math lib
  32 bytes sin-gen
  24 bit tga-viewer
  blitter example
  lz77 packer
  lz78 packer
  protracker replayer

avoiding c2p conversions

ok, as we know how to do really quick c2p conversions, now you might be greatful with hearing that things can even be done quicker, at least in most cases. "most cases" means that your mapper will have to work row-oriented unless you don't want to run into heavy difficulties (wolf3d is such an example: i have to use a c2p there because the 3d environment gets mapped by vertical slices). if you're coding a demo i'd suggest you to use this method whenever it's possible as you don't need to care about memory-consumption and so on, anyway - and yes the way to do it quicker is quite memory consumptive, but who cares if speed is our priority :).
the trick is quite simple, actually: consider something like a static tunnel or any other offset based distortion effect. what we do to distort our texture is filling the screen pixel by pixel, row by row. furthermore we have a table containing texture offsets for every pixel we need to map - now, i don't wanna discuss how to do bitmap-distortions, what this is about is that our screen gets mapped "row-wise". try to remember the first c2p tutorial where we noticed that one movep.l-instruction is able to set 8 pixels or 4 double-pixels in a single run.
and there do you go, the longword being movep'ed into the screen contains four bytes, directly representing 4 bitplane values with 4 corresponding doublebits for each pixel - let's visualise to get this understandable: imagine we'd like to set 4 double pixels with the colors $f,$2,$3,$1. arranging the data to be movep'ed into screen memory we'd end up in a longword like this:

longword to be movep'ed
byte/bitplane 0				byte/bitplane 1				byte/bitplane 2				byte/bitplane 3
%11	%00	%11	%11	%11	%11	%11	%00	%11	%00	%00	%00	%11	%00	%00	%00

you might notice that the doublebits representing one of the four doublepixel's color are marked with a unique shade of gray. if you understood this you should understand why this method is just as memory-consumptive as stupid, finally. what we'll do to gain speed is prepearing 4 texture preshifts containing the 4bit color of just a single pixel "packed" into a longword while following the pattern from the table above
(i.e. (%XX000000XX000000XX000000XX000000) >> 0,2,4 and 6). now that you've encoded and preshifted your textures like this you can easily put 4 pixels to any 8 (single)pixel aligned position in a row by just doing:

move.l offset1(a0),d0 or.l offset2(a1),d0 or.l offset3(a2),d0 or.l offset4(a3),d0 movep.l d0,screenoffset(a4) ; get the first pixel ; mask in the following 3 ones ; blast into the screen

in a static tunnel's case for example (a0-a3 -> preshited patterns, a4 -> screen). if you'd like something dynamic instead, e.g. a triangle mapper, you'll surely run into some alignment overhead which isn't very easy to code. but still it'll even faster than a mapper that's using a nibble buffer and a c2p-conversion (and even this needs byte-alignment overhead).

- 2002 ray//.tscc. -