|
trying to code a "newschool" effect like, let's say a mapped 3d object,
a bumpmapper, a perspective rotozoomer or anything similar where you
have to manipulate the screencontent on something like a "per-pixel basis"
as most demo-crews tried and keep trying in recent years as well as today
you will notice that the 16/32bit series of atari machines do not provide
the nicest kind of video-hardware.
the screen is stored in a so called "bitplane-format", which makes it hard
to manipulate a single pixel on the screen in almost any screenmode, with
only one exception, however - ie. the falcon's truecolor modes.
i guess, that saying "making it hard to manipulate a screen-pixel" is wrong
speaking in coders' terms, something that makes operating with bitplane-modes
much worse is that it's awfully slow, for our purposes.
now, you might ask, why atari needed that long to give us a nicer way to do
quick screenoperations with the falcon, finally - well, to my knowledge that
depended on at least two reasons: the first reason for organizing the screen
in a bitplane-format was directly linked with the used viedeo-ic's and almost
any other machine used some of bitplane or "planar" screenmodes too those days,
whereas the second reason was the fact that noone ever thought of the planar
modes as a disatvantage because most games consisted of putting and moving sprites
or scrolling the screen into multiple directions (this becomes obvious if you take
a look at atari's attempts of introducing speedup-devices in concerning copy- and
scroll-tasks, eg. the blitter and the ste's possibilities to hardware-scrolling).
so, if you want to maipulate a pixel in a "chunky" or truecolor mode, you simply
have to move a certain byte-, word- or longword-value into the apropiate position
of your linear screen-memory depending on the color you want to set the pixel to.
the diffrence between chunky and truecolor modes is that the first ones use a pixel's
value to lookup a real color (ie. you have a palette or "color lookup table"/clut,
looking up on direct hardware basis of course).
in truecolor-modes the value is directly interpreted as a rgb-assignment, in a 5 bits
red, 6 bits green, 5 bits blue format resulting in 16 bits (= one word) per pixel.
setting the upper leftmost screenpixel to plain green works out by doing:
|
move.w #63<<5,(a0)
|
; a0 points to your tc-screen
|
being able to address a pixel the just the same way in any color-depth would be nice
but this is not the case in planar modes where the screen is organized like this (let's
assume a mode with 4 bitplanes, for example):
pixel in row |
00 |
01 |
02 |
03 |
04 |
05 |
06 |
07 |
08 |
09 |
10 |
11 |
12 |
13 |
14 |
15 |
bitplane |
word 0/bit |
15 |
14 |
13 |
12 |
11 |
10 |
09 |
08 |
07 |
06 |
05 |
04 |
03 |
02 |
01 |
00 |
0 |
word 1/bit |
15 |
14 |
13 |
12 |
11 |
10 |
09 |
08 |
07 |
06 |
05 |
04 |
03 |
02 |
01 |
00 |
1 |
word 2/bit |
15 |
14 |
13 |
12 |
11 |
10 |
09 |
08 |
07 |
06 |
05 |
04 |
03 |
02 |
01 |
00 |
2 |
word 3/bit |
15 |
14 |
13 |
12 |
11 |
10 |
09 |
08 |
07 |
06 |
05 |
04 |
03 |
02 |
01 |
00 |
3 |
as this table hopefully shows the situation is painful - meaning that the first 4 words
of our screen-memory form a block of 16 pixels, the next 4 words the following 16 pixels
and so on. now, one bit-column within those 4 words represents one pixel on the screen.
word 0 belongs to bitplane 0, word 1 to bpl 1 and so on. bpl 0 equals the lowest significant
bit of the pixel's color, bpl 1 the second significant and so on.
therefore you'd need to do some of word- or longword-accesses in order to change one
pixel's color (unless you use bclr/bset which can only access bytes momory-directly),
which is why we're looking for a diffrent approach than:
|
andi.w #$ffff-(1<<15),(a0)+
ori.l #1<<31|1<<15,(a0)+
ori.w #1<<15,(a0)
|
; a0 points to your bpl-screen
|
to set the first screenpixel to color %1110 = $e = 14 (the and's are needed to erase,
the or's to set the accordig bits of the pixel we want to change) - let's face it,
this sucks and isn't of any greatful use as we will be propably aiming the possibily
of operating more than one pixel at a time *the comfort way*. which is the way we'd
use in a chunky or truecolor mode.
well, the atari's standard hardware doesn't provide any chunky-mode, therefore we'll
try to achieve it the software way, meaning we'll emulate a pseudo chunky-mode which
is done by a technique called "chunk to planar coversion" as the title implies.
this conversion exactly corresponds to its name because it doesn't do anything but
converting a chunky screenbuffer into data fitted for the planar screen-modes.
the most popular way of realizing this for the 2 and 4 bitplane modes is using a
lookup table and some 68k instruction that comes in almost magical concerning the
basic problem, ie. "movep".
there's a little drawback using this method, though: we'll need to double the pixels
if we want to speed things up on the st. this is because of speed- and memory-reasons
as you'll notice, in a minute.
this special movep-instruction is intended to be used with peripheral devices that have
to do bus transfers on an 8-bit level, like the YM-soundchip.
what it does, basically, is transferring a word or longword "bytewise", but with a
one-byte interleave. movep is capable of writig out to odd addresses, which makes
it even more useful for our aims. some explanation, coming up:
|
movep.l d0,0(a0)
|
; d0 = $f1e435c7
; after execution:
; (a0) = $f1
; 2(a0) = $e4
; 4(a0) = $35
; 6(a0) = $c7
|
i guess you get the idea. please notice that the addresses a0, 2+a0, 4+a0, 6+a0
exactly point to the beginning of our single bitplanes meaning that we've just set
a block of 8 pixels. with horizontally doubled pixels we'd be able to set 4 pixels
with one instruction that costs 6 nops (not too bad, actually).
one more example, imagine we'd want to set 4 pixels to the following colors: $f730.
translated into a binary number this becomes: %1111011100110000.
writing this in our bitplane format we'll have: %1110, %1110, %1100, %1000.
doubling the pixels: %11111100, %11111100, %11110000, %11000000.
now that's exactly what we want since we filled a longword which can set 4 pixels to
the color we want, written out using movep. imagine d0 would keep those 4 bytes above:
|
movep.l d0,0(a0)
|
; d0 = $fcfcf0c0
|
would just do the job then. the method mentioned above works out by using 4 pixel's colors
as an offset into a huge (16*16*16*16 longs = 256kb) lookup-table holding the planar-data to every
possible combination to write these out with a movep - easy, huh ?.
imagine we'd have created our table yet. with a byte-chunkybuffer the conversion would work
out like this (a0 pointing to our buffer, a1 to our table, a2 to the screen):
|
moveq.l #0,d0
move.w (a0)+,d0
lsl.w #4,d0
or.w (a0)+,d0
lsl.l #2,d0
move.l (a1,d0.l),d0
movep.l d0,offset(a2)
|
; keep upper bits clean
; fetch pixels *1*2
; shift it to 1*2*
; do this and we have 1324
; note that 3 & 2 are swapped
; *4 (longword alignment)
; get 4 planar-bytes
; write the 4 pixels
|
that's all - to convert a whole row with an unrolled loop, you could do:
offset
offset
|
set 0
rept 160/8
moveq.l #0,d0
move.w (a0)+,d0
lsl.w #4,d0
or.w (a0)+,d0
lsl.l #2,d0
move.l (a1,d0.l),d0
movep.l d0,offset(a2)
moveq.l #0,d0
move.w (a0)+,d0
lsl.w #4,d0
or.w (a0)+,d0
lsl.l #2,d0
move.l (a1,d0.l),d0
movep.l d0,offset+1(a2)
set offset+8
endr
|
; plane offset
; 160 pixels/row (lo-rez: 320/2)
; convert the first 4 pixels
; the next 4 pixels
; point to next bitplane block
|
now you still need to double your rows if you want to fill the whole screen (a3=a2+160):
|
rept 160/8
moveq.l #0,d0
move.w (a0)+,d0
lsl.w #4,d0
or.w (a0)+,d0
lsl.l #2,d0
move.l (a1,d0.l),d0
movep.l d0,0(a2)
moveq.l #0,d0
move.w (a0)+,d0
lsl.w #4,d0
or.w (a0)+,d0
lsl.l #2,d0
move.l (a1,d0.l),d0
movep.l d0,1(a2)
move.l (a2)+,(a3)+
move.l (a2)+,(a3)+
endr
|
; convert the first 4 pixels
; the next 4 pixels
; copy the 4 words just written
; and advance a2 by 8 bytes
; automatically
|
if you convert multiple rows or the full screen, please don't forget to
advance a2/a3 in order to make them point to the next row (which can be
done with lea 160(a2),a2 and lea 160(a3),a3 in my suggested example).
with 2 bitplanes it works pretty much the same way, just that you need to
use a movep.w and a much smaller table instead, of course.
the c2p-lut can be precalculated this way:
gen_c2p
.c2p_loop
__planes |
section text
lea.l c2p_table,a0
moveq.l #0,d0
move.w d0,d1
move.w d0,d2
move.w d0,d3
move.w d0,d4
rol.w #4+2,d1
lsr.w #8-2,d2
lsr.b #4-2,d3
lsl.b #2,d4
andi.w #%111100,d1
andi.w #%111100,d2
andi.w #%111100,d3
andi.w #%111100,d4
move.l __planes(pc,d1.w),d1
rol.l #2,d1
or.l __planes(pc,d3.w),d1
rol.l #2,d1
or.l __planes(pc,d2.w),d1
rol.l #2,d1
or.l __planes(pc,d4.w),d1
move.l d1,(a0)+
addq.w #1,d0
bne.s d0,.c2p_loop
rts
dc.b 0,0,0,0
dc.b 3,0,0,0
dc.b 0,3,0,0
dc.b 3,3,0,0
dc.b 0,0,3,0
dc.b 3,0,3,0
dc.b 0,3,3,0
dc.b 3,3,3,0
dc.b 0,0,0,3
dc.b 3,0,0,3
dc.b 0,3,0,3
dc.b 3,3,0,3
dc.b 0,0,3,3
dc.b 3,0,3,3
dc.b 0,3,3,3
dc.b 3,3,3,3
|
; 1st pixel
; 2nd
; 3rd
; 4th
; long-alignment
; 1st pixel (planar)
; 3rd pixel
; 2nd pixel
; 4th pixel
; do all the quad-combinations
; plane-data to every color
|
- 2002 ray//.tscc. -
|
|