CGA in 1024 Colors - a New Mode: the Illustrated Guide

By now you may have heard of the 8088 MPH
demo, the winning entry in
Revision 2015's Oldskool Demo compo
this month. It's been my pleasure to combine efforts with the likes of
Trixter,
reenigne and
Scali to make it happen - not only did
I get the opportunity to work alongside a bunch of extremely talented
wizards of code, we also achieved what we set out to do: break some
world records on the venerable (and yet much-maligned!) IBM PC, the
mommy and daddy of the x86 platform as we still know it today.

One of our "hey, this hardware shouldn't be doing that!"-moments was
extending the CGA's color palette by a cool order of magnitude or two.
How'd we pull that off? - reenigne has already posted an excellent
technical
article
answering that very question. To complement his writeup, I'll take a
bit of a different approach – here's my 'pictorial' take on how we
arrived at this:

The idea that such multi-color trickery was possible came to me some
time ago, as I was looking at reenigne's code for patching up composite
CGA emulation in DOSBox; messing with that
patch during development gave me a much better picture of composite
CGA's inner workings. When I had ironed out the basic concept for this
hack, I divulged it to reenigne for 'peer review' and for testing on
real hardware. Soon enough, we had an improved recipe:

Take two familiar (though officially undocumented) tweaks. Blend to
an even mixture producing a new effect.

Add one crucial new trick – an ingredient of reenigne's devising.

Test and calibrate until blue in the face.

Below is my rundown of how it all fits together. Fair warning: the
'target audience' for this writeup is people who may not be overly
familiar with CGA, and/or come from other demo platforms. As such,
there's a whole bunch of background that's already well-known in
CGA-land. To prevent acute boredom, I decided to stick this TOC here –
feel free to skip to the interesting part(s):

Because, much like a broken clock, even Wikipedia gets it right sometimes

Old Trick #1: 16-color graphics over RGBI

A short crash course on CGA basics: the first graphics standard
available on PCs supports a 16KB memory buffer, and is driven by an
MC6845 CRTC (some later cards used alternatives). Video output options
are composite NTSC through a standard RCA jack, and the more widely-used
DE9 connector, which outputs an RGBI signal (red, green, blue and
intensity). The latter is what most people think of when they hear
"CGA"; this is a digital (TTL) signal, where each component can be
either on or off, hence 16 different colors. Despite what arcade
hardware buffs would like you to think, CGA – in the strict sense – is
NOT analog RGB, and never was.

Standard (BIOS-supported) graphics modes are high-resolution (640x200)
in 2 colors, and medium-resolution (320x200) in 4 colors. Not a lot of
wiggle room here: in hi-res mode, only one of the colors (foreground) is
redefinable – the background is always black; in medium-res, it's the
background color that's adjustable, while the other 3 are determined by
the infamously nasty fixed
palettes.

Infuriatingly, in an almost-trollish move, IBM
mentioned
an additional low-resolution 16-color mode - "not supported in ROM" -
with zero information on how to actually achieve it. That nut was
cracked pretty early on, though.

Low-resolution mode

This is no graphics mode at all, but a modified 80-column text
mode.
Basically, you adjust CRTC registers to get 100 rows of text instead of
the usual 25; this gives you a character box of 8x2 pixels, a quarter of
the normal 8x8. Filling the screen with one "magic" ASCII character,
0xDE, effectively splits each character cell into left and right
"pixels", corresponding to the background and foreground colors. These
two colors can be individually set to any of the 16 CGA values, as in
any CGA text mode, as long as you remember to turn off blinking.

CGA low resolution mode

So there you have it; 160x100 @ 16c. This mode was used in games as
early as
1983,
but never got wildly popular - probably because of the
"snow"
that plagues IBM CGA cards in 80-column mode, unless you burn some
costly CPU time to avoid it.

The Macrocom Method

You may ask: since this is text mode, what's stopping you from using the
entire ASCII character set? Other than a healthy respect for your own
sanity, nothing really! This was first attempted around the mid-'80s by
a few brave souls at
Macrocom, who
combined the 100-rows trick with ASCII art, to create what Trixter once
succinctly called "ANSI from hell".

As you can see above, I've experimented with this a little. With
judicious use of the character set, you can almost fool somebody into
thinking that this is a 640x200 mode - although there's some inevitable
"attribute clash", a little like the ZX
Spectrum: each 8x2
character cell can contain only two colors, foreground and background.
Also, you have to be a bit of a glutton for punishment to actually draw
in this mode from scratch... but that's a subject for a future post.

This trick isn't directly relevant to our demo: we were targeting
composite displays. Even if CGA's composite output didn't have its share
of bugs and quirks in 80-column mode – which it
does
– there'd be no way to see this level of detail over NTSC. There's a
reason I mention this effect, however; the idea behind it does figure
into the story. But more on that later.

Old Trick #2: 16-color graphics over composite

Digital RGB monitors were still a luxury item at the time of CGA's
introduction, and IBM itself didn't offer one until a couple of years
later, coinciding with the release of the PC/XT. But CGA also provided
composite output, giving out (mostly) NTSC-compatible video. At the
expense of resolution, there's more fun to be had here with color.

Direct colors

On the composite output, the familiar 16-color CGA palette is
represented by a series of color signals, whose hue is determined by
their phase relative to a reference signal (the NTSC color
burst). The frequency of the NTSC color clock (3.579545 MHz) works
out to exactly 160 color cycles per active CGA scanline.

These are directly generated by CGA hardware as color signals, so we'll
conveniently call them "direct colors". IBM had two main
revisions
of the CGA, which produce composite video somewhat differently:
'new-style' cards contain additional circuitry, which helps the palette
match its RGBI counterpart a little more closely. For the demo, we
standardized on 'old-style' cards, simply because we happened to have
done more testing on those (with somwhat better results), so all images
in this post will reflect 'old-style' CGA colors.

If these 16 direct colors were all we had, it wouldn't be a whole lot
of fun, would it? They're also shockingly ugly, esepcially on an
old-style CGA, which doesn’t help matters either. Just look at that
palette... gross, dude. Luckily, there's a way to go one better.

Artifact colors

Due to bandwidth restrictions, NTSC video doesn't fully separate
chrominance (color) from luminance. Effectively, any high-resolution
detail – that is, detail with higher frequency than the NTSC color clock
– gets 'smeared' when the signal is decoded. This is responsible for the
characteristic color bleed, seen in the form of fetching little fringes
at the edges of text characters and other fine detail.

Remember how you get 160 color cycles per active CGA scanline? Standard
CGA gives us either 320 or 640 active pixels per scanline,
depending on the video mode. Ergo, we can switch pixels on and off at
2x or 4x the frequency of the color carrier. Since this
high-frequency detail cannot be fully separated from color information,
the upshot is this:

The hue of a pixel, or a fringe (transition between pixels), depends on its position within the color-cycle period.

This NTSC color cycle is sometimes represented as a wheel: one complete
period of this cycle equals a 360° revolution around the color wheel,
and we have 160 complete revolutions per scanline.

Let's say we're in hi-res (640x200) mode, where 4 pixels fit into one
such color cycle: moving one pixel left or right translates to moving
90° along the wheel, in either direction, and accordingly shifts the hue
by 90°. Likewise, in 320x200 mode, we move in 180° increments of
hue-shift.

In short, manipulating detail at high resolutions is
effectively a method of generating color; being an artifact of NTSC's
imperfections, this is known as artifact color.

Various filters can be (and often are) employed on the receiving end to
recover some of the high-frequency detail, reducing color bleed and
making edge transitions somewhat sharper. We're still dealing with
technology, not magic, so full separation of detail and color can never
quite be achieved, and the trade-off is a whole new set of artifacts (in
the form of "echoing" or "ringing"). This trade-off may or may not be
acceptable, depending on what you're doing, but the above image doesn't
attempt to reproduce any such filtering.

Solid artifact colors

All this business of "fringing" and "bleeding" sure sounds like a
bummer, and that's exactly what it is: the unwanted side-effect of a
less-than-ideal encoding scheme. But like any good flaw, it can be
turned into an advantage by an enterprising soul, and this is where we
get to the fun part (your mileage may vary).

When you look at the interplay of color vs. detail over NTSC, a very
handy fact becomes apparent:

Any periodic composite signal, with the same frequency as the color carrier (160 per line), will be decoded as a solid, continuous color.

Our 16 direct colors are exactly this type of periodic composite
signal. But hold on – with some simple high-resolution pixel-pushing,
we can manually put together our own periodic waveforms! Any pattern of
dots will do, as long as it repeats at the right frequency. This lets
us achieve solid colors that lie outside the direct color palette.

The "classic" way of doing this on CGA is to set up BIOS mode 6 –
640x200 in 2 colors, white on black – and set the color-burst bit (which
is off by default, for a B&W picture). At this resolution we can
squeeze 4 pixels into a color clock period, and at 1 bit per pixel,
there are 16 possible patterns – giving us 16 solid artifact colors.

This is pretty much the same
technique
used by Steve Wozniak to generate color on the Apple ][. In fact, on
an old-style CGA card, these 16 colors are identical to the 16 low-res
Apple colors (although you couldn't get them on a poster, like Apple
owners
could).
More to the point: the pixels themselves are white, which carries no
color information; it's the detail that does the deed.

*But wait, there's more!* Despite popular wisdom, CGA lets us one-up
the Apple, and then some. OUR underlying pixels don't have to be white:
in 640x200 mode, we can play with the palette register and set any of
the 16 direct colors as the foreground (background is always black).
By using the same pixel patterns with a different foreground color, we
get 16 entirely new sets of artifact colors, with 16 colors each. We
can only use one such set at a time, but we get to pick and choose what
our 16 colors are.

Then there's 320x200 mode, which supports a palette of 4 direct colors.
Only one of those, color #0 (background), is freely selectable. For
the rest, intensity may be on or off, but we can only use
green/red/yellow or cyan/magneta/white; the undocumented cyan/red/white
palette involves disabling the color burst, making the composite picture
greyscale.

Since our pixels are twice as fat in this mode, only two of them can
squeeze into a color-clock cycle – but at 2 bits per pixel, the total
count of artifact colors is still 16. The possible combinations of
palette, plus the user-defined background color, provide us with a whole
slew of other 16-color sets.

This may be a good place to correct a bit of a misconception. Since we
have 160 color cycles per scanline, many people treat CGA's graphics
modes over composite as 160x200 "modes", but that's not quite accurate.
Our effective color resolution is indeed 160x200, and it's impossible
to get finer detail than that using solid artifact colors. But as we've
seen, on NTSC the pixel grid and color grid are NOT one and the same –
which makes the question of horizontal resolution a bit fuzzy, depending
on how you're sampling and/or filtering the signal. It even varies with
the specific color waveforms you're using.

IBM itself never documented any of these artifact color tricks, other
than one oblique reference to "color mixing
techniques"
in the PCjr tech ref (if I'm wrong about this, drop me a line and link
me!). The concept is fairly old hat, however – it was used in games
very early on; some of the first ones I can think of were Microsoft's
Decathlon
and Flight
Simulator,
both in 1982. And the limitation has always been the same: the maximum
simultaneous color count you can get over composite CGA is 16.

....Or is it? On the off chance that you've been following me so far, and
you're still reading, you may have an idea of what the next step is.

256 colors

We've already observed that our choice of 16 artifact colors depends on
the palette and color register settings. One fairly obvious strategy
seems to suggest itself here – change those registers at particular
scanlines on every frame, and get >16 colors on screen that way.
Right?

This has been done before on CGA, and you can actually exploit this
for 256 colors (as proven by reenigne - see the image to the left), but
that's not how we did our multi-color hacking in the demo. We were
actually toying with the idea of including a static screen that uses
this technique, but I didn't have the time to pursue this; if anyone
manages to compose some nice artwork using this method, I'd love to see
it – that's gotta be a bit of an artistic challenge. But no, the way we
wrangled more color out of CGA is a whole other shenanigan... which I came
across by equal parts chance and morbid curiosity.

Recall how any color/dot pattern of the right length (four repeating
pixels in 640x200, or two in 320x200) produces a solid color on a
composite display? Back when I was testing composite emulation for
DOSBox, that fact was fresh in my mind. At around the same time, I was
experimenting with the "ANSI from Hell" graphical hack detailed
above;
that's purely a text mode / RGBI trick, but it requires a close
familiarity with the ROM character set... closer than most sane people
would want or need.

Let's take another look at a particular section of the CGA ROM font, in
80-column mode, with the top 2 scanlines highlighted:

At this point, if you're a visually-oriented person, and if you've been
following my drift, you're probably catching on. Don't see it yet?
Here's a fatter clue:

See those top 25% of the character bitmap? Two dots of foreground and
two dots of background, doubled horizontally across. We're in
hi-res/80-column mode, so there are two color cycles per character...
corresponding exactly to those two matching halves. And those top two
scanlines are identical.

That's just the type of repeating pattern that gets us a solid artifact
color over NTSC. In fact, it's the very same waveform that 320x200 mode
lets us play with. Except that now we have it available in text mode:
you know, where we can freely assign a foreground AND a background to
each character, from the 16 direct colors.

That's 256 possibilities right there... this is the part that made me go
"I have a cunning plan", in my best imitation of Blackadder's
Baldrick
(just not out loud). Indeed, it's possible to achieve >16 colors on
CGA without any flickering, dithering, interlacing or per-scanline
effects.

Here's what the possible combinations work out to:

512 colors

Oh, we're not done yet: once that lightbulb went off over my head, I had
another look at the CGA ROM font to see if any other useful bit
sequences emerge. There are a few character bitmaps that give us the
exact same waveform as 'U' does – 'H', 'V', 'Y' and '¥' – but only one
with a different suitable bit sequence right where we need it: 0x13,
the double exclamation mark ('‼').

The top two scanlines of 'U' give us a bitmask of 11001100 for
foreground/background; '‼' is 01100110 – a single shift to the right, or
a 90° shift in phase. This perfectly complements 'U' in terms of having
a well-rounded palette, because we get all the colors that the "...1100..."
waveform has to offer: going from 'U' to '‼' shifts the phase by 90°
(0110); 180° and 270° are achieved by flipping the foreground and
background colors for 'U' and '‼' respectively – the same as going
'0011' and '1001'.

Okay, we've pushed the envelope even further: 512 simultaneous colors!
Granted, the real number is lower, because a good few are duplicates
(and others are very close). But 512 seems to be the limit for this
technique: no other characters in our font fit the bill for solid
colors. The CGA character ROM does have an alternate 'thin' 8x8
font; but, besides the fact
that you'd have to mod your card if you wanted to use it, the 'thin'
font has none of the magic bit patterns in the right places, which makes
it useless for our purposes.

My kingdom for redefinable characters... alas, when you're dealing with
old PC hardware, IBM's penchant for cost-cutting over innovation can
always sneak up from behind and ruin your day – even in the most
unusual of places.

Still, I was pleased with my little discovery: extending the palette by
a factor of 32 has to count for something, right? At this point, I
shared my ideas with reenigne. Little did I know that he'll promptly
come up with a new devious scheme to double our color count yet
again...

1024 colors

This part is some next-level CRTC black magic which I could never have
figured out by myself – I'm just a graphics guy; you might as well ask
me to wait for a full moon and chant the MC6845 spec-sheet backwards in
hexadecimal. All credit goes to reenigne for this particular bit of mad
science, which, despite its complex execution, stems from a wonderfully
simple idea: our fixed character bitmaps don't play nice with what we're
trying to do? No problem – we'll make them play nice, or else.

See, there are two additional characters whose very first scanline
could be used; problem is, the second scanline is different, which would
ruin our solid color effect. These are ASCII codes 0xB0 and 0xB1, the
'shaded block' characters. It would be quite convenient if we could just
tell that offending second scanline to buzz off, wouldn't it? As it
turns out, we can.

The lowdown on how this is done is all in reenigne's writeup, which is
linked to at the top of this post. But this is the basic idea: by
starting a new CRTC frame every other scanline and twiddling with the
start address, it's possible to lay down our character rows so that the
first scanline of each gets duplicated twice!

Now we can make use of those two extra characters, and doing so gets us
two more 256-color sets:

Naturally, there are downsides: having to mess with the CRTC every
couple of scanlines is quite taxing for the poor 4.77MHz 8088, so
there's not much you can do with this other than static pictures. The
512-color variant, using only ASCII 0x55 and 0x13, doesn't suffer from
this – it's basically "set and forget", requiring no more CPU
intervention than any 80-column text mode (the familiar overhead of
avoiding snow).

Then, there's that other problem which plagues 80-column CGA on
composite displays... the hardware bug that leads to bad hsync timing and
missing color burst. There are ways to compensate for that, but none
that reliably works with every monitor and capture device out there.
This proved to be an enduring headache in calibrating, determining the
actual colors, and obtaining a passable video capture of the entire
demo... but that's all covered elsewhere.

At any rate, we now have 1K colors on a 1981 IBM CGA, at an effective
resolution of 80x100 'chunky pixels'. 'Chunky' describes the memory
layout, but it also applies in the visual sense: we're really plumbing
the depths of resolution here. 160x100, that's as low as you could go?
allow me to snicker, IBM - "low-res" just got lower, baby!

One might object that this isn't a lot of canvas. Yeah, yeah: 80x100 is
a bit on the cramped side, 'artistically' speaking; but the limitation
is part of the challenge, as it has always been in demos. You can keep
your fancy 4K monitors - 0.008 megapixels should be enough for
anybody.

When we first showed Trixter the 'proof-of-concept' 1024c drawings, his
response was, and I quote: "HOLY F!@#$%G SHIT. WOW. I must know how
this works!!". Achievement unlocked: getting THAT out of a veteran
8088/CGA hacker and demomaker is, by itself, almost as good as... well,
joining the team, 'making a demo about it' and winning the oldskool
compo. :)

That's about it for my writeup. If you made it this far,
congratulations! There's more I could write about the tools and
techniques I used to actually compose these graphics... but we'll get to
that some other time.

You remember correctly - CGA Frogger starts each frame with a blue background, and sets it to black at a particular scanline down the screen. Jungle Hunt and California Games did similar things (all rely on rather precise timing and/or polling the CGA status register, since there are no raster interrupts).

As an emulation author, I have done quite a bit of study on how to emulate the NTSC color fringe using sliding windows. Your description of the artifact behavior is spot-on and well-done! It might help the folks at home to look at YU'V-to-RGB conversion formulas in order to better understand the math involved. But discovering how to use other colors in text-mode, that's just outstanding!

As it turns out the color fringing behavior is present in Apple // graphics as well, just never really utilized. A blue pixel fades in/out of black and so on. You could represent the apple screen as having 560x192 resolution, but some pixels come out as different colors despite the apple only produces 140 color cycles -- if you move a single pixel from left to right it will still appear as if it were in 560 places, it will just change colors along the way.

The best way I could exploit this is by a brute-force method, where my goal is to find the best combination of bits (the apply uses 7 out of every byte) to represent 7 input pixels. I just look for the shortest (perceptive) color difference for a group of 7 pixels, turning on each bit on and off one by one. Then using dithering I spread the error to the pixels below, etc. This has a nice side-effect of giving me the fullest color representation of an image while retaining 560 pixel-width detail. No you can't see it on a TV screen clearly but it looks nice and smooth all the same.

Anyway, well-done. You guys are clever gents! When you get a chance, have a look at some of the recent graphics demos by "French Touch" who have been able to do some crazy graphics hacks on the Apple // series (especially their "Crazy Cycles" demo which will break most emulators but not mine. ;-)

Funny that you mention Apple ][ NTSC emulation, as I just put up a new post on that subject - to be fair, I wasn't aware of JACE until now, so I didn't consider it when writing that post. I'll have to give it a go with those demos you mention :-)

Jesus Christ!!! As a former amateur programmer, I came here from a different thread: C64 raster-handling tricks which I guessed for long but never found explained until now. But 1024 CGA is simply genius stuff. Absolute respect and admiration to people involved.

If switching between alternating fields was used to produce the same solid color on both scanlines, what would you get if it was used to set a different bit/color pattern on each scanline separately? I know you said reenigne could use that hack, but didn't. What if you added that hack too?

@Ed Coolidge: if you could use different bit patterns on every scanline, you would have the same "palette" at double the resolution (80x200 'pixels'). But at two scanlines per character row, the non-programmable CGA font doesn't give us a free choice of such patterns to use... and if you go for one scanline per row, a full-screen picture would need twice as much RAM as the CGA's 16K. The "mugshots" section of 8088MPH (before the end credits) does exactly that, which is why it only covers half the screen. :)

The mentioned hack that we didn't use has to do with modifying the card's color registers at particular scanlines, but that only works in graphics mode (in text mode it'd only affect the border/overscan color, not the active area). Anyway, that trick still gets you 16 colors per scanline at most.

Back in 198<something>, BBC Micro Elite changed screen modes halfway down the screen --- so the top half was 320x256 monochrome, and the bottom half was 160x256 4-colour. This allowed the view out of the cockpit window to be high resolution but your cockpit instruments were in colour. Naturally this would only work when there were the same number of bytes per scanline with the two modes.

I would love to see what the final 1024-artifact color technique could produce when combined with the changing of colors on every scanline, plus changing of video modes (and/or disabling of color burst) partway through the image to produce a combination of sharp detail and a wide color gamut.

Outside of 8088 Land, though...Having access to an NTSC Composite Display plug-in for non-DOS emulators (for example, emulators of old game consoles such as the Super Famicom and the Mega Drive) would be incredible stuff. Combined with CRT emulators such as the one used for MAME, it has the potential to make everything look exponentially more awesome.

Unfortunately, 90-degree shifts on the NTSC artifact color wheel are not possible on the Mega Drive, as each scanline can only have a maximum of 320 or 256 pixels, and the Super Famicom has 256 pixel and 512 pixel modes, but no 320 pixel or 640 pixel modes.

And then, of course, there is the issue with VDP circuitry screwing with the composite signal to try cleaning it up...

On the plus side, there is a theoretical maximum of 61 colors per scanline on the Mega Drive (four layers of 15 colors each, plus a background color), even more if you use the Shadow and Highlight mode on the Sprite layer, and since each indexed color represents a 9-bit RGB direct color... and when combined with interlaced modes to double the vertical resolution, you can have at least 448 scanlines to work with. It would technically be 224, with the second frame being offset slightly, and in order to produce artifact colors correctly both scanlines need to be nearly if not completely identical, but you would finally be able to achieve a vertical dithering effect and artifact colors at the same time.

Really cool!! Was using CGA/VGA mode to demonstrate elementary graphics (peek/poke) for a hack session using Forth --- came across this -- outstanding achievement and great writeup!

Mario
says:

Sep 09, 2018 at 04:45

There are quite a few duplicate colors, there are actually only 944 colors in total.

Mario
says:

Sep 09, 2018 at 04:46

It is still very impressive though. (adds on to my other comment.)

MrMadguy
says:

Mar 24, 2019 at 17:44

Can you please answer 2 noob questions about CRT?1) VGA 320x200x256 uses 640 timings, but produces 4 pixels per character clock, instead of 8. So, is 160x200 possible on "Dot clock div 2" mode, i.e. with 320 timings?2) About 320 timings for 160 mode. I needed to investigate them in order to figure out timings for 180 (i.e. 360) mode. Why can't be use 640 timings, divided by 2? Cuz, you know, VGA is analog monitor and there is not difference for it between 320x200 and 640x400. Timings are the same. Only difference - is at what rate pixels and lines are being outputted? Why do we need special timings? What I don't understand - is why retrace end = 0? If it's 0, then character count can be only 00h, 20h or 40h. It can't be 20h, cuz retrace start is 2Dh. It can't be 40h either, as horizontal total is 31h. So, it's 0 and retrace ends right before starting to display image. I.e. without any porch. How about left overscan area? What can it be right?