Technical InformationDAN3 is a LZ77/LZSS variant developed based on DAN1 and DAN2, which explains their similarities.The format starts, like in DAN2, with defining what's the size in bits for the large offset values used to identify a match with the following table with a unary code (sequence of bits that looks like this):

0 : 9 bits long <- 512 bytes (characters on a screen).

10 : 10 bits long <- 1K (character set, some bitmap screens)

110 : 11 bits long <- 2K (most bitmap screens)

1110 : 12 bits long <- 4K (dithered bitmap screens)

11110 : 13 bits long <- 8K (my decompression routine limit as-is)

111110 : 14 bits long <- 16K

1111110 : 15 bits long <- 32K

11111110 : 16 bits long <- 64K (only good for 32bits/64bits PC files at this point)

... (no limit in theory)

Then, like in DAN2, the first byte of the uncompressed data is stored as is.

Afterward, things are a little different, which makes DAN3 different than the others; offering a better compression on average but not always.

For each match, the size and the relative offset values are encoded. While DAN1 and DAN2 are using Elias Gamma to encode the size value, DAN3 is using a k=1 Exp-Golomb encoding instead, which somehow helps to optimize a little bit both in term of space and time to decode. As for the relative offset values, DAN3 is using a completely different set of possible bit-size offsets; using {5, 8, max} instead of {1, 4, 8, max} bits to encode offsets. As for the special case of a nearby single byte being the same as the current byte to encode, DAN3 limits to the first 3 nearest bytes, instead of the 18 nearest bytes, which limits its potential to find a match and save space, but since the big impact is with sequences of more than 2 bytes, this change has no impact besides offering a better compression than Pletter and the others that do not support sequences of a single byte, acting if you like as a local fixed Huffman encoding.

Here's a comparison side by side of Elias Gamma and k=1 Exp-Golomb to show you what I mean by size and speed possible gain in DAN3 since reading fewer bits means taking less time to decode.

Elias Gamma (DAN1 and DAN2) versus k=1 Exp-Golomb (DAN3)

1 : 10 = size 1

010 : 11 = size 2

011 : 0100 = size 3

00100 : 0101 = size 4

00101 : 0110 = size 5

00110 : 0111 = size 6

00111 : 001000 = size 7

0001000 : 001001 = size 8

0001001 : 001010 = size 9

0001010 : 001011 = size 10

0001011 : 001100 = size 11

0001100 : 001101 = size 12

0001101 : 001110 = size 13

0001110 : 001111 = size 14

0001111 : 00010000 = size 15

000010000 : 00010001 = size 16

...

000000011111110 : 00000011111111 = size 254

In DAN3, the support of Exp-Golomb stops at 00000011111111 = size 254. There are 3 reasons for that:

It allows an optimization of decoding only into a single byte instead of supporting to carry the bits into a 16 bits register pair. In Z80 opcodes, that simplifies the decoding routine, making it faster and smaller.

It makes the specials markers for END OF DATA and RLE about a byte smaller and faster to read.

Very large sequences of nothingness sadly will need more than one match of size 254 each but since we're talking about compressing our elaborated graphics which are mostly not that empty, the limitation should be barely an issue and satisfy plenty our needs.

Special Markers

00000001 + byte : RLE - Copy raw the next (byte value + 1) bytes

00000000 : END OF DATA

Relative OffsetFor a size of 1 byte, the relative offset is either 0 (the byte just before), 1, or 2 encoded respectively as 0, 10, and 11. For sizes of 2 or more, the offset is encoded as follow:

ListingDecompression to VRAM Routine for Z80, using BE and BF ports (ColecoVision)Written in SDCC style, to be added to your compression toolbox.DAN3 decompression routine is only 14 bytes bigger than DAN1 decompression.

I still have to watch it, and my library has it to be watched freely, but I never find the time to do it.

You can compare GZIP and DAN3 as being basically the same since they are both variants of LZ77/LZSS. If the TV series mention a score to GZIP, which is the format we use in our Internet communications, then you have basically the Weissman score of DAN3... if only I was not the only one working on this and many tools were done supporting DAN3 format for various needs.

Since the compression time is meaningless compared to the resulting size to be used in our homebrew projects (decompression routine + data), I will not even try to give a Weissman score.

My compression tool here is written in a way to try to optimize the compression ratio regardless the time it will take to do so. Once compressed, we put together the compressed data and the decompression routine in our homebrew projects, which is the part that affects the users' experience, to save in loading time if on tapes and disks, and to offer more content (graphics, levels, text) inside the limited space of the memory chips of our cartridges.

As for the decompression time, it's about the same as the other LZ77-LZSS variants, sometimes faster, sometimes slower, making really the compression ratio the most important part and that's what I've focussed on.

DAN3 compression ratio tends to be closer to PuCrunch and Exomizer than Pletter, ZX7, ApLib, MegaLZ and others, but it uses a format close to the latter group and it doesn't need extra memory space like Exomizer to set up a table of values in memory. I believe that the difference is mostly because DAN3 tries to even compress a single byte rather than just sequences of 2 or more bytes. At the extreme, DAN3 will give worse compression ratio results like MegaLZ will do for meaningless data like a text file with the only the letter Q thousands of times. But since I'm not concerned about meaningless data and expect DAN3 to be used to compress even better-detailed content (hi-res graphics and elaborated levels for example), I'm quite confident that DAN3 fits for our needs in average, even if not always the best solution.