Note: This cartridge's settings do not allow embedded playback. A [Play at lexaloffle] link will be included instead.

This is a tech demo of some tools I've been working on for developing text-based games. It'd be more impressive if it were an actual game, but this victory was hard won so I'm posting it. :)

Notes:

Text is stored in cart data, not as string literals in the code.

The original source file does have the text as string literals in code. I use a post-processor to extract the string literals, pack them into text data stored in the cart, then replace them with string IDs. I use a custom syntax to flag which strings ought to be extracted so I can still use regular string literals elsewhere. The processing tool lets me adjust the location of the text in memory, so I can set aside space for sprites, sfx, etc. by limiting the size of the text data region.

Text is compressed using LZW with variable-width codes. The processor has a compressor written in Python, and it appends a decoder written in Lua to the cart code. This Tale of Two Cites excerpt is 25,613 bytes, and compresses to 12,457 bytes for storage in the cart data, 48% of its original size.

My LZW implementation is designed to allow random access to strings during program execution. LZW is a dictionary-based compression algorithm, and all strings share the same dictionary for efficient packing. The entire corpus is not decompressed all at once into Lua memory. Instead, the lookup dictionary is calculated from the bit stream and retained in Lua memory so that strings can be decoded on the fly as they are accessed. In the cart data, I use a simple binary layout that gives each string a header with information that helps track the code bit width during decompression, and byte-aligns each string's first and last characters.

This Tale of Two Cities demo gets close to the Lua memory limit with its dictionary. I maximized the size of the dictionary (7,903 entries) to minimize the size of the compressed data. In practice, I'll probably cap the dictionary size to 4,096 entries, which for this text gains a few kilobytes in cart data. But headroom in Lua RAM will be important for real games.

The slow scroll of the text in this reader app is artificial, originally intended for use in a text game. Decompression is quite fast after the initial dictionary is built. I have limited interest in making a usable ebook reader cart, but you're welcome to try it. This implementation uses only 292 tokens and 5017 chars, and that could probably be tightened up a bit.

This excerpt is 4,634 words. For comparison, Zork I is 14,214 words. Considering Zork had the luxury of paging from a 160k floppy disk and this is packed into a 16k region of cart data, that's not too shabby. :)

I don't know yet if this will actually be useful for a game project, but it was fun to make. The complete code is not ready for public consumption, but here's the Github link anyway: https://github.com/dansanderson/p8advent It's based on and requires picotool.

@dddaaannn I know this is pretty old by now, but I'm really interested in how you were able to store so much text here. When you said this was pre-packed and stored in 16k of cart data, where did you mean, specifically? This would be really cool to use for other purposes as well.

This experiment was about storing a table of indexable strings as compressed data in the addressable region in a cart, i.e. where the graphics and sound data go. By putting compressed text in the graphics/sound region, you can save multiple chunks across multiple cart files and load them in selectively with reload(). A single cart's graphics/sound region is 16 KB.

If you open the Tale of Two Cities cart in Pico-8 and switch to the sprite sheet editor, you can see all of the text data as rainbow noise. I started with fixed-width LZW codes instead of variable-width and you could see stripes indicating the unused gaps.

If I remember correctly, my experimental tools let you set how much memory to reserve for text, so you can leave some space for graphics/sound as well. I didn't production-harden these tools so they're a little rough. Feel free to mess with them for your own purposes!

Ah, I was looking at the memory map and wondering where it was located. I wasn't sure if there was some other trick involved, like if there was some way to pre-load user data before run time. So this does compromise your ability to use graphics and sound.

Is it something like 0x000 to 0x4300 (the start of user data)? That might be what confused me, there wasn't an easily delineated 16k chunk in memory.

Correct. This technique takes up addressable cart ROM that would normally be used by graphics/sound data. 0x0000-0x4300 = 16.75 KB.

In a real game using both text and graphics/sound, I would expect to use a small portion of the graphics data for text, and page in additional chunks as needed from auxiliary carts. Each reload() comes with an artificial pause, so how practical this is probably depends on the game.