After finishing ROTT and the Shadow Caster maps, I decided to tackle something much simpler, but also to document the process as I go along. That game is Xargon, which can be obtained at Classic Dos Games. According to that site, Allen Pilgrim released the full version available as freeware with a very short license, so we will be able to map all three episodes. However, since I already have it on my computer, I will start with Episode 1. Note that the Source Code is released, so that is an option if I get stuck. However, to hopefuly make this tutorial more useful (and fun), I'm planning to do this as a black-box exercise. Obviously, if the game has a source code release, or is well documented, that can help reduce the guesswork.

As with Shadow Caster and ROTT, I will be using Python and the Python Imaging Library (PIL). I will also need a Hex editor to look through the game resources before digging in, so Bless is my editor of choice. Windows users will obviously need something else.

Generally speaking, creating a map via hacking is composed of three phases:Phase 1: Figure out the map formatPhase 2: Figure out the image data format(s)Phase 3: Put it all together and make the map (this the real meat of it)

Phases 1 and 2 can be done in either order, but I will start in the order above.

So, the map format. Xargon is split into a number of different files, and I'm going to take a wild guess that the BOARD_##.XR1 files are the maps. Let's look at one in a hex editor:

The first thing to notice is there's definately a repeating pattern. The second thing is that it looks like a 16 bit pattern. This can mean either 2 individual bytes per location, or a single 16 bit number per location. Either way, we know the size of a location, but not the dimensions of the map. We also know that there is no header, because the pattern starts immediately at the top of the file. Given the nature of the game, we will start with the assumption that the game map is simply a direct listing of tiles without any grouping, compression or other tricks. So, let's take the file size and figure out how many tiles we have total:

19255 bytes / 2 = 9627.5. This already tells us that we must have some sort of footer that isn't part of the map data. Scrolling down to the bottom of the file confirms this, as the map ends in the string "TOO LATE TO TURN BACK!". However, the footer is unlikely to be a huge portion of the file, so let's ignore it for now. Taking the square root of the file size gives us the dimensions if the game used square levels: 98.11. This gives us an order-of-magnitude to guess for. I know that the maps aren't square, but let's run with this and see where it gets us.

The next step is to visualize the map to see if we guessed correctly. 8-bit numbers are easiest, as we can go direct to grayscale. 16 bit numbers present us with two options: if we think the numbers present different information, we can try two parallel grayscale images. If we think it's just one 16 bit tile number, we should just pick two out of the three RGB channels and generate a colour image. I'm going with the latter option initially. I'm also going to cheat a little bit and copy some boilerplate code from my other scripts as far as grabbing the input file and checking for # of input parameters. Here's the basic visualizer script:

'<{}B' means a little endian pattern of {} bytes, where the actual number is filled in by the format call. Refer to the Python Struct module documention for more information on these strings.

So we run it and get:

That's cute. Obviously wrong too, but there's a clear pattern shift to the image. We should be able to figure out exactly how far we are off on each row. Opening in GIMP and counting the pixel shift shows that we appear to repeat every 64 pixels. Taking our original dimensions, and using 64 as one dimension yields 9627.5/64 = 150.4 for the other. Some of that is going to be footer, but we should be able to clearly see the breakdown when we finish. Our adjusted script becomes:

That looks WAY better. But it looks sideways. And we can clearly see the garbage at the bottom is the footer and is not tile data (we'll have to figure it out later; I suspect it is item/monster placement). Opening in GIMP again shows that the map is only 128 pixels tall, so let's fix that dimension. Let's also rotate it -90 degrees.

Yesterday we got the map format decoded. While I can visualize more of the maps the same way, let's go ahead and try to figure out the tile format. Now, Xargon is a 256 colour VGA game, so we're going to need a colour palette. We also need to know what dimension of tile we are looking for. Both of these goals can be accompished by taking a simple screenshot (via Dosbox for me):

Looks like 16 x 16 tiles to me. Looking through the Xargon directory, GRAPHICS.XR1 is the most likely candidate for containing tile information. With the lack of a better idea, I'm going to just assume it contains a whole bunch of RAW 16 x 16 images in order. With a filesize of 697555, and each tile taking up 16*16 = 256 bytes, that is about 2724.8 tiles. Let's cook up a quick looping RAW image importer with PIL and see what we get:

Not a horrible assumption. There are some properly decoded (be it shifted) tiles among there. But there are also some images that look misaligned (i.e. not 16 x 16), and some random pixels (which imply headers). There's also a repeating pattern at the top of the file with random pixels that may be an overall file header (i.e. Images 0000 and 0002). Zooming the image shows that each record appears to be 4 bytes in length.

Scrolling through each of these 4-byte regions and looking at the decoding panel in my hex editor, each region appears to be a simple 32-bit integer, and each number is larger than the previous. (example below)

Let's decode those numbers. I locally commented out the image code and am excluding it from below for brevity. Counting the bytes in the Hex editor, it appears to end at offset 0xF8 for a total of 62 records. Also note that pdb.set_trace() in the below example invokes the python debugger where we can inspect data interactively. This also saves me the trouble writing a print statement for debug data.

Keep in mind the file size of this graphics file is 697555. The last number is close, but does not exceed this number. This implies to me that these are all file offsets, probably delimiting the start of a region of data in the file. Since image 0002 also looks like header data, let's look at that too:

Well, it's smaller. Suspiciously looks like half as small, in fact. Let's grab that too and decode it as a series of 16 bit integers.

Perfect match. I'm convinced. Now, going by the debug images I already generated, the first region appears to be a whole lot of blue and will be difficult to determine on its own. Let's start with the second region, at offset 9359 (0x248F hex). Since each picture is 256 bytes (0x100 hex), that puts towards the end of picture 0x24 (36 decimal). Hrm, still too much blue. Let's jump ahead until we get to the images. The next is 14366 -> image 56, which is still blue, but the following is 15087 -> 58 which starts to look like something. It also puts is one pixel before the last row of the image, which looks like another header. Since we're offset by 1, and the next image matches the last pixel in this image, I suspect the header is 16 bytes. I'll just copy the bytes from the hex editor:24 01 00 D0 06 10 0D 90 19 08 04 00 08 10 00 10

Now we need to guess the bit size of each field in this header. Remember that we are little endian. The first four bytes are:24 01 00 D0

If we think this is 32 bits, that's either -805306076 (signed) or 3489661220 (unsigned), or -8.590234E+09 (floating). None of those are particularly likely (and floats aren't very common for old DOS games in general).

I think it's likely we have either an 8 bit sequence or some sort of hybrid.

I'm scratching my head, so let's get another sample. The next few sample images look like noise, so they may be some other form of data, or just unrecognizably misaligned. Let's jump ahead to the really recognizable tiles around image 293. 293 is offset 75008, and from above, the closest record starts at 74973 (image 292, pixel 221). Looks like another 16 byte header to me. Hex values are:

For today's effort, let's try to see if we can get better decoding of each image record. It may not be perfect, but we should be able to get good-enough decoding to get the map tiles ready for use. It's very hard to guess what every number might mean from a black box perspective, so some of those fields will just have to remain unknowns for now. Usually, the best approach is to start with a number that you think should be present and try to find it.

The most likely candidates I see are the 3rd and 4th last numbers. Since we know the tile images are 16 x 16, we see a lot of records with those numbers. What I'm going to do is split each record up into a series of images, using those dimensions for each series. We'll see how well that turns out. I'm also going to need to clean up my code a bit more to support this, so it's time to group things into classes. The updated script is below:

Code:

import struct, sys, os, pdb, csvfrom PIL import Image

def createpath(pathname): """ Simple utility method for creating a path only if it does not already exist. """ if not os.path.exists(pathname): os.mkdir(pathname)

So now we have an imagefile class for the overall file, and an imagerecord for each record in the file's table of contents. I also moved my save CSV code and load/save image code into functions for each class.

The first few sets of images start to look more like something, but they're still a bit weird. Set #5, however, is where things start to get interesting. Here we have an almost-correct decoding of the player sprite!

Guess the guess wasn't too far off. Since we're shifting between each sub-image, I suspect there may be a small header between each image. Let's see what we can find. I'm going to use image set #9, because it's 16 by 16 and very easy to see image alignment. The first image seems to be shifted by 1 pixel, and the the second image seems to be shifted by 2 pixels. Hrm. 3 pixel header? Let's find this in a hex editor. My debug code nicely names the directories by their offsets, so start from offset 74973 + 16 for header + 256 for first image -1 for first image error gets me to 75244. And this decodes to:16 16 0

So, it looks like the end of the previous header was actually the start of the first image's header. And it also appears I was accidentally grabbing the first pixel from the image. Let's go fix all this, and decode the headers for each image. Also, looking at record 9, I ended up with 47 images. The header for that record has 47 as the first number. Since I know now that we can have variable size images, I'm going to assume this is the number of images in a record.

Now, to top things off, let's mask transparent areas of sprites. It looks like colour 0 (black) is used for transparency. Let's just add a routine to loop through the image data, create a mask, then convert each to RGBA. This will make our lives much easier going forward. I'm actually going to just copy a routine I already wrote for my Shadow Caster maps.

And done...

Code:

import struct, sys, os, pdb, csvfrom PIL import Image

def createpath(pathname): """ Simple utility method for creating a path only if it does not already exist. """ if not os.path.exists(pathname): os.mkdir(pathname)

Before we start identifying tiles and creating the image map, there are two things I noticed yesterday about the image data that I want to investigate.

1) The GRAPHICS file table of contents had a few blank entries. My current algorithm skips those positions entirely. However, when we start identifying tile mappings, the fact that there was a field allocated for them may be important. I will therefore update my algorithm to specifically create blank places in order to keep the record alignment accurate. It also appears to have space for more records, so I'm going to at least process those regions in case there are more records in the registered version.

2) Record 4 and Record 49 have a garbage 64*12 image in each of them. Also, record 49 appears to have the wrong colour palette. The interesting thing about a 64*12 image is that it takes up 768 bytes. A full 256 colour palette also takes up 768 bytes, so I suspect these records are actually colour palettes. If so, we don't need our sample screenshot and we can pull the colour palette direct from the game data. It also means we can correct the palette for the Record 49 images, if this is accurate.

Also, as the source files are going to get a bit more plentiful (and of increasing length) I'm also going to change to only posting snippets of code and include the full listing in a zip at the end of this day's post.

Finally, because I'm paranoid, I'm going to put in a warning of a record still has data after we read the last image in that record. I *think* I'm decoding that right, but if there's more data, we don't want to miss it.

For the blank entries, it couldn't be easier. Increasing the size of the record removes the need for the seek command and the IF on the list comprehension:

Instead, the IF needs to go inside the image record instead. It also needs to be added to the debug output so we don't get a ton of empty folders. I'll let the debug CSV actually attempt to write out data for these blank records so it's easier to see the breakdown. I'm also going to add my paranoid sanity check to the __init__ method now. Here is the updated imagerecord class:

Well, we are missing data. Now, is this actual data, or blank regions? Let's investigate one. Offset 54601 is record #7, and according to my debug CSV has a size of 6880. If we have 969 unaccounted bytes, that puts at offset 54601+6880-963 = 60518. And it looks like there's a header for a 24 * 40 pixel image here. This implies that we are off by 1 on the number of images in a region. However, the 3 byte misalignments look a bit too small. Sure enough, investigating one of those shows that it actually is a 0, 0, 0 header. So, let's expand to grab that last image, but provide the ability to ignore an empty image if needed.

Weird. Everything looks like the correct colours, just VERY dark. Either it's in some strange format, or it's an alternate palette. A quick comparison using GIMP and a Hex editor doesn't show any direct correlation. Bah. It's not worth it right now. I'll keep the code in, but I will limit it for group 53 data and leave the main data using the screenshot for the palette. Let's get to the mapping!

In order to get started on the mapping, I need to refactor how I load the map. Previously, we loaded the map byte-at-a-time to make visualization easy. Now we should load as a series of 16-bit halfwords instead. I should also add a quick routine to generate a map csv so I can see the data values for each tile easily. Before I do that, let's get the previous functionality refactored and try it on a few other maps for comparison.

I tried to do something similar with Mega Man 9 before the Dolphin emulator was able to play it, but sadly, the graphics tiles weren't laid out so nicely in the data file. I stared at them in a hex editor for hours and hours before I finally gave up and eventually just fixed Dolphin so I could assemble the maps manually.

Yeah, I'm thinking the black box method is only really practical with old DOS games. The newer games are more likely to be using strange compression schemes and alternate image formats. That said, some games may just use standard image formats (like PNG), so it may be entirely possible to start looking for headers of known image files.

Then again, the entire resource file is probably compressed too.

On to Day 5: the day where I stop teasing and actually create a map.

To support this, we need to start a lookup routine in the graphics file to get the corresponding tile for a map tile index. Unfortunately, we don't know the mapping yet, so we need to pre-populate with debug tiles. Since we found out that the tile index is only an 8-bit number, we should be able to fit it (in hex notation) in the 16 x 16 pixel debug tile. This may make the CSV we made obsolete, but what can you do.

We need a font to generate the text, so I'm going to copy in the Android favourite, DroidSans. Here's the routine to create a simple debug tile (adapted from my Shadow Caster mapper) and the font declaration:

And the lookup method, which will change once we figure out real mappings:

Code:

def gettile(self, tilenum): return self.tilelookup(tilenum)

Now, we start a new file for generating the map. In this file, we're going to import the other two files we wrote in order to use their features in tandem. This will start out extremely simple: we will create a large Image object to hold the map, then paste every debug tile according to its coordinates. Since the map is a linear set of tiles, we need to use integer division and modulus (i.e. the "remainder") to split that up into X, Y coordinates.

Here's the initial mapper file. I set up the mapper as a class, but you can do it as a standalone set of functions as well. Personally, I like to split it up into phases. the Shadow Caster had initialize, generate and save phases, but I think we can combine the first two in this case.

Well, that's same area. So now we need to go look through our extracted tiles and match a few to IDs. Usually we can figure out a trend to unlock the rest for us; otherwise it can become a bit tedious. I see a few tiles in graphics record #8, so I'm going to go ahead and map those. Actually, I already see a pattern: it looks like map tile values 1 to at least 0A correspond to the first few tiles in group 8. But there appears to be a discontinuity at index 6. GRR. Let's do what we can with it.

Not too bad so far, but I can see this getting awkward after a bit. Let's see what this does to the map:

Okay, that's progress. Now let me go ahead and try to identify some of the other tiles... gah, what is that??

Looks like I was wrong about the 8-bit tiles. This all falls under the red area in the original visualisations. Looking a bit closer at my grayscale meta files in GIMP show that even the non-black regions appear to have a few very-close-but-not-identical colour values. Time so switch back to 16-bit tiles (grumble). I'm not quite sure how to make 4-digits visible on a 16 pixel tile though. I'll try, but I may need to rely more on the CSV.

No more misidentified tiles, but we aren't exactly coming up with much of a trend, even if we were to calculate out the offsets (i.e. -0xC0F5+2 = -0xC0F3, but -0xC0D3+9 = -0xC0CA). It seems that almost every group of tiles, even those in the same record, ends up discontinuous. At this point, we have two options. Either keep identifying tiles manually (and up with some sort of mapping database), or see if there is some information we are missing. Hey now, there's a file called TILES.XR1. Let's see what's in it. Looks like a whole bunch of strings with some supplementary information. It's a very regular file, so it should be fairly simple to decode.

Still no direct correlation to the values we see on the map, as C000 (for example) is 49152 in decimal. Let me do one more thing today. I'm going to re-interpret some of our unknown GRAPHICS record header values as 16-bit numbers.

Something to note about old DOS games as well: sprites are often stored with some sort of run-length encoding. It's an outdated method of reducing the size of bitmaps. What it does is translate the drawing of the image into commands similar to:

Move X pixelsNew rowDraw X pixels of color

Each game seems to be slightly different in the encoding, but with trial and error, it can be figured out.

Thanks for the feedback guys. To expand on Darkwolf's comment, Xargon is a very simple game with some basic data structures. It makes it a good example, but note that other games may be more difficult. Both games I did prior to this (ROTT and ShadowCaster) had more complicated sprite schemes. ROTT used an RLE encoding scheme as DarkWolf described, while ShadowCaster actually had a simple scheme of specifying the start and end pixel in a column, but not otherwise compressing the image. That said, a lot of games have had some prior investigation, or even a source code release, which can help reduce the guesswork immensely. As I mentioned, the source code for Xargon has been released, but I'm intentionally avoiding it to make this guide (hopefully) more useful.

To start off today's exercise, I'm going to play around in a spreadsheet. Exciting stuff. Basically, for every known sprite thus far, I want to figure out the exact relationship between the TILES file, the Record ID, and the Map ID. One thing I noticed yesterday was that the Record ID column in the TILES file appeared to shift by a 256 boundary for each record. We should be able to use the classic divide and modulus pair to split out the record number. I just need to confirm to make sure we don't start drifting for whatever offset I end up calculating.

Records appear to be easy. The record number is just the TILES entry / 256 - 64, and the entry in the record is the TILES entry % 256 -1. The only two things that we need to pay special attention to are:1) The "extra" tile we mentioned before (i.e. that I thought might be garbage data) appears to be referred to by position -1. That said, index -1 in Python already refers to the last element in a list, so we don't actually need to handle this explicitly.2) The tiles file refers to record 49, which does not exist. We will just need to keep an eye out if this happens. I expect this is just placeholder data (maybe for the registered version).

The Tile IDs appear to just be offset by C000, which is pretty simple. I'll need to figure out the tiles that are in the 0 to FF range on the map, but that can wait. I'll just put an IF statement to avoid processing them for now.

So let's expand our xargontiles.py file to do the lookup for us. We have two options for linking together the graphics and tile files:1) Pass in the Graphics file when creating the Tiles file and store a reference for future use2) Pass in the Graphics file for the lookup operation to the Tiles file

Since I prefer to keep each class tailored to just the file it is responsible for, I will be doing the second option. The xargonmapper.py is the only file that will know about all the related files and will tie them together.

First, we need to set up xargontile to do easier lookups. I will expand the initialization routine to populate a dictionary of tile to record mappings:

And would you look at that? THAT is progress. But it's not perfect, and it has some glitches we need to investigate. Notably, we haven't looked into the map index 0 -> FF area yet. I also noticed a couple glitches over in this part of the map which we need to investigate further:

But first, the region that doesn't start with C000:

And let's start identifying some tiles. I already have a theory that these are simply direct indicies into the TILES file, so let's test that out with the black tiles (00AD it looks like). 0xAD -> 173 decimal, which has the name of 0NO and record 9, index 16. Sure enough, that tile is totally black. Let's implement it and see how it turns out:

Almost there, but we ran into the same glitch again. Let's look into that. I'm just going to open up the debug CSV to grab the tile value that screwed up... and it looks like 174. In the TILES array, 174 is called RNSLDGM... and it's in record 10, index -1. ARGH. Looks like the out-of-place "extra" graphics really are out of place! And I can't figure out a clear reason why the "extra" tile looks wrong. However, this tile in particular looks like record 10, index 3 (although that is also assigned a tile name).

What I'm going to is remove the debug image list from before, since we have the tiles nominally identified. Instead, I will create a much shorter list to handle the cases where we run into a -1 tile and re-direct it to the correct index (if, in fact, such a tile exists). Hopefully we can get something working.

So yes, it looks like all the glitched tiles ARE the -1 tiles. Unfortunately, the tile we picked was wrong; the correct tile should have more of a highlight. We also need to find a suitable rope tile, it looks like.

And I can't find either of them! *cries*.

Wait a minute. I am an idiot. Raise your hand if you see the bug below:

That's right, I ALWAYS skip the first image! GAH! This explains the whole subtract 1 thing (which, obviously now, is wrong). Let's fix both problems and re-run to generate:

Yay! For run, I also ran it on the rest of the maps, and most everything looks okay, except for a few maps:

Looks like there may be some sort of transparency behaviour with the sky colour on other maps? Or a different colour palette? I'll need to investigate when I decode more of the map format. That will be tomorrow's topic.

And finally, day6.zip is available for anyone who wants it. It includes the full output images thus far if anyone wants to see what the current versions of the rest of the maps look like.

Hello and welcome back. Today we will be looking at the rest of the map format. We're looking for two major things to include in our map:1) some sort of header information that will hopefully give us more information on background colours2) Monsters and Pickups!

Remember the garbage at the bottom of this image?

Well, let's try to figure out what it means. There's obviously a repeating pattern here, but it's a bit skewed because we were decoding two bytes at a time. From the image, it looks about 15-16 pixels, or probably 31 bytes in length. Let's see if we can find the bounds of this record in a hex editor. The map is 128*64*2 bytes long, so we should start at offset 0x4000.

Looks about right. But now we need to find the bounds of this region, and to figure out what comes before and after it. Let's start at the end of the file instead.

Well, that's very clearly delineated. There is a number followed by a string then the next string, etc. The number appears to be the string length. The previous section also has a clear empty buffer region between, so I'm going to assume that the last record ends with a non-zero number. If I then grab the last record, it looks like:F0 02 00 00 00 00 10 00 0C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 05

That seems like a fairly likely record alignment. The first two characters correspond to a 16 number of 752, which is a reasonable decoding. The F0 is at offset 0x4A8E, so let's keep going back in intervals of 31 until we reach a reasonable first record. (0x4000 - 0x4A8E) / 0x57.29, so it looks like we have approximately 0x57 (87) records here. My first guess places the start of these records at address 0x4005.

Or, condensed, '<11H2B2HBH'. Time to update our map script to decode this information and write it out to a debug CSV so we can hopefully figure something useful out of it (like an object ID and coordinates).

def debugcsv(self): # Remember that the map is height-first. We need to convert to # width-first. This only outputs tile data for now. with open(self.name + '.csv', 'wb') as csvfile: writer = csv.writer(csvfile) for y in range(64): writer.writerow([self.tiles[x*64+y] for x in range(128)])

Well, whatever is in that 65535's column, it is probably signed. However, it's unlikely that data will be relevant for us (since it's normally zero). We need to pick out likely coordinate and item ID columns. The easiest way to do this is to locate a known object (or pattern of objects) in the game world and try to find their representation.

This classic screenshot provides some useful objects. All of the same type in close proximity:

Opening up the pixelmap image in GIMP, drawing in the item locations, then checking their coordinates results in the following zero-based indicies:(3, 11) (3, 12) (2, 13) (4, 13)

1-based:(4, 12) (4, 13) (3, 14) (5, 14)

And I'm coming up pretty empty. I'm not even sure what could look like coordinates or an index in the data I have. Nothing is obviously a coordinate system, and a direct index would require a number between 0 and 8192, and nothing goes anywhere near this value. But I have a strange idea that the indicies could possibly be given in absolute pixel coordinates. Let's divide the first and last columns by 16 and see if we see any patterns.

Hrm, not a perfect match, but close. Also, every entry appears to be within the bounds for a 64 x 128 map, and appears sane. I'm not going to implement today, but I have a theoretical decoding. I will number the columns below:

The nominal sprite dimensions entry I guessed because I noticed the coordinates 2, 8 would put something pretty close to the start of the level and the dimensions were 24, 40 (same as the player sprite). Also, the first-level monster sprite dimensions are 38, 25, which shows up several times in this file.

Looking at the decoding from yesterday again, I think I misaligned the record start after all. I think the 20 00 at the start of the record area is actually the X coordinate for the first record. That means that the trailing number of 05 is its own thing. Let's move that around and see if that fixes our "almost correct" alignment for the pickups at the start. Remember, we expected either0-based:(3, 11) (3, 12) (2, 13) (4, 13)

I also switched the half-words to signed due to the strange 65535 entries. Nothing appears to "normally" approach the 16 bit boundary, so this should be safe. I also changed how I skip/capture the apparently unused map region. Hopefully I can find a pattern for the maps with different background colours, when I get there. Here's the updated code:

self.objs = [struct.unpack(objrecord, mapfile.read(struct.calcsize(objrecord)) ) for i in range(numobjs)]

# There always appears to be a 0x61 byte unknown region between# the records and strings. Let's just collect it as bytes for now.unknownregion = '<97B'self.unknown = struct.unpack(unknownregion, mapfile.read(struct.calcsize(unknownregion)) )

Okay, now we think we have a reasonable decoding, and a few sample mappings (corrected due to the misalignment):0 - Player start33 - Lolly Pop25 - Monster

Unfortunately, pickup/monster/object sprites are typically defined in code due the additional logic that is typically needed. It's unlikely we will find a direct mapping like we did for the Tiles. That said, there usually isn't an overwhelming number of interactable objects in a game, so we should be able to identify them as-we-go. We know there aren't more than 256 of them, after all. In order to track the mappings, we should start a new Python file for the sprite database. This will contain a lookup table for record entries, and a method to grab the correct sprite for a given sprite ID. We will also re-use the debug code from unknown tiles to provide placeholders for unknown sprites.

Finally, we need to make a decision on how sprites get drawn into the world. There are two options: have the mapper handle everything and add additional logic there, or defer processing to the sprite file, and pass in a reference to the map-in-progress. I'm going to choose the later, because it helps keep all the sprite processing in one place. Here's the first cut at the sprite file:

Looking good. A couple things to note that we will need to take care of:1) There are some objects that appear functional but not visual in nature (object IDs 17 and 63). We could provide markup to indicate what we think they mean, but I think a better approach would be to simply not draw anything.2) The player start appears slightly in the ground. We will either need to figure out an alignment algorithm to correct this sort of thing, or make a special case for the player sprite.3) The sprites have Dimensions. We should take this into account for our debug images to make them easier to identify by shape as well as location.

To make this sort of thing easier, I'm going to expand our spritedb file into an actual class which will keep a copy of the corresponding sprite images. This way it can generate debug images on-the-fly when a sprite is not found, and we can make specific entries for empty sprites. To consolidate functionality, I made the debug image function return empty sprites if called with 0 width and height.

And now, we get to identifying. Time to fire up DosBox. Those 51 sprites look like movable clouds, so I'll go find their sprite first. Once I get in-game, I'm going to go ahead and start identifying world map sprites too.

Hrm, it actually looks like the map goes by different sprite IDs!

That's fine, we can maintain two lookups. I'm going to go ahead and split it, then identify a few sprites. Then again, that doesn't look to be enough info. For instance, the 88 sprites on the map screen appear to be mountains, but mountains have two sides. I think there is some sort of sub-ID going on here. And I think I found it:

I think it's that column that counts 2, 3, 6, 2, 3, 0, 1. Next I'm going to need to expand the sprite identification to take that into account. But that's a task for tomorrow. I'll also need to figure out some way to clearly indicate the sub ID in the debug image. I may need to move back to 8 point font

Hello folks. Let's get things restructured to allow item varients. I'm also going to introduce two new classes to make things more flexable: an object record class so I don't need to manually specify decoding every time, and a sprite class. The sprite class will be used to allow custom handling for certain sprites, like the player sprite (which needs an offset).

Next we need to update our sprite db and graphics file (i.e. debug image) to support the two identifiers. Let's do the debug image first. Note that I'm changing this to draw a slightly larger image if needed, and make the image rectangle semi-transparent. I also stopped bothering to pick different shades of gray.

def draw(self, mappicture, objrec): # When pasting masked images, need to specify the mask for the paste. # RGBA images can be used as their own masks. mappicture.paste(self.image, (objrec.x +self.xoffs, objrec.y +self.yoffs), self.image)

Excellant. Now it's time to ACTUALLY get decoding things, since we won't end up with overlapping definitions any more. Things a pretty straightforward at this point. Here's the properly identified first section of the map:

And a bit further into level 1:

But it looks like our interior room has a special sprite for the in-game text.

That's going to require something special to implement. Also, we're going to want to indicate what is INSIDE a present, since we can decode that information. That too, will require special handling. With the architecture we have now, we can simply set up subclasses of the sprite class to draw more complicated information. Unfortunately, we're out of time for today, so that will have to be tomorrow.

Good evening. Time to tackle the two enhancements we wanted yesterday. Enhancement #1 is for present contents. For these I'm just going to expand the capabilities of the basic sprite class and not even make a separate class. I will simply add an optional parameter for the content sprite. When drawing the sprite, if contents are assigned, they will be drawn immediately above the sprite itself.

def draw(self, mappicture, objrec): # When pasting masked images, need to specify the mask for the paste. # RGBA images can be used as their own masks. mappicture.paste(self.image, (objrec.x +self.xoffs, objrec.y +self.yoffs), self.image)

That was easy. Now for text. There are two approaches we can use for text:1) Use the raw text images we have to reproduce the in-game text2) Find a close match and just go with it

That said, what we got directly out of the reference file appears a bit strange and would need some post-processing to get it to appear correct. The second option is certainly simpler, and I will go with that for now. As long as I can find a fixed-width font with reasonably similar weight and size, it should be pretty equivalent. For reference, here is the direct font data from the GRAPHICS file:

Stage 1 is easy, but we need to figure out the general solution for picking the correct text string. For that, we will refer to stages 3 and 33 (aka the ending).

Let me include the records for object 7 from each stage together below (stage # is the first column). I will also number the columns from the original file as a zero-based index along top.

We've already identified indices 0 to 2 and 5 - 7, although the obvious choice of the subtype does not appear to be used to select the specific string to be displayed. The only one that appears different for each string is column 13, but that doesn't make much sense on its own. That said, I think column 3 might be the colour index, of the first 16 colours in the palette. 9 is blue, 7 is white, 6 is yellow, 8 is dark gray.

Hrm, giving it a bit more of a look, it's possible that it might be part of a 16-bit number. The wrap around from 1 72 to 250 71 seems to imply that. Let me fix my decoding and try again. Here are just the numbers, (+ some in hex)

I don't get it. I'm just going to have to hack something together. The ONLY pattern I think I can figure out is that these numbers are decreasing compared to the order of the strings. I also noticed that apparently object #17 also uses strings, as does object 6. Object 17 appears to be for internal use, so I won't actually draw those, but object 6 appears to be another string region (different font maybe?). Either way, here's what my hacked up approach will be:1) Look for all numbers in this column, and collect them in a list in descending order.2) When I have a string to grab, I will pull the string that corresponds to the order within the strings array corresponding to this number.

That should work for maps 1 and 3, but I have no idea how well it will sort itself out for map 33. At least it will get us un-stuck for now. First, let's add the new fields to the record:

# String reference lookup table. This is a bit of a hack for now.# Sort all known string references in reverse order:self.stringlookup = [record.stringref for record in self.objs if record.stringref > 0]self.stringlookup.sort(reverse=True)

To be able to do the text lookup, I need to pass a reference to the map into my sprite draw routine. Then I need to actually write the text sprite class to take advantage of all this new infrastructure. For the font, I'll just go with DroidSansMono for now. I'll pick 10 point font for object 7 and 8 point font for object 6. Here's how the text sprite class ended up:

But the font is a bit too thin. Let me see if I can find a monospaced font that has a dedicated bold version. I'll also tweak the alignment a bit.

Here's FreeMonoBold, with a bit larger size. It'll do for a while. It's not that great, so I think we will want to get some handling for the ingame font. But that can wait now that we have the actual text display working nicely.

Now on to identifying items. Stage 1 is done, so stage 2 is next.

But it looks like we hit a snag on level 2. There is a centipede monster in this stage, which is stored as a series of segments in the graphics file, so we can't just draw a single sprite. We need a method for creating a compound image to draw him properly. Also, sprite 73 appears to be a hidden pickup. We will want to indicate these some way (i.e. by making them semi-transparent, probably). I will tackle all this tomorrow. Both cases should be just a matter of pre-processing the images before creating the corresponding sprite object and shouldn't require their own types of sprites.

Today I'm going to handle the slightly more complicated sprites we want to handle. First, here's a screenshot of the centipede monster we want to re-create from the individual segment sprites:

I count a head, six segments, then a tail. The map data tells me that the bounding box is 76 x 22, although the actual sprite dimensions vary from 16x20 (head) to 8x17 (segment) to 12x8 (tail). However, each segment appears to connect directly to the next with no padding on the sprite image itself, so that will make things easy. Adding up the required widths adds up to exactly 76, so there's some additional confirmation. Let's make up a method for making a composite sprite. I'm adding this to the graphics file, since it owns the original images. It could have also been added to the spritedb file, but it would need a copy of the graphics file, so that seems somewhat silly.

On to transparency. Again, I'll add this to the graphics file, although it does not really need to go there. It's going to be a static method that simply takes the image to make transparent and the desired alpha value. We're simply going to use the multiply channel operation to perform this:

# Pickups appear to be in the same order as their corresponding record.# There are two types of pickups: normal and hidden.for subtype in range(24): self.addsprite(33, subtype, sprite(graphics.records[37].images[subtype])) self.addsprite(73, subtype, sprite(graphics.semitransparent( graphics.records[37].images[subtype], 128) ))

Looking good:

Stage 2 is done now. On to stage 3.

Well now. Stage 3 had a black background (that sometimes flashes with lightning). That is a far cry from the blue the map generates for it. Taking a look at the colour palette of the screenshot, and it is indeed different. Board 6 and 7 also look like they should have black backgrounds. However, looking at the map data doesn't seem to imply anything different between those maps and the maps with blue backgrounds. I think I'm just going to have to by the map number for this.

I'm going to guess it's that column that goes 2, 3, 4, 0, 5, 3. To avoid totally re-architecting my sprite algorithm for ONE special case, I'm going to create a new type of sprite for treasure boxes. And yank the contents handling from the normal sprite class, since it doesn't get used in any other context.

Well, that works and all, but it misidentifies several treasure boxes. Hrm, comparing map 1 to this map, it appears that map 1 always uses '3' in the 'colour' field, while this map uses 0 and 1. Let's assume it's the colour field instead and re-do this.

Fixed that. Next is the palette thing. To do this, we need to defer the masking operation in the graphics file. We should store the images in index format initially, then create the masked version on-demand. Since we already have other functions directly referencing graphics.records[n].images[m], we should store the raw data in a different location:

Then we just need to change palettes for each map. We could go by the map name, but I noticed the first byte in the "unknown" region of the map data appears to be the map number. I'm going to go by that.

And it works as expected. Now we don't have the garish blue background in the dark levels. Time to continue identifying sprites in level 3.

Well, I ran into another interesting thing. Sprite 12 can sometimes be visible! We need to determine what decides this and adjust accordingly. Checking the object CSV, it looks like it has a 1 in what I called the "colour" column. Looks like I should expand my treasure box sprite into something that can also handle other similar variable sprites. I will then have to make the contents optional again. Let's see how that turns out:

Good morning folks. When doing the switches and treasure boxes yesterday, I came to a realization. The column that I first attempted as the treasure box appearance actually seems to be the link between a switch and what it affects. Why don't we try drawing this identifier into the world to clearly indicate the effects of switches? If we don't like how it turns out, we can always disable it again.

To do this, I'm simply going to create a new method in the spritedb file to draw this onto the map. Since this is NOT debug text, we're going to want to draw this in a way that is more visible. I'm going to re-use the code from my Shadow Caster map to draw it first offset a couple times in black, then in white to get a bordered effect.

Well, it seems to show up in too many places. I'm going to comment it out for now, but let's keep it in the back of our minds. We may still want something similar, but only for specific scenarios. This also appears to provide links for doorways, which is always handy.

In any case, I'm just going to go ahead and identify items in stage 4 now. Though I'm getting really tired of going in and out of subfolders to try to find sprites. I'm going to flatten my export structure from the graphics file, and remove the record offset.

The first few sprites weren't anything special, although I found the sprite number of an illusionary wall, which I made semi-transparent. However, I ran into the following location on the map:

Which doesn't contain any unknown sprites in my current map. This must be another object I'm not drawing. Let me comment out my hidden objects and find which one it is. But even turning off every identified sprite STILL doesn't draw anything in this location. It doesn't make any sense. I will finish mapping this stage, then double-check my map decoding to make sure I'm not missing any sprites somehow.

I also ran into ceiling sprites, which are apparently the same ID as floor spikes. Probably differentiated by the "colour" field. I'm going to go ahead and rename that field to appearance. Though looking at the objs file, it appears to be the next field one over. I'll tentatively call this one "direction" and update the variablesprite class to specify which field to look up. I'll have to use access the raw underlying python field dictionary to make it work, though.

Looking over the objects file, I figured out what was going on. Those bouncing ball traps reported themselves as having no size in the objects file! This malfunctioned by creating an invisible debug image. I'm going to go ahead and switch to always generate a debug image. For any sprites I did not want to draw, I'm going to redirect to record 36, image 28 instead (i.e. an empty sprite).

With that, stage 4 is complete.

day12.zip is available. I've moved the CSVs and flat maps into a sub-folder in order to better organize things.

VGMaps.com is an archive of video game maps up since May 6, 2002. Optimized for at least 800 x 600 resolution.

This site does not contain commercial ROMs or any other illegal materials. All directly "ripped" game images are the property of their respective copyright holders. This web site and compass logo are copyrighted by Jonathan Leung 2002-2018.