Friday, June 22, 2012

IFF file format experiments

I haven't written any blog post for a while, but don't worry... I'm still alive, though very busy with writing my PhD thesis. There is still a fun project that showed some interesting results a while ago, but so far I never allowed myself to take the time to report about it.

A while ago, I wrote a blog post about my second computer, the Commodore Amiga. I also mentioned that one of my favourite Amiga programs was Deluxe Paint, which stores images in a so-called "IFF file format", which was ubiquitous on the AmigaOS and supported by many Amiga programs.

Nowadays, this file format is rarely used and also poorly supported in modern applications. Most common viewers cannot open it, although a number of more advanced programs can. However, the quality of their implementations typically differ as well as the features that they support.

What is the IFF file format?

IFF is an abbreviation for Interchange File Format. Quite often, people think that it is just a format to store images, as the most common IFF application format is the InterLeaved BitMap (ILBM) format used by Deluxe Paint and many other programs.

In fact, IFF is a generic purpose container format for structuring data. Apart from pictures, it is also used to store 8-bit audio samples (8SVX), musical scores (SMUS), animations (ANIM) and several other formats. The IFF file format as well as a number of application file formats were designed by Electronic Arts, nowadays a well-known game publishing company, and described in several public domain specifications, which everybody was allowed to implement.

The confusion with the IFF file format is similar to the OGG file format, which are quite often mistakenly identified as Vorbis audio files, as Vorbis is the most common OGG application file format. In fact, OGG is a container format for bitstreams, while Vorbis is an application format to provide lossy audio compression and decompression. There are many other OGG application formats, such as Theora (for video) and Speex (for speech).

IFF concepts

Conceptually, IFF files have a very simple structure. Every IFF file is divided into chunks. Each chunk consists of a 4 character identifier followed by a signed 32-bit integer describing the chunk size, followed by the given amount of bytes representing data:

The picture above shows a very simple example chunk with identifier: 'BODY' which contains 24000 bytes of data. The data in the chunk body represents pixel data.

Although the concept of IFF files using chunks is ridiculously simple, it immediately offers a number of useful features for handling file formats. By looking at a chunk identifier, a program can determine whether it contains useful information to present to end users or whether a chunk is irrelevant and can be skipped. Furthermore, the chunk sizes indicate how much data has to be read or how many bytes must be skipped to reach the next chunk. Using these attributes make it possible to implement a robust parser capable of properly retrieving the data that we want to present.

In principle, every chunk captures custom data. Apart from data chunks, the IFF standard defines a number of special group chunks, which can be used to structure data in a meaningful way. The IFF standard defines three types of group chunks:

The FORM chunk, contains an arbitrary collection of data chunks or other group chunks, as shown in the picture above. Our example, defines a FORM which has the type ILBM. In principle, every application file format is essentially a form in which the form type refers to the application file format identifier. In the body of the FORM several data chunks can be found:

The BMHD defines the bitmap header containing various general settings, such as the width, height and the amount of colors used.

The CMAP defines the red, green and blue color channel values of each color in the palette.

The BODY chunk contains pixel data.

The CAT chunk may only contain a collection of group chunks, that is only FORM, CAT or LIST chunks.

The LIST chunk is an extended CAT chunk that also contains a number of PROP chunks. PROP chunks are group chunks which may only reside in a list and contain a collection of data chunks. These data chunks are shared properties of all group chunks inside the LIST. For example, a LIST containing ILBM FORM chunks, may use a PROP chunk containing a CMAP chunk, which purpose is to share the same palette over a number of bitmap images.

Application file formats can define their own application specific chunks and their attributes. For example, the ILBM file format defines the BMHD data as BitMap Header chunk, containing important attributes of an image, such as the width, height and the amount of colors used and the BODY chunk that stores the actual graphics data.

Apart from these basic concepts, IFF has a number of other small requirements:

If a chunk size is odd, then the chunk data must be padded with an extra 0 byte, so that the next chunk is always stored on an even address in memory (as shown in the example form). This requirement was introduced, because the 68000 processor (which the Amiga uses) processes integers much faster on even addresses in memory. In our example form shown earlier, the CMAP chunk is padded.

Also application file format attributes of word and long word sizes, must be word aligned (stored on even addresses in memory).

All integers must be big-endian, because the Amiga was a big-endian system. This means that on little-endian systems, such as PCs, the byte order of integers has to be reversed.

IFF file format support

The IFF file format is yet simple, but also powerful and served it purpose really well when the Amiga was still alive. For a very large and cool experiment (which I will keep secret for a while) I wanted to open ILBM images in a SDL application (a cross-platform library frequently used to develop games and multimedia applications), as well as modifying ILBM files and saving them. I ran into several issues:

Support for most IFF application formats is not present in many common viewers and players. However, some more advanced programs support it. For example, Paint Shop Pro and the SDL_image library have support for viewing ILBM images.

These applications all have their own implementation of a specific IFF application format. Some implementations are good, others lack certain features. For example, I have seen several viewers not supporting the special Amiga screen modes, such as Extra HalfBrite (EHB) and Hold-and-Modify (HAM) or the color range cycle chunks.

Applications can open simple IFF files that consist of a single FORM, but do not know how to deal with IFF scrap files, i.e. CATs/LISTs containing multiple FORMs of various types, possibly with shared options.

Most applications can view IFF application formats, but cannot write them or check for their validity, which may result in crashes if invalid IFF files are opened.

A number of open file formats have generic parser libraries, e.g. PNG (libpng), JPEG (libjpeg), GIF (giflib), Ogg (libogg), Vorbis (libvorbis) etc. that applications use to open, parse and save files. There is no equivalent for ILBM and other IFF application formats.

IFF libraries experiment

So after I ran into these issues I've decided to take a look at the IFF specification to see how hard it could be to implement the stuff I needed. After reading the standard, I started appreciating the IFF file format more and more, because of the simplicity and the practical purpose.

Furthermore, instead of implementing yet another crappy parser that only supports a subset, I have decided to do it right and to develop a set of general, good quality, reusable and portable libraries for this purpose, with similar goals to the other file format libraries so that application programs can support IFF application file formats as easy as the common file formats that we use nowadays.

I also think it's good to have file formats which used to be widely used, properly supported on modern platforms. Finally, it looks like fun, so why not doing it?? I did a few experiments that resulted in a number of interesting software packages.

Implementing a SDL ILBM viewer

First, I have decided to implement support for my primary use case: Proper ILBM image support in SDL applications. I have implemented a SDL-based viewer program, having the following architecture:

In the picture above, several components are shown:

libiff. This library implements the properties defined in the IFF specification, such as parsing data chunks and groups chunks. Furthermore, it also supports writing IFF files as well as conformance checking.

libilbm. This library implements the application chunks as well as the byte run compression algorithm defined in the ILBM specification. Furthermore, it supports several extension chunks and the file format used by the PC version of Deluxe Paint (which has several minor differences compared to Amiga version). Application chunks can be parsed, by defining a table with function pointers to the ILBM functions that handle these and to pass the table to the IFF library functions.

libamivideo. This library acts as a conversion library for Amiga graphics data. As explained earlier, the Amiga uses bitplanes to organise graphics and has several special screen modes (Extra-Halfbrite (EHB) and Hold-and-Modify (HAM)) to display more colors out of the predefined color registers. In the SDL viewer we use the libamivideo library to convert Amiga graphics data to chunky or RGB graphics and to emulate the special screen modes.

Images saved by the PC version of Deluxe Paint however (which have the PBM form type instead of ILBM), do not use bitplanes but chunky graphics, and thus conversion is not necessary.

SDL_ILBM. This package contains a high level SDL library as well as the ilbmviewer command-line tool, directly generating SDL surfaces from IFF files containing ILBM images as well as performing the required conversions automatically.

Usage of the SDL ILBM viewer is straight forward:

$ ilbmviewer picture.IFF

The viewer can also be used view IFF scrap files. For example, it may be possible to combine several ILBM images as well as other formats (such as a 8SVX file) into a single IFF file. By passing the combined file to the viewer, you can switch between images using the 'Page Up' and 'Page Down' keys. For example:

Below I have included some screenshots of the SDL ILBM viewer. The picture on the top left is an image included in Graphicraft, which defines a color range cycle to animate the bird and the bunny. By pressing the 'TAB' key, the viewer cycles the color range to show you the animation. The other screenshots are images included with Deluxe Paint V. As you can see, the viewer also knows how to view HAM images (the Aquarium) and AGA images (the desk).

Implementing a SDL 8SVX player

To see how well my IFF library implementation is designed, I have decided to implement a second IFF application format, namely the 8SVX format used to store 8-bit audio samples. The architecture of the SDL 8SVX player is quite similar to the SDL ILBM viewer, with the following differences:

lib8svx. This library implements the application chunks as well as the fibonacci-delta compression method defined in the 8SVX specification. As with libilbm, it also defines a table with function pointers handling application specific chunks to the IFF parser.

libresample. This library is used to convert sample rates. 8SVX samples have variable sample rates, while on the PC hardware samples are typically passed to audio buffers with a fixed sample rate. Therefore, we have to convert them.

SDL_8SVX. This package contains a library as well as the 8svxplayer command-line tool. Sample rate conversion is automatically done by the SDL library.

As with the SDL ILBM viewer, the SDL 8SVX player can also play samples from scrap IFF files:

$ iffjoin Picture.ILBM Sample.8SVX > join.IFF
$ 8svxplayer join.IFF

Backporting the ILBM viewer to AmigaOS

The third experiment I did was a really crazy one. I have backported the libraries and tools to the AmigaOS. People probably wonder why I want to do something like this, but hey: it is fun, so why shouldn't I do it? The reasons are the same why people want to backport WINE to Windows or AROS back to the Motorola 68000 platform.

Another reason is that I wanted to know how well these libraries perform on the original platform were these file formats were designed for. The Nix AmigaOS build function I have developed previously, helped me a lot in achieving this goal. Apart from a few small fixes, mainly because getopt_long() is not supported, I could easily port the codebase in a straight forward manner without implementing any workarounds.

The architecture shown above is nearly identical to the SDL ILBM viewer. The only difference is the role of the libamivideo library. In the AmigaOS viewer application, it serves the opposite goal compared to the SDL version; it converts images saved by the PC version of Deluxe Paint in chunky graphics format to bitplanes.

It was also nice to write a Intuition GUI application for AmigaOS. Back in the old days, I have never programmed in C and I never wrote a GUI application (apart from a few small experiments in Amiga BASIC), simply because I did not have the knowledge and tools available back then. The AmigaOS libraries were not very difficult to understand and to use.

Below I have included some screenshots of the UAE emulator running the viewer using my own libraries. As you can see, the GUI application has implemented Intuition menus allowing you to open other IFF files using a file picker and to navigate through IFF scrap files:

Conclusion

In this blog post I have described several software packages that resulted from my IFF file format experiments, because I could not find any IFF libraries that have all the features that I want. The purpose of these packages is to provide a set of high quality, complete, portable libraries to display, parse, write and check several IFF application formats.

All the software packages can be obtained from the IFF file format experiments subpage of my homepage and used under various free and non-copylefted software licenses, such as the MIT license and the zlib license.

I haven't made any official releases yet, nor I have defined a roadmap, so don't consider these libraries production ready. Also, the API may still evolve. Probably, at some time in the future I will make it more stable.

I have also found two other projects implementing the IFF standard:

The IFF project on Sourceforge, is a C++ library using uSTL, which deviates on some aspects. For example, it stores integers in little-endian format. Furthermore, I haven't seen any application file formats using this library.

I also found a project named libiff on Google Code. It seems to have no releases and very little documentation. I have no clue about its capabilities and features.

It is also interesting to point out that I have more stuff on my hard drive, such as libraries supporting several other file formats, which utilise several packages described here. When I can find the time, I'll make these available as well.

2 comments:

Wow, this is really nostalgic: reminds me of my early hacking days, when if we wanted to parse a file, we needed first to reverse engineer its format from available files and from tracing the code of the viewer ;)

Haha yes indeed :-), but speaking about nostalgic: Many file formats in use nowadays still follow the same principles. The PNG format which is "modern" uses a similar organisation with chunks and sizes, although it is not an IFF application format.

So in my opinion, several ideas developed in the past aren't that bad. This solution is simple, elegant and practical.

Nowadays, I have the impression that modern developers have forgotten about these lessons and solve problems in a much crappier way.

I still remember this "XML hype" period, in which every body thought that defining your file format as an XML application made everything 10 times better or so (which is ridiculous of course). I also remember this advertisement from a big software company claiming that the "new XML transaction system" saves customers a nickel for every transaction.

And it was fun to do this experiment, don't ask me why. Sometimes, I like to do "fun oriented programming" (as you call it). The "bigger experiment" is even more interesting but I'll reveal about this once I have anything interesting to show.