TIFF: little- or big-endian?

The difference between the flavours of TIFF image file lies in the order in which bytes of data are stored. Numbers which can take more than 256 possible values must be stored in more than one byte. The question is which order the bytes should be in: starting with the least significant byte (LSB) or the most significant (MSB). Processor instructions sets are designed to work one way round or the other. With a reference to Gulliver's tales, the choices are termed little-endian (LSB first) or big-endian (MSB). It so happens that Intel processors are little-endian and Motorola processors (as traditionally used in the Apple Mac) and Sun's SPARC processors are big-endian. Java's virtual machine (JVM) is also always big-endian, regardless of which type of physical processor it is really implemented on.

At the start of a TIFF file there are 2 bytes which tell software reading the file whether it has been stored in little- or big-endian fashion. The first 2 bytes are ASCII characters, either 'II' (hexadecimal 4949) for Intel, meaning this file is little-endian or 'MM' (hex 4D4D) for Mac, meaning big-endian.

One reason why TIFF has become so widely used is that this mechanism enables images to be moved between the different types of system very easily. Software for reading or writing image files in the TIFF format must respect the byte ordering, or the files will not make sense.

Unfortunately the first version of JAI's Image-I/O TIFF reader only partly got it right. The header part of a TIFF file was read correctly but if the pixel channel data were 16-bit they were always assumed to be Java-style big-endian. Therefore our application, GRIP, had an option for swapping pixel data bytes after loading a 16-bit image.