Embedded Systems

Your Own Endian Engine

Source Code Accompanies This Article. Download It Now.

Big-endian, Little-endian, or anything in between--you won't have to worry about byte order with the "endian engine" presented here. In fact, John has used this engine to simulate a 36-bit machine with 9-bit bytes on a 32-bit machine with 8-bit bytes.

Back in the days before heterogeneous networks and client/server architectures, life was relatively simple for programmers. IBM-supplied mainframes ruled the roost, a byte was eight bits, and numbers were always in Big-endian order. Now, however, there are C compilers in which char takes nine bits. There are even rumors of 10-bit byte machines. It wouldn't surprise me to see 16-bit char implementations (for Unicode) in a couple of years. Furthermore, today's computers can switch between Big- and Little-endian at boot time, and even juggle different byte orders between data and program space. Moreover, Big- and Little-endian aren't the only byte-ordering schemes--there are also Middle-endian systems, such as the DEC VAX.

Most computers address memory by byte, although word-addressable computers have been around for decades. For C programmers, the term "byte" has become shorthand for "the smallest chunk of addressable memory." The best alternative term I've seen for this is "minimum addressable unit" (MAU), from the IEEE standard for a portable object file format (MUFOM). Consequently, I now use the term "MAU order" in place of "byte order."

To alleviate the confusion caused by differing orders and sizes, I've developed an "endian engine" that handles every byte order (including every possible Middle-endian order) and every byte size, native or not. I've even used this engine to simulate a 36-bit machine with 9-bit bytes on a 32-bit machine with 8-bit bytes. In this article, I'll show how the engine can be used to implement big-integer routines. The engine and the routines are building blocks for a simulator I hope to write that would emulate Knuth's upcoming 64-bit computer (MMIX) with just about any ANSI/ISO C compiler.

There are a couple of familiar workarounds for the byte-ordering problems; neither is recommended, however. The first is the swab (swap bytes) function provided in many versions of UNIX, as far back as Version 7 (1979) and as recently as 4.4BSD-Lite (1994). The swab function only exchanges pairs of adjacent char values--useful in the days of the 16-bit PDP-11, but it's a relic now.

The other method is to use a union in C. However, the ANSI C standard says, "if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined." (Section 3.3.2.3, which also lists one exception that doesn't help this particular problem.) Elsewhere, ANSI C says that strictly conforming programs shall not depend on any implementation-defined behavior.

Several schemes exist for byte-ordering notation. Both Bolsky and Plauger notations, for instance, can have a leading zero (0123 is a 4-byte, Little-endian value in Plauger notation). This can cause subtle problems in C, where a leading 0 indicates an octal value.

"BSD notation" is my name for the byte orders returned by the 4.4BSD sysctl library function. In BSD notation, 4321 is a 4-byte, Big-endian value. Leading zeros are not possible, so octal confusion is avoided. The 4.4BSD-Lite source CD-ROM (available from O'Reilly & Associates) includes various endian.h header files that define equates in this notation. (The files are for kernel use only.)

Unfortunately, since the three notations I've mentioned are all in decimal, none work well with 16-byte entities. BSD notation is handy for simple things and is available from a 4.4BSD library routine (sysctl), but it isn't powerful enough to completely describe the format of something in memory. For convenience, I've provided a way to use BSD notation with the endian engine, via a routine that accepts BSD notation.

Table 1 is a list of the byte orders for a number of systems. The list uses BSD notation where necessary. Please e-mail me any additions or corrections to the list.

Listing Two, end.h, contains Memory_Format_T, a structure that describes the format of the memory for a given entity. Memory_Format_T hasfour members:

The size of a minimum addressable unit (MAU) in bits.

MAU_Order_T, an enum type that is shorthand for the MAU order. MAU_Order_T is declared in the rep.h header, and its values are BigEndian, LittleEndian, and MiddleEndian; see Listing One.

The number of MAUs in the entity.

A pointer to an array of references indicating exactly which MAU goes where. This is necessary only if MAU_Order_T is MiddleEndian.

In BSD notation, the byte order for a 4-byte, Big-endian value would be 4321. The equivalent array of MAU references is {4,3,2,1}. The DEC PDP-11 (a Middle-endian machine) would have an array of MAU references of {3,4,1,2} for a 32-bit integer on that machine. (The pointer to this array must be NULL for the BigEndian and LittleEndianMAU_Order_T values.)

Example 1 is C code that initializes Memory_Format_T structures for four different systems, including the Middle-endian PDP-11 and the 36-bit Honeywell 6000 (which has 9-bit char values).

The main routine in the endian engine (EndSwap) copies data and swaps MAUs as it goes. EndSwap requires two memory formats: one for the destination and another for the source. All the routines I declared in end.h can be used outside the endian engine, but EndSwap is the only one an application must call. Listing Two is the end.h source. EndSwap is in Listing Three. A call to it appears in Listing Four (the BigSwap routine).

Besides building them directly in the application, I've provided two other means to create Memory_Format_T values. Given a byte order in BSD notation, EndBSDByteOrderToFormat will allocate and fill in a Memory_Format_T structure. EndLearnNativeUnsignedLongFormat will build a Memory_Format_T structure for the native unsignedlong type in data space; this order may not apply in program space or to other data types. Both routines return NULL if out of memory or an error occurs. Callers should deallocate the memory allocated by those routines via the standard C free function. The source code for these functions is available electronically; see "Availability," page 3.

Eventually, I want the assembler and simulator that I write for the 64-bit Big-endian MMIX computer to run on my 32-bit Little-endian computer. In the middle, I envision a set of big-integer routines that would perform (at least) 64-bit integer math on any ANSI/ISO C system. The big-integer routines would be useful for compilers, debuggers, encryption, and the like. (Contact me at 72634.2402@compuserve.com if you'd like to examine the "draft standard" for these routines.) My implementation of the big-integer library uses the endian engine; other implementations may vary.

Sample big-integer routines are available electronically. The BigAdd routine adds two big integers using an optional layout. BigAdd calls BigSwap to convert each number to native layout. BigSwap uses the endian engine (EndSwap) to do the actual swap. Then BigAdd does the addition and calls BigSwap to convert the sum from native layout back to the caller's layout. The current version of BigAdd only supports one of the four usual representations for integers. In the future, I'll add support for two's complement, one's complement, and signed magnitude.

The rep.h header (Listing One) is shared between the big-integer routines and the endian engine. bigimpl.h (available electronically) and anything beginning with the prefix Big_ are not intended for use outside my big-integer code.

I faced a dilemma when I wrote the endian engine. What should it do if asked to process an entity that isn't an exact multiple of the number of bits in a char in that implementation of C? Consequently, if, when swapping data, a destination is not an integral number of char values, then:

Destination size is rounded up to an integral number of char values.

Additional bits of the destination have unspecified values after the swap.

Additional bits are located as if they were simply more high-order bits in their usual positions.

Also, the total number of bits (MAU sizexMAU count) for the destination and source must be identical.

This version of the endian engine includes optimizations for the most common combinations of formats. EndSwap uses EndSameFormat (available electronically) to determine if a straight memory copy would be correct; if so, EndSwap calls EndCopy (also available electronically). The next optimization EndSwap tries is for simply reversed char order; it uses routines in EndRev.c (see "Availability") to check for and handle that case.

If it can't use either of those optimizations for a given case, EndSwap calls EndSmallestMoveSize to determine the maximum number of bits that it can move together in a single chunk. EndSmallestMoveSize bases this computation on the number of bits in a destination MAU, the source MAU size, and the native char size in bits. The smallest size into which those three things can be divided is the greatest common denominator (GCD) of those three numbers. EndSwap loops for each chunk of this number of bits. For the actual bit processing (such as moving groups of bits in the EndSwap routine), I use the bit-operation macros I described in my article "Bit Operations with C Macros" (Dr. Dobb's Sourcebook of PowerPC Programming, September/October 1995). The MVBITS (move bits) bitops.h macro is one used in the endian engine.

Another possible optimization would be to build a table of "moves" for any given pair of formats. Each "move" would describe a set of contiguous bits in the source and how and where that set of bits would end up in the destination. Finally, the resulting table could be ordered so that only one pass over the destination would be necessary. (Perhaps this use of fewer writes would speed up some hardware caches.) Having built the moves table for a given pair of formats, I would cache it for later use. All this may seem like a lot of work, but it would probably only need to be done once.

The only missing piece of the endian engine is a routine to compute the GCD of two unsigned numbers. The book C: A Reference Manual, Third Edition, by Samuel P. Harbison and Guy L. Steele, Jr. (Prentice-Hall, 1991), provides listings of GCD functions in C. The endimpl.h header (available electronically) declares EndGCD as a function. I suspect you could "#define EndGCD gcd" or whatever the name of your GCD routine is.

If you design a computer architecture, file formats, I/O interface, protocol, or the like, I strongly recommend you specify Big-endian order. In his classic paper "On Holy Wars And A Plea For Peace," Danny Cohen said:

To the best of my knowledge only the Big-Endians...have built systems with a consistent order which works across chunk-boundaries, registers, instructions and memories. I failed to find a Little-Endians' system which is totally consistent.

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!