Data Alignment

What is Data Alignment?

In programming language, a data object (variable) has 2 properties; its value and the storage location (address). Data alignment means that the address of a data can be evenly divisible by 1, 2, 4, or 8. In other words, data object can have 1-byte, 2-byte, 4-byte, 8-byte alignment or any power of 2. For instance, if the address of a data is 12FEECh (1244908 in decimal), then it is 4-byte alignment because the address can be evenly divisible by 4. (You can divide it by 2 or 1, but 4 is the highest number that is divisible evenly.)

CPU does not read from or write to memory one byte at a time. Instead, CPU accesses memory in 2, 4, 8, 16, or 32 byte chunks at a time. The reason for doing this is the performance - accessing an address on 4-byte or 16-byte boundary is a lot faster than accessing an address on 1-byte boundary.

The following diagram illustrates how CPU accesses a 4-byte chuck of data with 4-byte memory access granularity.

Memory mapping from memory to CPU cache

If the data is misaligned of 4-byte boundary, CPU has to perform extra work to access the data: load 2 chucks of data, shift out unwanted bytes then combine them together. This process definitely slows down the performance and wastes CPU cycle just to get right data from memory.

Misaligned data slows down data access performance

Structure Member Alignment

In 32-bit x86 systems, the alignment is mostly same as its size of data type. Compiler aligns variables on their natural length boundaries. CPU will handle misaligned data properly, so you do not need to align the address explicitly.

Data alignment for each type

Data Type

Alignment (bytes)

char

1

short

2

int

4

float

4

double

4 or 8

However, the story is a little different for member data in struct, union or class objects. The struct (or union, class) member variables must be aligned to the highest bytes of the size of any member variables to prevent performance penalties. For example, if you have 1 char variable (1-byte) and 1 int variable (4-byte) in a struct, the compiler will pads 3 bytes between these two variables. Therefore, the total size of this struct variable is 8 bytes, instead of 5 bytes. By doing this, the address of this struct data is divisible evenly by 4. This is called structure member alignment. Of course, the size of struct will be grown as a consequence.

Be aware of using custom struct member alignment. It may cause serious compatibility issues, for example, linking external library using different packing alignments. It is better use default alignment all the time.

Data Alignment for SSE

SSE (Streaming SIMD Extensions) defines 128-bit (16-byte) packed data types (4 of 32-bit float data) and access to data can be improved if the address of data is aligned by 16-byte; divisible evenly by 16.

You can declare a variable with 16-byte aligned in MSVC, using __declspec(align(16)) keyword;

Because 16-byte aligned address must be divisible by 16, the least significant digit in hex number should be 0 all the time. That is why logical operators are used to make the first digit zero in hex number.

And, you may have from 0 to 15 bytes misaligned address. In worst case, you have to move the address 15 bytes forward before bitwise AND operation. Therefore, you need to append 15 bytes extra when allocating memory. For example, the 16-byte aligned addresses from 1000h are 1000h, 1010h, 1020h, 1030h, and so on. And if malloc() or C++ new operator allocates a memory space at 1011h, then we need to move 15 bytes forward, which is the next 16-byte aligned address.

Example

This example source includes MS VisualStudio project file and source code for printing out the addresses of structure member alignment and data alignment for SSE. Note that it uses MS specific keywords; __declspec() and __alignof().