“Know the rules well, so you can break them effectively.”
— Dalai Lama XIV

One of the lesser-known secrets of the C programming language is the so-called “strict aliasing rule”. This is a shame, because failing to adhere to it takes you (along with your code) straight into the realm of undefined behavior. As no one in their right mind wants to go there, let’s shed some light on it!

POINTER ALIASING DEFINED

First of all, we have to clarify what “aliasing” really means, or rather aliasing of pointers. Take a look at this example:

1

2

3

4

intvalue;

int*p1=&value;// p1 points to 'value'.

int*p2=&value;// p2 as well...

Here, ‘p1’ and ‘p2’ are aliased to the same object ‘value’; that is, they point to the same object. If you update ‘value’ through ‘p1’:

1

*p1=42;

a read through ‘p2’ will reflect this change:

1

assert((*p1==*p2)&&(value==*p2));// So true...

Because of the possibility of aliasing, a C compiler is prevented from applying certain optimizations. Consider:

1

2

3

4

5

intsilly(int*x,int*y){

*x=0;

*y=1;

return*x;

}

You might think that any decent compiler would generate simplified code equivalent to this:

1

2

3

4

5

intsilly(int*x,int*y){

*x=0;

*y=1;

return0;// *x was previously set to 0, so don't load from memory again.

}

It’s not a matter of decency — the compiler just can’t do this optimization! Here’s the assembly output that clearly shows that the return value is loaded from memory:

1

$gcc-O2-masm=intel silly.c-S&&catsilly.s

1

2

3

4

5

silly:

movDWORDPTR[rdi],0

movDWORDPTR[rsi],1

moveax,DWORDPTR[rdi]; '*x' fetched from memory.

ret

The optimization is not possible because the caller could call ‘silly’ like so:

1

2

intvalue;

silly(&value,&value);

In this case, ‘x’ and ‘y’ are aliased to the same ‘value’, which means ‘silly’ must return 1 not 0. Consequently, ‘*x’ must be read from memory, every time. Period.

ROOM FOR IMPROVEMENT

If you think about it, even though it may happen, pointer aliasing won’t happen very often in practice. Why waste so much potential for optimization for the uncommon case? Most likely, the folks from the C standards committee had the same line of thinking. They introduced rules that state when pointer aliasing must not happen. Enter the strict aliasing rule.

As you can see, this time no load from memory is performed — 0 is returned instead. The optimization is possible because the compiler assumes that aliasing is not allowed in this case.

VIOLATIONS

But what happens if pointers to incompatible types nevertheless alias? After all, this can happen quite easily. Maybe not in the ‘silly’ example, but in real-world production code:

1

2

3

4

5

6

7

8

9

10

structmeasurements_t{

uint8_t level;

uint16_t temperature;

uint32_t force;

};

voidconvert(constuint8_t*data,structmeasurements_t*measurements){

/* Fill measurements object with raw data. */

*measurements=*((structmeasurements_t*)&data[0]);

}

In an attempt to convert data stored in a buffer (maybe read over a network connection) into a high-level structure, a pointer to ‘struct measurements_t’ is aliased with a pointer to a ‘uint_8’. Since both types are incompatible (pointer to struct vs. pointer to ‘uint8_t’) this code is a violation of the strict aliasing rule. Experienced C developers most likely recognized immediately that this code yields undefined behavior, but they would have probably attributed it to struct padding and alignment issues. The real reason, as we know by now, is a violation of the strict aliasing rule.

THE FINE PRINT

So what exactly is the strict aliasing rule and what does “type compatibility” mean? Here’s an excerpt from the ISO C99, standard, chapter 6.5:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

a type compatible with the effective type of the object,

a qualified version of a type compatible with the effective type of the object,

a type that is the signed or unsigned type corresponding to the effective type of the object,

a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

a character type.

Such Standardeese is often hard to digest, so let me try to clarify it a bit. Aliased pointer access is fine if:

1. The pointed-at types are identical. Note that typedefs are just type aliases and don’t introduce new types:

1

2

3

typedefintINT;

INT*p=...

intx=*((int*)p);// Fine and cast not really necessary!

2. The pointed-at types are identical apart from the “signed-ness” (e. g. ‘int’ vs. ‘unsigned int’).
3. The pointed-at types are identical apart from qualification (e. g. ‘const int’ vs. ‘int’).
4. The rule “an aggregate or union type that includes one of the aforementioned types among its members” is highly confusing and probably doesn’t mean much. Check this out for details.
5. The pointed-at types are different, but the pointed-at type through which the access is made is a pointer to character:

1

2

3

4

5

6

floatf=3.1415;

unsignedchar*p=(unsignedchar*)&f;

unsignedchara1=p[0];// First byte of 'f'.

unsignedchara2=p[1];// :

unsignedchara3=p[2];// :

unsignedchara4=p[3];// Last byte of 'f'.

Conversely, aliased pointer access is not defined if the pointed-at types are fundamentally different. Note that this includes pointers to structs that are identically defined but have different tag names:

1

2

3

4

5

structS1{intx;};// tag 'S1'.

structS2{intx;};// tag 'S2'.

S1*s1;

S2=*((S2*)s1);// Undefined behavior!

CONCLUSION

The strict aliasing rule was introduced to give the compiler vendors some leeway regarding optimizations. By default, the compiler assumes that pointers to (loosely speaking) incompatible types never alias. As a consequence, you, the programmer, have to make sure that this rule is obeyed.

Here’s some disquieting news: a lot of existing code isn’t conforming to the strict aliasing rule, but the code works (or appears to work) fine anyway. As an example, the ‘convert’ function above, which aliases a struct to an array of bytes might work fine on an Intel x86-based platform, which supports unaligned memory access. However, if you use ‘convert’ on an ARM-based platform, you will get a “bus error” exception that will crash your system. In other cases, nonconforming code just works by coincident, with a particular compiler, or a particular compiler version at a particular optimization level.

To me, knowing about the strict aliasing rule is as important for every systems developer as knowing about the other systems programming “secrets” like alignment, struct padding, and endianness.

FURTHER READING

I’m indebted to the authors of the following two blog post. From the first post, I shamelessly stole the idea for the ‘silly’ function, which the author originally called ‘foo’, which I found silly :-) The second post gives a much more detailed coverage of the strict aliasing rule and also discusses the interesting technique of “casting through a union”.