Welcome back, and apologies for going silent without warning. Something resembling normality has resumed, and so I hope to also resume something resembling my normal schedule. In any case, for today's article, local reader Michael Manesh suggested that I talk about how you can use (or abuse) C's type system to obtain stronger typing guarantees by creating structs containing only a single field.

typedefC's typedef facility is tremendously useful for turning primitives into more semantic types, and a typical C program will be full of them. Even if you never write them yourself, you typically inherit a ton of them from system frameworks and the like. For example, Cocoa gives you types like NSTimeInterval, NSInteger, and CGFloat.

However, the typedef facility is weak. It doesn't produce a new type, but rather it just creates a new name for the existing type. For example, NSTimeInterval is declared as:

typedefdoubleNSTimeInterval;

This means that an NSTimeIntervalis just a double. They're two names for the same thing.

Sometimes that's exactly what we want. The whole point of NSInteger is just to be either an int or a long depending on architecture. Likewise, CGFloat just exists to give you either a float or a double depending on architecture.

NSTimeInterval is a different beast. Conceptually, it's not just a double, but a doublerepresenting a number of seconds. You might write this:

NSTimeIntervalinterval=5.0;// five seconds

But you probably wouldn't write this:

NSTimeIntervalinterval=[viewframe].size.width;

It's possible that you just happen to want an interval that's equal to the width of a view, interpreted as seconds. However, it's not very likely. It would be nice if the type system could notice that you're trying to assign a float or double to a NSTimeInterval and call this out as being wrong. Unfortunately, typedef can't do this, because NSTimeIntervalis a double in the end.

structAn interesting feature of C structs is that structurally-identical structs are still different types. For example, given this:

structFoo{intx,y;};structBar{intx,y;};

This will not compile:

structFoofoo;structBarbar=foo;

Despite the fact that foo and bar have identical contents, they have different types, the compiler won't convert between the two.

This fact gives us the tool we need to create new types rather than simply creating new names for existing types.

Single-Field structsThe idea is simple. Rather than define a time interval using typedef, define it with a struct that contains a single element:

typedefstructMATimeInterval{doubleseconds;}MATimeInterval;

This still uses typedef, of course, but just as a convenience, so that we can write the type as MATimeInterval instead of struct MATimeInterval.

The fact that it's a struct has some syntactic consequences which makes the code more verbose. While this is a minor disadvantage over a plain typedef, it's also an advantage in that it makes code more explicit. For example, you can no longer write something like this:

MATimeIntervalinterval=5;

Instead, you need some braces:

MATimeIntervalinterval={5};

Using field initializers, you can make it more explicit:

MATimeIntervalinterval={.seconds=5};

This way there's no doubt what unit of time is being used.

When passing a value to a function or method that takes a MATimeInterval as a parameter, you can no longer just pass a number. Instead, you can use C's compound literals syntax:

The function implementation is explicit and clear, with all the units spelled out, and the calling code is direct and to the point.

Runtime CostsWhen replacing a bunch of simple primitives with structs, it's natural to be worried about the runtime costs. An int or a double can fit into a register and be directly manipulated with machine instructions, but a struct must require more work to load and unload the values within.

The good news is that this is not the case. It would be true if we were using, say, full-fledged Objective-C objects, but structs are sufficiently low-level that they can be completely optimized away. The compiler is able to treat each element of a struct as a separate value:

As long as you don't take the address of foo, the compiler is free to rearrange the storage at will. It can put a, b, and c into individual registers. It can even eliminate or short-circuit the assignments altogether if circumstances allow. Take this function for example:

You might expect this to allocate 16 bytes on the stack (two double components in the CGSize, when targeting x86-64), then perform two multiplies, a subtraction, and finally a call to sqrt(). Here is the code that clang produces when compiling this function with optimizations:

It's able to peel away the struct and precalculate the entire expression, so that the executed code does nothing but returning that precalculated value.

There's never a case where a single-field structcan't be treated as being the same as the field it contains at runtime. Manipulating the struct to get or set the value inside becomes free. Even passing them as parameters to methods or returning them from methods imposes no additional overhead compared to using the underlying type directly, at least on any architecture we're likely to encounter. The field access ends up as nothing more than compile-time syntax.

ConclusionThe C type system is fairly weak, and the common technique of using typedef to produce new type names makes it easy to mix up values of different conceptual types in code. The struct keyword creates an entirely new type which can be used to avoid this, allowing the compiler to enforce the difference between your types. The resulting code becomes more verbose, which can be good or bad, depending on your perspective and situation. While constantly packing and unpacking structs can be a pain, wisely chosen field names can help make it more obvious just what kind of values the code is working with.

That wraps it up for today! Come back next time for more craziness. Friday Q&A is driven by reader suggestions, so if you have a topic you'd like to see covered that next time, or some time after that, please send it in!

Did you enjoy this article? I'm selling whole books full of them! Volumes II and III are now out! They're available as ePub, PDF, print, and on iBooks and Kindle. Click here for more information.

In ios, if return value is a struct, the caller will prepare a buffer for the struct, and pass the buffer pointer as first argument for return value. This is a bit less efficient than returning a primitive type.

In x86, there is no such issue. If struct size <= 8 byte, the return value will be put in eat:edx(32bit).

That is mostly not true, although you are correct for some cases. To quote the ABI:

A Composite Type not larger than 4 bytes is returned in r0.

However, you do get different behavior for 64-bit primitives (e.g. double or long long). When returned directly, they are returned in r0 and r1, while a struct containing one of those types is returned using the caller-provided buffer as you say.

One case where this fails, though, is when you want to return a pointer from a function: using plain values you can do
int global_which_needs_to_be_a_plain_int;
int* get_value_ref(void)
{
return &global_which_needs_to_be_a_plain_int;
}

Well yes, you’re not taking the address of the global, you’re initialising a newly constructed struct with its value, so the address is necessarily going to be on the stack.

But on any compiler that produces identical code for accesses to plain ints as for accesses to single-int-member structs, you should be able to get away with a (foo*)&global_which_needs_to_be_a_plain_int cast.

@Aristotle: The cast is (as I presume you know) undefined behaviour but should kind-of work for getting the return value. The (practical) problem is that you now may end up with pointers of different type pointing to the same location. Type-based alias analysis could bite you at any point after that.

What you call "field initializers " are more commonly refered to as "designated initializers". They are supported in C99, but not C++, although for this usage there is no real point it using them given there is only one field anyway. <http://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html>;

For a bit of initial work, C++ can work around a lot of the failings of the structs, in that it can allow things like adding or subtracting MADistance without allowing you to add MADistance and MAVelocity, while retaining the lack of runtime overhead.

The notion that not being able to return the address of an int in place of a struct pointer return is a PROBLEM for this technique seems to miss the point. The technique is intended to disallow treatment of different types as the same thing. That it is effective at accomplishing its intended purpose is a demonstration of its effectiveness, not its deficiency.