Is there any good example to give the difference between a struct and a union?
Basically I know that struct uses all the memory of its member and union uses the largest members memory space. Is there any other OS level difference?

12 Answers
12

With a union, you're only supposed to use one of the elements, because they're all stored at the same spot. This makes it useful when you want to store something that could be one of several types. A struct, on the other hand, has a separate memory location for each of its elements and they all can be used at once.

To give a concrete example of their use, I was working on a Scheme interpreter a little while ago and I was essentially overlaying the Scheme data types onto the C data types. This involved storing in a struct an enum indicating the type of value and a union to store that value.

edit: If you're wondering what setting x.b to 'c' changes the value of x.a to, technically speaking it's undefined. On most modern machines a char is 1 byte and an int is 4 bytes, so giving x.b the value 'c' also gives the first byte of x.a that same value:

union foo x;
x.a = 3;
x.b = 'c';
printf("%i, %i\n", x.a, x.b);

prints

99, 99

Why are the two values the same? Because the last 3 bytes of the int 3 are all zero, so it's also read as 99. If we put in a larger number for x.a, you'll see that this is not always the case:

union foo x;
x.a = 387439;
x.b = 'c';
printf("%i, %i\n", x.a, x.b);

prints

387427, 99

To get a closer look at the actual memory values, let's set and print out the values in hex:

In C, the order of bytes in an int are not defined. This program overwrote the 0xEF with 0x22 on my Mac, but there are other platforms where it would overwrite the 0xDE instead because the order of the bytes that make up the int were reversed. Therefore, when writing a program, you should never rely on the behavior of overwriting specific data in a union because it's not portable.

@KyleCronin I think I get it. In your case, you have a group of the types, knowing your only going to need to use one but you don't know which one until runtime - so the union allows you to do that. Thanks
–
user12345613Feb 21 '12 at 0:49

2

@user12345613 unions can be used as a sort of base class for structs. You can emulate an OO hierarchy using unions of structs
–
Morten JensenMar 12 '13 at 21:57

allocates at least (sizeof(int)+sizeof(long)+sizeof(double)+sizeof(long double)) bytes in memory for each instance. ("At least" because architecture alignment constraints may force the compiler to pad the struct.)

allocates one chunk of memory and gives it four aliases. So sizeof(union foobarbazquux_u) ≥ max((sizeof(int),sizeof(long),sizeof(double),sizeof(long double)), again with the possibility of some addition for alignments.

As you already state in your question, the main difference between union and struct is that union members overlay the memory of each other so that the sizeof of a union is the one , while struct members are laid out one after each other (with optional padding in between). Also an union is large enough to contain all its members, and have an alignment that fits all its members. So let's say int can only be stored at 2 byte addresses and is 2 bytes wide, and long can only be stored at 4 byte addresses and is 4 bytes long. The following union

union test {
int a;
long b;
};

could have a sizeof of 4, and an alignment requirement of 4. Both an union and a struct can have padding at the end, but not at their beginning. Writing to a struct changes only the value of the member written to. Writing to a member of an union will render the value of all other members invalid. You cannot access them if you haven't written to them before, otherwise the behavior is undefined. GCC provides as an extension that you can actually read from members of an union, even though you haven't written to them most recently. For an Operation System, it doesn't have to matter whether a user program writes to an union or to a structure. This actually is only an issue of the compiler.

Another important property of union and struct is, they allow that a pointer to them can point to types of any of its members. So the following is valid:

struct test {
int a;
double b;
} * some_test_pointer;

some_test_pointer can point to int* or bool*. If you cast an address of type test to int*, it will point to its first member, a, actually. The same is true for an union too. Thus, because an union will always have the right alignment, you can use an union to make pointing to some type valid:

In this imaginary protocol, it has been sepecified that, based on the "message type", the following location in the header will either be a request number, or a four character code, but not both. In short, unions allow for the same storage location to represent more than one data type, where it is guaranteed that you will only want to store one of the types of data at any one time.

Unions are largely a low-level detail based in C's heritage as a system programming language, where "overlapping" storage locations are sometimes used in this way. You can sometimes use unions to save memory where you have a data structure where only one of several types will be saved at one time.

In general, the OS doesn't care or know about structs and unions -- they are both simply blocks of memory to it. A struct is a block of memory that stores several data objects, where those objects don't overlap. A union is a block of memory that stores several data objects, but has only storage for the largest of these, and thus can only store one of the data objects at any one time.

"union" and "struct" are constructs of the C language. Talking of an "OS level" difference between them is inappropriate, since it's the compiler that produces different code if you use one or another keyword.

You have it, that's all.
But so, basically, what's the point of unions?

You can put in the same location content of different types. You have to know the type of what you have stored in the union (so often you put it in a struct with a type tag...).

Why is this important? Not really for space gains. Yes, you can gain some bits or do some padding, but that's not the main point anymore.

It's for type safety, it enables you to do some kind of 'dynamic typing': the compiler knows that your content may have different meanings and the precise meaning of how your interpret it is up to you at run-time. If you have a pointer that can point to different types, you MUST use a union, otherwise you code may be incorrect due to aliasing problems (the compiler says to itself "oh, only this pointer can point to this type, so I can optimize out those accesses...", and bad things can happen).

Yes the main difference between struct and union is same as you stated,
Struct uses all the memory of its member and union uses the largest members memory space.

But all the difference lies by the usage need of the memory.
Best usage of the union can be seen in the processes of unix where we make use of signals.
like a process can act upon only one signal at a time.
So the general declaration will be,

union SIGSELECT

{
SIGNAL_1 signal1;
SIGNAL_2 signal2;
.....
};

In this case, process make use of only the highest memory of all signals.
but if you use struct in this case, memory usage will be sum of all signals.
Makes alot of difference.

To summarize, Union should be selected if you know that you access any one of the member at a time.

I think of unions as of a tool for very low level manipulation like writing device drivers for a kernel.

You were asking about an example and I think I have an excellent one.
In the code below I am dissecting float number by using union of a struct with bitfields and a float. I save a number in the float, and later I can access particular parts of the floats through the struct. It shows how union is used to have different angles to look at a data.

The uses of union
Unions are used frequently when specialized type conversations are needed.
To get an idea of the usefulness of union. The c/c standard library defines no function specifically designed to write short integers to a file. Using fwrite() incurs encurs excessive overhead for simple operation. However using a union you can easily create a function which writes binary of a short integer to a file one byte at a time. I assume that short integers are 2 byte long

structure is collection of different data type where different type of data can reside in it
and every one get its own block of memory

we usually used union when we sure that only one of the variable will be used at once and you want fully utilization of present memory because it get only one block of memory which is equal to the biggest type.