The first part of this article has introduced the concepts behind Structs and Unions, and how to define and use each either independently, or even interleaved .i.e. a Union inside a Struct. In this part, we’re going to see useful applications of Struct and Union in Embedded C/C++, like how to use them to map registers to access its fields and we are going to discuss some pros and cons of bit fields.

Bit Fields in A Nutshell

Bit fields are designed specifically to reduce the needed memory amount to a minimum, where the same memory location can be divided to “bit fields” instead of having a dedicated location for every bit field. To declare a bit field inside a Struct, use the “:” operator followed by the number of bits as an integer value.

C

1

2

3

4

5

6

typedefstructfoo

{

unsignedchara:4;

unsignedcharb:3;

unsignedcharc:1;

}foo_t;

What bit fields do, is basically masking bitwise operations to access the value of its fields. They are actually memory addresses with specific lengths (foo occupies 1 Byte). To examine this, we will disassemble an (.elf) file after compiling a c++ code for an AVR MCU (Arduino Sketch). The assembly code shows how to access a value inside “foo”:

Assembly (x86)

1

2

ldsr24,0x01DB; Load foo 1-Byte-length value from 0x01DB address to a general purpose register

andir24,0xF0; Mask the value according to the field lengths (foo.a is a 4-bit-length field)

So, using bit fields saves memory, that’s right, but add more instructions to access the bit fields variables. One more line of code couldn’t be a problem, but this is not the best case. There are complicated Structs or even simple ones but with special cases i.e. when using bit fields belongs to 2 bytes in the same time. Let’s tweak “foo” a little to explain how:

C

1

2

3

4

5

6

7

8

struct{

unsignedinta:4;

unsignedintb:6;

unsignedintc:1;

unsignedintd:8;

unsignedinte:3;

unsignedintf:2;

}foo;

“b” field has 4 bits belong to the first byte (0x8001db) and 2 bits belong to the next byte (0x8001dc). So the compiler has to add much more lines to deal with “foo.b” and that’s clear in the assembly code.

Assembly (x86)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

; The below is for foo.b = foo.b + d;

a84:8f73andir24,0x3F; 63

a86:865fsubir24,0xF6; 246

a88:982fmovr25,r24

a8a:9295swapr25

a8c:907fandir25,0xF0; 240

a8e:4091db01ldsr20,0x01DB; 0x8001db <foo>

a92:4f70andir20,0x0F; 15

a94:492borr20,r25

a96:4093db01sts0x01DB,r20; 0x8001db <foo>

a9a:8295swapr24

a9c:8370andir24,0x03; 3

a9e:9091dc01ldsr25,0x01DC; 0x8001dc <foo+0x1>

aa2:9c7fandir25,0xFC; 252

aa4:892borr24,r25

aa6:8093dc01sts0x01DC,r24; 0x8001dc <foo+0x1>

Let’s see what it will look like if foo.a was used with the old foo’s definition

Assembly (x86)

1

2

3

4

5

6

7

; The below is for foo.a = foo.a + d;

a58:682fmovr22,r24

a5a:6f70andir22,0x0F; 15

a5c:9091db01ldsr25,0x01DB; 0x8001db <foo>

a60:907fandir25,0xF0; 240

a62:962borr25,r22

a64:9093db01sts0x01DB,r25; 0x8001db <foo>

Less lines of code!

So it’s important to keep an eye on how fields are divided. For instance, the last “foo” definition can be modified to be like this:

C

1

2

3

4

5

6

7

8

9

10

struct{

unsignedinta:4;

unsignedint:4;

unsignedintb:6;

unsignedintc:1;

unsignedint:1;

unsignedintd:8;

unsignedinte:3;

unsignedintf:2;

}foo;

So a pad was added as needed to avoid unwanted behavior/performance.

Don’t Trust The Code, Listen to the Compiler

Dealing with bit field needs open eyes as we have seen in the last example, where the Struct division has a great impact on performance and size of code. Now, two examples can say why doesn’t the compiler understand the struct definition as we expect. The following examples are adapted from questions appeared on Stackoverflow website.

Example #1

C

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

struct

{

unsignedchara:4;

unsignedcharb:8;

unsignedcharc:4;

}foo;

struct

{

unsignedchara:4;

unsignedcharb;

unsignedcharc:4;

}FOO;

These two Structs should be the same, right? Saying “unsigned char b:8;” and “unsigned char b;” seems the same for us. But actually, if we print the size of foo and FOO, we will find that foo’s size is 2 and size of FOO is 3. But Why? The compiler understands the first one as the following:

First Byte: 0..3 bits for “a” and 4..7 for first 4 bits for “b”.

Second Byte: 0..3 bits for the rest of “b” and 4..7 for “c”.

In the second Struct:

First Byte: 0..3 bits for “a” and 4..7 padded (unused).

Second Byte: for b.

Second Byte: 0..3 bits for “c” and 4..7 unused.

Example #2

C

1

2

3

4

5

6

7

8

9

10

11

12

13

14

struct

{

unsignedlonga:1;

unsignedlongb:32;

unsignedlongc:1;

}mystruct1;

struct

{

unsignedlonga:1;

unsignedlongb:31;

unsignedlongc:1;

}mystruct2;

In this two Structs, most of us will expect to have the same size (both 8 Bytes), but the fact is the first one will have the following Bytes:

Bytes 0..3: for unsigned long a:1.

Bytes 4..7: for unsigned long b:32.

Bytes 8..11: for unsigned long c:1.

While the second one will have:

Bytes 0..3: for unsigned long a:1 and unsigned long b:31.

Bytes 4..7: for unsigned long c:1.

Should I Use Bit Field Or Not?

As any other solution in engineering world, it’s not always a win-win situation. Bit-fields save place in data memory, and they also provide a simplified way to set and get values (that aren’t byte-aligned) rather than using bitwise operations. On the other hand, this means that whenever you use a bit-field variable, the processor/compiler will perform READ-MODIFY-WRITE operations. As the memory location is shared with others, then the compiler will read the entire variable to store it in a temporary place, mask other fields, change the value and then restore the value of unchanged fields (this is what is called READ-MODIFY-WRITE operation).

So it’s true that we saved some space in the SRAM, but we will need more instructions for each reading/writing operation in the Flash memory. Accessing a bit-field variable is another concern when using bit-field. It is not an atomic operation, which could lead to faults in some critical code sections. Especially for shared bit-field variables between interrupts and processes.

Many developers advocate to avoid using bit fields because it makes the code less portable as changing the compiler/version means changing how it looks and do with bit field. They also advocate to use bit-banding (the hardware version of bit fields) when it’s available. Actually, the past line was written in an article I wrote about bit-field variables.

Application #2: Implementing Protocols

Any non-ASCII protocol has fields with special meaning inside each byte (or any other size of data). Let’s say that we have a protocol that starts with one byte called “command”, and has the first bit to indicate direction, if it’s (get or response), and 2 bits as an address and rest is the command id. Using bitwise operation may be annoying. As Struct already do the necessary bitwise operations, then it’s useful to implement the protocol packets using a Union.

C

1

2

3

4

5

6

7

8

9

union

{

struct{

uint8_t dir:1;// bit 0

uint8_t add:2;// 1 .. 2 bits

units8_t id:5;// 3 .. 7 bits

}fields;

uint8_t val;

}CMD;

Thus, when sending or receiving a cmd packet can be done using that Union. A demo pseudo-code:

C++

1

2

3

4

5

6

7

8

9

10

11

12

CMD.fields.dir=0;

CMD.fields.add=2;

CMD.fields.id=7;

send(CMD.val);

if(newcmd)

{

CMD.val=receive();

if(CMD.fields.add!=dDeviceadd)

//error

}

Application #3: Access to MCU Registers

Representing hardware registers as Bitfields is a very handy trick and a useful way for the ease of MCU register access. This techniques is used widely in SDKs i.e. ARM cortex M3/M4 SiliconLabs’ Gecko SDK. Using this technique, each register can be accessed using a bit field struct with its name. Later this struct will point to the peripheral/register base-address.

Embedded Hardware Engineer interested in open hardware and was born in the same year as Linux. Yahya is the editor-in-chief of Atadiat and believes in the importance of sharing free, practical, spam-free and high quality written content with others. His experience with Embedded Systems includes developing firmware with bare-metal C and Arduino, designing PCB&schematic and content creation.

Atadiat Community Newsletter

Enter your Email address

One Comment

It might be good to mention, that in standard C and C++ it is undefined behavior to write to a union using one field and read using another (see for example http://en.cppreference.com/w/cpp/language/union). It works on many compilers (and optimizations) and platforms, but you need to be sure of it before using this feature.

Advertisement

Follow Us

About

With passionate staff, Atadiat believe that Electronics is a practical domain and related content must be fine and practical. We aim to provide a new content experience with marketing related to electronics to our audience.