Archive for August, 2014

Boolean Fields in a Struct: Another Complex PInvoke Construct

Adjacent Boolean fields in a structure constitute a problem in marshaling a C++ struct to C# .Net. Point is that the memory layout of structs is maintained in marshalling. Using the MarshalAs function or the StructLayout Pack field doesn’t solve the problem. Sharing of the bool aggregation memory word by several Properties, in combination with bit-masks and a bitwise operator, does however do resolve the situation.

Introduction

In the previous blog post we saw how to use PInvoke to copy an array of structs from C++ to C#, using a callback. In this blog post we build on the code developed in that previous post to solve a problem in working with Boolean fields (fields of type bool).

I know, it should be easy. However, it is complicated.

Point is: if we extend our structure with a single Boolean field, marshaling works fine, but if we add two adjacent Boolean fields, marshaling breaks down. Why?

Let’s first see what happens, then analyze the problem , and with the analysis in hand evaluate solution alternatives. And yes, the problem will be solved.

Adding Boolean Fields to a Struct, Then Marshal it in PInvoke

Assume we would like to extend our C++ MeteoInfo (meteorological Information) struct with a Boolean field indicating whether it is raining (whether there is precipitation) at the meteo station at time of measurement. In C++ we get:

Then, encouraged by our success, we add two more, but adjacent Boolean fields: IsOperational and IsOnline, which indicate whether the meteo station is known to function properly, and can be reached by internet, respectively. In C++:

Analysis: Struct Memory Layout

In Marshaling a type with layout from C++, default alignment/padding of types is the same as in C++.

Alignment requirement is by default 8 but can be changed using the Pack field on the StructLayoutAttribute.

To complete the picture we should add that:

The (minimum) byte size of the bool type has not been defined in C++, so it is implementation dependent. In VC++ sizeof(bool)=1, as it is in C#.

Now let’s take a look at the memory layout of the structure in C++, hence in C# .Net, and compare this with the layout we would expect in C# based on aligned type byte sizes.

Field

Byte offset in C++, C#

Type Size CLR

Aligned type size based offset

DisplayName

0

8

0

UniqueID

8

8

8

Temp

16

4

16

IsRaining

20

1 (padding=3)

20

Humidity

24

8

24

One Past End of struct (size)

32

32

32

So indeed, no problems in sight; the actual offset are equal to the offsets based on aligned managed type sizes.

How did I get these data? The C++ structure memory layout can be read off the Memory window while debugging in VS2013. However, in order to obtain the offset numbers for the C# struct (MMeteoInfo) you have to use a trick, see here: assign each field of a struct object to a suitably typed variable (i.e. var 🙂 ), and evaluate the offset of the source value. (I’m open to suggestions on utilities that show memory layout of complex types in managed memory, how ever volatile.)

When we add two (or more) adjacent Boolean fields, we are in trouble

Field

Byte offset in C++, C#

Type Size CLR

Aligned type size based offset

DisplayName

0

8

0

UniqueID

8

8

8

IsOperational

16

1

16

IsOnline

17

1

20

Temp

20 (aligned at word)

4

24

IsRaining

24

1

28

Humidity

32 (aligned at 8 multiple)

8

32

One Past End of struct(size)

40

40

40

This shows that the two adjacent Boolean fields are packed together in a single word. This is the doing of default marshaling, but clearly the CLR cannot adapt to this.

We can explain, or predict the errors now:

An assignment of IsOperational grabs four bytes, so it also copies IsOnline, hence with our initializations the result is always true. Indeed, if we set both to False, IsOperational comes out false too.

An assignment of IsOnline also grabs four bytes, so also the first byte of Temp. Temp is positive in our initializations, so IsOnline is always true. Indeed, if we set Temp to zero, IsOnline comes out False.

Assignment of Temp really grabs the value of IsRaining at offset 24, so we get to see the value of IsRaining instead of the value of Temp.

The value for IsRaining is copied from the empty padding bytes at offset 28, rendering it the value 0.

The value or Humidity, a double, is taken from the first multiple of 8 bytes, at offset 32, hence it is correct.

So, now we know the MSIL instructions copy too many bytes, or bytes from the wrong location, how do we repair this error?

Solutions and Non-Solutions

A solution we do not want is to change the C++ code by using #pragma pack(…). #pragma is used to create non-portable code, pack is used to adapt the memory layout of structures. I don’t want to make code non-portable, just because I would like to add a GUI to that code.

Let’s review a number of approaches that might seem reasonable, but in fact will not work.

The MarshalAs Method

We could insert a clause like:

MarshalAs(UnmanagedType.Bool)

However, that one is meant for export of data to C++, targeting the Windows BOOL type, which is four bytes. For our 1 byte bool type we use:

MarshalAs(UnmanagedType.U1)

Then we change the definition of MMeteoInfo so we can marshal the bool variables, if we would like to:

The StructLayoutAttribute.Pack Field

If we set Packing to 4, all our troubles should be over; all fields should start at word boundary. However, mirroring the C++ layout takes precedence – the two Boolean fields keep being marshaled as two adjacent bytes, and we still get the same error message. The same result is obtained for each packing value we may choose.

Use the StructLayout Attribute with LayoutKind.Explicit

So, we cannot change the structure’s memory layout by merely specifying a packing. Next step is to explicitly specify the layout per structure field. If we do so, the definition of the structure becomes like this:

Note that the property IsOnline does not have a backing field of its own, It shares a backing field with IsOperational. After all, both bools are encoded in these same 4 bytes. This code works, except that IsOperational, and IsOnline are now dependent values.

So, how do we get the correct value for each bool out of the 4 bytes in the isOperational field?