Using MISRA C and C++ for security and reliability. Part II

Editor's Note: This article was originally presented at ESC Boston 2011.

-------------------------The purpose of MISRA C and MISRA C++ guidelines are not to promote the use of C or C++ in critical systems. Rather, the guidelines accept that these languages are being used for an increasing number of projects. The guidelines discuss general problems in software engineering and note that C and C++ do not have as much error checking as other languages do. Thus the guidelines hope to make C and C++ safer to use, although they do not endorse MISRA C or MISRA C++ over other languages.

MISRA C is a subset of the C language. In particular, it is based on the ISO/IEC 9899:1990 C standard, which is identical to the ANSI X3.159-1989 standard, often called C ’89. Thus every MISRA C program is a valid C program. The MISRA C subset is defined by 141 rules that constrain the C language. Correspondingly, MISRA C++ is a subset of the ISO/IEC 14882:2003 C++ standard. MISRA C++ is based on 228 rules, many of which are refinements of the MISRA C rules to deal with the additional realities of C++.

After introducing MISRA C and MISRA C++ and presenting the taxonomy of rules in Part One, Compiler Development reviews here the rules mentioned in Part One.

The first statement sets bit 8 of the variable line_a. The second statement sets bit 7 of line_b. You might think that the third statement sets bit 6 of line_c. It doesn’t. It sets bits 2, 4, and 5. The reason is that in C any numeric constant that begins with 0 is interpreted as an octal constant. Octal 64 is the same as decimal 52, or 0x34.

Unlike hexadecimal constants that begin with 0x, octal constants look like decimal numbers. Also, since octal only has 8 digits, it never has extra digits that would give it away as non-decimal, the way that hexadecimal has a, b, c, d, e, and f.

Once upon a time, octal constants were useful for machines with odd-word sizes. These days, they create more problems than they’re worth. MISRA C prevents programmer error by forcing people to write constants in either decimal or hexadecimal.

. Typedefs that indicate size and signedness should be used in place of the basic types. (C Rule 6.3/C++ Rule 3-9-2/Advisory)

This is a portability requirement. Code that works correctly with one compiler or target might do something completely different on another. For example:

On a target where an int is a 16-bit quantity, j*1024 will overflow and become a negative number when j >= 32. MISRA C suggests defining a type in a header file that is always 32-bits. For example one could define a header file called misra.h that does this. It could define an 32 bit type as follows:

Then the original code could be written as:

Strict adherence to this rule will not eliminate all portability problems based on the sizes of various types1, but it will eliminate most of them. Other MISRA rules (notably 10.1 and 10.3) are meant to fill in these gaps.

The potential drawback to such a rule is that programmers understand the concept of an “int”, but badly-named types may disguise what the type represents.

Consider a “generic_pointer” type. Is this a void * or some integral type that is large enough to hold the value of a pointer without losing data? Problems like this can be avoided by sticking to a common naming convention. Although there will be a slight learning curve for these names, it will pay off over time.

Another problem is that using a type like UI_16 may be less efficient than using an “int” on a 32-bit machine. While it would be unsafe to use an int in place of a UI_16 if the code depends on the value of the variable being truncated after each assignment, in many cases the code does not depend on this. In some cases, an optimizing compiler can remove the extra truncations; in the rest, the extra cycles can be considered the price of safety.

----------------------------1. The “integral promotion” rule states that before chars and shorts are operated on, they are cast up to an integer if an integer can represent all the values of the original type. Otherwise, they are cast up to an unsigned integer. The following code will behave differently on a target with a 16-bit integer (where it will return 0) than it will on a target with a 32-bit integer (where it will return 65536).