The programmer writes one past the end of the array by failing to realize that you
can't safely say x[SIZE]='\0' if x is declared as x[SIZE].
Testing that you have written past the end of the array after having made
the error is too late; the program is doomed to fail once the memory is overwritten.

If the passed-in string contains only blanks, nothing in the above code stops
the loop from decrementing the loop pointer right past the beginning of the string.

If the passed-in string is empty, the loop pointer makes its first test on the character
before the start of the string.

Either of these problems can lead to a program abort. Do not use a pointer to
address memory that is before or after the end of the string; that memory may not be
accessible.

Note that a pointer may safely contain an address that is before or after
either end of the string. What is not permitted is to use that pointer to
address memory. The following is a safe use of a pointer past the end of the string,
since the pointer is moved inside the string before it is used to address memory:

Test before you store or index an array

The incorrect code below stores the integer into the array before it tests to
see if there is room for the item to be stored. This can fail if no space is left in
the array, causing memory to be overwritten and the program to fail:

If 'x' is not found in the string, the loop indexes str[-1], which is off the end of
the array and may cause a processor memory fault. The correct code does the test for
the end of the array first, and the index second:

while( p >= str && *p != 'x' ) /* CORRECT */
--p;

Don't cast that data item

Casting of pointers and values can cause problems in your code. Remove them and
rework the code so that it works cleanly without the use of casts. Using an ANSI
compiler, even the return from malloc() doesn't need to be cast.

Here, the programmer is unaware that C always promotes characters to integers when used
in expressions.

The programmer tells the compiler to chop the integer return value from getchar() down
to character size, then the compiler has to widen it back to integer to compare it with
the '\n' that is also widened to be an integer. Had the programmer left out the
cast, the integer returned from getchar() would be directly usable in the comparison with
the integer widening of '\n'.

EOF is an integer, not a character

The following code was intended to copy standard input to standard output, stopping
only at end-of-file. It doesn't work on files containing the character 255 (0xFF):

char ch; /* WRONG! */
while( (ch=getchar()) != EOF )
putchar(ch);

On most architectures, characters are signed characters. They
behave exactly like 8-bit integers, and they can be negative if the sign bit is turned on.
The character 0xFF (decimal 255) is one such "negative" character, and
when it is read from a file and stored into a character variable and then compared with
the integer constant EOF (which is -1, or 0xFFFF), the comparison succeeds. The loop
stops as soon as it reads the character 0xFF, not when it reaches end-of-file.

The character 0xFF and the integer 0xFFFF compare equal because the compiler widens
the character 0xFF to the integer 0xFFFF by sign-extending the top bit into all
the upper 8 bits of the 16-bit integer. Thus, 0xFF becomes 0xFFFF, and 0xFFFF ==
0xFFFF and the loop stops. (Remember that C always widens characters into
integers when performing arithmetic or comparisons.)

If the programmer had correctly stored the return value from getchar() into an integer,
the character 0xFF would be stored as 0x00FF and it would not compare equal to 0xFFFF.
If ch is declared correctly as an integer, the loop stops only when getchar()
returns the integer 0xFFFF (EOF).

The following code won't work, for the same reason as above (EOF can't fit into a
character):

Unfortunately, the programmer did not remember to stop the program after this
fatal error by using exit() or return(). For some unexplained reason, the program
continues on, incorrectly using the NULL pointer in the call to getc().

If a fatal error happens, stop. Do not blunder on as if nothing unusual
has happened.

Don't use garbage in stack variables

This programmer forgot that local variables have undefined, garbage values in them:

If the programmer is lucky, ch will only have a simple garbage value that prints an
ugly blotch on the terminal screen. One day though, ch will have the random garbage
value -1 (EOF) or the value 10 ('\n'), and the loop won't even execute at all.

Rewrite the loop so that the character is obtained before testing it, and so
that only good ASCII characters are sent to the terminal. (Don't make the mistake of
sending the EOF integer (-1) to the terminal! EOF is not a printable character.)

Another example of uninitialized storage:

Node *e;
e->next = NULL; /* WRONG! */

The node e points to garbage. Using a pointer to garbage as a
structure pointer variable will cause memory errors.

Compound logical operations

This programmer made sure to assign to the local variable before testing its value:

At some point, getchar() will return the integer EOF (-1). Sending this -1 to the
terminal screen as a character using putchar() is not a good thing to do. (Storing
it in a buffer or character string is also incorrect.)

The logic in the while loop is wrong. If ch is not EOF, the first half of
the condition is TRUE and the loop continues. If the character is EOF, the
first half is FALSE and the second half must be TRUE, and the loop continues. No
matter what character is read, the loop continues and never stops.

One way to arrive at the correct logical test is to consider when we want the loop to stop.
We want the loop to stop when we reach end-of-file or when the
character read is the newline, that is, stop when: ch == EOF || ch == '\n'.
If this is the stopping condition, we need its logical opposite for the going
condition in the while() loop. We can either invert the whole condition by using the
logical NOT operator:

while( ! (ch == EOF || ch == '\n') );

or we can use deMorgan's theorem to invert the separate halves, giving:

The problem with this code is that it will incorrectly use fclose() on a NULL pointer
if the file does not open properly. Don't close files that are not actually open.
(Move the fclose() right after the process() function call, where we know that fd
is not NULL.)

Don't point to local storage in a function

This programmer created a function to turn an integer into a hexadecimal character
string:

This programmer doesn't know how local variables work in C. The local variable hex
goes away when this function returns. The pointer returned by the function will be
useless, since it will point to stack storage that will sooner-or-later be re-used and
overwritten. (Good compilers may tell you about this error.)

The solution here might be to use malloc() to get the storage for the string, in which
case the calling program would have to remember to free() it; or, to use static
storage that isn't allocated on the run-time stack, in which case the calling program
needs to take other precautions. See the next section on using static storage.

Using static storage in functions

Given this improved re-implementation of the makehex() function used above:

The answer depends on the order in which your compiler evaluates arguments to
functions; but, it is one of the following two incorrect possibilities:

1 is 0x1 and 2 is 0x1 (wrong!)

1 is 0x2 and 2 is 0x2 (wrong!)

The expected output, 1 is 0x1 and 2 is 0x2, can't happen. Before
printf() can be called, both calls to makehex() have to be made to build the argument list
for printf(). Because both calls to makehex() print into the same static
storage buffer and return the address of the same static storage to printf(),
that storage either has 0x1 in it (if makehex(1) was computed last), or it has 0x2 in it
(if makehex(2) was computed last).

Watch out for static storage.

Writing error messages: Whose fault is it?

Here is a typical junior programmer error message:

if( x <= 0 )
printf("***\007\007 ILLEGAL INPUT -- RETYPE ***\n");

Junior programmers often write error messages that *** YELL AT THEIR USERS!!!
***, make loud beeping noises, and blame the users for mistakes. They don't
tell the users what input the program actually received, what is wrong with the input, nor
do they offer examples of what correct input might be.

The error message prints the name of the program issuing the error (argv[0]).

The error message does NOT YELL AT THE USER. The text is lower-case. No BEL
characters are included that make loud noises. No exclamation marks.

The error message echoes back the data value received by the program, so that typing
errors can be noticed by the user of the program. (The user might have accidentally
typed -5 instead of 15.)

The error message tells what the correct input might be, using variables or manifests
(MIN) to ensure that the same numbers are used in the test as in the error message.

Most importantly, the error message takes the point of view that the input was
"unrecognized" by the program; that is, that the program is at fault
for not being clever enough to recognize what the user meant. The message does not
lay the blame on the user.

Always write your error messages from the point of view that the user knows best, and
any input that doesn't meet the input criteria show faults in the program, not in the
user.

Using pointers that point nowhere

You must not use a pointer until you point it at something. The above code takes
the value of ptr (garbage) and then offsets from that garbage memory
location a few bytes to the thing structure element. It then copies value
into that memory.

Remember, C treats ptr->thing the same way as (*ptr).thing, which
may make it clearer why you must have a valid value in ptr before you use it.

With luck, your program will die immediately when you use an uninitialized
pointer. Without luck, you will overwrite some part of memory that you won't
discover until later.

A pointer variable is no different from an integer variable. You must set it
before you use it.

Note how the programmer has received a particular pointer value from malloc(); but, has
passed a different pointer value to free(). Perhaps the program will die
right away; perhaps it won't die until much later. (What it will not do is
free only part of the memory.)

The only values you may safely pass to free() are non-NULL values you got from
malloc(). If you need to adjust the value of the pointer returned by malloc(), make
sure you save a copy for later use by free().

Freeing the same memory twice

Some programmers confuse pointers and the memory the pointers point to. The
system library function malloc() allocates memory and returns the address of that memory.
Your program stores that address into one or more pointers:

In this program fragment, there is only one block of memory allocated, yet the
programmer is trying to free that block three times. All three pointers contain the
same address; all three pointers therefore point to the same memory. Any one
of the three pointers would do in a call to free(). Only the first call to free()
works; the other two attempt to free the same block over again and may cause your program
to abort.

What does free do?

Many C programmers forget what the argument to free() is. It is
a memory address, probably 2 or 4 bytes long. That address is the start of the block
of memory returned by, say, malloc(), realloc(), or strdup().

To free memory, you pass the address of the start of the memory. You do not pass
the entire block of memory:

Since arguments in C language are passed by value, it is a copy of the 4-byte
address in p that is sent into free(). When free()
has freed the memory at that address, the 4-byte pointer p remains
untouched and unchanged; it still has the old memory address in it. If you said free(p)
again, it would again take a copy of the old memory address, send it down to free(),
and free() would try to free that memory a second time. (This would
most likely cause your program to die.)

Writing cover functions (e.g. my_malloc)

As an experienced programmer, you now always check the return of your malloc(),
realloc(), and strdup() function calls to make sure they are not the NULL pointer.
For example:

The other cover functions, for realloc() and strdup(), would look similar to
my_malloc(). Their arguments are exactly the same number and type of arguments as
the library functions that they cover.

Now, your code is much simpler and easier to read, since the error checking doesn't
dominate your code:

struct Foo *foop = my_malloc(sizeof(struct Foo));

foop->buffer = my_malloc(MY_SIZE);

Using cover functions that cause your program to exit is only appropriate if the type
of error is serious enough that your program should exit on error. You shouldn't do
this for functions whose return values you want to check, e.g. fopen(). (Having your
program exit simply because it couldn't open the given file name would be rude.)

How wide are your arguments to printf?

The width of the typedef size_t argument to malloc() depends on
the compilation environment. Sometimes size_t is an unsigned
integer; sometimes it is an unsigned long integer. This makes it difficult to print
in a straightforward manner:

The first printf() statement fails if size_t is wider than an unsigned
integer; it doesn't match the '%u' that expects an unsigned integer. The second
printf() statement fails if size_t is narrower than an unsigned long
integer (as it is on many PC architectures); it doesn't match the '%lu' that expects an
unsigned long integer. The third statement always works, because it casts the size
to an unsigned long integer so that it always matches the '%lu' of the printf() call.

In general, you must only use arguments to printf/scanf and similar functions if you
know how wide they are, so that the width of the arguments match the '%' format specifiers
used in the format string. Here is another example:

For the same reasons as above, the first two printf() statements may fail depending on
the width of the constant MY_CONST. MY_CONST might be defined by the programmer to
be any of 1, 32767, 65535, or 123456L (a long int). Only if the width of the
constant matches the width of the printf() format will the print statement work. By
casting the argument to a known width (an unsigned long integer), it is guaranteed always
to match the '%lu' specification in the printf(). If you don't do the cast, some
values of MY_CONST will cause the print statement to fail.

On many architectures, subtracting pointers yields a long integer result. The
field width specifier to match the '*' in the printf call expects an integer. The
result is to mis-align the following string pointer, and usually cause a fault. This
happens in the following example, too, if integers and long integers are different sizes:

The programmer forgot to say '%ld' in the first printf, causing printf
to pick up only two of the four bytes of the "foo" argument. When
printf went to get the four-byte pointer for the string, it picked up the
remaining two bytes of the integer and then the next two bytes from the
pointer. The result is usually a fault.

Writing code that always works isn't easy.

C Language Modules

To hide data and functions in C, you have to put the data and the functions that use
them in the same source file and use the static keyword to confine the
scope of the identifiers to the source file.

Consider this main program that uses a module (a source file) named module.c containing
the public functions init() and dump():

#include "module.h" /* make sure prototypes match */
/* Here are some static data, local to this source file.
* The "static" keyword protects these names.
* These static items are only accessible within this source file.
*/
static int localint; /* local storage */
static char *localstr; /* local storage */
static int localfunc( int i ); /* forward prototype declaration */
/* This initialization function brings in data from outside the
* module and saves it in variables local to this source file.
* This function is "public"; it is not declared "static", so
* it can be called from outside the module. A prototype for
* this function must be put in "module.h".
* It uses a static utility function that is not visible and
* not callable outside of this source file.
*/
void
init( int num, char *str ){
localint = localfunc(num); /* call utility function */
localstr = str; /* save the data here */
}
/* This termination function undoes anything done by init().
* This function is "public"; it is not declared "static", so
* it can be called from outside the module. A prototype for
* this function must be put in "module.h".
*/
void
term( void ){
localint = -1;
localstr = NULL; /* return ptr to NULL state */
}
/* This function uses the static data saved by the init() function.
* This function is "public"; it is not declared "static", so
* it can be called from outside the module. A prototype for
* this function must be put in "module.h".
*/
void
doStuff( void ){
printf("The saved values are %d and %s\n",
localint, localstr);
}
/* This function is hidden in this file by the "static" keyword.
* It is "private": only functions in this file may use it.
* This function must *NOT* be prototyped in "module.h".
*/
static int
localfunc( int num ){
return num * 2;
}

Most modules require an initialization function that initializes or allocates data
structures used by the module. In the above example, the initialization function
brings external data into the module and saves it for later use by another public function
in the module.

If the initialization function does things that need later undoing, the termination
function does the undoing. Typical uses for the termination function are:

writing out statistics or completed data structures

freeing dynamic memory allocated by init()

closing any files opened by init()

restoring the state of data structures so that init() may be called again

The data structures and local utility functions used by the source file module are
protected from access by functions outside the module by declaring them as
"static". The "static" keyword confines the names to the one
source file in which they are compiled. Only the "public" functions have
prototypes in the "module.h" include file. The "module.h"
include file is included in module.c to make sure that the prototypes for the public
functions all match the definitions at compile time.

Understand your Compiler and Linker

Here are some common errors and warnings that people get when compiling their projects:

Warning BUFFER.CPP 42: Possibly incorrect assignment in function

The compiler is warning you about bad style in a logical condition, such as:

if( str = (char *)malloc(.....) ){

The compiler thinks perhaps you meant to say the following, but made a typing mistake:

if( str == (char *)malloc(.....) ){

Variable str is not a Boolean; it's a pointer. Don't write the
code as if str were a Boolean value. Make the test for non-NULL
explicit and the compiler warning goes away:

if( (str=(char *)malloc(.....)) != NULL ){

The above code makes it clear that str is a pointer. Both the
compiler and other readers of your code won't complain about it.

Linker Error: Undefined symbol _bf_free in module MAIN.C

Your main.c file has code that calls a function named bf_free
that has no visible definition in any of your source files. The code compiles, so
you probably have the correct function prototypes included; but, the linker can't find the
compiled function body to link to.

For a function definition to be visible to the main.c source file, the
function body has to be either:

local to the main.c source file, or

global to all source files

Some people forget that the static keyword hides a function definition
and makes it local to the source file in which it is compiled.

What is wrong with writing: printf(prompt)?

What is wrong with this code?

func( char *prompt ){
printf(prompt);
...
}

The above code works fine until the day someone calls it with a prompt containing a printf
format specification tag, perhaps similar to this:

func("Please enter the number of %s in your address:");

The prompt string gets passed to printf, which interprets the %s
as a string format argument, for which there is no corresponding string:

printf("Please enter the number of %s in your address:");

Your program will either fault or print garbage. The correct, and only safe way to
print a string using printf is this way:

printf("%s", prompt);

An alternate way to print the prompt string is not to use printf at
all:

puts(prompt);

A final observation is that prompts should always appear on the terminal and not be
redirected into output files when command line redirection is being used, so the prompt
really should be sent to the standard error unit, not to standard output:

fprintf(stderr,"%s",prompt);
fputs(prompt,stderr);

Mutually recursive #include files

Some students have #include files that include each other recursively. For
example, you might have a file of string function prototypes or data structures that use
some buffer data types, and a file of buffer function prototypes or data structures that
use some of the string data types. Thus, the file mystring.h might
need to #include "buffer.h", and the file buffer.h might need
to #include "mystring.h".

This doesn't work.

At worst, the preprocessor will get into a loop trying to read either of the two files.
At best, if you remember to put #ifndef _BUFFER_H_ in front of the buffer.h file
and #ifndef _MYSTRING_H_ in front of the mystring.h file, then one of the two files will
be only partially processed when the other file is #included, causing the #included file
to generate errors about unknown data types and invalid declarations.

Rework your source code to eliminate this mutual recursion.

sizeof() vs. strlen()

The sizeof() operator gives the compile-time size of a C-language data
structure. It does not compute the run-time length of a string. The above code
allocates five bytes of memory (four for the size of the pointer, plus one), then dies
trying to copy 13 bytes of a string into it.

Don't confuse sizeof() with strlen().

sizeof() and malloc()

The argument to malloc() should be the number of bytes of memory to
reserve. The sizeof() operator returns the compile-time size of its
argument. The size of numbytes is the size of a C-language integer;
that is 2 bytes on most PC architectures and 4 bytes elsewhere. Thus, the above code
allocates only 2 or 4 bytes of memory and then copies a long string into that memory.
This overwrites and corrupts the memory pool. The programmer won't likely
discover the error until the next malloc() or free() is
attempted, at which point the program will die because of the corrupt memory pool.

Be clear about where to use sizeof().

Modifying string constants

This code doesn't work:

char *msg = "The file name is: ";
strcat(msg,"BIGBEN.TXT");

The pointer msg points to a small area of memory containing the text
"The file name is: " followed by the NUL byte ('\0'). The strcat
function tries to add 10 more bytes to this string, catastrophically overwriting whatever
was in memory after the string and possibly causing a processor fault.

You can only use string functions to move bytes into memory that you have already
reserved.

Now, the pointer msg points to enough memory to hold the 10 new bytes
after the first NUL character; but, this memory is part of a "string constant"
compiled by the compiler. Many modern compilers put these constants into the
read-only text segment of your executing process, making it impossible to overwrite any of
this string. A processor fault may result.