A C struct type may be best introduced by comparison to array
types. C array variables, as is the case with arrays in other languages,
consist of a collection of variables of the same type. For example,
in the declaration

int Z[5];

we are declaring a collection of five variables of type int. By
contrast,

struct A {
int L,M;
char C;
} T,U;

would declare T and U to each consist of a collection of three
variables; two of the three are integers, and the third is of type
character.

It is important, though to describe this more precisely. What we
are really doing above is two things:

[(i)]
We are inventing a new type, whose name will be A.

[(ii)]
We are declaring two variables, T and U, of this new type A.

We also can do (i) and (ii) separately, e.g. as

1 struct A {
2 int L,M;
3 char C;
4 };
5
6 struct A T,U;

(I have added line numbers for easy reference; of course, these would
not be part of the actual .c file.) Here Lines 1-4 define this new
type A,1
and Line 6 declares two variables, T and U, of this new type
A.2
Note carefully that Lines 1-4 do
not result in the
compiler arranging for run-time space to be allocated in memory;
only Line 6 does that. Put another way, the variables L, M and
C don't exist until Line 6 (and then there will be two of each,
one for T and one for U).

We can access the individual variables (called fields) within
a struct by using the `.' operator. For example, T.M would be the
second integer variable within T, and U.C would be the character
variable within U.

Here X is declared to be of type int, but Y and Q are declared to
be of type pointer-to-int, i.e. Y and Q will contain the memory addresses
of int variables, maybe X.4 In fact, the latter would be the
case for Y if we executed the statement

Y = &X;

The `*' operator is used to de-reference a pointer, i.e. to tell
what is in the memory location pointed to by the pointer. In the above
example, for instance, the statement

printf("%d\n",*Y);

would have exactly the same result as would

printf("%d\n",X);

If a pointer has the value 0, it is considered in C to point to nothing
(even though most machines do have a memory location 0). If you attempt to
de-reference that pointer, e.g. you have an expression `*Y' above at a
time when Y is 0, you will generate an error message, which on Unix is
usually ``segmentation fault.''

We can add and subtract values to/from pointers, but the arithmetic works
in a different way. If P is a pointer-to-type-K, then the expression
P+R is not necessarily R larger than P; it is R*sizeof(K) larger. In
the example with Y above, suppose &X is 0x5000. Then (on 32-bit
machines) Y will be 0x5000, and Y+3 will be 0x500c, not 0x5003.

One of the major uses for pointers is to reference structs. For instance,
let us extend the above struct example to

struct A {
int L,M;
char C;
} T,U,*G;

and suppose we execute the statement

G = &T;

Then we could reference the fields within T indirectly via G, by
using the - > operator. For example,

Below, in Lines 5-248, is an example program. It is quite involved,
using structs and pointers in complex--but typical--manners. A
script record of a couple of runs of the program follows the
program listing, followed in turn by an analysis.

The comments (Lines 5-19) explain the goals of this program.
Please read these carefully before continuing.

By the way, the comments mention that we hope to save disk space
by storing the data in binary format. This will not necessarily
be the case, depending on the size of the numbers involved. If
a number is at least five digits long in decimal form, then binary
form will save space here: An int variable on a SPARCstation
takes up 32 bits, thus four bytes; in character form, i.e. when
stored by a statement of the form

fprintf(file pointer,"%d",variable name);

we would be using five bytes. On the other hand, if we knew that all
our integer values would be in the range 0-255, we could use the type
unsigned char instead of int in Lines 41-43, in which we
would still save space by using binary form.

Overview: We can get a good idea of what the program is doing
by looking at main(), Lines 230-248.

First, the user is asked whether to input existing information from a
file, or to start from scratch. If the user does want to input
information from a file, the function BuildListFromFile() will read
all the records stored in the file, create structs for each one (with
the struct type defined on Lines 39-45), and link the structs together,
with the head of the chain being pointed to by ListHead. On the other
hand, if we start with nothing in the list, ListHead will be initialized
to 0.

Next, we get to the major portion of main(), the while loop.
It continues to take commands from the user, either adding new records
to the list, or deleting them when a cancellation request is made.

The user can also make other requests, e.g. that the current list be
written to a file. In the latter case, the function SaveToFile()
will carry out this request, traversing the chain, writing the data
from each struct to the designated file. This data will consist of
the three fields on Lines 41-43, but
not that in Line 44,
the Link value. The Link values are irrelevant, since when the file
is read in again later on and the list recreated, the Link values
will probably be different anyway.

Line 22: The file /usr/include/fcntl.h contains some #define's
that will be needed for our I/O, e.g. a definition of the value O_RDONLY
on Line 86.

Lines 39-45: Here is our definition of the struct type which we
will use. Note carefully that ListNode is the name of the
type,
not the name of a variable of that type. This type consists of three
integers, as well as a pointer variable. The latter is of type
pointer-to-ListNode. In other words, a field within one ListNode
struct will point to another ListNode struct--this is how we will link
together many such structs.

Line 48: C's typedef construct does what its name implies--it
defines a new type. In this case, we are defining a new type whose name
is NodePtrType, and it is defined to be pointer-to-ListNode. Note
that we had to put the word `struct' in here, to remind the compiler
that ListNode was a struct type (Line 40); we could not have written
simply

typedef ListNode *NodePtrType;

omitting the `struct'. This is also true on Lines 58, 95, and so on.

Line 51: Now that we have defined this new type, we declare
ListHead and CurrPtr to be of that type.

Line 58: Here is an example of the use of the sizeof
operator. On a SPARCstation, integers and pointers are four bytes
each, so NodeInfoSize will be equal to 12 here, but by using
sizeof, this program will compile and run correctly on any
machine/compiler combination.

Display() Function, Lines 61-70: Look at TmpPtr in Lines
63, 65 and 68: TmpPtr initially points to the head of the list (Line
63); at the end of each iteration of the loop TmpPtr is advanced to
the next struct in the list (Line 68); and the loop continues until
we get to the end of the list (Line 65). In this manner, we hit all
the structs in the list, printing out each one (Lines 66-67).

As an aid to debugging, and also a teaching aid, I have also printed
out for each struct its address and the address of the struct which
it links to.

Input from file, Lines 86-87: We are using the Unix functions
open()
and read(). These, along with write() (Line 224), are the
basic I/O functions for Unix, not for the C language. In fact, the
C functions like fopen(), fscanf(), and so on, internally
make calls to the corresponding Unix functions; this way a C program
developed under a non-Unix environment can be moved to a Unix environment
without change. These Unix I/O functions do straight binary data
transfer--they just copy bytes from/to memory to/from a file, i.e.
without any ``interpretation'' such as that done by %d.5

The open() function, Line 86, has as its first argument the file
name. Its second argument states whether we are opening the file to
read only, write only or whatever (you can get a complete list of these
by typing man open). Instead of returning a file pointer, as
fopen() does, open returns an integer-valued file descriptor,
which we are storing in our int variable FD.

On Line 87 we are calling the read() function. Note that we are
reading in the whole file at once! Here we have told read() to
read the file whose file descriptor is FD (i.e. the file we just opened
on Line 86), put whatever we read into the array (buffer)
InFileArray, and to try to read MaxFileSize bytes. In most cases, the
word ``try'' here is key, because there won't be MaxFileSize bytes in
the file; then read will only read until the end of the file.
In order to let us know just how many bytes it did read, the function
will return this number, which we are storing in our variable InFileSize.

Lines 92-113: OK, now that we have read the file data into our
array InFileArray, we can proceed to recreate our linked list from that
array. That is what is happening within this loop, which has three
major sections to it: In Line 95, we create a new struct in memory;
in Lines 97-103, we fill this new struct with the data from the array;
and in Lines 105-113, we attach this new struct to our list. More
details on each of these below.

Line 95 Here we call the function calloc, which allocates
memory for us. Its first argument states the number of objects we need
room for, and the second parameter states how many bytes each object
will need; here our ``object'' is a ListNode struct. Also, we wouldn't
be able to use this newly allocated memory unless we knew where it was,
so calloc tells us, by returning a pointer to it; we store that
pointer value in TmpPtr.

The `(NodePtrType)' is called a cast, which essentially does a
type change. The type of the return value of calloc is
pointer-to-char, but we are saying to the compiler, ``Yes, Compiler,
we know that calloc returns a value of type pointer-to-character,
but we will use it as type NodePtrType, i.e. type pointer-to-ListNode,
so don't worry; we know what we are doing.'' No actual change will
be made--an address is an address, no matter what type of data is
stored at that address--but this way we avoid the compiler warning
us about type-mixing.

On Line 100, we see an example of pointer arithmetic. The program
would fail if we were to replace this line with

BytePtr = (char *) (TmpPtr + J);

Lines 149ff: See the next handout, ``A Closer Look at the
Warehouse Program.''

Footnotes:

1 Don't forget to terminate the definition with a
semicolon! You will get very odd compiler error messages otherwise.

3 This is
true in any language which has pointer variables, e.g. Pascal, but
in C it is more explicit than in Pascal.

4 Note very carefully that we
are
NOT declaring variables named *Y and *Q. We are
declaring variables named Y and Q. The `*' symbols are only there to
let us know that Y and Q are pointers; the `*' is not part of the
names of these variables.

5 As
mentioned above, the C functions call the Unix functions, so some
``translation'' may have to be done along the way. For example, suppose
we have an int variable M whose current value is 18, and we have
the call fprintf(FPtr,"%d",M). Then the
characters `1' and
`8', i.e. we want the bit string 0011000000111000 to be written to the
file. But M itself consists of the bit string
00000000000000000000000000010010, so the fprintf() function
will need to convert from this latter string to the former one before
calling write.