This section contains a brief introduction to the C language. It
is intended as a tutorial on the language, and aims at getting a
reader new to C started as quickly as possible. It is certainly
not intended as a substitute for any of the numerous
textbooks on C.

The best way to learn a new ``human'' language is to speak it right
from the outset, listening and repeating, leaving the intricacies of
the grammar for later. The same applies to computer languages--to
learn C, we must start writing C programs as quickly as possible.

An excellent textbook on C by two well-known and widely respected
authors is:

Dennis Ritchie designed and implemented the first C compiler on a
PDP-11 (a prehistoric machine by today's standards, yet one which had
enormous influence on modern scientific computation). The C language
was based on two (now defunct) languages: BCPL, written by Martin
Richards, and B, written by Ken Thompson in 1970 for the first UNIX
system on a PDP-7. The original ``official'' C language was the ``K
&amp R'' C, the nickname coming from the names of the two authors of
the original ``The C Programming Language''. In 1988, the American
National Standards Institute (ANSI) adopted a ``new and improved''
version of C, known today as ``ANSI C''. This is the version described
in the current edition of ``The C Programming Language -- ANSI
C''. The ANSI version contains many revisions to the syntax and the
internal workings of the language, the major ones being improved
calling syntax for procedures and standarization of most (but,
unfortunately, not quite all!) system libraries.

This creates an executable file a.out, which is
then executed simply by typing its name. The result is that the
characters `` Hello World'' are printed out, preceded by
an empty line.

A C program contains functions and variables. The
functions specify the tasks to be performed by the program. The
``main'' function establishes the overall logic of the code. It is
normally kept short and calls different functions to perform the
necessary sub-tasks. All C codes must have a ``main'' function.

Our hello.c code calls printf, an output
function from the I/O (input/output) library (defined in the file
stdio.h). The original C language did not have any built-in
I/O statements whatsoever. Nor did it have much arithmetic
functionality. The original language was really not intended for
''scientific'' or ''technical'' computation.. These functions are now
performed by standard libraries, which are now part of ANSI C. The K
&amp R textbook lists the content of these and other standard
libraries in an appendix.

The printf line prints the message ``Hello
World'' on ``stdout'' (the output stream
corresponding to the X-terminal window in which you run the code);
``\n'' prints a ``new line'' character, which brings the
cursor onto the next line. By construction, printf never
inserts this character on its own: the following program would produce
the same result:

The first statement ``#include &lt stdio.h&gt'' includes
a specification of the C I/O library. All variables in C must be
explicitly defined before use: the ``.h'' files are by
convention ``header files'' which contain definitions of variables and
functions necessary for the functioning of a program, whether it be in
a user-written section of code, or as part of the standard C libaries.
The directive ``#include'' tells the C compiler to insert
the contents of the specified file at that point in the code. The
``&lt ...&gt'' notation instructs the compiler to look for
the file in certain ``standard'' system directories.

The void preceeding ``main'' indicates that
main is of ``void'' type--that is, it has no type associated
with it, meaning that it cannot return a result on execution.

The ``;'' denotes the end of a statement. Blocks of statements are
put in braces {...}, as in the definition of functions. All C
statements are defined in free format, i.e., with no specified layout
or column assignment. Whitespace (tabs or spaces) is never
significant, except inside quotes as part of a character string. The
following program would produce exactly the same result as our earlier
example:

#include &lt stdio.h&gt
void main(){printf("\nHello World\n");}

The reasons for arranging your programs in lines and indenting to
show structure should be obvious!

The code starts with a series of comments indicating its the purpose,
as well as its author. It is considered good programming style to
identify and document your work (although, sadly, most people only do
this as an afterthought). Comments can be written anywhere in the
code: any characters between /* and */ are ignored by the compiler and
can be used to make the code easier to understand. The use of
variable names that are meaningful within the context of the problem
is also a good idea.

The #include statements now also include the header
file for the standard mathematics library math.h. This
statement is needed to define the calls to the trigonometric functions
atan and sin. Note also that the compilation must include the mathematics
library explicitly by typing

The compilers checks for consistency in the types of all variables
used in any code. This feature is intended to prevent mistakes, in
particular in mistyping variable names. Calculations done in the math
library routines are usually done in double precision arithmetic (64
bits on most workstations). The actual number of bytes used in the
internal storage of these data types depends on the machine being
used.

The printf function can be instructed to print integers,
floats and strings properly. The general syntax is

printf( "format", variables );

where "format" specifies the converstion specification and
variables is a list of quantities to print. Some useful
formats are

%.nd integer (optional n = number of columns; if 0, pad with zeroes)
%m.nf float or double (optional m = number of columns,
n = number of decimal places)
%ns string (optional n = number of columns)
%c character
\n \t to introduce new line or tab
\g ring the bell (``beep'') on the terminal

Most real programs contain some construct that loops within the
program, performing repetitive actions on a stream of data or a region
of memory. There are several ways to loop in C. Two of the most
common are the while loop:

The while loop continues to loop until the conditional
expression becomes false. The condition is tested upon
entering the loop. Any logical construction (see below for a list) can
be used in this context.

The for loop is a special case, and is equivalent to
the following while loop:

Infinite loops are possible (e.g. for(;;)), but not too
good for your computer budget! C permits you to write an infinite
loop, and provides the break statement to ``breakout '' of
the loop. For example, consider the following (admittedly
not-so-clean) re-write of the previous loop:

You can define constants of any type by using the #define
compiler directive. Its syntax is simple--for instance

#define ANGLE_MIN 0
#define ANGLE_MAX 360

would define ANGLE_MIN and ANGLE_MAX to the
values 0 and 360, respectively. C distinguishes between lowercase and
uppercase letters in variable names. It is customary to use capital
letters in defining global constants.

The appropriate block of statements is executed according to the value
of the expression, compared with the constant expressions in the case
statement. The break statements insure that
the statements in the cases following the chosen one
will not be executed. If you would want to execute these
statements, then you would leave out the break
statements.
This construct is particularly useful in handling input
variables.

The C language allows the programmer to ``peek and poke'' directly
into memory locations. This gives great flexibility and power to the
language, but it also one of the great hurdles that the beginner must
overcome in using the language.

All variables in a program reside in memory; the statements

float x;
x = 6.5;

request that the compiler reserve 4 bytes of memory (on a 32-bit
computer) for the floating-point variable x, then put the
``value'' 6.5 in it.

Sometimes we want to know where a variable resides in memory. The
address (location in memory) of any variable is obtained by placing
the operator ``&amp'' before its name. Therefore
&ampx is the address of x. C allows us to go
one stage further and define a variable, called a pointer,
that contains the address of (i.e. ``points to'') other variables.
For example:

float x;
float* px;
x = 6.5;
px = &x;

defines px to be a pointer to objects of type float, and
sets it equal to the address of x:

Pointer use for a variable

The content of the memory location referenced by a pointer is
obtained using the ``*'' operator (this is called
dereferencing the pointer). Thus, *px refers to
the value of x.

C allows us to perform arithmetic operations using pointers, but
beware that the ``unit'' in pointer arithmetic is the size (in bytes)
of the object to which the pointer points. For example, if
px is a pointer to a variable x of type
float, then the expression px + 1 refers not to
the next bit or byte in memory but to the location of the next
float after x (4 bytes away on most
workstations); if x were of type double, then
px + 1 would refer to a location 8 bytes (the size of a
double)away, and so on. Only if x is of type
char will px + 1 actually refer to the next byte
in memory.

Thus, in

char* pc;
float* px;
float x;
x = 6.5;
px = &x;
pc = (char*) px;

(the (char*) in the last line is a ``cast'', which converts
one data type to another), px and pc both point
to the same location in memory--the address of x--but
px + 1 and pc + 1 point to different
memory locations.

Run this code to see the results of these different operations. Note
that, while the value of a pointer (if you print it out with
printf) is typically a large integer, denoting some
particular memory location in the computer, pointers are not
integers--they are a completely different data type.

In C, arrays starts at position 0. The elements of the array occupy
adjacent locations in memory. C treats the name of the array as if it
were a pointer to the first element--this is important in
understanding how to do arithmetic with arrays. Thus, if v
is an array, *v is the same thing as v[0],
*(v+1) is the same thing as v[1], and so on:

(The expression ``i++'' is C shorthand for ``i = i +
1''.) Since x[i] means the i-th element
of the array x, and fp = x points to the start
of the x array, then *(fp+i) is the content of
the memory address i locations beyond fp, that
is, x[i].

is an array of characters. It is represented internally in C by the
ASCII characters in the string, i.e., ``I'', blank,
``a'', ``m'',... for the above string, and
terminated by the special null character ``\0'' so programs
can find the end of the string.

String constants are often used in making the output
of code intelligible using printf ;

printf("Hello, world\n");
printf("The value of a is: %f\n", a);

String constants can be associated with variables. C provides the
char type variable, which can contain one character--1
byte--at a time. A character string is stored in an array of character
type, one ASCII character per location. Never forget that, since
strings are conventionally terminated by the null character ``\0'', we
require one extra storage location in the array!

C does not provide any operator which manipulate entire strings at
once. Strings are manipulated either via pointers or via special
routines available from the standard string library
string.h. Using character pointers is relatively easy since
the name of an array is a just a pointer to its first element.
Consider the following code:

The standard ``string'' library contains many useful functions to
manipulate strings; a description of this library can be found in an
appendix of the K &amp R textbook. Some of the most useful functions
are:

Character level I/O

C provides (through its libraries) a variety of I/O routines. At the
character level, getchar() reads one character at a time
from stdin, while putchar() writes one character
at a time to stdout. For example, consider

This program counts the number of characters in the input stream
(e.g. in a file piped into it at execution time). The code reads
characters (whatever they may be) from stdin (the
keyboard), uses stdout (the X-terminal you run from) for
output, and writes error messages to stderr (usually also
your X-terminal). These streams are always defined at run time. EOF is
a special return value, defined in stdio.h, returned by
getchar() when it encounters an end-of-file marker
when reading. Its value is computer dependent, but the C compiler
hides this fact from the user by defining the variable EOF. Thus the
program reads characters from stdin and keeps adding to the
counter nc, until it encounters the ``end of file''.

C allows great brevity of expression, usually at the expense of
readability!

The () in the statement (c = getchar()) says
to execute the call to getchar() and assign the result to
c before comparing it to EOF; the brackets are necessary
here. Recall that nc++ (and, in fact, also ++nc)
is another way of writing nc = nc + 1. (The difference
between the prefix and postfix notation is that in ++nc,
nc is incremented before it is used, while in
nc++, nc is used before it is incremented. In
this particular example, either would do.) This notation is more
compact (not always an advantage, mind you), and it is often more
efficiently coded by the compiler.

The UNIX command wc counts the characters, words and lines
in a file. The program above can be considered as your own
wc. Let's add a counter for the lines.

Functions are easy to use; they allow complicated programs to be
parcelled up into small blocks, each of which is easier to write,
read, and maintain. We have already encountered the function
main and made use of I/O and mathematical routines from the
standard libraries. Now let's look at some other library functions,
and how to write and use our own.

Calling a Function

The call to a function in C simply entails referencing its name with
the appropriate arguments. The C compiler checks for compatibility
between the arguments in the calling sequence and the definition of
the function.

Library functions are generally not available to us in source
form. Argument type checking is accomplished through the use of
header files (like stdio.h) which contain all the necessary
information. For example, as we saw earlier, in order to use the
standard mathematical library you must include math.h via
the statement

#include &lt math.h&gt

at the top of the file containing your code. The most commonly used
header files are

Arguments are always passed by value in C function
calls. This means that local ``copies'' of the values of the arguments
are passed to the routines. Any change made to the arguments
internally in the function are made only to the local copies of the
arguments. In order to change (or define) an argument in the argument
list, this argument must be passed as an address, thereby forcing C to
change the ``real'' argument in the calling routine.

As an example, consider exchanging two numbers between
variables. First let's illustrate what happen if the variables are
passed by value:

It is standard practice in UNIX for information to be passed from the
command line directly into a program through the use of one or more
command-line arguments, or switches. Switches are typically
used to modify the behavior of a program, or to set the values of some
internal parameters. You have already encountered several of
these--for example, the "ls" command lists the files in
your current directory, but when the switch -l is added,
"ls -l" produces a so-called ``long'' listing instead.
Similarly, "ls -l -a" produces a long listing, including
``hidden'' files, the command "tail -20" prints out the
last 20 lines of a file (instead of the default 10), and so on.

Conceptually, switches behave very much like arguments to functions
within C, and they are passed to a C program from the operating system
in precisely the same way as arguments are passed between functions.
Up to now, the main() statements in our programs have had
nothing between the parentheses. However, UNIX actually makes
available to the program (whether the programmer chooses to use the
information or not) two arguments to main: an array of
character strings, conventionally called argv, and an
integer, usually called argc, which specifies the
number of strings in that array. The full statement of the first line
of the program is

main(int argc, char** argv)

(The syntax char** argv declares argv to be a pointer to a
pointer to a character, that is, a pointer to a character array (a
character string)--in other words, an array of character strings. You
could also write this as char* argv[]. Don't worry too much
about the details of the syntax, however--the use of the array will be
made clearer below.)

When you run a program, the array argv contains, in order,
all the information on the command line when you entered the
command (strings are delineated by whitespace), including the
command itself. The integer argc gives the total
number of strings, and is therefore equal to equal to the number of
arguments plus one. For example, if you typed

UNIX programmers have certain conventions about how to interpret the
argument list. They are by no means mandatory, but it will make your
program easier for others to use and understand if you stick to them.
First, switches and key terms are always preceded by a ``-''
character. This makes them easy to recognize as you loop through the
argument list. Then, depending on the switch, the next arguments may
contain information to be interpreted as integers, floats, or just
kept as character strings. With these conventions, the most common
way to ``parse'' the argument list is with a for loop and a
switch statement, as follows:

Note that argv[i][j] means the j-th character of
the i-th character string. The if statement
checks for a leading ``-'' (character 0), then the switch
statement allows various courses of action to be taken depending on
the next character in the string (character 1 here). Note the use of
argv[++i] to increase i before use, allowing us
to access the next string in a single compact statement. The
functions atoi and atof are defined in
stdlib.h. They convert from character strings to
ints and doubles, respectively.

A typical command line might be:

a.out -a 3 -b 5.6 -c "I am a string" -d 222 111

(The use of double quotes with -c here makes sure that the
shell treats the entire string, including the spaces, as a single
object.)

Arbitrarily complex command lines can be handled in this way.
Finally, here's a simple program showing how to place parsing
statements in a separate function whose purpose is to interpret the
command line and set the values of its arguments:

Suppose you don't want to deal with command line interpretation, but
you still want your program to be able to change the values of certain
variables in an interactive way. You could simply program in a series
printf/scanf lines to quiz the user about their
preferences:

and so on, but this won't work well if your program is to be used as
part of a pipeline (see the UNIX
primer), for example using ther graphics program plot_data,
since the questions and answers will get mixed up with the data
stream.

A convenient alternative is to use a simple graphical interface
which generates a dialog box, offering you the option of
varying key parameters in your program. Our graphics package provides
a number of easy-to-use tools for constructing and using such boxes.
The simplest way to set the integer variable n and the
float variable x (i.e. to perform the same effect as the
above lines of code) using a dialog box is as follows:

Compile this program using the alias Cgfx (see the page on
compilation) to link in all necessary
libraries.

The two create lines define the entries in the box and
the variables to be associated with them, set_up_dialog
names the box and defines its location. Finally,
read_dialog_window pops up a window and allows you to
change the values of the variables. When the program runs, you will
see a box that looks something like this:

Modify the numbers shown, click "OK" (or just hit carriage return),
and the changes are made. That's all there is to it! The great
advantage of this approach is that it operates independently
of the flow of data through stdin/stdout. In principle,
you could even control the operation of every stage in a pipeline of
many chained commands, using a separate dialog box for each.