Pointers C's way

Pascal goes in for pointers but C goes in for them in a big way!

A C pointer to an integer is declared as

int *a

When a pointer is first declared its value is undefined. You can assign the special value NULL which plays the same role as Pascal's nil. You can allocate memory to pointers using the function malloc(n) which returns a pointer to n characters of storage and free(pointer) deallocates the memory that pointer points at.

In C the value that a pointer points at is written as *a.C is different to Pascal in that you can derive the address of any variable using the & operator.

For example, if i is an integer variable, &i is a pointer to its value. This allows pointers to be set to point at named variables - thus you could say that Pascal uses anonymous variables for pointers and C uses named variables.

For example,

a=&i

sets a to point at i.

So in C you can set a pointer to an existing variable or dynamically allocate storage under its control.

The only problem is that, as malloc works in terms of characters or bytes of storage, it looks as though you can only dynamically allocate character pointers. Fortunately there is a solution to this problem. You can allocate storage for any data type using an expression like:

pointer = (type *) malloc(sizeof(type));

The sizeof function returns the number of storage locations needed to store a single variable of the required type.

For example an integer pointer could be allocated storage using:

a= (int *) malloc(sizeof(int));

Notice that C has introduced the whole idea that the programmer should have access to and manage memory in this very low level way. This is one of the attractions and disadvantages of C. More modern languages don't expect you do manage the memory - they do it for you.

A surprising feature of C pointers is the ability to do pointer arithmetic. This is possibly the feature that made C into such a widely adopted language - when you first encounter it pointer arithmetic seems all-powerful. It is because it is a direct translation of what happens in assembler into pointers.

If a is a pointer to int then a+1 isn't the pointer value plus one but the pointer value incremented by the amount of storage that the pointer's type takes to store. In other words a+1 is equal to a+sizeof(int) and in general:

pointer+1;

is

pointer + sizeof(type);

All other pointer arithmetic is done in the same way using the size of the base type as the fundamental unit.

This single idea of making pointer arithmetic depend on the size of the base type of the pointer is responsible for much of C's unifying view of data structures - and as already mentioned possibly for its popularity.

For example, suppose you allocate storage for say ten integers using:

a = (int *) malloc(sizeof(int)*10);

then a+i is a pointer to the ith integer and so *(a+i) is the value of the ith integer.

Sound familiar?

Well it should because it is nothing more than a simple array of ten integers. When you define a C array using:

int a[10]

then a is an integer pointer to an area of storage large enough to hold ten integers. And when you use a[i] this is exactly the same as *(a+i). The only real difference is that when a is declared as an array it is a pointer constant and cannot be modified to point at some other area of storage. This ability to treat arrays as pointers is very powerful.

For example, if j is an integer pointer then j=(a+i) is a pointer to a particular element of the array and you can retrieve the value of the element using just *j any time that it is needed.

In standard C pointers are essential because they allow you to pass parameters by reference and so create input/output parameters.

For example, if you want to write a function that swaps values then you have to write something like:

void swap(a,b) int *a, b; { int temp; temp=*a; *a=*b; *b=temp; }

and call the function as swap(&x,&y).

Notice how nicely the idea that an array name is a pointer fits into this scheme. When you pass an array name it is already a pointer so you don't have to place an & in front of it.

C probably does the best job of converting the low level concept of address indirection into a higher level form. However it too has all of the problems that assembler and Pascal have - you can all too easily dereference a null pointer or dereference a pointer that no longer points at anything valid.

Modern languages

More modern languages such as C++ or C# have stopped this untidy use of pointers to pass parameters by reference but pointers are still sometimes useful.

In both C and Pascal pointers within records (C calls them structures) allow you to create very advanced data structures. In C++ this has been extended to include pointers to objects and pointers within objects.

Although C++ is a very advanced object oriented language it cannot entirely break free from its lower level past. For example, the only way to implement a polymorphic method is to use pointers - which is a strange mix of old and very new.

Languages such as C#, Java, Python, Ruby and even JavaScript have done what they can to completely replace pointers by "references" which can be thought of as strongly typed pointers to data structures/objects that the system automatically constructs for you. In other words you generally don't have to allocate memory for a reference to point at. In fact you can start to forget that references are addresses to memory locations and start to think of them as pointers to objects.

However, references, generally don't permit you to use pointer arithmetic and at least one high level language, C#, has added a pointer type simply because sometimes pointer arithmetic is just the simplest way of getting the job done.

You shouldn't be too convinced by the C# inclusion of pointers. As languages such as Java, JavaScript and so on have proved you really don't need low level pointers and pointer arithmetic as long as the language supports sufficiently sophisticated data structures.