Pointers in C Part One: Basics

2010-01-18

A pointer variable "p" initialized
with the "address of x" (&x) points to the integer
variable "x" with value 3490. In this case "*p" is
also equal to 3490.

One of the most difficult things for people to grok is pointers.
Especially in C. And
since C basically requires you to know how to use pointers if you want
to program effectively, you've got to know pointers to
know C.

In C, objects (such as the integer int x) are stored in
memory, and the location at which they are stored is known as their
address.
After all, the computer has to be able to differentiate between the
memory that holds your bank account balance and the memory that holds
the number of days you've been playing WoW, and it does this
differentiation by putting the values at different addresses.

Let's get all philosophical for a minute here and ask, "What is an
address?" Consider your home address of 123 Main St.
(That's just a fake address, and any resemblance to your actual address
is a coincidence. If it is your real address, well, I know where you
live.)

So I ask you where you live, and you tell me, "123 Main St." Now I
know how to get to your house if I need to, but I'm not actually at your
house. I merely know where it is. You have not given me your house.
You have given me a pointer to your house. You've given me
the address of your house. And now I can navigate there when I
choose to.

It's a similar situation when you have, say, an int X and
you get a pointer to X. You do not have X, but you
have been given a pointer to X. You have the address
of X. And with the pointer, you can use X when
you choose to. If you have a pointer to a thing, you can read and write
to the thing.

The main benefit of this is that if you pass a pointer to a function,
the function can manipulate the object that the pointer is pointing to.

When you call a function in C, all the arguments to the function are
copied to local parameters that are named in the parameter list for the
function. Since these parameters are local to the function, modifying
them only affects the local copies within the function, and not the
original arguments in the caller.

To get around this limitation, you can pass a pointer to a variable
as an argument to a function. You're not passing the variable—you're
passing a pointer to the variable. The function can then manipulate the
variable via the pointer.

That's all there is to the basics of pointers. Seriously!

From here, we have to take on a big chunk of C syntax. Sadly, due to
some of the choices that were made in the design of the language, it can
be a little bit confusing due to different uses of "*". But
it's not so bad, and once you get it, you'll have it.

We need to be able to do two things:

Go from "an object" to "a pointer to that object". When we have
an object, we need to be able to get a pointer to that object.

And the opposite of that: go from "a pointer to an object" to
"the object" itself. When we have a pointer to an object, we need to
be able to get to the object so we can manipulate it.

Let's tackle the first of these first. We'll declare a variable, and
then print out a pointer to it. Remember, a pointer is the address of
the object, so to get the address of the object, we're going to use the
address-of operator, which is an ampersand:
&. Then we're going to print it using the
printf() format specifier "%p", which prints a pointer
in a system-defined manner.

On my machine when I ran the above code, it printed out:
"0x7fffd944f17c", which (for me) is the address of x.
For you, it will be something else, probably; in fact, it might even
change from run to run. The exact number doesn't matter, so don't worry
about it. In fact, I command you to not even think about it! The
"%p" format specifier is rarely used in real life.

Technically, the "%p" format specifier prints a parameter of
type void* (pointer to void), so the line should be modified to
cast the result of the address-of:

printf("%p\n", (void*)&x); // address-of x, aka "a pointer to x"

but I left it off in the interest of clarity at this early part of
the document. We'll get to pointers-to-types quite soon.

And now the fun begins. We're going to store the address of
x in a variable (we'll really want this later when we pass the
address to a function.) But the big question is, what type should
the variable be? When you have something like "12", okay, that's an
int; and "34.9" can be stored in a variable of type
float, sure. And if you have "int x", then x
is type int. But if you have "a pointer to x", what
type is that?

It turns out, it's of type "pointer to an int", or "int-pointer".
That's right: there are special types that exist just to point to other
types.

So how does one declare such a type? It's easy: you put an asterisk
in front of the variable name, after the type:

As you see in the above example, we've used the "address-of" operator
(&) to get the address of x, and we've stored that
in the pointer-type variable y. y now points
tox.

So there we've achieved goal #1 in the list, above: we've gone from
an object to a pointer to that object by using the &
operator.

Now what about the other way? How, if we have a pointer to the
object, can we get to the object so we can manipulate it or read it? We
do a little magical operation called dereferencing the pointer.
This means, "I'm not talking about the pointer—I'm talking about the
thing it's pointing at."

In C, we do this using the indirection operator: *. Yes,
it's the asterisk again, except this time we're not using it in a
variable declaration, so the context is different. (In a variable
declaration, the asterisk means you're declaring a pointer, and in an
expression it means you're dereferencing the pointer. Unless you're
using it for a multiplication. :-) )

The C99 spec doesn't
actually talk about dereferencing except for an off-handed mention in a
footnote; it prefers "indirection operator" or "* operator".
But all C programmers know what you mean if you say "dereference".

See how that works? When you pass a pointer to a variable (or an
address of a variable) to function make24more(), that function
dereferences the pointer to the thing, and adds 24 to whatever the
pointer points to.

And you'll notice we called make24more() in two different
ways. We did it by passing a pointer type y and also by simply
passing the address of x, or "&x. Both these ways
are perfectly acceptable.

It has to do it this way, because it wants to change the value of
x, and the only way it can do it is via a pointer to
x.

So what we have here when we call make24more(), is that a
local copy of the pointer is made, and in this case uses the local name
a from inside the make24more() function. But the
function in this case doesn't really even care that the parameter is a
copy of the pointer, because it's not manipulating the pointer; it's
manipulating what the pointer is pointing at using the *
operator. The pointer to x stored in variable y
points to x. And the copy of the pointer to x stored
in parameter a also points to x. Dereferencing either
one will get you to x. It doesn't matter if you have my home
address, or a copy of my home address; both of them get you to my house.

This is actually quite powerful. If you have a pointer to a thing,
you can pass that pointer around to all kinds of different functions,
and they can all manipulate the thing the pointer points to. In effect,
those functions are saying, "I don't have the thing, but I know where it
is because I have a pointer to it. And I can change its value by
dereferencing the pointer."

And because it doesn't really fit anywhere else, I'm going to mention
right here that C uses the symbolic name "NULL" for pointers
that don't point to anything:

int *p; // p is uninitialized at this point
p = NULL; // now p points to nothing, explicitly

This is used to mark the end of a list of pointers, or sometimes an
error condition. For instance, if malloc() fails to
allocate memory for some reason, it returns NULL to let you
know:

Pointers in C can point to variables, elements in arrays, nodes in
lists or trees, explicit memory addresses, and even functions. But save
yourself a lot of pain: make sure you have the basics down before you
head down those paths!

Share me!

Historic Comments

Long ago, when I was first learning C, I had to read, and re-read the chapter on pointers. It is the least understood, most powerful, and most dangerous concept in computer programming. These the reasons that many modern languages do not have, or discourage use of pointers.

I truly started to understand the potential behind pointers when I read about a structure used in Amigados called a "linked list". If there's a part 2 on pointers, I bet we'll be seeing linked lists.

Even today, I still have difficulty completely understanding concepts such as const correct pointers (which, IMO, just make code more unreadable and only exist to make things more difficult).

stan4233212010-04-13 18:42:04

Seeing I have some experts here: in C++, can I assign a pointer to void f(int a = 0) to void(*p)(void)?

Once upon a time, back in the early or middle 80's, I was learning Pascal and ran into pointers, but couldn't get it.

Around the same years, I was also learning Z80 assembly and there were instructions like "LD A, (HL)", meaning "load A with the content of the memory cell, the address of which is in the register HL". This I could get. But I didn't realise that this was in fact pointers.

It wasn't until the end of the 80's, when I was learning C and ran into pointers again, that it hit me:
Pointers are just addresses to memory cells.