Normally, you use these functions assuming they'll do exactly what you say:
"Sleep for 8 seconds!", "Sleep for 1 minute!",
and so on.
Unfortunately, you get into problems when you say
"Sleep for 1 millisecond, ten thousand times!"

Delaying for a second: inaccurate code

Unfortunately, no, this won't return after one second on IBM PC hardware.
It'll likely wait for three seconds.
In fact, when you call any function based on the nanosleep() or
select() functions, with an argument of n milliseconds,
it actually takes anywhere from n to infinity milliseconds.
But more than likely, this example will take three seconds.

So why exactly does this function take three seconds?

Timer quantization error

What you're seeing is called timer quantization error.
One aspect of this error is actually something that's so well understood
and accepted that it's even documented in a standard:
the POSIX Realtime Extension (1003.1b-1993/1003.1i-1995).
This document says that it's all right to delay too much, but it
isn't all right to delay too little—the premature firing of a timer is undesirable.

Since the calling of delay() is asynchronous with the running of
the clock interrupt, the kernel has to add one clock tick to a
relative delay to ensure the correct amount of time (consider what would happen if it didn't,
and a one-tick delay was requested just before the clock interrupt went off).

Figure 1. A single 1 ms sleep with error.

That normally adds half a millisecond each time, but in the example given,
you end up synchronized with the clock interrupt, so the full millisecond
gets tacked on each time.

Figure 2. Twelve 1 ms sleeps with each one's error.

The small error on each sleep accumulates:

Figure 3. Twelve 1 ms sleeps with the accumulated error.

OK, that should make the loop last 2 seconds—where's the extra second coming from?

The tick and the hardware timer

The problem is that when you request a 1 ms tick rate, the kernel may not be able
to actually give it to you because of the frequency of the input clock to the timer hardware.
In such cases, it chooses the closest number that's faster than what you requested.
In terms of IBM PC hardware, requesting a 1 ms tick rate actually gets
you 999,847 nanoseconds between each tick.
With the requested delay, that gives us the following:

1,000,000 ns + 999,847 ns = 1,999,847 ns of actual delay

1,999,847 ns / 999,847 ns = 2.000153 ticks before the timer expires

Since the kernel expires timers only at a clock interrupt, the timer expires after
ceil(2.000153) ticks, so each delay(1) call actually waits:

999,847 ns * 3 = 2,999,541 ns

Multiply that by 1000 for the loop count, and you get a total loop time of 2.999541 seconds.