Timing Code

At some time in the development of an application, it becomes
necessary to optimize it. And to see if the optimization is
working, it is necessary to time it.
Using a stopwatch is crude, inaccurate, and is a little silly
since computers have accurate clocks. Let's use that built in
clock.

The first way is to time the execution of the entire program.
This is easilly accomplished with a little shell program
that reads the current time, executes the program supplied
as an argument, reads the time again and subtracts to get
the elapsed time:

This, of course, includes everything in the measurement - program load
time, module construction times, cleanup time, other tasks that
happen to be running at the same moment, etc.
To get a usable result try to turn off any other running programs
that will be consuming appreciable time, and run the timing program
multiple times until it converges on a result.

Timing a section of code is as simple as snipping out the three lines
of code:

Here, only one line needs to be inserted into a scope in order
to time the code within that scope. When it goes out of scope,
the elapsed time is computed within the destructor and the
results printed.

While this works, after using it a while you'll run smack into
another problem. The getUTCtime() returns the time in
milliseconds, which is far too coarse for timing smaller snippets
of code. Even worse, it calls the operating system to get the time,
so the execution of an unknown and perhaps large amount of code
gets added to the total.

The solution is the RDTSC (Read Time-Stamp Counter)
instruction introduced on Pentium and later CPUs.
It returns a cycle count in EDX:EAX, and is perfect for profiling
usages. All we need to do is fix getCount with a little
inline assembler magic:

The long types get returned from a function in EDX:EAX,
so this works out well. Now we have a very fine grained timer
with a very low, and consistent, overhead.

More advanced timers along these lines can be found in std.perf.

These timers will all just profile selected sections of code.
To profile the operation of a whole program, track how
functions call each other, find out where the bottlenecks are,
and time the various interactions you'll need a proper profiler.
Fortunately, DMD has a
profiler
at the flip of the -gt command line switch.