Profiling

DProf is deprecated. We recommend using NYTProf instead! The ideas in this
tip are still valid, however.

Profiling is used to determine which parts of your code are taking up the
most execution time. This allows us to best decide the which locations to
focus our optimising efforts. For example, pretend some code uses two
subroutines A and B. Through testing we determine that subroutine A takes
50 time-units to execute whereas subroutine B takes only 2 time-units.

The naive approach would be to assume that our optimising efforts are best
spent on speeding up subroutine A. However, if we profile our subroutines
when used under normal conditions, we may discover that subroutine A is
only being called once per program invocation. On the other hand,
subroutine B may be called hundreds of times. With such results, a 25%
speed increase to subroutine B is likely to significantly outweigh a 50%
speed increase to subroutine A.

Good profiling tools will provide information both on which subroutines
take the longest to execute and which subroutines are called most often.
Should you need to optimise your code, use these tools to identify
potential bottlenecks and focus your efforts where they can do most good.

Profiling Perl

There are a few tools to profile Perl code and which one is best depends
on your current focus. A commonly used one is DProf and its friend
dprofpp (these both come standard with Perl). To use these type:

The first two lines tell us how many seconds the program took to run. The
first line says that the program finished 1.2 seconds after we started it
and the second line tells us that had the the code been the only other
process on the machine, it would have taken only 0.56 seconds. For the
remaining 0.64 seconds the CPU was working on other things.

The first column is the percentage of time of total program invocation that
the subroutine took. The exclusive seconds (second column) mark the time
that the subroutine used itself without including data from subroutines it
called. The cumulative seconds (third column) represent the entire time
for the subroutine including further subroutine calls.

By looking at these columns we can determine that our sub_b subroutine
was called 100 times, took an average of 0.0048 seconds each time for a
total of 0.48 seconds. This resulted in it taking up 85.7% of the program
execution time.

On the other hand, each call to sub_a took 0.08 seconds (exclusively),
16 times longer than sub_b. However, as sub_a was only
called once this resulted in sub_a only taking up 14.2% of program
execution time.

Removing sub_a would yield a 14% improvement.
Increasing it's speed by 50% would yield a speed up of 7%. However,
improving sub_b by only 25% would result in a massive 29% speed up for
our program.

These results tell us that although sub_a takes up much more time, per
call, than sub_b, sub_b still makes the best first candidate for
optimisation.