Sometimes this is easy to diagnose: the problem code path consists of a nice simple sequence of method calls whose timings are easily measured with a bit of Stopwatch action and a few Debug.WriteLine calls. When you find that the Sort method is taking 30 seconds and everything else is taking milliseconds, it’s pretty obvious what needs fixing.

Sometimes, however, the cause is more mysterious. Okay, Sort is taking 30 seconds, but there’s nothing obvious in the Sort method that explains why it should be taking 30 seconds. Too many developers roll up their sleeves, guess where the problem must “obviously” be (“Maybe it’s because it allocates a lot of Point objects!”), and dive in. Sometimes this works.

More often, they spend hours turning nice simple code into a horribly complicated mess (“Let’s implement our own Point class, make it a mutable reference type and have a pool of them we can reuse them instead of allocating new ones!”), and end up shaving off a fraction of a percent – or even making things worse. (If you catch a colleague proposing to unroll a loop in order to make it go faster, call security and have them gently escorted away from their keyboard.)

Wiser developers know that their intuition about performance can easily lead them astray. So instead they use a profiler to measure where the application is spending its time or allocating its memory. A profiler can tell you information like how many times each method is called, how much time is spent in any given method, how many objects are alive and how much memory they’re taking up. Guided by these measurements, developers can focus their attention where it’s going to have a real effect – on a bug or a method or class that doesn’t scale, or an algorithm that is leaking references.

Profiling your application used to require third-party tools, but Visual Studio 2008 includes a profiler (though not in all editions). So now there’s no excuse for guessing instead of measuring.Let’s look at an example.

Unfortunately, it’s a bit hard to come up with a performance problem that’s easy enough to explain in a blog post but hard enough to require a profiler to diagnose, so the example I’m going to use is embarrassingly artificial – so bear in mind that the point is to illustrate the use of the profiler, not to crack a real-world problem!

Our scenario is a program which, given a number n, creates string of asterisks whose lengths are the first n Fibonacci numbers and saves these to disk, presumably as part of some sort of research into whether file system drivers get bored. (I did say it was artificial.)

It runs fine in all our test cases, but users are reporting that when n is 40 or more, the program appears to hang, chewing up 100% of the CPU. We do some testing of our own and find that we can reproduce the problem. Now to track it down!

The program constructs each string, then saves it to a suitably named file. So which bit is slow, the construction or the saving?

Clearly this example is trivial enough that we could answer this using the “Stopwatch and Debug.WriteLine” technique, but let’s use the profiler anyway. You can find this on the Analyze menu, and you can launch it in a number of ways. The easiest way to get started is the Launch Performance Wizard command.

The Performance Wizard first asks you which project you’d like to profile, and reminds you to switch the configuration to Release, because a debug build may have distorted performance characteristics, for example because of the debugger keeping objects alive for longer, or because certain optimisations are turned off in debug mode.

Then the wizard asks whether you’d like to profile using sampling or instrumentation. Sampling is an relatively coarse technique which involves taking snapshots of the program, like clicking the Pause button every so often and having a look at the call stack; instrumentation is finer grained but more intrusive and more heavyweight, like adding logging code into the program to record and time every method entry or exit.

For this example, we choose Sampling and finish the wizard.When we return to Visual Studio, we find a new window displayed: Performance Explorer. Right now, there’s not a lot going on here. It lists the profiling targets we’ve created – just the one so far – and provides some command buttons for using the profiler:

The most interesting command button is the third one along, Launch with Profiling. This runs our application under the profiler. Before we charge in and press it, however, note that if you drop this button down, you get another option, Launch with Profiling Paused. This is often more handy in the real world.

Suppose you have a Windows Forms or WPF application, and users are complaining that the application runs slow when they perform a particular action or when a particular situation occurs. If you ran the application under the profiler, your results would be distorted by all the steps you had to take to get to the point where things slowed down. (Also, if you are using instrumentation, profiling slows down your application a lot, so if you had to step through a lot of GUI actions to get to the trouble spot, it would be very, very tedious.)

So what you’ll typically do in such situations is launch the application with profiling paused, get it to the stage where you’re ready to reproduce the problem, and then turn profiling on (using the Start/Stop button) only for the duration of the problem code.

In our trivial example, the only thing the program does is call the SaveFibonacciStrings function, so we may as well run the whole program under the profiler. So we click Launch with Profiling, the program runs, and when it finishes, Visual Studio clunks away for a while analysing the sampling logs and then displays a performance report.

The initial Summary view shows us which functions are consuming the most run time. We can see from the top list that over 96% of our program’s run time was spent within the Stringonacci method, and we can switch to the Functions view for more detail:

So, 96.7% in Stringonacci, and 2.4% in Save. I guess there’s not much point optimising our file save code then.

The report tells us a bit more than this though. Although the program spends 96% of its time in Stringonacci, it spends only 15% of its time in Stringonacci itself. (Look at the lower chart in the Summary view, or the Exclusive Samples % column in Functions view.) The other 81% of the time is spent in the methods called by Stringonacci. In fact, the Functions table tells us we’re spending 81% of our time in String.Concat (and the functions it internally calls).

Our opportunities for optimisation, however, end with Stringonacci. We can’t improve the performance of String.Concat because it’s part of the .NET Framework. We can only change Stringonacci to call a more efficient function instead, or to make fewer calls to String.Concat.

The profiler has located the problem, but it can’t solve it for us. Fortunately, one obvious fix presents itself. Everybody knows string concatenation is inefficient! We’ll just switch over to StringBuilders! Unfortunately, in this case, it turns out that this “call a more efficient function” strategy helps, but not very much: it just means the program grinds to a halt around n=42 instead of n=40. Are we going to be forced to fall back on using our brains?

Whoa, let’s not panic here. Let’s switch over to instrumentation mode and see if the profiler can give us some more detail. You can easily do this using the Performance Explorer window:

Here’s the instrumentation summary for n=30:

Stringonacci2Helper is the method which recursively builds up the string in a StringBuilder. It is called 2.7 million times for n=30. That seems awfully high, and merits further investigation. If we rerun the program under instrumentation for different values of n, the profiler tells us that Stringonacci2Helper is called 242000 times for n=25, 392000 for n=26, 635000 for n=27… It’s an exponential increase. No wonder the program is grinding to a halt when n gets into the forties, and no wonder the improvement from using StringBuilder got eaten up so quickly!

Again, the profiler can’t identify the problem directly, but it has given us an enormous clue. An exponential increase probably means Stringonacci2Helper is calling itself, directly or indirectly, more than once. A quick look at the code confirms this:

The culprit is found. The profiler still can’t tell us how to fix the problem, but it has led us directly to the cause. We can now investigate a variety of possible optimisations such as memoisation, direct calculation and sacking the bozo who wrote it this way in the first place.

Finally, let me mention a couple of things we’ve not covered in this overview.

First, we’ve not talked much about the Performance Explorer window and the Analyze menu except as ways for launching the profiler. Performance Explorer records your profiling sessions so you can review different runs of the program. You can also compare different runs, and save the profiling information to a file – useful if you need to pass it on to another developer to review, or if your customer is savvy enough for you to ask them to run the profiler on their machine and send you the data. Similarly, we’ve only looked at a couple of the views that the Visual Studio profiler offers of the profiling data. Have a play around, or read the Visual Studio documentation, to see what more you can do with the profiler.

Second, I want to emphasise that the profiler is a diagnostic tool, not a prophylactic one. I once had a boss who, after a customer had identified a performance issue and we had solved it using a profiler, suggested we should profile everything before releasing it. This is totally impractical. Profiling is expensive. You don’t use it unless you need to.

At the trivial end , instrumentation is insanely expensive in run time: on my machine, the Fibonacci example normally conked out around n=40, but I had to use n=30 and below for instrumentation tests, it was that much slower. More important, profiling is expensive in programmer time. It takes time to perform a profiling run, to interpret the results and to determine what needs fixing and how to fix it. Worst of all, speculative profiling tempts you to fix things that don’t need fixing. You’ll always have a “most expensive” function: the question is not whether one of your methods is using more time than others, but whether it’s using more time than is acceptable to users.

With that caveat, though, the Visual Studio 2008 profiler is a jolly useful tool to have around. You won’t need it every day, but when you do need it, it can be a lifesaver.Download Visual Studio 2008 90 day trialFor detailed information and to request a free 90-day trial DVD of Visual Studio 2008 Team Suite to be sent out to you, go to the Microsoft Visual Studio webpage.

I just want to point out a few things:
- Since you sampled the call stack, you could have displayed the call sites that are on the most samples. That would instantly identify both issues. The call to string concat would be on nearly every sample, and the self-calls would be on nearly every sample. Looking at those call sites would instantly pinpoint the problems.
- Since the problems are on 80-90% of samples, having 800+ samples is overkill. 10 would be more than enough.
- Summarizing at the function level, and distinguishing between inclusive and exclusive time, is just obfuscating, when you actually have instruction-level information.

Comment
by
Mike Dunlavey, on 26-Feb-2009 08:38

Just to show you what I mean, I took half a dozen samples of your program. Except for the terminal line, they all look about the same:
Stringonacci Line 18,25,26,27
Stringonacci Line 26
....
Stringonacci Line 25
Stringonacci Line 26
SaveFibonacciStrings Line 11
Main Line 31
So what? So lines 25 & 26 are on every sample, whether you take 10 or a 1000, so they each cost 100% of the time, and they tell you exactly where to look. NO DETECTIVE WORK.
This is a toy program, but the same principle applies no matter how large the software is.
When are these profiler honchos going to "get it"?

Comment
by
Raja, on 22-Jul-2009 22:41

can i use performance wizard for windows service application?

Comment
by
Raja, on 22-Jul-2009 23:50

can i use performance wizard for windows service application?

About the Visual Studio 2008 Blog

In the years since the hugely successful release of Visual Studio 2005, Microsoft has used developer feedback from all over the world to introduce over 250 new features in the 2008 release.

This sponsored blog will bring Visual Studio 2008 tips and tricks from well known developers in the Australian and New Zealand tech community directly to you.

Disclaimer

The Visual Studio 2008 blog is sponsored by Microsoft NZ. The blog posts are the authors' genuine accounts of their experiences with Visual Studio 2008 and are not influenced or filtered by Microsoft NZ in any way.

This blog is hosted by Geekzone. You can have a Geekzone Blog, free for non-commercial use, when you participate!
Report this post. Contents are property and copyright of the author, or licensed. Geekzone®