Sunday, March 16, 2008

Performance testing mistakes (I've made)

I've been testing for Performance for eight years now, more or less. When I started testing, I didn't even know I'm supposed to test for "Performance", I was simply the- dude-who-knew-VUGen. Neither me, nor my manager cared much about performance testing methodologies: I'd simply record a script, run it with multiple users and report my findings. This is a very bad attitude to run performance tests, as I have found out in several heated QA-dev meetings. The worst part was seeing my credibility deteriorate, and my hard earned findings doubted. Over the time I managed to pinpoint my mistakes and win my credibility back, but the process was long and painful. In this post I'll discuss some of the biggest performance methodology mistakes I have committed over the years.

1. Configuration problems. This one was the hardest on me. You run a test, you find what you think is valuable performance issue, you call some developers and have them investigate your system, only to find out that you ran out of disk space. Or the debug level was to high. Or the memory wasn't properly configured.

Always verify your system before a test. Read all the Read Me's, even the ones no other QA tester reads. Same goes for Best Practices. In one of the companies I worked for, we found a DB configuration document no one even knew existed, and we configured our system properly. Prepare a checklist a verify it before each test. Misconfiguration will bite you. Hard.

2. Mixing conclusions with findings. I love monitors. I disperse them generously across the applications and servers and monitor all relevant process (and some which aren't). In the end of the day, I sometimes have 100-200 monitors per server. While this is not a problem in itself, it became one when I didn't know what to report.

When asked to present my conclusions, I used to report anything which held a trend: Working Sets increasing, Context Switch peaks, Threads amount changing. The graphs looked impressive and I felt like I have a lot of "meat" in my reports. Problem started when people started asking: "well, what does it mean?" , "isn't it normal?" or even "is it good or bad?"

In many cases, I had no idea. I figured that other R&D personnel knew what those stats meant even when I didn't. Well, they didn't either.

Many trends are cyclic by nature (resource consumption is increasing and then decreasing), some reach a flat plateau after increasing and some rise and fall like saw teeth. In most cases, Monitors of hardware usage only give you hints of the problem and rarely the problem proper. There are some exceptions to this rule: high CPU consumption over a prolonged period of time is bad in most cases, for example.

The findings are important, as they help know your application better and know it baseline. If you're already familiar with specific monitors (I/O of process x, for example) following its trend can help you detect when an application is misbehaving, but in most cases monitors results are best left out of your conclusions section and kept in the findings baseline. Report only things which are definitely bad and you're sure about.

3. Fighting your previous war. There's a saying about the army that it's always preparing for the previous war instead of the next one. While it's always fun to make jokes about the army, most people handle the current project similar to the way they handled the previous one. I once had my engineers spend a week trying to locate memory leaks which weren't there, only because the previous product I tested suffered from memory leaks. In another time, it took me some time to adjust to the fact that the load on the system I tested is not caused by user activity, but by amount of data flowing into the system. Bye-bye 100 concurrent users load test.

Performance problems on different systems are caused by different factors affect different components. Study the system you test, learn what hurts it and how does it express its pain. Design your tests after you are familiar with the system and try to verify with yourself whether you're testing what you're testing because you're used to it or because it's relevant to the current product. Fight the current war, not the previous one. Or you can join the army if you like the uniform, though.

A subsection of this issue would be to test what's easy for you test. Nice try, but no. Don't.

4. Convoluted reports (no one reads). I'm not intimidated by putting together a 200 pages summary document, I actually take a perverse pleasure in it. It had taken me some time to understand no one really reads them if you simply throw the reports in their faces (or in their inbox, for that matter). Having a 10 MB, 200 pages monster uncoil before your eyes is a scary sight when you're unprepared and the natural reaction of most people I sent the report to was to close it and "read it later".

So I changed the format of my reporting and now my reports get the attention they deserve (most of the time, anyway). R&D managers are like ADHD children: they have short attention span and they like bright and shiny things, so in order to get their attention I compose a mail they can easily digest. The secret is to put nice colorful graphs and simple explanations summarizing your most important conclusions in the mail you send to the relevant people. No more than 2-3 issues. Put the rest of the report online somewhere and attach a link to it. Most people will ignore it, so you better make sure that executive summary count. Colorful graphs, simple explanations and no more than 2-3 conclusions. Easy.

About Me

I'm in QA since 2000, started by specializing in Performance testing and later set up a QA group from a tiny start-up phase to corporate. I also collect and paint miniatures, raise two daughters and a boy and ride a bike. Usually not at the same time.