Get a real-world sense of how real users see your site: Test your load times the old-school way

Taking care of a site is kind of like farming. In this day and age, you can manage your crops with spreadsheets and high-tech equipment, but if you really want to know what’s going on, you have to get out in the field, rub some dirt between your fingers, and give it a whiff.

I was reminded of this by a really good question that was asked at the WebPerfDays event that followed Velocity EU a couple of weeks back:

Is this crazy talk? Why would anyone choose to go lo-fi when we have such incredible real-user monitoring (RUM) tools available?

Actually, there are some solid reasons for performing lo-fi performance tests:

You want to validate your RUM data, but you’re skeptical about synthetic tests.

Onload is not always a good proxy for when a page feels ready. You can capture that feeling when you ask people to stop the watch when they feel pages are loaded.

You want to time when specific elements (e.g banners, ads, social sharing buttons) appear, and get a feeling of how this feels in the context of the entire page load.

If you’re not a hardcore performance geek, perhaps you don’t want to learn how to interpret reams of RUM data. You want a layperson-friendly approach that gives believable results.

Today I want to talk about a low-tech way to measure performance, either to validate your high-tech results or to give you an introduction to performance. I call it (drumroll) …

Stopwatch timing (pretty much what it sounds like)

This is as simple as it sounds: load a page and time how long it takes to render. It’s amazing how few people actually do this. All you need is a good stopwatch, a few testers on connections and browsers that mimic those of your users, and some patience. (Aside: In researching stopwatches and web timing I came across this archaic — in web terms — tool called StopWatch that still seems to work, despite being seven years old and developed pre-Chrome. Retro.)

I’ve always wanted a cool stopwatch like this one:

But I had to settle for the stopwatch on my phone:

I picked our own site (www.strangeloopnetworks.com) because I always pick on other people and I wanted to see if I could learn anything new about my own site with this approach.

Step 1: Gather assumptions

I made a few assumptions:

I assumed the page would be hit from a corporate network. (I’m currently in the office.)

Looking at our analytics, it looks like we have a lot of Chrome users, so I used Chrome.

Step 2: Establish a baseline with WebPagetest

I wanted to get a baseline, so I first ran a WebPagetest from our office over a FIOS network (we have fast internet here) to see what it would look like on Chrome. Here’s a closeup of the filmstrip view of the results, showing that the bulk of visible content arrives at 1.4 seconds:

Step 3: Run the clock

Then I cleared my cache and cookies, took out the stopwatch, and started to play. I didn’t get to play for long. I stopped the clock when the page seemed to fully load: at 1.2 seconds.

Step 4: Compare the results

1.882 seconds is fast, but it’s about 57% slower than the perceived load time of 1.2 seconds. This just serves to highlight that onload — a metric that most RUM and synthetic tools focus on — doesn’t necessarily give you the best idea of how a page feels for real visitors. (Note that there’s a good reason why tests use onload as a proxy for when a page is ready: namely, there is no utterly consistent proxy that applies across all sites. Onload is the best we have.)

As a side note, it was good to see that WebPagetest’s filmstrip view was very close to my stopwatch results. (I’m calling 1.4 and 1.2 seconds close.) It’s great to see that WebPagetest is capturing a pretty accurate mirror image of real-world performance.

Next step: DIY your own approach

If the idea of stopwatch timing appeals to you, I recommend mining your existing analytics and creating a methodology something like this:

Identify key times of day for your business. Traffic spikes will give you a sense of how your pages perform under load. Traffic lulls will give you a good point of comparison against spikes.

Identify connection types most used by your customers. If most of your customers are using DSL, then test from home. If a significant number of your shoppers are coming from mobile over 3G, then get out there and test for that.

Identify the most popular browsers used by your customers. You might love Chrome, but if your customers are using IE8, then that’s what you should be focusing on.

Identify where your customers live. If you’re in New York, but your customers are in the Midwest, find someone on the ground in the Midwest to test for you.

Create a schedule for running your tests. It should go something like this: “At 10am, 2pm, and 9pm ET, I will test URL X 10 times on each of the following browsers: Firefox 6, IE8, and Chrome 19, as well as on the iPad 2′s native version of Safari. The 10am and 2pm tests will take place in an office over T1 and Wifi. The 9pm test will take place at home over Wifi.”

Perform these tests over a meaningful amount of time — at least three days, ideally a week or longer.

Grab the median result for each set of tests and plot in a table.

When the test period is over, analyze. Compare to your fancy RUM data.

Stopwatch timing across a flow or transaction

Measuring on a per-page basis is interesting, but it doesn’t give a complete performance picture. 96% of your page traffic is part of a flow view, as users journey through your site, and how a page loads within a flow is not the same as how it performs during a standalone page view.

There are a couple of lo-fi ways to get a hands-on look at flow performance:

Identify a sequence of pages that takes you through a transaction up to the credit card authentication process. Use the methodology above to test each of these pages. Aggregate the results and note where slowdowns occur. This process won’t take you through authentication, but it’ll still yield meaningful results.

Time the entire process, from product page to checkout. This next idea is borrowed from one of Strangeloop’s ecommerce customers. In the early days of information gathering, they gave us a company credit card, which we would use to “shop” on the site every day. The transactions were automatically nullified (which was too bad, because we picked out some really nice things ), but we were able to complete the transactions and get a really neat ground-level performance picture. I’ve always wondered why more ecommerce companies don’t make it part of someone’s job description to do this a few times a day, clock the results, and chart them over time. (If your company does this, or something like this, let me know. I’d love to learn more about it.)

Conclusions

I get a thrill at the thought of billions of RUM beacons gathering tonnes of data for me to slice and dice. But I also like to get a tactile sense of how a site looks and feels. While I’m a huge cheerleader for our friends at great companies like New Relic, who are out there making this technology better by the day, I still believe there’s a lot to be learned from taking a lo-fi approach.