Page Test Tools: Which results should you trust?

When I work with a client to help them understand what kind of performance gains they can expect with Strangeloop, one of the first things we do is benchmark their site’s current performance. Right out of the gate, this can prove to be a major stumbling block, because no two tests return the same results. Part of my job is walking people through the various results and explaining which data is meaningful for their purposes, and which isn’t.

To illustrate, I put a high-traffic site, Target.com, through its paces on a handful of commonly used tests — Gomez, Keynote, HTTPWatch and Webpagetest* — to see how it performs on first views and, where possible, repeat views (click through the links below to see the test results):

The first column – page load time (first view) – is the time it took for the Target.com home page to fully load for first-time visitors. The second column – page load time (repeat view) – shows how long it took for the home page to load for a previous visitor whose cache had not been cleared.

Both these columns are meaningful. You definitely want new visitors to your site to have a good experience, given the evidence that 80% of first-time visitors who have a poor experience will not return. But you also need to give your repeat visitors an excellent experience, because studies indicate that they have higher expectations of your site than newcomers do.

Today’s focus: first-time visitors

For the purposes of this post, however, I’m going to focus on first-time visitors, as this is the metric that most people are initially interested in, and we know this metric has a dramatic effect on conversion. (See the third graph in this post for evidence.)

As I’ve written about in the past, the emerging benchmark is a sub-2-second page load time. In this set of tests, the numbers ranged from 1.383 seconds for Gomez to 10.180 seconds for HTTPWatch (on Firefox from my office).

But how is this discrepancy possible? And which results are most applicable? To answer these questions, you have to understand one thing:

Different page test tools are designed to test in different environments

If you’re a Gomez or a Keynote client, then you are most likely using their backbone tests, named for the fact that they take place over the backbone of the internet. Backbone tests tell you how fast your site loads at major internet hubs, but because you’re skipping the “last mile” between the server and the user’s browser, you’re not seeing how your site actually performs in the real world.

If you’re testing with HTTPWatch, then you’re finding out how a website performs on your own desktop. If you’re testing while at work, using your company’s souped-up connection, then you’re going to get zippy results. Testing the same site at home on your shared DSL line, while your neighbor is downloading seasons one through four of Lost… well, that’s going to be a much less zippy experience. HTTPWatch is a good way to test how a site works for you, but not necessarily how it works for your users.

If you’re running Webpagetest, you’re testing in an environment that attempts to simulate real-world user environments and browser behavior. For instance, if your users live in mid-sized urban locations, you can test from Webpagetest’s servers in Dulles, Virginia (which is where I ran these tests). If your users are overseas, you can test from international servers spanning the UK, China and New Zealand.

In order to understand which test is best for you, you need to understand your real users’ environment.

Which brings me to my next point…

How to identify your users’ environment

You need to find out three things:

Where your users live. Are they in major urban centers? Are they overseas? Do they live in small towns?

What kind of browser are they using?

What kind of internet connection are they using. Cable? DSL? T1? You might be surprised.

All this information is readily available through your analytics tools. After you’ve studied it, come up with a technical profile (or set of profiles) for your users. Then use this profile to customize your page test parameters to get the most accurate results for your site. That’s your benchmark.

So what is the best test for Target.com?

Without knowing the exact user breakdown for Target.com, I would assume that, like many of our key ecommerce customers, they fit the following profile:

Live in the US in the suburbs

60-65% use IE7/8

Most connect from home over DSL/ADSL

Based on these assumptions, I would recommend to the Target executives they look very closely at the IE7 and IE8 Webpagetest results (which showed first-time load times in the 6-9 seconds range) and I would suggest that investing in performance would really help.

*Test parameters:
Gomez: Tested last mile and backbone, running one test per hour for six hours across five points in North America: LA, Miami, Atlanta, New York and Seattle.
Keynote: Same as Gomez.
HTTPWatch: Tested the site on Firefox 3.6.6 and IE8 from my office in Vancouver.
Webpagetest: Tested the site on IE7 and IE8 on both DSL and FIOS networks. Averaged three runs via server hosted in Dulles, VA.