Monday, June 29, 2009

New Google study on speed in search results

Googler Jake Brutlag recently published a short study, "Speed Matters for Google Web Search" (PDF), which looked at how important it is to deliver and render search result pages quickly.

Specifically, Jake added very small delays (100-400ms) to the time to serve and render Google search results. He observed that even these tiny delays, which are low enough to be difficult for users to perceive, resulted in measurable drops in searches per user (declines of -0.2% to -0.6%).

Please see also my Nov 2006 post, "Marissa Mayer at Web 2.0", which summarizes a claim by Googler Marissa Mayer that Google saw a 20% drop in revenue from an accidentally introduced 500ms delay.

Update: To add to the Marissa Mayer report above, Drupal's Dries Buytaert summarized the results of a few A/B tests at Amazon, Google, and Yahoo on the impact of speed on user satisfaction. As Dries says, "Long story short: even the smallest delay kills user satisfaction."

Update: In the comments, people are asking why the effect in this study oddly appears to be an order of magnitude lower than the effects seen in previous tests. Good question there.

14 comments:

So if I understand correctly, the idea that the 500ms delay in 2006 was the cause of the 20% drop in revenue (proportional to further use and searches, I guess) was completely dis-proofed by actual, rigorous testing.

As I read these numbers, speed does not matter at all (even almost half a second of delay caused an almost negligible drop in use)

According to these number, we are much better of looking for other factors like usability to improve our sites.

In part, I think this is because the experiments differed in how much and where they did the delay as well as what metric they used to determine impact.

For example, if the search results appeared significantly before the ads appeared, you might expect to see a fraction of people click on the search results before the ads even display.

In the end, how much speed impacts perceived quality in a particular application probably depends on the usage and interface design. What these studies indicate is that we probably should be concerned about speed. How much we should be concerned likely is something that would have to be determined for each application.

By the way, you mentioned that you hadn't seen Marissa give that +500ms leads to a -20% revenue drop data point? If you want, you can see it in her "Scaling Google for the Everyday User" talk at the Seattle Scalability Conference in 2007 (see around 13:00 in the video).

By the way, you mentioned that you hadn't seen Marissa give that +500ms leads to a -20% revenue drop data point?

Greg,

In your cited Nov 2006 post, you wrote: "Traffic and revenue from Google searchers in the experimental group dropped by 20%." Not just revenue, but traffic. Which I assume is pretty much a 1:1 correspondence with number of searches issued.

So I'm kinda with Ewout, in scratching my head in a little bit of confusion over this one. Let's see..

Sure, we can write Jake. His email isn't listen in the paper -- do you have it?

Another though, while I'm on it: Let's even say that these numbers are correct. Let's assume that the 400ms delay causes a 0.6% drop in the number of searches performed, and that drop is statistically significant. (I don't see significance values reported in the paper, but.. whatever. Let's assume that it is significant.)

Translated into real numbers, that means that for every 500 queries that a user used to do, he or she now does 497? Or, given the statistic that the average searcher does 4 searches per day.. that means that once every 125 days, the user does one less search? That's once every 4.2 months, or approximately three times a year that the user does one less search, than they otherwise would have. ONE.

I can see how, maybe, from the corporation's perspective, all those little tiny differences add up to a significant difference in revenue. Especially when you aggregate over hundreds of millions of users.

But I'm looking at this from the perspective of the user. And I am having a hard time seeing how this really affects the user, at all. If once every four months I feel so inclined to not perform a search that I otherwise would have, I don't think that I would ever even notice that, or in any way be inconvenienced by that fact.

I don't get it.

And so it seems to me that a much better use of corporate resources would be to make sure that a larger percentage of the searches that a user does succeeds. What is the statistic.. something like 50% of all searches end in failure? If you could lower that failure rate to, say, 45%, then the end user would be much happier, than if you lower delay from 400ms to 100ms.

It looks like Jake has published his e-mail on his home page. It is jakeb[at]google.com.

I get what you and Ewout are saying, that this study not only appears to be inconsistent with previous reports on the magnitude of the effect, but wildly inconsistent. It would be interesting to get Jake's thoughts on that if he is willing to share them.

In a separate e-mail, the author of the paper, Jake Brutlag, suggests that the difference is likely due to confounding factors such as additional client side latency when adding 30 results.

Jake also pointed to Marissa Mayer's Velocity 09 talk, where (starting around 5:00) Marissa explicitly discusses these different results, though she doesn't go into why they differ by such a large amount.

Given the magnitude of the difference, it is hard not to still have questions about these results, but perhaps that helps a little.

The experiment Marissa Mayer referenced changed the default number of search results on the page. This is a different mechanism of increasing latency than injecting server-side delay (server-side delay underlies the results described in this blog post). Furthermore, as fnthawar comments, it is not clear one can untangle the user response to additional results vs. additional latency to generate those results.