How Google tested Google Instant

Google's John Boyd (standing) and LaDawn Jentzch, user experience researchers at Google, explain to CNET's Tom Krazit how Google's usability lab works, as seen from the observation room.
James Martin/CNET

MOUNTAIN VIEW, Calif.--In a world of data-obsessed number-crunching engineers, Google's John Boyd is the people person.

Boyd is responsible for testing user-experience changes to Google Search, the company's most important product. While Google is famous for obsessing over statistical differences in user clicks between one shade of blue versus another, Boyd's team focuses on studying how real people interact with products under development inside Google through the company's usability lab.

This mission took on great importance as Google prepared to make perhaps the biggest change to its search experience it had ever contemplated: Google Instant. Google surveyed 160 people--divided equally between Googlers and the general public--as it developed "Google Psychic," the internal code name for what would become Google Instant.

Over the course of several weeks, Google continued to tweak Instant in front of new testers until it was finally confident in the product. It claims those users became fans: of the 160 people who tested the product, just one said they didn't plan on using Google with the Instant feature turned on, Boyd said, which he called "unheard of in lab testing."

So just how does Google make sure its new ideas are ready for the real world? CNET got a tour of the company's usability labs to find out.

Watching what they eat
Google, like just about every technology company, employs a bevy of eager and captive testers--employees--when getting ready to roll out a new product. However, there are clear limits to what "dogfooding" (as the process is known) can predict about how the general public will receive a product, especially at a company like Google where employees are selected in part because they are "outliers" compared to the general population, Boyd said.

"The user research team is instrumental in understanding how users interact with the design we're creating," said Irene Au, director of user experience across all of the Google-branded product efforts, which excludes things like Android and YouTube.

The testing process for Google Instant began toward the beginning of the year, when the first internal employee testers were brought in to look at the product. LaDawn Jentzch, a user experience researcher for Google, and other testers sat the users down in the testing room, complete with one-way mirror, and put them through a series of search-oriented tasks.

Eye-tracking equipment and software is used in the lab to record where a tester is looking when they enter their query. However, Boyd's team is more interested in recording subjective analysis of how the tester used Google, as analyzing the numerical data from the eye-tracking software can take weeks if not months. And the goal of the testing process with Google Instant was to follow Google's usual strategy of constant iteration, just in secret this time.

As summer began, Google started to bring in outsiders. Google operates testing labs all over the world, but a large percentage of the testing is done right here in Northern California, Boyd said.

Google is deliberately looking for middle-of-the-road people when it comes to its testing pool, those who are familiar with Google but would never be confused with the kinds of power users that occupy its campus. The going rate for Google product testing these days is about $75 an hour (those interested can sign up at google.com/usability) but don't expect to make a career of it: Google tends to put testers on ice for about six months after a visit.

Now Playing:Watch this:
Google launches "instant" search results

2:22

Researchers like Jentzch would usually sit alongside testers and give them a set of tasks, carefully observing their behavior and encouraging the tester to vocalize everything they were doing, the "talk aloud protocol" commonly used in user research tests. Google Psychic engineers were also encouraged to monitor the tests from the other side of the one-way mirror to get a sense for themselves how people were interacting with the product.

The team was struck by how few of the outsiders noticed that the search results were changing rapidly below the search bar, Boyd said. People in this testing protocol noticed other changes that Google had recently made--such as the design changes on the left-hand navigation bar that had rolled out several months before most were brought into the testing lab--but less than half of the outsiders noticed Google Instant during the first series of tasks.

As the summer progressed, researchers settled into a weekly pattern. They would test Google employees the first few days, and outsiders later in the week, meeting with the Google Psychic design team when testing was complete to go over the results and suggest changes. One major change that was the direct result of user feedback was the rate at which Google Instant generated new results, which was too fast for early testers of the product.

The end result was what Boyd called "the most positive professional experience of my research career," with Google Instant rolling out in early September with few glitches or complaints.

Socially acceptable?
While Boyd's team appears to have the search user experience research well in hand, he admitted that testing user behavior for social-media products is "complicated research," in that different people can have entirely different ideas of the level of customization or privacy best suited to their needs, and for completely different reasons.

The impact was clear: privacy controls were insufficiently labeled, difficult to find, and set to defaults that made regular users uncomfortable. And the result was perhaps one of the busiest holiday weekends in the lives of the engineers who worked on Google Buzz, as they scrambled to make changes to quell the backlash.

As Google prepares to add a lot more social-media features around its products through a project code-named Google Me, it will face the same sort of challenges in making sure the general public is not freaked out by its decisions. It's simply hard to test social-media products in an hour in a lab, there's just not enough time to properly form an opinion of a product that has so many variables, many of which are based on interactions with other users rather than the one-on-one user experience with a tester and a computer screen.

"That's often perceived to be the only thing we do, optimizing a mature established experience," Au said. "But for something that's conceptually new, we can't do a/b testing to inform" decisions about that product's evolution, she said.

That means Boyd's team is generally responsible for making sure Google's latest and greatest products provide the maximum benefit to its users without creating a backlash, such as Google saw with the launch of Buzz or the redesign of Google News earlier this year.

Longtime and hardcore Googlers often are bewildered by some decisions made by general users of its products: such as those who can't stand the threaded conversations view in Gmail. Au, Boyd, and their teams serve as the balance between Google engineers who want to pull users into a more efficient ways of using their products, and everyday users who just like things the way they are.

It's still a science, but a people-facing one, as opposed to a machine-driven one.