Eye-Tracking Study: Everybody Looks At Organic Listings, But Most Ignore Paid Ads On Right

Interesting new data about searcher behavior from a recent User Centric eye-tracking study: Whether using Google or Bing, all 24 participants looked at the organic search results for their queries, but between 70% and 80% ignored the paid ads on the right side of the page.

User Centric studied the search behavior of 24 “experienced users” of both Google and Bing, all between 18 and 54 years old. They were asked to do eight searches — four on Google (with Google Instant turned off) and the other four on Bing.

The results? Here’s a table version of the diagram above.

Google

Bing

Organic Results

100% viewed; 14.7 seconds total

100% viewed; 10.7 seconds total

Top Paid Results

91% viewed; 0.9 seconds/result

90% viewed; 0.7 seconds/result

Right-side Paid Results

28% viewed; 0.16 seconds/result

21% viewed; 0.11 seconds/result

Left-side Column

17% viewed; 1.2 seconds

18% viewed; 2.9 seconds

User Centric says there’s no significant statistical difference between the 28% of searchers who looked at Google’s right-side ads and the 21% who looked in the same place on Bing (as shown in row three above). Ads that appear above the organic results were viewed substantially more often than those in the right column and almost as often as the organic search results.

The various filters and refinements that both Google and Bing display on the left-side of the search results page were looked at even less than paid ads on the right: 18% for Bing and 17% for Google. Notably, time spent looking at Bing’s left column was more than twice on Google.

The main difference in activity was in time spent looking at organic search results; searchers on Google spent four more seconds looking there than Bing users did. The image example above is a search for “engagement ring” — both search engines provided a map with local results in the middle page along with numerous traditional “blue link” results. It looks like there may also be a news result near the top of the Google results. User Centric says one possible interpretation for the time difference is that users had more trouble finding the information they were looking for on Google, but it’s not clear what the reason was.

One other interesting stat: User Centric says only 25% of the study participants activated Bing’s automatic site previews, and each time it happened accidentally. Google also offers Instant Previews, but those require a click.

About The Author

Matt McGee is the Editor-In-Chief of Search Engine Land. His news career includes time spent in TV, radio, and print journalism. After leaving traditional media in the mid-1990s, he began developing and marketing websites and continued to provide consulting services for more than 15 years. His SEO and social media clients ranged from mom-and-pop small businesses to one of the Top 5 online retailers. Matt is a longtime speaker at marketing events around the U.S., including keynote and panelist roles. He can be found on Twitter at @MattMcGee and/or on Google Plus. You can read Matt's disclosures on his personal blog. You can reach Matt via email using our Contact page.

Sponsored

Ian Williams

Interesting study, but 24 people is a bit low to be making sweeping statements. Just one person’s behaviour is worth over 4%!

Matt McGee

Agreed, Ian. I don’t know how many people typically are typically involved in eye-tracking studies, but 24 does seem low. On the other hand, they’re measuring searches more than people here — and with each person doing 8 searches on two search engines, you do end up with a fair amount of data to consider.

http://www.usercentric.com/ Aga Bojko

Ian and Matt,

Hypothesis testing (i.e., trying to determine if there is a difference) is frequently confused with precision testing (i.e., trying to generalize an exact “score” to the population). This confusion leads to a lot of criticism regarding sample sizes used in research studies.

We certainly do not claim that the exact numbers (% of participants who looked and gaze time) that we obtained in the study can be generalized to the population. To do that we would have to run hundreds of participants.

Our results indicate, however, that there are three significant differences between Google and Bing at alpha level of 0.1. In other words, Bing and Google will differ along those three dimensions 9 out of 10 times. These differences are marked with asterisks on the heatmaps above.

Being able to detect a significant difference indicates that the sample size used in the study was sufficient. An insufficient sample size usually results in an inability to detect a difference where the difference really exists rather than in detecting a difference where it doesn’t exist. If a statistically significant difference is found with a small sample size, this indicates that the difference does exist.

I hope this helps!

http://uk.linkedin.com/in/jordanseo Jordan Russell

Not a fault of the test, but I wonder if the results would be different if the searchers had ACTUAL intent to purchase. Sure we can give them a scenario but different emotions are involved if they’re genuinely looking with intent to buy.

I’d love to see one of these tests with people in that mindset. Perhaps sit outside the Apple sotre and grab the next 10 people about to go in and buy an iPod :)

http://www.metricsmarketing.com MetricsMarketingRCX

In order to be statistically significant, at least 30 users are needed for an eye-tracking study (Kara Pernice and Jakob Nielsen). However, the standard recommendation for quantitative findings like these is actually higher.

However, the more interesting question to explore is about the recruiting strategy itself. According to the User Centric study description, “Twenty-four Internet users between the ages of 18 and 54 participated in the study. Participants conducted an average of 48 online searches per week using both Bing and Google, with at least five searches per engine.”

However, to look specifically for users who (1) understand what a search engine is, (2) can recognize different search engines, and (3) actively switch between search engines, assumes that the target users for this study are advanced computer users. In the studies I have conducted, it is rare that even individuals who self-identify themselves as “Extremely Comfortable Online” are advanced enough to change their search engine, or to recognize when they are at another one. In June of 2009, Google found that users don’t understand what a browser is, or know the difference between that and a search engine (http://googlesystem.blogspot.com/2009/06/browser-is-search-engine.html).

When discussing the results of this study, it is important to keep in mind that this applies to advanced computer users, and doesn’t necessarily extend to the general public.

glew

The Pernice and Nielsen’s reference about 30 participants is incorrectly stated above.

The reference can be downloaded (http://www.useit.com/eyetracking/) and it has absolutely nothing to do with statistical power. It says that “If you want to draw conclusions using heatmaps or if heatmaps are the main deliverable for the study, you need 30 users per heatmap.”

This study analyzed actual data points, not a heatmap. Conclusions were drawn from a statistical analysis of the actual data. Not a visualization of the data as suggested by Pernice and Nielsen. Clearly, this rule of 30 does not apply.

Those inquirying about sample size need to understand statistical power. As one of the posters commented, this is not about generalizing the data to the population. This requires large sample size and this is quite a bit larger than 30 ;-)

The statistical test run was about difference scores. It is that the difference score is generalizable–it can be predicted to be different with error of alpha (probably .05 level).

This is very common in experimental studies. Moreover, the fact that statistical differences were found make sample size irrelevant. Really. The point about having more sample size is that there is sufficient power to detect a difference, if it did indeed exist. This is Stats 101. If there was a difference found, then the finding is real (with error of alpha).

Let’s talk about the results, not about the stats. The stats are clear to graduate students in stats classes….

http://www.usercentric.com/ Aga Bojko

The sample size of 30 is actually a common misconception. It has been described in more detail in the article titled ‘More Than Just Eye Candy: Top Ten Misconceptions about Eye Tracking in the User Experience Field.’ The article came out in User Experience Magazine in 2010: http://bit.ly/gHAad9

http://www.diepbizniz.nl RobertJan van Diepen

The talks about the sample size are important. In cooperation with the Utrecht University we researched how many participants you need to achieve a statistical significant heatmap. In this study we used advanced statistical techniques to measure this. We found out that statistical significant heatmaps can be achieved from 17 participants. When participants did a task we found out that in some cases 12 participants was enough. In general studies with a free examination needs more participants than with tasks.

http://sadgrove sadgrove

70% – 80% ignored the ads?

My last 4 weeks stats show 29,000 visits from Google CPC, and 18,000 from Google organic search. And that’s for a site which ranks well for search. Nor do I pay for top ad positions.

My data suggests the ads work well, for my site, anyway.

http://www.usercentric.com/ Aga Bojko

RobertJan van Diepen: What do you mean by a statistically significant heatmap? Are you talking about gaze patterns that can be generalized to the population or do you have hypothesis testing in mind? These are two very different concepts and require differnt sample sizes.

http://www.diepbizniz.nl RobertJan van Diepen

When can you draw statistical significant conclusions from the eyemovement behaviour on a webpage. That’s the main question. Yes, these are gaze patterns. On the internet many eyetrackng heatmaps are shown and discussed. But non of us can tell if these heatmaps can be generilazed to the population. In our research we focused on that topic. How many participants do we need and can we predict that. Let’s say a heatmap is created from 20 people. Is the eyemovement behavior statistical significant or do we need more participants to draw statistical significant conclusions. In other words will this behavior differ when we ad more participants. If not 20 participants is clearly enough. So why ad more participants to the study. I hope this is more clear.

http://decisionstats.com Ajay Ohri

for numbers as low as 24, T tests suffice. the data breadth is 24 times the studies given to each participant and eye tracking can generate huge amounts of data.
an interesting result would be to revalidate with another sample after a period of time given the dynamic time nature of search

Attend Our Conferences

Attend Marketing Land's SocialPro conference and learn fresh new strategies and tactics from some of the savviest brands and digital marketing agencies managing earned, owned and paid social media marketing campaigns across multiple platforms. Visit the SocialPro site to learn more..