Eye Tracking as method to improve search usability

An often discussed topic in the UX world is the eye tracking method and its right to exist as a meaningful and valid usability method.

I am including eye tracking in my usability tests as often as possible, because I found it to be an helpful method to investigate what user look at on a result page. Particularly the information on eye movement in combination with user behavior on the search results page provides a lot more information about the motivations for certain behaviors, than tracking user behavior alone. Eye tracking data should always be seen complementary to the live statistics. The same applies also to other forms of searches like maps, news or product.

Ian Everdell, Usability Consultantfor Enquiro is working with me on advanced search issues for the t-online.de search since 2007. During the last month of our work we have frequent – sometimes controversial – discussions on theories and methods. To give you a chance to participate in the discussion we had, I summarized some parts here in my blog as an interview.

So, if you think “search usability” is cool stuff you should read on, otherwise I hope you will find something more interesting on my page.

SQ: The world of usability seems to divided up into two groups. On one hand we have the fanatics about eye tracking, and on the other hand we have the critics. So Ian, what do you think? Were do you see the benefits outweighing the costs?

IE: The eye tracking debate is certainly a heated one, especially when you have well-known usability figures like Jakob Nielsen and Jared Spool taking opposite sides. The argument is typically that eye tracking is a highly specialized and expensive technique that doesn’t add any insight that couldn’t have been found using more traditional methods. I respectfully disagree.

I think that eye tracking can add a lot of insight. It can useful for identifying areas of the page that seem to be receiving lots of or little attention, although it’s important to realize that gaze does not equal attention (you can be looking at something without paying attention to it) and vice versa (you can pay attention to something that you’re not looking at). It can also help identify typical scan patterns, which can help you position information and page components appropriately.

A good example of this would be testing the search function on a web page. Imagine that the search box is placed in the lower left-hand side of the page. Participants in the study find the search box and use it to navigate to their goal without any problem, and they all report that the experience of using the site was good. However, analysis of the eye tracking shows that almost all of them looked for the search box in the top right corner first – maybe you should move it there to help speed things up?

Eye tracking can also be used to prompt study participants to talk about their experience. By replaying the video of their session and letting them watch what their eyes were doing, it can help them remember what they were thinking or doing at a specific point. Then they can talk about their experience in more detail.

Another interesting thing that we’ve found in some of our work at Enquiro (Link) that examines gaze behaviour during search is that while in the end it takes most people a similar amount of time to choose a link, men and women process the page in remarkably different patterns – this is insight that would be tough to get from traditional usability testing.

However, eye tracking is ultimately just one tool. It needs to be used in combination with other usability testing methods to get the best results.

SQ: What do you think is the biggest benefit of eye tracking for search?

IE: As I mentioned above, eye tracking has already shown us a number of interesting things specifically related to search, like how people scan a typical SERP and that there are differences between men and women. It also shows us how long people engage with particular parts of the page, although we have to keep in mind that looking doesn’t equal attention, and that we don’t know whether extended viewing means that people are engaged or confused. However, knowing that something is happening in that area of the page allows us to ask the participant more questions about why they might have been looking at that part of the page.

One interesting finding that Jakob Nielsen has published using eye tracking data is that people really only read the first 11 characters of a headline – this has huge implications in how the title tags that are used for search results are written. Eye tracking has given us pretty definitive proof that using “Welcome to the web site for…” is not a good title tag.

SQ: Which usability methods do you find helpful when optimizing a search?

IE: Traditional usability testing is probably the best method – get people to interact with the page and then verbalize their thoughts, etc. when they’re done. This gives you a good understanding of what they were expecting (chances are that this day and age the answer will be “I was expecting it to be more like Google”). But I think that eye tracking is a good follow up to this because it can give you more understanding of where certain elements should be placed, how content should be written to promote reading and understanding, and ultimately coach a click through.

SQ: Which usability method do you find undervalued for search in the usability community?

IE: I’d love to see remote usability testing start to be incorporated into search – capture real users in their real environment doing real things. I’m sure the major search engines could do this fairly easily, but of course the volume of data would be enormous.

But then you have a different problem from traditional usability testing: instead of having to create intent for the user to get them to use a specific search term, you know the search term but not the intent. You can only infer it from the search term itself… and this is why behavioural targeting is tricky.

SQ: What are the typical mistakes that happen when testing a search in the usability lab? How should the structured questionnaire be developed? What types of questions are relevant?

IE: I think the biggest problem with testing search in a lab is that you have to craft intent for the participant (at least, if you want to compare between participants). There are some studies that allow participants to do “free searching”, where they are given a vague intent and are allowed to complete the task completely on their own – the problem with this is that each participant might use a slightly different search term, which leads to a different SERP, which leads to a different experience altogether, so it’s tough to compare between participants.

I think another inherent difficulty in testing search in a lab is that there are billions of possible experiences that people could have – one little change in the search term, the search algorithm, blended results (images, video, news, etc.), or a page’s metadata or content can change the SERP, which again makes it difficult to compare between experiences. So taking the few exposures that you get in a lab and generalize them to each and every SERP can be quite a stretch.

Another difficulty with testing search in a lab is that search is about more than just the SERP – it’s about what’s after the click as well. Users make judgments about the search engine based on what they find afterwards, so usability studies should include exposure to landing pages and web sites as well

Relevant questions for the usability of search include standard quantitative measures like time to click, a user satisfaction scale and eye tracking metrics (if collecting eye tracking). I would also be interested in subjective qualitative information from the participants like their perception of relevancy and their expectations for what would be on the page.

SQ: What are the right (or most valuable) metrics for measure the usability of a search? How do you choose them? Are they different for the different searches, e… local searches, web search, product searches etc.?

IE: The standard usability metrics are effectiveness (the user is able to accomplish the task, measured as a completion rate), efficiency (how quickly the user can finish the task, the number of clicks, or some similar measure of effort needed), and satisfaction (how happy the user is after performing the task).

However, I think the most valuable metric for search usability would be the correlation between relevancy (measured as part of satisfaction) and time to click (efficiency) – searching needs to be effortless for the user, so the ultimate goal would be to maximize relevancy and minimize the time it takes them to make a decision.

In terms of different metrics for different types of searching, certainly you might want to define what “effective” or “efficient” mean – effectiveness for a local search might mean finding a business on a map, while for a product search it might mean finding the price – but ultimately you want to focus on improving those three key metrics.

Choosing metrics to measure, for any type of usability testing, should be based on ensuring that the experience is as good as possible for the user – it has to be efficient, effective, and satisfactory.

Whether you choose to work on one of those at a time (e.g. make sure they can do the task, then work on making it faster/easier to do) or all at once, it all comes down to the fact that you’re there to serve your user.