Eye Tracking: Best Way to Test Rich App Usability

A detailed examination of the benefits of eye tracking.

Article No :505 | December 8, 2011 | by James Breeze

Eye tracking has recently been debated on many fronts, with a particular focus on the ways people misuse it, and how some use eye tracking only as a way to "wow" clients. In our experience, however, it's invaluable in bringing to light key findings that are otherwise unattainable through other user testing methods. Eye tracking offers UX people the ability to:

Leave a participant alone during a test to focus on the task at hand, and therefore

Capture real physiological data about their conscious and unconscious experiences. This data is unique to eye tracking.

Eye Tracking for Rich Applications

Recently eye tracking has been heavily used in website design and testing. When I became involved about eight years ago, the sites tested were mainly flat HTML. Researchers were able to produce beautiful heat maps that were useful for comparing and optimising simple screen layouts and online advertising placements.

The invention of rich, interactive, and transactional interfaces, however, has meant producing eye tracking results is now more complex. Each interface has multiple states and people can interact in pretty much whatever way they like. People can choose their own way through a task to completion and the eye trackers can't tell which state is what as a person's eyes are tracked. Additional analysis is now required to separate these interactions, and fortunately eye tracking technologies have advanced to make this process relatively simple.

If this new level of sophisticated analysis is not achieved, this will result in eye tracking data being misused and eye tracking will retain (inappropriately) its novelty status.

Usability Testing Fraternities Lock Horns

Think Aloud

Think Aloud (TA) is an age-old usability testing method. People are asked to speak their thoughts, feelings, and opinions during a set of usability testing tasks. This is done with the help of the facilitator, who “skilfully interrupts" the process frequently to find out why people do particular things during the test.

In my opinion, when people are faced with lots of interactions on screen, considerable cognitive effort is required. Adding TA to this experience will inappropriately add more cognitive load to the task that would not normally be present. This can lead to misleading additional eye fixations and dwell times on outputs, which clouds the analysis. Often a poor facilitator will prompt users to the next stage (when was the last time someone knocked on your door and helped you find the right button when you browsed at home?), again spoiling the desired realism while testing.

We know there are three types of memory “storage" systems: sensory memory, short term, and long term. Our sensory memory retains an exact copy of what is seen or heard, and is generally thought to last between 300ms and a few seconds. Our short-term memory tends to remember between five and nine “items" (George Miller, et al) of information. If we start to talk about our actions in a TA protocol, these precious (milli) seconds and snapshots of information are quickly forgotten or overwritten. After that, what are they basing their commentary on?

Eye Tracking

There has been considerable debate about the usefulness of usability testing with eye tracking. Many TA proponents claim their methods, when carefully performed, will find enough issues compared with eye tracking, which they consider to be too difficult, time consuming, and expensive to bother about.

Of particular concern is Jakob Nielsen's F-pattern research. This was produced in 2006 and I regularly hear it mentioned in design meetings in Australia. This study was done using the regular TA protocol, which means that participants' eye gaze data is very likely not valid because they were talking to the experimenter during the study. Try doing an everyday task like driving, cooking, or cleaning while all along the way verbalising every little step, and see how your behaviour (actions, methods, or time to complete) is affected.

Retrospective Think Aloud

Retrospective Think Aloud is another usability testing method that has been used for many years. In this case, participants give their opinions of a task after it is completed and the interview is recorded for later reference. Of course, it is hard to remember what you did during a task.

Retrospective TA with eye tracking (RTA) is a method in which participants are quickly calibrated on the eye tracker and then asked to do the testing task without interruption from the facilitator. In fact, the facilitator can even leave the room during a test. Following the test, the facilitator immediately asks the participant to score their experience and then replays the eye gaze video of the participant's experience to them. This replay of their eye gaze triggers the person's memory of what they did, thereby mitigating the memory issue. Expanding on this, the eye gaze can also be removed to ask what the participant thought they looked at before revealing their actual interactions.

Think Eyetracking, an early adopter of the RTA eye tracking protocol (which they renamed PEEP), have published a jointly researched academic paper with Lancaster University, UK. Their academic article can be downloaded on the Think Eyetracking Blog. They also had a very popular blog post about it in 2008 that generated some controversy.

Below are some eye tracking heat maps created by Think Eyetracking that show a comparison of a Google search task done with TA (on the left) and RTA (on the right). Note the dramatic differences! It is obvious that the behaviour is very different, with long dwell times and numbers of fixations apparent in the TA output, probably caused by the participants staring at and browsing the screen while verbalising their actions.

Recently, Tobii Technology from Sweden created a unique feature in their Tobii Studio software where during the eye gaze replay stage of the test, the software records a video and audio record of the participant and facilitator as they review the eye tracking session. This can be paused, replayed, and scrubbed to allowing a full detailed analysis of the session with both visual and audio cues. Find out more about RTA on Scribd or watch this video:

Usability labs are set up to approximate real life. We regularly see experimenters set up their testing facilities like offices or lounge rooms to make the person feel at home. TA asks people to talk to someone while they are busy doing a task—where's the real life in that?

Eye tracking is the only real way to test a rich application without distracting the participant.

See Where People Looked, Not Where They Think They Looked

We allow people to complete tasks in a focused way, and also obtain real physiological data about what they are doing. It is difficult to argue with and almost impossible to fake these measures. We are not making assumptions about what they looked at and in what order things captured their attention. Some recent client projects encouraged us to use eye tracking to identify:

Where do people look first?

What don't they look at?

What they looked at before the usability issue occurred?

How people learn an interface?

1. Where Do People Look First?

Eye tracking measures unconscious behavior—and provides data that people simply cannot verbalize in other common user research methods, especially TA usability testing protocols. Decades of psychology research show that much human behavior occurs at an unconscious level.

The human eye, for example, can make up to 5 fixations per second and this occurs below people's level of conscious awareness. So in a 30 second scan of a typical homepage, the customer may be looking at up to 150 items on the page. Your customers (or research participants) simply cannot verbally tell you where their eyes are going and this is exactly the value that good eye tracking data provides.

Our experience is that visual attention data IS correlated with behavioral performance metrics. If people don't "see" something, then they are less likely to click it.

2. What They Don't Look At

Case study 1: Eye tracking shows what things on the screen people didn't look at. Importantly, the data revealed what space was being wasted in the design and what areas of the page were essentially ignored.

Recently, when we tested an internal CRM application for a finance company, eye tracking proved that customer service staff ignored the very information the company wanted them to focus on. In the task, they weren't even required to click on the screen.

The image here shows clearly that in the first few seconds of usage staff focused primarily on the bottom right rather than the bottom left where they were meant to focus. This would not have been observable if simply interviewing them. Considering this screen is used 300,000 times per day, any improvements to the design that make the correct part of the screen more obvious will drive positive outcomes for the finance company's customer service.

3. What They Looked at Before the Usability Issue Occurred

Only with eye tracking can we see all the options that people consider, even unconsciously, before starting and completing a specific task.

Eye tracking shows you where people immediately look on a screen. Yes, they can find a target and do a usability task just fine. But where did they look first, especially for ecommerce where time taken can force customers to leave you or stay? Rob Tannen puts this very clearly:

[Eye tracking] does have value as a secondary diagnostic tool. In the context of usability testing, eye tracking does not determine the presence of a usability problem, but helps determine what led to that problem in conjunction with performance data, facilitator observations and user self-reporting.

Case study 2: As the video clearly shows, this user was looking everywhere except at the Donate area on the right. After looking at the navigation both at the side and at the top, the rest of the page was viewed but at no point did the user focus on the Donate area in the main image. It clearly highlights the fact that this call-to-action does not stand out in the prototype, and users are also expecting to see something within the navigation. Equally, the heat map below gives an indication where all six people we tested would expect to see this link.

Case study 3: When users were asked to change one of the options on this screen, the eye tracking heat map below showed very clearly where they were expecting to go. People did not see the areas they were supposed to (indicated in red). Eye tracking of the first second they looked at the screen allowed us to make the site more efficient as it clearly indicated where the functionality should have been positioned.

The heat map below shows the first second of eye tracking on a prototype applicaiton. Users were heavily fixated on one area of the screen, and it can be assumed that this is where they were expecting to find the function they were asked to look for (the buttons).

This experience can also be seen in our financial institution case study.

How Do People Learn an Interface?

Eye tracking is also useful for change management and training when a new system is introduced to staff within a business.

Where do people look the first time they see an application? How about the second time, and the third time? Eye tracking shows very clearly how people learn to interact with a system.

Case study 4:A new user visiting the website

The new user is seen to skip back and forth between the right hand side panel and the selections and information on the main part of the page to complete the task.

A frequent user of the website

The frequent user skips back and forth less frequently than the new user and is more focused on completing the task.

An expert user of the website

The expert user is highly focused and directed and completes the task with minimum effort.

This example was again from the banking CRM case study. The client even used an eye tracking video as part of the training package for customer service reps. It was used to show them the best way to look at the interface the instant a customer identifies themselves at a branch.

Role-Played Customer Service

The eye tracking data gathered from the CRM examples above was gathered during a simulated customer service interaction. The bank branch staff member was tracked during a 45-minute role-played customer interview. Afterward, the usability issues were discussed when the staff member's eye gaze and screen interactions were replayed to him. I can't think of any other way to do this type of test that essentially involves three people: participant, role-played customer, and facilitator.

Commonly Reported Eye Tracking Advantages

A more relaxed testing environment where participants give feedback in their own time, and actually find more usability errors.

Executives like eye tracking because it produces compelling physiological data that can't be argued with.

Real time eye tracking data also provides for a better observation experience. I frequently find that if I am observing a participant's gaze data in real time while they complete their tasks, I am better engaged and glean more detailed insights about the user interface.

In TA, sometimes it can be very hard to see what a person is talking about during the test. I once mentioned this to a TA proponent and they suggested that if the TA is managed well it wouldn't be a problem. During the test, they would have the test facilitator ask the participant to hover their mouse over the part of the screen they are describing so that the observers can see what is being discussed. I'm sorry, but this just means the participant gets even more distracted from the task at hand.

About the Author(s)

James Breeze has a Masters of Organizational Psychology and his goal is to improve people's lives through improved design and usability of things. He runs Objective Asia and Eye Tracking consultancy in Singapore and SE Asia. Objective Asia was set up in February 2013 and is a subsidiary of Objective Digital, a UX consultancy in Sydney, Australia.

Comments

I agree that Think Aloud has the potential to disrupt natural use of an interface. The interesting thing is that I've seen it work in both a booster and drag on usability in lab settings.

Some subjects get distracted talking (and an even smaller number get a little intoxicated on their own opinion!) and lose track of the task they were asked to perform.

Other subjects actually seem to improve their performance by thinking aloud - i.e. "So, I'm thinking that this button here will toggle the thingy to off, but now I say that, it that doesn't really make sense, because the thingy is already off..."
Sometimes articulating thoughts verbally organises thought patterns that were a little disorganised previously.

Having said that - even in the latter situation where the user (probably) completes the task successfully, an observer has learned that the toggle button in question is a potential usability problem. All goes to show that careful analysis of usability results are crucial

Yes, totally!
As I mentioned, the actual pages we tested were replaced with wires to protect client confidentiality ;)
Sorry!

@flora,
Eye tracking absolutely shows how people learn an interface. In the case shown we tested novices and experts - each group had dramatically different common eye gaze patterns. Also, when the novices looked at the interface a second and third time their average gaze patterns approximated the experts more closely with each task on the same CRM screen.

I find eye tracking a really interesting method for design investigations, although I have to disagree that it could answer the fourth question about how people learn. I argue that we can see the visual behaviour before and after, but in my opinion this method only gives clues to questions about "how" and "why".

Yes, graphics have a massive impact on the UX. As well as the actual implementation of the fine grained state changes (or transitions) in the productivity application's UI.

In my case, if we are doing usability [on wireframes] it will be always be with an eye tracker, as that method provides many advantages to the testing process, as listed above.

Eye tracking wireframes will teach us some things about the IA and labelling of functions on the interface. But you'll get much more out of the eye tracking if you prepare a an interactive, designed prototype.

I am really interested in your article. I personally like to use quick and dirty prototypes to test rich applications (we mostly designs productivity applications as opposed to sites though). But this is an interesting approach.

Just curious, is it common practice to conduct eye-tracking studies on wireframes? Wouldn't effective (or ineffective) visual design have a major impact on where people focus their attention?

Interactive minds rightfully state:
"A method becoming more and more popular is the so called "retrospective think-aloud" in connection with eye tracking: The participants are confronted with their very own gaze behavior and can comment on it. This can lead to interesting and deep insights into their motivation while solving some task. With their gaze path at hand they remember more details than without it leading to even more precise results."

I was recently frustrated with an overseas client who wanted me to test a transactional site and then send the data to them for analysis!

I had to say, even though Tobii Studio records the RTA interview, it is best if we do some of the analysis for you."

You really have to be present to gain the insights required to understand the data and improve a site's usability.

You are completely right with your observation that RIA poses new challenges to eye tracking data analysis:"
If this new level of sophisticated analysis is not achieved, this will result in eye tracking data being misused and eye tracking will retain (inappropriately) its novelty status."

In fact,if one looks closely to some of the resukts published on the internet, then one starts to wonder why people have spent most of their attention in regions where obvoiously no intersting abject can be seen. If you then visit the very same website, then it becomes obvious what happened: The website consisted not only of a background, but also of a large overlayineg javascripted enabled pop up menu. So, most likely, people looked at the pop up, but the eye tracker faile to distinguish looking at the background from looking at the pop up.
A company called interactive minds have some intersting insights about that issue here:
http://www.interactive-minds.com/en/eye-tracking-usability
and here:
http://www.interactive-minds.com/en/eye-tracking-software/nyan2-gallery
(you have to scroll down and look for "Detection of Dynamic Web Page Elements" - not the best usabilty on that web site I have to say)