It’s the questions, stupid: less biased and more accurate tests

True or false: User testing is about getting into the head of the user?

Pictured: the head of a user.

Before I began this post I would told you that absolutely, user testing is all about getting into the head of the user. Why else would we ask them questions and go to such great expense to make those questions accurate, valid and reliable test instruments if not to get in their head?

Liar, liar pants on fire…

But we know that people often don’t mean what they say. Studies have shown that in a typical 10-minute conversation, the average person will tell 2.92 falsehoods. In essence, we filter what we say based on our context and situation. Much of what we say is coded for transmission based on the receiver. That’s a nice way of saying: I’m going to tell you what you want hear. It’s just a little white lie. Don’t worry about it, it’s not gonna hurt anybody.

It’s easier to do with speech than with actions. Our physical bodies have their own way of communicating. We call it body language.

Body Language

It’s been said that “hips don’t lie”.

She’s got a point.

But your body is constantly broadcasting what you’re feeling. Micro gestures in the face can tip off a liar or whose guilty. It’s how magicians guess names. Witness in the below video how the “magician” guesses the name of the ex-boyfriend by asking the woman a rapid series of questions by which he can observe her facial micro gestures.

Did you know that your feet are the most honest part of your body? Almost nobody pays attention to what they are doing with their feet. You can observe openness or defensiveness by whether the feet are in front or are behind and crossed or behind and wrapped around the chair legs. (You checked your feet, didn’t you?)

Check out the nervous guy at the end of the bar… What is he doing with his hands? Posture?

I’m pretty sure he’s just happy to see me.

What does this mean for the evaluator or user test creator? Does it mean we have to check to see if they’re packing heat or if they’re Shakira? Are these examples helping?

What it really means is that it’s very difficult to write (verbalize) a test question and not tip off your bias.

It seems to me that most of the standard test questions are leading the users. For example, “How do you feel about this test?” or “Tell me about your Mother.“.

What if the user doesn’t feel anything about that? Once you force someone to generate a feeling, I figure they fall back onto their default response. For some it’s a negative response. Some, it’s a positive response.

Tester default response (call it tester bias? Tester prejudice?) must be accounted for. Personally, I believe in the “innocent until proven guilty” adage. The website I’m viewing is presumed innocent of design mistakes, until it breaks a design rule. Websites are presumed innocent. Finding out this tester default response is key to understanding user testers.

But it that getting into their head?

All of this is academic and probably doesn’t really effect in a measurable way the outcome of tests-especially at this basic level. However I feel it is good to visit and share these thoughts as we encounter them. The pitfall is thinking, “I will not past because I can’t be scientific in my measurement.”

Here are some ways I plan to alter the script based on my understandings stated above:

Tasks first, impressions later

For the practical user testing scenario, I suggest tasks first and impressions later. Allow the user to work with the site in form and impressions in a natural way. I understand that the research shows that you only have 5 seconds for someone to determine whether or not they want to stay on that page versus clicking the back button. But, in the testing environment it doesn’t really translate. Perhaps a good compromise is that ask them to make a personal note, not to be shared, about their first impression. And then, after the tasks are complete, ask them to revisit that first impression and see how it changed. This might be more beneficial and give a truer sense of the users impression.

What this protects against is my initial reaction to prejudice. By asking me my impressions after only 5 seconds you are really just determining my prejudice. All sites are innocent until proven guilty of wasting my attention or stealing my cognitive load.

Go ahead it’s okay to say it stinks

There has got to be a way to set the environment-and I’m talking about actively setting up the testing environment- to be open and honest. That doesn’t mean setting nice music playing and making sure you talk in a soothing voice-that crap would set me on edge. In such a clinical environment, I would feel less inclined to give my actual opinion. To be frank I don’t know exactly how to do this. Anonymous testers might feel more inclined for negative feedback. Perhaps the sandwich method of positive-negative-positive could be used. Are remote tests better than face to face? Strangers vs Friends?

I can’t do it with you watching me

Seriously. Stop that.

Being observed makes me nervous. It’s unfortunate because user testing is all about making observations. The point I want to make is in order to make the environment open and honest for feedback-both positive and negative-it could be important to make the recordings and note taking inconspicuous. People understand they are being recorded, but will quickly return to a more natural expression when the recording isn’t conspicuous.

Conclusion

The whole point of user testing is to observe users interacting with your design with the intent to improve that interaction. There will always be some degree of bias in both the evaluator and the tester. It’s not required to eliminate it in order to find the majority of problems. However, if you think about and make a few changes to the standard scripts for user testing you can get a less biased and more accurate test of actual user experience.

Questions:

Do you have to get into user’s heads for a good user test?

How do you set the tone for a user test in order to elicit honest and open feedback?

Do you account for the natural, tester baseline / default response?

Do you, in a systematic way, observe and measure body language from the testers?