When user testing is performed, users are often presented with a serious of variations of a product and observed to see where they excelled and where they had trouble. People will often click in areas that are not intended to be clicked on, or may try to achieve a goal in a way that you had not designed for. Now put yourself in their seat for a moment. When you go to click on something, why are you doing it? You are doing it because either the interface has made it clear what to do, or you are simply relying on how you think things should work.

Once the results are gathered and analyzed and conclusions are made, the research is published and becomes factual information that others, like ourselves on this site, use to help create their products and interfaces.

So step back for a moment:

Isn't all user testing essentially a way of gathering how users think something should work and then making conclusions from that research? (seeing as all decisions that are made while being tested are based on how the user thinks the interface works

If so, then how are what ones thinks of the best way to do something (like on this site) often only taken with a grain of salt when they aren't backed up by user testing/research? Yes it is very true that using research to support your claim strengthens your argument and essentially says "I believe this and 100+ people in this study agree with me". But how is this any different that 100+ people agreeing with your individual opinion that is not backed by research?

I don't think that "gathering opinions or perceptions" is user testing, it is user interviewing. User testing is more along the lines of giving the user a task and observing how successful they are at completing that task. I ask them to think out loud so that I can have an idea of what they are thinking and where they are going, but I ignore any comments that are opinions, comments, or suggestions.
–
Dave NelsonJul 12 '11 at 18:56

The user's success during that test is based upon their own inuitions and opinions of how they think the interface works. That's what I mean by opinion, not necessarily verbally stated opinion.
–
Matt RockwellJul 12 '11 at 19:10

1

I think the answer to your question of "Isn't all user testing essentially a way of gathering how people think things should work?" is YES. Building things that work exactly how a significant portion of potential users expect them to work is the goal of user testing.
–
Dave NelsonJul 13 '11 at 1:40

10 Answers
10

You are conflating 'subjective' and 'unreliable'. Usability tests aim to get reliable information about people's reactions. Self-reported opinions are also subjective, and are much less reliable indicators of how other people will react to the interface.

If I test 100 people and their subjective opinion is that they hate an interface, I'm pretty sure that the next 100 people I test are also going to hate it. That's reliable data. And it's because usability tests are structured to factor out a lot of the unreliability.

If I ask 100 people, "Do you hate this interface?", the data will be all over the map:

Some folks will try out the interface and answer to the best of their abilities. But they may feel they have to suggest or shade their feedback.

Some will want to please me and answer yes or no accordingly.

Some will feel like they did a good job (and say so), when actual evidence would reveal they are less productive, made more mistakes, or otherwise found the interface less than ideal.

Some will hate something peripheral to the interface. (Think of all the 1-star reviews of appliances on Amazon that are really 1-star reviews of UPS or Fedex.)

Some won't even use it, but will make up an opinion anyway. Then they'll retroactively come up with reasons--that may seem very real to them!

People who feel strongly are much more likely to respond. The "silent majority", however, make up the bulk of your users.

Quite frankly, most people are very bad at self-reporting accurate impressions of an experience.

Usability testing sets users up for success:

We exert control over what they're experiencing

We ask them to explain their expectations and actions

We have the opportunity to guide them down specific paths

We have the opportunity to ask follow-up questions

We can compare very specific portions of the interface to understand why a user has one opinion or another.

In short: usability testing is a process whereby professionals can mitigate biases in subjective experience to understand the impact of a design.

This is a really important question. Not all research is created equally. Honestly some published studies are fundamentally silly, but you have to evaluate each on its own merits, which is why the "methodology" section of any scientific paper is so important. It matters how results are obtained and analyzed and we would all do well to be more constructively critical of UX "studies."

Generically though, the reason I give more weight to studies (if well crafted) than self-reported or opinion data is that people lie. Take, for example, the question of how much someone drinks or if they go to church. When self reporting, respondents invariably drink less than consumption data would indicate and go to church more than attendance stats would indicate. Self reporting is far more subject to personal bias than is measured behavior. And that difference is interesting too, but if I want to design products/interfaces based on what people actually do and succeed at doing, rather than what they wish they did, I'll take the measured success data any day.

So that users do better or worse at some assigned tasks in studies, even if it's because of their cultural or educational biases or training is still an important finding--though it's application may be limited in time or to a specific audience. Back to the methodology, it matters who's in the study group and their demo and psychographics. The generalizability of the findings are constrained by the characteristics of the participants. But that's important and why I would still design a slightly different interface for senior citizens than I would for college students even for the same content/domain.

As another example, I've also done continuous customer satisfaction research for many websites and after a site redesign, we almost always saw self-reported satisfaction levels drop. So, in a nutshell, return users thought the redesign sucked. However, concurrent with that same data, we would often also see conversion rise and/or inbound call center calls drop. Eventually (a couple of months) the customer sat levels woudl recover and rise above their previous levels. Bottom line? Users aren't good designers. Who cares what they think they think? It only matters what they do and that's not always the same as what they say they think.

Ideal user testing scenarios will collect both quantitative and qualitative data. For the quantitative, you typically will want to test something very specific, and usually as compared to something else (a/b) testing. Those situations produce a result typically "x% accomplished the test with option a than with option b".

Qualitative is where you can collect ideas/suggestions from the users that aren't as easily measured into digestable data points. It's more about exploring other ideas/options.

Now, some people only deal with qualitative (like Hollywood Test Screenings) and some ignore qualitative research altogether (Steve Jobs) but most tend to straddle the fence between the two as much as they can.

Isn't quantitative data a result of the users opinion/perception of how to achieve a task? Aren't user tests and research just compounded user opinions of what to do?
–
Matt RockwellJul 12 '11 at 17:11

I guess you can put it that way. What is the issue you are having with that?
–
DA01Jul 12 '11 at 17:17

It was just something I thought of when pondering the difference in reactions of people when using certain studies and research in UX design. Say for example there was an answer on here that had a research study with 50 participants saying one thing, and someone whose answer is backed by 200 people using up votes. It seems that in some cases people favor the research study based answer and think it had more validity, when in fact all it is is a grouping of opinions that should hold equal weight.
–
Matt RockwellJul 12 '11 at 17:23

A study is typically more scientific than just an opinion posted online, though. At the same time, a study isn't an absolute truth, either...but it gives a starting point for the discussion to form around.
–
DA01Jul 12 '11 at 17:53

@Matt - using your voting up example, which would be more valuable - an answer with someone just stating their opinion that got upvoted 200 times or an answer with someone citing accepted user research results that got upvoted 200 times?
–
Charles BoyungJul 12 '11 at 17:59

There are plenty of sorts of testing that aren't especially subjective. Think of studies that define

The average completion time for two forms, differentiated by only one UI strategy

The mean eye-tracking heatmap for a statistically significant sample size

The rate of 'correct' (sane) responses to forms with particular UI elements

Yes, you are measuring human opinions and subjective responses. But one can collect objective data about subjective things. It may be statistically invalid if it has a poor sample size, but that is not the same business as something being 'subjective'.

Nice response, but all of these results are gathered based on what the user expected to happen. If I am tested using a software and can get to exactly what I want, when I want, quickly, and did it correctly - this is because either A) my opinion or perception with how it should work was aligned with how the interface actually did work, or B) The interface was designed in a way to guide it correctly based on the designers perception of how it should work (and how it would be best understood).
–
Matt RockwellJul 12 '11 at 18:12

Basically there are no solid facts, and all existing "rules" and "standards" are developed because a majority prefers to use them, and succeeds in using them that that way.
–
Matt RockwellJul 12 '11 at 18:12

Yes. But how is that an issue? One can collect objective data on the likelihood that something meets the perceptions of a random user in a particular set. Just because someone's opinions are involved doesn't mean data is moot (which seems the real nub of your query) - just ask the organizations who save / make thousands due to fewer support calls and smoother UIs. 'Opinion' doesn't make that cash vanish in a puff of smoke.
–
Jimmy Breck-McKyeJul 12 '11 at 18:14

2

A test of 100 people's responses that proves point X is not the same thing as 100 people agreeing on point X. It needn't have the same weight. Likewise, 100 people succeeding, not failing, when a field is correctly labelled, stands for more than 100 people saying they think it'd be a good idea.
–
Jimmy Breck-McKyeJul 12 '11 at 18:35

1

"Basically there are no solid facts, and all existing "rules" and "standards" are developed because a majority prefers to use them, and succeeds in using them that that way." This is starting to look more like a Philosophy 101 discussion than anything to do with UX. That's not a bad thing, it can be a helpful discussion, but your question gets asked of pretty much every domain of knowledge that exists - from what is "known" of cartography, to what is "known" about insect reproduction.
–
gef05Jul 12 '11 at 18:43

User testing allows to detect inconsistencies between the mental model your target audience creates about the system and the model of the system defined by the designers.

To detect inconsistencies, individuals which represent the target audience are evaluated and a generalization is performed to discriminate individual from general inconsistencies.

Is this subjective?

The mental model each user creates is influenced by many different elements (conventions, previous experiences, cultural, etc.) which are most of them subjective. The key is that our target audience will be also influenced by these elements and will have similar problems with the system.

There are exceptions that do not apply to the target audience but this happens with any statistic process.

Why not to just ask for the user opinions?

The user's inability to predict their future behavior is widely documented.
So, the strategy followed is to detect the problems instead of asking for the solutions to the user.

Yes, but its not the whole picture. When the user acts towards our interface with a goal in mind there are two influences. Their own opinion and then the interface itself. The interface and experience can influence their behavior and even their opinions on how it should work.

So, User Testing is gathering peoples opinions and perceptions as a metric to influence and be influenced by a UX design.

Thanks for the thoughts. - Isn't your interface based on your teams opinions of how it should function, which in turn is probably based on research which has been cultivated by user opinions/perceptions of usage? So really when broken down to its simplest form, all research and testing is based on user preference/opinion/perceptions.
–
Matt RockwellJul 12 '11 at 17:15

Basically what I am getting at is should be be any difference between A)an opinion of how something should work that has a good explanation why standing behind it and B) an opinion of how something should work with a reference to a previously conducted study?
–
Matt RockwellJul 12 '11 at 17:17

6

There's a difference between a user's preference and what they actually do. That's a key concept to keep in mind. A user may say they prefer pink text on purple polka dotted backgrounds (qualitative), but when asked to complete a task, you may find them failing and saying "I can't read this" (quantitative)
–
DA01Jul 12 '11 at 17:19

2

My favorite analogy for this type of topic is the Homer Simpson car ;)
–
DA01Jul 12 '11 at 17:21

A basic rule in psychology is 'What you see - is what you think you'll see'.

One's entire experience of the world is 'constructed' based on existing 'models' in the brain.

The brain is pretty good at this, but things such as visual illusions, demonstrate where the brain can't get a mental model to match the incoming data.

Most of this processing is, however, unconscious. So the conscious 'user' is a bit like an iceberg: they only have access to 1/10ths (or whatever) of the processing which is going on in their brain. So they know what they are seeing. But they generally can't tell you why.

The difference is the background of the opinion. If you have a bunch of individuals each posting their individual opinions, there is no way to say that one is more valuable than another. All you can do is personally agree or disagree with them.

However, if you have a person stating "this is what a usability study with 100 participants shows" then that is going to hold significantly more weight. Yes, it is still just opinions, but those opinions have been collected and analyzed and consolidated to a formal conclusion.

Think of it like a political poll. You saying that your local mayor (or governor, or any other elected figure) is doing a horrible job and should be removed from office is really not that valuable. But a poll of a decent-sized sample of voters where the majority are saying the same thing definitely is valuable. Those opinions by themselves really don't mean much, but as a whole can provide a barometer of how the official is doing. Usability testing is the same way.

All I am saying is that in that poll, a persons vote is a persons vote and they are equal. An answer with 20 up votes should be equal to an answer with a research study involving 30 people, in which 19 agree in favor of the stated opinion.
–
Matt RockwellJul 12 '11 at 18:01

Not to mention that the "consolidation and analyzation process" could be skewed or faulty based on the researchers own personal motives, which has happened many times in research. A lot of times there is an agenda.
–
Matt RockwellJul 12 '11 at 18:03

@Matt is your question specifically pertaining to the SE upvoting system or are you just using that as an example? As for research, ANY type of research is open up to human bias and error. Still, research and its analysis is typically seen as more valid than opinions alone.
–
DA01Jul 12 '11 at 18:27

No not particularly to the SE system, but mainly to the opinion that published studies, often of which in the UX field are few and far between due to the speed of emerging technology and new interface techniques - hold more weight then a general consensus or accepted opinion.
–
Matt RockwellJul 12 '11 at 18:39

Perceptions and opinions are not the same thing. Perceptions are our understanding of the world based on what we have learned through our senses. Opinions are our value-laden thoughts about something (e.g. about something we have perceived). How something is perceived by an individual is a subjective fact ("I can't see the button because it is the same colour as the background"). If the evidence/stats stack up and you can be confident that just about everyone will have the same perception then you have a normative fact (users will not see the button). If the button in our example really can't be discriminated from the background because it is the same colour or due to limitations of human physiology then we have an objective fact.

Users will also provide their opinion ("This interface is great, I like how it makes me hunt for the buttons"). Their opinions matter in a number of ways. They will tell others what they think of your interface, and the perception of your interface will be influenced by the opinion given.

Usability testing is about gathering normative facts. It can also be about gathering facts about people's opinions about the subject of the study (the software). Both can be valuable. They are however different things. Research that takes people's opinions (about the software) and reports them as facts (about the software)—e.g. instead of saying "45 out of 47 users said that the software was difficult to use" the report says "45 out of 47 users had difficulty using the software"—is bad research. It is flawed. It should not pass peer review. You should not give it any credibility.

There are varied reasons why facts-about-opinions and facts-about-behaviour-and-performance with software shouldn't be confused and misreported, but one important one is that they don't always match. People can report that interface A is horrible and slow and they made many more mistakes than they did with interface B etc etc where this is their opinion but the evidence shows that actually they performed better and made fewer mistakes with interface B. And the reverse can also be true. This isn't to say opinions don't matter. But facts about opinions and facts about behaviour/performance/etc shouldn't be confused.