Analysing statistics just isn’t that easy

Having a solid statistical and scientific background, I often find myself frustrated by research and data-analysis in User Experience Metrics, Conversion Optimization and Google Analytics. In my opinion, doing research and analysing statistics requires proper training and understanding of what you are doing. Am I the only one?

You need a brain to do statistics!

Just last week, I have resisted to various inclinations of throwing ‘Measuring the user experience’ by Tom Tullis and Bill Albert against the wall. After reading the book completely, I found it to be a brave attempt to explain statistics as well as a total over-simplification of doing research. In my view, such a simplification really messes up the reliability of results.

The common message in the online research community appears to be that research and statistics are easy and can be executed by everyone. True, all kind of packages like Google Analytics or Convert make doing statistics that much easier. But still… you really need a brain to do it!

3 pitfalls

Research methodology and simple descriptive statistics are not easy. In my first year of university, three quarters of the students failed their first (mainly descriptive) statistics exam. Also, my years of teaching made clear that mathematics and statistics are the most challenging subjects. It is hard! Executing research and analysing data without proper knowledge of both research designs and statistics can lead to serious misinterpretations in results. I will discuss 3 pitfalls:

1. Doing statistics with small amounts of data

I am not going to argue that statistical analysis with less than 30 observation is not possible, because there are tests (student T-test for example) specifically designed for doing just that. Still, one should be aware that small samples have limited power. This means that differences between two small samples will only be significant if the difference is obvious and large. For instance, if the old design of your checkout page did an average conversion of 2 % and the new design has a conversion rate of 20 %, then the difference will appear significant with 20 observations. But usually differences aren’t that obvious. Small differences or nuances cannot be tested with small samples.

More importantly, I really wonder whether you should do statistical analyses with a very small sample at all. I would always advice a qualitative approach if you have a sample of 15 individuals or less. In a qualitative research design you gather in-depth understanding of human behaviour. Ask open questions and try to discover why visitors of your website buy your products, (dis)like your design of read your posts. Analysing these answers (in a non-statistical manner) will be of great value to increase the conversion of your site. ‘Measuring the user experience’ actually gives a nice introduction to a more qualitative approach of user experience research.

2. Representative sample

Equally important to the sample size is the question whether the sample is representative. Does the sample of individuals you research upon resemble the total population. An example:

If we would do a User Experience study of Yoast.com and we would ask totally random people to visit our website, the sample will not be representative. No offence, but to visit the Yoast-website, you have to be some kind of nerd. You can imagine that the User Experience of random people will probably greatly differ from those of nerds. A representative sample of our population would thus be a random sample of nerds. We would need nerds from all over the world, because our readers from the US probably differ from the ones we have in Europe, India or Australia. And maybe, because of a recent growth in our reader population, our current population also includes some non-nerds. We should definitely take into account the nerdiness of the individuals in our sample. Making a representative sample is hard, the more if you do not know exactly what your population looks like. Taking a large random sample takes care of most of these issues. But: especially with small samples, it is hard to make sure your sample is representative. And: a non-representative sample leads to non-representative (and thus worthless) results.

Validity & Reliability

Validity:

The validity of a measurement tool (for example a question in a survey) tells us the degree to which the tool actually measures what it claims to measure. Sometimes it is referred to as accuracy.

Reliability:

Reliability is the extent to which a measurement gives consistent results. So, If you pose the same question to the same person twice, will answers be the same? A reliable measurement tool results in the same answers over and over again.

Difference between reliability and validity:

Imagine a person of 200 pounds stepping on the scale 5 times and gets readings of 15. 250, 95, 140 and 500 pounds. This scale is not reliable, the reading is different every time. If the scale consistently reads 150 pounds, the scale is reliable, because the reading are the same. However, the scale is not valid. The reading is wrong. It does not measure, what you want to measure.

3. Validity: GIGO

Website analytics is awesome because a lot of measuring is very easy. You can just count the number of visitors on your page and the number of clicks on a button. Attitudes towards your brand and self-reported issues with usability are much more difficult to measure though. If you want to measure these kinds of things, you could do a qualitative study with a small sample. But a quantitative design with a larger number of individuals is also possible. Possible but also challenging and difficult! The drafting of questions in a survey (especially with limited answering possibilities) is difficult and requires proper testing. You should make sure that your questions really measure what you want to know. Measuring what you want to measure is what we call validity of your measurements. An example:

You want to measure the extend to which people like the design of your website. You ask whether they like the colour. The answers to this question indeed say something about the degree to which people like your website. But design is more than colour. You would probably need more questions to really capture the degree to which people like the design of your website.

If the questions you present to people are of bad quality, the data will become of bad quality as well. Thus remember GIGO: Garbage In, Garbage Out!

Interpreting invalid data (whatever sophisticated statistical analyses you will apply) will always lead to invalid results.

Conclusion

Research is definitely a very powerful tool. But, I think you should have some statistical and methodological background in order to interpret results and execute proper analyses. Taking the time to really understand what you are doing is required.

In this post, I have only discussed very basic methodological and statistical topics. If this is out of your league, you should definitely brush up your statistical knowledge (only if you want to do research, otherwise please do something more fun).

This being said, I do understand the seduction of simple statistical techniques that are available for a broad public. Testing is a beautiful tool to improve your website! For the future, I expect research to become more and more important for websites owners.

This is why we are currently brainstorming at Yoast about designing a tool or a service, which will help people with interpreting test results and statistics. We will keep you posted about developments in this new project!

Google Analytics is basically useless nowadays they withhold so much information but there are plenty of other free programs out there that can do the same thing just do a Google search a lot of them are free.

I totally agree. Google Analytics has turned into a visitor counter in my opinion. Even with their integration of Webmaster Tools it just shows apparently what sites are showing up for in search engines but all I am seeing is image results.

Even with my own background in statistical analysis, the information currently available for website owners is hit or miss at best. It will be nice if Yoast could help with this issue. I’ll keep my eye on your site for further developments.

I have to recommend “How to Measure Anything: Finding the Value of “Intangibles” in Business”. He covers a lot of territory and anyone interested in using statistical data may find it insightful.
On a personal note, I find it’s difficult to even know what to measure when looking for relationships. Often there is some simple, non-obvious metric that isn’t tracked but has a clear correlation with conversion or deepening user interaction. The path to discovery for these metrics usually involves physically watching someone who is unfamiliar with an interface who isn’t concious of being observed (like at a tradeshow or however you can “socially engineer” a situation where this is possible).
I guess the point being, if you work with smaller samples, maybe you need more data about each sample and the findings can become more usable?

I think you are right, with smaller samples, you can go more in depth… measure more about fewer people. Your methods to analyze such data will be different then when analyzing larger samples. I am going to take a look at the book you recommended! Thanks

Good article! This is really basic stuff, and you don’t need a background in statistics to get this far. Although I’m sure it’ll help. Think logically, study the topic and the system, and approach the data carefully, and you’re likely to get something useful out of it.

I don’t necessarily think that you need some statistical and methodological background to understand google analytics. I think over time you will be able to determine what constitutes a good statistical sample for your blog. This is especially true if you are getting consistent traffic.

Reading this article reminds me of the book ‘Software Estimation: Demystifying the Black Art’ by Steve McConnell. In it the author points out that if you want to do estimation properly you need to take a lot of measurements from previous projects, do the appropriate math and number crunch a lot of numbers. He also points out that for an awful lot of development projects this is probably overkill. Steve McConnell then goes on to say that good estimation results can still be achieved by using heuritics rather than a robust scientific method. Perhaps this is the approach, as website owners, we need to take? We know that we are not approaching the analysis in a scientific and robust way, but if we use the data we have and apply the small amount of brain power we have then decent trends and results can still be spotted. So maybe we shouldn’t call is statistical analysis, but maybe the ‘black art’ of statistical estimation?(mmm.. a better name is perhaps needed.)

Good article! This is really basic stuff, and you don’t need a background in statistics to get this far. Although I’m sure it’ll help. Think logically, study the topic and the system, and approach the data carefully, and you’re likely to get something useful out of it.

I agree that hiding data by Google really complicates the analyses. They give us a tool and take away the major information from it. I personally think that ‘not provided’ is the biggest failure in 2013.

Analyzing data from website traffic, analytic software, AdSense and other advertising campaigns is not so easy and everyone’s task. Analyzing data in right way can lead you to taste success in sort time.

This is really basic stuff, and you don’t need a background in statistics to get this far. Although I’m sure it’ll help. Think logically, study the topic and the system, and approach the data carefully, and you’re likely to get something useful out of it.
Thanks..!!