Welcome

Welcome to Josh Baker's Practical Advice for Optimizing Your Internet Marketing blog. Here you will find internet marketing optimization and online strategy articles full of tips, tricks, discussions, and thoughts to help you take your marketing and business to the next level of success.

Simpson’s Paradox and Marketing Testing

In 1973, the University of California at Berkeley was sued for showing bias in admissions for women to their graduate school. Men had a much better chance to be admitted than women according to the statistics given. The reporting showed that this sex bias was unlikely due to chance since the percentage difference between the men and women admitted was so large that it had to be in fact true.

But when the numbers were looked at by individual department, it was actually shown that there was a small but statistically significant bias that favored the women in actually having a higher chance at being admitted.

How can this be? Simple, it’s called Simpson’s Paradox. Simpson’s Paradox is when the trends derived from the data from individual subgroups are reversed when the groups are combined.

Let’s take a look at the above example with numbers (as taken from the Simpson’s Paradox entry in Wikipedia):

The subgroups combined, showing men more likely to be admitted:

The subgroups alone, showing women with the slight statistical advantage in being admitted:

Interesting isn’t it?

Gordon Linoff from the Data Miners Blog states in his Simpson’s Paradox and Marketing post:

“I could imagine finding marketing results where Simpson’s Paradox has surfaced, because the original groups were not well chosen. Simpson’s Paradox arises because the sizes of the test groups are not proportional to their sizes in the overall population.”

As marketers, we have to watch out for Simpson’s Paradox when making casual inferences from the data we have which is then used to roll out the winners from marketing tests as well as in the structuring and designing of the tests themselves (i.e. making sure the groups being tested are as identical as possible). I see this often, especially within email marketing A/B tests (although it’s present in many other marketing channels when split testing such items as landing pages or subject lines for instance) when lists are pulled to do tests on and then the results are interpreted.