PPC A/B testing: How long do you wait?

Pay Per Click generates measurable sales data remarkably quickly, but sometimes we still have to wait a while for any given ad group and set of ads to generate enough traffic to create a reasonable statistical result.

How long exactly? Well, the most effective measure that we have used is based on the venerable confidence interval statistics measure, which is the same one used to identify the significance of poll surveys that are often seen on the nightly news. In those surveys a question is posed with two possible answers, such as “Do you think Three-eyed Wombats are good or bad for the Cleveland economy?” After a sufficiently large sample size, the difference between the “good” and “bad” answers can be shown to be significant with a high degree of confidence, generally around 95%.

A/B testing can be done the same way. The math for this is pretty straightforward, or at least it ought to be; we wasted a lot of time on traditional statistics books trying to find the one that could best describe the concept to a nonstatistician before we finally found the excellent “Statistics for Dummies” book by Deborah Rumsey, which does a very nice job of covering standard distribution statistics methods such as confidence intervals.

If you want to ditch the math and get right to the test, a software version of this method can be found at www.splittester.com, which allows a user to enter in the “clicks”, and the “clickthrough rate” for any two ads in order to determine whether or not one is outperforming the other. It also includes a nice overview of the reasoning behind the test.
This test can also be used to measure conversion rates, even though it is advertised as a measure of confidence interval for clickthrough rates.

So how long does it take? If two ads really are performing differently– in this case differently by more than about 1.5%– you should expect one to distinguish itself in about forty clicks with a typical Clickthrough Rate of about 3%. The same is true for conversion rates.

Advertisers with limited budgets may find it difficult to do enough testing with their ads because they will run out of money. We recommend in this situation to test your adgroups sequentially so that you can determine the best performing ads as quickly as possible.

Other advertisers may not have enough overall traffic for any given ad group to effectively measure clickthrough rates. In this case you can pool ads together that are similar across ad groups. This is risky, but can be a way to quickly assess ads if one category is really outperforming the other.

Generally, though, the best thing to do is wait. The nice thing about online advertising is that the waiting is still relatively short, and even low-traffic sites should, within eight weeks, know more about their visitor’s expectations than they ever dreamed possible.