Blog

Clinical trials would never give a patient the drug AND the placebo.

So why are you serving the same person both versions of your ad?

Digital advertising makes possible real-time creative testing and optimization at a speed and precision previously unimaginable. While most digital marketers desire optimized creative that respects their customers and drives response rates, confusion reigns around the proper way to test and optimize creatives. As a result, only a few marketers are truly engaged in proper creative testing and optimization, with the majority relying on outdated testing capabilities of ad servers and Dynamic Creative Optimization (DCO) platforms that have fatal flaws in their tests, the result of which is the arbitrary selection of winning creatives.

What insights separate the savvy digital marketers who are leveraging real-time creative testing and optimization? Some critical insights come from the biomedical research space.

The Challenge of Confounding Effects

Most creative tests are impression-level, automated tests. This means that, throughout a test, the volume of impressions allocated to each creative shifts to creatives that appear to be winning based on early, inconclusive test results.

This is the case with ‘rotation optimization’ and ‘automated testing’ capabilities of most ad servers and DCO platforms. While these platforms offer the tantalizing prospect of dynamically optimizing towards winning creatives on a daily or even an hourly basis, the reality is that these tests produce largely random results that waste valuable media dollars and time.

The critical flaw in these tests is that they don’t control the day-to-day factors that affect campaign performance. As a result, these factors are confounded with the results of creative tests, leading to an inability to distinguish between conversion due to ads or an unaccounted for confounding effect.

For example, if two ads are being A/B tested, and Ad A appears to perform better on Friday, an automated test will assign more impressions to Ad A than to Ad B the next day, on Saturday. But if the advertiser always sells more on weekends, then Ad A will unduly benefit from this confounding temporal effect, resulting in even more impressions assigned to Ad A on Sunday and then Monday.

Adacus studies have found that rotation optimization platforms pick winners from A/A tests and, what is worse, the winner is different from test to test.

Some DCO platforms claim to be able to account for time-varying confounding effects, but these claims are also a mirage as most historic time-varying effects in digital advertising are not sufficiently stable to build any temporal models around. In particular, changes in media planning that are made monthly and, with programmatic bidding, daily and hourly, have dramatic effects on the quality of media being sent to an ad server.

The best evidence that automated creative tests produce random output is found by conducting A/A tests on these platforms. When comparing the same ad to itself, there should be no winner. And yet, Adacus studies have found that rotation optimization platforms do pick winners from A/A tests and, what is worse, the winner is different from test to test.

The Critical Insight – Randomized Controlled Assignment of Households

The gold standard for testing is an approach used most prominently in biomedical research.

Trials of new drugs, medical devices or therapies are characterized by high-stakes experiments, and it’s critical to have results that can be reliably replicated on a larger population.

Biomedical tests randomly assign treatments to patients, in order to control for confounding effects such as gender, age, pre-existing conditions, and so on. This is known as a Randomized Controlled Test.

To control for confounding effects in digital advertising, like in biomedical research, it is critical to conduct randomized controlled assignment of creatives against households, and for the proportion of households assigned to each creative to stay the same throughout the duration of the test.

When creatives are randomly assigned impression-by-impression, with no regard for users or households, then users see both creatives being tested. When these users convert, it is impossible to know which creative should be attributed with the conversion.

Cookies Aren't Enough

Some ad servers enable random assignment of creatives by cookie. This approach is also flawed, because most users use multiple devices, and more and more devices do not support third-party cookies. Furthermore, the ad servers that support this type of random assignment don’t report raw conversion results on their dashboards, because conversions are first deduplicated by media channel such that test results take months or years to reach any degree of statistical significance.

Lastly, creative test results are often included in reports from multi-touch attribution (MTA). These comparisons of creative performance are not based on randomized controlled tests, and instead attempt to control for confounding variables like placement, daypart and so on through the use of regression models and related approaches. This is highly unreliable, as the variation in performance by placement, daypart or dimension simply can’t be teased apart from variation in performance by creative. As discussed above, there is simply far too much noise from these confounding factors.

Take Note, Digital Advertisers

Randomized controlled assignment of creatives to households is the only way to test and optimize creatives in digital advertising. This approach leverages householding technology to ensure that all devices associated with a household will see the same creative throughout the duration of a test.

Only randomized controlled assignment of households has the benefit of producing replicable, actionable insights, AND is a transparent testing methodology whose results can be displayed and analyzed with an A/B Test Dashboard.

Download our free eBook for more A/B Testing Insights

Most A/B tests are designed in a way that all but ensures the learnings will not translate into an ongoing lift in sales.

Ultimately, to know with confidence that one creative variation will outperform another, tests must be deployed at the user level, and not the impression level. When tests are run at the impression level, results are far more likely to indicate no difference between creatives. That’s because impressions don’t engage with brands, people do

The large number of inconclusive A/B tests resulting from testing impressions rather than users unfortunately discourages many advertisers from continued testing and optimization. User-focused digital marketers are smarter, however, and demand creative A/B testing that is administered at the user level.

Users Split by Line Item

Tests are all-too-often conducted by comparing the performance of two campaigns or line items being run simultaneously. The problem with this approach is that it does not hold the audience constant, which is a fundamental requirement of A/B testing. Different line items, even with identical settings, will inevitably access slightly different inventory over the course of the test. A proper A/B test in digital advertising must be run on a single line item with the ad server assigning users into groups at each impression so that audience used for the test truly is identical across the test groups.

Testing Plan Siloed from Media Plan

It is critical that creative A/B testing is coordinated with your media agency or in- house programmatic team. Programmatic media buying that is not coordinated with creative optimization can undermine creative tests in the following ways:

Buying low CPM inventory that is less viewable limits the impact of any creativeAs discussed in the introduction, creative is growing in importance for multiple reasons, one of which is that digital advertisers are increasingly getting the viewability and attention to their ads that they have been missing. Your creative testing will only be as informative as your ads are viewable. Ad placements that compete with 5-10 other placements on the page may be low in CPM, but the attention garnered makes creative less effective, thus making creative testing less effective. If you’re impressions are 40% unviewable, 40% competing with 5-10 other placements, and only 20% both viewable and prominent, the impact of creative will simply be minimal.

Targeting the same users in the A/B test with other programmatic campaignsSometimes when an agency sets up a creative A/B test, they generate a separate placement in the ad server for the test and trafEic it to a line item or package with their trading desk. Such tests are less likely to measure the actual differences in performance between creative variations, because users in the test are being served creatives from other programmatic campaigns at the same time. When providing multiple placements or ad tags to a trading desk, ensure that the trading desk isolates users in a test from other programmatic campaigns.

Creative Test Obscured by Media Attribution

Ad servers, other than Adacus, apply attribution models by default to all conversion reporting. In other words, when a user converts after having seen ads from a test as well as ads from other placements or channels, the ad server will likely not attribute that conversion to the test group to which the user was assigned.This makes A/B testing all but impossible, for two reasons:

Attribution across channels removes most conversions from the results of an A/B test, thus increasing the amount of testing time required to achieve significance from weeks to months.

Multi-channel attribution introduces noise into the A/B test results. Evaluation of an A/B test does not require multi-touch attribution as users are only presented with the A ad or the B ad throughout the duration of the test.

The solution is to report creative test performance separately from media performance across channels. In the creative test report, include all users that converted after seeing an ad in the test regardless of whether or not they were exposed to media from outside of the test as well.

Learn how to avoid more common pitfalls in our eBook

"The Essential Guide to A/B Testing for Digital Advertisers"

Excerpt from free eBook, "The Essential Guide to A/B Testing for Digital Advertisers"

Marketers perpetually ask if test results are “statistically significant,” but what does that term even mean? Probably not what you think.

Traditional statistics relies on p-values as a measure of the “statistical significance” of test results. Most marketers are surprised to learn that p-values do not, in fact, measure the probability that one creative or treatment will outperform another. What p-values measure is far more abstract and removed from the decisions that marketers make based on A/B tests.

In every A/B test one variation will perform at least slightly better than the other. P-values measure the probability that a test result (say, creative variation B outperforms creative variation A by 10%) would have occurred if in fact there were no difference between the two creatives at all. That “95% confidence level” threshold you’ve probably heard bantered about simply means that there is a 5% chance that, were the two variations identical, you would have observed as large a difference in performance between them as you did. The p-value is an important measure in other fields of study to account for what is known as in traditional statistics as Type I error.4In our experience, we have yet to hear a digital marketer ask us for this specific probability. And why would they.

Learn why Adacus uses Bayesian statistics in our latest eBook

"The Essential Guide to A/B Testing for Digital Advertisers"

Adacus, like any growth-stage company, has to pick and choose conferences very judiciously. For better or worse, the digital marketing industry offers more than it's fair share of conferences from which to choose.

Ad:tech, on the other hand, sits comfortably somewhere along the continuum between those two. Adweek ranks it as one of the top 10 digital marketing conferences to attend. By the conference's own account, "is the original industry authority for marketing and media technology, where marketing, technology and media communities assemble to share new ways of thinking, build strong partnerships, and define new strategies to compete in an ever-changing marketplace."

Beyond the fancy language, we're excited to exhibit among other companies hustling to leverage technology for positive change within some niche of the greater industry. It is the place to learn what's next and to talk with the companies making it happen.

Come meet Adacus at booth NX20 on the exhibition floor. There just might be some swag in it for you.

Dynamic Creative Optimization (DCO) plucked the low-hanging fruit of creative optimization in digital advertising. By turning a creative into a creative template, DCO changes the value of a text or image element using a large data feed, usually location or last product viewed. But once a marketer has dynamic locations and product images, what comes next? If DCO excels at contextualizing a single creative, a Creative-Side Platform (CSP) goes further by optimizing an entire creative plan.

CSPs represent as big an opportunity for digital marketers as DSPs. CSPs do for creative plans what DSPs did for media buying. That DSPs preceded CSPs is a matter of economics – arbitrage is a big money-maker, so that’s where investment capital went first. But this also tells us something about the potential of CSPs – they drive lift without a penny more in media spend.

If DCO excels at contextualizing a single creative, a Creative-Side Platform (CSP) goes further by optimizing an entire creative plan.

What exactly is a CSP? A CSP is an ad server with tons of audience data. CSPs access audience data by integrating with DMPs but also by building in third-party audience data like demographics and interests. CSPs leverage all of this audience data with endless creative testing, decisioning and optimization tools that make it easy to develop and deploy a creative segmentation strategy.

You can run an A/B test that shows which creative variation wins with men vs women, with parents vs non-parents, or with new customers vs long-time customers. You can figure out how to drive sales from millennials by running an A/B/C test of 3 creatives only when a millennial is served an impression. And, ultimately, you can deploy a creative decision tree that captures the creative segmentation insights generated from all this testing.

CSPs represent the elevation of programmatic from an ad delivery platform to an insight-generating platform. The messaging segmentation strategy that results is a proprietary asset of the brand, as tightly guarded as the segmentation strategy developed by traditional database marketers. In fact, that’s a telling comparison, as CSPs represent the introduction of martech best practices into digital advertising. DCO, by comparison, is a thoroughly adtech phenomenon.

Hold it, you may be thinking. I can already serve different creatives to men and women using my DSP. Not so fast.

First, media targets are not creative targets. Men respond to a message tailored for men regardless of how much the media they consume costs. The latter, media costs, are incredibly dynamic from day to day, as anyone who has worked the controls of a DSP console can tell you. A creative segmentation strategy, on the other hand, is more stable over time, though it does evolve with changes to the brand and to consumer preferences over time.

Second, ask an ad operations team whether they can serve 5 creatives to 5 different demographic and interest segments. Here’s the ad ops ordeal this creates: buying 5 media targets from a DMP, creating 5 line items on the DSP, cookie syncing the 5 media targets with the 5 line items, generating 5 ad tags from your ad server, and trafficking those ad tags to their respective DSP line items - and then testing all of this to make sure no mistakes were made. Even then, your creative decisioning rules are ignored by all other line items on your DSP, other DSPs you use and direct buy publishers.

Third, testing to see if this creative decision makes sense on the DSP isn’t possible because creative testing requires holding the media constant between creative groups. Creative testing has to be done on an ad server.

The slow feedback loops of direct marketing channels are about to be replaced with far faster and more data-rich feedback loops from programmatic. CSPs are the orchestrators of this feedback loop and will not only drive future lift in digital advertising, they will drive the segmentation insights that are leveraged across all direct marketing channels.