Tag Archives: experiments

The key to effective digital transformation isn’t analytics, testing, customer journeys, or Voice of Customer. It’s how you blend these elements together in a fundamentally different kind of organization and process. In the DAA Webinar (link coming) I did this past week on Digital Transformation, I used this graphic to drive home that point:

I’ve already highlighted experience engineering and integrated analytics in this little series, and the truth is I wrote a post on constant customer research too. If you haven’t read it, don’t feel bad. Nobody has. I liked it so much I submitted it to the local PR machine to be published and it’s still grinding through that process. I was hoping to get that relatively quickly so I could push the link, but I’ve given up holding my breath. So while I wait for VoC to emerge into the light of day, let’s move on to controlled experimentation.

I’ll start with definitional stuff. By controlled experimentation I do mean testing, but I don’t just mean A/B testing or even MVT as we’ve come to think about it. I want it to be broader. Almost every analytics project is challenged by the complexity of the world. It’s hard to control for all the constantly changing external factors that drive or impact performance in our systems. What looks like a strong and interesting relationship in a statistical analysis is often no more than an artifact produced by external factors that aren’t being considered. Controlled experiments are the best tool there is for addressing those challenges.

In a controlled experiment, the goal is to create a test whereby the likelihood of external factors driving the results is minimized. In A/B testing, for example, random populations of site visitors are served alternative experiences and their subsequent performance is measured. Provided the selection of visitors into each variant of the test is random and there is sufficient volume, A/B tests make it very unlikely that external factors like campaign sourcing or day-time parting will impact the test results. How unlikely? Well, taking a random sample doesn’t guarantee randomness. You can flip a fair coin fifty times and get fifty heads so even a sample collected in a fully random manner may come out quite biased; it’s just not very likely. The more times you flip, the more likely your sample will be representative.

Controlled experiments aren’t just the domain of website testing though. They are a fundamental part of scientific method and are used extensively in every kind of research. The goal of a controlled experiment is to remove all the variables in an analysis but one. That makes it really easy to analyze.

In the past, I’ve written extensively on the relationship between analytics and website testing (Kelly Wortham and I did a whole series on the topic). In that series, I focused on testing as we think of it in the digital world – A/B and MV tests and the tools that drive those tests. I don’t want to do that here, because the role for controlled experimentation in the digital enterprise is much broader than website testing. In an omni-channel world, many of the most important questions – and most important experiments – can’t be done using website testing. They require experiments which involve the use, absence or role of an entire channel or the media that drives it. You can’t build those kinds of experiments in your CMS or your testing tool.

I also appreciate that controlled experimentation doesn’t carry with it some of the mental baggage of testing. When we talk testing, people start to think about Optimizely vs. SiteSpect, A/B vs. MVT, landing page optimization and other similar issues. And when people think about A/B tests, they tend to think about things like button colors, image A vs. image B and changing the language in a call-to-action. When it comes to digital transformation, that’s all irrelevant.

It’s not that changing the button colors on your website isn’t a controlled experiment. It is; it’s just not a very important one. It’s also representative of the kind of random “throw stuff at a wall” approach to experimentation that makes so many testing programs nearly useless.

One of the great benefits of controlled experimentation is that, done properly, the idea of learning something useful is baked into the process. When you change the button color on your Website, you’re essentially framing a research question like this:

Hypothesis: Changing the color of Button X on Page Y from Red to Yellow will result in more clicks of the button per page view

An A/B test will indeed answer that question. However, it won’t necessarily answer ANY other question of higher generality. Will changing the color of any other button on any other page result in more clicks? That’s not part of the test.

Even with something as inane as button colors, thinking in terms of a controlled experiment can help. A designer might generalize this hypothesis to something that’s a little more interesting. For example, the hypothesis might be:

Hypothesis: Given our standard color pallet, changing a call-to-action on the page to a higher contrast color will result in more clicks per view on the call-to-action

That’s a somewhat more interesting hypothesis and it can be tested with a range of colors with different contrasts. Some of those colors might produce garish or largely unreadable results. Some combinations might work well for click-rates but create negative brand impressions. That, too, can be tested and might perhaps yield a standardized design heuristic for the right level of contrast between the call-to-action and the rest of a page given a particular color palette.

The point is, by casting the test as a controlled experiment we are pushed to generalize the test in terms of some single variable (such as contrast and its impact on behavior). This makes the test a learning experience; something that can be applied to a whole set of cases.

This example could be read as an argument for generalizing isolated tests into generalized controlled experiments. That might be beneficial, but it’s not really ideal. Instead, every decision-maker in the organization should be thinking about controlled experimentation. They should be thinking about it as way to answer questions analytics can’t AND as a way to assess whether the analytics they have are valid. Controlled experimentation, like analytics, is a tool to be used by the organization when it wants to answer questions. Both are most effective when used in a top-down not a bottom-up fashion.

As the sentence above makes clear, controlled experimentation is something you do, but it’s also a way you can think about analytics – a way to evaluate the data decision-makers already have. I’ve complained endlessly, for example, about how misleading online surveys can be when it comes to things like measuring sitewide NPS. My objection isn’t to the NPS metric, it’s to the lack of control in the sample. Every time you shift your marketing or site functionality, you shift the distribution of visitors to your website. That, in turn, will likely shift your average NPS score – irrespective of any other change or difference. You haven’t gotten better or worse. Your customers don’t like you less or more. You’ve simply sampled a somewhat different population of visitors.

That’s a perfect example of a metric/report which isn’t very controlled. Something outside what you are trying to measure (your customer’s satisfaction or willingness to recommend you) is driving the observed changes.

When decision-makers begin to think in terms of controlled experiments, they have a much better chance of spotting the potential flaws in the analysis and reporting they have, and making more risk-informed decisions. No experiment can ever be perfectly controlled. No analysis can guarantee that outside factors aren’t driving the results. But when decision-makers think about what it would take to create a good experiment, they are much more likely to interpret analysis and reporting correctly.

I’ve framed this in terms of decision-makers, but it’s good advice for analysts too. Many an analyst has missed the mark by failing to control for obvious external drivers in their findings. A huge part of learning to “think like an analyst” is learning to evaluate every analysis in terms of how to best approximate a controlled experiment.

So if controlled experimentation is the best way to make decisions, why not just test everything? Why not, indeed? Controlled experimentation is tremendously underutilized in the enterprise. But having said as much, not every problem is amenable to or worth experimenting on. Sometimes, building a controlled experiment is very expensive compared to an analysis; sometimes it’s not. With an A/B testing tool, it’s often easier to deploy a simple test than try to conduct and analysis of a customer preference. But if you have an hypothesis that involves re-designing the entire website, building all that creative to run a true controlled experiment isn’t going to be cheap, fast or easy.

Media mix analysis is another example of how analysis/experimentation trade-offs come into play. If you do a lot of local advertising, then controlled experimentation is far more effective than mix modeling to determine the impact of media and to tune for the optimum channel blend. But if much of your media buy is national, then it’s pretty much impossible to create a fully controlled experiment that will allow you to test mix hypotheses. So for some kinds of marketing organizations, controlled experimentation is the best approach to mix decisions; for others, mix modelling (analysis in other words – though often supplemented by targeted experimentation) is the best approach.

This may all seem pretty theoretical, so I’ll boil it down to some specific recommendations for the enterprise:

Repurpose you’re A/B testing group as a controlled experimentation capability

Blend non-digital analytics resources into that group to make sure you aren’t thinking too narrowly – don’t just have a bunch of people who think A/B testing tools

Integrate controlled experimentation with analytics – they are two sides of the same coin and you need a single group that can decide which is appropriate for a given problem

Create constant feedback loops in the organization so that decision-makers can request new survey questions, new analysis and new experiments at the same time and with the same group

I see lots of organizations that think they are doing a great job testing. Mostly they aren’t even close. You’re doing a great job testing when every decision maker at every level in the organization is thinking about whether a controlled experiment is possible when they have to make a significant decision. When those same decision-makers know how to interpret the data they have in terms of its ability to approximate a controlled experiment. And when building controlled experiments is deeply integrated into the analytics research team and deployed across digital and omni-channel problems.

People have struggled with this (big) data provider model but Factual feels like it’s found a real (and valuable) niche. Would love to see more of this grow since external data is a huge miss in most big data systems.

Targeted VoC is a powerful (and totally neglected) tool for personalization. Facebook’s experience is entirely relevant to ANY content producer. I don’t know if I can take credit for this, but I suggested this to folks at Facebook a couple of years back!

An interesting discussion of the problems in identifying “likely” voters and the benefits of behavioral data integration. Food for thought in the enterprise world as well where the equivalent is often possible but rarely done.