Acxiom's Inaccurate Data and Why It's So Useful

The oldest and largest marketing data aggregator, Acxiom, recently opened up its database to let individuals see and edit the data that the company has collected about them. Data scientists and marketers have been viewing the data and, often, chuckling at its inaccuracy. Acxiom admits that up to 30% of its data is wrong, and yet it's the industry leader. How can Acxiom make so much money with data that's only 70% accuracy? Simple: Most marketers' data is worse, and in the land of the blind the one-eyed man is king.

Over 100 years ago William Lever in England (the soap tycoon whose name still appears on Unilever bar soap) and John Wanamaker in the U.S. (New York City merchant and advertising pioneer) said that half their advertising wasn't working, but they didn't know which half. They were driving blind. Data aggregation and campaign testing methods from companies like Acxiom started to change all that 40 years ago. There are a number of technical reasons that 70% accuracy is good enough to tell what's working: data redundancy, diminishing returns, and compound results.

A great deal of consumer marketing data is redundant, or highly correlated, with other consumer data, so if one attribute is inaccurate then data scientists can use other attributes that essentially contain the same information. Redundancy is due both to the way we collect and organize consumer data and to the fact that members of a group tend to behave similarly. A trivial example is the attributes “presence of children” and “number of children”: presence of children being true is highly correlated with number of children being above zero. Another example is household income, home value, and ZIP Code. Within the same ZIP Code, household income is correlated with home value because people who live in the same area tend to behave similarly, including spending about the same portion of their income on homes. Acxiom's new website AboutTheData.com exposes 125 consumer attributes, but Acxiom, Experian, BlueKai, NeuStar, and others track more than 2,500 attributes of every consumer. There's a lot of redundancy when dealing with 2,500 attributes. Data scientists—and the massive computing power available to them today—are good at exploiting that redundancy to avoid inaccurate data.

Additionally, there are rapidly diminishing returns for data accuracy and 70% accuracy has good ROI for many marketing applications. Seventy percent might not be good enough for trans-Atlantic flight navigation or accounts receivable, but it's fine for marketing. Given any input data, even random noise, marketers can create a model to predict which consumers will respond to a campaign. More accurate models are better of course since they allow marketers to efficiently move resources between creative development and campaign delivery. For example, if a model is highly accurate then we can spend more on the creative and less on the delivery because we don't need to reach as many consumers to meet our business goals. But more accurate models require better data and more data scientists, both of which are costly.

Yet even inaccurate models can be profitable. Direct mail campaigns often have a 1% response rate which means that they're wrong 99% of the time. Many retail websites have just a 2% conversion rate and are considered successful. The eminent statistician George Box wrote that “essentially, all models are wrong, but some are useful.” A 1% response rate can be very useful if you manage your creative and delivery costs to fit that, and don't overspend on data and modeling.

Last, 70% data accuracy is good enough because marketing isn't a single campaign game, but rather marketers use their data to develop a campaign, execute it, learn from it, and repeat. If a marketer's data and lessons-learned are even slightly better than their competitors' then they acquire new customers, and gain more private first-party data on those consumers' behavior, which they can then leverage in their next campaigns. That is, even a slight advantage compounds over time. Consider Capital One. The financial services firm started in the late 1980s with the same consumer data that everyone else had, but outperformed its competitors using better modeling, better experiments, and more experiments, reportedly as many as 30,000 experiments per year.

So go to Acxiom's AboutTheData.com website and checkout the data that they have on you. If it says that you're into gourmet cooking when actually you haven't so much as boiled water in months, then have a good laugh. But in aggregate that data has been very useful for a very long time. And, oh yeah, Acxiom is coming online in a big way. Eighteen months ago Acxiom hired Phil Mui as its chief product officer. Mui is wicked smart, with a Ph.D. from MIT and a masters from Oxford, and was the product manager for Google Analytics for six years. Looks like Acxiom's data might be useful for another 40 years.