In a series of articles last year, executives from the ad-data firms BlueKai, eXelate and Rocket Fuel debated whether the future of online advertising lies with “More Data” or “Better Algorithms.” Omar Tawakol of BlueKai argues that more data wins because you can drive more effective marketing by layering additional data onto an audience. While we agree with this, we can’t help feeling like we’re being presented with a false choice.

Maybe we should think about a solution that involves smaller amounts of higher quality data instead of more data or better algorithms.

First, it’s important to understand what data is feeding the marketing ecosystem and how it’s getting there. Most third-party profiles consist of data points inferred from the content you consume, forms you fill out and stuff you engage with online. Some companies match data from offline databases with your online identity, and others link your activity across devices. Lots of energy is spent putting trackers on every single touchpoint. And yet the result isn’t very accurate — we like to make jokes around the office about whether one of our colleagues’ profiles says they’re a man or a woman that day. Truth be told, on most days BlueKai thinks they are both.

One way to increase the quality of data would be to change where we get it from.

Instead of scraping as many touchpoints as possible, we could go straight to the source: The individual. Imagine the power of data from across an individual’s entire digital experience — from search to social to purchase, across devices. This kind of data will make all aspects of online advertising more efficient: True attribution, retargeting-type performance for audience targeting, purchase data, customized experiences.

So maybe the solution to “More Data” vs. “Better Algorithms” isn’t incremental improvements to either, but rather to invite consumers to the conversation and capture a fundamentally better data set. Getting this new type of data to the market won’t be easy. Four main hurdles need to be cleared for the market to reach scale.

Control and Comfort

When consumers say they want “privacy,” they don’t normally desire the insular nature of total anonymity. Rather, they want control over what is shared and with whom. Any solution will need to give consumers complete transparent control over their profiles. Comfort is gained when consumers become aware of the information that advertisers are interested in — in most cases, the data is extremely innocuous. A Recent PWC survey found that 80 percent of people are willing to share “information if a company asks up front and clearly states use.”

Remuneration

Control and Comfort are both necessary, but people really want to share in the value created by their data. Smart businesses will offer things like access to content, free shipping, coupons, interest rate discounts or even loyalty points to incentivize consumers to transact using data. It’s not much of a stretch to think that consumers who feel fairly compensated will upload even more data into the marketing cloud.

Trust and Transparency

True transparency around what data is gathered and what happens to it engenders trust. Individuals should have the final say about which of their data is sold. Businesses will need to adopt best practices and tools that allow the individual to see and understand what is happening with their data. A simple dashboard with delete functionality should do, for a start.

Ease of Use

This will all be moot if we make it hard for consumers to participate. Whatever system we ask them to adopt needs to be dead simple to use, and offer enough benefits for them to take the time and effort to switch. Here we can apply one of my favorite principles from Ruby on Rails — convention over configuration. There is so much value in data collected directly from individuals that we can build a system whose convention is to protect even the least sensitive of data points and still respect privacy, without requiring the complexity needed for configuration.

The companies who engage individuals around how their data is used and collected will have an unfair advantage over those who don’t. Their advertising will be more relevant, they’ll be able to customize experiences and measure impact to a level of precision impossible via third-party data. To top it off, by being open and honest with their consumers about data, they’ll have impacted that intangible quality that every brand strives for: Authenticity.

In the bigger picture, the advertising industry faces an exciting opportunity. By treating people and their data with respect and involving them in the conversation around how their data is used, we help other industries gain access to data by helping individuals feel good about transacting with it. From healthcare to education to transportation, society stands to gain if people see data as an opportunity and not a threat.

Marc is the co-founder and CEO of Enliken, a startup focused on helping businesses and consumers transact with data. Currently, it offers tools for publishers and readers to exchange data for access to content. Prior to Enliken, Marc was the founding CEO of Spongecell, an interactive advertising platform that produced one of the first ad units to run on biddable media.

Recently, Omar Tawakol from BlueKai wrote a fascinating article positing that more data beats better algorithms. He argued that more data trumps a better algorithm, but better still is having an algorithm that augments your data with linkages and connections, in the end creating a more robust data asset.

At Rocket Fuel, we’re big believers in the power of algorithms. This is because data, no matter how rich or augmented, is still a mostly static representation of customer interest and intent. To use data in the traditional way for Web advertising, choosing whom to show ads on the basis of the specific data segments they may be in represents one very simple choice of algorithm. But there are many others that can be strategically applied to take advantage of specific opportunities in the market, like a sudden burst of relevant ad inventory or a sudden increase in competition for consumers in a particular data segment. The algorithms can react to the changing usefulness of data, such as data that indicates interest in a specific time-sensitive event that is now past. They can also take advantage of ephemeral data not tied to individual behavior in any long-term way, such as the time of day or the context in which the person is browsing.

So while the world of data is rich, and algorithms can extend those data assets even further, the use of that data can be even more interesting and challenging, requiring extremely clever algorithms that result in significant, measurable improvements in campaign performance. Very few of these performance improvements are attributable solely to the use of more data.

For the sake of illustration, imagine you want to marry someone who will help you produce tall, healthy children. You are sequentially presented with suitors whom you have to either marry, or reject forever. Let’s say you start with only being able to look at the suitor’s height, and your simple algorithm is to “marry the first person who is over six feet tall.” How can we improve on these results? Using the “more data” strategy, we could also look at how strong they are, and set a threshold for that. Alternatively, we could use the same data but improve the algorithm: “Measure the height of the first third of the people I see, and marry the next person who is taller than all of them.” This algorithm improvement has a good chance of delivering a better result than just using more data with a simple algorithm.

Choosing opportunities to show online advertising to consumers is very much like that example, except that we’re picking millions of “suitors” each day for each advertiser, out of tens of billions of opportunities. As with the marriage challenge, we find it is most valuable to make improvements to the algorithms to help us make real-time decisions that grow increasingly optimal with each campaign.

There’s yet another dimension not covered in Omar’s article: the speed of the algorithms and data access, and the capacity of the infrastructure on which they run. The provider you work with needs to be able to make more decisions, faster, than any other players in this space. Doing that calls for a huge investment in hardware and software improvements at all layers of the stack. These investments are in some ways orthogonal to Omar’s original question: they simultaneously help optimize the performance of the algorithms, and they ensure the ability to store and process massive amounts of data.

In short, if I were told I had to either give up all the third-party data I might use, or give up my use of algorithms, I would give up the data in a heartbeat. There is plenty of relevant data captured through the passive activity of consumers interacting with Web advertising — more than enough to drive great performance for the vast majority of clients.