How a Top Ad Firm Stopped Fearing Big Data and Learned to Love Analytics

Alex Woodie

Competition in the $500-billion global advertising industry is brutal, and ad firms that fail to deliver are quickly shown the door by their clients. But in this age of big data analytics, delivering results is not enough, as clients increasingly demand to know exactly how their media buys boosted demand.

As the sixth largest media buying and marketing firm in the world, Havas Media has created and executed print, online, display, and TV advertising strategies for some of the world’s biggest companies, such as Coca-Cola, McDonald’s, Merck, Hyundai, Panasonic, and Fidelity. Hundreds of brands depend on the firm’s recommendations about what to advertise, where, when, and to whom.

For UK-based Havas, success depends not only on having an ad man’s sharp eye for consumer trends and excellent timing, but also having a data scientist’s ability to spot patterns buried across multiple data sets. The entire advertising industry has become increasingly data driven, and Havas has invested millions in building a state of the art data analytics division to help inform its clients and other Havas divisions.

The heart of Havas’ data science setup is a 600TB warehouse based on the Greenplum column-oriented analytical database from Pivotal. Havas uses a custom-built, Java-based ETL system to continually transform and load data from dozens of sources into this warehouse, where advanced analytics generate precise recommendations for clients based on models honed over the firm’s 15 year existence.

Havas Media’s EVP of data platforms Sylvain Le Borgne is responsible for ensuring that the Greenplum warehouse is ready to crunch data for a decentralized team of 500 analysts located around the world. A big part of that job is keeping the data flowing from dozens of data sources, ranging from Google’s DoubleClick and Adobe to Twitter and Facebook.

“You name it–anything that generates data, we collect it and centralize it, store it, match it, and make it available to our analytics teams so they can do magic and wonders with it,” Le Borgne tells Datanami. “Most of our data comes from media buying, but we complement it with other data because we need to better understand consumer behavior.”

The firm’s data analytics operation has evolved tremendously over the past several years. Previously, the company employed a sizable number of mid-level SQL programmers to prepare the data and handful of SAS experts to perform statistical analysis on it. That started to change about four years ago when the company adopted a shrink-wrapped analytics package from Alpine Data Labs.

According to Le Borgne, Alpine enabled the company to do much more analytical work on behalf of Havas’ clients at a lower cost than before. “It’s completely changed our business and is empowering our users,” he says. “Before, we had a team in the middle that would ask the analysts for specifications, that would do all the SQL code for them and run everything. Now with Alpine we’re able to let analysts run whatever they want on the database without risking having them take the database down. All the data preparation is done centrally by our own ETL process. The data is ready to be used and after that we just let them do advanced analytics on top of it, building models and running algorithms.”

Havas is using Alpine to build attribution models that allow it to determine what boosts the effectiveness of its clients and what does not. It uses Alpine to calculate total lifetime value for prospects, and segmentation analysis to determine what groups of consumers its clients should target, which groups it should stay away from, and what kind of marketing message will be most effective.

Since adopting Alpine, the number of Havas employees building attribution models and doing advanced analytics on behalf of clients has grown by several orders of magnitude. Instead of relying on a handful of SAS gurus to build and maintain the models, the software automates much of that, leaving the team of 500 analysts to crank out the results in Alpine’s drag-and-drop environment.

In addition to boosting big data productivity, Alpine has helped to build confidence that clients have in the analysts and their analytics. You could almost say that it’s created a new culture of openness when it comes to asking questions of big data.

“The data analysts can talk data with our clients, and they’re not afraid to look for answers in data when our clients are asking them questions,” Le Borgne says. “Now we know we can have very strategic thinking with our clients and we back it up with data. We’re not afraid to build models and do scenarios with them.”

“We’d love our clients to blindly trust us,” he continues. “But that’s not the game anymore. When it comes to data, they want to feel reinsured about what we’re doing. To trust the result, they want to see what goes into the data. They know that we can make the data talk the way we want. So by being able to show them the detailed process we have on the data, and by giving them the freedom to ask us to make any changes live, we build trust.”

Sylvain Le Borgne, EVP of Data Platforms for Havas Media

The reusable nature of analytic components in the Alpine product is also heightening collaboration among the company’s data analysts, and allowing Havas managers to sleep better at night. “Alpine is bringing them together, pushing collaboration, and allowing them to learn best practices, and sometimes–although it’s not enough for my own taste–it means reusing processes that have been built elsewhere,” Le Borgne says. “The fact we know that, if tomorrow somebody on an analytics team is leaving the company, that we still retain everything he’s been building, all the processes he’s been using for clients, for us is priceless.”

Alpine Data Labs was spun out of Greenplum at about the same time Havas was first adopting Alpine’s software. The software is a good match for the Greenplum database, but it can also run on Hadoop as well as other massively parallel databases like Teradata‘s.

The fact that Alpine runs on Hadoop and is being adapted to run in Spark for more real-time analysis is something that Le Borgne is watching. “Hadoop is too batch for us for now, but we’re looking at it and it will definitely be an option when we move to more unstructured data than we are currently using,” he says. “If we start to find a way to leverage Twitter feeds, for example, we’ll need different solution than Greenplum, and Hadoop will be a very interesting one. For us having a layer like Alpine on top of it…it’s keeping our options open, which we like.”