Kashmir Hill is the editor of Fusion's Real Future. She has hacked a stranger's smart home, lived on Bitcoin & paid a surprise visit to the NSA's Utah datacenter, all while trying to prove privacy... more

On Thursday morning, I listened to an interview with the CEO of “a big data intelligence company” called Dstillery; it “demystifies consumers’ online footprints” to target them with ads. The CEO told public radio program Marketplace something astounding: his company had sucked up the mobile device ID’s from the phones of Iowa caucus-goers to match them with their online profiles.

Via Marketplace:

“We watched each of the caucus locations for each party and we collected mobile device ID’s,” Dstillery CEO Tom Phillips said. “It’s a combination of data from the phone and data from other digital devices.”

Dstillery found some interesting things about voters. For one, people who loved to grill or work on their lawns overwhelmingly voted for Trump in Iowa, according to Phillips.

When I heard this, I wondered how the company was doing this. Did they have employees at all the caucus locations holding phone-sniffing devices? The idea that phone-toting people could walk up to vote and immediately have their real world identities matched with a profile based on their digital trail would indeed be, as Marketplace headlined its piece, a “new frontier in voter tracking.”

But that’s not how it works. The pairing of caucus-goers with their online footprints was more roundabout than that, explained a Dstillery spokesperson by email.

What really happened is that Dstillery gets information from people’s phones via ad networks. When you open an app or look at a browser page, there’s a very fast auction that happens where different advertisers bid to get to show you an ad. Their bid is based on how valuable they think you are, and to decide that, your phone sends them information about you, including, in many cases, an identifying code (that they’ve built a profile around) and your location information, down to your latitude and longitude.

Yes, for the vast majority of people, ad networks are doing far more information collection about them than the NSA–but they don’t explicitly link it to their names.

So on the night of the Iowa caucus, Dstillery flagged all the auctions that took place on phones in latitudes and longitudes near caucus locations. It wound up spotting 16,000 devices on caucus night, as those people had granted location privileges to the apps or devices that served them ads. It captured those mobile ID’s and then looked up the characteristics associated with those IDs in order to make observations about the kind of people that went to Republican caucus locations (young parents) versus Democrat caucus locations. It drilled down farther (e.g., ‘people who like NASCAR voted for Trump and Clinton’) by looking at which candidate won at a particular caucus location.

Because I think this is a fascinating look into how online and offline tracking can be combined, here’s the full Q&A:

Fusion: How did Dstillery gather mobile ids from phones in Iowa?

Dstillery: For most ads you see on web browsers and mobile devices, there is an auction among various programmatic advertising firms for the chance to show you an ad. We are one of those buyers, and we are sent a variety of anonymous data, including what kind of phone you have, what app you are using, what operating system version you’re running, and sometimes – crucially for this study – your latitude and longitude (lat/long).
We identified the caucusing locations prior [to] the Iowa caucus and told our system to be on the lookout for devices that report a lat/long at those locations during the caucus.

So when we received an ad bid request that our system recognized as being at one of the caucus sites, our system flagged that request and captured that device ID so we could use it for this.

This is roughly equivalent to exit polling for the smart phone age.

Fusion:How many caucus locations did it gather from?

Dstillery: We gathered data from across ~90% of the caucus sites for both parties.

Fusion: How many mobile ids was it able to match to its database?

Dstillery: We identified about 16,000 devices at the various caucus sites.

Dstillery: We use the anonymized advertising IDs provided by the devices themselves as identifiers in our system. Generally speaking this means Android’s Advertising ID or iOS’s IDFA.

Fusion: What’s the range of information associated with a mobile id?

Dstillery: The data we receive from those auction messages is fairly limited. To build out that rich information set that you are referring to (we call them ‘Crafted Audiences’), we need to see a device several times across many different sites. We then use some pretty sophisticated machine learning techniques to extrapolate behaviors. We can only do this because we see such a broad view of digital behavior. In other words we know that seeing you on sites A, B and C mean that you are likely a New Mom, but seeing you on A, D and E mean that you are Health Conscious.

We have hundreds of crafted audiences – including credit checkers, wrestling fans, new movers, CEOs and even things like DIYers and cigar aficionados. And generate more crafted audiences all the time.

One thing that isn’t in the data is personal identifiable information. The data and system are completely anonymous. We have no idea, for example, what your name is. All we see are behaviors and everything we do is based on analyzing those behaviors writ-large.

Fusion: Does Dstillery do its real world association of mobile ids with consumer attributes in other settings or was this a one-off?

Dstillery: This application is an extension of what we do every day in our core business. Our entire mission as a company is to find the right consumer at the right time with the right message. We had to do some special setup and analysis due to the caucus dynamics, but this sort of experiment – seeing things in the data that no one else has before – is our bread and butter.