Are those creepy web ads that learn your preferences and follow you around online also discriminatory?

Floodwatch, a new tool from the Office for Creative Research, is hoping it can collect enough data from users to help researchers answer questions around just how users are being targeted by ads online.

One of my new year’s resolutions has been to be better about personal finance, so I’ve been browsing a lot of Roth IRA explainers and looking into automated investing services (yay!). Now ad-supported websites I visit are consistently framed by a whole lot of never-seen-before ads concerning retirement, mutual funds, and brokerage services (eep!).

To all those banner ads that follow me around online, clearly sensing my needs and wants, my response has largely been, “Wow, kind of creepy. ¯\_(ツ)_/¯,” a complacency I suspect many others share. But shouldn’t the amount of demographic information about me carried in these web ads be cause for alarm, or at least a little more watchfulness?

Floodwatch is a new project that helps track and categorize all the ads any Floodwatch user sees while browsing the Internet, with the goal of increasing users’ vigilance for advertising practices and also collecting a ton of anonymized data for approved researchers to scrutinize. It also lets users visualize the breakdown of ad categories they see across their Internet browsing history and allows users to compare their ad profiles with other groups of Floodwatch users. (You can create your own filters for comparison — e.g., you, compared to all Christian female users.) Users can choose to provide a little more demographic information about themselves, too — additional data that helps potential researchers identify patterns of discriminatory practices in how ads are served, should such patterns exist.

Here were the categories of ads I saw shortly after embarking on my New Year’s resolution, compared to all Floodwatch users1. Caveat: There were fewer than 150 users signed up for the project at this point in time:

“The end goal of this project, if it goes really well, is to get ammunition for policy change around what web advertisers can and can’t do,” said Jer Thorp, OCR cofounder and a former data artist-in-residence at the now-shuttered New York Times R&D Lab. “In a fantasy world where 10,000 or even 50,000 people are using Floodwatch, we’d have a base of evidence that can be used to demonstrate the practices that are happening, and that can reinforce a real drive towards change. There’s been some great work that’s come out of the FCC about this, but there’s not a lot of data, and where there is data, it tends to be a small subset. “

(Last fall, the Federal Communications Commission ruled that Internet providers like Comcast or Verizon would have to get consumers’ permission to be able to use their browsing and app use history and location data, regulations that throttles these companies’ ability to more precisely target ads to their customers. Facebook and Google aren’t covered by the regulation. Around the same time, ProPublica published a story about how Facebook’s ad portal allowed advertisers to actively exclude what it defined as “ethnic affinities” from seeing a particular ad. The following month, Facebook said it was updating its protocol to prevent such racial exclusion on housing, employment, and credit ads.)

Floodwatch, released quietly in beta this month, is actually version 2.0: OCR built the first iteration of the Chrome extension in 2014 with funding from the Ford Foundation, also intending to “empower” users to track the ads that are tracking them (the latest Floodwatch is funded entirely by OCR itself). The response to Floodwatch 1.0 was encouraging, Thorp said, because it seemed to serve as a model for many other browser extensions. But beyond collecting for and showing users what ads they were seeing as they browsed the Internet, it didn’t offer much else.

“The first version of Floodwatch was a success in one fashion, and not a success in another. People are still referencing that extension as a cool way for individuals to do something given all these surveillance systems on the Internet they may feel powerless in the face of,” Thorp said. “As an idea, it was great. But we knew what we really needed was to come up with a way to classify those ads so that individuals could know not only how many ads they were seeing, but also what types of ads they were seeing.”

Version 1.0 of the tool collected a bunch of ads from users and manually categorized them via Amazon’s Mechanical Turk service. The OCR team then worked with machine learning experts to use that data to build an automated ad classifier, according to Chris Anderson, an OCR staffer who worked on the backend of the project.

Manual classification “gave us a pretty granular view into how ads were classified, but a lot of them are difficult to figure out. It’s hard to discern whether a picture of palm trees is a travel ad, or maybe it’s actually a banking ad, or maybe they’re somehow selling a car in it,” Anderson said. “A lot of it was ambiguous and took a lot of human effort to really get to the ground truth.”

The team is working on adding more ways for interested users to play around with the data and create and share their own comparisons, but “the goal is never to prove causation; there are too many factors at play,” Jane Friedhoff, another OCR staffer who led the work on Floodwatch, said. “But we want people to start to get curious — maybe for individual users to be able to flag certain comparisons as interesting, and that can act as a breadcrumb trail for researchers — so it can also be sort of a crowdsourced process.” Other potential features, which depend on more users downloading and using the extension, could include ways to depict how a user’s ad profile has changed over time, and also ways to toggle between showing “all ads shown to me” and “ads being targeted directly at me.”

Floodwatch will only give data collected by the extension to approved researchers affiliated with research institutions and colleges and universities — not corporate entities, Thorp said, and will list publicly all the approved researchers who have access to the data. The extension is open sourced, for those interested in tinkering for other browsers.

“We’re also excited for this to be used in classrooms, for students of various ages to be able to ask questions around, what should we allow advertisers to do based on demographics?” Thorp said. “It could help teach young people in particular that there are these new avenues of activism, but through tech. I not only want people to think about using tools like Floodwatch, but also building tools like it themselves.”

Photo of a 1964 two-page car ad for Buick by Classic Film used under a Creative Commons license.

Notes

To use the tool, you need to uninstall adblockers such as AdBlock Plus or Ghostery, which prevent ads from loading ↩