ProPublica built software and a machine-learning algorithm to allow Facebook users to send us the political ads that appear on their Facebook news feeds.

Political ads on Facebook have come under scrutiny since it was revealed that Russia used such messages to try to influence the 2016 U.S. election. But online political ads are often seen only by a small target audience — making it difficult for the public to check them for accuracy. In order to shine a light on political advertising on Facebook, we built a tool that allows Facebook users to automatically send us the political ads that were displayed on their news feeds. (You can install the tool, known as a web browser extension, for Chrome or Firefox.)

The extension, which we call the Political Ad Collector, is a small piece of software that users can add to their web browsers. When a user logs into Facebook, the extension will collect the ads displayed on the user’s news feed and guess which ones are political based on an algorithm built by ProPublica. Ads that are found likely to be political are made public in a searchable database.

Browse the Ads

To make American campaigns more transparent, we’ve built a tool to display political ads that are rarely seen outside their selected audience of Facebook users.

Our tool collects basic information about each ad, such as the Facebook ad identification number and the dates we saw the ad in our system. However, to protect the privacy of users, we automatically remove any personally identifiable information from the ads we collect, including Facebook ID numbers and tracking identifiers, which are tiny images that can be used to identify users. We also remove the names and profile links of the user's friends who have liked the ad and any comments on the ad.

We collect targeting information that Facebook provides with the ad, but we do not connect that information to any data that could be used to identify a user. The targeting data tells users some of the criteria used to decide which ads to display to which user, such as age and location. Facebook users can see targeting information if they click the dots at the top right corner of any Facebook ad and select “Why am I seeing this?”

Read the Story

These ads raise doubts about Facebook’s ability to monitor paid political messages. In each case, the ads ran afoul of Facebook’s own guidelines to curb misleading and malicious advertising.

To determine which ads are political, ProPublica built a machine-learning algorithm to calculate the statistical likelihood that an ad contains political content. This algorithm, called a Naive Bayes classifier, has long been used for identifying spam emails. It works particularly well on classifying text into one of two groups, such as spam or not spam, or, in this case, political and not political.

Before we launched the tool, we trained this algorithm on a list of Facebook posts that we knew were political — posts from parties and candidates, and posts about political issues — and a list of posts that weren’t political — published by big stores and other companies.

If we relied solely on these initial hand-selected posts, our classifier would have been able to reliably find ads published by the Democratic or Republican parties, but it would have missed ads from groups that we didn’t include in our training data. It also would miss how politically charged language and subjects change over time.

So, for our algorithm to distinguish more accurately between political and non-political ads, our tool regularly shows users a selection of ads and asks them to identify which ones they think are political and which ones they think aren’t. These include the ads that appeared on the user’s own feed as well as ads that were shown to other people. Just because a single user tells us an ad is non-political doesn’t mean it will get dropped from our database of political ads, but the algorithm will take that vote into account.

The tool is already being used in several countries, including Germany, Italy, Australia, Austria and the U.S. For each country, we customize the the algorithm to learn to identify political content. The open source code behind our project is available to the public.

Julia Angwin is a senior reporter at ProPublica. From 2000 to 2013, she was a reporter at The Wall Street Journal, where she led a privacy investigative team that was a finalist for a Pulitzer Prize in Explanatory Reporting in 2011 and won a Gerald Loeb Award in 2010.

Close this overlay (subscribe to our email)

Close this overlay (search)

Close this overlay (Creative Commons)

Republish This Story for Free

Thank you for your interest in republishing this story. You are are free republish it so long as you do the following:

You can’t edit our material, except to reflect relative changes in time, location and editorial style. (For example, “yesterday” can be changed to “last week,” and “Portland, Ore.” to “Portland” or “here.”)

If you’re republishing online, you have to link to us and to include all of the links from our story, as well as our PixelPing tag.

You can’t sell our material separately.

It’s okay to put our stories on pages with ads, but not ads specifically sold against our stories.

You can’t republish our material wholesale, or automatically; you need to select stories to be republished individually.