Due to the new European data protection law, we need your consent before you use our website:

We use cookies and other technologies to customize your experience, perform analytics and deliver
personalized advertising on our sites, apps and newsletters and across the Internet based on your
interests. By clicking “I agree” below, you consent to the use by us and our third-party partners of
cookies and data gathered from your use of our platforms. See our Privacy Policy
and Third Party Partners to learn
more about the use of data and your rights. You also agree to our
Terms of Service.

These are big numbers. But it can be easy to misinterpret them. The $100,000 in advertisements was a drop in the bucket compared to the $70 million directly spent by the Trump campaign. Barely anyone actually showed up to the anti-immigration protests. It turns out that the widely cited metrics for organic reach have a botnet problem.

Studying Facebook interactions is hard

The study arguing that Russian posts got billions of views comes from Jonathan Albright, research director at Columbia University’s Tow Center for Digital Journalism. Albright relied on CrowdTangle, a popular social media analytics tool for monitoring Facebook interactions and surfacing viral content. This is a novel and creative application of the CrowdTangle toolset, and not the purpose that CrowdTangle was designed for. Unfortunately, that may have led to problems in the analysis.

“The logic of CrowdTangle’s model is relatively simple (even if the underlying math and software code gets complicated). CrowdTangle tracks clusters of Facebook pages and specific keywords. It gathers historical data on how stories, posts and images tend to perform on these sites, and then highlights the stories, posts and images that are doing best against their own expected baseline performance rate. [The] company then packages this information into a daily email, alerting [its] clients to the content which is likely to perform best on a day-to-day basis.”

CrowdTangle plays a crucial behind-the-scenes role in the social sharing optimization strategies of digital media producers like Upworthy, Vox and Buzzfeed. CrowdTangle was not designed to combat, weed out or even study botnets. It was designed to identify stories and content that performs better-than-average within a company’s peer network.

Albright’s study focuses on a pair of metrics that CrowdTangle generates on the basis of the data it gathers: “interactions” and “organic reach.” The trouble with digital indicators like these is that they are easy to inflate. We have seen this on Twitter, where nearly half of President Trump’s Twitter followers are fake accounts and bots. This may be a particular problem when studying Russian influence activities. Adrien Chen’s reporting has documented that Russia’s Internet Research Agency (IRA) specializes in creating fake social media accounts to magnify the impact of its activities. These fake accounts can warp these sorts of simple metrics of online opinion.

The ‘Blacktivists’ page shows how this can work

The “Blacktivists” Facebook page is an instructive example. This page was created by the IRA. Donie O’Sullivan and Dylan Byers have previously reported that the Blacktivist page “had 360,000 likes, more than the verified Black Lives Matter account on Facebook, which currently has just over 301,000.” We cannot tell based on public data how many of these likes came from IRA-created Facebook accounts. But it is highly likely that the reason IRA’s page had more likes than the actual Black Lives Matter account was because the IRA also fabricated several thousand Facebook profiles, then used those accounts to give its content a veneer of legitimacy.

The Albright study highlights the 6.18 million “interactions” (reactions, comments, and shares) the Blacktivist page received across 500 posts. The single most-shared post from the Blacktivist page received 344,209 interactions, which is less than the page’s total number of likes.

This could indicate massive viral spread among socially-conscious Facebook users. Or it could just be repeat sharing echoing across a botnet. The study also estimates the “organic reach” by counting the sum total of followers of all Facebook pages that shared a Blacktivist post. That sum is rife with overcounts though — if two fake Facebook profiles each have the same 5,000 fake friends, and both share the fake Blacktivist post, CrowdTangle’s “organic reach” will record it as visible to 10,000 people.

The headline from Albright’s study is that Blacktivist posts had a total “organic reach” of 103.8 million. Combined with five other IRA-created pages that have been made public, Albright counts 340 million. Since Facebook has deleted 470 pages, he reasonably concludes that the total “organic reach” of all these sites is likely in the billions. That math is correct, but misleading. We have no way of knowing what portion of these views are attributable to actual human beings living in the United States of America.

The larger difficulty here is that Facebook has quasi-monopolistic power in the social sharing economy. We encounter news content through the black box of Facebook’s newsfeed algorithms, and no one besides Facebook’s own engineers can say precisely how these algorithms operate. Facebook’s internal data might be able to sort through these questions, but even that is uncertain. Facebook’s publicly-available data is extremely limited. This makes it hard to sort out the scale of Russian digital propaganda activity in the 2016 election.

We know that foreign actors expended substantial time and money in an attempt to inject digital propaganda into the 2016 election. We know that they did this in an attempt to undermine trust in democracy and buttress the efforts of the Trump campaign.

Gathering clear data on the scope of these activities is both phenomenally important and phenomenally difficult. Albright’s effort to shed light on this activity is to be applauded. Still, readers need to be cautioned not to overhype the topline findings, which may inadvertently be highly misleading.

This article is one in a series supported by the MacArthur Foundation Research Network on Opening Governance that seeks to work collaboratively to increase our understanding of how to design more effective and legitimate democratic institutions using new technologies and new methods. Neither the MacArthur Foundation nor the Network is responsible for the article’s specific content. Other posts in the series can be found here.

Dave Karpf is an associate professor in the School of Media and Public Affairs at George Washington University.