Privacy Implications of Social Media Manipulation

The ethical debate about Facebook’s mood manipulation experiment has rightly focused on Facebook’s manipulation of what users saw, rather than the “pure privacy” issue of which information was collected and how it was used.

It’s tempting to conclude that because Facebook didn’t change their data collection procedures, the experiment couldn’t possibly have affected users’ privacy interests. But that reasoning is incorrect.

To simplify the discussion, let’s consider a hypothetical social network that I’ll call Wo. Wo lets people set up accounts and establish mutual friend relationships, as on Facebook. Rather than letting users post detailed status updates, Wo just lets a user set their status to either Smiley or Frowny. Users viewing the service then see photos of their friends, which are either smiling or frowning depending on the friend’s status. Wo keeps records of users’ status changes and when each user views their friends’ statuses.

Wo learns certain things about their users by collecting and analyzing these records. They can tell how often a user is Smiley, how long a user’s Smiley and Frowny states persist, and how a user’s status correlates with the status of each friend and with the statuses of their friends in aggregate.

What’s interesting is that if Wo manipulates what its users see, it can learn things about individual users that it couldn’t learn by observation alone.

Suppose Wo wants to study “emotional contagion” among its users. In particular, it wants to know whether seeing more Frowny faces makes a user more likely to set their own status to Frowny. Wo could measure whether a user’s status tends to be correlated with her friends’ statuses. But correlation is not causation—Alice and Bob might be Frowny at the same time because something bad happened to their mutual friend Charlie, or because they live in the same town and the weather there is bad.

If Wo really wants to know whether seeing Bob’s Frowny status tends to cause Alice to set her status to Frowny, the most effective method for Wo to learn this is a randomized trial, where they artificially manipulate what Alice sees. Some random fraction of the time, they show Alice a random status for Bob (rather than Bob’s actual status at the time), and they measure whether Alice is more Frowny when the false Bob-status is Frowny. (There are some methodological issues that Wo has to get right in such an experiment, but you get the idea.)

This kind of experiment allows Wo to learn something about Alice that they would not have been able to learn by observation alone. In this case, they learn how manipulable Alice’s emotional status is. The knowledge they gain is statistical in nature—they might have, say, 83% statistical confidence that Alice is more manipulable than the average person—but statistical knowledge is real knowledge.

A notable feature of this hypothetical experiment is that Alice probably couldn’t tell that it was going on. She would know that she was revealing her emotional state to Wo, but she wouldn’t know that Wo was learning how manipulable her emotions were.

The key point is that the privacy impact of an interaction like this depends not only on which types of information are gathered, but also on which prompts were given to the user and how those prompts were chosen. Experimenting on users affects their privacy.

Now: What does this hypothetical teach us about the privacy impact of Facebook’s experiment?

What it tells us is that Facebook did learn some non-zero amount of information about the manipulability of individual users’ emotions. Given the published results of the study, the information learned about individual users was probably very weak, in the statistical sense of being correlated with the truth but only very weakly correlated, for the vast majority of users or perhaps for all users.

To be clear, I am not concluding that Facebook necessarily learned much of anything about the manipulability of any particular user. Based on what we know I would bet against the experiment having revealed that kind of information about any individual. My point is simpler: experiments that manipulate user experience impact users’ privacy, and that privacy impact needs to be taken into account in evaluating the ethics of such experiments and in determining when users should be informed.

Comments

Of note given your weather example, Facebook actually published an instrumental variable study of emotional contagion using rainfall earlier this year. Adam Kramer, lead author of the PNAS paper, was a co-author on that study:

Yes, the weather study you cite is a good example of a useful study that observes users’ behavior on the “vanilla” Facebook service but doesn’t manipulate what they see. Such observe-only studies raise less complicated ethical issues than manipulate-and-observe studies.

Ed, you raise a very interesting point, which prompts me to take your analysis further into the realm of philosophy: This is not a violation of privacy in the traditional understanding of the term where someone knows something that they do not want to reveal. In your scenario, users of Wo not only do not know themselves but they do not even have the capacity to know what Wo knows through its experimentation capabilities. In a sense, allowing the very possibility of someone having the ability to do this type of analysis without the obligation to disclose the results (and to whom?) amounts to a deep philosophical question that lies at the core of, let me informally say, many things about big data.

I agree with what I think you’re saying in your final point about privacy appearing as a subset of the ethical questions apparent in Facebook’s manipulation (and here, as in my comment on the previous post, I mean their common business practices of choosing what to display, not whatever changes they did for this study). Companies are using privacy protections as a way to demonstrate that they deserve trust to make larger, much less quantifiable ethical decisions.

danah boyd points this out in her piece today (): “If Alice is happier when she is oblivious to Bob’s pain because Facebook chooses to keep that from her, are we willing to sacrifice Bob’s need for support and validation? This is a hard ethical choice at the crux of any decision of what content to show when you’re making choices. And the reality is that Facebook is making these choices every day without oversight, transparency, or informed consent”

This is not a question of users’ privacy from Facebook (assumed non-existent in this case) but at some level a similar question about whether Facebook is making shared information visible. Can our current models for privacy also cover the dissemination of information such as Bob’s statuses? Or are there other models in the research community that can examine this? “Companies have always decided what’s fit to print” seems like it does not capture the individualized decisions that Facebook (or Wo) might derive from their data.

PS: Every time I post a comment from Firefox I spend the next few days getting every freedom-to-tinker post page redirected to a 404 for /garbage.php – any ideas whether that’s an Apache error, a confusion because I have a quote in the “Name” field, a Firefox incompatibility…?

I use Firefox without any issues… of course I don’t use double quotes in my name…. However, a recent website that I finished developing, I did note that double quotes had to be handled very carefully from EVERY web form field [even the fields I didn't expect a double quote to be used in] or inconsistent results happened in the testing phase. But that is all back-end scripting stuff that probably is not in the purview of this blog’s contributors expertise.

I may be missing something, but this seems to be a rather strained analogy. In your scenario, Wo is lying to Alice, and that (IMO) is the primary ethical failure. My understanding is that the FB experiments only changed how certain data was prioritized – akin to Google changing the way in which search rankings are calculated – and did not actually falsify anything. Have I misunderstood?

As Joseph Bonneau wrote below, the Wo hypothetical isn’t meant as a perfect analogy to what Facebook did, but rather as an explanatory device to explain the principle that you can learn more about someone by experimenting on them than by merely observing them under “natural” conditions—which in turn implies that A/B type experiments do have privacy implications for end users.

Oh, I see what you mean. The goal of the exercise may have been to learn something about human psychology, but in the specific case you’ve learnt something about Alice’s psychology, at least potentially.

(I’d have thought in most cases you wouldn’t get any statistically meaningful information for any particular individual, i.e., the results would only be significant in the aggregate, but at a minimum you’d still need to consider the risk.)

Harry, that was my reaction too; I think Wo was lying to Alice in Ed’s example which I believe is far more important ethically speaking than what they can derive from their lying; but even so I believe the same of Facebook.

Decisions on how to “prioritize” people’s posts is still considerably different than Google trying to prioritize search results. The later is expected as part of the service of search engines, the former is NOT expected of a website from which people interact with each other and the website boasts itself as a method to interact with others.

My status may not be ranked get pushed down to the bottom if even shown or several pages of clicks in to my friends, makes me less likely to want to post my status if the purpose is for my friends to see it. For Facebook to do that against their advertised business is just as much lying (or dishonesty, or I use the word outright fraud) as the Wo example. If I knew Facebook was going to do that how likely would I want to post anything at all?

Harry Johnston-your point is valid that Ed’s analogy involved actively deceiving users, which is more than Facebook did for this experiment. This is a big difference.

However, Ed’s point probably still stands given a weaker analogy where the Wo network simply shows Bob’s frown or nothing to test Alice’s response. They would still be learning some new info about Alice that she might not realize.

I know I am backtracking in dates again… But this again failed to address the most basic problem with the whole Facebook thing; and that is the implication of fraud. If we address your example of Wo, supposing Alice signed up with Wo specifically so that she could see whether Bob was smiling or frowning. And then supposing that Wo decided to do the “emotional contagion” experimentation for whatever reason and whether or not they collect data that may have privacy implications. The moment they chose to do more than just simple observation (because as you say causation and correlation are not the same); the moment they provide a “false Bob-status” they have now conducted themselves fraudulently, because they have now told Alice that Bob’s status is one thing (which it may not be).

In my book the fraud is just as important as the privacy implications. If a company didn’t delve into fraud, there is less people would have to worry about when their privacy is invaded. But, if a company is happy to conduct fraud, you can better be assured that their intentions to use any private data collected will be with just as malicious. A company that is fraudulent is not trustworthy at all.

Facebook is one of the biggest Frauds of the 21st century; which is why I don’t like that. Now this one study was only one of many things I have found fraudulent about Facebook.

Freedom to Tinker is hosted by Princeton's Center for Information Technology Policy, a research center that studies digital technologies in public life. Here you'll find comment and analysis from the digital frontier, written by the Center's faculty, students, and friends.