Buying and Selling Privacy

Big Data's Different Burdens and Benefits

Big data is transforming individual privacy—and not in equal ways for all. We are increasingly dependent upon technologies, which in turn need our personal information in order to function. This reciprocal relationship has made it incredibly difficult for individuals to make informed decisions about what to keep private. Perhaps more important, the privacy considerations at stake will not be the same for everyone: they will vary depending upon one’s socioeconomic status. It is essential for society and particularly policymakers to recognize the different burdens placed on individuals to protect their data.

I. The Value of Privacy

Privacy norms can play an important role defining social and individual life for rich and poor. In his essay on the social foundations of privacy law, the dean of Yale Law School, Robert Post, argued that privacy upholds social “rules of civility” that create “a certain kind of human dignity and autonomy which can exist only within the embrace of community norms.”[1] He cautioned that these benefits would be threatened when social and communal relationships were replaced by individual interactions with “large scale surveillance organizations.”[2]

Today, privacy has become a commodity that can be bought and sold. While many would view privacy as a constitutional right or even a fundamental human right,[3] our age of big data has reduced privacy to a dollar figure. There have been efforts—both serious and silly—to quantify the value of privacy. Browser add-ons such as Privacyfix try to show users their value to companies,[4] and a recent study suggested that free Internet services offer $2,600 in value to users in exchange for their data.[5] Curiously, this number tracks closely with a claim by Chief Judge Alex Kozinski that he would be willing to pay up to $2,400 per year to protect his family’s online privacy.[6] In an interesting Kickstarter campaign, Federico Zannier decided to mine his own data to see how much he was worth. He recorded all of his online activity, including the position of his mouse pointer and a webcam image of where he was looking, along with his GPS location data for $2 a day and raised over $2,700.[7]

“Monetizing privacy” has become something of a holy grail in today’s data economy. We have seen efforts to establish social networks where users join for a fee and the rise of reputation vendors that protect users’ privacy online, but these services are luxuries. And when it comes to our privacy, price sensitivity often dictates individual privacy choices. Because the “price” an individual assigns to protect a piece of information is very different from the price she assigns to sell that same piece of information, individuals may have a difficult time protecting their privacy.[8] Privacy clearly has financial value, but in the end there are fewer people in a position to pay to secure their privacy than there are individuals willing to sell it for anything it’s worth.

A recent study by the European Network and Information Security Agency discovered that most consumers will buy from a more privacy-invasive provider if that provider charges a lower price.[9] The study also noted that when two companies offered a product for the same price, the more privacy-friendly provider won out. This was hailed as evidence that a pro-privacy business model could succeed, but this also anticipates that, all things being equal, one company would choose not to collect as much information as a competitor just to be seen as “privacy friendly.” This defeats much of the benefit that a big data economy promises.

II. The Big Data Challenge

The foundations of big data rest on collecting as much raw information as possible before we even begin to understand what insight can be deduced from the data. As a result, long-standing Fair Information Practices like collection limits and purpose limitations are increasingly viewed as anachronistic,[10] and a number of organizations and business associations have called for privacy protections to focus more on how data might be used rather than limit which data can be collected.[11] The conversation has moved away from structural limitations toward how organizations and businesses can build “trust” with users by offering transparency.[12] Another suggestion is to develop business models that will share the benefits of data more directly with individuals. Online data vaults are one potential example, while the Harvard Berkman Center’s “Project VRM” proposes to rethink how to empower users to harness their data and control access to it.[13] In the meantime, this change in how we understand individual privacy may be inevitable—it may be beneficial—but we need to be clear about how it will impact average individuals.

A recent piece in the Harvard Business Review posits that individuals should only “sell [their] privacy when the value is clear,” explaining that “[t]his is where the homework needs to be done. You need to understand the motives of the party you’re trading with and what [he] ha[s] to gain. These need to align with your expectations and the degree to which you feel comfortable giving up your privacy.”[14] It could be possible to better align the interests of data holders and their customers, processing and monetizing data both for business and individual ends. However, the big challenge presented by big data is that the value may not be clear, the motives let alone the identity of the data collector may be hidden, and individual expectations may be confused. Moreover, even basic reputation-management and data-privacy tools require either users’ time or money, which may price out average consumers and the poor.

III. Big Data and Class

Ever-increasing data collection and analysis have the potential to exacerbate class disparities. They will improve market efficiency, and market efficiency favors the wealthy, established classes. While the benefits of the data economy will accrue across society, the wealthy, better educated are in a better position to become the type of sophisticated consumer that can take advantage of big data.[15] They possess the excellent credit and ideal consumer profile to ensure that any invasion of their privacy will be to their benefit; thus, they have much less to hide and no reason to fear the intentions of data collectors. And should the well-to-do desire to maintain a sphere of privacy, they will also be in the best position to harness privacy-protection tools and reputation-management services that will cater to their needs. As a practical matter, a monthly privacy-protection fee will be easier for the wealthy to pay as a matter of course. Judge Kozinski may be willing and able to pay $200 a month to protect his privacy, but the average consumer might have little understanding what this surcharge is getting him.

The lower classes are likely to feel the biggest negative impact from big data. Historically, the poor have had little expectation of privacy—castles and high walls were for the elite, after all. Even today, however, the poor are the first to be stripped of fundamental privacy protections. Professor Christopher Slobogin has noted what he calls a “poverty exception” to the Fourth Amendment, suggesting that our expectations of privacy have been defined in ways that make the less well-off more susceptible to experience warrantless government intrusions into their privacy and autonomy.[16] Big data worsens this problem. Most of the biggest concerns we have about big data—discrimination, profiling, tracking, exclusion—threaten the self-determination and personal autonomy of the poor more than any other class. Even assuming they can be informed about the value of their privacy, the poor are not in a position to pay for their privacy or to value it over a pricing discount, even if this places them into an ill-favored category.

And big data is all about categorization. Any given individual’s data only becomes useful when it is aggregated together to be exploited for good or ill. Data analytics harness vast pools of data in order to develop elaborate mechanisms to categorize and organize. In the end, the worry may not be so much about having information gathered about us, but rather being sorted into the wrong or disfavored bucket.[17] Take the example of an Atlanta man who returned from his honeymoon to find his credit limit slashed from $10,800 to $3,800 simply because he had used his credit card at places where other people were likely to have a poor repayment history.[18]

Once everyone is categorized into granular socioeconomic buckets, we are on our way to a transparent society. Social rules of civility are replaced by information efficiencies. While this dynamic may produce a number of very significant societal and communal benefits, these benefits will not fall evenly on all people. As Helen Nissenbaum has explained, “the needs of wealthy government actors and business enterprises are far more salient drivers of their information offerings, resulting in a playing field that is far from even.”[19] Big data could effectuate a democratization of information but, generally, information is a more potent tool in the hands of the powerful.

Thus, categorization and classification threaten to place a privacy squeeze on the middle class as well as the poor. Increasingly large swaths of people have little recourse or ability to manage how their data is used. Encouraging people to contemplate how their information can be used—and how best to protect their privacy—is a positive step, but a public education campaign, while laudable, may be unrealistic. Social networks, cellular phones, and credit cards—the lifeblood of the big data economy—are necessities of modern life, and assuming it was either realistic or beneficial to get average people to unplug, an overworked, economically insecure middle class does not have the time or energy to prioritize what is left of their privacy.

At present, the alternative to monetizing privacy is to offer individuals the right to make money off their information. Michael Fertik, who runs the online privacy management site, Reputation.com, sees a bright future in allowing companies to “unlock huge value in collaboration with their end users” by monetizing “the latent value of their data.”[20] Startups like Personal have tried to set themselves up as individually tailored information warehouses where people can mete out their information to businesses in exchange for discounts.[21] These are projects worth pursuing, but the degree of trust and alignment between corporate and individual interests they will require are significant. Still, it is unlikely we can ever develop a one-to-one data exchange. Federico Zannier sold his personal data at a rate of $2 per day to anyone who would take it as an experiment, but average individuals will likely never be in a position to truly get their money’s worth from their personal data. Bits of personal information sell for a fraction of a penny,[22] and no one’s individual profile is worth anything until it is collected and aggregated with the profiles of similar socioeconomic categories.

Conclusion

While data protection and privacy entrepreneurship should be encouraged, individuals should not have to pay up to protect their privacy or receive coupons as compensation. If we intend for our economic and legal frameworks to shift from data collection to use, it is essential to begin the conversation about what sort of uses we want to take off the table. Certain instances of price discrimination or adverse employment decisions are an easy place to start, but we ought to also focus on how data uses will impact different social classes. Our big data economy needs to be developed such that it promotes not only a sphere of privacy, but also the rules of civility that are essential for social cohesion and broad-based equality.

If the practical challenges facing average people are not considered, big data will push against efforts to promote social equality. Instead, we will be categorized and classified every which way, and only the highest high value of those categories will experience the best benefits that data can provide.

Robert C. Post, The Social Foundations of Privacy: Community and Self in the Common Law Tort, 77 Calif. L. Rev. 957, 959 (1989).

See id. at 1009 (suggesting that the relationships between individuals and large organizations are “not sufficiently textured or dense to sustain vital rules of civility” and instead emphasize raw efficiency in data collection).

Since their inception three decades ago, the Fair Information Practices, which include principles such as user notice and consent, data integrity, and use limitations, have become the foundation of data protection law. For a thorough discussion and a critique, see Fred H. Cate, The Failure of the Fair Information Practice Principles, inConsumer Protection in the Age of the “Information Economy” 343 (2006).

VRM stands for “Vendor Relationship Management.” According to the Harvard Berkman Center, the goal of the project is to “provide customers with both independence from vendors and better ways of engaging with vendors.” ProjectVRM, Harv. Univ. Berkman Ctr. for Internet & Soc’y, http://cyber.law.harvard.edu/projectvrm/Main_Page (last updated Mar. 27, 2013, 07:07 PM). It hopes Project VRM can improve individuals’ relationships with not just businesses, but schools, churches, and government agencies. Id.

Other contributions to this discussion

How should privacy risks be weighed against big data rewards? The recent controversy over leaked documents revealing the massive scope of data collection, analysis, and use by the NSA and possibly other national security organizations has hurled to the forefront of public attention the delicate balance between privacy risks and big data opportunities. The NSA revelations crystalized privacy advocates’ concerns of “sleepwalking into a surveillance society” even as decisionmakers remain loath to curb government powers for fear of terrorist or cybersecurity attacks.

Classification is the foundation of targeting and tailoring information and experiences to individuals. Big data promises—or threatens—to bring classification to an increasing range of human activity. While many companies and government agencies foster an illusion that classification is (or should be) an area of absolute algorithmic rule—that decisions are neutral, organic, and even automatically rendered without human intervention—reality is a far messier mix of technical and human curating. Both the datasets and the algorithms reflect choices, among others, about data, connections, inferences, interpretation, and thresholds for inclusion that advance a specific purpose. Like maps that represent the physical environment in varied ways to serve different needs—mountaineering, sightseeing, or shopping—classification systems are neither neutral nor objective, but are biased toward their purposes. They reflect the explicit and implicit values of their designers. Few designers “see them as artifacts embodying moral and aesthetic choices” or recognize the powerful role they play in crafting “people’s identities, aspirations, and dignity.” But increasingly, the subjects of classification, as well as regulators, do.

Big data is all the rage. Its proponents tout the use of sophisticated analytics to mine large data sets for insight as the solution to many of our society’s problems. These big data evangelists insist that data-driven decisionmaking can now give us better predictions in areas ranging from college admissions to dating to hiring. And it might one day help us better conserve precious resources, track and cure lethal diseases, and make our lives vastly safer and more efficient. Big data is not just for corporations. Smartphones and wearable sensors enable believers in the “Quantified Self” to measure their lives in order to improve sleep, lose weight, and get fitter. And recent revelations about the National Security Agency’s efforts to collect a database of all caller records suggest that big data may hold the answer to keeping us safe from terrorism as well.

Legal debates over the “big data” revolution currently focus on the risks of inclusion: the privacy and civil liberties consequences of being swept up in big data’s net. This Essay takes a different approach, focusing on the risks of exclusion: the threats big data poses to those whom it overlooks. Billions of people worldwide remain on big data’s periphery. Their information is not regularly collected or analyzed, because they do not routinely engage in activities that big data is designed to capture. Consequently, their preferences and needs risk being routinely ignored when governments and private industry use big data and advanced analytics to shape public policy and the marketplace. Because big data poses a unique threat to equality, not just privacy, this Essay argues that a new “data antisubordination” doctrine may be needed.

Big data’s big utopia was personified towards the end of 2012. Our concern is about big data’s power to enable a dangerous new philosophy of preemption. In this Essay, we focus on the social impact of what we call “preemptive predictions.” Our concern is that big data’s promise of increased efficiency, reliability, utility, profit, and pleasure might be seen as the justification for a fundamental jurisprudential shift from our current ex post facto system of penalties and punishments to ex antepreventative measures that are increasingly being adopted across various sectors of society. It is our contention that big data’s predictive benefits belie an important insight historically represented in the presumption of innocence and associated privacy and due process values—namely, that there is wisdom in setting boundaries around the kinds of assumptions that can and cannot be made about people.

“Big Data” has attracted considerable public attention of late, garnering press coverage both optimistic and dystopian in tone. Some of the stories we tell about big data treat it as a computational panacea—a key to unlock the mysteries of the human genome, to crunch away the problems of urban living, or to elucidate hidden patterns underlying our friendships and cultural preferences. Others describe big data as an invasive apparatus through which governments keep close tabs on citizens, while corporations compile detailed dossiers about what we purchase and consume. Like so many technological advances before it, our stories about big data generate it as a two-headed creature, the source of both tremendous promise and disquieting surveillance. In reality, like any complicated social phenomenon, big data is both of these, a set of heterogeneous resources and practices deployed in multiple ways toward diverse ends.

“Big data” can be defined as a problem-solving philosophy that leverages massive datasets and algorithmic analysis to extract “hidden information and surprising correlations.” Not only does big data pose a threat to traditional notions of privacy, but it also compromises socially shared information. This point remains underappreciated because our so-called public disclosures are not nearly as public as courts and policymakers have argued—at least, not yet. That is subject to change once big data becomes user friendly.

Debates over information privacy are often framed as an inescapable conflict between competing interests: a lucrative or beneficial technology, as against privacy risks to consumers. Policy remedies traditionally take the rigid form of either a complete ban, no regulation, or an intermediate zone of modest notice and choice mechanisms. We believe these approaches are unnecessarily constrained. There is often a spectrum of technology alternatives that trade off functionality and profit for consumer privacy. We term these alternatives “privacy substitutes,” and in this Essay we argue that public policy on information privacy issues can and should be a careful exercise in both selecting among, and providing incentives for, privacy substitutes.

There are only a handful of reasons to study someone very closely. If you spot a tennis rival filming your practice, you can be reasonably sure that she is studying up on your style of play. Miss too many backhands and guess what you will encounter come match time. But not all careful scrutiny is about taking advantage. Doctors study patients to treat them. Good teachers follow students to see if they are learning. Social scientists study behavior in order to understand and improve the quality of human life.

De-identification is a process used to prevent a person’s identity from being connected with information. Organizations de-identify data for a range of reasons. Companies may have promised “anonymity” to individuals before collecting their personal information, data protection laws may restrict the sharing of personal data, and, perhaps most importantly, companies de-identify data to mitigate privacy threats from improper internal access or from an external data breach. This Essay attempts to frame the conversation around de-identification.