Self-destructing data: The return of Internet privacy

There is no such thing as privacy on the Internet anymore—anything you say or do lives on ad infinitum in Internet memory. In the intro of his Harvard paper, Viktor Mayer-Schönberger notes that “In March 2007, Google confirmed that since its inception it had stored every search query every user ever made and every search result ever clicked on. Google remembers forever.” As one of the most pervasive tools of our generation, Google and its associated applications have changed the way we think about data, privacy, digital identity, and memory.

A recent article by Nate Anderson in Ars Technica highlights professor Mayer-Schönberger book, Delete: The Virtue of Forgetting in the Digital Age. The message: “Technology has now made ‘remembering’ the default approach to information, and in doing so, threatens to make ‘forgetfulness’ obsolete.” This is not only a profound change from 20 years ago, it can also be detrimental to our ability to think and analyze information. The article goes on to say: “Selective forgetfulness is a boon to humanity; it keeps us from drowning in our own recorded data. It allows us to sift and sort, then to think at a higher level of abstraction instead of wallowing in detail.”

But, this may all soon change. Perhaps, computers can learn to forget too.

Researchers led by doctoral candidate Roxana Geambasu, at the University of Washington in Seattle are working on project called Vanish. The idea is to encapsulate data such as e-mails, selected text in messages, or documents that are sent over the Internet. The system would create corresponding keys for decapsulation that are widely available online, but that would deteriorate over time so that the data in readable form would only be available for a certain period of time. The overview page of the Vanish project states, “We strongly believe that realizing Vanish’s vision would represent a significant step toward achieving privacy in today’s unforgetful age.” Mayer-Schönberger suggests a similar solution that uses metadata to tag data objects with expiration dates and cites the work of Lawrence Lessig who has proposed a broader approach to combine policy and software to force privacy compliance.

nGenera’s research project Leading in an Age of Unbounded Data is looking at new sources of data available to the enterprise and how these will lead to new insights, opportunities, and challenges, as well as change enterprise processes and decision-making. One of the assumptions we make is that data will continue to grow and companies, through analytics, will develop a type of ‘sixth sense’ or situational awareness about the organization thanks to information captured from across the business ecosystem. We have already found that the growth of personal information and digital identity data will lead to rich digital profiles containing social graph information. These rich profiles present opportunities to better engage with customers and employees, improve customization, and facilitate knowledge management by anticipating user needs and connecting them to relevant people and information.

Projects like Vanish force us to think about data, not as an asset with an indefinite lifespan, but rather as something that depreciates over time, just like physical assets do. This would effectively reduce the amount of data that we need to manage and improve signal-to-noise ratio as more important facts and information would be retained while less significant information would be deleted. By eliminating the perfect memory of computers, we might also feel less pressure to maintain digital facades and manicure our online profiles. Additionally, the idea of adding expiration dates and metadata to data could accelerate the shift in power away from marketer towards consumer as it would allow individuals to dictate what personal data is used, who has access, for how long, and for what purpose.

But, self-destructing data would also diminish the value of many of the ‘big data’ opportunities that we talk about such as using large data sets to infer the truth about various situations, and using sentiment analysis to mine online customer comments and status updates for market research and product insights. It would confound companies and marketers that store petabytes of information to generate longitudinal trends and rely on usage data to drive Web analytics and build reputation and ratings, as well as improve information management through technologies such as collaborative filtering (e.g. the technology used by Amazon to recommend books to you based on the activity of people with similar behaviors). By collectively deleting our less-than-favorable digital trails, would we also be doing a disservice to future generations of anthropologists that could benefit from a complete digital history and behavior map—both good, bad, and questionable actions—of their ancestors?

The idea that all data should live on forever is a relatively new concept that many people have already taken for granted. In general, I think enterprises, governments, and individuals would benefit from more discussion on the topic instead of seeing it as a foregone conclusion. The idea of having an information lifecycle for all data is a powerful one. Personally, I would welcome more initiatives such as those by the Vanish team and professor Mayer-Schönberger that broach the topic and reintroduce a little forgetfulness into our digital lives.

8 Comments

You can follow any responses to this entry through the RSS 2.0 feed.
Responses are currently closed, but you can trackback from your own site.

[...] Self-destructing data: The return of Internet privacy Published: February 15, 2010 Source: Wikinomics There is no such thing as privacy on the Internet anymore—anything you say or do lives on ad infinitum in Internet memory. In the intro of his Harvard paper, Viktor Mayer-Schönberger notes that “In March 2… [...]

Privacy is gone.A shift in our society has occurred the past few years. We have gone from fearing the security of the internet to anything/everything goes, your nobody unless everything about you is transparent. There is little to no digital hygiene that is of any concern with many of the nets younger users. This is all they have known since High School, so it must be safe, secure, and no problem. I don’t know where this all nets out for privacy and society. Caution is still necessary, storage is unlimited and cheap and everything is connected.

phillies blunt Feb 22, 2010 20:30

they have since changed their methods to like 8 months they store your search request. I have read that somewhere.

[...] now change dynamically, according to time, place, and medium. An Internet with garden walls, that forgets, is a good start; after all, good fences and a charitable forgiveness make good neighbors. But a [...]

Now available in paperback!Don Tapscott and Anthony D. William's latest collaboration, Macrowikinomics: New Solutions for a Connected Planet.Learn more.