Wednesday, October 21, 2009

Privacy in Social Computing

The computer has evolved tremendously over the last half century, to the point that today’s handheld devices are many times more powerful than the original mainframes. Today’s devices are also infinitely more interconnected, with both the internet and other devices around us. This means that information is flying, so to speak, everywhere at an amazing rate.Combined with humans’ social nature, it is no surprise that this all led to the sprouting and rapid growth of social networking sites. Due to underlying idea behind social networking being constantly updating personal information, privacy in the field is an ever present concern.With today’s networked applications, there is risk of some personal information being shared.This notion should, to a degree, be accepted by users, but the real value sensitive challenge is to determine the degree of acceptability this tradeoff creates for the user and their sense of privacy. The entire realm of privacy is a touchy subject, and will continue to be so as our online information base grows.

The social networking swell started several years ago and has grown remarkably to its current state; the big three networking sites, Facebook, MySpace and Twitter recorded 124.5 million, 50.2 million, and 23.5 million unique visitors, respectively, in September 2009 (1).With this many unique users, many of whom come back frequently (Facebook had 2.3 billionvisits in Sept (1)), it clear to see the immense popularity of the networking trend. This networking movement plays on the natural human tendency to crave social interaction: while we are individualistic, the greater draw is to interact with others. Networking sites allow you to make your profile your own to varying degrees; on one extreme, MySpace, with virtually no limit to what can be done to your page, contrasted with the more professional based networks that have stricter limitations, thus catering to our individualistic desires.At the same time, they allow interaction between you and your ‘friends’ by sharing all sorts of personal information: text, pictures, audio, as well as video. With the amount of users frequenting social networking sites, it is easy to imagine the amount of data being created.

Each user profile on a social network (I will specifically be looking at Facebook, as I am most familiar with it) contains all sorts of information about the user: demographic data, interests, hobbies, organizations, jobs and so on. The powerful thing about this data is that it is largely accurate, according to Sree Nagarajan, founder of Colligent, a company that provides our data to marketers(2). This accuracy of data, combined with its abundance is a dream come true for advertisers. It allows for targeted advertising to happen on a page by page basis: the ads that each user sees can be tailored specifically to his or her interests and demographics as well as the actual content on the page they are on. This is all made possible by Facebook’s (and other networks’) very uniform and consistent presentation of data, along with the fact that it is largely public (though the definition of what is truly public is constantly being refined). Facebook makes matters even easier by offering an API that allows for scraping of data from users’ news feeds on the fly. All these factors add up to a platform that is a data-mining wonder.

Of course, privacy is a huge concern when so much personal information is so widely available. Wikipedia defines privacy as “the ability of an individual…to seclude themselves or information about themselves and thereby reveal themselves selectively”(3).This definition works very well with the social network users’ needs to have their information visible and easily accessible to their social contacts and out of the hands of strangers. Like any other web application, networks assure users that “of course your privacy is very important to us” (4).But how true is this statement?

While investigating privacy in location-enhanced computing, Freier et al. developed a set of features that had direct impacts on user privacy, some of which are applicable to social networks as well: interpretability, awareness, control, scope of disclosure and risk and recourse(5). As stated earlier, social network data is very standardized, and is very easily interpreted. This makes it extremely easy for both legal and illegal searching of the data, both reducing privacy. The question to ask here is if this extent of standardization and interpretability is really necessary for the operation of the social network.On the one hand, the networks could make the data harder to mine and less accessible, but they would then be biting the hand that feeds them; advertisers would surely be displeased. A case could surely be made for both sides, though unfortunately, the side with the most money, the advertisers, would surely win.

Awareness of what information is being shared with whom is an important part protecting your privacy and goes hand in hand with the ability to control the flow of information.Freier et al. classify systems into two categories: invisible and transparent. In other words, invisible systems do not bother users with notifications for their awareness, whereas transparent systems disclose all information regarding privacy.On the surface, it would seem as though transparent systems are the correct design choice in terms of value sensitive design and that users would embrace them; systems designed to be invisible to the user would surely fail.However, studies show otherwise.For example, the User Account Control feature introduced in Microsoft Vista was supposed to address user awareness of when system settings were being modified and provide control to allow the change or deny it.The aim was to preserve the security and privacy of users’ computers, both very important values in most users’ minds.However, after launch, many users wound up turning the feature off, despite the fact that it tried to inform the user of an issue and provide control over how to proceed.Perhaps this was due to a poor implementation, but there may be other reasons.Bonneau and Preibusch conducted an extensive study of privacy features in the social networking landscape and found trends that one would not otherwise expect (6).First they split the population into three groups, what they called the marginally concerned, pragmatic majority and the privacy fundamentalists.They discovered that the majority of users, the pragmatic majority, claimed to be concerned with privacy, but given an attractive service or monetary rewards, quickly forgot about it.In addition, it was shown that the more assurance of privacy a social site provided, the less comfortable non-fundamentalists became. In other words, minimizing the sense of privacy in a site, while actually providing it was the best approach to appease all three user groups. In addition, the study found that social network sites (especially Facebook, it was the worst offender) tend to bury privacy settings deep in the site settings. This makes it difficult for users to opt in or out, depending on the situation, and only the dedicated fundamentalist group described above bothers to look at and modify them. All this contradicts the seemingly common sense idea that transparent systems would be more welcomed by users and points to the fact that people are content with invisible systems.

The scope of disclosure is very important in analyzing privacy in social networks. Because of the different classes of people that a user interacts with (direct friends, friends of friends, strangers, etc.) there need to be definitions of what different user groups can see.The different classes defined by Freier et al. applied well to the location-enhanced devices they discussed, but the classes Priebusch et al. defined are much more appropriate.They suggest the data classes that are private, used only internally, group, seen by friends, community, seen by users of the social network regardless of friend status, and public, that can be seen by anyone, regardless of social network status (4). In addition to these definitions, I think we can expand the group definition to reflect the fact that users can have actual friends and people in their network (i.e. RPI), two groups with whom users can have different types of interactions.Within the context of social networks, these classes are the bread and butter of privacy settings: the nature of the sites requires information to be shared and users need power over who sees what parts of their profile.Tied to scope and disclosure is the risk and recourse metric of measuring privacy.This feature deals with the sensitivity of information versus the ability of users to hold accountable those who use their information inappropriately.Unfortunately for social network users, their data is often very sensitive and their potions for recourse very limited.For example, a study found that many users on social network sites accept ‘friend’ requests without any checks (13% on Facebook, 92% on Twitter) and post their address information, as well as their vacation plans(7). The combination of these three factors makes social networks a great new place for burglars to look for homes to hit. Granted, these events are much more severe and rare than most other inappropriate uses of users’ information, but regardless, users have very little recourse against abusers of the system as they are often unknown. However, it brings up an interesting point: how much of the users’ privacy concerns are brought on by uninformed or foolish behavior, and how much should and can social networks do to prevent them?Common sense (and our mothers) tells us not to accept candy from strangers.The same principles apply to social networks, and if users ignore them, then they are asking for trouble. As for the social networks, it would perhaps be possible to create algorithms to analyze suspicious user friend requesting patterns, though the effectiveness and ethics (privacy included) of this would be questionable.

There are several major stakeholders in the system, both direct and indirect. First, the most obvious direct stakeholder is the user base that uses social networks. They are the group around whom the entire system is designed and built, and to whom the advertisers push products. The advertisers and marketers are another large stakeholder, though indirectly. They communicate with the companies that mine the users’ data and sell it to them to provide targeted advertising.The data mining companies are also direct stakeholders. These three stakeholders are on opposite ends of the privacy issue; the users desire more privacy whereas the miners and advertisers want more lax privacy policies. Which side is right is debatable.While user privacy is an important value that designers should embrace in all applications, as the study above showed, most users forgot about their privacy concerns once given a reason, usually an attractive service.The advertisers, on the other hand, stand to benefit greatly from looser restrictions, allowing them to receive more information and allow them to better server targeted advertising. The ethical question is whether they should receive these looser restrictions, given that users would likely still use the services. It would greatly tread on users’ value of privacy, for sure, but would superior ad targeting serve the users’ needs better?Would these ads slowly move from being looked at annoyances to being useful and actually see higher click through rates? These are definite questions to consider and incorporate into future privacy decisions.