This is the 13th episode of the Social Media Security Podcast recorded April 30, 2010. This episode was hosted by Tom Eston and Scott Wright. Below are the show notes, links to articles and news mentioned in the podcast:

New Facebook Changes – Social Graph, Social Plugins and Instant Personalization. Here are twoarticles to read on the new changes. Want to know more about the new Graph API? Read Facebook’s documentation.

Please send any show feedback to feedback [aT] socialmediasecurity.com or comment below. You can also call our voice mail box at 1-613-693-0997 if you have a question for our Q&A section on the next episode. You can also subscribe to the podcast in iTunes. Thanks for listening!

I find that the only people saying privacy is dead seem to be those named in its will. Social media researcher danah boyd highlighted some of these conflicts of interest when she admonished, “No matter how many times a privileged straight white male technology executive pronounces the death of privacy, Privacy Is Not Dead.”

Privacy is not simply about confidentiality. Privacy is about control – you having control over the nature, disclosure, dissemination, and usage of your information. Privacy is about ensuring data exchanges happen under certain norms and in appropriate contexts.

Many Silicon Valley executives, however, seem to think users should embrace sharing most of their data with the entire web. This attitude is typified in a comment by blogger Robert Scoble: “We are all going to have to learn new ways to deal with privacy. Personally I think privacy is dead. Get over it. If you want it to be private don’t put it on a computer and don’t put it on the Internet. My entire life is public. If you want, you can search for naked photos of me (there are three out there).”

But can we really extrapolate the experiences of certain social media personalities and apply them to web users in general? Would we be as comfortable with a thirteen-year-old girl commenting that you could find three naked photos of her online?

In fact, the incongruence between Scoble’s public living and the worlds that even other US bloggers navigate became apparent in a post by Michelle Greer on geolocation. Greer does not oppose geolocation services, but she does note how they can increase risks for a person dealing with stalkers. And such risks are not eliminated by the person simply avoiding these tools – if trusted friends start using them without careful thought, an attacker can exploit data beyond their target’s control.

Robert Scoble may be able to have his entire life public, and in an ideal world, perhaps everyone else could too. The difficult reality, however, is that people in a broad range of circumstances require a greater degree of privacy to thrive socially – and at times, even to survive.

Of course, Scoble is far from alone in his outlook. I often see reactions to various stories that include sentiments I can describe at best as oversimplifications or misunderstandings. In some cases, these ideas seem to carry an appalling amount of arrogance as well. I’ll give four examples with short rebuttals:

“No one cares about what you ate for breakfast.” What if you died of poisoning one morning? Suddenly your family, the police, and many other people would care very much about your breakfast. But while I could offer dozens of other similar scenarios, they can distract from a more important point: Who are you to decide whether anyone cares about my breakfast? Why should I or others rely on your judgment in determining the value of the information that I choose to share? We all know people who care about details as mundane as our meal choices simply because of their relationship with us, even if that knowledge seemingly provides them no tangible benefit (unlike the poison investigation).

“What use would basic profile data be to a malicious third party? Disclosing it would not really matter.” This perspective includes an informal logical fallacy familiar to many in the scientific community: an argument from incredulity. In other words, since the questioner cannot imagine a certain scenario happening, it must be impossible. As before, I could easily frame a few situations where simple information disclosure could cause serious consequences for a given user (and the Google Buzz roll-out provided real-life examples) but doing so would fail to address the real issue: Only a profile’s owner has the knowledge and background required to outline all possible implications of disclosing their particular bits of information to various other parties.

“If you don’t want everyone to see certain content, you shouldn’t post it online to begin with.” Nearly everyone who routinely interacts with websites sends them content that carries expectations of confidentiality. Would you be comfortable with sites publicly sharing your credit card information? After all, you’re not liable for unauthorized charges, a point Blippy noted after a few of its customers’ credit card numbers leaked out on Google. The flexible nature of the Internet has always allowed people to share content in a way that limits the audience. Nothing technological has to prevent users from enjoying degrees of disclosure between encrypted e-mail transfer and publicly indexed web pages.

“Participating in social media is a choice. If you don’t like Facebook/Twitter/etc., don’t use it.” This advice assumes that personal choice is the only determining factor for using a social media service. Under the same assumption, I could argue that driving a car, using a mobile phone, having indoor plumbing, and buying groceries instead of farming are also choices no one is forced to make. Many Facebook users could leave the service in the sense that doing so would not affect their physical survival, but many of them cannot leave Facebook without significant negative effects on social, relational, and perhaps even economic aspects of their lives. Once again, few of us are in any position to evaluate such situations for other individuals.

In essence, no social media executive can assume that he or she understands the ramifications of reducing user control over information. No algorithm can make the same social judgments a human being can. And yet, what sort of trends do we see in the market? As an example, Facebook has gradually widened the definition of “publicly available information” while also adding features that aggregate and publicize data unexpectedly.

As Bruce Schneier notes in an excellent video presentation, however, you and I are not Facebook and Google’s customers. We are their products. They sell information about us, and hence they have a business interest in us sharing more information with more people. Yet for us, this approach tends to increase the amount of noise we deal with. I would submit that the market for online social networking needs to shift towards a model where business interests somehow align with users’ best interests. Obviously such a proposal is easy to state but difficult to implement and monetize, but it’s time we started rethinking how we approach these services.

For instance, many social networking sites have been structured more around technological paradigms than social ones. Most sites include a private messaging feature generally intended for confidential, one-on-one communication, then a method for sharing information that’s generally public, but perhaps includes features for limiting the audience. Perhaps we should design a more fluid communications system that reflects the sort of individual and group interactions we make offline or shoehorn into existing online services.

Another practical step towards ensuring user privacy would be to implement restrictive default settings. Which would be worse for the user: posting content privately that was intended to be public, or posting content publicly that was intended to be private? Rather than require a user to complete long lists of privacy settings prior to engaging with a service, keep content locked down by default and make it simple for a user to then open up their content more broadly.

Privacy is not dead, but many of today’s web applications seem intent on killing it. We desperately need alternatives that empower users with intuitive, defensive privacy controls. Note that by calling for better privacy models, I’m not saying we should avoid public sharing. If users want to live as Robert Scoble, a social media service need not stand in their way. (While Facebook once had more restrictive privacy defaults, it also used to prevent most content from ever leaving the site.) But rather than assume most people are Scobles, we need to find value in also enabling less-public sharing and protect the information that users themselves value.

I do agree with Scoble on one point: “We are all going to have to learn new ways to deal with privacy.” I also see a grand opportunity for entrepeneurs to help shape those “new ways” while keeping privacy very much alive.

In the wake of last week’s Facebook announcements, people have begun dissecting more of the technical details involved and adding various critiques. One point of discussion has been Facebook’s use of the buzzword “open,” with some observers feeling the description masks certain negative aspects of the new Open Graph.

But amid all the debate about openness, critics and supporters alike seem at times to inadvertently conflate three different (albeit related) technologies. First, the Open Graph Protocol defines a structure for website authors to provide certain bits of metadata (such as title, type, description, location, etc.) about their pages. Second, Facebook is expanding their “social graph” concept by building a database of connections among people, brands, groups, etc. The label “Open Graph” has been variously applied to this new map. Finally, the social networking site has introduced new methods for accessing these stored connections as part of their Graph API.

From a technical perspective, each of these offer great potential. But as they are currently being implemented, they still face difficulties that may hinder Facebook’s vision of the Semantic Web. In fact, while Facebook may have brought certain Semantic Web ideas to a more mainstream audience, they have not addressed some of the issues that have stymied advocates of similar technologies – including criticisms found in Cory Doctorow’s famous “Metacrap” essay from 2001. But first, I think it worthwhile to explore some of the details of Facebook’s three new components.

According to the spec’s website, the Open Graph Protocol is an RDFa vocabulary created by Facebook, though “inspired by” a few other related specs. Four properties are required for every OGP-enabled page, providing a title, type, image, and canonical URI. Optional fields include a description, a site name, location data, certain product codes, and contact information. Since OGP uses RDFa, each of these properties are specified via “meta” tags in the page’s “head” element.

Anyone is free to implement OGP in their pages or consume it with their services, as the technology is published under the Open Web Foundation Agreement 0.9. In that sense, the spec is certainly “open,” though some seem disappointed that the label is applied to a vocabulary apparently developed privately by one company without feedback from others. While Facebook does note already published standards they drew on for inspiration, OGP at times seems to be reinventing the wheel a bit. (Update: One reader pointed out to me that Facebook’s approach uses RDFa to specify data in a separate namespace, so my criticism may have been unjustified.) For instance, the HTML spec has always included a way to specify a page’s description via a “meta” tag – a feature many abused in the past to improve search rankings.

Facebook will not be immune to such abuse in their new namespace for metadata. Doctorow’s first problem with “meta-utopia” was that people lie. In my testing thus far, the OGP properties of title, canonical URI, and site name are essentially arbitrary. This means that not only can page authors add “like” buttons for other pages, they can add false metadata that produces deceptive feed stories. For instance, a feed story may say that a user “liked The Rock on IMDb” when the story links actually point to a malware host. If Facebook wants to build a semantic search engine, they will still have to deal with old black hat SEO tricks.

In addition to OGP properties, Facebook checks pages for an “fb:admins” parameter that sets which Facebook users can administer analytics and information for a given website. Since the site requires no further authentication, I find it a bit disconcerting that a simple XSS hole could provide an attacker with access to so much power for a site that heavily integrates with Facebook. I was glad to see that redirection techniques or spoofed metadata did not enable cross-domain application of “fb:admins”, but I’m still unsure of how some cross-domain (or cross-subdomain) issues will factor in to Facebook’s graph technologies.

Ironically enough, Facebook has yet to add OGP metadata to their own pages, and the new “like” button will not work for pages on facebook.com domains.

While the OGP can help authors describe individual pages, it does not include any way of establishing links between pages. That’s where Facebook’s ambitions become perhaps a little less “open.” The Open Graph of connections between Facebook profiles and OGP-enabled pages is housed on Facebook’s servers. The company does offer many simple ways for other applications to add or access edges of the graph, including the new Graph API. But Facebook is the gatekeeper, and some fear what that control could produce. Also, while Facebook has updated their privacy policy to reflect recent feature changes, their terms of service still include a clause about accessing data using “automated means.” Consequently, I’m still not entirely certain how much of the Open Graph can be automatically replicated.

Apart from concerns about control, however, the new Open Graph opens many possibilities by providing a set of links between pages and people with far more structure than the hyperlinks crawled by search engines today. But several factors may limit the possibilities. If sites do not implement OGP metadata in their pages (and that will include a significant percentage for the foreseeable future), Facebook has to infer data from the page. As already noted, data poisoning could become a significant factor. Maintaining a complex database will also require other types of maintenance, and currently the Open Graph can lead to issues of redundancy or caching of expired data.

If all website authors sought to protect their visitors and provide accurate, structured information on their pages, Facebook’s Open Graph would be a fairly certain success – but then again, it may not even be needed in that case. Meanwhile, since we have to take into account a range of problems and attacks when indexing online content, Facebook will still have to address basic problems encountered by past implementations of Semantic Web ideas. The company’s vision for mapping connections is ambitious, but plenty of work still remains.

As most major news organizations and blogs have covered the changes that Facebook has made from a high level, I wanted to focus this post specifically on Facebook’s “Open Graph”, “Social Plugins” and “Instant Personalization”. In my opinion, these are three changes that will significantly impact the way you and your friends use Facebook. As I usually do, I will provide a point of view from the eyes of an attacker. As we all know, its only a matter of time before these new features begin to be abused by attackers.

Open Graph
The first significant change is Facebook’s “Open Graph”. Open Graph is a significant departure from Facebook’s previous data connection strategy which used to be centered around Facebook Connect. All of that is gone and replaced with Open Graph. Open Graph basically allows partner websites and Facebook applications to share your public information and the public information of your friends with each other. The other big change which is a departure from Facebook Connect is that developers can hold your data indefinitely. The requirement was previously only for 24 hours (and we all know developers weren’t really holding to that anyway).

What’s also interesting is that Facebook has implemented an API called the Graph API. The Graphs API is how developers can easily integrate their applications with this new stream of user data. In fact, now you don’t even need a Facebook account to search the Open Graph. For example, https://graph.facebook.com/search?q=facebook&type=post will show you 25 recent status updates. Note that these status updates are set to Everyone and it seems that Facebook has put a limit on data you can retrieve with one query (this will change most likely or you can figure out ways around this). Before you had to log in to Facebook to do a search or use some creative Google queries for this information. This is good news for attackers, spammers and data miners. Facebook has made publicly available information even easier to search for and in my opinion, is going to start competing with Google for personalized search results. Stay tuned, Open Graph is going to be a huge area that I will be focusing my research on. As a penetration tester, my job just got easier. Thanks Facebook!

Social Plugins
Social plugins are small bits of code (the “Like” button for example) that you probably have been seeing all over the web. What Facebook has done is added simple plugins that web site developers can easily integrate. Also note that there are many more plugins available besides the “Like” button. Simply run the wizard, fill in a few lines and you’re done. Lets take the “Like” button as an example. If you are signed into Facebook (or not) you will see the button just like you do on Mashable:

Clicking on the button while you are signed in to Facebook posts a notice to your news feed that you like Mashable. The button also works when you are not logged into Facebook by prompting you to sign in. This is similar to how Facebook Connect worked. If you want to “unlike” the page, simply click the “Like” button again. Already, someone has found a potential security problem with the “Like” button that could possibly be abused by spammers. Keep in mind that these social plugins are part of Facebook’s strategy to take over the world integrate their Open Graph protocol. Once Open Graph starts to be more popular, you will see lots more attacks leveraging these new plugins.

Instant Personalization
Lastly, we have “Instant Personalization”. Instant Personalization is the feature in which Facebook has “pre-approved” third-party web sites to gain access to your public information just by visiting them. There is very little information available currently on how Facebook approves third-party sites. Once you allow these sites full authorization, they have the same access that any developer would have to your Facebook information. For example, here is what it looks like when you surf to Yelp. You will get a pretty blue bar that shows up at the top of your browser window:

You should notice that you have the option to “Learn More” or say “No, thanks”. You will also notice how instantly, if any of your friends on Facebook are using Yelp you can see any of their activity just below the blue bar.

Now something interesting happens once you visit one of these pre-approved sites. I noticed that a Facebook application (in this case Yelp) gets installed and allows it permissions to post. You don’t have to even click “No thanks”, the application is already installed. Pandora and Microsoft Docs work the same way. In fact, when testing the Microsoft Docs personalization I noticed the Facebook application that gets installed sets its privacy permissions to EVERYONE and allows one-line posts on your behalf. This means that anyone can see any activity that is posted by that application. Keep in mind that these controls are all being closely looked at by attackers and I suspect that we will see some hacks and/or abuse of this new personalization system soon.

Instant Personalization Privacy Settings
Facebook has put in a global “opt-out” check box in your privacy settings. Of course in typical Facebook fashion they have buried this setting so it’s hard to find. Ironically, just as I was writing this post Facebook changed the location of this setting. So now you have to go down one more level by clicking an additional button to get to the setting (see the screen shot below).

There are some very important caveats about this setting. First, this setting is enabled by default. Yes, that’s right. If you have a Facebook account this setting is checked right now and you are opted in. I had thought that Facebook would have learned from the Beacon fiasco but it appears they haven’t. Secondly, just because you “opt-out” doesn’t mean your information is safe. Just like other Facebook applications if your FRIENDS use Yelp, Pandora or Microsoft Docs these sites can still get your public information or anything else you have made available to be shared with friends. To completely opt-out you need to MANUALLY block each and every application (in this case Yelp, Pandora and MS Docs). It goes without saying, this is a huge pain and I look forward to the long list of complaints and privacy concerns regarding this psudo opt-out. The other problem is that I have already seen posts by Facebook that they already have partner sites that they are going to announce soon. What this means is that if you want to truly “opt-out” you need to keep up to date on all the new third-party partners with Facebook and manually block their applications. This is a terrible control in my opinion.

So where are these settings? Click on Account –> Privacy Settings –> Applications and Websites –> Instant Personalization (Click the Edit Settings button). In the screen shot below you can see the box that you need to uncheck.

UPDATE:Yvan Boily on Twitter had mentioned that you should also uncheck every box under “What your Friends can share about you” in your privacy settings (in my guide on SocialMediaSecurity.com this is what I recommend as well).

Share and Enjoy

Earlier today, Facebook held a developer conference called f8 and took the opportunity to announce a number of new features that impact both developers and average users. I’ve assembled a non-exhaustive list of several important changes the company described, along with a summary of each change and a quick pro/con evaluation from my perspective. I’ll be looking at these and other new features in-depth over the next several days.

The Open Graph

While Facebook has often talked about how its users friend relationships form a “social graph,” the company is now focused on creating a broader “open graph.” This is essentially a map of connections between people, companies, products, websites, and so on. When you list your interests and tastes on your profile, you’re helping build this structured database of links.

Pros

In many ways, this idea echoes the vision of a “Semantic Web” that others have outlined in the past. In fact, World Wide Web creator Tim Berners-Lee has long called for building a similar structure.

Facebook’s implementation includes simple ways for sites to add usable information about them, and they’ve built a simple interface for accessing data on pieces in the graph.

Cons

While this graph may be “open” for contribution and access, it’s definitely controlled by Facebook alone. That setup has obvious business, political, and philosophical implications, but centralized administration of such a graph has technical trade-offs as well, such as dependence on a single point of failure.

Facebook’s new version of the Semantic Web still carries many of the same issues as older versions, such as major privacy concerns, data poisoning, and data inconsistencies.

Universal Social Experience

In today’s keynote, Facebook CEO Mark Zuckerberg often talked about the high-level goal of enabling social experiences for users across the entire web. By combining the latest features Facebook offers, any site can bring identity and relationships into its own ecosystem.

Pros

Much of the information that you encounter on sites today is generic and requires that you spend time sorting or searching to make the site more relevant. With data from your part of the open graph, sites could customize and optimize in a way that’s tailor made for you, providing more relevant content right away.

This approach greatly reduces friction on other sites as well, since you won’t have to go through the tiresome process of setting up a new account, remembering another password, and trying to find people to connect with or useful content.

Cons

One person’s feature is another person’s privacy violation. However well-intentioned other sites may be, their “social experiences” can fail to recognize the value of anonymity or take into account a rightful degree of user control.

As others have pointed out previously, since this type of optimization often centers around your establish relationships, it can create an echo chamber effect and further isolate socioeconomic or ideological groups from each other.

Instant Personalization

This is the marketing term for a feature Facebook first earlier this year. The company has partnered with certain “pre-approved” websites that can now automatically identify a Facebook user at their first visit. The sites can also access what Facebook classifies as publicly available information.

Pros

This is a more specific example of Facebook’s vision for social experiences reducing friction. The feature is aptly named “instant,” as it basically sets up a user’s account on another site without any interaction, a behavior some may find very convenient.

From a privacy standpoint, Facebook has included a global opt-out under users’ application privacy settings, and clearly indicates when this sort of automatic authentication takes place with a banner at the top of the site.

Cons

The feature still raises a number of privacy concerns, and essentially repeats several of Google’s well-documented mistakes with the launch of Buzz. And while a full opt-out does exist, users are opted in by default. This personalization will likely be the source of many surprises and violated expectations.

Facebook controls who has access to the setup, and currently it’s not entirely clear how sites can become pre-approved or how much the program will expand in the future. The privacy controls also lack some clarity, as the opt-out does not cover information shared by friends who use instantly personalized sites.

Social Plugins

Any web site now has access to a range of simple tools that add Facebook features, such as “liking” a page and publishing approved stories to a user’s news feed. These widgets also replace some of the options previously offered to developers under Facebook Connect.

Pros

Facebook has built these plugins with ease of deployment in mind, and they drastically reduce the complexity of integrating with the service. Many developers will be pleased with the simplicity of these functions.

From a security perspective, Facebook’s approach also sets up a barrier between the external site and Facebook content the users sees. While the like buttons and friend pictures may seem to be simply part of the page, they actually reside in a separate data space from the rest of the page’s content until you choose to authorize access for the other site. This helps protect both the developer and you as a Facebook user.

Cons

In practice, the deceptive appearance just described may mislead many users into thinking that Facebook is exchanging far more data with other websites than they actually are. This will likely lead to some unwarranted panic.

These plugins do rely in many ways on developers providing accurate data, and it’s likely we’ll see these features abused by scam artists and distributors of malware. Currently, the plugins seem to lack certain authentications that may lead to unintended consequences.

OAuth 2.0

As part of a more streamlined development experience, Facebook has launched a technology called OAuth 2.0 for authenticating applications and websites. This replaces the proprietary model the site had been using and should once again simplify building Facebook-enhanced services.

Pros

This is a major validation for an open standard many companies have helped put together. Many developers will be encouraged to see Facebook choosing OAuth over a proprietary system.

As already mentioned, this is another way that Facebook has simplified application development. OAuth should reduce confusion over how other sites can access Facebook information.

Cons

While perhaps not a completely fair point, I’ll note that the use of OAuth does not diminish the threat of application-based attacks through vulnerabilities known as XSS and CSRF.

A number of other sites, such as Twitter, have used OAuth for some time, but this is a major roll-out of a very new version. We may see new security issues related to Facebook’s implementation.

Facebook Credits

At f8, Facebook expanded on their plans to offer a virtual currency system for application payments. Several applications are already using Facebook Credits, but we’ll likely see far more implementations in the near future.

Pros

Yet again, this system helps reduce friction. For developers, Facebook offers a simple way to include payments without having to worry about a number of implementation details.

Also, for users, virtual currency can reduce the hassle of worrying about issues such as international currency conversion.

Cons

Since Facebook is already facing widespread criticism over privacy issues, some users may hesitate to add credit card information to their Facebook profiles, even if it can only be accessed by Facebook.

This service makes Facebook a middleman in potentially millions of dollars of transactions, and could raise liability issues.

Granular Data Access

Though perhaps overlooked, Facebook made good on their promise to include more granular permissions when applications request user information. This feature comes in response to concerns raised by Canada’s Privacy Commissioner last fall. With the new setup, applications will have to individually request private profile fields when a user chooses to authorize.

Pros

This change will immediately provide more transparency and accountability, since users will see listed out exactly what fields an application will want access to when they authorize.

Many users may simply click through anyway, but the new system may raise awareness for many users who did not previously understand the range of information applications could access. Seeing a greedy list of data fields may give users pause.

Cons

Since announcing granular access last fall, Facebook has radically changed the definition of what constitutes “private” information. Consequently, many of the fields that might have been included in this setup are now considered “public” and thus generally outside access controls.

While commendable, this change may not lead to any substantial changes in practice. The model relies on developers limiting their requests, and many users will probably still want access to applications that ask for all information.

Persistent Data Storage

Until this week, applications and Facebook-enabled websites could not store most information accessed via the Facebook API beyond 24 hours. Now, Facebook has removed this time limit, meaning developers can save user data for as long as they want.

Pros

This change will significantly reduce overhead for both developers and Facebook, since applications will no longer have to exchange data with the service each day a user connects.

Users will likely see some performance gains from applications, since they can cache data locally rather than constantly checking with Facebook before rendering content.

Cons

Facebook applications will now be far more valuable targets for attackers. If a popular application suffers a database compromise, millions of users’ private information could be put at risk. Hacking Facebook directly tends to be difficult, but many applications lack the same level of security.

This increases opportunities for behavioral targeting and visitor tracking, since third-party developers will now be able to maintain complete archives of profile information.

Yesterday, Facebook announced two new features: Community Pages and “connections” for certain profile information. The first combines some of the generic fan pages that have become popular over the last few months with Wikipedia articles to create a sort of social encyclopedia. I’m not entirely clear on what Facebook envisions with this feature, but it will be interesting to watch it develop.

The second feature, however, has attracted much more attention, and rightfully so. I’m again still sorting through details and have not yet seen the new connections in action, but certain parts are pretty clear. Facebook is replacing the manual lists in parts of the “info” tab on your profile to lists of fan pages you connect with. Along with the new setup, Facebook is changing the “Become a Fan” buttons to “Like” buttons. If you want to connect with a page for something you’re interested in, you now will simply “like” the page.

In a blog post, Facebook spun the connections as an exciting improvement: “Instead of just boring text, these connections are actually Pages, so your profile will become immediately more connected to the places, things and experiences that matter to you.” I can see three main reasons why Facebook would make this change, and none of them involve text being boring.

First, this helps software more easily process your interests. With textual lists, you may find titles such as these under a user’s favorite movies: “LOTR,” “Lord of the Rings,” “Lord.Of.The.Rings,” “***Lord of the Rings!***”, “i just LOVE lord of the rings so much,” etc. It’s obvious to a human that these all refer to the same trilogy of movies, but not to a computer. By essentially turning sections of your profile into database relationships, Facebook can take all of these disparate descriptions and replace them all with a link to an official Lord of the Rings page.

Second, the shift to “liking” reduces friction. The semantics may be subtle, but I’m sure Facebook has done research on this. “Liking” implies a simple, casual gesture (represented by the thumbs up icon), while “becoming a fan” or “subscribing” carries more of a commitment and desire for further interaction. I’m guessing users are far more likely to say they “like” something than “become a fan” of it, and Facebook wants users to connect and share as much as possible.

Third, this increases the useful data Facebook can offer to others. It’s likely that a large majority of Facebook’s users currently have privacy settings that only allow friends to see the “boring text” in their profiles. But since last fall’s privacy changes, connections to fan pages are now considered publicly available information. By taking the simple step of “liking” a page, users will add an easily processed connection that certain sites and applications will be able to access when visited.

Since the new setup has obvious privacy implications, Facebook added privacy controls, but unfortunately, they seem to also add further confusion. As Facebook notes, the new settings relate only to profile visibility: “You can control which friends are able to see connections listed on your profile, but you may still show up on Pages you’re connected to.” This is yet another example of Facebook making information appear to be private without actually making it private. As TechCrunch writer Jason Kincaid put it well, “In short, this section is about the data on Facebook that you can’t actually control. You can make it harder to find, and even hide it from your profile, but you can’t remove it entirely.”

Facebook stands to gain enormously from users embracing these new profile connections, and fan pages within Facebook are only the beginning. Tomorrow is f8, a developer conference hosted by Facebook, and the company will likely be introducing several new features and plans, such as adding location information to wall posts. Inside Facebook has an excellent round-up of what to expect. Several of these changes will likely have a significant impact on user privacy; I expect we’ll hear more detail about pre-approved Facebook Connect sites gaining automatic access to user data. Another item of interest will be the Open Graph API, which takes the “liking” behavior described above and extends it to any website.

That means that rather than simply say you’re a fan of Social Hacking, for instance, you could potentially “like” theharmonyguy.com. In other words, you could create a connection between your profile and a given URI (website address). That opens up many new possibilities, but once again adds significant information to your public profile.

As I said, certain details are still not clear to me; for instance, Facebook seems to have backtracked on whether your list of friends is publicly available information, and says that fan page connections will not be public for minors. I’ll certainly be watching to see what Facebook announces tomorrow, and will likely have much more to say about it in the next week or so. (In fact, I’ve been holding off on a few posts until I see how the f8 announcements will impact the issues they deal with.) I should also have shorter, quicker updates throughout the day tomorrow on my Twitter feed.

A few weeks ago, I sent Facebook a demonstration of what appeared to be a previously unknown attack combining two behaviors of the Facebook Platform. The technique allowed one to create a seemingly innocent web page that would invisibly and silently steal a visitor’s private Facebook content. Facebook has now disabled the attack by modifying one of the exploited behaviors.

It’s unlikely that any real-world attacks used this particular vulnerability, and I certainly have no record of such a case. But it’s also unclear how long the problem has existed. I discovered one part of the technique, a “return_session” parameter for application authorization, while examining the behavior of the Yahoo! contact importer, which only launched a month ago. However, discussions on Facebook’s developer forum mention the parameter in the context of Facebook Connect implementations as far back as February 2009. The other main component, now modified by Facebook, may have existed since the beginning of the Platform in 2007.

In my proof-of-concept demonstration, I loaded a harmless-looking web page on a server external to Facebook. The page included code for an inline frame sized to be invisible to the user. This frame then loaded the login page for a Facebook application. If the user has already authorized an application, its login page will automatically forward to the application, and that’s exactly what I wanted to happen. I chose FarmVille for my demo, since it has a wide install base. Keep in mind that while FarmVille currently lists about 83 million monthly active users, the attack would have worked for anyone who has authorized the application, regardless of how long ago. The attack could also target multiple applications at once using multiple iframes, meaning nearly any of Facebook’s 400 million active users could have fallen prey.

But the first main component of the attack involved a slight modification to the login page URI. By adding a “next” parameter, one can specify an alternate landing page for authorized users. Not all applications take advantage of this parameter, but many do. The parameter would not work for an arbitrary site, but Facebook previously did allow any URI that began with apps.facebook.com. Thus one could craft a login page URI that checked whether the user had authorized one application and then forward the user to a second application.

The next part of the attack came from adding “return_session=1″ to the login page URI. This parameter causes Facebook to append particular session variables for the authorized application onto the URI of the landing page – in our case, the second application given by the “next” parameter. That application merely has to check its address for the session data, which provides enough information to execute API requests using the credentials of the already authorized application. Since an authorized application essentially operates on behalf of a user, it has access to nearly all private profile information (essentially, everything but your e-mail address and phone number) and content (photos, links, notes, etc.) that can be loaded via the API, and hence the second application had such access as well. This entire process could be fully automated without any user interaction and did not require any authorization for the second application. Also, the attack could generally be executed quick enough to avoid Facebook’s measures for detecting when their pages are loaded in frames.

To patch the attack, Facebook has restricted the “next” parameter; it now only forwards to addresses for the application specified on the login page, preventing any appended session data from reaching the wrong destination. Since an authorized application already has API access, using return_session with that application will not add any new privileges.

I commend Facebook for responding quickly to this issue and for being open to white-hat security reports. But in my opinion, this vulnerability is simply the latest reminder that the Facebook Platform can open users to many problems quite separate from the security of Facebook itself. I personally think that aspects of the Platform’s implementation fail to match user expectations of privacy, as I’ve discussed previously. And while this particular problem may be solved, vulnerabilities in specific applications and the nature of application access continue to put private data at risk of unwanted disclosure.

I don’t take my responsibility as a blogger lightly, and I realize that many readers look to this site for reliable information on privacy and security issues with social networking applications. Consequently, I strive to maintain high standards of accuracy and clarity in my posts. Over the last few years, I’ve set some personal rules for myself, such as reproducing a vulnerability before relaying it here. I would never want to mislead my readers or betray their trust.

However, I must issue an apology regarding what I view as a significant error that I discovered today while researching a new idea. In at least two recentposts, I misrepresented how much information Facebook applications are able to access without explicit authorization. My apologies to Facebook for overstating such access.

Previously, I’d stated that Facebook applications have access to your “publicly available information” and content marked accessible to “Everyone” prior to authorizing the application. In one case, I stated this could be used by a fan page tab to identify users without explicit authorization.

As it turns out, applications only have this automatic access in certain circumstances. According to Facebook’s documentation, such access only occurs when users arrive at an application page from certain Facebook channels and can be affected by strong privacy settings. I misunderstood this process and consequently applied in situations where it would not actually come into play.

As for fan pages, a tab apparently does not have automatic means of identifying a user and would need to request authentication to access such information.

It bothers no one more than me that I misled my readers on this point, and I will certainly strive all the more to avoid such an error in the future.