Friends or Frenemies? The Increasingly Important Legal Battle over Social Data Extraction Tools

Vol. 4No. 5

By
Jonathan H. Blavin

Jonathan H. Blavin is a partner at the law firm of Munger, Tolles & Olson LLP in San Francisco, California. His practice focuses on intellectual property and antitrust issues. He can be reached at jonathan.blavin@mto.com.

Less than a year after its release date, the burgeoning social networking site Google+ already has over 90 million users.1 As is the nature with all social networking platforms, Google+’s success will depend upon how many more users join the network and convince their “friends” to do so as well. With roughly 845 million users on Facebook, 500 million on Twitter, and 150 million on LinkedIn,2 that inevitably will mean users from these social networking sites will be joining Google+.

In the wake of Google+’s release, a number of third parties developed tools that allow users to extract and export social data from Facebook and reconstruct their Facebook friends lists on Google+. For example, the tool “Open-Xchange” takes the first and last names of friends from Facebook and matches those names to other e-mail records in users’ accounts. It relies on a Facebook application programming interface (API) to extract the data. Similarly, an extension for the Google Chrome Internet browser (created by an independent programmer) exports not only names but also e-mail addresses, phone numbers, birthdays, and more from Facebook. This tool does not rely upon Facebook’s API for access, but instead extracts data from Facebook pages themselves.3

These data extraction tools appear to violate Facebook’s terms of use, which state in relevant part that users will “not collect users’ content or information, or otherwise access Facebook, using automated means (such as harvesting bots, robots, spiders, or scrapers) without our permission.”4 Facebook has disabled both data extraction tools, and has threatened enforcement actions against Facebook users employing such tools.5 During an earnings call with investors, Google’s CEO Larry Page implicitly commented on Facebook’s disabling of these tools, stating that “Google as a company believes in users owning their own data and being able to easily move it out of Google. Some of our competitors don’t believe in that. We think users will eventually move to services that are in their best interests and that work really well for them.”6

This article provides an overview of the various legal issues relating to data extraction tools, which are at the center of the increasingly competitive social networking market. In particular, it examines the legality of data extraction tools under the following common law and statutory theories: trespass to chattels; the federal Computer Fraud and Abuse Act (CFAA), 18 U.S.C. §§ 1030 et seq., and its California state law corollary, the California Comprehensive Computer Data Access Act, Cal. Penal Code § 502; tortious interference with contractual relations; the Controlling the Assault of Non-Solicited Pornography and Marketing Act (CAN-SPAM Act), 15 U.S.C. §§ 7701 et seq.; copyright infringement; and the Digital Millennium Copyright Act (DMCA), 17 U.S.C. §§ 1201 et seq. The article also explores the legality of social networking websites’ technological disabling of data extraction tools that allow users to export data to competing websites.

Common Law Trespass to Chattels

Since the earliest days of the Internet there has been litigation relating to tools used to extract and aggregate data from websites. One of the first and most prominent cases was eBay, Inc. v. Bidder’s Edge, Inc., 100 F. Supp. 2d 1058 (N.D. Cal. 2000). Bidder’s Edge was an auction aggregation site designed to offer online auction buyers the ability to search for items across numerous online auctions without having to search each host site individually.7 It used an automatic crawling tool that searched various auction websites, such as eBay, and “scraped” information from the site. Bidder’s Edge accessed the eBay site approximately 100,000 times a day,8 and the effect of the robots over time was to “consume the processing and storage resources” of eBay’s system.9

The court granted eBay a preliminary injunction preventing Bidder’s Edge from accessing the site. In granting the injunction, the court determined that eBay had established a strong likelihood of prevailing on its trespass to chattels claim.10 The court stated that to establish a trespass claim, eBay had to prove that: (1) Bidder’s Edge intentionally and without authorization interfered with eBay’s possessory interest in its computer system, and (2) Bidder’s Edge’s unauthorized use proximately resulted in damage to eBay.11 The court held that Bidder’s Edge’s use was unauthorized and intentional, as Bidder’s Edge had violated eBay’s terms of use and ignored its requests to stop using its crawlers.12 With respect to the damage requirement, the court found most persuasive that denying an injunction would likely encourage other auction aggregators to crawl the eBay site, potentially to the point of denying effective access to eBay’s customers. If preliminary injunctive relief were denied, and other aggregators began to crawl the eBay site, there appears to be little doubt that the load on eBay’s computer system would qualify as a substantial impairment of condition or value.13

The California Supreme Court in Intel Corp. v. Hamidi, 71 P.3d 296 (Cal. 2003), narrowed the potential scope of the eBay decision. There, the court held that a former Intel employee’s e-mails to current Intel employees, despite requests by Intel to stop sending messages, did not constitute a trespass of Intel’s e-mail system. The court rejected the suggestion of the eBay decision that unauthorized use of another’s chattel is actionable even without any present showing of injury, noting that this would not be a correct statement of California or general American law on this point. While one may have no right temporarily to use another’s personal property, such use is actionable as a trespass only if it “has proximately caused injury.” “[I]n the absence of any actual damage the action will not lie.”14

The court made clear that the “injury” required for a trespass to chattels claim must be an “injury to its personal property, or to its legal interest in that property,” i.e., impairing the “quality or value” of a “computer system.”15 The court also rejected injuries premised on “indirect . . . business interests,” such as “reputation, customer goodwill, and employee time,” and held that the time and expense incurred “attempting to block [the employee’s] messages” cannot “be bootstrapped into an injury to Intel’s possessory interest in its computers.”16

Under the reasoning of Hamidi, a social networking site plaintiff would need to establish that a data extraction tool caused an actual, non-de minimis impairment to its physical property or a legal interest in that property, diminishing its quality or value. A plaintiff thus may be unable to establish cognizable injury under a trespass claim premised on indirect business harms, e.g., the resources expended attempting to block a data extraction tool, or the loss of users to competing social networking sites as a result of the tool.

Unauthorized Computer Access Statutory Violations

Social networking plaintiffs whose websites have been subject to data extraction tools increasingly have alleged the violation of federal and state laws that prohibit unauthorized access to computer systems. The federal CFAA prohibits “access[ing] a computer without authorization or exceeding authorized access” and thereby obtaining “information.”17 Section 502 of the California Comprehensive Computer Data Access Act is closely analogous, prohibiting access to computer systems “without permission.”18 Both the CFAA and section 502 make statutory violations a criminal act with criminal penalties. A central issue in cases involving the CFAA and section 502 is whether a data extraction tool’s violation of the terms of use of a website alone is sufficient to render access to the site “unauthorized” or “without permission” to give rise to a cause of action.

In Facebook, Inc. v. ConnectU LLC, 489 F. Supp. 2d 1087 (N.D. Cal. 2007), Facebook sued a competing social networking site, ConnectU, that accessed Facebook to collect “millions” of e-mail addresses of Facebook users, and then used those e-mail addresses to solicit business for itself.19 In denying ConnectU’s motion to dismiss the section 502 claim, the court rejected ConnectU’s argument that it only “accessed information on the Facebook website that ordinarily would be accessible only to registered users by using log-in information voluntarily supplied by registered users.”20 The court held that ConnectU was subject to and allegedly violated Facebook’s terms of use, and that such conduct would constitute access to Facebook “without permission” within the meaning of section 502.

In Facebook, Inc. v. Power Ventures, Inc., No. C 08-05780 JW, 2010 WL 3291750 (N.D. Cal. July 20, 2010), the court explicitly disagreed with the ConnectU court on this point. There, Facebook sued Power.com, a website designed to integrate various social networking or e-mail accounts into a single portal. Power.com users determined which social networking sites they wanted to integrate into a single portal, and then provided their user names and passwords to Power.com for those sites. Power.com then “scraped” user information from the accounts into a single portal. The site also asked Facebook users to select which of their friends should receive a Power.com invitation, and then sent those friends unsolicited e-mails to join Power.com that purportedly came from “Facebook” and used an “@facebookmail.com” address.

In denying Facebook’s motion for judgment on the pleadings on the section 502 claim, the court rejected on constitutional vagueness grounds the holding of ConnectU that the mere violation of the Facebook terms of use constituted access “without permission” within the meaning of the statute. The court reasoned that:

[A]llowing violations of terms of use to fall within the ambit of the statutory term “without permission” . . . essentially place[s] in private hands unbridled discretion to determine the scope of criminal liability recognized under the statute. If the issue of permission to access or use a website depends on adhering to unilaterally imposed contractual terms, the website or computer system administrator has the power to determine which actions may expose a user to criminal liability. This raises constitutional concerns. . . .21

For example, would a 12½-year-old user’s creation of a Facebook account, in violation of its 13 years of age requirement, be sufficient to subject him or her to criminal liability under section 502?

In determining where to draw the liability line, the Power Ventures court held that a “distinction can be made between access that violates a term of use and access that circumvents technical or code-based barriers that a computer network or website administrator erects to restrict the user’s privileges within the system, or to bar the user from the system altogether,” and that unlike the former, the latter may “subject a user to liability under Section 502.”22 This focus on technological circumvention essentially turns section 502 into a DMCA-like statute, which is discussed further below. The Power Ventures court recently granted summary judgment for Facebook on its section 502 and CFAA claims, finding that the “undisputed facts establish that Defendants circumvented technical barriers to access Facebook site, and thus accessed the site ‘without permission.’”23

In the CFAA context, the courts are likewise split on the issue. A number of cases have interpreted the CFAA to cover violations of corporate computer use restrictions, website terms of use, and violations of a duty of loyalty.24 The Ninth Circuit, however, recently held in an en banc opinion “that the phrase ‘exceeds authorized access’ in the CFAA does not extend to violations of use restrictions,” but rather applies to “hacking—the circumvention of technological access barriers.”25 Writing for the majority, Judge Kozinski humorously noted that:

Minds have wandered since the beginning of time and the computer gives employees new ways to procrastinate, by g-chatting with friends, playing games, shopping or watching sports highlights. Such activities are routinely prohibited by many computer-use policies, although employees are seldom disciplined for occasional use of work computers for personal purposes. Nevertheless, under the broad interpretation of the CFAA, such minor dalliances would become federal crimes.26

Unlike the damages limitations on a trespass to chattels claim under the Hamidi decision, the CFAA and section 502 allow plaintiffs to recover for resources expended for attempting to block unauthorized access from social data extraction tools. The CFAA defines “loss” to include “any reasonable cost to any victim, including the cost of responding to an offense.”27 Similarly, section 502 provides that any person who “suffers damage or loss by reason of a violation” of the Act may recover “any expenditure reasonably and necessarily incurred by the owner or lessee to verify that a computer system, computer network, computer program, or data was or was not altered, damaged, or deleted by the access.”28 In Power Ventures, the court held that Facebook had suffered cognizable damage under section 502 sufficient to establish standing where it “attempted to block Power’s access” and “expended resources to stop Power from committing acts that Facebook [contended] constituted Section 502 violations.”29

Tortious Interference with Contractual Relations

Even if violation of a website’s terms of use does not provide the basis for a CFAA or section 502 claim, it may nonetheless subject the provider of data extraction tools to liability for tortious interference with contractual relations. The elements of a cause of action for interference with contractual relations under California law are: (1) the existence of a valid contract between the plaintiff and a third party, (2) the defendant’s knowledge of this contract, (3) the defendant’s intentional acts designed to induce a breach or disruption of the contractual relationship, (4) actual breach or disruption of the contractual relationship, and (5) resulting damage.30 An argument can be made that data extraction tools induce end users to breach their agreements with social networking sites, such as the Facebook terms of use discussed above, by giving them the ability to harvest without authorization end user information through automated means. Open issues are whether social networking site plaintiffs can establish that providers of data extraction tools are aware of these services’ terms of use and that such terms of use are enforceable and binding.

CAN-SPAM Act

Some data extraction tools not only extract and export friend data, but also separately e-mail friends to invite them to join the new social networking site. As noted, ConnectU collected e-mail addresses of Facebook users, and then used those e-mail addresses to solicit business for itself. Similarly, Power.com asked Facebook users to select which of their friends should receive a Power.com invitation, and would then send those friends unsolicited e-mails to join Power.com.

To state a claim under CAN-SPAM, a party must allege that the defendant sent e-mails containing “materially false or materially misleading” header information.31 The statute defines false or materially misleading headers to include “information that is technically accurate but includes an originating electronic mail address, domain name, or Internet Protocol address the access to which for purposes of initiating the message was obtained by means of false or fraudulent pretenses or representations.”32

In the ConnectU case, the court dismissed Facebook’s CAN-SPAM claim, with leave to amend, noting that even if there was “deception in connection with the manner in which ConnectU gathered the destination email addresses,” there was “nothing in the complaint suggest[ing] that emails subsequently sent to those addresses included headers that were misleading or false as to the source from which they originated, or in any other manner.”33 By contrast, in Power Ventures, the court recently granted summary judgment for Facebook on its CAN-SPAM claim, holding that although the defendants were the “initiators” of the e-mail messages, their software program “caused Facebook servers to automatically send the e-mails,” which contained an “@facebookmail.com” address.34 Thus, as the header information did “not accurately identify the party that actually initiated the e-mail within the meaning of the Act,” the header information was “materially misleading as to who initiated the e-mail.”35

Copyright Infringement

Data extraction tools also potentially raise copyright infringement issues. As the court made clear in Power Ventures, “Facebook does not have a copyright on user content, which ultimately is the information that Defendants’ software seeks to extract.”36 Nonetheless, the court held that Facebook had adequately pled a claim for copyright infringement because Power.com had made an unauthorized “cache” copy of the Facebook website into a computer’s RAM, which as a collection of noncopyrighted material arranged in an original way was subject to copyright protection.37

Although users who access Facebook’s website necessarily make a temporary cache copy of the site into their computers’ RAM, they do so with authorization. By contrast, when the Power.com extraction tool accessed the Facebook website, it violated the Facebook terms of use, and thus the complaint had “sufficiently allege[d] unauthorized access” resulting in unauthorized cache copies. As the court noted, “if Defendants first have to make a copy of a user’s entire Facebook profile page in order to collect that user content, such action may violate Facebook’s proprietary rights.”38 The court further held that Facebook adequately pled a claim for secondary copyright infringement, as the utilization of Power.com by Facebook users “exceeds their access rights pursuant to the Terms of Use,” and thus when a Facebook user directs Power.com to access the Facebook website, “an unauthorized copy of the user’s profile page is created” which “may constitute copyright infringement.”39

The Ninth Circuit in MDY Industries, LLC v. Blizzard Entertainment, Inc., 629 F.3d 928 (9th Cir. 2010), held that the creation of cache copies of a video game into RAM while the user played the game in violation of a video game’s license agreement did not alone constitute copyright infringement; rather, there must be a “nexus” between the condition of the license violated and the licensor’s exclusive rights of copyright.40 That case involved the use of unauthorized automated “bots” in the game World of Warcraft. The Ninth Circuit held that the “antibot” provision in the World of Warcraft terms of use was not a “condition” linked to a copyright owner’s exclusive rights, and thus a player using a bot “violates the covenants with Blizzard, but does not thereby commit copyright infringement because [the bot] does not infringe any of Blizzard’s exclusive rights,” such as itself copying the software.41 No court has yet to rule whether terms of use prohibiting the use of data extraction tools fall on the copyright condition versus mere covenant side of the Blizzard line, but this promises to be an area of future litigation.

DMCA

Section 1201(a) of the DMCA prohibits “circumvention” or trafficking in tools that “circumvent” a “technological measure that effectively controls access to a work protected under [the Copyright Act].”42 Section 1201(b) of the DMCA prohibits trafficking in technology that circumvents a technological measure that “effectively protects” a copyright owner’s right.43

To the extent social data extraction tools circumvent technological measures protecting against unauthorized access to copyright content, both the traffickers and users of such tools may be liable under the DMCA as well. In Power Ventures, the court held that Facebook had adequately alleged a DMCA claim given that the Facebook terms of use bar the use of “automated programs to access the Facebook website,” which was subject to copyright protection; Facebook had alleged that it “implemented specific technical measures to block access by Power.com”; and the defendants had “attempted to circumvent those technological measures.”44

Disablement of Social Data Extraction Tools

There has yet to be significant litigation regarding the legality of technological measures disabling social data extraction tools, but this also may be an area of potential legal development. Thus far, courts have been skeptical of such claims.

In the Power Ventures case, the defendants asserted an antitrust counterclaim alleging that while “Facebook solicited (and continues to solicit) internet users to provide their account names and passwords for users’ email and social networking accounts,” and runs “automated scripts to import their lists of friends and other contacts” into Facebook, “Facebook simultaneously prohibited (and prohibits) users from using the same type of utility to access their own user data when it is stored on the Facebook site.”45

The district court dismissed this counterclaim on the pleadings, holding that:

Defendants cite no authority for the proposition that Facebook is somehow obligated to allow third-party websites unfettered access to its own website simply because some other third-party websites grant that privilege to Facebook. In fact, the Ninth Circuit has held that merely introducing a product that is not technologically interoperable with competing products is not violative of Section 2 [of the Sherman Act].46

Conclusion

As the social networking world continues to grow and expand and new competitors like Google+ enter the market, the proliferation of data extraction tools undoubtedly will continue. Although courts widely have found that such tools are subject to and potentially violate several intersecting areas of the law, the fast-moving state of the Internet frequently outpaces legal developments. These doctrines not only raise complex issues regarding the legality of social data extraction tools, but also important questions concerning a user’s “right” to export his or her online life from one social networking site to another. It will be interesting to see how both the technology and the law of social data extraction tools continues to evolve in the months and years to come.