Search Engine Privacy

Introduction

Internet search engines are the primary means by which individuals access Internet content. Internet users submit more than 15 Billion searches per month. Typically, search engines collect detailed information that is personally identifiable or can be made personally identifiable. This information includes the search terms submitted to the search engine, as well as the time, date, and location of the computer submitting the search. This data is collected for marketing and consumer profiling purposes. Companies also use search engine data to carry out research and compile usage statistics. Search engines also link individuals' names and other personal information with websites and news stories that may be inaccurate, misleading, or harassing.

Search data is one of the most sensitive types of personal information, and its collection and use by Internet firms poses significant consumer privacy risks. As a result of behavioral marketing methods and the potential exposure of sensitive personal information, privacy groups have called for greater protections for search data. Specifically, privacy advocates have called for strict limitations on the collection, retention, and disclosure of information relating to Internet Protocol (IP) addresses. IP addresses are one of the main methods of identifying Internet users. Other methods include browser fingerprinting, tracking cookies, and search query analysis (particularly with regard to vanity searches). Most users are unaware that search engines collect their personally identifiable data. The majority of users polled in 2015 think that online advertisers should not have any information about their online activities.

Federal Appeals Court Revives Google Cookie Tracking Suit: A federal appeals court has reinstated a class action alleging that Google and internet advertising companies unlawfully placed tracking cookies on users' web browsers. A reasonable jury could conclude that Google's "deceitful override of the plaintiffs' cookie blockers" constitutes a "serious invasion of privacy" under California law. The appeals court also held that tracked URLs could constitute "content" under the federal Wiretap Act, though it ultimately upheld the dismissal of all federal law claims for other reasons. EPIC filed an amicus brief in a similar case, arguing that Viacom's disclosure of IP addresses and unique device identifiers to Google violated the Video Privacy Protection Act. (Nov. 12, 2015)

According to a new survey, nine out of ten voters in the United States want the right to delete links to personal information. Those voters say they would support a U.S. law that permits Internet users to ask search companies, such as Google, to remove links to certain personal information. Last May the top court in the European Union established the "right to be forgotten" as a fundamental right, protected by the EU Constitution. EU citizens may require search companies to remove personal information that is inadequate, irrelevant, and inaccurate. The recent US survey bolsters the findings of a previous US survey which found that 61% of Americans supported the right to be forgotten. EPIC has argued that the right should be established in the United States.

The Federal Trade Commission has responded to EPIC's letter urging the agency to oppose a collusive Google class action settlement. The agency stated that it "systematically monitors compliance" with its consumer protection orders and that it "takes alleged violation[s] of an order seriously," but that it cannot publicly disclose details of its investigations until a formal complaint is issued. In 2010, Google was sued for sharing user web browsing information with advertisers. Under the proposed settlement agreement, Google will distribute several million dollars to a handful of organizations, many of which already have ties to the company. EPIC and other privacy organizations urged the Commission to formally object because the proposed agreement "confers no monetary relief to class members, compels no change in Google's behavior, and misallocates the cy pres distribution." The agency has a history of filing objections - it filed a similar objection in Fraley v. Facebook, an unfair class action settlement in the Ninth Circuit. For more information see EPIC: FTC and EPIC: Search Engine Privacy.

EPIC, along with a group of consumer privacy organizations, has asked the Federal Trade Commission to object to an unfair class action settlement in California federal court. In 2010, Google was sued for sharing user web browsing information with advertisers. Under the proposed settlement agreement, Google will distribute several million dollars to a handful of organizations, many of which already have ties to the company. EPIC and other privacy organizations have argued that the proposed agreement "confers no monetary relief to class members, compels no change in Google's behavior, and misallocates the cy pres distribution" to organizations that are "not aligned with the interests of class members and do not further the purpose of the litigation." The consumer groups, who have already written to the court opposing the settlement, urged the Federal Trade Commission to object as well. The agency filed a similar objection in Fraley v. Facebook, an unfair class action settlement in the Ninth Circuit. For more information, see EPIC: FTC and EPIC: Search Engine Privacy.

The Supreme Court of Canada has ruled that police conducted an unconstitutional search when they used an IP address to obtain subscriber information from an Internet Service Provider without legal authorization. The Court also found Canada’s personal information protection law does not require ISPs to disclose subscriber information to law enforcement. In its analysis, the Court described information privacy as "control over, access to and use of information." The Court stressed that "anonymity may be the foundation of a privacy interest that engages constitutional protection against unreasonable searches and seizures." Two recent opinions from the European Court of Justice have firmly established the right of information privacy law in EU law. EPIC has urged the US Supreme Court to recognize the right of information privacy and also to safeguard the right of anonymity. For more information, see EPIC: NASA v. Nelson, EPIC: Watchtower Bible v. Stratton, EPIC: Internet Anonymity and EPIC: Search Engine Privacy.

A federal judge in California has approved a settlement agreement in a lawsuit against Google that will allow the company to continue to sell data about users' browsing history to advertisers. EPIC and several other consumer privacy organizations objected to the settlement, stating that it requires no change in Google's business practices and provides no benefit to those on whose behalf the case was brought. EPIC and the groups also recommended that the court adopt an objective basis for distributing cy pres funds, noting that the awards are often made for the benefit of the lawyers settling the case and not the class members. Class action settlements have come under increasing scrutiny in recent years, with courts increasingly concerned about collusion between attorneys and faux settlements that do not reflect the purpose of the initial lawsuit. In a case that reached the Supreme Court, Chief Justice Roberts said that courts will need to look more closely at these settlements to determine whether there are fair, whether organizations designated to receive funds reflect the interests of class members, and also the obligation of judges to carefully review these proposals. For more information, see EPIC: Search Engine Privacy and EPIC: Google Buzz.

EPIC, joined by several leading privacy and consumer protection organizations, submitted a letter to the Northern District of California regarding a proposed settlement in a class-action lawsuit against Google. The settlement was proposed by class action lawyers on behalf of Google users in a case concerning the unlawful disclosure of search terms by Google to third parties. Under the terms of the proposed settlement, Google would be allowed to continue to disclose user search terms to third parties. The letter explains that the proposed settlement "provides no benefit to Class members" because it does not require Google to change its business practices. "Furthermore," the letter states, "the proposed cy pres allocation is not aligned with the interests of the purported Class members." "Cy press" ("as near as possible") is a legal doctrine that allows courts to allocate funds to protect the interests of individuals when there is a class action settlement. Under Ninth Circuit precedent, cy pres funds must be used to advance the interests of the class members. EPIC previously highlighted the dangers of improper cy pres distributions in settlements. For more information, see EPIC: Fraley v. Facebook, EPIC: Lane v. Facebook, and EPIC: Search Engine Privacy and EPIC: Google Buzz.

The Internet Society has announced the world launch of IPv6, which will dramatically expand the number of Internet addresses. IPv6 creates fixed IP addresses, allowing routine tracking of Internet-connected devices, such as laptops, cellphones, and soon many consumer appliances. This will make it easier for law enforcement agencies and advertisers to track users of Internet-based services. A Privacy Extension allows the use of IPv6 without persistent identifiers, though it is not clear how widely it will be be adopted. In 2008, EPIC testified before the European Parliament on IP addresses and privacy, and said that companies that use IPv6 linked to identifiable users should be subject to data privacy requirements. The EU classifies IP addresses as personal information. For more information: See EPIC: Search Engine Privacy.

A Pew study found that users of search engines were pleased with the quality of search results but opposed targeted advertising and search results, and were generally anxious about the collection of personal information by search engines. Specifically, 73 percent of those surveyed were opposed to search engines tracking their searches, and 68 percent opposed behavioral advertising. 83 percent of respondents reported using Google to conduct searches. Recently, Google began combining user data gathered from more than sixty Google products and services—including Google search--to create a single, comprehensive profile for each user. For more information, see EPIC: Search Engine Privacy and EPIC: EPIC v. FTC.

Eight members of Congress wrote to Google asking the company to explain the "steps [that] are being taken to ensure the protection of consumers' privacy rights." The letter follows Google's announcement that it would begin combining data gathered on consumers of over 60 Google products and services, including Gmail, Google+, Youtube, and the Android mobile operating system. The members' letter includes 11 specific questions ranging from the ways in which Google collects information to the specific consequences for Android phone users. In 2010, EPIC, along with other privacy groups, wrote a letter to Google about the company's decision to combine user data among 12 Google services. The groups warned that the practical effect would be to reduce privacy protection for users of Google services. For more information, see EPIC: In re: Google Buzz and EPIC: Google search.

International watchdog Privacy International has announced the launch of a new website for bringing transparency to "technical mysteries" behind controversial systems. Cracking the Black Box identifies key questions regarding mysterious technologies and asks experts, whistleblowers, and other concerned parties to "help crack the box" by anonymously contributing ideas and input. The organization responsible for the technology in question is then invited to provide an official response. The first two issues addressed on the PI site are the Google Wi-Fi controversy and the EU proposal to retain search data.

Following similar letters from other Congressional leaders, the head of the House Judiciary Committee has asked Google Inc. and Facebook to cooperate with government inquiries into privacy practices at both companies. Rep. Conyers (D-MI) noted that Google's collection of user data "may be the subject of federal and state investigations" and asked Google to retain the data until "such time as review of this matter is complete." Rep. Conyers also asked Facebook to provide a detailed explanation regarding its collection and sharing of user information. The House Judiciary Committee is expected to hold hearings on electronic privacy later this year. For more information, see EPIC: Facebook Privacy, EPIC: In re Facebook II, and EPIC: Search Engine Privacy.

In order to comply with European privacy law, Microsoft announced that it will delete user search data, including IP addresses, after six months. In 2008 the Article 29 Working
Group, which includes data protection officials across the European Union, met with Microsoft, Google, and Yahoo to discuss their data retention practices. Following a determination that records are subject to European privacy law, the Article 29 Working Group asked the search engine companies to eliminate online user data, including IP addresses and search queries, after six months. Microsoft will redesign its new Bing search engine to comply with the request. It is unclear at this point what Google and Yahoo will do. In early 2008, EPIC urged the European Parliament to protect the privacy of search histories. For more information, see EPIC: Search Engine Privacy.

Background

IP and MAC Addresses

An Internet Protocol ("IP") address is a numerical identifier that is used by a computer to send and receive data on a network. An IP address for a computer is similar to a telephone number for a telephone, a “housing addresses” of networked devices. Most modern networks use the TCP/IP protocol to communicate, but there are now two different standards used for IP addresses. All computers that connect to IP networks have an assigned IPv4 address, which is a 32-bit address expressed by four numbers separated by dots (e.g. 192.168.1.1). Many modern devices now also use IPv6 addresses, which are 128-bit identifiers expressed by eight groups of hexadecimal numbers separated by colons (though groups of numbers consisting of all zeroes are often omitted to save space).

Due to the limited size of the IPv4 address space (4,294,967,296 total numbers) and to avoid confusion, the Internet Assigned Numbers Authority (IANA) has reserved three "blocks" for use by private networks (the 10/8, 172.16/12, and 192.168/16 prefixes). These private addresses are commonly assigned to computers on local networks for homes, businesses, or educational institutions. As a result, "public" IP addresses can be shared by multiple computers. An single computer can also be assigned multiple IP addresses if it has multiple network interfaces (e.g. wireless, wired, etc). The IPv6 address space, by contrast, is much larger (3.4 × 1038 addresses) and each device can be uniquely identified. In addition to the IP address, each device with a network connection has a
unique media access control (MAC) address for each “distinct point of attachment" (network card or interface). Marketing agencies rely on
usernames, IP addresses, and other digital identifiers to track users across the web, and to deliver targeted ads.

Behavioral Marketing

The emergence of targeted Internet advertising has led to "behavioral marketing." In the course of recording users' viewing habits and monitoring their search terms, companies collect information about user interests and tastes, including the things they buy, the stories they read, and the websites they visit, in addition to very sensitive personal information. Search terms entered into search engines may reveal a plethora of personal information such as an individual's medical issues, religious beliefs, political preferences, sexual orientation, and investments. The expansion of the behavioral marketing industry, as well as its ability and incentive to monitor online search behavior, has produced significant privacy problems and substantial risks to Internet users. Opaque industry practices result in consumers remaining largely unaware of the monitoring of their online behavior, the security of this information and the extent to which this information is kept confidential. Industry practices, in the absence of strong privacy principles, also prevent users from exercising any meaningful control over their personal data that is obtained.

Right to Be Forgotten

In 2014, the European Court of Justice ruled that European citizens have a limited right to deindex websites from search results of searches of the person’s name. A website is subject to removal if it contains information that is “inadequate, irrelevant or excessive in relation to” the information’s original purpose. In so ruling, the Court concluded that the fundamental right to privacy is greater than the economic interest of the commercial firm and, in some circumstances, the public interest interest in access to information.

Public Disclosure of Search Engine Data by US Service Providers

In 2006, America Online (AOL) published three months of search records for 658,000 Americans. AOL attempted to "anonymize" the records, and intended for academics and technologists to use the data for research purposes. The records did not link searches to IP addresses or user names, but did group searches by individual users via randomly-assigned numerical IDs. Subsequent events demonstrated that AOL's storage of numerical IDs as opposed to usernames or IP addresses does not necessarily prevent search data from being linked back to individuals. Though the search logs released by AOL had been "anonymized," identifying the user by only a number, quick research by New York Times reporters matched some user numbers with the correct individuals. Other sources identified sensitive and occasionally disturbing personal information in the AOL search data, including user searches for "how to kill your wife" "anti psychotic drugs," and "aftermath of incest." In response, several privacy groups filed complaints with the Federal Trade Commission.

EU Regulation of Search Engines

The European Union Data Protection Directive requires search engines to "delete or irreversibly anonymise personal data once they no longer serve the specified and legitimate purpose" for which they were collected. Retention of personal data by search engines for more than six months is presumed to be unnecessary. Search engines that retain personal data for longer periods must "demonstrate comprehensively that it is strictly necessary for the service." This requirement applies to IP address data, which virtuallyallsearchengines collect each time a user runs a search. The EU also imposes limits on the lifetime of search engines' cookies - small computer files that can track users between multiple sessions and web sites. As a technical matter, every cookie expires eventually, and web sites can easily select the expiration dates for their cookies. EU guidelines prohibit search engines from setting expiration dates farther in the future than necessary to provide search services.

The Article 29 Working Group's April 4, 2008 report issued a set of obligations to search engine firms, including:

Search engines should get informed consent from users if they correlate personal data across different services, such as desktop search;

Search engine providers must delete or anonymise (in an irreversible and efficient way) personal data once they are no longer necessary for the purpose for which they were collected;

Personal data should not be held by search engines for longer than six months;

In case search engine providers retain personal data longer than six months, they must demonstrate comprehensively that it is strictly necessary for the service;

It is not necessary to collect additional personal data from individual users in order to be able to perform the service of delivering search results and advertisements;

If search engine providers use cookies, their lifetime should be no longer than demonstrably necessary;

Search engine providers must give users clear and intelligible information about their identity and location and about the data they intend to collect, store, or transmit, as well as the purpose for which they are collected

EPIC's Work

IP Address Privacy in the United States

In the United States, federal law does not provide uniform privacy protections for personal data submitted to search engines or for IP addresses. Some federal regulations (i.e. 45 C.F.R. § 164.514(b)(O)) treat IP addresses as "individually identifiable" information for specific purposes, but such treatment is not comprehensive.

IP Address Privacy in the European Union

The European Commission classifies IP addresses as personal data. Search engine data falls under the relevant EU data protection directives, and EU regulations generally apply to search engine companies even when they are headquartered outside Europe. Search engines must comply with European privacy provisions if they maintain an establishment in one of the EU Member States, or if they use automated equipment based in one of the Member States for the purposes of processing personal data. European privacy rules limit the collection, use, and disclosure of personal information. The privacy officials who make up the EU Article 29 Working Group have stated that "the protection of the users' privacy and the guaranteeing of their rights, such as the right to access to their data and the right to information as provided for by the applicable data protection regulations, remain the core issues of the ongoing debate."

Corporate Policies Regarding IP Address Privacy

Google, the leading Internet search engine, automatically collects its users' search terms in connection with their IP addresses. Google states that, after collection, it retains the personally identifiable information for 18 months, and then "anonymizes" the data linking search terms to specific IP addresses by erasing the last octect of the IP address.

Ixquick states that it deletes users' search data (including IP addresses) within 48 hours. Ixquick further states that it does not set any uniquely identifying cookies, and that it shares data with 3rd parties only in limited circumstances.

Change in Yahoo Search Retention Leaves Privacy Questions Unresolved. Yahoo announced that, after 90 days, it will obscure some elements in the records that it keeps about all Internet users who use the company's services. The search company will continue to keep modified record locators, time/date stamps, web pages viewed, and a persistent user identifier, known as a "cookie" for an indefinite period. Yahoo is also retaining much of the IP address, which typically identifies a user's device, such as a laptop or a mobile phone. Privacy rules classify IP addresses as "personal data." Experts have criticized the partial deletion of IP address data as insufficient to protect consumers, and called for complete deletion. For more information, see EPIC's Search Engine Privacy page. (Dec. 18, 2008)

Google "Flu Trends" Raises Privacy Concerns. Google announced this week a new web tool that may make it possible to detect flu outbreaks before they might otherwise be reported. Google Flu Trends relies on individual search terms, such as "flu symptoms," provided by Internet users. Google has said that it will only reveal aggregate data, but there are no clear legal or technological privacy safeguards to prevent the disclosure of individual search histories concerning the flu, or related medical concerns, such as "AIDS symptoms," "ritalin," or "Paxil." Privacy and medical groups have urged Google to be more transparent and publish the algorithm on which Flu Trends data is based so that the public can determine whether the privacy safeguards are adequate. (Nov. 12, 2008)

European Privacy Officials: Privacy Rules Apply to Search Engines. European privacy officials have established "a clear set of responsibilities" on search engine companies regarding their handling of user data. The opinion, issued by the Article 29 Working Group, states that the European Union Data Protection Directive requires search engines to "delete or irreversibly anonymise personal data once they no longer serve the specified and legitimate purpose" for which they were collected. This requirement has particular significance for search engines, because European privacy rules classify Internet Protocol (IP) addresses as "personal data." The opinion further holds that European privacy laws generally apply to search engines "even when their headquarters are outside [Europe]," and requires that search engines must delete personal data within six months of collection. (Apr. 7, 2008)

Search Histories Subject to European Privacy Rules. European privacy officials determined this week that companies operating search engines will be subject to European privacy rules that limit the collection, use, and disclosure of personal information. The privacy officials who make up the Article 29 Working Group stated that "The protection of the users' privacy and the guaranteeing of their rights, such as the right to access to their data and the right to information as provided for by the applicable data protection regulations, remain the core issues of the ongoing debate." Earlier this year, EPIC urged the European Parliament to protect the privacy of search histories. A report from the Article 29 Working Group on Search Engines and Privacy is expected in April. (Feb. 22, 2008)