Login

Is Google’s Massive Database a Security Risk?

Massive databases equal massive targets to many people, and Google has some of the largest databases in the world. Should we be concerned about what the search engine giant will do with all that information — especially since a surprising amount of it is personal and sensitive? Even if Google follows the “do no evil” policy laid out by founder Sergey Brin, many with less honorable intentions may gain access to some of that information. This article outlines the risks.

Mae West once said that “Too much of a good thing is wonderful.” Not to disagree with this dear departed screen star, but there is at least one situation in which too much of a good thing is not so wonderful: too much information. More precisely, a concentration of too much potentially sensitive information about too many people makes a tempting target, and not just for hackers. In this case, too much information in the hands of search engine giant Google can cause the rest of us to feel somewhat less than wonderful, as privacy advocates have been chiming in with their concerns.

The focus on Google and how much information its servers contain about its users evolved from how popular it has become. The company has leveraged its popularity as a search engine into other endeavors. It now offers a variety of services, including email, blogging, and personalized search. Projects in development include a digital library, a payments service, and software aimed at speeding up web traffic.

Whether Google has managed to beat its rivals in these areas, it certainly has them running scared. When its Gmail service offered users 1 GB of storage space for their email, for example, Yahoo and Microsoft scurried to increase the storage space on their email users’ accounts. They would hardly have done so if they weren’t afraid of losing customers to Google.

The key point is that many of the services Google currently offers, and is planning to offer in the not too distant future, collect and require personal information – sometimes in the form of cookies that track users’ habits, sometimes in the form of actual personally-created content (i.e. emails) preserved on Google’s servers. As Chris Hoofnagle, senior counsel with the Electronic Privacy Information Center observed, “This is a lot of personal information in a single basket. Google is becoming one of the largest privacy risks on the Internet. I don’t think any of the others have the scope of personal information that Google does.”

Some civil libertarians are most worried about the concentration of personal data under one digital roof. This could make Google an impossible-to-resist target for law enforcement officials, prepared with subpoenas in hand. Personal and sensitive information collected for particular investigations could later become public – for example, through court filings – even when the information concerns people who aren’t even targeted by the investigation.

To Google’s credit, it actively looks for feedback from civil liberties groups such as the Center for Democracy and Technology and the Electronic Frontier Foundation. While both groups have said that Google doesn’t always agree with them, they do admit that the search engine at least listens to their point of view. Nicole Wong, an associate general counsel at Google, stated that privacy concerns are a priority at the company. “In general, as a company, we look at privacy from design all the way (through) launch,” she told the Associated Press.

But you cannot expect a company to block the lawman from its records. According to Wong, Google will surrender data if it receives a subpoena, court order, or warrant. How many of these “requests” does the company receive? Wong is silent on that matter. Indeed, Google cannot legally disclose any requests for information that relate to national security, thanks to a federal law.

Some of us may feel we have nothing to hide from the law, but it isn’t just “the right side of the law” we need to worry about. A casual glimpse at the news over the past few months reveals several high-profile security breaches at firms such as banks and credit card firms. Malicious hackers broke into company databases and managed to steal information about customers – information that they could use for anything from blackmail to identity theft. Just how secure are Google’s servers?

This question is far from idle. Like Microsoft, Google is such a large company that even its rivals could end up adopting its practices, as a sort of “industry standard.” According to computer scientist and privacy advocate Lauren Weinstein, “Google is perhaps the most noteworthy right now by the simple fact that they are the 800-pound gorilla. What they do tends to set a pattern and precedent.”

Google’s huge database system isn’t merely a target for hackers. They also use the search engine as a tool for finding a way to break into other systems. Security researcher Johnny Long, author of the book Google Hacking for Penetration Testers, spoke on this subject at the recent Black Hat USA conference in Las Vegas. In one of his demonstrations, he used Google to discover an unprotected web interface to someone’s household electrical network. There were a number of appliances listed, complete with two control buttons labeled “on” and “off.”

This isn’t exactly news, as Frank Hayes pointed out in a blog on Computerworld’s website. As hacker Adrian Lamo pointed out in early 2003, “Google, properly leveraged, has more intrusion potential than any hacking tool.” It is a shame, however, that this has apparently not changed more than two years later.

Nor is it just unprotected home networks that are vulnerable to hackers. Long and others have been able to access and control printer networks, PBX phone systems, routers, webcams, and websites, all with a little help from Google. The search engine may only be partly to blame for this; many people don’t realize just how powerful Google is at digging up information. How can you secure your sites against a threat you don’t even know exists?

In addition to uncovering vulnerable networks, Long demonstrated how to use Google for discovering network topology information – without taking the risk of actually breaking into the network first. This involves taking some educated guesses about how the search engine will respond to certain queries, and the ability to translate what at first glance may appear to be nonsense. For instance, searching on “site:nasa” in Google should turn up nothing, because no website has the word “nasa” as its URL. Instead, it turns up a long list of apparent URLs that, when clicked on, deliver an error box saying that the site could not be found. These “URLs” look very much like they could be the names of servers on the internal network for the National Aeronautics and Space Administration.

Long refers to such bits of unintended strangeness that can be dangerous in knowledgeable, malicious hands as “Google Turds.” He pointed out that a hacker could combine them with text processing tools to retrieve SQL passwords and other information, which could then be used to launch an SQL injection attack. Such an attack could give a malicious hacker the ability to run unauthorized commands on an SQL database.

A few years ago, Microsoft tried to get its Passport off the ground. The service was supposed to allow users to sign in once, and then engage in a variety of activity across the Internet at other sites that accepted the Passport. This required Microsoft to hold a lot of personal information on its own servers, which raised privacy and security concerns similar to many of the ones that privacy advocates and security experts are now raising about Google. Microsoft’s Passport service has never quite caught on; indeed, late last year it stopped trying to convince other companies to use the service. At the same time, eBay dealt the service a major blow by announcing it would stop supporting the service on its own website.

Google does not have a single service that concerned users can point to, as Microsoft did with Passport, which causes concern. Instead, Google’s compilation of services allows for the accumulation of large amounts of data on individuals. Many Google fans use more than one of these services. The company insists that any services that require personally identifiable information are completely optional, and that it gets permission before collecting this information.

But what information should be considered “personally identifiable”? Google tracks search terms that people use automatically, presumably to improve its service. The information is attached to a user’s IP address, and is even kept on a “cookie” file that Google uploads to computers unless users configure their browsers to reject them. Most Internet companies, including Google, do not consider that information personally identifiable. But an IP address can often be traced to a specific user. Any hacker could do this.

Again, to Google’s credit, its privacy and security standards are known to be as good as or even better than Microsoft’s. This is part and parcel of the company’s “Do no evil” philosophy. But people have been known to do evil unintentionally, or against their will. Let’s hope that Google never finds itself in this unpleasant position – especially with our own digital privacy at stake.