Award-winning news, views, and insight from the ESET security community

Google's data mining bonanza and your privacy: an infographic

Do you use Google? These days the question sounds almost absurd. If you use the Internet, or an iPhone, or an Android phone, or a Kindle or an iPad, then of course you use Google in some shape or form. And if you take a keen interest in how your personal information is used, you

Do you use Google? These days the question sounds almost absurd. If you use the Internet, or an iPhone, or an Android phone, or a Kindle or an iPad, then of course you use Google in some shape or form. And if you take a keen interest in how your personal information is used, you

Do you use Google? These days the question sounds almost absurd. If you use the Internet, or an iPhone, or an Android phone, or a Kindle or an iPad, then of course you use Google in some shape or form. And if you take a keen interest in how your personal information is used, you probably know that on March 1, 2012, the world's largest collector of personal data, Google, changed the way it uses information about you. But how big of a deal is this? And what, if anything, should you be doing differently today to protect data that Google may be collecting about you?

Let's start answering those questions by picturing just how much data about its users Google has the potential to tap. The infographic on the right is titled: "Google Data Mining Bonanza." It shows some, but not all, of the different "pools" of data that Google could potentially access in order to build a picture of you and your interests as you use different Google services.

Just to be clear, I'm not saying that Google is actively mining all this data to create detailed profiles of people that are shared inappropriately with third parties. But I am saying that the changes Google made on March 1 have raised numerous questions to which I have not yet found answers, and I'm not exactly new to Internet privacy (I wrote a book about it 10 years ago).

The most visible sign of those March 1 changes is a "unified privacy policy" that combines over 60 separate privacy policies for different Google services into one. There is much to be said for the benefits of a unified privacy policy, but applying one retroactively is problematic. That's why the folks who first thought about privacy and computer-based information systems chose, as the first privacy principle: Notice/Consent.

To its credit, Google gave plenty of Notice of the March 1 changes, but when you first signed up for something like Gmail I'm guessing you did not give informed consent to what Google is doing with your data today. And millions of users of those scores of Google services have time and data invested in them which make withholding consent, where that is an option, problematic to say the least.

Take Google's Gmail for example, which I started using in 2005. (Google claims there are now 350 million active Gmail users.) Even though I don't use Gmail for all my email, there are currently more than 47,000 messages in my Gmail Inbox. You could draw a fairly detailed picture of the last 7 years of my life from that lot.

How about Google Search? A quick back-of-the-envelope calculation tells me it is quite possible that I've performed more than 47,000 searches via Google in the same time period. What a picture those search terms could paint! And if it's moving pictures you want, consider the YouTube videos that I have uploaded, commented on, searched for and watched.

Not that I think I am personally of great interest to Google or the world in general, the point is I am valuable to Google as a potential clicker of online advertisements; and Google has found that my value increases each time the company can pipe another source of data about me into the ad targeting mix. Like a lot of people, including many fans of Google, I am now wondering what could happen to my "pooled" Google data.

So what are my options if I want to cut back on Google's use of data about me? The place to start, a place you should visit even if you're not that bothered about what Google does with data about you, is the Dashboard.

The Google Dashboard

You need to be signed into Google to see your information on the Dashboard and you might be surprised at just how much information that is. I counted 32 different entries plus a note that says "15 additional products are not yet available in this dashboard." (It would be nice to know what those are so I will keep checking back.) Below you can see the top of my Google Dashboard, with some of my personal information blacked out.

The first thing on the Dashboard that caught my interest was the entry titled: "Websites authorized to access the account." When I clicked on this link, lo and behold there were some surprises, including connect.thedailyshow.com and socialize.cnet.com.

No offense to Jon Stewart or The Daily Show or CNET but I don't recall giving them special access to my Google account, so I used the Revoke Access link to remove them, leaving just the Google Mail Notifier and Google Calendar.

While revoking access was easy, this page could do a lot better job of explaining what exactly access means, and what the implications of adding or revoking access might be. The same page does present a lot of information about "application-specific passwords" and "2-step verification" but again there is not enough context.

Me on the Web

The second item on my Google Dashboard is "Me on the Web" and it has three sections, although that is really over-selling the content Google has put together for this section:

How to manage your online identity: Tips on searching for yourself to see what is out there; creating a Google profile as a way to control what people learn about you; removing unwanted content and search results, and; getting notified when information about you appears on the web.

How to remove unwanted content: More about the same topic covered in section 1.

About Me on the Web: More about the section you are looking at.

Despite the redundancy, this is good information, stuff that people who are heavily involved in social media probably know and do already (for example, I regularly Google myself to make sure nothing bad pops up and I have a Google Alert on myself for the same reason). What may come as a surprise to the more casual Google user is the amount of work it takes to manage your online identity.

Web History

What may also come as a surprise when you start to explore the Dashboard is the fact that the privacy item most people seem concerned about–Web History, the information that Google stores about what you search for–is way down at the bottom of the page. (I know that's because the page is alphabetically arranged, but to me that is weak user interface design.) When you do work your way down to Web History it can make interesting reading. Here's what I saw when I clicked the "Remove items or clear Web History" link:

When you check out this page for yourself, don't be surprised to find that it includes searches conducted on multiple devices. From my entries it was clear that Google was tracking my searches on my laptop, my iPhone, and my Kindle Fire. It is this kind of all-embracing, cross-platform tracking of what you do with Google that seems to bother some privacy-conscious people. Fortunately, Google makes it easy to put a stop to this: just click the Pause button. According to Google, the Pause button will "prevent your future web activity from being saved in Web History and from being used to personalize your search results." If you then click Remove all Web History all your past activity will be erased.

Another way to avoid Google tracking your search activity is to use search without signing in. If you go to www.google.com in a web browser on a laptop or desktop and you see your name at the top of the page, that means you are signed in. You can click on your name to access the Sign out option.

If you are using Google as your search engine on your Apple iPhone and you are using iOS5 then you can go into the Safari settings and turn on Private Browsing to turn off tracking. (I'm pretty sure Private Browsing is off by default and I don't recall signing into Google on Safari on my iPhone, but I can assure you my searches from that phone were tracked by Google until I turned on Private Browsing.)

You may have noticed that Google is pretty persistent about signing you back in and keeping you signed in once you have logged in from a particular browser. One strategy to consider on your laptop or desktop is multiple browsers because Google login is browser specific. That means you can use the Chrome browser for your "logged in" Google activity but Firefox for activity where you don't log into Google. For good measure you can turn on the "Do not Track" option in Firefox.

How problematic is it that Google records your search history? The answer is largely subjective, based on how you feel about other people knowing what subjects interest you. Not that people at Google sit around reading your search history, but there are clearly issues of trust around what could happen to your history.

Consider the section of the Google Privacy Policy titled "For legal reasons." Basically, it says Google will indeed share your personal information with companies, organizations or individuals outside of Google if the company has "a good-faith belief that access, use, preservation or disclosure of the information is reasonably necessary to meet any applicable law, regulation, legal process or enforceable governmental request." I'm no lawyer but I would say that's a pretty broad definition and there seems to be a lot of room for interpretation in phrases like "good-faith belief" and "reasonably necessary." The extent to which you feel you can rely on Google to screen and vet such requests is a matter of trust. And Google would clearly have no control over the way in which a third party would interpret my Google searches for subjects like "missile silos near me" and "where to buy arsenic."

Ads Preferences

One reason Google would like to track your searches is to improve the targeting of adverts. The company argues that such targeting is better for you. The stock market suggests it is also better for Google. But although Google allows you to exercise some control over the ads you see, those controls are strangely absent from the Dashboard. You have to go to a place called Ads Preferences to make changes. The preferences are broken out into "Ads on Search and Gmail" and "Ads on the Web."

You will find the latter very interesting if you have been allowing Google to use its cookie to track your activities. The page presents "a summary of the interests and inferred demographics that Google has associated with your cookie." Frankly, I was surprised at what I found because it was not a very well-rounded picture of my interests. This suggests that Google is not doing all the correlation of data that it could, at least not yet. (For example, the fact that my demographic age is listed as 45-54 has to be intentional flattery since my date of birth is in my Google Profile and it proves I'm older than that).

The Ads Preferences page allows you to opt out of seeing targeted ads and gives you access to the Remove and Edit features for ad preferences. These enable you to tailor ads by removing erroneous categories or adding fresh categories. As with many things Google, the details are quite complex. For example, a cookie is required to prevent tracking. So if you routinely erase your cookies you potentially remove your opt-out preference (we will have more to say about this in a future post).

More to be Said

Indeed, there is a lot more to be said about Google's privacy policy changes and the way they are being handled, starting with the fact that Google went ahead with them despite a chorus of objections from legislators and regulators in the U.S. and the E.U. There is also the question of corporate and government agency use of Google products and what the changes mean for them. Expect to see more blog posts on this topic in the coming weeks. (For further reading right now, the San Jose Mercury News offers a fairly balanced review of reaction to the recent Google privacy changes and there is an extended discussion here on NPR.)

During the financial crisis of '08 we all became familiar with the term "Too big to fail." I find it hard to escape the feeling that, given the vast size of Google's installed base and the broad range of its services, its privacy policy changes are: "Too big to understand." Certainly, getting a clear picture of where things now stand will take a lot of work on the part of Google users, even as Google continues to build out tools like the Dashboard, which is still a work in progress (for example, I got a "page not found" error when I checked out the link titled "About privacy and security in Google Voice").

We'd love to hear your thoughts about Google privacy and your experiences with the Dashboard.