To protect privacy, you first have to identify what data is personal. That’s why there has been a lot of discussion in recent months around the world to try to define “personal data” (as it is referred to in Europe), or “personally identifiable information” (as it’s called in the U.S.).

The discussion is a broad one: as the world’s information moves online, how should we protect our privacy? What pieces of data can identify us as individuals, directly or indirectly? For instance, your name, address, phone number, social security number and your fingerprints are all personal data, since all of them can be used to identify you as an individual. But many people have raised the question of whether an IP address is personal data. To decide whether this is the case, it's helpful to first understand the technical workings.

An Internet Protocol (IP) address is an address for a computer on the Internet, which exists to allow data to be delivered to that computer. When you enter a website's name - like http://www.google.com - that is actually a handy shortcut for the website's IP address - right now, one of Google's is http://72.14.207.99/. So when a website needs to send your computer something (for instance, your Google search results), it needs your IP address to send it to the right computer.

The situation gets a bit more complex, though, because the IP addresses that people use can change frequently. For instance, your Internet service provider (ISP) may have a block of 20,000 IP addresses and 40,000 customers. Since not everyone is connected at the same time, the ISP assigns a different IP address to each computer that connects, and reassigns it when they disconnect (the actual system is a bit more complex, but this is representative of how it works). Most ISPs and businesses use a variation of this "dynamic" type of assigning IP addresses, for the simple reason that it allows them to optimize their resources.

Because of this, the IP address assigned to your computer one day may get assigned to several other computers before a week has passed. If you, like me, have a laptop that you use at work, at home, and at your corner café, you are changing IP addresses constantly. And if you share your computer or even just your connection to your ISP with your family, then multiple people are sharing one IP address.

So, back to our initial question: is an IP address personal data, or, in other words, can you figure out who someone is from an IP address? A black-and-white declaration that all IP addresses are always personal data incorrectly suggests that every IP address can be associated with a specific individual. In some contexts this is more true: if you're an ISP and you assign an IP address to a computer that connects under a particular subscriber's account, and you know the name and address of the person who holds that account, then that IP address is more like personal data, even though multiple people could still be using it. On the other hand, the IP addresses recorded by every website on the planet without additional information should not be considered personal data, because these websites usually cannot identify the human beings behind these number strings.

At Google, we know that user trust is fundamental to our success; users will stop choosing to use Google products and services if they can't trust us with their data. For this reason, we have made moves to safeguard that privacy, like anonymizing our logs and worked with privacy groups on initiatives like shortening cookie length. We have proposed broad global privacy standards, and are strong supporters of the idea that data protection laws should apply to any data that could identify you. The reality is though that in most cases, an IP address without additional information cannot. The policy debate about data protection and IP addresses will continue, but it’s important to have a firm grasp of the technical realities of the debate in order to reach conclusions that make sense.

57 comments
:

This article shifts definitions midstream. First, it asks whether IP adddresses are personally identifiable, then it switches to whether they can be used to determine identity. Unless someone is behind a router where multiple users simultaneously have the same external IP, then given a specific time, an IP address is absolutely identifiable. That doesn't mean it can be used to identify someone in reverse, although it could be used with other information to determine a high probability, but it is still definitively personally identifiable information for that person at that time, unless multiple persons are simultaneously behind a router.

I work for a competitor of Google's but a couple of months ago we ran into the exact same issue - in the EU, they consider IP addresses as PII.

While you do have valid points that a laptop user can move their IP address around, there is another case. What about the home user that has a static IP address on their home server? Perhaps its reverse DNS is mail.homeuser.com or something like that. In that case, their IP address would not be subject to change and therefore you could, quite conceivably, learn their identity.

That seems to be the situation that the EU is targetting. I don't really agree with it since it only identifies a machine and not a user. However, the question is a legal one and not a technical one.

I would think that an IP address is PII. I have a static ip for my home computers and my servers. That ip can be tracked back to me quite easily and i applaud the EU for their stance. I wish the US (where i live) would also enforce that information being personal data.

After the issue with AOL's lame attempt to protect privacy before releasing logs i'm very leary of my ip being released to anyone.

This is absurd. Why should the fact that IP address change frequently mean that, during the duration of time that it IS assigned to a user, it isn't personal? I can change my phone number every day if I wish but that has zero bearing on whether or not phone numbers are a category of data that should be considered personal.

And why does the fact that many people can share an IP address mean it isn't personal? A family lives at one home address. That means their home address isn't personal anymore? Google's position conflates the difference between UNIQUE and PERSONAL. They're not the same thing yet Google's position boils down to, if info isn't unique, it isn't personal. Wrong. Many kinds of information can be shared but still be personal, info can be personal without having to be unique. To say that the only information that should be protected is unique info is basically to say only a person's DNA sequence should be covered. Is that what Google wants?

To protect privacy, you first have to identify what data is personal. That’s why there has been a lot of discussion in recent months around the world to try to define “personal data” (as it is referred to in Europe), or “personally identifiable information” (as it’s called in the U.S.).

The discussion is a broad one: as the world’s information moves online, how should we protect our privacy? What pieces of data can identify us as individuals, directly or indirectly? For instance, your name, address, phone number, social security number and your fingerprints are all personal data, since all of them can be used to identify you as an individual. But many people have raised the question of whether a Credit Card Number is personal data. To decide whether this is the case, it's helpful to first understand the technical workings.

A Credit Card Number is an identifier for your account on the Visa/MC Interchange Network, which exists to allow data to be delivered to your bank. When you enter a Credit Card Numbe that is actually a handy shortcut for your bank account - right now, one of Google's is chock full of cash. So when a merchant needs to bill your bank for something (for instance, your Google Ad Words buy), it needs your Credit Card Number to send it to the right bank/account.

The situation gets a bit more complex, though, because the Credit Card Numbers that people use can change frequently. For instance, your bank may have a block of 20,000,000 Credit Card Numbers and 40,000 customers. The bank assigns a different Credit Card Number to each new account, and assign a new one every two years or so on card expiration (the actual system is a bit more complex, but this is representative of how it works).

Of course, the Credit Card Number you actually use may change several times before a week has passed. If you, like me, have several different accounts that you use at work, at home, and at your corner café, you are changing Credit Card Numbers constantly. And if you share your account or even just your card with your family, then multiple people are sharing one Credit Card Number.

So, back to our initial question: is a Credit Card Number personal data, or, in other words, can you figure out who someone is from a Credit Card Number? A black-and-white declaration that all Credit Card Numbers are always personal data incorrectly suggests that every Credit Card Number can be associated with a specific individual.

Regardless of how long you have the address it's still personal since it is traceable. ISP's keep a log of who is assigned what IP and when. That's how they can reveal who someone is when ordered by a court to do so.

Therefore it is an identifiable piece of information, just not directly.

A social security # isn't identifiable either. It's just a string of numbers. But with the help of the government, you can find a name to go with that SSN.

This post is extremely misleading. The vast majority of users are or will be on broadband connections. This is already the case in many countries. These are always-on connections, so even if the ISP dynamically assigns IP's, these rarely change (my own ISP uses fixed IP's by the way, even providing custom reverse DNS names). When IPv6 starts to be implemented on a wider scale, fixed IP's will become more common again, but even now, most IP's in developed nations with high broadband penetration are de facto fixed for extremely long periods of time.

In other words: IP addresses are as personal as a home address, even more so as much more information about behaviour can be traced back via a person's IP.

All of this talk about IP not being personal is based on current improvisations to get people online that are rapidly disappearing. It's a smokescreen (whatever happened to "don't be evil"?), nothing more.

I believe that Google is coming at the problem from the wrong perspective.

The question must be, how can this information be misused and how can Google lead the industry to prevent that misuse. Anything less will be seen as a significant shift away from Google's mantra.

By claiming that IP addresses should not be, by default, protected as personally identifying information, you instantly bring to mind the very public legal settlements and convictions of the last few years in which IP served as key evidence against the defendant. You bring to mind the power of matching wikipedia edits to corporate IP addresses. You bring to mind struggles over open Wifi, and file sharing which, at their heart revolve around the way IP addresses are handled.

It is not feasible to distinguish between IP addresses which are stable and uniquely identifying and those which are not, therefore the responsibility of Google is to treat them all as if they unique and powerful identifiers. To do otherwise is to sacrifice the privacy of the few on the altar of corporate interests, and I, for one, find that very distasteful.

IP addresses are definitely personal information. As has been pointed out, when coupled with a date+time you can work out which computer had that address, and many computers are either used by one individual, or members of one family. In addition, with always-on connections and modems which retain that connection even while the computer is switched off, IP addresses don't change all that much. The fact that Google, or similar service companies, might not be able to link that to a person without going through the ISP doesn't change the fact that it can be.

Obviously, any server needs to use one's IP address to transfer information to/from the computer in question. However, why would such a company need to store that IP address for years at a time?

There are valid causes for concern, some of which have already been mentioned. By linking together databases of IP addresses with, say, one of many cookie-based automatic logins, a company could easily link whoever is visiting a website with a facebook (or similar) account. It shouldn't be too difficult for a webserver to determine that the IP address currently visiting its site is the same IP address that only recently downloaded some fancy image while logged in as user X in facebook, and then determine that user X is subscribed to certain interest groups or has a certain list of friends. Computers make the collation and storage of such data possible, but it's rarely something that the users are comfortable with.

What I am more concerned by is: why would Google store this information?

If it can be used to understand a user (same searches, account, regular switch between three range of IPs at work, family and café times, etc.) then it is a useful information — *because* it is PII and it makes sense to treat it as such, even if you need to be either Google (thanks to a user model) or the ISP (thanks to logs) to use it as PII.

If IP is not PII, then why keep it? If it is unreliable enough, why would Google need to argue to store it? I remember the main argument behind this blog was that Google needed a reasonnable, internationally homogeneous policy on digital matters—there is a decision based on common sense: EU (like Google, like ISPs, like many due-process against computer crimes) treats IP as an PII. Not a proof (happily for many unaware owners of zombies) but something that can be followed.

Why lobby to prove the opposite? Why not try to make US match the consensus, and err on the side of a Right for Privacy?Information deduced from IP (ISP, approximate location, type of connexion) can be non-identifiable. But — unless I missed something — if it is worth arguing for, it sounds like it is too valuable not to be considered.

Arguing that Google should have the right to systematically register our car number plates or ip addresses, because it is not proven that we are behind the wheel or keyboard, is a stupid argument. I have agreed with my political leaders to make the law for companies not do that - end of story. My locations is not something that you should trace.

Privacy is between me and my legislature - companies like Google meddling with my privacy is alarming. a Public company can not consider the rights of the individual without consideration its obligation to its shareholders. They are in conflict - so back off.

Yes, not all IP addresses on their own are directly traceable to specific individuals. However, there are many well-known cases where an IP address alone has been enough to expose somebody who thought they were anonymous. Take a look at Brian Chase, for example.

I hope that the error here is just one of poor expression, and that you promptly take the opportunity to correct this.

I consider IP Address as PII. Google may say that IP Address is just numbers, but if they are logging the IP Address and the time that IP Address made a request, then it can be tracked in the ISP DHCP logs.

Google saves every search term we enter. And also the IP Address that requested that search. That helped police crack a case. I am all for Govt getting access to the info, but cannot trust the folks inside a corporation.

I dislike the way that Google attempts to portray itself as a company that is not evil, but then goes on to try to spin a issue in its favor when it is clearly in the wrong. I respect Microsoft since they have don't try to portray themselves as a company doing good.

It's an interesting question and one which I imagine will generate a lot of debate. I think there is a portion of IP addresses that are probably personally identifiable but a large portion that are not.

For example, there are more than one person in our household so identifying a particular individual from a static IP address (if we had one) on our broadband connection would be difficult, also I think that people tend to use the Internet from different places, home, work, cafe, mobile phone, library and very few of these are likely to be identifiable to a particular individual. Also, visitors often visit the house and use the computer or ask me to look up information on their behalf.

In comparison, I have one of those supermarket points cards which has an identifier that is identifiable personally to me and this large supermarket chain probably tracks everything I buy at the shops. Does it worry me? A little. Will I stop shopping there because of it? No.

The Data Protection Act isn't to be messed with, nor will any sophistry on googles part make any difference. You're not dealing with the corrupt US system where waving some money around will get the laws rewritten. Nor will google get away with gaming the system in any way. Unless people opt in to have personally identifiable data stored by google you'd better anonimise it or very bad things will happen. Once you start holding data on people even with their permission they have rights to see and manipulate that data and can have google erase said data at any time. If the google system is incapable of doing this it will have to be closed down untul such time as it can.

In the UK DP Act, the question is whether the individual is identifiable not from the IP address but rather "from other information in the possession of Google" (e.g. a history of transactions).

Identifiablity does not need a name - it can involve something where there is a focus on a particular characteristic (e.g. the user from the IP address 330.09.08.07 is likely to be interested in XYZ because he/she has visted web-sites P, Q and R).

The analysis here is incomplete in that you can't look at the data alone. You have to consider the context in which the information is used (e.g. are you finding out something about the particular user).

Also the APEC Privacy Framework mentioned in the text is unlikely to protect privacy (http://www.out-law.com/default.aspx?page=8550)

I agree with what has been said before, i.e. personal data in Europe in not just data that identifies a person, but any information that, in association with other information, can lead to the identification of a person. Therefore the whole article is based on a definition of what is personal data that is not what the law calls for, at least in Europe. AND it is my opinion that definitions of this kind have to be made by the legislator, not by the industry, who can contribute and give its view, but is obviously aiming at reaching the best definition for itself, not necessarily for the persons whose dta are being handled.

I'd be stunned at this, but then again, this person gets paid for pretending they aren't destructive of privacy. The argument is wrong throughout, and jumps from "this is how DHCP works, and many people are on DHCP" to "the only people for whom DHCP considerations might not apply are ISPs" and on into:

"the IP addresses recorded by every website on the planet without additional information should not be considered personal data, because these websites usually cannot identify the human beings behind these number strings."

This is not true. If you go to Arin and look up my home /29, you get my billing name and location.

I'm a networking person and I need to be able to know a couple of things: exactly what IP address I will be coming from, and exactly what equipment is between me and a remote premise. So I use statics.

In five years, many people will be using ip 6 addresses, and perhaps even most people will. ip 6 will not need to be dynamically assigned. ip 6 addresses are intended now, by the engineers, to preserve the possibility of anonymity, but there is no guarantee that as deployed in the US or EU they will.

An argument that draws what little strength it has from a workaround for a weaknesses in an end of life protocol (ipv4) to try to lobby in favor of establishing policy to favor Google now, then go back later and "compromise" once the landscape of IP addressing utterly shifts - well, disingenuous is the nice long word for that kind of argument.

if you look at all the comments posted so far, it all shows that Google is moving the wrong direction in terms of privacy and that it has no credibility on this topic. I was stunned at reading so many critical comments, and at the same time relieved: peoples till think with their heads! The lesson to be learned (assuming Google cares,) is that they should stop the crap and start acting more sensibly on something we all take very seriously.

Google speaks with forked tongue on this issue. On the one hand, it's okay for Google to track the IP address of anyone who uses its services. But on the other hand it hides the originating IP addresses of people communicating privately using Gmail.

Google's practice of hiding the originating IP address of senders using Gmail is dangerous. It makes it impossible for people in private communications to verify that the person they're communicating with is who they say they are.

And it makes it harder for parents to protect their kids from predators without calling in the FBI or getting a court order. Gmail is great for kids because the spam filter is very effective.

But this practice of hiding originating IP addresses makes not sense, especially when Yahoo! and Microsoft show the originating IPs in their free email products.

I've raised this repeatedly with Google in various ways and have never received a the courtesy of a response.

Google values personal identifiable information such as ip addresses. I like millions of other would like "full control" of our personal information. Gmail is a problem, it says in its privacy policy that deleted emails may remain in backup. I would like to know for sure when I delete an email that google cannot even retrieve it because I can no longer retrieve it. With all this information Google could go up against the NSA's amount of information and it would be a good fight. Google needs to clean up/clear up their privacy policy and let users do what they want with the logged information.

Gmail's ULA may say 14, but kids use the service and the company can hide behind legal documents if it wants, but I didn't think it was Google's way to turned a blind eye to evil.

Second, the ULA does not stop the service being used by people older than 14 who want to do bad things.

Indeed, what possible purpose is there behind Google providing anonymous email? Who benefits? It can only be helpful to those few who want to do bad things.

Do a search on this topic and see who is talking about it most. You'll see that there is a happy underworld that is loving the anonymity that gmail provides. Think hackers, phishers and predators, they're all attracted to gmail for its anonymity.

The rest of us honest gmail users don't even know that the originating IP is hidden -- until something happens and we need to verify a sender's identity. Then we find that we need a court order to find out what the originating IP address of the sender is.

Yahoo! Mail and MSN Hotmail do not have this flaw. Right now, it's a flaw in gmail, but the longer Google ignores it, the more irresponsible the company's behavior becomes.

I cannot understand why the company would put its reputation at risk like this. I would like to think it is just ignorance, but the silence from Google on this issue makes me wonder.

Give me a description of your general SEO(search engine optimization) experience.Do you currently do SEO on your own sites and give me some examples. Do you operate any blogs? Do you currently do any freelance work and do you plan on continuing it?Where do you think the SEO industry is headed?What industry sites, blogs, and forums do you regularly read?Have you attended any search related conferences?What SEO tools do you regularly use?What SEO areas are you weak and strong in, and give examples of both.What areas do you think are currently the most important in organically ranking a site?Do you have experience in copywriting and can you provide some writing samples?

"Considering an IP address personal makes as much sense as making AT&T's main switchboard phone number personal for any given employee."

I don't know how many people live in your house then, because giving away the "switchboard" number for a home owner is the equivalent of giving them your home number. Sure you can leave your wi-fi open. That means they'll just sue you for the information of your "users" or make you pay.

As a little test, post your home IP address on this blog if you don't consider it private.

My IP address is personal. No US judge has the right to my personal data. I am a European citizen who has never been to the US, so hands off my privacy, Judge Stanton. I shall be complaining to the European data protection authorities about this decision, which is in breach of EU privacy law.

Google now has to hand over 20TB of user logs to Viacom. Their only escape is if they print them out and hand them over as hard copy rather than supply digital logs. The format is never detailed in court proceedings as yet. Google however I think has not got the balls for doing that.

so now while google (more especific: the users of YT are) f*cked up because of -among other things- THIS specific public opinion expression about the "personal "-factor of IP adresses and an IMO technological ignorant/incompetent Judge, may I ask if google will do something against the viacom//YT discovery rulings?

At least with respect to NON-US bases IP adresses (like the 62.226.nnn.nnn ones for example) in google's/YT's database?

You know my dear googleanians, in the rest of the world with civilised countries there are data-/privacy protection laws valid that would bring guys like this judge and those viacom vultures behind bars for what they are trying to do!

First I use a freeware product and add on to Firefox called track me not. The software picks random web pages and sends a transaction to that web page, thus rendering any profiling efforts useless. TrackMeNot runs in Firefox as a low-priority background process that periodically issues randomized search-queries to popular search engines, e.g., AOL, Yahoo!, Google, and MSN. It hides users' actual search trails in a cloud of 'ghost' queries, significantly increasing the difficulty of aggregating such data into accurate or identifying user profiles. As of version 0.4, TMN's static word list has been replaced with a dynamic query mechanism which 'evolves' each client (uniquely) over time, parsing the results of its searches for 'logical' future query terms with which to replace those already used.

http://www.mrl.nyu.edu/~dhowe/trackmenot/

Second a USB flash product Ironkey, provides a rather healthy hardware encryption system for on board data. But the real trick is IronKey's Secure Sessions Service, your Web communications travel through an encrypted tunnel on the Internet to private network routing servers and eventually out to your destination website with random IP address. Any return packets are sent first to the secure server, IP addresses are translated to your IP address (somewhat like NAT) and returned to your computer via the same encrypted tunnel.

First, it asks whether IP adddresses are personally identifiable, then it switches to whether they can be used to determine identity. Unless someone is behind a router where multiple users simultaneously have the same external IP, then given a specific time, an IP address is absolutely identifiable. That doesn't mean it can be used to identify someone in reverse, although it could be used with other information to determine a high probability, but it is still definitively personally identifiable information for that person at that time, unless multiple persons are simultaneously .what do u mean by this D.?

"Protecting our users' privacy is something we take very seriously. Personal information, including someone's exact location, can be gathered from someone's IP address, so Gmail doesn't reveal this information in outgoing mail headers. This prevents recipients from being able to track our users, or uncover what may be potentially sensitive personal information."

I think CNN uses IP addresses to moderate out comments. I am not comfortable with the IP address being used for any purpose that has to do with screening, identifying, or categorizing anyone. We are losing our privacy at a very fast clip.

I'm a member of a Google site known as YouTube and I think that an IP should be considered personal information. I say this because somehow another user of the YouTube site managed to find my city, state and zip code!! I know for a fact that I wasn't carelessly giving out that information. I never gave out my location to anyone and yet I saw my info on a stalkers trolling account just a few days ago while they where threatening me on the site.

The only difference between static and dynamic addresses really is if you will have the same address in the future. It doesnt mean it can be RECORDED somewhere. For example, everybody's cable modem or dsl modem has a MAC address which is in your providers database (recorded info). Then, when you connect to the network they probably have to keep a log (government BS Patriot act or something like this) in their servers of what IP address was assigned to you, at what time, and for how long. Now, If I post this comment "anonymously" here, then if "blogger" records my comments to IP addresses.. then there is no way that it could be private. The question is, "Who is recording IP addresses for each transaction on web pages and in Internet service providers? AND , "if the government comes to them to get information.... can they? " Im sure that answer is YES. To be truly "anonymous" or private... the websites and Internet Service Providers need to NOT log the IP addresses. If they do,,, we can be sure that the government can access them..

It seems to me that its not important to discuss whether an IP address is personally identifiable or determine identity any longer. Everybody in this blog disagreed with Google and highlighted that their personally identity is sacred.

If a client contacts a server that client needs to expose some facts about itself to establish communication eg. Browser, routers, ip-address, etc. In that case its perfectly reasonable for the server to customize its response based on having a cookie or knowledge of the identity behind an ip-address, since the client initiated communication the server can assume the client gave consent.

Would however a server scan ip-addresses, or a wifi scanner store data packets based on associating ip-address with location and identity there is no client that gave consent. Doing this is know as identity theft. Taking it one step further and actually training staff at Google to drive around to collect that data can only have two reasons: National Security or Cyber terrorism. If it is for national security the prime minister or president of that country should have given its consent, if not, its a terrorist act.

Would the police interrogate those Google employees will they admit that they did this on purpose? This topic certainly indicates Google had intent. Perhaps some Google Manager created chineese walls so one staff member did not realize what the other was doing. I fully trust that Google employees will tell the truth when a judge ask them if they installed those scanners and disks into the car with the intent of national security. I also trust that they will not take the 5th to protect themselves when asked if they knew its a criminal act to store identifiable data without consent. At the end the day the motto is : Do no evil