Blogs and Privacy

Over at Civilities, Jon Garfunkel points out that many blogs that use Site Meter and other third-party visitor tracking services are publicly displaying a lot of information about their readers: IP addresses, domains, location information, referring URLs, and if they came to the site via a search engine, the search terms that took them to the blog. In his post, Jon notes that he tried an experiment at a blog that uses Site Meter and linked up IP addresses to specific anonymous or pseudonymous comments. That blog was ours.

Our Site Meter stats are public. I really like allowing our readers to see our traffic stats, referring URLs, the search term queries that bring readers to our blog, and the location and domain information.

But Site Meter also lists the IP address of each visitor, something that the public really doesn’t need to see. An IP address is a unique numerical identifier that is assigned to every computer connected to the Web. It doesn’t reveal your name, but it can be used to trace back to the specific computer you used or be linked to your account with an ISP. In other words, your IP address can be used to find out who you are.

A sample stat entry looks like this (I’ve blocked out part of the IP address):

Our Site Meter tracks the last 4000 visits to our blog, which are publicly displayed at the Site Meter page for our blog. Entries beyond the last 4000 are no longer publicly displayed. For our blog, which is getting about 3000 to 3500 visits per day, information about your visit to our blog is displayed for a little over a day.

Jon’s post has made me think much harder about blogs and privacy. Beyond Site Meter, our blogging software logs the IP addresses of commenters. We could conceivably be subpoenaed in a civil or criminal litigation to turn over IP address information about an anonymous or pseudonymous commenter. When posting a comment, our blogging software also asks for your email address, which is not published with your comment but which is recorded in our system and made available to us. You can provide a fake email address, but if you provide your true email address, this could also be of use in identifying you if subpoenaed.

So all this made me realize that we do have some data about you and we need to construct a privacy policy. Regarding Site Meter, bloggers who use the premium Site Meter service (which we use) display full IP addresses in their public stats. Those who use the free version of Site Meter have the IP addresses partially blocked out in their public stats. Site Meter has an option to conceal all the stats, but it doesn’t allow for only concealing or partially blocking IP addresses. The choices are to publicly display everything or conceal nearly everything.

I contacted Site Meter to see if IP addresses could be partially blocked in public stats on a premium account. A person at Site Meter informed me that they currently cannot do this, but that they hope to add this capability in the near future. Right now, then, the best solution to this dilemma is to downgrade from the premium to the free Site Meter, as this will allow us to have public stats with partially-blocked IP addresses. We are currently considering do this at Concurring Opinions, as well as removing Extreme Stats, another service similar to Site Meter that we use which also displays IP addresses.

Many of the blogs I visit have public stats, so if you’re a regular blog reader, your IP address and other information is being logged and publicly displayed across the blogosphere.

Some questions:

I. YOUR ATTITUDES TOWARD PUBLIC STATS

1. Do you find our public visitor stats via Site Meter to be useful? If so, why?

2. Do you find it problematic for your IP addresses to be publicly displayed in Site Meter and other visitor tracking services?

II. YOUR KNOWLEDGE ABOUT WHAT INFORMATION WE HAVE ABOUT YOU

3. Did you realize that when you visit our blog and others, that your IP address and other information are publicly available in our Site Meter logs?

4. Did you realize that when you make an anonymous comment on our blog, it is possible to link up your IP address with your comment via Site Meter stats?

5. Did you realize that when you make an anonymous comment on our blog, our blogging software records your IP address, which could be subpoenaed?

III. YOUR THOUGHTS ABOUT POLICY

6. Should we continue on as usual (public Site Meter stats with full IP addresses)? Or should we block full IP addresses from public view?

7. If there’s a tradeoff between having public stats with full IP addresses and no public stats at all, which of these options would you prefer?

8. What should our policy be if we are requested by others or subpoenaed to provide identifying information (an IP address or email address) for an anonymous or pseudonymous commenter?

22 Responses

I’m open to suggestion, and my own views aren’t really calcified. My intuitions are along these lines:

First, I think that keeping relatively detailed information about commenters is important, on an internal level. We don’t get a _lot_ of trolls, but they do exist. As a blog administrator, it’s useful to be able to see if a hostile, personal-attack comment comes from the same address as other comments like that. Every now and then, we have to ban someone for consistently being hostile and disruptive on the blog. Our ability to effectively moderate the comments would be severely undercut if we didn’t keep good records.

On a related note, I worry about sock puppetry. We haven’t been hit by that here (have we?). On my other blog, we’ve had a few run-ins with sock-puppet trolls. These are really not fun. These people post comments under new names, agreeing with themselves, making their position look better. (We actually had one person post a comment under one name, and then, under another name, write “I really liked so-and-so’s comment.”) These trolls, if they’re smart at all, will vary the e-mail address they put. The only real way to catch them is by IP address. (If they’re really smart, they’ll post from one identity at work and another at home. This makes them effectively undecetable. It requires a lot of work to keep up that level of commenting division over time, though.)

So, I’m absolutely in favor of retaining such records as are necessary or useful for us as blog admins.

At the same time, I don’t think we need to show that detailed information to visitors. There’s no need for our visitors to see anything but the general contours of traffic. It may be interesting to see a few more details here and there on an aggregate level — that some percent of our traffic comes from .edu, for instance. But visitors don’t need to see referral details or IP details.

And there is a real potential concern about those. Given enough time, one could potentially start trying to identify people. Most IP addresses are limited. But when you add in content — if you see that one person looked at Deven’s last post between 2 and 3 yesterday, and a comment shows up at that time, you know the commenter’s IP. If that comment is anonymous but says, “I teach contracts,” and comes from the William and Mary server — suddenly you’re narrowing it down to one or two possible people. (This is less possible for popular posts that get a lot of views, but still a potential concern.)

So, I’m in favor of limiting what information visitors can see.

What if Site Meter can’t make that happen? That seems unlikely — this looks like a simple oversight to me. But assuming that they don’t change anything, I guess we go to another tracking service. I like Site Meter, and this issue looks like a simple oversight to me, not a deliberate problem. But it’s out in the open now, and so I’d say we need to address it.

As for the anonymous/pseudonymous commenter, that’s an interesting question. I don’t know if I favor a blanket policy in either direction.

I don’t think we owe a duty to commenters to always safeguard anonymity. If a commenter is using an anonymous handle to harass or abuse others, I have no problem exposing that. I wouldn’t necessarily wait for a subpoena, either. Anonymity has a dark side, and I really don’t want to become another XOXO by enabling anonymous abuse.

On the other side of the spectrum, there are instances where I would be willing to fight, hard, to protect anonymity. In particular, discussions where the commenter is particularly vulnerable, such as a discussion of sexual harassment where a commenter anonymously discusses the fact that she was sexually harassed in the past.

As for cases in the middle, attempts to find out who “Joe” or “Anon” are, where there’s neither abuse by the commenter, nor special vulnerability — I’m not sure where I stand. I’ve got no problem fighting a subpoena — we’re all attorneys, and filing a motion to quash is something any of us could do easily. We’re not going to have to spend big bucks on legal representation. But, I also don’t feel an urge to fight in every case.

As for requests from others — I’m happy to share information with other blog admins if it’s related to troll abuse. For instance, if Eugene Volokh or Dan Markel or whoever e-mails me and says, “I’ve been getting a lot of hostile / abusive / trolling comments from a commenter named Joe at IP 1.2.3.4. Have you gotten any from him?” — I’d be glad to look it up, and discuss the issue with the other blog admin.

I do think that any lookup and information sharing should have a legitimate purpose related to blog management. It’s wrong to say, just for the fun of gossiping, “hey, Dan Markel, do you know who anon123 is? It’s actually so-and-so.”

Since sitemeter does most of its tracking work the instant one enters the site, I suppose the privacy policy cannot entirely protect someone. (Granted, it can inform them about what information is recorded and encourage/discourage them from returning or from commenting.) Given this problem, however, it seems like a prudent surfer should make the default assumption that the sort of information sitemeter detects is always recorded and available to others, until he or she learns otherwise. If that’s the prudent default position, it makes the use of sitemeter at full strength a bit less problematic.

I’ve been a passive reader of Concurring Opinions since you all started. I’ve always found it mildly–amusing? ironic?–that a privacy advocates blog would reference so much javascript hosted by third parties (presently, on this page, there is Javascript from blogads.com, amazon.com, technorati.com, sitemeter.com and embedded local-Javascript for extreme-dm.com.

Personally I use Firefox’s NoScript extension to control what JavaScript my browser downloads and runs. I’ve marked sitemeter.com as forbidden. Since NoScript’s default policy is to block unless explicitly allowed I don’t run any of the others (though with a single click I can temporarily–for the duration of my browsers current session–run them). I probably do get bit by the web bug image from extreme-dm.com, and naturally I know that your server’s access logs will be recording my IP address.

that “My general response to this post is that for casual surfing, this kind of information is routinely collected by sites. You can block certain information by disabling Javascript etc., or go through anonymous proxies (e.g., Tor) for more privacy. But it’s not privacy-friendly to republish this kind of information with full IP addresses. Even though this person may not actually (still) be at this IP addresss, we can find out more things, such as that this IP address currently has an open Telnet server port open (and possibly others). In theory, with a full IP address available, one could potentially do some hacking. Not that one couldn’t do this to any random IP address…”

I fully expect sites to log information about visitors, for a variety of reasons. Sites should have a clear privacy policy. But I don’t see why detailed stats are needed by the blog readers. It’s interesting at most to see general trends.

There are certainly easy ways to increase one’s privacy and anonymity on the web, to protect from hacking, but perhaps more importantly to protect general privacy. Routinely changing IP address (I do that daily), blocking 3rd-party cookies, blocking known 3rd-party tracking servers via a hosts file, using anonymous proxies and/or onion routers, etc., etc. Unfortunately, most surfers are not that savvy.

A very interesting connundrum, and a good example of what Glenn Reynolds referred to yesterday as “spying back”. I use the Premium Site Meter and do find it to be very helpful. The IP addresses are very effective in blocking stalkers and unwanted spammers, perhaps the most effective means, for it means that for the spammers and stalkers to continue their mischief, they would presumably need to continue to change computers every successive time they intend to continue their spamming. One would think that eventually the computers to which they have access would diminish…

On the other hand I can also see where it could become a privacy concern, particularly in the hands of a Socialist Administration bent on, for example, auditing or otherwise using heavy handed tactics against its political opponents. My current plan is to continue using it, but should events in our country turn in a more unfortunate direction, I can certainly see a day when another tool might be preferable.

We went back and forth on this issues when we started Stubborn Facts last year. In the end, we settled on allowing all visitors to see our SiteMeter stats. If we had an option to hide IP address of our visitors from the rest of the world, we would probably exercise that, but we don’t have any serious concern about the display of the data, in the end.

It’s crucial for site owners to have this data, because it is sometimes necessary to ban access by stalkers, trolls, and other undesirables who would otherwise destroy the community you’re trying to create.

I support the right to post anonymously, but I think that maintaining anonymity is really up to the end user. If you really need privacy, then you need to use a proxy server and take other steps to assure your privacy.

I’ve found the public sitemeter stats of our own and other blogs very helpful. Blogs are often linked together by overlapping groups of regular readers and commenters. Occasionally, a psycho/stalker/troll type will follow from one blog to another in the group. Comparing sitemeter stats makes it much easier to identify the troll. Our own site has good logs and blocking tools, but not all blogging systems make it that easy to identify the IP address of a specific commenter.

I would add, in a related vein, that the privacy concerns you raise are greater for the smaller sites. Even with the premium sitemeter, if you’re looking at the Instapundit site, for example, you’re looking at the past few minutes of traffic only. With the free sitemeter account, the 100 visit limit keeps track of only the past few hours, minutes, or even seconds of very popular blogs.

As for subpoena policies, my own reaction would likely depend on the nature of the comment the government was seeking to unmask. If I could see that there was a decent reason for the subpoena, I’d probably comply, providing notice only if required by law. But if I thought the subpoena was merely part of an illegitimate attempt at harassment or intimidation, then I would do my best to notify before compliance to allow the anonymous commenter time to challenge the subpoena.

By the way, the coolest thing about Sitemeter stats is when you see that, on the day you’ve posted a “separated at birth?” post comparing Ruth Bader Ginsburg and Willie Wonka, you receive a visit from a computer belonging to the Supreme Court of the United States. I love having that ability, and it’s always exciting when you see that people you are writing about are reading what you said about them.

I don’t find any use in seeing other people’s visitor stats, or even those of my own site. I don’t wish my IP address publicly displayed.

I was not aware that the tracking information you gather was publicly available. I did know that the connection could be made through your web server logs, but I didn’t know your blogging software collected the data as well.

I would prefer that no site retain records of its visitors. I would suggest you implement a policy of peroidically purging those records, and to whatever extent possible avoid providing records on anonymous or pseudononymous commenters. If I want to be identified, I’ll identify myself.

On my own blog, I find it useful to be able to track users by IP, and I use a commenting system (Haloscan) that lets me block IPs and IP ranges, which is once in a while necesary when I get a troll. Which I did very recently in a rather upsetting way.

I’ve considered using anonymizing software to disguise my location, but never got around to it. I suppose I should though, because if anyone ever cared enough to unmask me (not that I’m trollish!), they easily could whenever I comment on another site with public sitemeter, such as yours. Or link up comments I make under the pseudonym with comments I make under my real name (which I do on law blogs in my area of research).

I suppose someone could check now, except that I’m visiting family, and for some reason our ISP broadcasts from a different location anyway. I think the servers run through Los Angeles.

In any case, I make my Sitemeter public (but for no particular edifying reason; I could just as easily make them private. And I am aware that I am not entirely anonymous or protective of my privacy when I participate on other blog forums, but that I could take steps to protect my privacy if I really tried. Of course, this is an “opt out” strategy, which is, I admit, a little interesting coming from privacy advocates.

Perhaps your stats could be made public (it is interesting to note how many people read your blog on a daily basis, if not from where or who or how they arrive there), but not broadcast IPs? Or perhaps the interesting details could be kept to you and your co-bloggers for site-management purposes, but not be made public.

In any case, I should probably think a little more about how much information I leave behind on other blogs. I’m already the worst-kept secret in the blogosphere!

Privacy is the site visitors responsibility. If I don’t want you to know who I am, I use anonymouse, and turn off Java scripting entirely (Sitemeter has to have Java enabled to pick up referral info).

I believe it’s a poster’s responsibility for his or her own privacy. A website (this one or any other) can do whatever they like; if one really wants privacy, one can have it.

As an aside, anyone who pays for sitemeter must have a tremendous ego. There’s no need for that, really, unless you have vanity issues. You can make your stats invisible to the public if you’re concerned (just show a number, folks); that should be sufficient.

I think the call for increasing privacy will come as a result of actual crimes committed against commenters.

In this respect this issue resembles the debate over the installation of a rural traffic-light: The question usually boils down to, “How many people have gotten hurt (or worse) at the subject intersection?”

If noone’s daughter has died there yet, the installation usually must wait ’til she has. Same goes, I think, for the installation of rules, or “safety barriers” for users of the world wide web.

Has anyone tallied the “victims” from this virtual intersection? Are there any data to work with?

I don’t think that people realize that with many of the blogging comment systems that the blog owners get a lot of information about the posters as well and should be reassured that they will not have that information be abused in any way. It’s nice of you to point that out.

Most of this stuff is common sense, and short of threatening the president or some other person, it’s pretty hard to commit a crime requiring subpoenas of IP addresses from a blog (I can’t think of an instance of this happening, can you?). But it would be reassuring, especially on blogs with lots of anonymous commenters for people to develop a privacy policy. That way people can feel safe to write whatever without the worry of being “outed” and chastised.

For the poster that thinks he can remain anonymous by using anonymizer he is wrong. A supeona will remain any anonyminity he has. Also web logs will leave clues that can be triangulated with logs from other blogs. And if he mentioned locations and activites in comments, he has provided a whole history with detailed points about himself that can be matched across blogs.

Drat, Rose! You’ve gone and uncovered our nefarious money-making scheme. Yeah, that was us. We sell all of our commenters’ e-mail addresses to the Huckabee campaign. In fact, most political insiders credit this blog with being the major reason for Huckabee’s recent surge in popularity.

MarkH raises an interesting point — under what circumstances are we actually likely to be subpoenaed?

Off the top of my head, I can think of a few different categories, and I’d be initially inclined to treat them somewhat differently.

First, we could be subpoenaed in a relatively morally unambiguous criminal case. For instance, someone could anonymously comment at Co-Op, “I just robbed the bank at State and Main,” and then the police could seek that commenter’s information. That seems relatively unlikely to happen. It also seems pretty easy. I wouldn’t have any problem handing over information on a bank robber.

A second scenario could be a criminal case where the morality is less clear-cut. For instance, one of us might write a post about Raich, the medicinal marijuana case. An anonymous commenter might then say, “I live in a state which has no medicinal marijuana provision (and of course, Raich held that the U.S. can still prosecute). However, I regularly purchase marijuana for my dying cancer-stricken grandmother, to ease her pain.” That’s an admission of a criminal act, but it’s much less morally clear-cut than the bank robber. My initial inclination would be to fight a subpoena in that kind of case — or at least, not to just roll over. I don’t know how much of a fight my co-bloggers would want to put up.

A third case is the morally clear-cut civil case. A clear-cut defamation, for instance, where an anonymous commenter writes “Person X is a child molester,” and then Person X sues for defamation. Again, if it looks like just a gratuitous, defamatory attack, I wouldn’t be averse to providing information.

But then there’s the fourth case — the less clear-cut civil case. A commenter says, “Professor Z’s views on the Constitution show that he is a fool,” and Professor Z sues and wants identifying information.

The Luskin/Atrios lawsuit is an example of category four, I think. Atrios made a harsh statement about Luskin’s political views, and Luskin sought to unmask Atrios in a subpoena.

It’s entirely possible that a commenter her will write, “[political commenter] is a fool,” or the like, and then political commenter then sues for defamation and seeks to identify the commenter.

My offhand reaction would be that, unless I was convinced of the rightness of the suit, I would be inclined to resist providing that information.

In general, though, it’s something that I’m sure we would discuss among the bloggers, and I’m not sure that all of my colleagues would always agree with my own views.

While it is true that Sitemeter reveals a number of statistics about web visits, it does NOT reveal a users specific IP address. Take a look at the details of visits and you’ll see that the last decimal is replaced with a “#.” But then, I suppose you could say they would have your “approximate” location, give or take 254 other users with similar IP addresses.

It is my understanding that standard sitemeter only provides 3/4 of the IP address and you have to get the upgraded version to get the full IP. Don’t know if that has changed.

Also the upgrade version has a larger buffer of the past visitor history.

Even at 5000 visits remembered with the upgrade version, on a very busy blog that doesn’t take long to start over writes after a full buffer exists.

I have also seen some commentary on how they determine site visits and page views which there is no real industry standard to define what that is for advertisement cost structures from blog to blog.

If you are running a blog at a blog aggregation place rather than on a dedicated server you are stuck with one of the middleware counters since you don’t have access to the server logs which can have much more detail than sitemeter could even dream of.

Besides spammers , the worst offense I have seen is some very busy blogs that don’t require registration and the software has a whole where you can post by the same name as one of the regulars there and start bogus thread wars saying stuff from two or three names which pit regular users against each other until the catch on to what has been done.

I fail to see why that’s an issue, frankly. What’s the big deal about this stuff? This is supposed to be a method of suppressing free speech, supposedly. I say ‘nonsense’. Look, friends, one of the qualities of free speech, as the founders envisioned such is having the courage to sign your name to something, and to have your name thus associated with the ideas you express. We do the ideal of free speech no favors when we facilitate hiding behind some electronic wall or other.

Stop whining, America, and have the courage to stand up and be identified with your words and ideas.

Dennis wrote: “While it is true that Sitemeter reveals a number of statistics about web visits, it does NOT reveal a users specific IP address.”

This is something that originally perplexed both Dan and me. I explained it in my article, but it appears Dan didn’t quite spell it out explicitly. JustADude explains above: when you pay, you not only get the full IP address yourself, but everybody else does as well. This seems like a bug on Sitemeter’s part, and it violates the spirit of their Privacy Guidelines. They didn’t concede as such to Dan or me.

C. Lee Davis wrote: “I don’t find any use in seeing other people’s visitor stats, or even those of my own site. I don’t wish my IP address publicly displayed.”

Well, yes, this data has been out there all these years and no one thought to look at it. I only got the idea after reading Dan’s book (and meeting him) and re-visiting some thoughts I’d long had on anonymous comments. And I provided examples (as Kaimi does here) where a third party would be tempted to peek at the IP logs.

Bithead wrote: “Stop whining, America, and have the courage to stand up and be identified with your words and ideas.”

I agree with you, Bithead (anyone see the irony here? click the link, scroll down the blog, you find out Bithead’s real name, apparently). I stated in my analysis that I happen to think that named communities on the Internet are generally more civil and constructive. But, just as well, people often feel more free to talk when anonymous.

That said, at the Yale ISP conference, Alessandro Acquisti of Carnegie Mellon presented some fascinating research which showed that the more detailed a privacy statement, the more people were primed to think about privacy, and thus they felt less free to speak openly.

anon wrote: “I believe it’s a poster’s responsibility for his or her own privacy.”

Well, I can’t contest what you believe, but it’s a question what works for a given forum/blog. I don’t know how many Internet users use, or are familiar with, IP-masking tools. I think that Dan has covered this in his book– and forgive me for not citing it chapter & verse– that people have reasonable expectation of privacy from a service provider, and can’t always be expected to be technically proficient to “opt out” themselves. Christopher Caldwell made a similar point in today’s NYT Magazine.

I agree with you, Bithead (anyone see the irony here? click the link, scroll down the blog, you find out Bithead’s real name, apparently)

Correct, you do. You also see it in the blog’s address. And on the “About” page, as well. (Looks like this thing won’t accept a link in the comments, sorry. “About is on the top of any page on my blog) No irony involved whatsoever. “Bithead” is a nick I’ve used for many years, but I’ve never made a secret about who I am. Nor have I hidden behind a name that wasn’t mine, ever.

But, just as well, people often feel more free to talk when anonymous.

No doubt. But of course that’s because they need not consider closely what they say. Considerations of repercussions of violence and whatnot aside, it’s harder to be labeled an idiot when they don’t know who you are. Thereby you need not think things through before speaking in such a situation. Somehow that doesn’t strike me as furthering the cause of freedom, or free speech.