Thursday, February 18, 2010

Alternative routes to identifying "anonymous" online users

David Robinson and Harlan Yu have posted a superb series of posts on Freedom to Tinker (1,2,3) about tactics which might be used to identify anonymous internet posters, even in cases where IP addresses might not have been logged by the site which hosts the comment. The key insight is that sites typically embed multiple external services (such as advertising, stats counters and video hosting) which may either individually or in combination enable the identity of particular users to be pinned down:

[P]laintiffs' lawyers in online defamation suits will typically issue a sequence of two "John Doe" subpoenas to try to unmask the identity of anonymous online speakers. The first subpoena goes to the website or content provider where the allegedly defamatory remarks were posted, and the second subpoena is sent to the speaker's ISP. Both entities—the content provider and the ISP—are natural targets for civil discovery. Their logs together will often contain enough information to trace the remarks back to the speaker's real identity. But when this isn't enough to identify the speaker, the discovery process traditionally fails.

Are plaintiffs in these cases out of luck? Not if their lawyers know where else to look.

There are numerous third party web services that may hold just enough clues to reidentify the speaker, even without the help of the content provider or the ISP. The vast majority of websites today depend on third parties to deliver valuable services that would otherwise be too expensive or time-consuming to develop in-house. Services such as online advertising, content distribution and web analytics are almost always handled by specialized servers from third party businesses. As such, a third party can embed its service into a wide variety of sites across the web, allowing it to track users across all the sites where it maintains a presence.

The traceability of any given site visitor will still depend on context: the number of third party services used by the site, the popularity of each third party service across the web, the types of identifying data that these parties collect and store, whether the speaker used any online anonymity tools, and many other site-specific factors.

Despite the variability in third party tracing capabilities, the nearly simultaneous connections to a few third party services means that the results of tracing can be combined. By sleuthing through information held in third party dossiers, logs and databases, plaintiffs in John Doe lawsuits will have many more discovery options than they had ever previously imagined.

Of course, these tactics are likely to be expensive. Also, in an Irish context the uncertainty as to whether a result will be achieved may mean that a court will be less willing to grant a Norwich Pharmacal order (which is a discretionary remedy (PDF) - not something which is available as of right). But nevertheless, the research is important - particularly as it illustrates that traditional methods of ensuring online anonymity (such as TOR routing) may be vulnerable to indirect attack.