The EFF digs deep into the FBI’s ‘everything bucket’

PublishedMay 2, 2009

A new EFF report pulls together everything that’s now known about the FBI’s monster internal records system. By Jon Stokes April 30, 2009arstechnica.com

Earlier this week, the EFF published a new report detailing the FBI’s Investigative Data Warehouse, which appears to be something like a combination of Google and a university’s slightly out-of-date custom card catalog with a front-end written for Windows 2000 that uses cartoon icons that some work-study student made in Microsoft Paint. I guess I’m supposed to fear the IDW as an invasion of privacy, and indeed I do, but given the report’s description of it and my experiences with the internal-facing software products of large, sprawling, unaccountable bureaucracies, I mostly just fear for our collective safety.

The idea behind the system, which the FBI has been working on since at least 2002, is that the Bureau can dump all of its information in there so that it can be easily searched and shared. IDW contains more documents than the library of congress–a stew of TIFFs with OCRed text, multiple Oracle databases, news streamed in from the Internet, reports and records in various in-house data formats, watch lists, telephone data, and an alphabet soup of smaller databases and records repositories–all accessible as one sprawling system that processes batch jobs, runs queries, and issues alerts. In short, the IDW is an “everything bucket” for the FBI.

Complicating the picture is the fact that some parts of the system are classified as “secret,” while others aren’t. I’m sure the entire thing is a joy to use.

The EFF’s report is based on information obtained over the past three years through litigating a FOIA request; the organization didn’t get everything it wanted from the FOIA, but it got quite a bit. Some of the e-mails obtained are bureaucratic classics, in which correspondents are fussing over phrasing to be used when testifying before Congress so as to give the proper impression (e.g., that they care about privacy) and generally stay under the radar.

Ultimately, though, the EFF still doesn’t have a complete picture of all of the data sources that have been added to the IDW, but the group is pretty clear on the direction that the expanding database is headed: data mining for the purpose of catching bad guys before they commit crimes or acts of terror.

Last year I wrote a pretty detailed explanation of why these attempts to use data mining to catch bad guys before-the-fact are all doomed to fail, based on an National Research Council report that made the same point, so I won’t recap that here. It suffices to say that the precrime stuff does not work, and will never work, and government should take the money they spend on these projects and hire linguists and other human agents instead.