US suspends vast ADVISE data-sifting system

From late 2004 until mid-2006, a little-known data-mining computer system developed by the US Department of Homeland Security to hunt terrorists, weapons of mass destruction, and biological weapons sifted through Americans' personal data with little regard for federal privacy laws.

Now the $42 million cutting-edge system, designed to process trillions of pieces of data, has been halted and could be canceled pending data-privacy reviews, according to a newly released report to Congress by the DHS's own internal watchdog.

Data mining to help fight the war on terror has become an accepted, even mandated, method to provide timely security information. The DHS operates at least a dozen such programs; intelligence agencies and the Department of Defense employ many others.

But ADVISE (Analy­sis, Dissemina­tion, Visu­ali­zation, Insight and Semantic Enhance­ment) was special. An electronic omnivore conceived in 2003, it was designed to ingest information from scores of databases, blogs, e-mail traffic, intelligence reports, and other sources, government documents and researchers say.

Sifting that enormous mass at lightning speed, ADVISE was to display data patterns visually as "semantic graphs" – a sort of illuminated information constellation – in which an analyst's eye could spot links between people, places, events, travel, calls, and organizations worldwide.

Report: DHS didn't follow guidelines

Yet ADVISE, whose existence and scope were first detailed by the Monitor in February 2006, seems to have run afoul of its own ambitious scope. It failed to incorporate federal privacy laws into its system design. From its earliest days, the system's pilot programs used "live data, including personally identifiable information, from multiple sources in attempts to identify potential terrorist activity," but without taking steps required by federal law and DHS's own internal guidelines to keep that data from being misused, the DHS Office of Inspector General (OIG) said in a June report to Congress, which was made public Aug. 13.

In a rebuttal attached to the report, the DHS Directorate for Science and Technology disagreed with most of the OIG's findings. "The ADVISE tool set is little more than an empty framework to which data must be applied," wrote Jay Cohen, DHS undersecretary for science and technology, in a letter accompanying the rebuttal. He said no privacy laws were violated.

Even in searching for terrorists, data-mining programs are supposed to ensure that Americans' personal information is used only when necessary and lawful – and only for specific and proper uses. One problem is that even data that look anonymous aren't necessarily so. For instance, even when names and Social Security numbers are stripped from data files, programmers can still identify 87 percent of Americans through their date of birth, gender, and five-digit Zip Code, researchers say. So a system has to be carefully designed and use encryption and other computer techniques to comply with the law.

Last week the Pentagon shut down its TALON terrorism database program, which had been found to hold files on peace activists. In 2003, another military data-mining project – the Total Information Awareness project – was also ended following a congressional uproar over privacy fears.

Congress last fall ordered its Government Accountability Office to audit the program for privacy and effectiveness. It asked the OIG to do the same. In February, the GAO recommended a full-blown data-privacy review of the ADVISE system. Without that, its report said, ADVISE holds "potential for erroneous association of individuals with crime or terrorism and the misidentification of individuals with similar names."

In his report to Congress, publicly released earlier this month, DHS Inspector General Richard Skinner revealed that:

•From late 2004 to mid-2006, three ADVISE pilot programs – one focused on biological threats, another on weapons of mass destruction, and a third classified program to identify emerging threats – were not mere test beds working out technical bugs. Instead, they were "operational" and used "personally identifiable" data, without having conducted any privacy-risk assessments.

•All three pilot programs were quietly halted in March pending formal privacy impact assessments on the vulnerability of personal data. A privacy impact assessment is a type of information audit that ensures that government is only using personal information when it is necessary and lawful to be revealed.

•While submissions were made to begin the process, full-blown "privacy impact assessments" of the three programs did not begin until early 2007 – about two years after they became operational and began hunting terrorists, the OIG's office reported. It also said the March shutdown to assess privacy implications has damaged ADVISE's prospects, giving rise to skepticism within DHS about the utility and cost of the program and leaving it "at risk" of cancellation by 2008.

Failure to ensure data privacy is a problem that has torpedoed other counterterror programs, says Lee Tien, an attorney with the Electronic Frontier Foundation, a privacy advocacy group..

"The OIG's report clearly shows major breakdowns in the system we're depending on to protect people's private data," says Mr. Tien. "Whatever the data ADVISE used, the outputs are clearly important for people's privacy. The biodefense pilot program, for instance, presumably involves information about people's medical condition and emergency-room reporting."

Confusion within department's ranks

DHS's delay in addressing data privacy appears to be due to confusion and miscommunication about privacy requirements by ADVISE program managers and DHS's privacy office, amid the rush to get a system running, the OIG says.

For example, ADVISE program managers told OIG investigators they didn't realize privacy assessments were required for a system still in development. At that stage, the system was just a processing tool without data, they argued – a view agreed to by the DHS privacy office.

Indeed, the privacy office mentions the ADVISE system only once, in a footnote, in its mandatory report last summer to Congress on data-mining activities. Until the "ADVISE tool" had data attached to it, it was not a data-mining program needing privacy review, the office reported.

Unknown to the privacy office, the ADVISE pilot programs had been operational and using personal data for about 18 months before the privacy office made that report to Congress, the OIG found.

DHS has not reported how much and what type of personal information was used. One senior DHS official, who agreed to speak only on condition of anonymity, says of the personally identifiable data used by ADVISE: "We have no idea what information or how much was used."

Larry Orluskie, a spokesman for the DHS science and technology directorate, says a DHS privacy office review of ADVISE last month corroborates the OIG finding that ADVISE "was maybe too zealous in its testing," he says.

Even so, he says, the ADVISE system is back on track, though he is unsure if the privacy assessment was complete or if operations had resumed. A request for interviews with Undersecretary Cohen or other ADVISE officials went unanswered.

One conclusion, however, is that the privacy failure has cost ADVISE dearly. Despite some early successes – ADVISE's weapons of mass destruction pilot program identified a link between organized crime and terrorism – the failure to abide by privacy laws and costs of compliance have now reduced interest within DHS in ADVISE, the OIG reports.