How FBI technology woes let Fort Hood shooter slip by

Or, "How not to build a data warehouse."

A "crushing volume" of data

Both DWS and EDMS had been running beyond their intended capacity for years. The systems lacked enough disk storage, had no disaster recovery capabilities, offered inadequate data security to share data outside the FBI, and had a severe shortage of server horsepower. And the FBI knew it. In an August 2006 report (PDF) justifying fiscal year 2008 budget requests to the Office of Management and Budget, FBI officials wrote:

While providing significant tactical value, EDMS cannot continue to support the FBI's counterintelligence and counterterrorism mission objectives as it currently exists due to the increase in data collection volume and user base. Since Oct 2004, EDMS experienced a 300 percent increase in average users per month. Over the past three years, the volume of ELSUR collections has grown over 62 percent for audio wiretaps and over 3,034 percent for digital collections (e.g., e-mail, seized media). The current system is unable to scale and meet these growing demands. Due to the increased burden, the ability to share ELSUR data and collaborate efficiently with other authorized federal, state, local law enforcement, and federal intelligence agencies will no longer be feasible unless the proposed enhancements are implemented.

Back in June 2007, DWS and EDMS had 1,600 users within the FBI. The two systems handled over 70 million “products” (e-mails, chat sessions, audio, and attached files), and tracked 16,500 e-mail, instant messaging, and Web accounts between them. After combining the systems in 2008, things just got worse—the systems exceeded 3,000 users, 350 million tracked “products,” and 50,000 tracked accounts by mid-2009.

And the integration had been a bit "iffy." A new graphic user interface was developed to make the joined systems simpler to access, but the older system (renamed "DWS-EDMS Classic") remained in use in many field offices. Because of the underlying structure of the DWS database and the limited functionality of the user interface, finding e-mails of interest was a little like checking the world's biggest e-mail inbox with Microsoft Outlook. Actually, it was a lot like that. The GUI, according to the Webster report, was based on Outlook.

When a user logged into DWS-EDMS, the "home page" of the application displayed system-wide announcements and a list of the user's active cases. From there, the application's main screen displayed headers for the "products" associated with a case in a column similar to an Outlook inbox display; selected documents opened in a panel to the right. A filtering tool allowed the user to filter content displayed in the "inbox."

That inbox contained what the FBI San Diego field office analyst working on the Aulaqi case called a "crushing volume" of information. Between the first message sent by Hasan to Aulaqi (December 2008) and the last (June 2009), the agent and analyst assigned to the case reviewed 7,143 documents—between 65 and 70 on an average day, with as many as 132 documents on peak days. Getting through that volume of data consumed astounding amounts of time. The analyst spent about 40 percent of his total time reviewing documents for the Aulaqi investigation, while the agent assigned to the case spent about three hours each day reviewing documents.

Much of that time was spent simply trying to get the data out of the system. According to the Webster report, the search tools in DWS-EDMS "were not designed for and do not provide effective assistance for the review and management of massive collections of information, like the collection in the Aulaqi investigation." Because of the way the underlying database was designed, the DWS-EDMS search capabilities were crippled at best.

DWS-EDMS could perform Boolean searches of document text, searches based on "participant" (the specific e-mail addresses being sought), a keyword search, and a limited full-text search capability. Depending on the search strategy used, results could vary widely. For example, the Webster commission found that a full-text search using Hasan's AOL e-mail account only retrieved half of the messages in the system (while the "participant" search brought up all the e-mails from his account). Keyword searches didn't include synonyms or variations, only returning documents with an exact match. And DWS-EDMS lacked any cross-investigation full-text search—if it had one, a search on Nidal Hasan's name would have brought up an e-mail captured in an unrelated investigation that gave Hasan's military e-mail address and tied him to the Walter Reed Army Medical Center.

It got worse. The DWS-EDMS "Classic" interface had no way to track specific e-mail account activity within cases. "A new message could be linked with an earlier message only through memory, notes, or by actively searching the system," the Webster commission found.

And prior to the Fort Hood shootings, while the "new" DWS-EDMS system allowed for users to track "favorite" cases and specific surveillance products (and to copy content they found to the equivalent of an Outlook "shared folder"), the system had no way to automatically link e-mail addresses and other types of data together. So if a person of interest had two different e-mail addresses, users had to conduct separate searches.

This drove them to dubious tracking systems of their own. The analyst on the Aulaqi case tracked e-mail addresses of interest in a separate Excel spreadsheet; the agent in charge of the case relied on written notes—and his own memory.

Who needs disaster recovery?

While the upgrades to DWS-EDMS did give it a friendlier face, the back-end databases for the system were extended far beyond what the infrastructure could support, by any normal definition of an "enterprise system." Most obviously, no disaster recovery system existed. That's a problem shared by other FBI databases, such as the Office of General Counsel's National Security Letter database—which, due to database crashes, became corrupted and in 2007 could not even give FBI Inspector General auditors an accurate count of exactly how many NSLs the bureau had sent. Even today, DWS-EDMS lacks any backup or high-availability capabilities, and it still runs on antiquated hardware. The Webster commission reported:

The lack of a modern hardware infrastructure has two major implications. First, the relatively aged server configuration for DWS-EDMS and its ever-increasing data storage demands, coupled with ever-increasing use, creates slowdowns that we witnessed repeatedly in our hands-on use of the system. An agent in the field with considerable DWS-EDMS experience reported that the slowdowns deterred searching the system. Second, DWS-EDMSs lacks a "live" or "failover" emergency backup.

The DWS-EDMS system had other significant problems that went beyond its engineering: nobody was ever actually trained on how to use it, and many people who could have benefitted from it didn't have access. Despite the massive growth in its user base, the combined system remained available almost entirely to FBI agents and analysts. Only a few members of joint terrorism task forces from other investigative services—including the NCIS and DCIS—even knew it existed. And DWS-EDMS existed as an island separate from the other investigative tools used by the FBI and its Joint Terrorism Task Force teams.

So when Hasan’s message to Aulaqi appeared in the system in December 2008, there was only one way for the agent in charge of the case to share it with his non-FBI JTTF colleagues: he e-mailed it to them.

Communication breakdown

Because Hasan mentioned the military, the FBI agent on the Aulaqi case e-mailed the text of Hasan's message to members of the San Diego JTTF from NCIS and DCIS.

"Here's another e-mail sent to Aulaqi by a guy who appears to be interested in the military," wrote the agent. "The header information suggests that his name is 'Nidal Hasan,' but that might not be true... Can we check to see if this guy is a military member? Also, I would like your input, from the military standpoint, on whether or not this should be disseminated further."

An initial check didn't find Hasan in the Defense Department's personnel database. However, after another message from Hasan got picked up on New Year's Day 2009, a DCIS analyst found Hasan in the Defense Employee Interactive Data System (DEIDS) and passed a printout of his information to the investigation team. The database identified Hasan as a "commissioned officer"—but because "commissioned" was abbreviated as "comm.", agents were concerned that he was a communications officer and would have access to Information Intelligence Reports (IIRs).

As a result, the data on Hasan wasn't shared with the Army. Instead, it was forwarded to the Washington, DC field office of the FBI and the lead was flagged to FBI headquarters:

This one is for WFO (Washington Field Office). The individual is likely an Army communications officer stationed at Walter Reed. I would recommend that this not be disseminated as an IIR, since he may have access to message traffic. If this needs to get to the military, WFO might have to do it internally.

The lead was sent through the FBI’s Automated Case Support Electronic Case File system as an “electronic communication”—the FBI’s digital equivalent of an official memo. It contained the text of Hasan's two messages intercepted thus far, basic information about San Diego’s ongoing investigation of Aulaqi, and Hasan’s home address, phone number, and the misconstrued information about his military assignment.

“While e-mail contact with Aulaqi does not necessarily indicate participation in terrorist-related matters, Aulaqi's reputation, background, and anti-US sentiments are well known,” the message concluded. “Although the content of these messages was not overtly nefarious, this type of contact with Aulaqi would be of concern if the writer is actually the individual identified above.”

But in being handed over to the Washington field office, the Hasan investigation lost its connection to the e-mail intelligence being gathered by the San Diego JTTF. The fragmented nature of the FBI's information systems would keep Washington investigators from having a complete picture of Hasan's continued communications with Aulaqi when they finally picked up the lead.

Since Hasan wasn't seen as key to the ongoing Aulaqi investigation, the San Diego field office agent in charge of the case didn't plan on following up with his colleagues in DC. The lead went untouched in DC for two months—possibly because the office was handling issues concerning the inauguration of President Obama. It would be May before the lead was assigned to a Washington-based DCIS agent to assess. By then, Hasan had been promoted by the Army from captain to major.

Sean Gallagher / Sean is Ars Technica's IT Editor. A former Navy officer, systems administrator, and network systems integrator with 20 years of IT journalism experience, he lives and works in Baltimore, Maryland.