A federal court has ruled that sophisticated hash value analysis of a hard …

Share this story

A good coder has as many uses for hash functions as George Washington Carver did for peanuts—but law enforcement is fond of these digital fingerprinting techniques as well, because they allow reams of data to be rapidly sifted and identified. Legal scholars, however, have spent a decade puzzling over whether the use of hash value analysis in a criminal investigation counts as a Fourth Amendment "search." A federal court in Pennsylvania last week became the first to rule that it does—but one legal expert says an appeal is very likely.

Chief Judge Yvette Kane of the U.S. District Court for the Middle District of Pennsylvania penned the opinion in United States v. Crist, granting Robert Crist's request for the suppression of child pornography police found on his computer. Crist had fallen behind on his rent, and his landlord hired a father-and-son pair to move the delinquent tenant's belongings out to the curb, where a friend of one of the movers, Seth Hipple, picked up Crist's computer. When Crist returned home, he began freaking out over his vanished machine—while Hipple was freaking out over what he'd found in a folder on the hard drive: Videos appearing to depict underage sex, which he promptly deleted.

Hipple called the East Pennsboro Township Police Department, and though the computer had been reported stolen, it soon found its way to the Pennsylvania Attorney General's Office, where special agent David Buckwash made an image of the hard drive and began sifting through its contents using a specialized forensics program called EnCase. Rather than directly examining the contents of the hard drive, Buckwash initially ran the imaged files through an MD5 hash algorithm, producing a unique (for practical purposes) digital fingerprint, or hash value, for each one. He then compared these smaller hash values with a database of the hash values of known and suspected child porn, maintained by the National Center for Missing and Exploited Children. He came up with five definite hits and 171 videos containing "suspected" child porn. He then moved to gallery view, inspecting all the photos on the drive, and ultimately finding nearly 1,600 images that appeared to be child pornography.

None of this, however, had been done with a warrant. That raised two intriguing legal questions. First, longstanding precedent holds that if a private party, unprompted by police, conducts a search—by opening a package or briefcase, for instance—then the owner has lost their "reasonable expectation of privacy" in the searched object. That means police are in the clear if they proceed to examine whatever the private party has discovered. But it's not always clear how this rule applies in particular cases. If a private person opens a briefcase, police might scrutinize it more closely when they take a look—but the exception clearly doesn't mean that police can scour an entire house, ripping open mattresses and digging through closets, just because someone else has already wandered through the place. So had Crist lost his expectation of privacy in the entire hard drive, or only in the few files and folders Hipple had seen?

Even if the entire hard drive wasn't to be considered fair game, however, a more interesting question remained: Was the analysis of hash values of the files on the hard drive a search at all? The question was first broached in a 1996 Yale Law Journal article titled "Cyberspace, general searches, and digital contraband." The author noted an interesting quirk of Fourth Amendment jurisprudence: Courts have held that a "search" occurs when someone's "expectation of privacy" is violated, provided that expectation is one that society is prepared to regard as "reasonable." But they've also held that there is no such "reasonable expectation" as regards the possession of illegal materials, like narcotics or child porn. In 2004, the Supreme Court would rely on this logic in the case of Illinois v. Caballes to hold that a trained drug dog's sniff, which only reveals the presence or absence of illegal drugs, does not count as a search. In the digital realm, this raised the possibility of what we might call, with a nod to novelist Erica Jong, a "zipless search"—a more or less perfect means of detecting only contraband, circumventing the Fourth Amendment's warrant requirement.

If hash value analysis isn't a search, then even if the state went too far in directly inspecting the hard drive, the evidence of a hash match against the NCMEC database might still be admissible. But Judge Kane rejected that logic, writing:

By subjecting the entire computer to a hash value analysis—every file, internet history, picture, and "buddy list" became available for Government review. Such examination constitutes a search.

But as George Washinton University law professor Orin Kerr, author of the Justice Department's computer search manual, wrote on the widely-read Volokh Conspiracy blog, this is almost maddeningly brief and vague. "Which stage was the search—the creating the duplicate?" asked Kerr. "The running of the hash? It's not really clear." And as Kerr notes, though the court alludes to the Caballes dog-sniff ruling earlier in its opinion, it does not directly take up the question of the "zipless search," or explain how the hash analysis differs from a dog sniff. The answer could be massively significant, since it would determine, for instance, whether law enforcement agents serving a valid warrant against one user on a huge server are entitled to scan the entire machine, rather than only their target's files, for illicit material.

The second question is whether Buckwash "expanded the scope of the private search" conducted by Hipple when he imaged and scrutinized Crist's entire hard drive. In United States v. Runyan, the Fifth Circuit Court of Appeals seemed to accept the application of a "closed container" metaphor to digital storage devices. Just as the privacy interest in the contents of a package are lost once someone has opened it, the contents of a digital storage medium are fair game once it has been accessed. But as Kerr has pointed out in his paper, "Searches and Seizures in a Digital World," physical metaphors are tricky in a world of bits. Is the computer really like a "container"? Or given the vast amounts of information a hard drive can contain, does it make more sense to think of the drive as analogous to a warehouse, where the "container" is an individual file or folder? Kerr ultimately opts for an "exposure theory" of digital searches, according to which only the information that has been displayed to a human user should be considered "searched," leaving the privacy interest in all the other data intact. In this case, Judge Kane seemed to agree that Hipple's "search" of a few files did not void Crist's privacy interest in the rest of the drive, and that in any event Buckwash's forensic analysis was qualitatively different and more extensive than Hipple's casual examination.

Kerr, however, told Ars that he expects the government to appeal the ruling, both because the argument for counting hash analysis as a "search" is so brief, and because the court's application of the Runyan precedent is subject to dispute.

That makes United States v. Crist a case to watch. Until now, the constitutional status of hash value analysis has been unclear. But if the Third Circuit Court of Appeals should disagree with Judge Kane's reasoning, it could send a signal that a new era of zipless searching is at hand.