In an attempt to forestall potentially intrusive new federal laws, a coalition of Internet companies has launched a campaign against child pornography that they say will tip off police to illegal images.

The Internet companies--AOL, EarthLink, Microsoft, United Online and Yahoo--are pledging $1 million in cash and technical assistance to develop technology that can "detect and disrupt the distribution of known images of child exploitation" on the Internet. The coalition's effort will take place under the auspices of the National Center for Missing & Exploited Children.

Tuesday's announcement comes just hours before the beginning of a two-day U.S. House of Representatives hearing that will explore enacting new laws to require Internet providers to store records on what Americans are doing online, a concept called data retention.

Because Internet providers are loath to see new laws that could raise privacy and security concerns--and cost them millions of dollars in the process--they hope that their own, self-regulatory proposal will reduce Congress' willingness to impose a mandatory one. That may be a tough task: Attorney General Alberto Gonzales has been pressing for data retention laws as a way to aid in child porn investigations, and some politicians have already drafted legislation.

"There's always a concern that regulations are adopted that are overly expansive or difficult to implement," said Fred Randall, general counsel to United Online, which provides Internet access through its NetZero brand and operates social networking Web sites such as Classmates.com. Randall said that United Online has a history of working with law enforcement and already reports child pornography images and videos that its employees encounter.

One proposal that politicians are expected to present during Tuesday's hearing, according to one industry representative who spoke on condition of anonymity, is the creation of a national list of Web sites featuring illegal sex-themed images. Internet providers could be either encouraged or required to block access to them. (That's being done in the U.K. and was the law in Pennsylvania until a federal judge struck it down as unconstitutional.)

Borrowing from computer science
While the Internet companies say they have reached no firm decision about what standardized detection mechanism to use and are planning a meeting in July to work out details, one leading candidate can be found in any basic computer science textbook: a hash function.

Hash functions are methods used by programmers to generate a relatively small digital fingerprint from any type of data--including music, videos and photographs. Checksums, for instance, typically rely on hash functions. What makes them useful are two properties: first, they're often unique (though there's no guarantee), and second, changing even one byte is supposed to result in a completely different fingerprint.

For instance, the popular yields the value fa145076b2c4d025fc7b7b4cf6bd256c for "CNET News.com" and the noticeably different result 643bc47634c1b834f36623fdb120d565 for the text "CNET News..com".

AOL has used hash functions in its internal efforts against child pornography since early 2004, said spokesman Andrew Weinstein.

The system works like this: When AOL employees become aware of a child pornography image included as an e-mail attachment, they forward the attachment and information about the sender's geographic location to the National Center for Missing & Exploited Children, which in turn sends it to the appropriate law enforcement agency. AOL also generates a digital fingerprint of the image so it can be automatically flagged if it flows through the company's network in the future.

ISP snooping timeline

In events that were first reported by CNET News.com, Bush administration officials have said Internet providers must keep track of what Americans are doing online. Here's the timeline:

Other Internet and e-mail providers could adopt the same approach. That would create a "master list of bad sites or files, or in this case signatures, that all partners can use it to escalate the fight" against child pornography, Weinstein said on Monday.

In the future, "we can look at things like instant messaging and video files and how you track those," he said.

Using hash functions to detect unwanted files is hardly new. Researchers at Brooklyn's Polytechnic University have described (click for PDF), for instance, how fingerprints could detect "pollution" on peer-to-peer networks performed by copyright holders intentionally trying to distribute corrupted files. Italian computer scientists have proposed (click for PDF) using hash functions to identify and discard junk e-mail. One open-source project that does that is called Nilsimsa.

Seth Schoen, staff technologist at the Electronic Frontier Foundation, the digital rights group in San Francisco, expressed concern that even legal content could end up on the national blacklist of hashed fingerprints.

"There's a question about whether people would want to add things other than child pornography to the list," Schoen said. "Is there any way to prevent people from simply suppressing nonchild-pornography-related (images) by claiming they're child pornography?"

Another possibility, Schoen said, is that child pornographers who know how the system works would simply make a tiny tweak to photographs to avoid detection--rendering the hash detection system useless. Internet providers could counter-attack using a "locality sensitive hash" function that's designed to detect similar files, but even that in turn could be foiled if image files are encrypted.

Industry representatives readily acknowledge that technical discussions are only beginning. "Hopefully this will help highlight the issue and help highlight some potential solutions that haven't been considered yet," said Randall, United Online's general counsel.

Officials from EarthLink, Microsoft, Yahoo, Google, AOL, Verizon and Comcast are scheduled to testify on Tuesday. The second day of the hearing, on Wednesday, is scheduled to include representatives of Facebook.com, MySpace.com parent company Fox Interactive Media, and Xanga.com.