Searching Extreme Amounts of e-Data a Challenge

December 13, 2010 – How many electronic records does it take before it becomes unsearchable? During a hearing last December, defense lawyers argued that the extreme amount of electronic data provided — the equivalent of 8,000 to 10,000 boxes of paper — was so mammoth and disorganized that it couldn’t be effectively searched for information that might help the defendants. In a ruling afterward, the judge agreed, ordering the prosecution to “re-disclose” any relevant material in a more organized fashion.

We found this news on Computer World in their article, “Keyword searches not good enough for e-discovery, experts say.” Problems that many of today’s search tools don’t handle well include the prevalence of false positives, inadvertent misspellings, and ambiguity of language. All factors that can significantly affect or hinder the e-discovery process.

Would adding a human to the process make it more efficient?

“You can do a lot better job of searching for relevant documents if you use a combination of an expert who knows the data set working with individuals who are actually running automated search queries,” agrees Jason Baron, director of litigation at the U.S. National Archives and Records Administration and a founding coordinator of the TREC legal track.

Software vendors are scrambling to provide alternative search technologies to overcome the limitations of today’s tools. Many approaches are being offered, such as relying on taxonomies of industry terms, or mathematical techniques.

Melody K. Smith

Sponsored by Access Innovations, the world leader in thesaurus, ontology, and taxonomy creation and metadata application.