E-discovery evolved: Smart searching

In Da Silva Moore v. Publicis Groupe, U.S. Magistrate Judge Andrew Peck’s opinion condoning the use of predictive coding was anything but a blanket endorsement of technology-assisted review (TAR). Instead, Peck emphasized that counsel “design an appropriate process, including the use of available technology, with appropriate quality control testing, to review and produce ESI.” While an “appropriate process” might leverage TAR, other tried and true technologies, such as concept searching, topic grouping and email threading, remain critical complementary tools for designing an effective protocol to filter, process and review relevant data.

Case law: Process is key to defensibility

Prior to Da Silva, several noteworthy opinions offered guidance for crafting a “smart” search process. In Victor Stanley, Inc. v. Creative Pipe, Inc., the court famously stated that “all keyword searches are not created equal” and recommended compliance with the Sedona Conference Best Practices Commentary. Specifically, the court suggested that counsel:

Choose methods best suited to the specific needs of their case

Perform due diligence by vetting providers and technology

Make good-faith efforts to collaborate with opposing counsel on the keyword, methods and technology used

Subsequent opinions, such as Eurand, Inc. v. Mylan Pharm., Inc., found the judiciary ill-equipped to evaluate the technicaladequacy of search terms and instead evaluated the reasonableness of the search and retrieval process. While courts may consider a variety of circumstantial factors in such an assessment, authority overwhelmingly asserts that parties must implement a comprehensive and cooperative process to comply with the Federal Rules of Civil Procedure and procure the “just, speedy and inexpensive” determination of a matter.

Designing effective keyword searches

A defensible search process requires significant oversight. Parties should conduct some form of early case and/or data assessment (ECA or EDA) to better understand the particulars of their case and their data at the outset. Following such an assessment, counsel can identify risks and better estimate the document review budget. Additionally, the insight resulting from ECA/EDA can help limit unnecessary disclosures when negotiating search terms with opposing parties.

In addition to understanding their case and data, counsel should also understand the technology implemented to search and filter the data set. Unfamiliarity with the search operator can significantly increase the amount of time and money counsel spend finding relevant documents, so reviewers should get comfortable with the engine before diving in. Additionally, counsel should work with the provider to identify useful strategies in the event of an error, such as a “time out” during a larger search.

Strategically building a keyword list is also extremely beneficial. Counsel should run broad searches to cull the document universe and identify similar documents that might provide additional keywords. Additionally, when crafting keyword lists, account for commonly misspelled terms, word/phrase permutations and over-inclusive “noise” words with the help of a data dictionary. Finally, save all searches in a simple text editor to document the search process.

Advanced searching technologies

There are a variety of advanced technologies available to conduct smarter searches. According to the needs of your case, you may use these technologies singularly or in tandem:

Concept searching. Helps identify hidden or disguised conversations. After a user seeds a term, the technology finds documents with very similar concepts by identifying word patterns and occurrences in documents and translating them into concepts.

Topic grouping. Useful for bulk removing non-relevant documents. Topic grouping analyzes the content of a particular set of documents and determines what themes appear within the data before a legal team conducts review. Then, the technology automatically groups the documents together and labels them for quick identification, such as “finance,” “accounting” or “HR.”

Language identification. Identifies all languages in the data set to determine additional costs associated with searching and reviewing documents in foreign languages.

Email threading. Groups emails based on their related content, which allows reviewers to follow conversations, determine who saw what (and when) and ensure consistent coding throughout the entire string of messages.

Near de-duplication. Performs a text-based comparison of documents to create families of documents that are very similar but not exact duplicates, such as various iterations of a contract or memorandum. This technology allows for more logical culling of a data set at the ECA stage or batching for review consistency. Regardless of the technologies chosen, all methods should implement sampling, which identifies a percentage of documents in a data set to test the reliability of keywords or coding decisions to ensure a defensible process.

Conclusion

Successful e-discovery exercises often leverage technology to complement human analysis and advocacy skills. By implementing a smart, multi-faceted approach, advances in technologies can effectively narrow the scope of review and find an appropriate number of relevant documents while limiting inadvertent disclosures. To establish the most effective process possible, counsel should work with their technology providers to identify best practices and strategies to bolster a smart searching protocol.