Predictive Coding 101

Predictive coding describes a review tool that employs algorithms to determine the relevance of documents based on the “training” the computer receives from human reviewers.

“Based upon the coding of a sample or seed set of documents by a team of lawyers with subject-matter expertise, the computer learns what is considered responsive versus nonresponsive and then applies that logic to the remaining documents to suggest additional documents that are like the first responsive set,” says Jacquelyn Caridad, of counsel to Morgan, Lewis & Bockius.

The lawyers repeatedly review the suggested documents to train the system. To quality-control test the predictive coding used, a party might take a statistical sample of documents that have been coded and review them again for accuracy.

If by this process only the top 40 percent of the documents are classified as relevant, for example, then the remaining 60 percent of documents may not even require review, which results in a significant cost savings, according to Matt Nelson, e-discovery counsel at security software company Symantec.