LITIGATION SUPPORT TIP OF THE NIGHT

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer. All content provided on this blog is for informational purposes only. The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of this information. This policy is subject to change at any time. The owner is not an attorney, and nothing posted on this site should be construed as legal advice. Litigation Support Tip of the Night does not provide confirmation that any e-discovery technique or conduct is compliant with legal, regulatory, contractual or ethical requirements.

Featured on the ACEDS blog.

Follow me on Twitter and see How-To Videos on my YouTube channel.

New tips for paralegals and litigation support profesionals are posted to this site each night. Click on the blog headings for better detail.

TAR for Smart People Outline - Chapter 10

January 27, 2017

Here's another installment in my outline of John Tredennick's 'TAR for Smart People'. I last posted an installment on January 15, 2017. This night's installment is on Chapter 10, Using TAR in International Litigation - Does Predictive Coding Work for Non-English Languages?

A. Department of Justice Memorandum

1. March 2014 DOJ memorandum acknowledges that TAR offers a chance for parties to reduce costs in responding to a second request in proposed merger or acquisition. The memo states that it is not certain that TAR is effective with foreign language or mixed language documents.

1. TAR doesn't understand English or any other language. It it simply analyzes words algorithmically according to their frequency in relevant documents compared to their frequency in irrelevant documents. When a document is marked by a reviewer as relevant or irrelevant, the software ranks the words in the documents based on frequency or proximity. TAR in effect creates huge searches using the words ranked during training.

2. If documents are properly tokenized, the TAR process will work. Words are only recognized because they have a space or comma before and after it. Chinese and Japanese don't use spaces are punctuation The search system used by Catalyst index Asian characters.

C. Case Study

1. Review of mixed set of Japanese and English documents. The first step tokenized the Japanese documents - Japanese text is broken into words and phrases. Lawyers reviewed a set of 500 documents to be used as a reference set by the system for its analysis. Then they reviewed a sample set of 600 documents marking them relevant or irrelevant. System ranked the remaining documents for relevance.

2. The system was able to identify 98% of likely relevant documents and put them at the front of the review queue. Only 48% of the total document set needed to be reviewed to cover the bulk of likely relevant documents. 3. A random sample was reviewed from the remaining 52% of the document set, and the review team only found 3% of these documents to be relevant.