How natural language processing can address critical government issues

By William D Eggers

Feb 20, 2019

Federal agencies routinely receive hundreds of thousands or even millions of comments on regulatory proposals. Many of these come from automated bots posting slight variations on a script.

How can agencies weed out this spam to identify legitimate comments? With the very same technology that generates robotic comments: natural language processing.

NLP is a form of artificial intelligence that recognizes patterns to understand human language. It helps computers deduce meanings and even infer unspoken information.

The accuracy of Google Translate is due to NLP. Automated chatbots also function with help from NLP. When legal software combs through tens of thousands of documents to discover relevant pages in a fraction of the time it would take human paralegals -- and with a higher accuracy rate -- that’s NLP.

The government keeps records that don’t divide nicely into spreadsheets. Police departments collect recordings and interviews. The Administrative Procedure Act of 1946 requires government agencies to collect public comments. Laws and regulations themselves can be complicated, sometimes redundant or even conflicting, layered with changes over years. Applications for permits, website feedback, stakeholder interviews and social media responses all result in textual data that often never gets analyzed.

This information offers valuable insights to anyone who can analyze it in depth. Quick, mathematical analysis of unstructured information has the potential to revolutionize government decision-making, policymaking and service delivery.

Because NLP analyzes a wide variety of data (documents, text, voice) and has a range of practical uses, the discipline encompasses multiple capabilities. These range from sentiment analysis to text categorization. NLP can scour documents and classify them by topic, even without a programmer defining in advance which topics to look for. It can also organize topics into taxonomies, like learning to flag diplomatic cables that might contain classified material. It can discern between entities with similar names -- sensing whether “Chris” in a document refers to Chris the prosecutor or Chris the suspect. And it can deduce relationships, like recognizing that the FBI is part of the Department of Justice.

Many customer service departments employ NLP to mine social media for references to their organization and analyze whether the sentiment is positive or negative. The city of Washington, D.C., uses sentiment analysis to make sure it’s serving citizens, while the Food and Drug Administration uses it to track the spread of opioids.

NLP can extract meaningful information from unstructured text. The Defense Advanced Research Projects Agency created a program that uses NLP technology to infer implicit information from source material. When raw intelligence hears, “I’m allergic to apples” and the response, “Don’t eat the cake,” the technology infers that the cake contains apples.

Governments stand to gain. IDC anticipates that the kind of insights NLP can provide will result in $430 billion in productivity gains by 2020 for organizations that can harness them. With this technology, governments can better analyze public feedback, improve predictions, increase regulatory compliance and enhance policy analysis.

Many agencies are already onboard. The FDA's National Center for Toxicological Research has analyzed 10 years of reports to identify groups of drugs that tend to cause adverse reactions together and to predict future adverse pairings. The Securities and Exchange Commission uses the technology to scan disclosures of companies charged with financial misconduct. To analyze option-pricing data, the SEC processes roughly two terabytes every day, the equivalent of 500 million double-sided, single-spaced, printed pages.

Policy analysts have used NLP to study speeches to assess the relationship between Latin American economic policy swings and economic performance. An AI program under development can analyze oil and gas permit applications to more thoroughly and quickly to assess threats to ecosystems.

The potential for governments to employ these advancements in language processing defies limitation. Governments collect massive numbers of unstructured records. Fragmented information presented in deeply non-mathematical formats, at volumes far too huge for human assessment, are now available for the kind of deep-pattern analysis originally reserved for numerical databases. NLP can open the public sector to new insights, better tailored services, and faster responses to information.

And you don’t even have to teach it. It teaches itself.

About the Author

William D. Eggers is the executive director of the Deloitte Center for Government Insights and the author of several books, including his latest study, "Using AI to unleash the power of unstructured government data."