The main benefit of machine learning and natural language processing is that they can scale infinitely as the volume and variety of data grow.

During a typical day, we use a variety of applications that, by virtue of their artificial intelligence, automatically understand our speech and provide near-real-time feedback to support decision-making. Think Siri. But is machine learning, one of a number of AI techniques, ready for clinical applications, specifically to accelerate drug development and/or reduce development costs?

First, some context. Machine learning encompasses a variety of algorithmic techniques that clinical drug developers can use to identify and infer patterns to support enhanced/automated decision-making. One such technique is Natural Language Processing, which can be used to “read” scientific text and infer its semantic context in order to search and find information more easily.

Given a patient’s historical health data and genomic profile, predict his or her propensity to contract a certain disease.

Given historical data about a clinical trial site’s ability to recruit suitable patients, predict the probability of success to recruit patients for a planned new trial.

Identify meaningful clusters of scientific documents relating to the same topic.

Find all the groups of patients that are similar to one another using qualitative, descriptive data.

The main benefit of machine learning and natural language processing is that they can be used to either augment or replace the error-prone manual analysis work performed by people, and they can scale infinitely as the volume and variety of data grow.

What’s Behind the Magic?Machine learning (ML) uses an algorithmic approach that takes both structured and unstructured historical data through a mathematically driven process to generate a model that recognizes patterns and contextual meaning. This process takes as its inputs:

Data sets that train the ML algorithm and much larger input data sets used for subsequent analysis.

Standard medical dictionaries that provide reference-word or term definitions.

Ontologies or textual annotations that describe relationships between terms.

A probability driven mathematical model that can be trained to address the types of input data sets to be processed (for example, scientific literature versus Twitter feeds, each of which has very different contextual, semantic, and linguistic characteristics).

Using these inputs, the trained ML algorithm can read the data sets and extract relationships among the terms. Additionally, a person can verify the accuracy of the relationships and “fine train” the algorithm so that it reaches or exceeds a level of comparable accuracy.

At the end of the training process, the ML algorithm processes larger input data sets to extract new relationships and build a bigger picture and better understanding of the area of interest. A significant advantage of ML is its scalability for use across “big data” data sets.

The world-class ML experts within Oracle Health Sciences, for example, are working with the Information Retrieval and Machine Learning Group within Oracle Labs to use machine learning to identify new adverse drug events from a variety of data streams.

What’s Ahead?For the first time there is a nexus among the widespread availability of data, availability of AI tools, and very low cost of computation to develop AI techniques to augment what are now human-centric activities.

Looking forward, the explosive growth in data science, which capitalizes on AI technology, will deliver new capabilities. For example, it’s possible that AI techniques will identify new drug candidates by identifying new relationships between data. AI can process scientific literature to identify new drug targets by correlating scientific concepts across multiple articles simultaneously, taking into account qualitative and quantitative semantics.

The execution of clinical studies, by which subjects are identified, are enrolled, and complete a trial, is a major drug development bottleneck. AI can optimize this process by detecting trends and negative signals far in advance. Using historical performance data in concert with AI offers the potential to compress the critical path for clinical drug development, while minimizing risk and, ultimately, cost.

If we stretch the potential of AI to support decision-making, we can imagine a world in which virtual assistants will be able to provide guidance and support at all stages of clinical development. It’s not inconceivable that a Siri will one day provide a clinical study manager with a diagnostic capability to improve enrollment in a poorly performing trial—in addition to predicting the weather.