Accelerated materials discovery using deep learning and HTVS

Discover new materials with targeted properties traditionally involves a large number of trial tests following a series of procedures as shown in Fig. 1(left), however, these efforts are inevitably far from adequate considering the well-recognized near-infinite chemical space (~ 1060). Efficient investigation of the unexplored chemical space calls for automated techniques with smart navigation. Recent advances in machine learning, coupled with the proliferation of computing power, are enabling exciting new approaches to the computational study in solving materials challenges. High-throughput (HT) computational screening, which is widely used nowadays, enables the thorough examination of a well-defined chemical space region or compound library (~ 104, Fig. 1, right bottom) and experimental efforts then can focus on the most promising candidate systems in the library (< 102). The integration of machine learning processes to the computational screening, i.e. high-throughput virtual screening (HTVS), makes the inspection of a remarkably larger library (> 106, Fig. 1, right middle) feasible. ML discriminative models trained with computationally determined properties of a selected portion of the library (~ 104) can predict the counterparts of all compounds in the whole library (~ 106), which then can help refine the selection and ultimately locate the promising candidates for experimental validations. In HTVS, the library can be manually adjusted regarding the feedbacks obtained from experimental tests, yet those adjustments are usually not optimized. Generative models from deep learning like the variational autoencoder (VAE) and generative adversarial networks (GANs) offer an important solution to the optimized searching of the chemical space (Fig. 1, right top). Trained with known compounds in the chemical space, generative models can automatically generate the library for computational and experimental screenings while the generation can be optimized with the screened compounds’ properties. A smaller number of configurations, therefore, are searched in the generative model automated screening compared to HTVS and the exploration of the whole chemical space is efficiently navigated. As a result, materials discovery can be significantly accelerated via applying those three strategies accordingly for various design purposes and scales.

Figure 1. Illustration of the current material discovery paradigm and the accelerated material discovery using first-principles calculations and materials informatics.