Using Data Science Tools to Discover New Nanostructured Materials

Using Data Science Tools to Discover New Nanostructured Materials

NEW YORK, October 28, 2013— October 28, 2013: Researchers at Columbia Engineering, led by Chemical Engineering Professors Venkat Venkatasubramanian and Sanat Kumar, have developed a new approach to designing novel nanostructured materials through an inverse design framework using genetic algorithms. The study, published in the October 28 Early Online edition of Proceedings of the National Academy of Sciences (PNAS), is the first to demonstrate the application of this methodology to the design of self-assembled nanostructures, and shows the potential of machine learning and “big data” approaches embodied in the new Institute for Data Sciences and Engineering at Columbia.

“Our framework can help speed up the materials discovery process,” says Venkatasubramanian, Samuel Ruben-Peter G. Viele Professor of Engineering, and co-author of the paper. “In a sense, we are leveraging how nature discovers new materials—the Darwinian model of evolution—by suitably marrying it with computational methods. It’s Darwin on steroids!”

Using a genetic algorithm they developed, the researchers designed DNA-grafted particles that self-assembled into the crystalline structures they wanted. Theirs was an “inverse” way of doing research. In conventional research, colloidal particles grafted with single-stranded DNA are allowed to self-assemble, and then the resulting crystal structures are examined. “Although this Edisonian approach is useful for a posteriori understanding of the factors that govern assembly,” notes Kumar, Chemical Engineering Department Chair and the study’s co-author, “it doesn’t allow us to a priori design these materials into desired structures. Our study addresses this design issue and presents an evolutionary optimization approach that was not only able to reproduce the original phase diagram detailing regions of known crystals, but also to elucidate previously unobserved structures.”

The researchers are using “big data” concepts and techniques to discover and design new nanomaterials—a priority area under the White House’s Materials Genome Initiative—using a methodology that will revolutionize materials design, impacting a broad range of products that affect our daily lives, from drugs and agricultural chemicals such as pesticides or herbicides to fuel additives, paints and varnishes, and even personal care products such as shampoo.

“This inverse design approach demonstrates the potential of machine learning and algorithm engineering approaches to challenging problems in materials science,” says Kathleen McKeown, director of the Institute for Data Sciences and Engineering and Henry and Gertrude Rothschild Professor of Computer Science. “At the Institute, we are focused on pioneering such advances in a number problems of great practical importance in engineering.”

Venkatasubramanian adds, “Discovering and designing new advanced materials and formulations with desired properties is an important and challenging problem, encompassing a wide variety of products in industries addressing clean energy, national security, and human welfare.” He points out that the traditional Edisonian trial-and-error discovery approach is time-consuming and costly—it can cause major delays in time-to-market as well as miss potential solutions. And the ever-increasing amount of high-throughput experimentation data, while a major modeling and informatics challenge, has also created opportunities for material design and discovery.

The researchers built upon their earlier work to develop what they call an evolutionary framework for the automated discovery of new materials. Venkatasubramanian proposed the design framework and analyzed the results, and Kumar developed the framework in the context of self-assembled nanomaterials. Babji Srinivasan, a postdoc with Venkatasubramanian and Kumar and now an assistant professor at IIT Gandhinagar, and Thi Vo, a PhD candidate at Columbia Engineering, carried out the computational research. The team collaborated with Oleg Gang and Yugang Zhang of Brookhaven National Laboratory, who carried out the supporting experiments.

The team plans to continue exploring the design space of potential ssDNA-grafted colloidal nanostructures, improving its forward models, and bring in more advanced machine learning techniques. “We need a new paradigm that increases the idea flow, broadens the search horizon, and archives the knowledge from today’s successes to accelerate those of tomorrow,” says Venkatasubramanian.

This research has been funded by a $1.4 million three-year grant from the U.S. Department of Energy.