Abstract

ALAMO: Machine learning from data and first principles

Nick SahinidisCarnegie-Mellon University

We have developed the ALAMO methodology with the aim of producing a tool capable of using data to learn algebraic models that are accurate and as simple as possible. ALAMO relies on (a) integer nonlinear optimization to build low-complexity models from input-output data, (b) derivative-free optimization to collect additional data points that can be used to improve tentative models, and (c) global optimization to enforce physical constraints on the mathematical structure of the model. We present computational results and comparisons between ALAMO and a variety of learning techniques, including Latin hypercube sampling, simple least-squares regression, and the lasso. We also describe results from applications in CO2 capture that motivated the development of ALAMO.