One of the challenges in drug discovery is to optimise the chemical composition of an initial set of promising molecules to improve multiple properties that influence how the drug behaves in the human body, such as potency, bioavailability and possible adverse effects. Computer-aided drug design facilitate this process by applying machine learning models that predict these properties and can virtually consider huge number of possible molecules to focus the search for molecules that will be tested later in the lab. However, some of these models might under-perform with a limited amount of data which may result in suggesting molecules with lower activity but high similarity to the initial set.

Active-learning is a relatively new approach in drug discovery to assist the selection of potential molecules by suggesting those that could also improve these models in an iterative/feedback manner over several cycles and suggest structurally novel molecules. GSK has been implementing this approach that could potentially reduce costs and development time by the increasing the ability to calculate better performing models and suggest molecules with improved multiple properties. We are currently evaluating several optimisation algorithms to adapt to each iteration and suggest the amount of structurally modifications with respect to multiple properties needed.

This well-defined project seeks an enthusiastic individual that will investigate which optimisation algorithm performs best by evaluating and comparing their potential to supervise an active learning process. To do that, we will also need to develop the necessary metrics to control these decisions, preferably based on multiple parameter optimisation. The outcome of this short study will assist our scientists to evaluate this approach during drug discovery processes in the future.