Thornton group

Computational biology of proteins: structure, function and evolution

Our research builds on our accumulating knowledge of the three-dimensional structures of proteins and their complexes, to understand the evolution of life in 3D and how variants and small molecules can cause or modulate diseases and ageing. This understanding will ultimately lead to improving our ability to treat diseases and facilitate healthy ageing.

Our research is focused in three distinct but related areas:

We explore the structure, function and evolution of enzymes and their mechanisms. These basic studies facilitate our ability to interpret coding variations in humans and their disease impact. We have strong, on-going collaborations with experimental groups, working on model organisms, to better understand the molecular basis of ageing.

We seek to understand how enzymes work and how they evolve to perform new enzyme functions, based on structural data. We have shown that most enzyme functions have evolved from other functions; this opens the path to rational design of novel enzymes with new functions and mechanisms. We also develop computational tools based on our analyses to improve enzyme design.

Our study of human coding variations, associated with developmental disorders and other diseases, aims to use protein structural knowledge to interpret their effects. Using the CATH domain database, we study variations occurring in related domains and explore how their genomic context influences the resultant disease. Our goal is to begin to trace the steps from the molecular protein variant and its effect on the protein’s function and, from there, to the organismal ‘disease and ageing’ sub-phenotype.

Disease-associated mutations affecting the DHSW tetrad in the WD40 motif of different proteins implicated in rare diseases a) The β-propeller structure of the WD40 domain b) The hydrogen-bonding network (dashed yellow lines) between the DHSW residues of a typical propeller blade which, when disrupted by mutation, can lead to different diseases involving different proteins which contain this motif.

Ageing can be described as the most ‘integrative or generic phenotype’, in that all organisms age and this ageing is affected by both the genome and the environment. We aim to combine data over multiple organisms and multiple data types to define sub-phenotypes of ageing to improve our understanding of the molecular basis of ageing. As part of our studies on the effects of variants, we will explore why ageing makes us more susceptible to some diseases. We also aim to use computational methods to identify small molecules that may have an impact on ageing.

Future projects

For our enzyme work, our central question will be whether we can predict enzyme-function evolution – both of new substrates and new mechanisms. Can we relate changes in function with changes in the structure of the enzyme and changes in the environment? Can we automatically predict or validate enzyme catalytic mechanisms in silico from structural data? We will further develop our data resources, CSA and MACiE, develop metrics to measure promiscuity and tools to help to predict promiscuity, and develop methods to predict transformations and mechanisms using deep-learning approaches.

For coding variants, we will provide web tools to relate variant, 3D structure and function to help non-experts to understand the impact of coding variants and how they generate disease phenotypes. To address these questions, we plan to:

visualise variants in structures and their links to diseases

develop new methods to analyse the effects of mutations in ligand binding sites

use our tools to analyse variants discovered in rare diseases and apply these methods to other specific example genes in collaboration with ‘domain’ experts

explore how the same mutations can cause many diseases, and how one disease can have many causes.

For ageing, we will explore:

whether it is possible to identify sub-phenotypes to better understand the ageing process, providing a better link between the molecular and whole-organism data

whether model organisms can be used to explore the impact of small molecules (potential drugs) on ageing

whether we can incorporate clinical data in our studies to bridge molecular and clinical data