non-coding RNA and evolution

We know that simple Mendelian disorders can be caused by a wide variety of genetic variants,
from single nucleotide point mutations to short insertions and deletions (indels) to larger
structural variants. However most research into complex disease has focused on using genetic
association methods on single nucleotide polymorphisms (SNPs) in conserved single copy parts
of the genome. This overlooks fast evolving structurally variable parts of the human genome,
representing perhaps 5-10% of the total sequence, which we know are enriched for functional
non-coding sequence. In particular miRNAs evolve rapidly and are often present in multiple and
variable copy numbers. We are exploring the contribution of such regions to quantitative traits in
a vertebrate model system, focusing in particular on miRNA gene evolution, and develop
methods that may potentially be applied to the structurally diverse portion of the human
genome. Cichlid fish from the great lakes of Africa present an unparalleled opportunity to study
the genetic basis of vertebrate phenotypes in a rapidly evolving system. In the last 5 million
years, since humans began to diverge from chimps, African cichlids have radiated into more
than 1,500 species that differ in craniofacial morphology, pigmentation, behavior and many
other traits. These species can be considered a collection of “natural mutants”, screened by
evolutionary selection for phenotypes expressed at all life stages. Because of their close genetic
relationship, they share a fundamentally similar genetic system, which allows us to explore the
specific genetic differences responsible for their diverse phenotypes. In recognition of this
potential five African cichlid genomes were sequenced early in 2011 (Broad Institute). As with
other fish genomes, these show extensive structural and copy number variation, even within a
single diploid individual. In a collaboration with Richard Durbin we are investigating the genetic
diversity and the contribution of copy number variable regions and non-coding RNAs to
phenotypic diversity within a set of 100 phenotypically diverse but closely related Malawi
haplochromine cichlids by genome sequencing. We expect this work to deliver unprecedented
insights into functional non-coding RNAs in general and miRNA/miRNA target interactions in
particular.