Regulation of genetic response to combined abiotic stress​I am fascinated by the complexity of how crop plants respond to environmental stresses, particularly stresses disrupting agricultural systems due to climate change. Plants are diverse in their response to stress, yet across species these pathways are tightly regulated and largely interwoven. By interwoven, I mean that many of the components that regulate response function in response to more than one kind of stress. This "crosstalk" between pathways makes it difficult for us to predict how plants will respond to multiple stresses based on the kind of single stress studies commonly done in labs.

In nature, plants are rarely exposed to just one stress at a time, so understanding how these response pathways talk to each other is critical to understand how plants will respond to climate change. The first project I'm tackling during my Ph.D. is to uncover the cis-regulatory code of plant response to combined stress using multi-dimensional data integration and machine learning. In a nutshell, I identify short sequences in gene promoter regions that are common between genes that are co-expressed under single and combined stress scenarios. Then I use information about those sequences, like their overlap with known TF binding, their chromatin accessibility, proximity to histone markers, sequence conservation, and more to assess how likely they are to be true regulatory elements. Finally, I build predictive models using machine learning to determine how well these putative regulatory elements can be used to predict a gene's response to the stress conditions. As I progress through my PhD, I hope to continue to take an interdisciplinary approach to better understand how this complex system is regulated.

http://www.nextgencassava.org/genomic_select.html

Helping plant breeders by developing tools for Genomic Selection​Since the early 1990s, breeders have used marker-assisted selection (MAS) to select lines that contain markers linked to traits of value. While MAS has been successful for certain traits, it has proved less useful in selecting for quantitative traits, like drought resistance, that are controlled by many, small-effect alleles. To breed for such traits, a new genome-wide method, called genomic selection, was established (Meuwissen et al. 2001). Genomic Selection is often favored over MAS methods because breeders do not need to know what markers are important for a trait a priori and small effect QTL can be accounted for. Since 2001, there has been an explosion in GS methods available. The methods vary in whether and how they account for non-linear interactions, population structure, differences in the effect size at each marker, and epistatic interactions between markers. Recently, deep learning based approaches have been used to predict traits from genomic data. Briefly, deep learning refers to machine learning approaches that perform layers of transformations on features to create abstraction features, known as hidden layers, which are used for the ultimate predictions. The goal of this project is to develop guidelines to help breeders determine what statistical methods to use in their GS studies.