Predicting and understanding inter-locus DNA interactions

Author(s)

Other Contributors

Advisor

Manolis Kellis.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.http://dspace.mit.edu/handle/1721.1/7582

Metadata

Abstract

Most computational methods analyze DNA as a linear sequence of information when in fact the 3D architecture and organization contains important structural and functional elements that provide valuable information on DNA's role in mediating cellular processes. However, this 3D conformation has been extremely difficult to profile, requiring vast experimental resources and limiting the number of cell types for which it becomes available. In this thesis, I seek to address this limitation using a computational approach for predicting 3D conformation using diverse genomic annotations that are much more easily and broadly available. Hi-C maps for lineage-committed IMR90 cells and pluripotent H1 cells will provide information on features that are inherent to long-range interactions in all cell types as well as in specific cell types. Previous work in the lab used support vector machines on 5C data to find important features. While SVM performance is competitive, it becomes difficult to reveal which features are useful. I use alternating decision trees, a type of supervised learning technique that potentially provides a more transparent relationship between features, to analyze proximal and distal genome interactions to determine the sequence and regulatory elements that are important for these interactions. In particular, I separated the data set by interaction distances to investigate how the mechanisms for chromatin organization vary spatially. Additionally, I extended the alternating decision tree learning algorithm to model the distance-dependent nature of these interactions.