Taught By

Jeff Leek, PhD

Associate Professor, Biostatistics

Transcript

Welcome to Week 2 of Statistics for Genomic Data Science. Last week, we got started and learned about exploratory data analysis. This week, we're going to be learning about pre-processing, and then begin on statistical modeling. So pre-processing is basically the idea that when you get genomic measurements, especially if you consider getting genomic measurements across multiple samples, they're often incomparable in various different ways. The machine that you use to collect the measurements might vary from day to day, or different liaisons might be used, and so these differences translate into differences in the data from sample to sample. We're going to be talking about some of the really common and basic techniques that are used to make samples and data from those samples more comparable before doing statistical analysis. One example of that is quantile normalization, which I've shown a picture of here. Another thing that we're going to be talking about is statistical modeling. We're going to use the linear model as the basic kind of model that we're going to use for most of the statistics for this class. And so we're going to get into how does a linear model work? What are the different components of it and how do you fit it? Then we'll hopefully move on a little bit into interpretation and statistical significance which will be coming up in week three. All right.

Explore our Catalog

Join for free and get personalized recommendations, updates and offers.