Course Prerequisites: Basic Algorithms and High-Level Languages.
We are expecting students with diverse backgrounds (CS, Math,
Biology, Biomedicine, Engineering, etc.), and hence will try
to make the course as self-contained as possible...

The course focuses on statistically determining the relations between
genotypes and phenotypes. We now know that human genome contains
millions of SNPs (single-nucleotide polymorphisms), and thousands more
variations in the number of copies of large and small segments of the
genome (CNVs: copy number variation), which may either directly cause
changes in phenotype (e.g., TAS) or which tag nearby mutations
containing the key differences that influence individual variation
(e.g., TASPs) and susceptibility to disease.

GWA (Genome-Wide Association) studies allow one to sample large number
of SNPs from many patients, thus, capturing variation uniformly across
the genome. Recently, there has been an enormous interest in such
studies as they have succeeded in identifying risk and protective
factors for asthma, cancer, diabetes, heart disease, mental illness
and other human differences. For instance, in 2005, it was learned
through a small scale GWAS that age-related macular degeneration is
associated with variation in the gene for complement factor H, which
produces a protein that regulates inflammation. One expects the GWAS
to play a significant role in drug discovery and personalized
medicine, and will be important in the modern models of health-care
(e.g., evidence-based medicine). For instance, it was found that the
genetic variants have different responses to various anti-hepatitis C
virus treatments: for genotype 1 hepatitis C, treated with Pegasys
combined with ribavirin, genetic polymorphisms near the human IL28B
gene are associated strongly with responses to the treatment. One
expects to find and catalogue many such facts.

This course will focus on the algorithmic, statistical and genetic
aspects of this problem. Thus, we will develop specialized methods for
Machine Learning (supervised and unsupervised), Classification, Model
Selection, Multiple Hypotheses Testing and Experiment Design (pooling
and group-testing).