An introduction to data integration and statistical methods used in contemporary Systems Biology, Bioinformatics and Systems Pharmacology research. The course covers methods to process raw data from genome-wide mRNA expression studies (microarrays and RNA-seq) including data normalization, differential expression, clustering, enrichment analysis and network construction. The course contains practical tutorials for using tools and setting up pipelines, but it also covers the mathematics behind the methods applied within the tools. The course is mostly appropriate for beginning graduate students and advanced undergraduates majoring in fields such as biology, math, physics, chemistry, computer science, biomedical and electrical engineering. The course should be useful for researchers who encounter large datasets in their own research. The course presents software tools developed by the Ma’ayan Laboratory (http://icahn.mssm.edu/research/labs/maayan-laboratory) from the Icahn School of Medicine at Mount Sinai, but also other freely available data analysis and visualization tools. The ultimate aim of the course is to enable participants to utilize the methods presented in this course for analyzing their own data for their own projects. For those participants that do not work in the field, the course introduces the current research challenges faced in the field of computational systems biology.

From the lesson

Deep Sequencing Data Processing and Analysis

A set of lectures in the 'Deep Sequencing Data Processing and Analysis' module will cover the basic steps and popular pipelines to analyze RNA-seq and ChIP-seq data going from the raw data to gene lists to figures. These lectures also cover UNIX/Linux commands and some programming elements of R, a popular freely available statistical software. Note that since these lectures were developed and recorded during the Fall of 2013, it is possible that there are better tools that should be used now since the field is rapidly advancing.