All Genetics majors must complete a lab course requirement to complete the degree requirements. The lab course requirement can be satisfied by taking either this course or 01:447:315 or 01:694:214 or 01:694:215 or 01:447:302 or 01:447:203.

Offered

Spring Term Only, Mondays and Thursdays, 1:20-4:20 PM

Credits

3

Pre-requisites

Students must have previously completed Genetic Analysis I (01:447:384) or Genetics (01:447:380).

Course Restrictions

This course is limited to Genetics majors. Other students can be added by special permission number pending computer space availability.

Course Description

The main focus of this course is the application of R programming to the analysis of genetic data, particularly “big data” sets with multiple measurements. The primary data sets considered will contain RNA-seq and/or other expression data for multiple/all genes in a given set of individuals. This course is for junior or senior students who are thinking of careers at the intersection of life sciences, statistics, and/or computer science, particularly students who are majoring in Genetics. The course fulfills the laboratory requirement for the Genetics major.

Students will learn how to acquire such data, format it for R, plot the data, and perform statistical analyses. In addition, students will learn how to simulate data under different hypotheses, and how to perform power and sample size calculations for different statistical methods applied to real or simulated data.

Each class consists of a mixture of lecture and computer-based demos and/or exercises, as well as time for students to work on assignments. Guest investigators will frequently make short presentations (in person or by skype) to provide illustrations of how programming and informatics is critical for their research. The course provides the introductory skills needed to conduct basic computational research in the life sciences, including many aspects of computer programming and data analysis.

Course Goals

The goals of Honors Computational Genetics reflect the learning goals of the Department of Genetics, and include (1) knowledge specific goals: know the terms, concepts and theories in genetics; and (2) integrate the material from multiple courses and research. Specific itemized goals include (1) to learn R programming, specifically methods for acquisition and analysis of big data from genomics repositories; (2) to discover online repositories for genomic data sets; (3) to learn the fundamentals of statistical analysis for such data sets; (4) to perform empirical type I and power evaluations for different statistics applied to expression data by writing R programs that can simulate data using mathematical models; and (5) to learn the fundamentals of experimental design for expression-data statistics.

The computer lab has Windows 8 computers. Class materials and files should be copied after each class to a portable USB flash drive (Windows formatted) to continue working at home. No textbook is required as most of the needed material is made available during class. A useful resource to have on hand if you prefer to have a printed book is:

The Art of R Programming: A Tour of Statistical Software Design 1st Edition

Attendance is expected at all classes; in-class demos and exercises are an integral part of this class and it is difficult to make-up work when class is missed. If a student must miss a class, please use the University absence reporting website https://sims.rutgers.edu/ssra/ to indicate the date and reason for your absence. An email is automatically sent to the instructors. Completion of all assignments is required, including any that may have been missed due to absence in class.

Students will be assigned weekly projects based on current material. The final grade is based on the grades received on these projects, quizzes, and a final exam.