Students will locate and download high-throughput sequence data and genome annotation files from publically available data repositories.

Students will use Galaxy to create an automated computational workflow that performs sequence quality assessment, trimming, and mapping of RNA-seq data.

Students will analyze and interpret the outputs of RNA-seq analysis programs.

Students will identify a group of genes that is differentially expressed between treatment and control samples, and interpret the biological significance of this list of differentially expressed genes.

Define similarity in a non-biological and biological sense when provided with two strings of letters.

Quantify the similarity between two gene/protein sequences.

Explain how a substitution matrix is used to quantify similarity.

Calculate amino acid similarity scores using a scoring matrix.

Demonstrate how to access genomic data (e.g., from NCBI nucleotide and protein databases).

Demonstrate how to use bioinformatics tools to analyze genomic data (e.g., BLASTP), explain a simplified BLAST search algorithm including how similarity is used to perform a BLAST search, and how to evaluate the results of a BLAST search.

Create a nearest-neighbor distance matrix.

Create a multiple sequence alignment using a nearest-neighbor distance matrix and a phylogram based on similarity of amino acid sequences.

Demonstrate proper use of standard laboratory items, including a two-stop pipette, stereomicroscope, and laboratory notebook.

Calculate means and standard deviations.

Given some scaffolding (instructions), select the correct statistical test for a data set, be able to run a t-test, ANOVA, chi-squared test, and linear regression in Microsoft Excel, and be able to correctly interpret their results.