Pre-College Program in Computational Biology

Starting in summer 2019, the Computational Biology Department will be offering a three-week Pre College Program in Computational Biology. This program is the first summer computational biology program in the United States designed for high school students.

In the program, our students (most of whom will be rising high school seniors) will learn the computational and laboratory skills needed in modern biology. Traditionally, these skills have been taught as part of disjoint courses, but our pre-college program will highlight the interplay between generating biological datasets in the lab and analyzing these datasets computationally.

Details on the curriculum are provided below and will be updated as the program evolves. If you’re interested in joining us, we would love to have you! We’re looking for students who enjoy math and science and have done a little bit of coding (but by no means do we expect expert programmers). For more information, including how to apply (deadline March 1), see the program homepage at the admission website: https://admission.enrollment.cmu.edu/pages/pre-college-computational-biology. For specific questions about the program, shoot us an email at compbio-precollege@cmu.edu and we would love to start a conversation.

2019 Dates

Pre-College Program in Computational Biology Curriculum

Program Overview

The Pre-College Program in Computational Biology will immerse students in the cutting edge laboratory and computational methods needed to generate and analyze real biological datasets. It begins on the first day of the program with computational and laboratory bootcamps getting them up to speed in programming and basic “wet lab” techniques. On the second day of the program, students will undertake an exciting day-long adventure onto Pittsburgh’s Three Rivers not only to sample water but also to learn about ecology (and of course take in the city’s beautiful bridges and architecture); see photo above.

Why are Pittsburgh’s Three Rivers an interesting biological environment? The Allegheny and Monongahela Rivers flow from somewhat rural landscapes into an urban environment with a history of industrial run-off, before merging into the Ohio River and continuing westward to its eventual confluence with the Mississippi. In even a small sample of river water lives an invisible ecosystem of microorganisms (bacteria and viruses). Only recently have researchers developed methods that can be used to start to understand, for each river, what these microbes are, what they do, and how they have evolved.

What is so interesting about bacteria? A landmark paper by Hug et al. and published in 2016 in Nature Biotechnologyprovided the evolutionary tree below. In it, we see that of the three domains of life, the eukaryotes (i.e., everything you have ever seen that is alive, and some things that you haven’t) make up the smallest component of the tree, meaning that they have the least genetic diversity. By far the most genetic diversity, and the largest part of the tree, is found in bacteria. This makes sense! Bacteria have been around a lot longer than we have, and they replicate and mutate quickly, so they have been able to move into environments that we could never dream of living – such as oil wells, deep sea ocean vents, and polluted rivers 🙂 — as well as produce a host of interesting compounds. For example, every antibiotic ever used to stop an infection was borrowed from a bacterium that had evolved to use this compound to kill its enemies.

The picture of an evolutionary tree for all living things is truly worth a thousand words. Source: Discovery Magazine.

But how is an evolutionary tree like this produced? We must sequence DNA from the same gene in many species. What is the lab method we can use to sequence this DNA from a biological sample (like river water)? And once we obtain the DNA, how do we train a computer to build this evolutionary tree?

These questions are just the beginning of the inquiries that we will make integrating laboratory work with computational approaches as part of the Pre-College Program in Computational Biology. The current week-by-week syllabus (which is subject to change as we continue to develop educational modules), is detailed below.

Week-by-Week Curriculum

Week 1

Coding bootcamp: How will programming help us solve biological problems that cannot be solved in the lab alone?

River sampling: How can we collect biological samples from the rivers while minimizing contamination and maximizing biological material yield? What other features of the rivers (e.g., ambient temperature/recent precipitation) are important?

DNA Extraction: In a sample of various biological specimens (river water), how can we extract all of the DNA present (and eliminate everything else)?

16S sequencing: How can we use a conserved gene to help determine the relative abundances of different species of bacteria in our extracted river water DNA sample?

16S sequencing analysis: Given the sequence of a strand of DNA, how can we determine the species from which it came?

Bacterial Isolation for whole genome sequencing: If we want to sequence the genome of a single bacterial cell, how can we isolate one cell from a river water sample?

Predicting Replication Origins: Using sequencing data, how can we predict bacterial replication origin?

Testing Replication Origin Predictions: How can we modify the genome bacteria to help us test our prediction?

Bioimaging: How can we identify bacterial colonies using microscopes to capture images?

Week 2

Whole Genome Sequencing: How can we read a short fragment of DNA excised from a bacterial genome? Why can sequencing machines only read short fragments of DNA and not entire genomes?

Whole Genome Reconstruction: After producing many DNA fragments that we can read, can we reconstruct the full genome from thousands of relatively short sequencing reads?

Mass Spectrometry: How can we determine what else is in the water samples that may be affecting microbial diversity?

Bacteria Identification:How can we use computational techniques to understand and characterize images of bacterial colonies?

Week 3

Building Phylogenies: How can we determine evolutionary relationships between organisms? Specifically, given genes from a host of different species, how can we construct an evolutionary tree for these species to determine how they have evolved?

Presenting Scientific Results: What are good strategies for conveying the results of scientific experiments? What are the fundamentals of giving a good scientific talk?

Fluorescence Microscopy: How can we use fluorescence to image eukaryotes and prokaryotes?

Fluorescence Microscopy Image Analysis: How can we analyze fluorescence images to help to classify eukaryotes and prokaryotes?

Final Weekend

Students will present their work to their parents/guardians, who are enthusiastically invited to attend our end-of-program celebration of student work.