A 21st century view of Marine Biology

So you wanna be a marine biologist in the 21st century? Better crack open that MacBook and start writing perl scripts.

As part of our NSF RAPID grant studying the impact of the Deepwater Horizon spill, our group is busy organizing an outreach workshop for undergrads entitled the “Bioinformatics of Biodiversity”. We’ll be giving a small group of students the down-low on traditional taxonomy as well as high-throughput sequencing of sediment communities–especially emphasizing the interdependence between the two for inferring complex interactions within marine ecosystems. While preparing material for these workshop sessions, I’ve been thinking a lot about what it means to be a marine biologist in the 21st century–I’m really not joking about the scripting.

On any given day, I may extract nematodes from some deep-sea mud, PCR up some 18S genes, build a quick phylogeny, help someone prep environmental RNA for transcriptome analysis, write and run perl scripts, fiddle around with Linux dependencies, and outline some complex data processing needs to my colleagues in the computer science department. Once in a while I go down to the beach to collect some fresh, writhing worms.

“You hear that Mr. Anderson? That is the sound of inevitability.”

I still consider myself a marine biologist at heart, but my goal as a postdoc is to be marketable. Academia is a crowded island, and if I want to survive I need to adapt and find my niche.

Now, we are truly living a data-driven life. One proposal I read succinctly noted the “twin revolutions in information/computing and in the biological sciences”. Computers are getting faster and DNA sequencing is becoming ever more high-throughput (with the pace of the latter far outstripping the former). With the plummeting cost of high-throughput sequencing technology and the impacts of climate change already manifesting, reverse taxonomy is becoming the only cost-effective option for describing the biodiversity on Earth: sequence environmental DNA first, then search out biological patterns, and eventually stick a formal species name on the taxa with the most interesting ecology. Taxonomy is always going to be important, but the problem is we could never do it fast enough. Scientists who study microbes are already taking an alternative approach–realizing that the functional role of organisms in an environment (e.g. expressed genes) can be more important than who is actually there.

In fact, given the vast number of uncultivated microbes, it may be that a DNA-centric approach, in which genes are linked to habitats (locations), is more useful than the species-centric view [Field et al. (2010) Nat Biotech 26(5):541]

I’m currently in beautiful San Diego to participate in the Biodiversity Working Group of the Genomic Standards Consortium. We’ll be discussing the current challenges facing high-throughput biodiversity research—how to anticipate and plan for future research needs, particularly the need for diverse fields to unite and share computational resources and workflows. The data problem is too big (and cyberinfrastructure is too costly) for disparate groups to try and tackle alone—thus, today’s scientific community is poised to become more integrative than ever (common computational hurdles mean that disciplines must unite to overcome them). Not only do scientists need robust data storage facilities, but researchers need to access and analyze large DNA datasets (and their associated metadata) in order to tease out patterns across biological communities.

Young biologists who can grasp this overarching zeitgeist and gain a broad scientific background–both computational and ecological–will be well poised for future success. You don’t necessarily need to walk the walk, but you definitely need to talk the talk. And maybe stumble around in the dark (for your own skill development?) if anyone asked you to actually try and walk the walk. I can talk about relational databases to computer scientists, but I could never actually sit down and construct one–well I could, but it would involve much swearing and lots of caffeine.

I am a computational biologist at the University of California, Davis. My research uses DNA sequencing and genomics to study microbial eukaryotes (yeah, nematodes!) in marine ecosystems, with an emphasis on evolution and biodiversity in the deep-sea. I can neither confirm nor deny that I like Unix more than I like going to sea.

Post navigation

I can’t second this enough…I literally can’t imagine how I’d get my work done if I hadn’t learned a bit of programming. The only thing I’d question for a beginner (not to start a my-language-is-better flame war or anything…) is the Perl part. If you’re just starting out and looking for a general-purpose scripting language, I’d say go for Python. One of my actual hacker friends told me this when I was getting started, and I haven’t looked back. The problem with Perl is that it can be a “write only” language…as in, you can’t understand your code once you’ve written it. For instance, this is a legal program in Perl.