Faculty / Leadership

David Haussler develops new statistical and algorithmic methods to explore the molecular function, evolution, and disease process in the human genome, integrating comparative and high-throughput genomics data to study gene structure, function, and regulation. As a collaborator on the international Human Genome Project, his team posted the first publicly available computational assembly of the human genome sequence. His team subsequently developed the UCSC Genome Browser, a web-based tool that is used extensively in biomedical research. He co-founded the Genome 10K project so science can learn from other vertebrate genomes, co-founded the Treehouse Childhood Cancer Project to enable international comparison of childhood cancer genomes, and is a co-founder of the Global Alliance for Genomics and Health (GA4GH), a coalition of the top research, health care, and disease advocacy organizations He is a member of the National Academy of Science and the American Academy of Arts & Science.

I am a principal investigator, director of the Computational Genomics Lab and an associate research scientist within the UC Santa Cruz Genomics Institute. I have a PhD from Cambridge University and the European Molecular Biology Laboratory. I am trying to contribute to answering some big questions, e.g. how we became human?, how we relate genetically to each other and other species?, and how genomes relate to the states of health and disease? To attack these questions I love algorithms, computers and coffee.

Scientists

Jingchun ZhuAssociate Research Scientist

I earned my Ph.D. in biological and medical informatics from UC San Francisco, and subsequently held a postdoctoral position in the Haussler Lab at UCSC, where I led the development of the UCSC Cancer Genomics Browser. I’m a project lead for UCSC’s Xena visual analysis tool. My interests include improving presentation of genomics data to increase clinician and researcher understanding of disease, creating modular and sharable web-based data visualizations and data hosting. I’m currently involved in UCSC Xena (xena.ucsc.edu), UCSC Cancer Genomics Browser, the Cancer Genome Atlas (TCGA), International Cancer Genomics Consortium (ICGC), Big Data to Knowledge (BD2K), and the Treehouse Childhood Cancer Initiative.

I earned my Ph.D. in computer science with a concentration in bioinformatics from UC Santa Cruz, and previously held research positions in interaction networks at the Pasteur Institute, and alternative splicing at UCSC. I’m interested in transcriptional and post-transcriptional regulation of gene and isoform expression, pediatric cancer genomics, and genetic variation. I’m currently working on the BRCA Challenge, Athena, Treehouse / CKCC, Toil/BD2K Genomics Core. I enjoy free time outdoors climbing mountains, backcountry skiing and mountain biking. I’m a member of the Tahoe Backcountry Ski Patrol, and Treasurer of Trips for Kids Santa Cruz.

Maximilian HaeusslerAssistant Research Scientist

I work on text mining and the UCSC Genome Browser. I’ve held research positions in text mining, sequence mining and Chip-seq pipelines at Manchester University, UK, and in zinc-fingers and repeats, genome annotation of sequence variants, and text mining at UC Santa Cruz. I hold an MS in computer science and business from the Universitaet Potsdam in Berlin, Germany, and a Ph.D. in developmental biology from CNRS Gif-sur-Yvette, one of the Paris locations of the French National Center for Scientific Research, where my dissertation research concerned Ciona intestinalis and cis-regulatory genome annotation and enhancer prediction. I’m interested in genome annotation and currently working on single-cell sequencing analysis pipelines for CIRM, various tracks of the UCSC Genome Browser including text, patents, and cancer. I enjoy all kinds of machines, and especially the programmable ones.

Software Engineers

Chloe DlottJunior Software Engineer, BRCA

I graduated from MIT with a B.S. in Biology in June 2016. I’m principally interested in how bioinformatics and big data can directly improve outcomes for patients and how these fields will influence clinical medicine in the years to come. Currently, I’m working with the BRCA Challenge team to develop methods that integrate natural language processing, optical character recognition, and handwriting recognition to parse data from identifiable patient records and extract de-identified summary information on individual and family cancer history. Outside of research, I enjoy playing tennis, baking, reading, and exploring the natural beauty of Santa Cruz.

Edwin Jacox Senior Software Engineer

After having worked as a software developer for over a decade and then earning a Ph.D. (databases), I started applying my skills to biological datasets and soon found myself doing research into transcriptional regulation and then species/gene tree reconciliations. Currently, I am combining my diverse skillset developing standards and best practices for working with genomic data as part of the GA4GH consortium and developing a workflow engine (Toil) that runs in the cloud and on a variety of batch systems with a focus on bioinformatics pipelines.

I spent 20 years working as an engineering manager in the file server storage industry starting with CDROM, moving to network attached storage and finally to clustered file servers and distributed computing before coming to UCSC to work in genomics. I have a patent on network security abstraction and another on distributed server synchronization (in process). My current work is with the GA4GH team to produce industry standard genomics APIs. I have bachelor degrees in Computer Science from UCSC and another in Natural Resources from Humboldt State, as well as an MFA from Brooks Institute in Photography. In my spare time I teach photography workshops and work on my own fine art business. I am also a long time fencing club operator in Santa Cruz who is certified as a Prevot level instructor.

Hannes re-joined the Genomics Institute in November, 2017 as a Senior Software Engineer working under Kevin Osborn as part of the Computational Genomics Platform (CGP), directed by Brian O’Connor.

Hannes holds a Dipl.Ing. (M.Eng) in computer science from Technische Universität Berlin. During his previous stint at UCSC he contributed to CGHub, the TCGA data repository, and CGL projects like Toil, CGCloud, Apache Spark, S3AM and Conductor. His interests are in cloud computing, big data, systems programming, and programming language design. He re-joined the Genomics Institute in November, 2017 as a Senior Software Engineer working under Kevin Osborn as part of the Computational Genomics Platform (CGP), directed by Brian O’Connor.

I currently work on Toil development, UCSC’s workflow execution engine for large biological compute jobs. I have a B.S. in Bioengineering from UC Santa Cruz, and have a strong interest in pursuing bioinformatic approaches to research questions in biology using big data.

I am a software engineer at the Genomics Institute, working on infrastructure for graph-based genomics. I hold a B.S. in Computer Science and Biology from Harvey Mudd College, and a Ph.D. in Biomolecular Engineering and Bioinformatics form UCSC. The software I work on enables bioinformaticists to analyze people’s genomes in the context of everything we know about genomic variation in the human population, with the goal of reducing reference bias and improving the handling of structural variation. When not revolutionizing genomics, I can be found indulging my passion for distributed systems.

Walt ShandsSoftware Engineer, Computational Genomics Platform (CGP)

Walt develops software for the Computational Genomics Platform (CGP) where his focus is on integration of bioinformatics pipelines such as RNA-Seq, ProTECT, and BCBio into the software infrastructure, running those pipelines on genomic data and developing new features. In addition he contributes to HMMER, a homology search application, most recently implementing translated search. Walt has a wealth of industry experience which includes significant contributions at Intel, Fujitsu and AT&T Bell Laboratories. Walt has both an M.S. and B.S from the School of Engineering at Columbia University in New York, and a black belt in Aikido.

My background is in software development and managing development teams in the computer storage industry. I have helped develop NAS (network attached storage) products such as SnapServer, which delivers easy-to-use network storage for SMBs (small and medium businesses) as well as clustered Enterprise Storage systems (BlueArc Titan and Overland Storage SnapScale) offering such features as High Availability, Clustered File System, Snapshots and Data Replication. I joined the Genomics Institute to help make a positive contribution to the development of treatments for genetically related diseases. My LinkedIn profile.

I hold an M.S. in neuroscience from Brandeis University, and a B.S. in engineering and applied science from The California Institute of Technology. I’m interested in databases, data visualization, and functional programming. I’m currently working on UCSC Xena (xena.ucsc.edu) and the UCSC Cancer Genomics Browser.

I hold a B.S. in Biology from UCSC and have been working in genomics for six years, both for the UCSC Genome Browser and the UCSC cancer research group. I design and prototype software tools for genomics research in addition to managing communication with scientific community, including conducting instructional workshops. I’m currently working on the BRCA Challenge, whose goal is to responsibly share breast cancer data on BRCA1/2 variants, and the UCSC Xena platform, a functional genomics browser that allows researchers to view their private data with data from large consortiums, like TCGA, in a secure, federated manner. In my free time I enjoy contra and zydeco dancing as well as hiking to alpine lakes and peaks.

Brian O’ConnorDirector of the Computational Genomics Platform

Dr. Brian O’Connor is the Director of the UCSC Genomics Institute Computational Genomics Platform, a sister project to the Computational Genomics Lab. There he focuses on the development and deployment of large-scale, cloud-based systems for analyzing genomics data. This includes the Toil workflow execution platform, which is designed to run genomic pipelines on a wide range of cloud environments including AWS, Azure, Google and OpenStack, and ADAM, a distributed genomics platform developed in collaboration with UC Berkeley. He is also the co-chair of the Containers and Workflows task team of the Global Alliance for Genomics and Health (GA4GH) where he works on tool and workflow container standards. Brian joined UCSC from the Ontario Institute for Cancer Research (OICR) where his previous projects included leading the technical implementation of cloud-based analysis systems for the PanCancer Analysis of Whole Genomes (PCAWG, https://dcc.icgc.org/pcawg) effort, acting as the project manager for the International Cancer Genome Consortium’s Data Portal (https://dcc.icgc.org), and creating the Dockstore project (https://dockstore.org), a platform for tool and workflow sharing.”

As a software developer with the Genomics Institute I have worked on enabling software systems for genomic analysis. My background is in software design for the web, visual analytics, and systems integration. I hold a BA in Philosophy from UCSC and am interested in how technology can be used to reduce human suffering.

Postdoctoral Scholars

I hold an M.S. in bioinformatics from UCSC as well as an M.S. in biotechnology from SUNY Buffalo, and a B.Eng. in biotechnology from the University of Rajasthan, in India. I became interested in health applications of DNA sequencing and computational biology while working as a researcher at a nanotechnology startup. I work on developing nanopore technology for DNA sequencing (long reads, base modifications) and RNA and protein sequencing. Computationally, I work on developing algorithms and software for mapping and alignment, variant calling, and cloud computing. I absolutely love technology and am the lead singer in a contemporary Eastern music band.

Karen MigaPostdoctoral Scholar

My research focuses on one of the most challenging and unexplored regions of the human genome, the millions of bases that span repeat-rich centromeres and heterochromatic DNA. During both my doctorate work with Hunt Willard at Duke University and postdoctoral position at the UC Santa Cruz, I have designed and implemented computational approaches to model and annotate repeat-rich satellite DNAs that define each centromere gap region. In addition to understanding the sequence content and organization within centromeric/pericentromeric regions, I am leading efforts to better understand how these sequences vary in the human population and how they contribute to cellular function.

Graduate Student Researchers

Joel ArmstrongGraduate Student Researcher

I’m a Ph.D. student in bioinformatics at UCSC, where I also earned my B.S. in biochemistry and molecular biology. I’m interested in whole-genome alignment, genome rearrangements, comparative genomics, phylogenetics, and machine learning. I’m currently involved in the Cactus and Mouse Genomes projects.

Alden DeranGraduate Student Researcher

I’ll be starting the PhD program in bioinformatics here in Fall 2016. I’m particularly interested in genome alignment and assembly, graph algorithms, and machine learning, and I’ve been contributing to Cactus, a whole-genome alignment tool. Before getting into bioinformatics I received my undergraduate degree in Physics at UCSC and did a senior thesis on Monte-Carlo simulations of W-boson scattering.

Marina HauknessGraduate Student

I’m Marina, a second year graduate student studying bioinformatics, and I’m interested in developing algorithms to analyze long-read sequencing data. I came to UCSC after getting my B.S., a joint degree in computer science and mathematics, from Harvey Mudd College. When I’m not at the computer, I like to run, paint, and take care of my reptiles.

Kishwar ShafinGraduate Student

Hello, I am Kishwar. Graduate student focusing on machine learning to tackle problems in bioinformatics arena. I think deep learning is going to be a key player in this field and my focus is on evaluating and executing efficient learning models in different biological questions mostly related to sequencing. I love superhero comic book and movies, my hobby is collecting superhero t-shirts. I also love robotics. I love to compete too, I was a competitive programmer and love learning new algorithms.

John VivianGraduate Student Researcher

I’m a Ph.D. student in bioinformatics at UCSC, where I also earned an M.S. My research areas encompass statistical inference and machine learning. More generally, I’m interested in bioinformatics, statistical modeling, data visualization, cloud computing, big data, and machine learning. I’m currently working on Toil, pipeline development, and RNA-seq analysis. Away from the lab, I enjoy playing piano and drums, composing music, rock climbing, and reading.

Andrew BaileyGraduate Student

I am a second year Ph.D graduate student interested in nanopore sequencing. By utilizing deep learning algorithms to analyze nanopore sequencing data, we have the opportunity to study the fundamental building blocks of life at a level which has been elusive with industry standard sequencing techniques. I went to UCLA for my undergraduate degree and enjoy playing rugby. Go Bruins. Go Slugs.

Jordan EizengaGranduate Student Researcher

I am a PhD student in the Computational Biology and Bioinformatics program. I completed my undergraduate in mathematics and political science at the University of Michigan in 2011. I later became interested in biology and computer science and spent two years studying them in open online courses before starting at UCSC in 2015. My current research interests include nanopore sequencer informatics and using graph structures to represent population genomic variation.

Charles MarkelloGraduate Student Researcher

I’m a Ph.D. student in biomolecular engineering and bioinformatics at UCSC. I hold a B.S. in computer science with a minor in biology from the University of Oregon, and I’ve worked as a special researcher in the Undiagnosed Diseases Program at the NHGRI, and a bioinformatics software developer at Maverix Biomics. I’m interested in bioinformatics, machine learning, data visualization, computer graphics, cloud computing, genetics and comparative genomics.

Currently, I’m working on the BRCA Exchange website for storing and visualizing brca genetic data, diploid graph alignment for improving next-gen-sequencing through the use of pedigree information. Away from UCSC, Charles enjoys rock climbing, skim-boarding, skiing, swing-dancing, and photography.

Arjun Arkal RaoGraduate Student Researcher

I’m a Ph.D. student in biomolecular engineering at UCSC, where I study the immune response to cancers and I’m working on a pipeline to predict potential tumor specific T-cell Receptor (TCR) epitopes as targets for immunotherapy. I hold a B.E. in biotechnology from the PES Institute of Technology in Bangalore, India, and prior to attending UCSC, I was a junior research scholar the Indian Institute of Science studying fusion genes in glioblastomas. I’m interested in cancer immunotherapy, machine learning, big data analysis, genomics, and cloud computing. Currently, I’m lead developer on ProTECT, a pipeline to predict potentially therapeutic neo-antigens in cancers. I’m also a developer on the CGL Toil, cgcloud, and s3am projects. My immunology work is used as a part of the Treehouse Childhood Cancer Project, the California Kids Cancer Comparison (CKCC), and I’m an active member of the Cancer Genome Atlas (TCGA) immune responses working group. In my free time, I enjoy travelling, camping, and experimenting with new recipes.

Colleen BosworthGraduate Student

I am a second year PhD student in the Bioinformatics Department at UCSC. I am currently working on segmental duplications and how they contribute to neurological phenotypes and brain growth since the evolutionary divergence from the great apes. I am especially interested in the applications of genomics to study disease. I like to read, bake, do yoga, and go on hikes in my free time. I have a dual BS in Statistics and Systems Biology from Case Western Reserve University in Cleveland, Ohio.

Ian FiddesGraduate Student Researcher

I’m a Ph.D. student in bioinformatics at UCSC, where I also earned my B.S. in molecular, cell and developmental biology. I’m interested in molecular evolution and comparative genomics, disease-related segmental duplication and I’m currently working on annotating and analyzing assemblies of 17 mouse laboratory strains and the genomic structure of 1q21.1 locus in humans. Outside of research I enjoy cooking, skiing, and road biking.

Rojin SafaviGraduate Student

Hello, My name is Rojin. I am a second year Ph.D student. My focus is on nanopore sequencing, and how to utilize machine learning methods such as deep learning to analyze nanopore data. I have received my bachelors in Molecular and Cell Biology from Berkeley in 2016. In my free time, I enjoy hiking, going to the beach, hanging out with friends, and reading books!

Yohei RosenVisiting Graduate Student

I’m a rising fourth year MD candidate at NYU who is at UCSC on a Howard Hughes Medical Institute fellowship. I completed my BA in theoretical mathematics at UC Berkeley in 2012 focusing on algebra and topology and my MASt in mathematical statistics at Cambridge in 2013. At Berkeley, I also studied chemical biology which helped develop my interests in medicine and research. I work within the Human Genome Variation Map project; my current research interests involve developing mathematical structures to allow reasoning with respect to biological and medical problems. Outside of research, I enjoy drawing, cooking and canoeing.

I am a bioinformatics student expected to graduate in June 2019. I am part of the Computational Genomics Platform currently working on a Blue Box viewer for HCA (Human Cell Atlas). Previously, I worked on the CGP’s website (ucsc-cgp.org) and the File Browser. In my spare time, I enjoy reading fantasy books and doing jigsaw puzzles.

Tyler MyersUndergraduate Student Researcher

Ethan SeitherUndergraduate Student Researcher

Alden DouglasUndergraduate Student Researcher

Mason HargraveUndergraduate Student Researcher

Natan LaoUndergraduate Student Researcher

I’m an undergraduate studying computer engineering with a concentration in networking. I work on Toil and sometimes on other things that aren’t as cool as Toil.

Alumni

Christopher KetchumPreviously Software Engineer

I hold a B.A. in Computer Science from UCSC. I primarily work on Toil and related projects for the Analysis Core, continuing the work I did as an undergraduate research assistant for the Genomics Institute. My interests include functional programming, cloud computing, and big data.

Jake NarkizianUndergraduate Student Researcher

I’m an undergraduate student in the UCSC Jack Baskin School of Engineering, where I expect to earn my B.S. in computer science in 2017. I’m interested in big data analysis, genomics, and cloud computing. I’m currently working on Toil.

Audrey Musselman-BrownGraduate Student Researcher

I am graduate student researcher in the Biomolecular Engineering department at UCSC with a background in genomics. I hold a BS in Computational and Mathematical Biology from Harvey Mudd College. My research interests include bioinformatics pipelines, cloud computing, machine learning, data sharing, and somatic cancer variants.

I earned a B.A. in economics at UC Santa Cruz before working as an organic chemist engineering cyclic peptides to pass through membranes. I discovered my love of computational biology in graduate school, and I’m currently a Ph.D. student in biomedical sciences and engineering at UC Santa Cruz. My interests include DNA sequencing technology, bioinformatics and machine learning, and single-molecule biophysics. I’m currently working on translating ionic current signal into nucleotide bases in systems that reread DNA, algorithms to detect DNA modifications in nanopore sequencing data, variant calling. Outside of research I’m very passionate about bike racing.