Species diversity is an important measurement of ecological communities.Scientists believe that there is a strong relationship between speciesdiversity and ecosystem processes. However efforts to investigate microbialdiversity using whole genome shotgun reads data are still scarce. With novel applications of data structuresand the development of novel algorithms, firstly we developed an efficient k-mer countingapproach and approaches to enable scalable streaming analysis of large and error... Show moreSpecies diversity is an important measurement of ecological communities.Scientists believe that there is a strong relationship between speciesdiversity and ecosystem processes. However efforts to investigate microbialdiversity using whole genome shotgun reads data are still scarce. With novel applications of data structuresand the development of novel algorithms, firstly we developed an efficient k-mer countingapproach and approaches to enable scalable streaming analysis of large and error-prone short-read shotgun data sets. Then based on these efforts, we developed a statistical framework allowing for scalable diversity analysis of large,complex metagenomes without the need for assembly or reference sequences. Thismethod is evaluated on multiple large metagenomes from differentenvironments, such as seawater, human microbiome, soil. Given the velocity ingrowth of sequencing data, this method is promising for analyzing highlydiverse samples with relatively low computational requirements. Further, as themethod does not depend on reference genomes, it also provides opportunities totackle the large amounts of unknowns we find in metagenomicdatasets. Show less

Serpentinization is the hydrous alteration of mafic rocks to form serpentine minerals and magnetite. The reactions of this alteration result in elevated pH of the surrounding fluids, abiotic generation of H2, CH4 (and other organic molecules), and depletion of dissolved inorganic carbon. Thus, serpentinization has implications for the origin of life on Earth and possibly Mars and other planetary bodies with water. The microbial diversity of continental serpentinite systems consistently shows... Show moreSerpentinization is the hydrous alteration of mafic rocks to form serpentine minerals and magnetite. The reactions of this alteration result in elevated pH of the surrounding fluids, abiotic generation of H2, CH4 (and other organic molecules), and depletion of dissolved inorganic carbon. Thus, serpentinization has implications for the origin of life on Earth and possibly Mars and other planetary bodies with water. The microbial diversity of continental serpentinite systems consistently shows communities that are dominated by two major taxa – microaerophilic Betaproteobacteria and anaerobic Clostridia. Previous studies relied on few samples collected from natural springs or seeps, meaning that the flow path of fluids from the subsurface process of serpentinization was unknown. The Coast Range Ophiolite Microbial Observatory (CROMO), a set of wells drilled into the actively serpentinizing subsurface environment in northern California, was established in northern California to gain a better understanding of the habitability and microbial functions within the serpentinite subsurface environment. This dissertation represents a culmination of microbiological investigations into the serpentinite subsurface environment at CROMO to identify the microbial inhabitants of subsurface fluids, rocks, and in situ colonization experiments using molecular methods and high-throughput sequencing. The CROMO wells represent a broad range of geochemical gradients and pH and the concentrations of carbon monoxide and methane have the strongest correlation with microbial community composition. The most extremely high pH wells were inhabited exclusively by a single operational taxonomic unit (OTU) of Betaproteobacteria and a few OTUs of Clostridia, while more moderate pH wells exhibited greater diversity. Genes involved in the metabolism of hydrogen, carbon monoxide, and carbon fixation were abundant in the extreme pH fluids, while genes for metabolizing methane were exclusively in the moderate pH wells. The subsurface environment is an amalgamation of fluids and rocks, and as such, studying fluids alone only gives half the story. CROMO represents the first drill campaign into the continental serpentinite environment and the microbial diversity of serpentinite cores to a depth of 45 meters below surface suggests that specific geological features harbor different microbial communities. Archaea, previously undetected at CROMO, dominated cores containing magnetite-bearing serpentine, while bacteria were more abundant in layers containing clay particles. Additionally, organisms involved in the cycling of nitrogen and methane were found associated with core materials, indicating core-associated communities may have strong biogeochemical roles within the serpentinite subsurface environment. Given that microbial communities appear to vary with geological composition and that serpentinite fluids are a challenging habitat for life, depleted in inorganic carbon and electron acceptors, microbe-mineral interactions within the serpentinite subsurface environment through the use of in situ colonization devices to see if communities were able to utilize inorganic carbon in calcite or ferric iron as a terminal electron acceptor from magnetite. In the highest pH well, calcite led to an increased abundance of Clostridia and Deinococcus, while magnetite led to an increase in diversity, including Alphaproteobacteria, Gammaproteobacteria, and Actinobacteria, suggesting further that mineralogical composition of solids within the subsurface impact community composition. The data discussed here further our understanding of life associated with serpentinite fluids and minerals within the subsurface environment. Show less

Evidentiary soil in an investigation can link an individual with the scene of a crime since the diversity and geospatial distribution of soils can make it highly probative. Recently, advanced techniques have been developed that allow a deeper investigation into bacterial communities and produce considerably more data than previous methods. This research used next-generation sequencing and statistical analyses to identify factors influencing soil bacterial communities and assess the... Show moreEvidentiary soil in an investigation can link an individual with the scene of a crime since the diversity and geospatial distribution of soils can make it highly probative. Recently, advanced techniques have been developed that allow a deeper investigation into bacterial communities and produce considerably more data than previous methods. This research used next-generation sequencing and statistical analyses to identify factors influencing soil bacterial communities and assess the feasibility for their use forensically. Soil samples were collected from a variety of habitats over different distances, depths, and times, DNAs were extracted, the 16S rRNA gene amplified, and DNAs sequenced on a Roche 454 platform. Five statistical procedures--nonmetric multidimensional scaling, hierarchical cluster analysis, integral library shuffle, unique fraction method, and k-Nearest Neighbor--were used to compare differences or changes in bacterial communities. Multiple similar and diverse habitats were differentiated with both multivariate statistics and pairwise comparisons. Additionally, changes in communities were indicated over time, horizontal space, and depth. Multivariate statistics generally suggested similar relationships though not always consistent with pairwise comparisons, which showed analogous results though the unique fraction method always found fewer differences. k-Nearest Neighbor could be forensically useful based on the correct classification accuracy of `unknown' samples from a non-ideal training set. This research elucidates the potential of next-generation sequencing for soil investigation, how samples should be collected, and what statistics would be useful to analyze the data. Show less