“Life is much more like a symphony orchestra than a piccolo player,” J. Craig Venter says about his institute's new synthetic bacterium. It carries a mere 473 genes, a smaller genome than any autonomously replicating cell ever found in nature. That Venter and his collaborators synthesized a new bacterium and brought it to life in a bacterial corpse is not the biggest part of this story. Perhaps more importantly, their synthetic mini microbe, designated JCVI-syn3.0, contains many quasi-essential genes—genes not absolutely necessary for viability but critical for robust growth. This is not the one-gene, one-trait phenomenon that cell reductionists were wishing for, he asserts. Details appeared 25 March 2016 in Science (doi:10.1126/science.aad6253).

For microbiologists, the sequencing of the genome of the bacterium Haemophilus influenzae in 1995 was a pivotal event, one that transformed the microbiological research enterprise. More than 20 years later, with the genomes of some 85,000 organisms sequenced, including about 70,000 bacterial species, whole genome sequence (WGS) information is being used to design, conduct, and analyze vast numbers of experiments. There is no going back.

In this chapter, the author shows how the bioinformatic tools have contributed to the studies of the c-di-GMP-mediated signaling pathways—and continue to do so. A brief discussion of the roles of the GGDEF, EAL, and HD-GYP domains in c-di-GMP turnover and the role of the PilZ domain as a c-di-GMP adaptor protein is followed by an analysis of the phylogenetic distribution of these domains and a listing of the most common domain architectures that involve these four domains. The identification of the GGDEF domain as a component of DGCs and c-di-GMP-specific phosphodiesterases, participating in c-di-GMP turnover, by Benziman and his colleagues was a watershed event in at least three important aspects. This work provided the first evidence of an enzymatic function for this widespread protein domain and paved the way to the experimental demonstration that the GGDEF domain alone was responsible for the DGC activity. Second, linking this widespread domain with c-di-GMP turnover provided evidence for the participation of c-di-GMP in a variety of signaling processes. Third, the presence of the GGDEF domain in DGCs and in c-di-GMP-specific phosphodiesterases, two kinds of enzymes with opposing activities, suggested that this domain had allosteric functions that regulate c-di-GMP turnover. Experimental characterization of the most widespread combinations of c-di- GMP-related domains, including those described above, remains a promising venue of research that can be expected to provide much-needed insights into the functioning of this fascinating signaling system and its role in bacterial adaptation mechanisms.

Bacterial RNA polymerase provides the central model for the transcription elongation complex and its various interesting fates-backtracking and correction by Gre protein-mediated transcript cleavage, transcription termination, and the antitermination controls that were discovered in bacteria. RNA polymerase and its transcription factors have functions beyond their obvious activity to provide RNA molecules to the cell, reflecting the fact that RNA polymerase and the process of transcription must have evolved as DNA arose from the primal RNA world-neither is worth much without the other. There is evidence or informed speculation implicating RNA polymerase and transcription proteins in processes of replication, DNA repair, and cell division. Thus, transcription by RNA polymerase activates the origins of replication of Escherichia coli and phage λ in some structural way independent of the RNA product. Just as transcription and replication coevolved, so did the coordination of chromosome segregation and cell division arise in the context of both. DNA is transcribed as it moves about the cell in an organized fashion during replication. RNA is translated at the same time, causing an added complication when emerging membrane proteins are inserted into the membrane and provide points of fixation for the complex.

For a bacterium short-term survival depends on the high fidelity of the replication machinery to prevent genomic alteration, but in the long term, evolution needs to occur. There are two basic themes, one providing for genome modification and the other providing for an increase in genomic content. The processes considered in this chapter are those of replication, amplification, and deletion. In fact, if gene acquisition, through transfer processes, is as high as is considered likely, then processes driving deletion reactions will be strategically important to conserving genome size. DNA sequences are subject to change for better or for worse, so selection would appear to favor individuals with a mutation rate approaching zero and an accurate replication machinery. Mutation is an event that is more likely to adversely effect the activity of a gene product. As mutations occur there is no mechanism for identifying them or replacing them by recombination. Clonal species are strongly driven by competition because one clone is "much like" another. This allows them to acquire by recombination large amounts of genetic sequence. Homology-independent (site-specific) pathways of DNA incorporation are likely to lead to increases in genome complexity. But nonhomologous (site-specific) recombination is also important in shaping bacterial genomes. Evidence suggests that gene elongation and duplication was the route taken by ancestral life forms to enlarge their genomes and increase their biochemical capacity. Genomic analysis of the three cell domains, Archaea, Bacteria, and Eukarya is allowing an analysis of their relationships.

The availability of the complete genome sequence of the free-living bacterium Haemophilus influenzae in 1995 opened the field of microbial genomics. Although a variety of sequencing technologies have been used for genome sequencing projects, the random shotgun sequencing strategy has demonstrated itself to be the most successful and efficient and has become the preferred method for whole-genome sequencing. Following on the closure and assembly of the DNA replicons, a complete analysis of the genome necessitates the identification of all DNA-encoded open reading frames (ORFs) or candidate genes in the DNA sequence and the assignment of gene names and associated function to these ORFs. Following the completion of the annotation phase of the sequencing project, the sequence allows for detailed comparative and functional genomics. Available tools of functional genomics include expression profiling, identification and analysis of protein-protein interactions, deletion phenotype analysis, and proteomics. In addition to the listed features that can be identified through bioinformatics analysis, insights can also be gained into previously unidentified biochemical pathways and transporter systems. The magnitude of the possible diversity that exists is evident when we consider that more than 99% of microbial species remain to be identified. Further details on the methodology of genome hybridization (CGH) are discussed. In the postgenomic era, the discipline of functional genomics is facing the challenge of associating function to the thousands of genes of unknown function that remain at the end of each genome project.

The expressed sequence tags (ESTs) method identified more than thirty new G protein-coupled receptors; however, these sequences were also of great interest to Human Genome Sciences, Inc. (HGS) and its partner, Smith-Kline Beecham (SB). Haemophilus influenzae is of historical significance in science as well, being the source from which Ham Smith first isolated restriction endonucleases, which led to his Nobel Prize in Medicine in 1978. No complete genome sequence for a free-living organism had ever been deciphered, so we all realized that this would be a landmark achievement if it could be accomplished. Since TIGR reported the first complete microbial genome sequence in 1995, the sequences of more than twenty bacterial and archaeal species have been published, and at least sixty other genome projects are in progress in laboratories around the world. Most species cannot be cultured in the laboratory, yet these species likely play some of the most important roles in the ecology of our planet. Breakthroughs in genomics technology and bioinformatics will continue to allow us to push back the frontiers of whole genome analysis. Breakthroughs in genomics technology and bioinformatics will continue to allow us to push back the frontiers of whole genome analysis. But at the same time, new technologies for functional genomics present exciting opportunities to begin to study the dynamic nature of the microbial cell.

rRNAs contain a number of different modified nucleosides, mainly, but not exclusively, pseudouridine (ψ, 5-ribosyl-uracil) and nucleosides methylated on either the base or the 2'-hydroxyl of the ribose. This chapter focuses on the rRNAs of bacteria, archaea, and organelles, including the last because of their presumed evolutionary relationship to bacterial organisms. With a single exception in Bacillus subtilis, all of the available information on the rRNA ψ synthases comes from Escherichia coli. Deletion of the synthase gene and subsequent RNA analysis for ψ can also be misleading if another synthase shares the specificity for forming a particular ψ. The effects of the absence of RluC or RsuA were much less marked but still significant. The strong effect of RluA deletion might be due to the loss of ψ from both large-subunit (LSU) RNA and tRNA. It is also important to note that the family designations based on amino acid sequence homology do not necessarily define the specificities of the member synthases. In E. coli, the locations of the 10 methylated nucleosides in the small-subunit (SSU) RNA and the 14 methylated nucleosides (one is m3ψ) plus one dihydrouridine in the LSU RNA are known.

This chapter gives an overview of the strategies developed to sequence entire microbial genomes, and discusses the advantages and disadvantages of various approaches. For total-genome shotgun sequencing, the genomic DNA is fragmented into random pieces and subcloned directly into pUC, Ml3, or other vectors that accept insert sizes of 1 to 5 kbp. Typically, 6 to 10 genome equivalents are sequenced to cover the DNA molecule completely by using standard primers that prime at the end of the cloning vector. The primer-walking strategy has been tried primarily in the context of the yeast sequencing project. The method requires an ordered library of clones, either an overlapping set of large clones (e.g., a cosmid library) or an ordered set of discrete subclones (e.g., two 6-base cutter restriction digest libraries from a cosmid). Regardless of the sequencing strategy chosen in a particular project, there are four general phases of the sequencing process. They are primary sequencing phase, linking phase, polishing phase, and finished sequence. Only one genome project, the Escherichia coli effort at the University of Wisconsin, made substantial progress with radioactive sequencing before changing to automated-sequencing strategies. There are two different kinds of sequencing laboratories that produce genomic sequence: sequencing factories and smaller laboratories with an output of 2 to 5 Mbp of genomic sequence per year. With increasing levels of automation, the sequence production costs will be reduced, and in the future it may be possible to reach 10 cents per finished base pair.