Outbreaks of infectious disease often result from exposure to a common source of the etiologic agent. Generally, the etiologic agent causing an outbreak of infection is derived from a single cell whose progeny are genetically identical or closely related to the source organism. In epidemiological terms, the organisms involved in the outbreak are clonally related; that is, they have a common origin. Clonally related organisms are members of the same species that share virulence factors, biochemical traits, and genomic characteristics. However, there is sufficient diversity at the species level that organisms isolated from different sources at different times and in different geographical regions may be differentiated or classified into subtypes or strains.

The process of subtyping is important epidemiologically for recognizing outbreaks of infection, detecting the cross-transmission of nosocomial pathogens, determining the source of the infection, recognizing particularly virulent strains of organisms, and monitoring vaccination programs.

Subtyping or strain classification has been accomplished by a number of different approaches. All of these methods must meet several criteria in order to be broadly useful. All organisms within a species must be typeable by the method used. For some phenotypic methods, such as those based on a reaction with an antibody or the presence of a bacteriophage receptor, the particular characteristic may not be present in all members of the species. Serotypes can be assigned only if a particular serological marker is present. For example, 10 to 15% ofMycobacterium avium strains are untypeable because they fail to react with any available antiserum (78). Similarly, bacteriophage typing has often been applied to the characterization ofStaphylococcus aureus, yet S. aureus isolates lacking the bacteriophage receptor cannot become infected with the typing phage (10, 64). In the absence of the marker, the isolates cannot be assigned a phage type.

Any subtyping method must have high differentiation power (3). It must be able to clearly differentiate unrelated strains, such as those that are geographically distinct from the source organism, and at the same time to demonstrate the relationship of all organisms isolated from individuals infected through the same source. As will be discussed in this review, molecular methods can differ widely in their ability to differentiate strains.

A third concern for subtyping methodologies is reproducibility (3). Reproducibility refers to the ability of a technique to yield the same result when a particular strain is repeatedly tested. Reproducibility is especially important for the construction of reliable databases containing all known strains within a species to which unknown organisms can be compared for classification. Variable expression of phenotypic characteristics, such as sporadic expression of virulence genes or antigens, can contribute to problems with reproducibility (3).

Many of the currently used molecular techniques for typing rely on electrophoretic separation of DNA fragments of different molecular lengths. The electrophoretic result is represented by a pattern of bands on a gel. Since these patterns may be extremely complex, the ease with which the patterns are interpreted and related is a factor in evaluating the utility of a particular typing method (3). Along with considerations related to a particular method’s ease of interpretation, its ease of use is also important (3). The technical difficulty, cost, and time to obtain a result must also be evaluated in assessing the utility of a particular typing method.

The shortcomings of phenotypically based typing methods have led to the development of typing methods based on the microbial genotype or DNA sequence, which minimize problems with typeability and reproducibility and, in some cases, enable the establishment of large databases of characterized organisms. In this review, we have compared some of the most current technologies for genotypic typing with regard to the principles of the methods and their reproducibility and discrimination power, ease of use, and cost in order to give the reader some measure of the clinical practicality and utility of the molecular tools available today.

MOLECULAR TYPING METHODS

PFGE of whole chromosomal DNA.Pulsed-field gel electrophoresis (PFGE) is often considered the “gold standard” of molecular typing methods. For PFGE, bacterial isolates grown either in broth or on solid media are combined with molten agarose and poured into small molds. The results are agarose plugs containing the whole bacteria. The embedded bacteria are then subjected to in situ detergent-enzyme lysis and digestion with an infrequently cutting restriction enzyme. The digested bacterial plugs are then inserted into an agarose gel and subjected to electrophoresis in an apparatus in which the polarity of the current is changed at predetermined intervals. The pulsed field allows clear separation of very-large-molecular-length DNA fragments ranging from 10 to 800 kb (65). The electrophoretic patterns are visualized following staining of the gels with a fluorescent dye such as ethidium bromide. Gel results can be photographed, and the data can be stored by using one of the commercially available digital systems, such as those manufactured by Alpha-Innotech, Bio-Rad, Hitachi, or Molecular Dynamics. Data analysis can be accomplished by using any of a number of commercially available software packages available from Applied Math, Bio-Rad, BioSystematics, Media Cybernetics, or Scanalytics.

Tenover et al. (75) have proposed a system for standardizing the interpretation of PFGE patterns in relation to determining strain relatedness. In their scheme, bacterial isolates yielding the same PFGE pattern are considered the same strain. Bacterial isolates differing by a single genetic event, reflected as a difference of one to three bands, are closely related. Isolates differing by four to six bands, representing two independent genetic changes, are possibly related. Bacterial isolates containing six or more band differences, representative of three or more genetic changes, are considered unrelated. These criteria are applicable to small, local studies in which genetic variability is presumed to be limited.

Numerous studies which define the banding patterns of a variety of bacteria digested with a number of restriction enzymes have been conducted. PFGE has proven to be superior to most other methods for biochemical and molecular typing. It is highly discriminatory and superior to most methods for analysis of Escherichia coli, vancomycin-resistant enterococci, S. aureus,Acinetobacter species, Pseudomonas aeruginosa, and M. avium (4, 5, 7, 30, 46, 52, 57, 63, 64, 73). PFGE was more efficient at differentiating strains of methicillin-resistant S. aureus than other methods tested (58, 63). PFGE was also more discriminatory than repetitive element sequence-based PCR (Rep-PCR) for differentiating strains of theAcinetobacter calcoaceticus-Acinetobacter baumannii complex, vancomycin-resistant enterococci, Neisseria gonorrhoeae, andP. aeruginosa (7, 30, 46, 56).

With the aid of the computerized gel scanning and analysis software, it is possible to create data banks of PFGE patterns for all organisms, enabling the creation of reference databases to which any new strain could be compared for identifying its phylogenetic relationship to other similar strains. This capability is reflected in a recent report to the President on food safety initiatives (22). Two of the recommendations were (i) to enhance surveillance of food-borne outbreaks of disease by equipping FoodNet sites with the equipment to carry out DNA fingerprinting by PFGE in order to identify the source of infectious agents and (ii) to establish an electronic network for rapid fingerprint comparison which would allow sentinel sites to rapidly share data (22).

One of the factors that has limited the use of PFGE is the time involved in completing the analysis. While the procedural steps are straightforward, the time needed to complete the procedure can be 2 to 3 days. This can reduce the laboratory’s ability to analyze large numbers of samples.

Southern blotting and RFLP methods.Southern blotting has been used for years to detect and locate genomic sequences from a variety of prokaryotic and eukaryotic organisms. For gene detection, whole chromosomal DNA is digested with a restriction enzyme, and the fragments are separated by electrophoresis through an agarose gel. The separated DNA fragments are transferred from the agarose gel to either a nitrocellulose or nylon membrane by Southern blotting (69). The membrane-bound nucleic acid is then hybridized to one or more labeled probes homologous to the gene to be examined. Probes can be labeled with a number of detectable moieties, including enzyme-colorimetric substrates or enzyme-chemiluminescent substrates. This classic method has been adapted to the differentiation of bacterial strains on the basis of the observation that the locations of various restriction enzyme recognition sites within a particular genetic locus of interest can be polymorphic from strain to strain, resulting in gel bands that differ in size between unlike strains. Thus, the name restriction fragment length polymorphism (RFLP) refers to the polymorphic nature of the locations of restriction enzyme sites within defined genetic regions. Only the genomic DNA fragments that hybridize to the probes are visible in RFLP analysis, which simplifies the analysis greatly.

Gene-specific probes have been used to subtype Brucellaspecies (29), Legionella pneumophila(76), and P. aeruginosa (47). Furthermore, ribotyping, a variation of RFLP-Southern blotting in which the probes are derived from the 16S and 23S rRNA genes, has been applied successfully in many studies for differentiating bacterial strains (12, 20, 25, 34, 40, 72). Ribotyping results in only a small number of bands, which simplifies the interpretation. However, this also limits the ability of the technique to distinguish between closely related strains.

Multiple genetic sites can also be targeted in Southern blotting studies to subtype bacteria. Grundmann et al. (30) used a combination of a toxA gene probe with probes for the 16S and 23S rRNA genes to subtype P. aeruginosa isolates. Both RFLP-Southern blotting probe sets yielded information about the evolutionary relationships of the strains examined, yet neither technique was as efficient at discriminating strain differences as PFGE. While RFLP-Southern blotting has been a useful technique for limited studies of strain subtypes, the cumbersome blotting techniques have been largely replaced by PCR-based locus-specific RFLP.

PCR-based locus-specific RFLP.The PCR has enabled specific genetic loci to be routinely amplified and examined for differences indicative of strain variation and antimicrobial resistance. The specific locus to be examined is amplified with gene-specific primers and subjected to RFLP analysis. The DNA fragments are separated on an agarose or small polyacrylamide gel, and the digestion patterns are visualized following ethidium bromide staining.

Locus-specific RFLP has been applied in a number of situations. Shortridge et al. (67) have used RFLP of the ureCgene to demonstrate the genetic diversity of Helicobacter pylori strains in the United States. The 16S, 23S, and 16S-23S spacer regions have also been used as targets for locus-specific RFLP (30, 83, 93). In this variation of ribotyping, the ribosomal DNA is amplified and subjected to digestion with a restriction enzyme, and the DNA fragments are visualized following separation by gel electrophoresis, alleviating the need for Southern blotting. Ribotyping can be successfully applied for differentiation of bacterial strains that display a high degree of heterogeneity within the rRNA operons.

In a novel application of locus-specific RFLP, Cockerill et al. (16) have identified point mutations in the katGgene of Mycobacterium tuberculosis that correspond to various levels of resistance to isoniazid by examination of RFLP patterns of the amplified gene. However, the discriminatory power of locus-specific PCR is generally not as good as that of other methods, due primarily to the limited region of the genome which can be examined. For example, in a study by Kühn et al. (42), PCR-ribotyping showed poor discriminatory power in comparison to PFGE and biochemical typing methods.

Locus-specific RFLP has found a significant application in epidemiological studies of hepatitis C virus (HCV). HCV can be divided into six major genotypes, numbered 1 through 6. Davidson et al. (17) have developed a simple method for determining the genotype of an HCV isolate based on PCR amplification and RFLP analysis of the 5′ untranslated region of the HCV genome. Digestion with either the combination or single enzymes HaeIII-RsaI,MvaI-HinfI, BstI, and ScrFI allows differentiation of viral isolates into groups 1a, 1b, 2a, 2b, 3a, 3b, 4, 5, and 6. RFLP of the 5′ untranslated region has facilitated studies of the geographical distribution of viral genotypes, the natural history of the disease, and correlation of genotypes with response to interferon.

RAPD assays.The random amplified polymorphic DNA (RAPD) assay, also referred to as arbitrary primed PCR, was first described by Williams et al. (87) and Welsh and McClelland (86). RAPD assays are based on the use of short random sequence primers, 9 to 10 bases in length, which hybridize with sufficient affinity to chromosomal DNA sequences at low annealing temperatures such that they can be used to initiate amplification of regions of the bacterial genome. If two RAPD primers anneal within a few kilobases of each other in the proper orientation, a PCR product with a molecular length corresponding to the distance between the two primers results. The number and location of these random primer sites vary for different strains of a bacterial species. Thus, following separation of the amplification products by agarose gel electrophoresis, a pattern of bands which, in theory, is characteristic of the particular bacterial strain results (14, 51, 86, 87).

In most cases the sequences of the RAPD primers which generate the best DNA pattern for differentiation must be determined empirically. More recently, a consensus M13 DNA sequencing primer has been used in RAPD fingerprinting assays, which allows for some standardization of the procedure (83, 85).

In comparison to other typing techniques, Vila et al. (83) found that the RAPD assay was more discriminating than RFLP analysis of either the 16S rRNA genes or the 16S-23S rRNA spacer region but less discriminating than Rep-PCR. Furthermore, a number of problems have been reported for RAPD assays that contribute to a lack of reproducibility and standardization. Since the primers are not directed against any particular genetic locus, many of the priming events are the result of imperfect hybridization between the primer and the target site. Thus, the amplification process is extremely sensitive to slight changes in the annealing temperature which can lead to variability in the banding patterns. The use of empirically designed primers, each with its own optimal reaction conditions and reagents, also makes standardization of the technique difficult (6, 51, 86).

Rep-PCR.Versalovic et al. (80) described a method for fingerprinting bacterial genomes by examining strain-specific patterns obtained from PCR amplification of repetitive DNA elements present within bacterial genomes. Two main sets of repetitive elements are used for typing purposes. The repetitive extragenic palindromic (REP) elements are 38-bp sequences consisting of six degenerate positions and a 5-bp variable loop between each side of a conserved palindromic stem (71). REP sequences have been described for numerous enteric bacteria (28, 33, 66, 71, 92). The palindromic nature of the REP elements and their ability to form stem-loop structures have led to multiple proposed functions for these highly conserved, dispersed elements (27, 54, 92).

The enterobacterial repetitive intergenic consensus (ERIC) sequences are a second set of DNA sequences which have been successfully used for DNA typing. ERIC sequences are 126-bp elements which contain a highly conserved central inverted repeat and are located in extragenic regions of the bacterial genome (33, 66). They have been defined primarily based on sequence data obtained from E. coli andSalmonella typhimurium (33, 66).

While the REP and ERIC sequences are the most commonly used targets for DNA typing, another repetitive element, the BOX sequence, has been used to differentiate Streptococcus pneumoniae (39, 50). BOX elements are located within intergenic regions and can also form stem-loop structures due to their dyad symmetry. They are mosaic repetitive elements composed of various combinations of three subunit sequences referred to as boxA, boxB, and boxC (50). The three subunit sequences have molecular lengths of 59, 45, and 50 nucleotides, respectively (50). The BOX elements have no sequence relationship to either REP or ERIC sequences (50). While initially thought to be unique to S. pneumoniae, BOX elements have now been found in a number of bacterial species (43).

Rep-PCR can be performed with DNA extracted from bacterial colonies or by a modified method using unprocessed whole cells (90). REP or ERIC amplification can be performed with a single primer, a single set of primers, or multiple sets of primers. ERIC patterns are generally less complex than REP patterns, but both give good discrimination at the strain level. Application of both REP and ERIC PCR to samples to be typed increases the discriminatory power over that of either technique used alone.

Rep-PCR is fast becoming the most widely used method of DNA typing. Rep-PCR with primers based on REP and ERIC sequences has been successfully used to differentiate strains of Bartonella(62), Bacillus subtilis (80),Citrobacter diversus (32, 89), Enterobacter aerogenes (24), Rhizobium meliloti(18), methicillin-resistant S. aureus(19), S. pneumoniae (81, 93), A. baumanii (21, 60), Burkholderia cepacia(31), L. pneumophilia (23), H. pylori (43), N. gonorrhoeae (56), and Neisseria meningitidis (91). The technique is easy to perform and can be applied to large or small numbers of isolates. Rep-PCR shows broader species applicability and better discriminatory power than either plasmid profiling or genomic fingerprinting (24). Rep-PCR has considerably better discriminatory power than restriction analysis of the 16S rRNA gene or the 16S-23S spacer region (2, 83). Furthermore, studies which have compared Rep-PCR to other typing methods such as multilocus enzyme electrophoresis (89), biochemical characterizations (15), or ribotyping (2, 68, 83) have shown Rep-PCR to be superior to these methods. Finally, several studies have shown Rep-PCR to have good correlation with PFGE results but, in general, with slightly less discriminatory power (7, 46).

Rep-PCR has been adapted to an automated format in which fluorescently labeled primers are used to create either the REP or ERIC profile and the amplified sequences are separated via a fluorescence-based DNA sequencer (19, 82). This method allows consistent pattern formation and storage of the data in a database as a digitized image. Unknown strains can be compared against the stored database for identification purposes. Thus, like PFGE, Rep-PCR lends itself to standardization, which will enable comparisons of strains isolated in different laboratories.

CFLP.Cleavase I is a thermostable, engineered endonuclease which consists of the 5′-nuclease domain (amino acids 1 to 281) ofThermus aquaticus DNA polymerase (13). Whereas restriction endonucleases recognize specific DNA sequences as their recognition sites, cleavase I recognizes structures formed by base-paired DNA at elevated temperatures (13). When denatured by heating at 95°C followed by cooling at temperatures ranging between 50 and 60°C, the single strands of DNA go through a self-base-pairing phase during which hairpin structures unique to each strand are formed prior to reannealing with the complementary DNA strand. This process creates a series of stem-loop structures connected by non-base-paired single-stranded regions. Cleavase I recognizes the stem-loop structures and cuts the DNA 5′ to the first paired base at the junction of the single- and double-stranded regions. Theoretically, in any given DNA preparation the denatured single strands represent a mixture of all possible configurations which the strands can assume. For the cleavase reaction, a PCR product labeled at one of its 5′ termini is subjected to partial digestion with the enzyme such that all possible digestion products are represented in the reaction mixture. The resulting mixture is analogous to a dideoxy sequencing reaction in that it contains a collection of randomly cleaved, labeled single-stranded DNA molecules extending from the 5′ labeled end to a terminal cleavage site.

Differences in the structural fingerprints between dissimilar PCR products may appear as a gain or loss of a band, a mobility shift in a band, or a change in the intensity of a particular band. While gains, losses, and shifts of bands are easily recognized, differences in band intensity, which occur frequently, are extremely difficult to detect and can often lead to false results.

The discrimination power of cleavase fragment length polymorphism (CFLP) is dependent on the DNA sequence of the locus examined. For example, while a pattern abundant in cleavage products is obtained from digestion of 16S rDNA, very few cleavage products are generated by digestion of the rpoB gene of M. tuberculosis, precluding identification of many sequence differences in the region critical for drug resistance (13). Furthermore, each DNA strand may digest with different abilities to discriminate sequence changes. For example, whereas differences in the nucleotide sequences of the intergenic regions of Shigella dysenteriae serotypes 1, 2, and 8 are represented in the structural fingerprints of the antisense strand, the fingerprints of the sense strand are indistinguishable. Maddox et al. (48) compared the abilities of CFLP and single-strand conformation polymorphism to detect single point mutations. In that study, single-strand conformation polymorphism detected six of six mutations analyzed, while CFLP could detect only three of the six mutations. Thus, the applicability of CFLP analysis to any given DNA cannot be predicted but must rather be determined empirically. Some DNA molecules give excellent results in terms of differentiation, while others give no results at all due to unknown reasons. CFLP is also size limited; that is, differences in the structural fingerprints of DNA fragments larger than 1 kbp cannot be reliably resolved by electrophoresis.

CFLP has been applied to genotypic differentiation of HCV isolates (49, 70). In both of these studies, genotype analysis by CFLP was successful in differentiating the strains examined. However, while the intralaboratory reproducibilities of the banding patterns for each viral type were consistent, the interlaboratory reproducibilities of the same patterns for the viral subtypes were poor. In the two studies that used CFLP for genotyping HCV, the structural fingerprints of HCV type 1a and 1b isolates in the study by Marshall et al. (49) showed significant differences from those reported by Sreevatsan et al. (70). Both studies presented CFLP fingerprints on a 244-bp fragment derived from the 5′ noncoding region by PCR amplification with the Amplicor HCV kit (Roche Molecular Diagnostics, Branchburg, N.J.). The patterns produced for a given genotype in the study by Marshall et al. (49) were generally consistent, and the patterns were easily interpreted. In contrast, while a consistent, unique fingerprint was generated for each genotype tested in the study by Sreevatsan et al. (70), the fingerprints were considerably more complex and difficult to interpret. Furthermore, the fingerprints for HCV types 1a and 1b differed significantly from the patterns for types 1a and 1b presented by Marshall et al. (49). The implication of this is that while CFLP can achieve acceptable intralaboratory reproducibility, the interlaboratory reproducibility limits its utility. Patterns generated by one laboratory cannot be shared or used as a reference for other laboratories.

The need for empirical determination of the applicability of CFLP to any given genetic system coupled with the difficulty in pattern interpretation and the interlaboratory variation in the CFLP patterns of the same organisms compromise the use of this technology for clinical purposes. CFLP is also considerably more complex to perform than either locus-specific PCR-RFLP, Rep-PCR, or RAPD analysis. With the exception of DNA sequencing, the discriminatory ability of CFLP has not been compared to that of any of the other molecular typing techniques, so its capabilities remain unknown.

AFLP assays.Amplified fragment length polymorphism (AFLP) is a genome fingerprinting technique based on the selective amplification of a subset of DNA fragments generated by restriction enzyme digestion (84). Originally applied to the characterization of plant genomes, AFLP has been more recently applied to bacterial typing (21, 26, 35, 36, 38). Two variations of AFLP have been described, one with two different restriction enzymes and two primers for amplification and the second with a single primer and restriction enzyme (21, 26, 35, 36, 38). In the most common configuration, bacterial DNA is extracted, purified, and subjected to digestion with two different enzymes, such as EcoRI andMseI. The restriction fragments are then ligated to linkers containing each restriction site and a sequence homologous to a PCR primer binding site. The PCR primers used for amplification contain DNA sequences homologous to the linker and contain one to two selective bases at their 3′ ends. For example, a selective primer directed against an EcoRI site might have the sequence 5′-GAATTCAA-3′, where the first six bases are complementary to the EcoRI site while the two A residues at the 3′ end are selective and allow amplification of only thoseEcoRI sites with the sequence 3′-CTTAATT-5′. They would not amplify anEcoRI site with the sequence 3′-CTTAATC-5′. Thus, the selective nucleotides allow amplification of only a subset of the genomic restriction fragments. In order to visualize the patterns, one of the PCR primers contains either a radioactive (21, 26, 35, 36) or a fluorescent (38) label, or, alternatively, the gels may be stained with ethidium bromide and the patterns may be examined under UV light (26).

Studies to date have demonstrated that AFLP is reproducible (21, 35, 38) and has good ability to differentiate clonally derived strains (21, 26, 35, 36, 38). The differentiation power of AFLP appears to be greater than that of PCR-based ribotyping; however, it has not yet been compared to Rep-PCR or PFGE (38). The AFLP procedure is more labor-intensive than Rep-PCR, but results are obtained more rapidly than with PFGE. Radioactive formats are impractical for clinical use, and while significant advantages exist with the fluorescent format, the cost of a DNA sequencer may be prohibitive for most laboratories. A commercially available kit can be used in combination with enzymatic detection of the bands; however, this requires that the DNA fragments be transferred to a filter by blotting and adds considerably to the labor costs. Yet, the reproducibility of the banding patterns for a given strain facilitates storing patterns in databases for use in identifying new bacterial strains. This combined with strong discriminatory power may make AFLP attractive to those laboratories performing frequent epidemiological studies such that a DNA sequencer becomes cost-effective.

DNA sequencing.Ultimately, all molecular genetic methods for distinguishing organism subtypes are based on differences in the DNA sequence. Logically, then, DNA sequencing would appear to be the best approach to differentiating subtypes. However, as discussed below, this is not the case. DNA sequencing was originally performed by using radioactive labels for detection of the reaction products, an approach that is unsuitable for clinical use. Current DNA sequencing protocols employ fluorescent nucleotides to label the DNA. The sequence is then read with an automated instrument.

DNA sequencing generally begins with PCR amplification of a sample DNA directed at genetic regions of interest, followed by sequencing reactions with the PCR products. Modifications of the procedure can be made such that RNA may also be used as the starting material. Instruments that perform automated analysis of DNA sequencing gels are based on real-time detection of fluorescently labeled sequencing reaction products. DNA sequencing instruments most commonly utilize a modification of dideoxynucleotide chain termination chemistry in which the sequencing primer is labeled at the 5′ end with one of four fluorescent dyes. Each of the four fluorescent dyes represents one of the four nucleotides which make up the DNA molecule, and hence four separate annealing and extension reactions must be performed for each DNA sample to be analyzed. Once the DNA sequencing reactions have been completed, the four sets of reaction products are combined, concentrated, and loaded in a single well on a polyacrylamide gel. During electrophoresis, these fluorescently labeled products are excited by an argon laser and automatically detected. The resulting data is stored in digital form for subsequent processing into the final sequence with the aid of specialized software.

There are several considerations that must be evaluated before undertaking the use of DNA sequencing for subtyping. First, for practical purposes in the clinical laboratory, DNA sequencing must be directed at only a very small region of the chromosome of an organism. It is impractical to sequence multiple or large regions of the chromosome of an organism. Thus, in contrast to techniques like PFGE, Rep-PCR, or RAPD analysis, which examine the entire chromosome, DNA sequencing examines only a very small portion of the sites which can potentially vary between bacterial or fungal strains.

Furthermore, the short region of DNA used must meet several criteria before it can be used for strain differentiation. The structure of the region of DNA selected must consist of a variable sequence flanked by highly conserved regions. This enables PCR amplification and typing of all members of a species. The variability within the selected sequence must be sufficient to differentiate different strains of a particular species. Finally, the selected sequence should not be horizontally transmissible to other strains of the species.

Unfortunately, for bacteria and fungi, few sequences meet these criteria. While the 16S rRNA genes have been used to identify new species of organisms, they show limited variability between strains of a bacterial species (11, 44, 55, 88). The intergenic region between the 16S and 23S rRNA genes has also been used with variable results (8). Essentially species-specific genetic loci must be identified and tested empirically for the ability of a locus to provide strain differentiation. The major limitation of DNA sequencing approaches to bacterial and fungal subtyping is the size constraints of the DNA that can be practically sequenced.

In contrast, DNA sequencing is considered the gold standard for viral typing. Nucleic acid sequencing has much greater discriminatory power for determination of HCV genotypes and is the only molecular method by which mutations associated with human immunodeficiency virus (HIV) drug resistance can be reliably determined. The regions used for viral genotyping and detection of drug resistance mutations are in short, well-defined sequences that fulfill the criteria needed for the application of nucleic acid sequencing methodologies.

In applying DNA sequencing to genotypic studies, the laboratory should also be aware of the constraints of the sequencing methodology. One of the most problematic areas involved in DNA sequencing is the preparation of the sequencing gel. Some sequencing instruments either use precast gels or have apparatuses which greatly simplify gel preparation. Sequencing instruments that employ capillary gels, which automate gel preparation and loading, are even more convenient. However, traditional sequencing instruments in which the gel consists of a 40- by 40-cm slab gel are cumbersome to handle and pour.

Care must also be used when interpreting DNA sequencing data. The laboratory should be aware of artifacts that can complicate the interpretation. For example, repeats of a single nucleotide in a sequence can result in band compression artifacts that lead to misreading of the sequence. Both DNA strands should be sequenced and compared in order to minimize incorrect base identification.

In general, DNA sequencing is expensive and requires a high degree of technical competency to perform. Furthermore, automated DNA sequencers can be prohibitively expensive, with some costing in excess of $100,000.

In the future, non-gel-based sequencing methods based on hybridization of an unknown sequence to a DNA array may simplify the task of DNA sequencing for typing purposes as gene-specific chips are made. Kozal et al. (41) have reported the use of high-density DNA arrays to determine the DNA sequences of HIV clade B proteases of 167 HIV isolates taken from 102 patients in order to identify mutations associated with resistance to protease inhibitors. More recently, Troesch et al. (77) have described the use of a high-density DNA array to simultaneously identify mycobacteria at the species level and detect the presence of mutations in the rpoB gene associated with resistance to rifampin. Seventy isolates comprising 27 different species of mycobacteria and 15 rifampin-resistant strains ofM. tuberculosis were analyzed. For species determination, the 16S rRNA hypervariable region was amplified by PCR, and the amplicons were hybridized to the array. Twenty-six of 27 species tested were correctly identified. Similarly, mutations associated with rifampin resistance were detected in all 15 rifampin-resistant isolates by hybridization of amplicons obtained by amplification of therpoB gene to the array. This demonstrates the ability of DNA arrays to simultaneously interrogate more than one genetic locus. The multiplex capabilities of DNA arrays may significantly increase the utility of DNA sequencing-based methods for strain identification, since multiple genes could be rapidly screened for sequence-specific strain differences. The multiplexing capabilities of a DNA array would also enable simultaneous sequencing of both DNA strands, decreasing the amount of material and labor required to complete an analysis and increasing the cost-effectiveness of a chip for typing. While this technology is still some years away from commercialization, DNA arrays show intriguing prospects for the future.

PRACTICAL ASPECTS OF MOLECULAR TYPING METHODS

Tenover et al. have identified criteria to aid in the selection and interpretation of molecular typing methods for epidemiological studies (74). Although a particular typing method may have high discriminatory power and good reproducibility, the complexity of the method and interpretation of results as well as the costs involved in setting up and using the method may be beyond the capabilities of the laboratory. The choice of a molecular typing method, therefore, will depend upon the needs, skill level, and resources of the laboratory. As summarized in Fig. 1, methods such as locus-specific PCR, RAPD analysis, and Rep-PCR are similar in their procedures and are generally the easiest to implement. While PFGE appears to be time-consuming, it too is not difficult to implement. In contrast, procedures such as DNA sequencing, which require considerable knowledge about the instrumentation, methods, and potential pitfalls that can lead to uninterpretable results, can be difficult to implement reliably and require a higher skill level. Likewise, CFLP, with its need for DNA purification, careful optimization, and complex interpretation and its questionable reproducibility, can also be difficult to implement in a clinical laboratory.

These methods also vary with regard to their equipment needs, setup costs, and cost per test (Fig. 2). For example, while DNA sequencing may ensure that the subtlest of changes can be observed, the cost of an automated sequencer may be prohibitively high. Furthermore, for subtyping bacteria and fungi, the laboratory must weigh the fact that the discriminatory power of techniques such as PFGE and Rep-PCR is sufficiently high to yield excellent subtyping results for a fraction of the setup costs as well as lower costs per test.

Equipment, reagents, and costs associated with molecular typing methods. Footnotes: 1, the large range in setup cost corresponds to the various types of automated DNA sequencing available; 2, material costs include enzymes, nucleotides, primers, disposable tubes and pipette tips, chemicals, purification kits, and electrophoresis reagents; 3, the material cost was calculated based on analysis of both DNA strands; 4, labor costs were calculated based on the total hands-on time of a technologist paid at a rate of $20.00/h and analyzing batches of 12 samples per run; 5, labor costs were calculated based on analysis of both DNA strands. rx, reaction.

While the costs per test may be decreased for both DNA sequencing and CFLP if only one DNA strand is examined, this must be weighed against the reliability of the result for strain differentiation. As mentioned above, CFLP can show marked DNA strand-specific differences in discriminatory power. Likewise, DNA sequencing is most accurate when both DNA strands are sequenced.

While the time to obtain a result can be as much as 4 days for PFGE, the labor costs are lower than those for a 2-day result from a technique such as CFLP (Fig. 2). Most of the time required for performance of PFGE is the reaction incubation time, whereas CFLP requires considerably more hands-on manipulation by the technician.

The pertinent characteristics that are important for introduction of a molecular typing method into a clinical laboratory situation are summarized in Table 1. One other criterion, not considered here, is the ability of a method to allow analysis of large numbers of samples. This criterion may not be of importance to some clinical laboratories, while for those routinely involved in large epidemiological studies, high-throughput capability may be crucial. For high throughput, simple techniques with high discriminatory power and low cost, such as Rep-PCR, may be most suitable. However, if storage of genetic profiles for the creation of reference databases of organisms is desirable, then methods such as PFGE or AFLP may be equally attractive for bacterial and fungal subtyping. For viral subtyping, DNA sequencing may be the only method suitable for monitoring genetic change and detecting mutations associated with drug resistance.

Summary of the characteristics of the various molecular typing methods

Modern epidemiologists are fortunate to have a variety of tools which provide good molecular differentiation and which can be tailored to fit the needs of both the laboratory and the clinical study. Several of these molecular methods could enable the creation of large reference libraries of typed organisms to which new outbreak strains can be compared across laboratories in order to monitor changes in microbial populations.

Use of repetitive (repetitive extragenic palindromic and enterobacterial repetitive intergenic consensus) sequences and the polymerase chain reaction to fingerprint the genomes of Rhizobium meliloti isolates and other soil bacteria.Appl. Environ. Microbiol.58199221802187

Food and Drug Administration U.S. Department of Agriculture U.S. Environmental Protection Agency Centers for Disease Control and PreventionFood safety from farm to table: a national food safety initiative report to the President, p. 3.1997Food and Drug AdministrationWashington, D.C

Validation of use of whole-cell repetitive extragenic palindromic sequence-based PCR (rep-PCR) for typing strains belonging to the Acinetobacter calcoaceticus-Acinetobacter baumannii complex and application of the method to the investigation of a hospital outbreak.J. Clin. Microbiol.34199611931202

How to select and interpret molecular strain typing methods for epidemiological studies of bacterial infections: a review for healthcare epidemiologists. Molecular Typing Working Group of the Society for Healthcare Epidemiology of America.Infect. Control Hosp. Epidemiol.18199742639