Which of the following is not a genotypic method for determining relatedness between microorganisms?

The advent of new molecular technologies in genomics and proteomics is shifting traditional techniques for bacterial classification, identification, and characterization in the 21st century toward methods based on the elucidation of specific gene sequences or molecular components of a cell. We discuss current genotypic and proteomics technologies for bacterial identification and characterization, and present an overview of how these new technologies complement conventional approaches. The new methods can be rapid, offer high throughput, and produce unprecedented levels of discrimination among strains of bacteria and archaea. Remaining challenges include developing appropriate standards and methods for these techniques' routine application and establishing integrated databases that can handle the large amounts of data that they generate. We conclude by discussing the impacts of rapid bacterial identification on the environment and public health, as well as directions for future development in this field.

Since the first recognition of microorganisms, scientists have devised classification schemes with the goal of systematically identifying species in an evolutionary or phylogenetic context (Clarke 1985). This has consistently proved more challenging for bacteria than for macroorganisms. Bacteria are asexual, so the classic definition of a species as a group of organisms that can interbreed and produce fertile offspring is difficult to apply. Furthermore, because of their small size, bacteria have a limited range of morphological attributes. They do exhibit enormous biochemical diversity in both their metabolism and cell structure; this has proved to be a useful cue for the taxonomy of some groups, but by no means all of them. It is important, then, that the molecular revolution that has transformed all of biology has had as great an impact on the taxonomy and systematics of bacteria as in any other area of biology. In the 1970s, on the basis of molecular comparisons of evolutionarily conserved ribosomal genes, Carl Woese proposed the then-heretical notion that the bacteria actually made up two separate domains, the Bacteria and the Archaea, each as distinct from one another as they are from the Eukaryotes, the third domain that comprises all “higher forms” of life (Woese 1987). This classification, supported by reams of additional molecular data, is now the standard view among microbiologists. This is not to say that bacterial systematics is now fully standardized; indeed, there is a vigorous ongoing debate about what constitutes a bacterial species (Gevers et al. 2005, Achtman and Wagner 2008). Nonetheless, molecular systematics has provided the crucial framework for building bacterial classification schemes.

Despite the lack of a coherent species definition, the timely classification, characterization, and identification of bacteria continue to be critical in many areas, including public health, clinical diagnosis, environmental monitoring, food safety monitoring, and identification of biological threat agents. In particular, the recent advances of modern molecular techniques in genomics and proteomics have offered attractive alternatives to conventional microbiological procedures for characterizing and identifying microorganisms. These new methods can provide a rapid, multidimensional data output with taxonomically relevant molecular information on both individual strains and whole populations.

This article is primarily concerned with the identification of individual strains of bacteria that can be grown as axenic cultures in the laboratory. The discrimination and identification of bacteria within mixed natural populations is also a rapidly developing field that utilizes some of the same techniques, but it is an entirely separate subject (Liu and Stahl 2007, Logue et al. 2008). Methods of bacterial identification can be broadly delimited into genotypic techniques based on profiling an organism's genetic material (primarily its DNA) and phenotypic techniques based on profiling either an organism's metabolic attributes or some aspect of its chemical composition. Genotypic techniques have the advantage over phenotypic methods that they are independent of the physiological state of an organism; they are not influenced by the composition of the growth medium or by the organism's phase of growth. Phenotypic techniques, however, can yield more direct functional information that reveals what metabolic activities are taking place to aid the survival, growth, and development of the organism. These may be embodied, for example, in a microbe's adaptive ability to grow on a certain substrate, or in the degree to which it is resistant to a cohort of antibiotics. Because genotypic and phenotypic approaches are complementary and use different techniques, this review is divided into two parts. However, this division is historical; we predict that as molecular-based identification matures, there will be more and more overlap in the information obtained using different methodologies.

Genotypic methods

Genotypic microbial identification methods can be broken into two broad categories: (1) pattern- or fingerprint-based techniques and (2) sequence-based techniques. Pattern-based techniques typically use a systematic method to produce a series of fragments from an organism's chromosomal DNA. These fragments are then separated by size to generate a profile, or fingerprint, that is unique to that organism and its very close relatives. With enough of this information, researchers can create a library, or database, of fingerprints from known organisms, to which test organisms can be compared. When the profiles of two organisms match, they can be considered very closely related, usually at the strain or species level.

Sequence-based techniques rely on determining the sequence of a specific stretch of DNA, usually, but not always, associated with a specific gene. In general, the approach is the same as for genotyping: a database of specific DNA sequences is generated, and then a test sequence is compared with it. The degree of similarity, or match, between the two sequences is a measurement of how closely related the two organisms are to one another. A number of computer algorithms have been created that can compare multiple sequences to one another and build a phylogenetic tree based on the results (Ludwig and Klenk 2001). The example cited above of using sequence comparisons of the ribosomal RNA (rRNA) gene to distinguish bacteria and archaea demonstrates how this information can be applied to identify relationships among microorganisms.

Both fingerprinting techniques and sequence-based methods have strengths and weaknesses. Traditionally, sequence-based methods, such as analysis of the 16S rRNA gene, have proved effective in establishing broader phylogenetic relationships among bacteria at the genus, family, order, and phylum levels, whereas fingerprinting-based methods are good at distinguishing strain- or species-level relationships but are less reliable for establishing relatedness above the species or genus level (Vandamme et al. 1996). When these methods are coupled with other phenotypic tests, this creates a polyphasic approach that is the standard for describing new bacterial species (Gillis et al. 2001).

Specific genotyping methodologies

Current protocols for the identification of bacteria may utilize a variety of different fingerprinting- or sequence-based methods, either alone or, more often, in combination. These techniques are constantly evolving to embrace new methodologies that provide both greater accuracy for identification and higher sample throughput. Examples of some of the most widely used techniques are provided below.

Fingerprinting-based methodologies.

At present, fingerprinting techniques are the most commonly used genotypic methods for bacterial identification. The most widely used of these methods are shown in table 1. Repetitive element PCR (rep-PCR), amplified fragment length polymorphism (AFLP), and random amplification of polymorphic DNA all utilize PCR to amplify multiple copies of short DNA fragments using defined sets of primers (Versalovic et al. 1994, Cocconcelli et al. 1995, Vos et al. 1995, Lin et al. 1996). These methods are designed to take advantage of DNA polymorphisms in related organisms that may accrue as a result of a variety of evolutionary mechanisms. Figure 1 provides an illustration of the type of data obtained using rep-PCR. Multiplex PCR uses unique PCR primer sets for more than one organism; these sets can be separated on the basis of amplicon size as a way of rapidly identifying more than one microbe at a time in a mixed sample (Settanni and Corsetti 2007).

Table 1.

Genotypic methods used in identifying bacteria.

Which of the following is not a genotypic method for determining relatedness between microorganisms?

Open in new tabDownload slide

An example of genotyping archaea from extreme environments using repetitive element polymerase chain reaction. The dendrogram compares the barcodes derived from four genera of extremely thermophilic archaea (Methanocaldococcus, Methanotorris, Sulfolobus, and Thermococcus) associated with high-temperature environments like the hydrothermal vent shown in the photograph on the left, and four genera of halophilic archaea (Halobacterium, Haloferax, Haloarcula, and Halorubrum) associated with high-salt environments like the solar saltern shown on the right. This analysis provides a highly specific method for strain identification of these unique organisms, but it does not provide information regarding their phylogenetic relatedness. Photographs: David Clelann.

Riboprinting does not use PCR, but instead utilizes a sensitive probing method to detect differences in gene patterns between strains and species (Bruce 1996). DuPont's Ribo-Printer system (www2.dupont.com/Qualicon/en_US/) and the DiversiLab system for rep-PCR (http://biomerieux-usa.com/diversilab) have both been developed as commercial products for bacterial identification. All of the methods described here have been used to identify bacteria in a multitude of different ways, many of which can be found in the scientific literature. These applications include source tracking (Meays et al. 2004), authentication of isolates for archival purposes (Cleland et al. 2008), taxonomy and systematics (Vandamme et al. 1996, Gevers et al. 2005), and determination of microbial population structures and community studies (Savenlkoul et al. 1999), to name but a few.

Sequence-based methodologies.

The most widely used sequence-based methods are also shown in table 1. Multilocus sequencing is one of the newest and, to date, one of the most powerful methods developed to identify microbial species. In principle, this technique is akin to 16S rRNA gene sequence comparisons, except that, instead of one gene, the fragments of multiple “housekeeping” genes are each sequenced, and the combined sequences are put together, or concatenated, into one long sequence that can be compared with other sequences. Housekeeping genes are generally defined as encoding for proteins that carry out essential cellular processes. A few examples include the gyrase B subunit (gyrB); the alpha and beta subunits of RNA polymerase (rpoA and rpoB); and recA, a gene encoding for an enzyme important in DNA repair; there are a host of others (Zeigler 2003). Housekeeping-gene loci are present in most cells and tend to be conserved among different organisms. As a result, general-purpose primers can be designed that will work using PCR to amplify the same genes across multiple genera.

In practice, the story is a bit more complicated; in most cases, truly universal primer sets are not possible, so primers need to be designed for specific families or orders of bacteria. Two multilocus sequencing strategies are currently used: multilocus sequence typing (MLST) and multilocus sequence analysis (MLSA). MLST is a well-defined approach that uses a suite of 6 to 10 genetic loci, with appropriate primers for each locus to allow PCR amplification and sequencing of the products (usually 400 to 600 base pairs) (Maiden et al. 1998). The resulting concatenated sequences can then be compared with a curated database of sequences for the same organism. The result provides a high-resolution identification of an individual strain that may reveal close evolutionary relationships among individual strains. This technique has proved useful in epidemiological studies, making it possible to track the outbreak of virulent bacterial pathogens (Cooper and Feil 2004). Thus far, MLST, and the robust databases that have been created for it, has been applied only to a relatively small number of common pathogens, using highly prescribed conditions for each organism, both for PCR primers and for database analysis.

MLSA also involves sequencing of multiple fragments of conserved protein encoding genes, but it uses a more ad hoc approach to choosing the genes for comparative analysis. A smaller subset (≤6) of genes or loci is typically used in MLSA than is used in MLST (Gevers et al. 2005). MLSA is typically used to identify organisms in the broader context of probing species relationships within genera of families, rather than tracking the history of individual strains. As typically applied, it does not have the analytical capacity to detect the very minor changes in sequence patterns that are useful in epidemiologic studies. At present, MLSA is limited by a lack of standardization, and no central databases are available. For example, an analysis of eight recent papers that used MLSA to identify a wide range of bacterial phyla found that anywhere from two to six genes were used in the different individual studies (Devulder et al. 2005, Lodders et al. 2005, Naser et al. 2005, Paradis et al. 2005, Thompson et al. 2005, Richter et al. 2006, Chelo et al. 2007, Richert et al. 2007). Furthermore, no single gene was common to all studies, and most studies used completely different sets of genes. Although the technique proved useful for each individual study, the lack of cohesiveness makes comprehensive comparative analyses impossible.

The genomic future.

The genomes of approximately 2000 strains of bacteria and archaea have now been sequenced or are in the process of being sequenced. This has led to the advent of using whole-genome comparisons between related species to determine the average nucleotide identity between two genomes (Goris et al. 2007). This technique currently defines a species at the genomic level as having 95%% average nucleotide identity between two strains. This corresponds to an estimate of at least 70%% reassociation for DNA-DNA hybridization, which has been the traditional standard for defining bacterial species (Vandamme et al. 1996). Complete genome comparisons have proved to be more accurate than DNA-DNA hybridization, which requires very stringent protocols and is often difficult to reproduce precisely between laboratories. The rapid advent of the next generation of sequencing technologies is likely to make sequence-based methods more cost-effective and more readily available for use at all levels of bacterial classification and identification.

This raises the question of whether it will soon be possible to simply sequence the genome of an isolate to determine what it is and what it does. Can this be done for roughly the same cost as a standard battery of biochemical identification tests and a genotype analysis? Can it be done with the same or better speed and efficiency as current methods? These are the challenges faced by researchers who are developing and using genomics-based identification methods. It is difficult to predict how soon, if ever, whole-genome sequencing will be used as a routine means of bacterial identification; however, it is certain that the multilocus sequencing approaches described above will expand and mature rapidly. While we appreciate the technological challenges of DNA sequencing per se, perhaps an even greater challenge will be the establishment of large, integrated databases that allow for the rapid assembly of sequence data to help researchers make robust comparisons among sequences and predict identifications between bacteria with a high degree of confidence. The lack of standardization for MLSA analysis needs to be addressed so that standards can be developed for comparisons of multiple taxa. Once these are in place, it will become progressively easier to develop MLSA- or MLST-type sequence-based strategies that accurately target multiple genes and can be used to provide a full range of genotypic information for all bacteria and archaea.

Microarrays are another technology that shows promise as a means of simultaneously identifying specific microbes and providing ecological context for the population structure and functional structure of a given microbial community. Microarrays work on the general principle of spotting probes for hundreds or thousands of genes onto a substrate (e.g., a glass slide) and then hybridizing sample DNA or RNA to it. The sample DNA or RNA is labeled with a fluorescent reporter molecule so that samples that hybridize with probes on the microarray can be detected rapidly. In terms of bacterial identification, several iterations of a “phylochip” that utilizes the small-subunit ribosomal gene as a target have been developed, both for specific and for very broad groups of environmental bacteria (Liu et al. 2001, Wilson et al. 2002). Another example is the geochip, which has been developed to identify microbes involved in essential biogeochemical processes such as metal transformations, contaminant degradation, and primary carbon cycling (He et al. 2007). In the clinical realm, the use of microarrays is moving forward rapidly, both for diagnostic purposes and for understanding the fundamentals of disease pathology (Frye et al. 2006, Richter et al. 2006). However, because of their inherent complexity and relative expense, microarrays have yet to be used as standard methods in microbial identification.

Proteomics technologies in bacterial identification and characterization

Although genotypic information is valuable in identifying an organism and determining how it is related to others, methods that probe an organism's phenotypic properties remain critical for understanding the physiological and functional activities of an organism at the protein level. Phenotypic methods that determine the activity of specific enzymes, such as catalase or oxidase, or metabolic functions, such as the ability to degrade lactose, have long been a mainstay of bacterial identification. The advent of new proteomics tools that are based primarily on mass spectrometry and allow rapid interrogation of biomolecules produced by an organism offers an excellent complement to classical microbiological and genomics-based techniques for bacterial classification, identification, and phenotypic characterization. What is also interesting is that some of these techniques are integrating genotypic and proteomic data to provide more complete information. The predominant proteomic technologies that have been explored for bacterial identification and characterization include matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS); electro spray ionization mass spectrometry (ESI-MS); surface-enhanced laser desorption/ionization (SELDI) mass spectrometry; one- or two-dimensional sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE); or the combination of mass spectrometry, gel electrophoresis, and bioinformatics. (See figure 2 for a general integrated proteomics flowchart.) In addition to the above-mentioned classical proteomics approaches, Fourier-transform infrared spectroscopy (FT-IR) has been used to classify and identify bacterial samples (see, e.g., Al-Qadiri et al. 2006).

Which of the following is not a genotypic method for determining relatedness between microorganisms?

Open in new tabDownload slide

Overview of proteomics approaches in bacterial identification and characterization. The bacterial sample can be analyzed using either a gel-based or a mass spectrometry (MS)–based approach. In the gel-based approach, bacterial lysate is prepared and run on one- or two-dimensional sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE). The SDS-PAGE can be analyzed by comparing it directly with available gel images in the database, or by excising the protein spots and using trypsin digestion and mass analysis for identification. On the basis of the protein pattern analysis or the identified proteins, or both, the bacterium from which the lysate was prepared will be identified using bioinformatics analysis (database search and computer algorithm analysis). In the MS-based approach, bacterial lysates or the whole cell are analyzed using different mass spectrometry techniques, such as matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF), electrospray ionization mass spectrometry (ESI-MS), liquid chromatography tandem mass spectrometry (LC/MS/MS), and surface-enhanced laser desorption ionization mass spectrometry (SELDI). The unknown bacterium will be identified either by comparing the resulting mass spectra with a collective proteomics database containing mass spectra of known bacteria, or by searching and matching the sequence of a panel of proteins with proteins of known bacteria in the protein database.

Mass spectrometry–based bacterial characterization and identification.

Mass spectrometry is a powerful analytical technique that has been used to identify unknown compounds, quantify known compounds, and elucidate the structure and chemical properties of molecules. The development of mass spectrometry can be traced back to the late 19th century, when it was first used by J. J. Thomson (1899) to measure the mass-to-charge ratio of electrons. With the refine ment of this technology throughout the 20th century, mass-spectrometry applications have been expanded to include physical measurement, chemical characterization, and biological identification.

One of the major breakthroughs in mass spectrometry for the analysis of biological molecules was the soft ionization method (i.e., MALDI-TOF-MS and ESI-MS; see figure 3 for a simplified schematic representation). Until the development of the soft ionization method, the application of mass spectrometry to biological materials was limited by the requirement that the sample be in vapor phase before ionization. Soft ionization has made it possible to study larger biological molecules and perform analyte sampling and ionization directly from native samples, including whole cells, using mass spectrometry (Fenn et al. 1989). Since its initial implementation for bacterial identification in 1975, mass spectrometry has helped to resolve time-constraint dilemmas imposed by traditional bacterial identification and characterization methods, and has permitted the generation of protein profiles specific enough for the identification of antibiotic-resistant bacteria and their molecular components. We describe the applications of MALDI-TOF-MS and ESI-MS in bacterial identification and characterization in more detail below.

Which of the following is not a genotypic method for determining relatedness between microorganisms?

Open in new tabDownload slide

Schematic representation of soft ionization techniques used in mass spectrometry. (a) Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. The sample to be analyzed (the analyte) is mixed with organic matrices and deposited on the sample plate in the form of a small spot. The mixture is ionized by the laser beam. The resulting ions move toward the mass analyzer, and the mass is detected to obtain the mass spectrum. (b) Electrospray ionization mass spectrometry (ESI-MS). The analyte is mixed with a solvent and sprayed from a narrow tube. Positively charged droplets in the spray move toward the mass-spectrometer sampling orifice under the influence of electrostatic forces and pressure differentials. As the droplets move to the orifice, the solvent evaporates, causing the analyte ions to move toward the analyzer for mass analysis.

Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry.

MALDI-TOF-MS is the most commonly used mass spectral method for bacterial analysis because (a) it can be used to analyze whole bacterial cells directly; (b) it can produce relatively simple, reproducible spectra patterns over a broad mass range under well-controlled experimental conditions; (c) the spectra patterns contain characteristic information that can be used to identify and characterize bacterial species by comparing the spectra fingerprints of the unknown species with known library fingerprints; and (d) a number of known, taxonomically important protein markers can be used directly for identifying bacterial species. In 1996, Holland and colleagues published an article on the first use of MALDI-TOF-MS for the rapid identification of whole bacteria, either by comparison with archived reference spectra or by coanalysis with cultures of known bacteria. Following this study, a variety of bacteria have been analyzed using MALDI-TOF-MS, including Staphylococcus species (Edwards-Jones et al. 2000), Mycobacteria species (Pignone et al. 2006), and extremophilic bacteria and archaea (Krader and Emerson 2004). More detailed descriptions of the MALDI-TOF-MS technique and its applications for bacterial characterization can be found in review articles by Lay (2001) and Dare (2006).

MALDI-TOF-MS is rapidly becoming an accepted technology for bacterial identification, so we will cite only one example of its use. Staphylococcus aureus, a bacterium commonly found on human skin, causes infection during times of uncontrolled growth. Improper use of antibiotics has rendered S. aureus resistant to the methicillin class of antibiotics. The first outbreak of methicillin-resistant S. aureus (MRSA) was recorded in a European hospital in the early 1960s. Since then, the threat of MRSA has spread from hospitals and clinical settings to schools and public communities, thus necessitating the use of techniques that can rapidly identify and discriminate MRSA from methicillin-sensitive S. aureus(MSSA). Edward-Jones and colleagues (2000) de veloped a MALDI-TOF-MS method for the identification, typing, and discrimination of MRSA and MSSA. In this method, a sample is taken from a single bacterial colony and smeared onto a sample slide. The appropriate matrix is applied to the sample, which is then analyzed using MALDI-TOF-MS. MALDI-TOF-MS analysis shows that MRSA and MSSA yield distinct spectral peaks that allow for rapid distinction between the two and therefore, hypothetically, for appropriate treatment of S. aureus infections with respect to their resistance to antibiotics.

To serve market needs, several instrument systems based on MALDI-TOF-MS have been developed for bacterial identification and characterization. For instance, Bruker Daltonics' MALDI BioTyper is a system based on the measurement of high-abundance proteins, including many ribosomal proteins in a microorganism (Mellmann et al. 2008). The system is equipped with bioinformatics tools (clustering and phylogenetic dendrogram construction) that allow for the rapid identification and characterization of a known or unknown bacterial culture on the basis of proteomics signatures.

Electrospray ionization mass spectrometry.

ESI-MS also has the potential to play an important role in bacterial characterization, especially for the analysis of cellular components. Proteins expressed by the bacteria can be extracted from the lysed cells and analyzed using ESI-MS. This technique allows for the analysis of both intracellular and extracellular proteins, carbohydrates, and lipids. A major advantage of ESI-MS is its ability to perform tandem mass spectrometry, in which the protein of interest can be fragmented for a second mass analysis that provides protein fragment sequence information, or a peptide fragmentation fingerprint, that can then be applied to a database search to identify that specific protein. This has significantly increased the accuracy of protein identification compared with identification using only molecular weight information from a single MALDI-TOF-MS analysis. A study by Krishnamurthy and Ross (1996) reported that the total analysis time leading to unambiguous bacterial identification in samples is less than 10 minutes, with reproducible results. A system recently developed by Ibis Biosciences (the T5000 Biosensor System) can identify and characterize bacterial strains rapidly and effectively using a combination of PCR and ESI-MS technology (Sampath et al. 2007). Another example of the use of mass spectrometry to analyze nucleic acids for bacterial identification is the combination of MALDI-TOF-MS with MLST analysis using Neisseria meningitidis (Honisch et al. 2007). These efforts exemplify how the integrated genotypic and proteomics technologies provide an even more powerful tool for bacterial identification. Additional descriptions of the ESI-MS technique and its applications for bacterial characterization may be found in a review article by Bons and colleagues (2005).

Another advantage of ESI-MS is its ability to identify target bacteria in mixed samples. The resolution of ESI-MS is such that specific intracellular biomarkers for individual micro organisms can be distinguished with enough confidence that they can be identified in unknown samples. For the protein analysis, comparing the experimentally obtained protein profile of an unknown bacterial species with the profile information found in a proteomics database will allow for the identification and characterization of the unknown species. Identifying the virulence factors of pathogenic bacteria is one of the major applications of liquid chromatography with tandem mass spectrometry (LC/MS/MS) (Chao et al. 2007). Once the putative virulence factors are identified, their functions and mechanism can be further characterized by phenotypic analyses such as mutagenesis, conventional biochemical methods, and structural biology.

Surface-enhanced laser desorption/ionization.

SELDI is a relatively new technology, designed to perform mass spectrometric analysis of protein mixtures retained on chemically (e.g., cationic, ionic, hydrophobic) or biologically (e.g., antibody, ligand) modified chromatographic chip surfaces. These varied chemical and biochemical surfaces allow differential capture of proteins based on the intrinsic properties of the proteins themselves. The SELDI mass spectrometer produces spectra of complex protein mixtures based on the mass-to-charge ratio of the proteins in the mixture and their binding affinity to the chip surface. Differentially expressed proteins may then be determined from these protein profiles by comparing peak intensity. Figure 4 illustrates the general procedure.

Which of the following is not a genotypic method for determining relatedness between microorganisms?

Open in new tabDownload slide

Schematic representation of the surface-enhanced laser desorption/ionization technique. This technique utilizes aluminum-based chips, engineered with chemically or biologically modified surfaces. These varied surfaces allow the differential capture of proteins based on the intrinsic properties of the proteins themselves. Bacterial lysates are applied directly to the surfaces, where proteins with affinities to the surface will bind. Following a series of washes to remove nonspecifically bound proteins, the bound proteins are profiled using the integrated mass analyzer to generate a mass spectrum for further analysis. Abbreviations: MS, mass spectrometry; m/z, mass-to-charge ratio.

SELDI technology has been applied extensively to biomarker and protein profiling studies in the field of oncology (Yip and Lomas 2002). By contrast, only a limited number of reports have investigated the applicability of SELDI for detecting and identifying bacterial pathogens (Seo et al. 2004) and virulence factors. However, these limited study results demonstrate that SELDI technology offers an alternative approach to the other techniques for exploring bacterial proteomes, ultimately permitting bacterial identification based on a comparison of protein profiles and patterns. An example of how SELDI technology has been applied is its use in distinguishing between four subspecies of Francisella tularensis, the causative agent of tularemia in humans. Of the four subspecies of F. tularensis, tularensis is the most infectious and the only subspecies found in North America. Lundquist and colleagues (2005) showed that SELDI time-of-flight mass spectrometry is capable of generating unique and reproducible protein profiles for each subspecies, allowing the subspecies to be distinguished from one another.

Although the use of mass spectrometry has great potential for identifying bacteria by their spectral profile, many factors affect the reproducibility of bacterial spectra. Sample preparation, matrix selection, and differences in instrument quality and performance can all have an impact on the reproducibility of protein profiles (Wunschel et al. 2005). Just as important, the physiological state of the cell may also influence the results of mass spectral analysis, and thus both the growth medium and the growth stage of the cells must be taken into account. Using MALDI-TOF-MS technology to analyze and discriminate foodborne microorganisms, Mazzeo and colleagues (2006) concluded that the growth time did not affect the bacterial protein profile, but that different growth media did affect the mass spectra of Escherichia coli. Similarly, Walker and associates (2002) showed that culture medium—especially with the addition of blood, as in Columbia blood agar—will cause variation in mass spectra profiles. With so many variables, many scientists have introduced standardized techniques for MALDI-TOF-MS of whole cells. Liu and colleagues (2007) developed a universal sample preparation method for characterizing gram-positive and gram-negative bacteria using MALDI-TOF-MS.

These results emphasize the need to develop standardized techniques for preparing samples to use in the creation of mass spectra databases for bacterial identification.

Gel-based bacterial characterization and identification.

Bacteria may also be differentiated on the basis of their cellular protein contents. The most established technique for examining cellular protein content is to lyse cells and separate their entire protein complement using SDS-PAGE. This results in a migration pattern of the protein bands that is characteristic for a given bacterial strain (Vandamme et al. 1996). Researchers can identify bacteria by comparing their migration patterns with reference gel patterns in an established database. However, because SDS-PAGE analysis is slow and labor-intensive, and because the application necessitates precise culture conditions that yield fairly large amounts of sample material, it is not particularly useful for rapid identification of bacteria, particularly for field and point-of-care applications.

Two-dimensional gel electrophoresis (2DE)—the combination of isoelectric focusing (IEF) and SDS-PAGE—affords a high-resolution separation of up to several thousand spots in a single gel analysis. In 1975, O'Farrell (1975) introduced 2DE as a method for separating complex mixtures of cellular proteins. In 2DE, proteins are separated by IEF electrophoresis in a pH gradient according to each protein's isoelectric point in the first dimension, followed by the second-dimension SDS-PAGE separation according to the relative molecular weight of each protein. After the second-dimension separation, the gel can be stained with standard or sensitive staining solutions so that protein spots can be visualized and analyzed. Protein gel patterns or 2DE maps from known bacteria can be further scanned, analyzed, and stored in a reference database. To identify an unknown species, a 2DE map from the unknown sample is generated by running a 2DE gel and then comparing it with 2DE maps in the reference database for identification.

When used as a stand-alone technique, 2DE is most often used for analyzing protein mixtures, isolating proteins of interest for identification, and comparing differential expression patterns of different types of samples. For more complex proteomics analysis, 2DE is greatly enhanced when combined with mass spectrometry. For example, Redmond and associates (2004) analyzed the exosporium of Bacillus anthracis spores by isolating the proteins from the outer casing of the spore using SDS-PAGE and analyzing the isolated protein using delayed-extraction MALDI-TOF-MS. The team identified several proteins associated with the exosporium of B. anthracis. Using these same methods, the whole proteome or subproteome (or both) has also been made available for many other bacteria, including E. coli, Bacillus subtilis, S. aureus, Pseudomonas aeruginosa, and Helicobacter pylori (Nouwens et al. 2000, Hecker et al. 2003, Peng et al. 2005, Pieper et al. 2006). A collective proteomics database with complete 2DE maps and mass spectra of known bacteria will allow investigators to compare and identify unknown bacteria with great efficiency. However, building such a database will not be an easy undertaking.

The database challenge.

One thing modern genomic and proteomic approaches share is the generation of large data sets for any individual sample that is analyzed. These present a real challenge in terms of archiving the data, processing and integrating data from many samples so that it can be used for comparative purposes in the broadest way possible, and developing robust quality control and quality assurance practices. All this should be done in an environment that is accessible and easy to use for a broad group of scientists, many of whom may have little experience with a particular method of analysis. At present there are no comprehensive genomic or proteomic databases for bacterial identification.

There are, however, a great number of databases and tools available for both genomic and proteomic analysis that are essential for providing integrated data for specific types of analysis. Two examples of databases that provide excellent analysis for identification based on 16S rRNA gene sequence analysis are the Ribosomal Database Project (Cole et al. 2007) and greengenes (DeSantis et al. 2006). On the proteomics side, LC/MS/MS data analysis has improved by several means and continues to complement 2DE gel analysis, which is still limited by the inability of the pattern recognition software to resolve overlapping protein spots (Palagi et al. 2006). Protein identification and characterization has been carried out with more depth since the development of predictive tools such as GlycoMod and databases such as PhosphoSite, which can reveal possible posttranslational modifications not otherwise accounted for in databases consisting simply of theoretical spectra (Barrett et al. 2005, Witze et al. 2007).

More important, algorithms are evolving to adapt to actual experimental occurrences and parameters. One example of such adaptation is the development of a recent algorithm that predicts missed cleavage sites as they typically occur during protein digests, in order to ease the return of, and render greater confidence to, mass spectra probability matches (Siepen et al. 2007). The goal of the ProDB platform proposed by Wilke and colleagues (2003) is not only to integrate data derived from various databases to enrich a protein profile but also to archive experimental conditions and parameters, such as growth and culture conditions as they might apply to bacteria, to account for the subsequent effects on mass-spectra profile generation. Archiving and integrating specific experimental conditions as part of the proteomic bioinformatics may alleviate the need for stringent culturing standards when attempting to identify proteins that aid the characterization and identification of clinically important bacteria. For a more comprehensive review on proteomic data analysis, see Lisacek and colleagues (2006).

Conclusions

Advanced genomics and proteomics technologies will continue to play a critical role in bacterial identification and characterization in the 21st century. Bacterial characterization has a number of practical applications, aside from being fundamental to questions of bacterial systematics, taxonomy, and evolution. Rapid identification and discrimination of pathogenic microbes has a major impact on public health in terms of correct diagnosis and timely disease treatment. The ability to identify specific indicator organisms is also important for determining water quality, and an enhanced understanding of the population structure of these organisms can allow researchers to identify the source of a particular contaminant. For example, methods are being developed to determine whether fecal bacteria found in public water supplies are from humans, mammals, or birds. This kind of information has a significant impact for treatment options.

As researchers learn more about the community fabric of microbial ecosystems, it is likely that we will come to recognize sentinel microbes that will tell us, by their presence (or absence) and abundance, important information about the state of that ecosystem. For example, the identification of microbes that carry out specific transformations of nitrogen or phosphorus might indicate the status of these important nutrients in aquatic or soil ecosystems. Likewise, the presence of microorganisms with certain biodegradative capacities could be an indicator of specific pollutants in an environment. The ability to rapidly identify these individual organisms within populations of thousands of different species is essential for understanding how they will affect our ecosystems. Bacterial characterization will also assist in elucidating the mechanisms that govern microbial pathogenesis, and allow for the discovery of important protein targets essential to the development of vaccines, diagnostic kits, and therapeutics for infectious diseases. It is these kinds of applications that make the continued development of techniques for bacterial identification important both for basic science and for the maintenance of human and environmental health.

Acknowledgments

The authors would like to thank Raymond Cypess (chief executive officer, American Type Culture Collection [ATCC]) and Cohava Gelber (chief scientific and technology officer, ATCC) for their unconditional support in preparing this manuscript. We thank Scott Jenkins for his help in editing the manuscript and David Cleland for his help in preparing figure 1. We also acknowledge that many excellent papers, particularly those providing examples of the use of the technologies described herein, could not be cited because of space limitations.

References cited

. .

Fourier transform infrared spectroscopy, detection and identification of Escherichia coli O157:H7 and Alicyclobacillus strains in apple juice.

International Journal of Food Microbiology

: -.

. .

Protein profiling as a diagnostic tool in clinical chemistry: A review.

Clinical and Chemical Laboratory Medicine

: -.

. .

Insight into the virulence of Rickettsia prowazekii by proteomic analysis and comparison with an avirulent strain.

Biochimica Biophysica Acta

: -.

. .

Congruence of evolutionary relationships inside the Leuconostoc–Oenococcus–Weissella clade assessed by phylogenetic analysis of the 16S rRNA gene, dnaA, gyrB, rpoC and dnaK.

International Journal of Systematic and Evolutionary Microbiology

: -.

.

The scientific study of bacteria.

Pages -. in , editors. eds.

Bacteria in Nature, vol. 1

.

. .

Use of the DiversiLab repetitive sequence-based PCR system for genotyping and identification of archaea.

Journal of Microbiological Methods

: -.

. .

Development of RAPD protocol for typing of strains of lactic-acid bacteria and enterococci.

Letters in Applied Microbiology

: -.

. .

The ribosomal database project (RDP-II): Introducing myRDP space and quality controlled public data.

: -.

.

Rapid bacterial characterization and identification by MALDI-TOF mass spectrometry.

Pages -. in , editors. eds.

Advanced Techniques in Diagnostic Microbiology

.

. .

Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB.

Applied and Environmental Microbiology

: -.

. .

A multigene approach to phylogenetic analysis using the genus Mycobacterium as a model.

International Journal of Systematic and Evolutionary Microbiology

: -.

. .

Rapid discrimination between methicillin-sensitive and methicillin-resistant Staphylococcus aureus by intact cell mass spectrometry.

Journal of Medical Microbiology

: -.

. .

DNA microarray detection of antimicrobial resistance genes in diverse bacteria.

International Journal of Antimicrobial Agents

: -.

. . Pages -. in , editors. eds.

Bergey's Manual of Systematic Bacteriology, vol. 1

.

. .

DNA-DNA hybridization values and their relationship to whole-genome sequence similarities.

International Journal of Systematic and Evolutionary Microbiology

: -.

et al.  .

GeoChip: A comprehensive microarray for investigating biogeochemical, ecological and environmental processes.

ISME Journal: Multidisciplinary Journal of Microbial Ecology

: -.

. .

Proteomics of Staphylococcus aureus—current state and future challenges.

Journal of Chromatography B: Analytical Technologies in the Biomedical and Life Sciences

: -.

. .

Rapid identification of intact whole bacteria based on spectral patterns using matrix-assisted laser desorption/ionization with time-of-flight mass spectrometry.

Rapid Communications in Mass Spectrometry

: -.

. .

Automated comparative sequence analysis by base-specific cleavage and mass spectrometry for nucleic acid-based microbial typing.

Proceedings of the National Academy of Sciences

: -.

. .

Characterization of Archaea and some extremophilic bacteria using matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry.

: -.

. .

Rapid identification of bacteria by direct matrix-assisted laser desorption/ionization mass spectrometric analysis of whole cells.

Rapid Communications in Mass Spectrometry

: -.

. .

Universal sample preparation method for characterization of bacteria by matrix-assisted laser desorption ionization–time of flight mass spectrometry.

Applied Environmental Microbiology

: -.

. .

Molecular approaches for the measurement of density, diversity, and phylogeny.

Pages -. in , editors. eds.

Manual of Environmental Microbiology

3rd ed .

. .

Optimization of an oligo nucleotide microchip for microbial identification studies: A non-equilibrium dissociation approach.

Environmental Microbiology

: -.

. .

Frequent genetic recombination in natural populations of the marine cyanobacterium Microcoleus chthonoplastes.

Environmental Microbiology

: -.

. .

Overview: A phylogenetic backbone and taxonomic framework for prokaryotic systematics.

Pages -. in , editors. eds.

Bergey's Manual of Systematic Bacteriology

.

. .

Discrimination of Francisella tularensis subspecies using surface enhanced laser desorption ionization mass spectrometry and multivariate data analysis.

FEMS Microbiology Letters

: -.

et al.  .

Multilocus sequence typing: A portable approach to the identification of clones within populations of pathogenic microorganisms.

Proceedings of the National Academy of Sciences

: -.

. .

Matrix-assisted laser desorption ionization–time of flight mass spectrometry for the discrimination of food-borne microorganisms.

Applied and Environmental Microbiology

: -.

et al.  .

Evaluation of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) in comparison to 16S rRNA gene sequencing for species identification of nonfermenting bacteria.

Journal of Clinical Microbiology

: -.

. .

Application of multilocus sequence analysis (MLSA) for rapid identification of Enterococcus species based on rpoA and pheS genes.

: -.

. .

Complementing genomics with proteomics: The membrane subproteome of Pseudomonas aeruginosa PAO1.

: -.

. .

Phylogeny of the Enterobacteriaceae based on genes encoding elongation factor Tu and F-ATPase μ-subunit.

International Journal of Systematic and Evolutionary Microbiology

: -.

. .

Proteomic analysis of the sarcosine-insoluble outer membrane fraction of Pseudomonas aeruginosa responding to ampicilin, kanamycin, and tetracycline resistance.

Journal of Proteome Research

: -.

. .

Comparative proteomic analysis of Staphylococcus aureus strains with differences in resistance to the cell wall-targeting antibiotic vancomycin.

: -.

. .

Identification of mycobacteria by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry.

Journal of Clinical Microbiology

: -.

. .

The phylogenetic significance of peptidoglycan types: Molecular analysis of the genera Microbacterium and Aureobacterium based upon sequence comparison of gyrB, rpoB, recA and ppk and 16SrRNA genes.

Systematic and Applied Microbiology

: -.

. .

Delineation of Borrelia burgdorferi sensu lato species by multilocus sequence analysis and confirmation of the delineation of Borrelia spielmanii sp. nov.

International Journal of Systematic and Evolutionary Microbiology

: -.

. .

Rapid identification of emerging infectious agents using PCR and electrospray ionization mass spectrometry.

Annals of the New York Academy of Sciences

: -.

. .

Amplified-fragment length polymorphism analysis: The state of an art.

Journal of Clinical Microbiology

: -.

. .

Rapid profiling of the infection of Bacillus anthracis on human macrophages using SELDI-TOF mass spectroscopy.

Biochemical & Biophysical Research Communications

: -.

. .

The use of multiplex PCR to detect and differentiate food- and beverage-associated microorganisms: A review.

Journal of Microbiological Methods

: -.

. .

Prediction of missed cleavage sites in tryptic peptides aids protein identification in proteomics.

Journal of Proteome Research

: -.

. .

Phylogeny and molecular identification of vibrios on the basis of multilocus sequence analysis.

Applied and Environmental Microbiology

: -.

. .

Genomic fingerprinting of bacteria using repetitive sequence based PCR (rep-PCR).

Methods in Cellular and Molecular Biology

: -.

. .

Intact cell mass spectrometry (ICMS) used to type methicillin-resistant Staphylococcus aureus: Media effects and inter-laboratory reproducibility.

Journal of Microbiological Methods

: -.

. .

High-density microarray of small-subunit ribosomal DNA probes.

Applied and Environmental Micro-biology

: -.

. .

Bacterial analysis by MALDI-TOF mass spectrometry: An inter-laboratory comparison.

Journal of the American Society of Mass Spectrometry

: -.