Abstract:

Carbohydrate utilization-related and multidrug transporter nucleic acids
and polypeptides, and fragments and variants thereof, are disclosed in
the current invention. In addition, carbohydrate utilization-related and
multidrug transporter fusion proteins, antigenic peptides, and
anti-carbohydrate utilization-related and anti-multidrug transporter
antibodies are encompassed. The invention also provides vectors
containing a nucleic acid of the invention and cells into which the
vector has been introduced. Methods for producing the polypeptides and
methods of use for the polypeptides of the invention are further
disclosed.

Claims:

1. An isolated nucleic acid molecule selected from the group consisting
of:a) a nucleic acid molecule comprising a nucleotide sequence as set
forth in SEQ ID NO: 210, 289, 103, or 197 or a complement thereof,b) a
nucleic acid molecule comprising a nucleotide sequence having at least
90% sequence identity to a nucleotide sequence as set forth in SEQ ID NO:
210, 289, 103, or 197, wherein said nucleic acid molecule encodes a
polypeptide having biological activity;c) a nucleic acid molecule
comprising a fragment of a nucleotide sequence as set forth in SEQ ID NO:
210, 289, 103, or 197, wherein said nucleic acid molecule encodes a
polypeptide having biological activity;d) a nucleic acid molecule
comprising a nucleotide sequence encoding a polypeptide having at least
90% amino acid sequence identity to the amino acid sequence as set forth
in SEQ ID NO:210, 290, 104, or 198, wherein said polypeptide has
biological activity; and,e) a nucleic acid that hybridizes under
stringent conditions to any of a-d.

2. A plasmid comprising the nucleic acid of claim 1.

3. The plasmid of claim 2, further comprising a nucleic acid encoding a
heterologous polypeptide.

4. A microbial cell comprising the plasmid of claim 2.

5. The microbial cell of claim 4, wherein said microbial cell is a
bacterial cell.

7. An isolated polypeptide selected from the group consisting of:a) a
polypeptide comprising an amino acid sequence as set forth in SEQ ID NO:
210, 290, 104, or 198;b) a polypeptide comprising a fragment of an amino
acid sequence as set forth in SEQ ID NO: 210, 290, 104, or 198, wherein
said polypeptide has biological activity; and,c) a polypeptide comprising
an amino acid sequence having at least 90% sequence identity with an
amino acid sequence as set forth in SEQ ID NO: 210, 290, 104, or 198,
wherein said polypeptide has biological activity.

8. The polypeptide of claim 7 further comprising a heterologous amino acid
sequence.

9. An antibody that selectively binds to the polypeptide of claim 7.

10. A method for producing a polypeptide, comprising culturing the cell of
claim 4 under conditions in which a nucleic acid encoding the polypeptide
is expressed, said polypeptide being selected from the group consisting
of:a) a polypeptide comprising an amino acid sequence of SEQ ID NO: 210,
290, 104, or 198;b) a polypeptide comprising a fragment of an amino acid
sequence as set forth in SEQ ID NO: 210, 290, 104, or 198, wherein said
polypeptide has biological activity;c) a polypeptide comprising an amino
acid sequence having at least 90% sequence identity with an amino acid
sequence as set forth in SEQ ID NO: 210, 290, 104, or 198, wherein said
polypeptide has biological activity;

11. A method for modifying the ability of an organism to transport a
carbohydrate into or out of a cell, comprising introducing into said
organism the nucleic acid of claim 1.

12. A method for modifying the ability of an organism to accumulate a
carbohydrate, comprising introducing into said organism the nucleic acid
of claim 1.

13. A method for modifying the ability of an organism to utilize a
carbohydrate as an energy source, comprising introducing into said
organism the nucleic acid of claim 1.

14. A method for modifying the flavor of a food product fermented by a
microorganism, comprising introducing into said microorganism the nucleic
acid of claim 1.

15. A method for modifying the texture of a food product fermented by a
microorganism, comprising introducing into said microorganism the nucleic
acid of claim 1.

16. A method for modifying the ability of an organism to produce a
modified carbohydrate, comprising introducing into said organism the
nucleic acid of claim 1.

17. A method for modifying the ability of an organism to survive food
processing and storage conditions, comprising introducing into said
organism the nucleic acid of claim 1.

18. A method for modifying the ability of an organism to survive in a GI
tract, comprising introducing into said microorganism the nucleic acid of
claim 1.

19. A method for modifying the ability of an organism to produce a
carbohydrate, comprising introducing into said organism the nucleic acid
of claim 1.

20. A method for modifying the ability of an organism to transport a drug
into or out of a cell, comprising introducing into said organism the
nucleic acid of claim 1.

Description:

CROSS REFERENCE TO RELATED APPLICATIONS

[0001]This application is a divisional of U.S. application Ser. No.
11/074,226, filed Mar. 7, 2005, which claims the benefit of U.S.
Provisional Application Ser. No. 60/551,121, filed Mar. 8, 2004, the
contents of which are herein incorporated by reference in their entirety.

INCORPORATION OF SEQUENCE LISTING

[0002]The entire contents of the compact disk filed in U.S. application
Ser. No. 11/074,226 containing one file entitled "5051.693 Sequence
Listing" (1,313 kb; created Mar. 4, 2005, a paper copy of which is filed
herewith) is incorporated by reference herein.

FIELD OF THE INVENTION

[0003]This invention relates to polynucleotides isolated from lactic acid
bacteria, namely Lactobacillus acidophilus, and polypeptides encoded by
them, as well as methods for using the polypeptides and organisms
expressing them.

BACKGROUND OF THE INVENTION

[0004]Lactobacillus acidophilus is a Gram-positive, rod-shaped, non-spore
forming, homofermentative bacterium that is a normal inhabitant of the
gastrointestinal and genitourinary tracts. Since its original isolation
by Moro (1900) from infant feces, the "acid loving" organism has been
found in the intestinal tract of humans, breast-fed infants, and persons
consuming high milk, lactose, or dextrin diets. Historically,
Lactobacillus acidophilus is the Lactobacillus species most often
implicated as an intestinal probiotic capable of eliciting beneficial
effects on the microflora of the gastrointestinal tract (Klaenhammer and
Russell (2000) "Species of the Lactobacillus acidophilus complex,"
Encyclopedia of Food Microbiology, 2:1151-1157. Robinson et al., eds.
(Academic Press, San Diego, Calif.). Lactobacillus acidophilus can
ferment hexoses, including lactose and more complex oligosaccharides, to
produce lactic acid and lower the pH of the environment where the
organism is cultured. Acidified environments (e.g., food, vagina, and
regions within the gastrointestinal tract) can interfere with the growth
of undesirable bacteria, pathogens, and yeasts. The organism is well
known for its acid tolerance, survival in cultured dairy products, and
viability during passage through the stomach and gastrointestinal tract.
Lactobacilli and other commensal bacteria, some of which are considered
probiotic bacteria that "favor life," have been studied extensively for
their effects on human health, particularly in the prevention or
treatment of enteric infections, diarrheal disease, prevention of cancer,
and stimulation of the immune system. Lactobacilli have also been studied
for their influence on dairy product flavor, and functional and textural
characteristics. Genetic characterization of other Lactobacillus species
(e.g., L. johnsonii and L. rhamnosus) has been described (see e.g., U.S.
Pat. No. 6,476,209; U.S. Pat. No. 6,544,772; U.S. Patent Publication Nos.
20020159976, 2003013882 & 20040009490; PCT Publication No. WO
2004/031389; PCT Publication No. 2003/084989; PCT Publication No. WO
2004/020467).

[0005]Bacterial growth requires specific transport systems to import
nutrients from the external environment. Lactic acid bacteria transport
molecules into and out of the cell via three systems: primary transport,
secondary transport, and group translocation. In primary transport,
chemical (primarily ATP), electrical, or solar energy is used to drive
transport. ATP-binding cassette (ABC) transporters are the most abundant
class of primary transport systems in lactic acid bacteria. In this
system, ATP hydrolysis is linked with substrate translocation across the
membrane for both the import of sugars and compatible solutes and the
export of products such as drugs or toxins that are undesirable to the
cell, or cellular components that function outside of the cell, such as
cell wall polysaccharides. In general, ABC transporters are relatively
specific for their substrates, but some are multispecific, such as the
multidrug transporters.

[0006]Secondary transport systems use electrochemical gradients to provide
the energy for sugar translocation. They comprise symporters, which
cotransport two or more solutes, uniporters, which transport one
molecule, and antiporters, which countertransport two or more solutes.
Symporters generally couple the uphill movement of the substrate to the
downhill movement of a proton (or ion), antiporters use the ion gradient
for excretion of a product, and uniporters do not use a coupling ion
(Poolman (2002) Antonie van Leeuwenhoek 82:147-164).

[0007]Group translocation involves the phosphoenolpyruvate (PEP)-dependent
phosphotransferase system (PTS), which couples the uptake of a
carbohydrate or alditol with its phosphorylation (Poolman (2002), supra).
The phosphate group originates from the conversion of PEP into pyruvate,
and the subsequent phosphorylation involves the energy coupling proteins,
Enzyme I and HPr, as well as substrate-specific phosphoryl transfer
proteins IIA, IIB and IIC.

[0008]Multidrug transporters may be separated into two major classes,
secondary multidrug transporters and ABC transporters. Secondary
multidrug transporters may be further divided into distinct families,
including the major facilitator superfamily (MFS), the small multidrug
resistance family (SMR), the resistance-nodulation-cell division family
(RND), and the multidrug and toxic compound extrusion family (MATE)
(Putman et al. (2000) Microbiol. Mol. Biol. Reviews 64:672-693).
Secondary multidrug transporters use the electrochemical gradients, as
described herein, to extrude drugs from the cell. ABC-type multidrug
transporters use energy from ATP hydrolysis to pump drugs out of the cell
(Putman et al. (2000), supra).

[0009]Bacteria are able to metabolize various carbohydrates by utilizing
transport proteins and enzymes with different carbohydrate specificities,
in addition to employing diverse regulatory mechanisms, such as
catabolite repression. The isolation and characterization of these
proteins allows for the development of essential probiotic products with
numerous applications, including those that benefit human and/or animal
health, and those concerned with food production and safety. The proteins
can also be used in developing transgenic plants with altered growth or
survival capabilities.

BRIEF SUMMARY OF THE INVENTION

[0010]Compositions and methods for modifying microorganisms and plants are
provided. Compositions of the invention include isolated nucleic acids
from Lactobacillus acidophilus encoding carbohydrate utilization-related
proteins, including proteins of the phosphotransferase system (PTS), ABC
transporters, and other proteins involved in transport, degradation,
and/or synthesis of sugars in Lactobacillus acidophilus. Compositions
also include isolated nucleic acids from Lactobacillus acidophilus that
encode multidrug transporters. Specifically, the present invention
provides isolated nucleic acid molecules comprising, consisting
essentially of and/or consisting of the nucleotide sequence as set forth
in odd numbered SEQ ID NOS:1-321, singly and/or in any combination, and
isolated nucleic acid molecules encoding the amino acid sequence as set
forth found in even numbered SEQ ID NOS:2-322, singly and/or in any
combination. Also provided are isolated and/or recombinant polypeptides
comprising, consisting essentially of and/or consisting of an amino acid
sequence encoded by a nucleic acid molecule described herein and/or as
set forth in even numbered SEQ ID NOS:2-322, singly and/or in any
combination. Variant nucleic acids and polypeptides sufficiently
identical to the nucleotide sequences and amino acid sequences set forth
in the Sequence Listing are encompassed by the present invention.
Additionally, fragments and sufficiently identical fragments of the
nucleotide sequences and amino acid sequences are encompassed. Nucleotide
sequences that are complementary to a nucleic acid sequence of the
invention, or that hybridize to a nucleotide sequence of the invention,
are also encompassed.

[0011]Compositions further include vectors and prokaryotic, eukaryotic and
plant cells for recombinant expression of the nucleic acids described
herein, as well as transgenic microbial and plant populations comprising
the vectors. Also included in the invention are methods for the
recombinant production of the polypeptides of the invention, and methods
for their use. Further included are methods and kits for detecting the
presence of a nucleic acid and/or polypeptide sequence of the invention
in a sample, and antibodies that bind to a polypeptide of the invention.

[0012]The carbohydrate utilization-related and multidrug transporter
molecules of the present invention are useful for the selection and
production of recombinant bacteria, particularly the production of
bacteria with improved fermentative abilities. Such bacteria include, but
are not limited to, bacteria that have a modified ability to synthesize,
transport, accumulate, and/or utilize various carbohydrates, bacteria
with altered flavors or textures, bacteria that produce altered
carbohydrates, and bacteria better able to survive stressful conditions,
such as those encountered in food processing and/or in the
gastrointestinal tract of an animal. The multidrug transporter molecules
of the present invention include those that allow bacteria to better
survive contact with antimicrobial polypeptides, such as bacteriocins or
other toxins. These carbohydrate utilization-related and multidrug
transporter molecules are also useful for modifying plant species.
Transgenic plants comprising one or more sequences of the present
invention may be beneficial economically in that they are more resistance
to environmental stresses, including, but not limited to, plant
pathogens, high salt concentration, or dehydration. They may also be
better able to withstand food processing and storage conditions.

[0014]Compositions further include vectors comprising the nucleic acids
described herein, vectors further comprising a nucleic acid encoding a
heterologous polypeptide, and cells, including bacterial, plant and
eukaryotic cells, containing said vectors. Also included in the invention
are methods for the recombinant production of the polypeptides of the
invention, and methods for their use. Further included are methods and
kits for detecting the presence of a nucleic acid or polypeptide sequence
of the invention in a sample, and antibodies that bind to a polypeptide
of the invention.

[0019]Additionally provided are methods for detecting the presence of a
polypeptide in a sample, said method comprising contacting the sample
with a compound that selectively binds to a polypeptide and determining
whether the compound binds to the polypeptide in the sample of the
invention, wherein the compound that binds to the polypeptide is an
antibody. Also provided is a kit comprising a compound for use in the
methods of the invention and instructions for use.

[0020]The present invention also provides methods for detecting the
presence of a nucleic acid molecule and/or fragments thereof of this
invention in a sample, comprising: a) contacting the sample with a
nucleic acid probe or primer that selectively hybridizes to the nucleic
acid molecule and/or fragment thereof, and b) determining whether the
nucleic acid probe or primer hybridizes to a nucleic acid molecule in the
sample, thereby detecting the presence of a nucleic acid molecule and/or
fragment thereof of this invention in the sample. Also provided are
methods for detecting the presence of a nucleic acid molecule and/or
fragment of the invention in a sample wherein the sample comprises mRNA
molecules and is contacted with a nucleic acid probe. Additionally
provided herein is a kit comprising a compound that selectively
hybridizes to a nucleic acid of the invention, and instructions for use.

[0023]Additionally provided is a Lactobacillus acidophilus bacterial
strain with a modified ability to survive contact with an antimicrobial
polypeptide or toxin, as compared to a wild-type Lactobacillus
acidophilus, wherein said modified ability is due to expression of at
least one multidrug transport polypeptide as set forth in even SEQ ID
NOs:78-88, 92-94, 124-126, 132, 282-288, 308 and/or 312-322.

[0024]Also provided is a plant, a plant cell and/or a seed of a plant,
having stably incorporated into its genome a DNA construct comprising at
least one nucleotide sequence of this invention and/or at least one
nucleotide sequence of this invention, selected from the group consisting
of: a) a nucleotide sequence as set forth in any of SEQ ID NOs:1-321,
singly and/or in any combination, or a complement thereof, b) a
nucleotide sequence having at least 90% sequence identity to a nucleotide
sequence as set forth in any of SEQ ID NOs:1-321, singly and/or in any
combination, or a complement thereof, c) a nucleotide sequence comprising
a fragment of a nucleotide sequence as set forth in any of SEQ ID
NOs:1-321, singly and/or in any combination, or a complement thereof; d)
a nucleotide sequence that encodes a polypeptide comprising an amino acid
sequence as set forth in any of SEQ ID NOs:2-322; e) a nucleotide
sequence that encodes a polypeptide comprising an amino acid sequence
having at least 90% sequence identity to the amino acid sequence as set
forth in any of SEQ ID NOs:2-322 and f) a nucleotide sequence that
hybridizes under stringent conditions to any of a)-e).

[0026]FIG. 2. Carbohydrate utilization in Lactobacillus acidophilus. This
diagram shows carbohydrate transporters and hydrolases as predicted by
transcriptional profiles. Protein names and EC numbers are specified for
each element. PTS transporters are shown in black. GPH transporters are
shown in light gray. ABC transporters are shown in dark gray.

DETAILED DESCRIPTION OF THE INVENTION

[0027]The present invention relates to carbohydrate utilization-related
and multidrug transport molecules from Lactobacillus acidophilus.
Nucleotide and amino acid sequences of the carbohydrate
utilization-related and multidrug transport molecules are provided. The
sequences are useful for modifying microorganisms, cells and plants for
enhanced properties.

[0028]As used herein, "a," "an" and "the" can be plural or singular as
used throughout the specification and claims. For example "a" cell can
mean a single cell or a multiplicity of cells.

[0029]Also as used herein, "and/or" refers to and encompasses any and all
possible combinations of one or more of the associated listed items, as
well as the lack of combinations when interpreted in the alternative
("or").

[0030]By "carbohydrate utilization-related" molecules or genes is meant
novel sequences from Lactobacillus acidophilus that encode proteins
involved in the utilization of carbohydrate molecules, including, but not
limited to, the synthesis, transport, or degradation of carbohydrates. By
"multidrug transporter" molecules is meant those that are involved in the
transport of antimicrobial polypeptides such as bacteriocins, or other
drugs or toxins. See Table 1 for specific carbohydrate
utilization-related and multidrug transporter molecules of the present
invention. The full-length gene sequences are referred to as
"carbohydrate utilization-related sequences" or "multidrug transporter
sequences," showing that they have similarity to carbohydrate
utilization-related genes or multidrug transporter genes, respectively.
The invention further provides fragments and variants of these
carbohydrate utilization related sequences or multidrug transporter
sequences, which can also be used to practice methods of the present
invention.

[0031]By "carbohydrate" is meant an organic compound containing carbon,
hydrogen, and oxygen, usually in the ratio 1:2:1. Carbohydrates include,
but are not limited to, sugars, starches, celluloses, and gums. As used
herein, the terms "gene" and "recombinant gene" refer to nucleic acids
comprising an open reading frame, particularly those encoding a
carbohydrate utilization-related protein or a multidrug transporter
protein. Isolated nucleic acids of the present invention comprise nucleic
acid sequences encoding carbohydrate utilization-related proteins or
multidrug transporter proteins, nucleic acid sequences encoding the amino
acid sequences set forth in even numbered SEQ ID NOS:2-322, the nucleic
acid sequences set forth in odd numbered SEQ ID NOS:1-321, and variants
and fragments thereof. The present invention also encompasses antisense
nucleic acids, as described below.

[0032]In addition, isolated polypeptides and proteins having carbohydrate
utilization-related activity or multidrug transporter activity, and
variants and fragments thereof, are encompassed, as well as methods for
producing those polypeptides. For purposes of the present invention, the
terms "protein" and "polypeptide" are used interchangeably. The
polypeptides of the present invention have carbohydrate
utilization-related protein activity or multidrug transporter activity.
Carbohydrate utilization-related protein activity or multidrug
transporter activity refers to a biological or functional activity as
determined in vivo or in vitro according to standard assay techniques.
These activities include, but are not limited to, the ability to
synthesize a carbohydrate, the ability to transport a carbohydrate into
or out of a cell, the ability to degrade a carbohydrate, the ability to
regulate the concentration of a carbohydrate in a cell, the ability to
bind a carbohydrate, and the ability to transport a drug or toxin into or
out of a cell.

[0033]The structures of the various types of bacterial transporters are
well known in the art. The ATP-binding cassette (ABC) superfamily of
transporters consists of proteins with four core domains (Higgins et al.
(1986) Nature 323:448-450; Hyde et al. (1990) Nature 346:362-365; Higgins
(2001) Res. Microbiol. 152:205-210). Typically there are two
transmembrane domains with six membrane-spanning alpha helices per
domain, and two ATP-binding domains that contain the core amino acids by
which the transporters are defined (Higgins (2001) supra.), as well as
the other conserved motifs including the Walker A and Walker B motifs
(Walker et al. (1982) EMBO J. 1:945-951; Prosite Ref No. PDOC00185).

[0035]The phosphotransferase system (PTS) proteins include enzyme-I
(Prosite Ref. No. PDOC00527), the phosphoryl carrier proteins (HPr)
(Prosite Ref. No. PDOC00318), and the sugar-specific permease, which
consists of at least three structurally distinct domains (Prosite Ref.
Nos. PDOC00528; PDOC00795). The HPr protein contains two conserved
phosphorylation sites, a histidine residue at the amino-terminal side
that is phosphorylated by Enzyme I, and a serine residue at the
carboxy-terminal side of the protein that may be phosphorylated by an
ATP-dependent protein kinase (de Vos (1996) Antonie van Leeuwenhoek
70:223-242).

[0036]Members of the major facilitator super family (MFS) of multidrug
transporters have either 12 or 14 transmembrane segments. Members of the
small multidrug resistance family (SMR) of multidrug transporters are
thought to form a tightly packed four-helix antiparallel bundle. Members
of the resistance nodulation-cell division family (RND) contain a single
N-terminal transmembrane segment and a large C-terminal periplasmic
domain (Putman et al. (2000) Microbiol. Mol. Biol. Reviews 64:672-693).
Conserved motifs within each of these types of multidrug transporters and
also throughout the multidrug transporters of the MFS, SMR, and RND
families, as well as specific proteins from various bacteria (with
Accession Nos.) have been described (Putman et al. (2000) supra).

[0037]The nucleic acid and protein compositions encompassed by the present
invention are isolated or substantially purified. By "isolated" or
"substantially purified" is meant that the nucleic acid or protein
molecules, or biologically active fragments or variants thereof, are
substantially or essentially free from components normally found in
association with the nucleic acid or protein in its natural state. Such
components include other cellular material, culture medium from
recombinant production, and/or various chemicals used in chemically
synthesizing the proteins or nucleic acids. Preferably, an "isolated"
nucleic acid of the present invention is free of nucleic acid sequences
that flank the nucleic acid of interest in the genomic DNA of the
organism from which the nucleic acid was obtained (such as coding
sequences present at the 5' or 3' ends). However, the molecule may
include some additional bases or moieties that do not deleteriously
affect the basic characteristics of the composition. For example, in
various embodiments, the isolated nucleic acid contains less than 5 kb, 4
kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleic acid sequence normally
associated with the genomic DNA in the cells from which it was obtained.
Similarly, a substantially purified protein has less than about 30%, 20%,
10%, 5%, or 1% (by dry weight) of contaminating protein, or
non-carbohydrate utilization-related protein. When the protein is
recombinantly produced, preferably culture medium represents less than
30%, 20%, 10%, or 5% of the volume of the protein preparation, and when
the protein is produced chemically, preferably the preparations have less
than about 30%, 20%, 10%, or 5% (by dry weight) of chemical precursors,
or non-carbohydrate utilization-related chemicals.

[0038]The compositions and methods of the present invention can be used to
modulate the function of the carbohydrate utilization-related or
multidrug transporter molecules of Lactobacillus acidophilus. By
"modulate," "alter," or "modify" is meant the up- or downregulation of a
target biological activity. Proteins of the invention are useful in
modifying the biological activities of lactic acid bacteria, and also in
modifying the nutritional or health-promoting characteristics of foods
fermented by such bacteria. Nucleotide molecules of the invention are
useful in modulating carbohydrate utilization-related or multidrug
transporter protein expression by lactic acid bacteria. Up- or
downregulation of expression from a nucleic acid of the present invention
is encompassed. Upregulation may be accomplished, for example, by
providing multiple gene copies, modulating expression by modifying
regulatory elements, promoting transcriptional or translational
mechanisms, or other means. Downregulation may be accomplished, for
example, by using known antisense and gene silencing techniques.

[0040]The polypeptides of the present invention or microbes expressing
them are useful as nutritional additives or supplements, and as additives
in dairy and fermentation processing. The nucleic acid sequences, encoded
polypeptides, and microorganisms expressing them are useful in the
manufacture of milk-derived products, such as cheeses, yogurt, fermented
milk products, sour milks, and buttermilk. Microorganisms that express
polypeptides of the invention may be probiotic organisms. By "probiotic"
is meant a live microorganism that survives passage through the
gastrointestinal tract and has a beneficial effect on the subject. By
"subject" is meant an organism that comes into contact with a
microorganism expressing a protein of the present invention. Subject may
refer to humans and other animals.

[0041]In addition to the carbohydrate utilization-related and multidrug
transporter nucleotide sequences and fragments and variants thereof as
disclosed herein, the nucleic acids of the current invention also
encompass homologous nucleic acid sequences identified and isolated from
other organisms or cells by hybridization with entire or partial
sequences obtained from the carbohydrate utilization-related and
multidrug transporter nucleotide sequences or variants and fragments
thereof as disclosed herein.

Fragments and Variants

[0042]The invention provides isolated nucleic acids comprising nucleotide
sequences encoding carbohydrate utilization-related and multidrug
transporter proteins, as well as the carbohydrate utilization-related and
multidrug transporter proteins encoded thereby. By "carbohydrate
utilization-related protein" is meant a protein having an amino acid
sequence as set forth in even numbered SEQ ID NOS:2-322. Fragments and
variants of these nucleotide sequences and encoded proteins are also
provided. By "fragment" of a nucleotide sequence or protein is meant a
portion of the nucleotide or amino acid sequence.

[0043]Fragments of the nucleic acids disclosed herein can be used as
hybridization probes to identify carbohydrate
utilization-related-encoding nucleic acids or multidrug
transporter-encoding nucleic acids, or can be used as primers in
amplification protocols [e.g., polymerase chain reaction (PCR)] or
mutation of carbohydrate utilization-related or multidrug transporter
nucleic acids. Fragments of nucleic acids of this invention can also be
bound to a physical substrate to comprise what may be considered a macro-
or microarray (see, for example, U.S. Pat. No. 5,837,832; U.S. Pat. No.
5,861,242; WO 89/10977; WO 89/11548; WO 93/17126; U.S. Pat. No.
6,309,823). Such arrays or "chips" of nucleic acids may be used to study
gene expression or to identify nucleic acids with sufficient identity to
the target sequences.

[0044]The present invention further provides a nucleic acid array or chip,
i.e., a multitude of nucleic acids (e.g., DNA) as molecular probes
precisely organized or arrayed on a solid support, which allow for the
sequencing of genes, the study of mutations contained therein and/or the
analysis of the expression of genes, as such arrays and chips are
currently of interest given their very small size and their high capacity
in terms of number of analyses.

[0045]The function of these nucleic acid arrays/chips is based on
molecular probes, mainly oligonucleotides, which are attached to a
carrier having a size of generally a few square centimeters or more, as
desired. For an analysis, the carrier, such as in a DNA array/chip, is
coated with DNA probes (e.g., oligonucleotides) that are arranged at a
predetermined location or position on the carrier. A sample containing a
target nucleic acid and/or fragments thereof to be analyzed, for example
DNA or RNA or cDNA, that has been labeled beforehand, is contacted with
the DNA array/chip leading to the formation, through hybridization, of a
duplex. After a washing step, analysis of the surface of the chip allows
any hybridizations to be located by means of the signals emitted by the
labeled target. A hybridization fingerprint results, which, by computer
processing, allows retrieval of information such as the expression of
genes, the presence of specific fragments in the sample, the
determination of sequences and/or the identification of mutations.

[0046]In one embodiment of this invention, hybridization between target
nucleic acids and nucleic acids of the invention, used in the form of
probes and deposited or synthesized in situ on a DNA chip/array, can be
determined by means of fluorescence, radioactivity, electronic detection
or the like, as are well known in the art.

[0047]In another embodiment, the nucleotide sequences of the invention can
be used in the form of a DNA array/chip to carry out analyses of the
expression of Lactobacillus acidophilus genes. This analysis is based on
DNA array/chips on which probes, chosen for their specificity to
characterize a given gene or nucleotide sequence, are present. The target
sequences to be analyzed are labeled before being hybridized onto the
chip. After washing, the labeled complexes are detected and quantified,
with the hybridizations being carried out at least in duplicate.
Comparative analyses of the signal intensities obtained with respect to
the same probe for different samples and/or for different probes with the
same sample, allows, for example, for differential transcription of RNA
derived from the sample.

[0048]In yet another embodiment, arrays/chips containing nucleotide
sequences of the invention can comprise nucleotide sequences specific for
other microorganisms, which allows for serial testing and rapid
identification of the presence of a microorganism in a sample.

[0049]In a further embodiment, the principle of the DNA array/chip can
also be used to produce protein arrays/chips on which the support has
been coated with a polypeptide and/or an antibody of this invention, or
arrays thereof, in place of the nucleic acid. These protein arrays/chips
make it possible, for example, to analyze the biomolecular interactions
induced by the affinity capture of targets onto a support coated, e.g.,
with proteins, by surface plasma resonance (SPR). The polypeptides or
antibodies of this invention, capable of specifically binding antibodies
or polypeptides derived from the sample to be analyzed, can be used in
protein arrays/chips for the detection and/or identification of proteins
and/or peptides in a sample.

[0050]Thus, the present invention provides a microarray or microchip
comprising various nucleic acids of this invention in any combination,
including repeats, as well as a microarray comprising various
polypeptides of this invention in any combination, including repeats.
Also provided is a microarray comprising antibodies that specifically
react with various polypeptides of this invention, in any combination,
including repeats.

[0051]By "nucleic acid" is meant DNA molecules (e.g., cDNA or genomic DNA)
and RNA molecules (e.g., mRNA) and analogs of the DNA or RNA generated
using nucleotide analogs. The nucleic acid can be single-stranded or
double-stranded, but is typically double-stranded DNA. A fragment of a
nucleic acid encoding a carbohydrate utilization-related protein or a
multidrug transporter protein may encode a protein fragment that is
biologically active, or it may be used as a hybridization probe or PCR
primer as described herein. A biologically active fragment of a
polypeptide disclosed herein can be prepared by isolating a portion of
one of the nucleotide sequences of the invention, expressing the encoded
portion of the protein (e.g., by recombinant expression in vitro), and
assessing the activity of the encoded portion of the protein. Fragments
of nucleic acids encoding carbohydrate utilization-related or multidrug
transporter proteins comprise at least about 5, 10, 15, 20, 25, 30, 35,
40, 45, 50, 75, 100, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650,
700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300,
1350, 1400, 1450, 1500, 1600, 1700, 1800, 1900, 2000, 2200, or 2500
contiguous nucleotides, including any number between 5 and 2500 not
specifically recited herein, or up to the total number of nucleotides
present in a full-length carbohydrate utilization-related or multidrug
transporter nucleotide sequence as disclosed herein (for example, 432 for
SEQ ID NO:1,369 for SEQ ID NO:3, etc.).

[0052]Fragments of amino acid sequences include polypeptide fragments
suitable for use as immunogens to raise anti-carbohydrate
utilization-related or anti-multidrug transporter antibodies. Fragments
include peptides comprising amino acid sequences sufficiently identical
to or derived from the amino acid sequence of a carbohydrate
utilization-related or multidrug transporter protein, or partial-length
protein, of the invention and exhibiting at least one activity of a
carbohydrate utilization-related or multidrug transporter protein, but
which include fewer amino acids than the full-length proteins disclosed
herein. Typically, biologically active portions comprise a domain or
motif with at least one activity of the carbohydrate utilization-related
or multidrug transporter protein. A biologically active portion of a
carbohydrate utilization-related or multidrug transporter protein can be
a polypeptide that is, for example, 10, 25, 50, 100, 150, 200, 250, 300,
350, 400, 450, 500, 550, 600, 650 contiguous amino acids in length, or
any number between 10 and 650 not specifically recited herein, up to the
total number of amino acids present in a full-length protein of the
current invention (for example, 144 for SEQ ID NO:2, 123 for SEQ ID NO:4,
etc.). Such biologically active portions can be prepared by recombinant
techniques and evaluated for one or more of the functional activities of
a native carbohydrate utilization-related or multidrug transporter
protein. As used here, a fragment comprises at least 5 contiguous amino
acids of any of even numbered SEQ ID NOS:2-322. The invention encompasses
other fragments, however, such as any fragment in the protein greater
than 6, 7, 8, or 9 amino acids.

[0053]Variants of the nucleotide and amino acid sequences are encompassed
in the present invention. By "variant" is meant a sufficiently identical
sequence. Accordingly, the invention encompasses isolated nucleic acids
that are sufficiently identical to the nucleotide sequences encoding
carbohydrate utilization-related proteins and multidrug transporter
proteins in even numbered SEQ ID NOS:2-322, or nucleic acids that
hybridize to a nucleic acid of odd numbered SEQ ID NOS:1-321, or a
complement thereof, under stringent conditions. Variants also include
polypeptides encoded by the variant nucleotide sequences of the present
invention. In addition, polypeptides of the current invention have an
amino acid sequence that is sufficiently identical to an amino acid
sequence set forth in even numbered SEQ ID NOS:1-320. By "sufficiently
identical" is meant that a first amino acid or nucleotide sequence
contains a sufficient or minimal number of equivalent or identical amino
acid residues as compared to a second amino acid or nucleotide sequence,
thus providing a common structural domain and/or indicating a common
functional activity. Conservative variants include those sequences that
differ due to the degeneracy of the genetic code.

[0054]In general, amino acid or nucleotide sequences that have at least
about 45%, 55%, or 65% identity, preferably at least about 70% or 75%
identity, more preferably at least about 80%, 85% or 90%, most preferably
at least about 91%, 92%, 93%, 94%, 95%, 95%, 96%, 97%, 98%, or 99%
sequence identity to any of the amino acid sequences of even numbered SEQ
ID NOS:2-322 or any of the nucleotide sequences of odd numbered SEQ ID
NOS:1-321, respectively, are defined herein as sufficiently identical.
Variant proteins encompassed by the present invention are biologically
active, that is they retain the desired biological activity of the native
protein, that is, carbohydrate utilization-related activity or multidrug
transporter activity as described herein. A biologically active variant
of a protein of the invention may differ from that protein by as few as
1-15 amino acid residues, as few as 1-10, such as 6-10, as few as 5, as
few as 4, 3, 2, or even 1 amino acid residue.

[0055]Naturally occurring variants may exist within a population (e.g.,
the Lactobacillus acidophilus population). Such variants can be
identified by using well-known molecular biology techniques, such as the
polymerase chain reaction (PCR), and hybridization as described below.
Synthetically derived nucleotide sequences, for example, sequences
generated by site-directed mutagenesis or PCR-mediated mutagenesis, that
still encode a carbohydrate utilization-related protein or multidrug
transporter protein, are also included as variants. One or more
nucleotide or amino acid substitutions, additions, or deletions can be
introduced into a nucleotide or amino acid sequence disclosed herein,
such that the substitutions, additions, or deletions are introduced into
the encoded protein. The additions (insertions) or deletions
(truncations) may be made at the N-terminal or C-terminal end of the
native protein, or at one or more sites in the native protein. Similarly,
a substitution of one or more nucleotides or amino acids may be made at
one or more sites in the native protein.

[0056]For example, conservative amino acid substitutions may be made at
one or more predicted, preferably nonessential amino acid residues. A
"nonessential" amino acid residue is a residue that can be altered from
the wild-type sequence of a protein without altering the biological
activity, whereas an "essential" amino acid is required for biological
activity. A "conservative amino acid substitution" is one in which the
amino acid residue is replaced with an amino acid residue with a similar
side chain. Families of amino acid residues having similar side chains
are known in the art. These families include amino acids with basic side
chains (e.g., lysine, arginine, histidine), acidic side chains (e.g.,
aspartic acid, glutamic acid), uncharged polar side chains (e.g.,
glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine),
nonpolar side chains (e.g., alanine, valine, leucine, isoleucine,
proline, phenylalanine, methionine, tryptophan), beta-branched side
chains (e.g., threonine, valine, isoleucine) and aromatic side chains
(e.g., tyrosine, phenylalanine, tryptophan, histidine). Such
substitutions would not be made for conserved amino acid residues, or for
amino acid residues residing within a conserved motif, where such
residues are essential for protein activity.

[0057]Alternatively, mutations can be made randomly along all or part of
the length of the carbohydrate utilization-related or multidrug
transporter coding sequence, such as by saturation mutagenesis. The
mutants can be expressed recombinantly, and screened for those that
retain biological activity by assaying for carbohydrate
utilization-related or multidrug transporter activity using standard
assay techniques. Methods for mutagenesis and nucleotide sequence
alterations are known in the art. See, for example, Kunkel (1985) Proc.
Natl. Acad. Sci. USA 82:488-492; Kunkel et al. (1987) Methods in Enzymol.
Molecular Biology (MacMillan Publishing Company, New York) and the
references sited therein. Obviously the mutations made in the DNA
encoding the variant must not disrupt the reading frame and preferably
will not create complementary regions that could produce secondary mRNA
structure. See, EP Patent Application Publication No. 75,444. Guidance as
to appropriate amino acid substitutions that do not effect biological
activity of the protein of interest may be found in the model of Dayhoff
et al. (1978) Atlas of Protein Sequence and Structure (Natl. Biomed. Res.
Found., Washington, D.C.), herein incorporated by reference.

[0058]The deletions, insertions, and substitutions of the protein
sequences encompassed herein are not expected to produce radical changes
in the characteristics of the protein. However, when it is difficult to
predict the exact effect of the substitution, deletion, or insertion in
advance of doing so, one skilled in the art will appreciate that the
effect will be evaluated by routine screening assays. That is, the
activity can be evaluated by comparing the activity of the modified
sequence with the activity of the original sequence. See the "Methods of
Use" section below for examples of assays that may be used to measure
carbohydrate utilization-related activity or multidrug transporter
activity.

[0059]Variant nucleotide and amino acid sequences of the present invention
also encompass sequences derived from mutagenic and recombinogenic
procedures such as DNA shuffling. With such a procedure, one or more
different carbohydrate utilization-related or multidrug transporter
protein coding regions can be used to create a new carbohydrate
utilization-related protein or a new multidrug transporter protein
possessing the desired properties. In this manner, libraries of
recombinant polynucleotides are generated from a population of related
sequence polynucleotides comprising sequence regions that have
substantial sequence identity and can be homologously recombined in vitro
or in vivo. For example, using this approach, sequence motifs encoding a
domain of interest may be shuffled between the carbohydrate
utilization-related or multidrug transporter gene of the invention and
other known carbohydrate utilization-related or multidrug transporter
genes to obtain a new gene coding for a protein with an improved property
of interest, such as an increased Km in the case of an enzyme.
Strategies for such DNA shuffling are known in the art. See, for example,
Stemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747-10751; Stemmer (1994)
Nature 370:389-391; Crameri et al. (1997) Nature Biotech. 15:436-438;
Moore et al. (1997) J. Mol. Biol. 272:336-347; Zhang et al. (1997) Proc.
Natl. Acad. Sci. USA 94:4504-4509; Crameri et al. (1998) Nature
391:288-291; and U.S. Pat. Nos. 5,605,793 and 5, 837, 458.

[0060]Variants of the carbohydrate utilization-related and multidrug
transporter proteins can function as either agonists (mimetics) or as
antagonists. An agonist of the protein can retain substantially the same,
or a subset, of the biological activities of the naturally occurring form
of the protein. An antagonist of the protein can inhibit one or more of
the activities of the naturally occurring form of the protein by, for
example, competitively binding to a downstream or upstream member of a
cellular signaling cascade that includes the carbohydrate
utilization-related or multidrug transporter protein.

[0061]Variants of a carbohydrate utilization-related or multidrug
transporter protein that function as either agonists or antagonists can
be identified by screening combinatorial libraries of mutants, e.g.,
truncation mutants, of a carbohydrate utilization-related or multidrug
transporter protein for agonist or antagonist activity. In one
embodiment, a variegated library of carbohydrate utilization-related
variants is generated by combinatorial mutagenesis at the nucleic acid
level and is encoded by a variegated gene library. A variegated library
of carbohydrate utilization-related or multidrug transporter variants can
be produced by, for example, enzymatically ligating a mixture of
synthetic oligonucleotides into gene sequences such that a degenerate set
of potential carbohydrate utilization-related or multidrug transporter
sequences is expressible as individual polypeptides, or alternatively, as
a set of larger fusion proteins (e.g., for phage display) containing the
set of carbohydrate utilization-related or multidrug transporter
sequences therein. There are a variety of methods that can be used to
produce libraries of potential carbohydrate utilization-related or
multidrug transporter variants from a degenerate oligonucleotide
sequence. Chemical synthesis of a degenerate gene sequence can be
performed in an automatic DNA synthesizer, and the synthetic gene then
ligated into an appropriate expression vector. Use of a degenerate set of
genes allows for the provision, in one mixture, of all of the sequences
encoding the desired set of potential carbohydrate utilization-related or
multidrug transporter sequences. Methods for synthesizing degenerate
oligonucleotides are known in the art (see, e.g., Narang (1983)
Tetrahedron 39:3; Itakura et al. (1984) Annu. Rev. Biochem. 53:323;
Itakura et al. (1984) Science 198:1056; Ike et al. (1983) Nucleic Acids
Res. 11:477).

[0062]In addition, libraries of fragments of a carbohydrate
utilization-related or multidrug transporter protein coding sequence can
be used to generate a variegated population of carbohydrate
utilization-related or multidrug transporter fragments for screening and
subsequent selection of variants of a carbohydrate utilization-related or
multidrug transporter protein. In one embodiment, a library of coding
sequence fragments can be generated by treating a double-stranded PCR
fragment of a carbohydrate utilization-related or multidrug transporter
coding sequence with a nuclease under conditions wherein nicking occurs
only about once per molecule, denaturing the double-stranded DNA,
renaturing the DNA to form double-stranded DNA which can include
sense/antisense pairs from different nicked products, removing
single-stranded portions from reformed duplexes by treatment with S1
nuclease, and ligating the resulting fragment library into an expression
vector. By this method, one can derive an expression library that encodes
N-terminal and internal fragments of various sizes of the carbohydrate
utilization-related or multidrug transporter protein.

[0063]Several techniques are known in the art for screening gene products
of combinatorial libraries made by point mutations or truncation and for
screening cDNA libraries for gene products having a selected property.
Such techniques are adaptable for rapid screening of the gene libraries
generated by the combinatorial mutagenesis of carbohydrate
utilization-related or multidrug transporter proteins. The most widely
used techniques, which are amenable to high through-put analysis, for
screening large gene libraries typically include cloning the gene library
into replicable expression vectors, transforming appropriate cells with
the resulting library of vectors, and expressing the combinatorial genes
under conditions in which detection of a desired activity facilitates
isolation of the vector encoding the gene whose product was detected.
Recursive ensemble mutagenesis (REM), a technique that enhances the
frequency of functional mutants in the libraries, can be used in
combination with the screening assays to identify carbohydrate
utilization-related or multidrug transporter variants (Arkin and Yourvan
(1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993)
Protein Engineering 6(3):327-331).

Sequence Identity

[0064]The carbohydrate utilization-related and multidrug transporter
sequences are members of families of molecules with conserved functional
features. By "family" is meant two or more proteins or nucleic acids
having sufficient nucleotide or amino acid sequence identity. By
"sequence identity" is meant the nucleotide or amino acid residues that
are the same when aligning two sequences for maximum correspondence over
a specified comparison window. By "comparison window" is meant a
contiguous segment of the two nucleotide or amino acid sequences for
optimal alignment, wherein the second sequence may contain additions or
deletions (i.e., gaps) as compared to the first sequence. Generally, for
nucleic acid alignments, the comparison window is at least 20 contiguous
nucleotides in length, and optionally can be 30, 40, 50, 100, or longer.
For amino acid sequence alignments, the comparison window is at least 6
contiguous amino acids in length, and optionally can be 10, 15, 20, 30,
or longer. Those of skill in the art understand that to avoid a high
similarity due to inclusion of gaps, a gap penalty is typically
introduced and is subtracted from the number of matches.

[0065]Family members may be from the same or different species, and can
include homologues as well as distinct proteins. Often, members of a
family display common functional characteristics. Homologues can be
isolated based on their identity to the Lactobacillus acidophilus
carbohydrate utilization-related or multidrug transporter nucleic acid
sequences disclosed herein using the cDNA, or a portion thereof, as a
hybridization probe according to standard hybridization techniques under
stringent hybridization conditions as disclosed below.

[0066]To determine the percent identity of two amino acid or nucleotide
sequences, an alignment is performed. Percent identity of the two
sequences is a function of the number of identical residues shared by the
two sequences in the comparison window (i.e., percent identity=number of
identical residues/total number of residues×100). In one
embodiment, the sequences are the same length. Methods similar to those
mentioned below can be used to determine the percent identity between two
sequences. The methods can be used with or without allowing gaps.
Alignment may also be performed manually by inspection.

[0067]When amino acid sequences differ in conservative substitutions, the
percent identity may be adjusted upward to correct for the conservative
nature of the substitution. Means for making this adjustment are known in
the art. Typically the conservative substitution is scored as a partial,
rather than a full mismatch, thereby increasing the percentage sequence
identity.

[0069]Various computer implementations based on these mathematical
algorithms have been designed to enable the determination of sequence
identity. The BLAST programs of Altschul et al. (1990) J. Mol. Biol.
215:403 are based on the algorithm of Karlin and Altschul (1990) supra.
Searches to obtain nucleotide sequences that are homologous to nucleotide
sequences of the present invention can be performed with the BLASTN
program, score=100, wordlength=12. To obtain amino acid sequences
homologous to sequences encoding a protein or polypeptide of the current
invention, the BLASTX program may be used, score=50, wordlength=3. Gapped
alignments may be obtained by using Gapped BLAST (in BLAST 2.0) as
described in Altschul et al. (1997) Nucleic Acids Res. 25:3389. To detect
distant relationships between molecules, PSI-BLAST can be used. See,
Altschul et al. (1997) supra. For all of the BLAST programs, the default
parameters of the respective programs can be used. Alignment may also be
performed manually by inspection.

[0070]Another program that can be used to determine percent sequence
identity is the ALIGN program (version 2.0), which uses the mathematical
algorithm of Myers and Miller (1988) supra. A PAM 120 weight residue
table, a gap length penalty of 12, and a gap penalty of 4 can be used
with this program when comparing amino acid sequences.

[0071]In addition to the ALIGN and BLAST programs, the BESTFIT, GAP, FASTA
and TFASTA programs are part of the GCG Wisconsin Genetics Software
Package, Version 10 (available from Accelrys Inc., 9685 Scranton Rd., San
Diego, Calif., USA), and can be used for performing sequence alignments.
The preferred program is GAP version 10, which used the algorithm of
Needleman and Wunsch (1970) supra. Unless otherwise stated the sequence
identity values provided herein refer to those values obtained by using
GAP Version 10 with the following parameters: % identity and % similarity
for a nucleotide sequence using GAP Weight of 50 and Length Weight of 3,
and the nwsgapdna.cmp scoring matrix; % identity and % similarity for an
amino acid sequence using GAP Weight of 8 and Length Weight of 2, and the
BLOSUM62 scoring matrix; or any equivalent program thereof. By
"equivalent program" is meant any sequence comparison program that, for
any two sequences in question, generates an alignment having identical
nucleotide or amino acid residue matches and an identical percent
sequence identity when compared to the corresponding alignment generated
by GAP Version 10.

[0072]Alignment of a sequence in a database to a queried sequence produced
by BLASTN, FASTA, BLASTP or like algorithm is commonly described as a
"hit." Hits to one or more database sequences by a queried sequence
produced by BLASTN, FASTA, BLASTP or a similar algorithm, align and
identify similar portions of a sequence. A hit to a database sequence
generally represents an overlap over a fraction of the sequence length of
the queried sequence, i.e., a portion or fragment of the queried
sequence. However, the overlap can represent the entire length of the
queried sequence. The hits in an alignment to a queried sequence produced
by BLASTN, FASTA, or BLASTP algorithms to sequences in a database are
commonly arranged in order of the degree of similarity and the length of
sequence overlap.

[0073]Polynucleotide and polypeptide hits aligned by BLASTN, FASTA, or
BLASTP algorithms to a queried sequence produce "Expect" values. The
Expect value (E value) indicates the number of hits one can "expect" to
see over a certain number of contiguous sequences at random when
searching a database of a certain size. The Expect value is used as a
significance threshold for determining whether the hit to a database,
such as the GenBank or the EMBL database, indicates actual similarity.
For example, an E value of 0.1 assigned to a polynucleotide hit is
interpreted as meaning that in a database of the size of the GenBank
database, one might expect to see 0.1 matches over the aligned portion of
the sequence with a similar score randomly. By this criterion, the
aligned and matched portions of the polynucleotide sequences then have a
probability of 90% of being the same. For sequences having an E value of
0.01 or less over aligned and matched portions, the probability of
finding a match randomly in the GenBank database is 1% or less, using the
BLASTN or FASTA algorithm.

[0074]According to an embodiment of this invention, "variant"
polynucleotides and polypeptides of this invention, comprise sequences
producing an E value of about 0.01 or less when compared to the
polynucleotide or polypeptide sequences of the present invention. That
is, a variant polynucleotide or polypeptide is any sequence that has at
least a 99% probability of being the same as the polynucleotide or
polypeptide of the present invention, measured as having an E value of
0.01 or less using the BLASTN, FASTA, or BLASTP algorithms set at
parameters described herein. In other embodiments, a variant
polynucleotide is a sequence having the same number of, or fewer nucleic
acids than a polynucleotide of the present invention that has at least a
99% probability of being the same as the polynucleotide of the present
invention, measured as having an E value of 0.01 or less using the BLASTN
or FASTA algorithms set at parameters described herein. Similarly, a
variant polypeptide is a sequence having the same number of, or fewer
amino acids than a polypeptide of the present invention that has at least
a 99% probability of being the same as a polypeptide of the present
invention, measured as having an E value of 0.01 or less using the BLASTP
algorithm set at the parameters described herein.

[0075]As noted above, the percentage identity is determined by aligning
sequences using one of the BLASTN, FASTA, or BLASTP algorithms, set at
the running parameters described herein, and identifying the number of
identical nucleic acids or amino acids over the aligned portions;
dividing the number of identical nucleic acids or amino acids by the
total number of nucleic acids or amino acids of the polynucleotide or
polypeptide sequence of the present invention; and then multiplying by
100 to determine the percent identity. For example, a polynucleotide of
the present invention having 220 nucleic acids has a hit to a
polynucleotide sequence in the GenBank database having 520 nucleic acids
over a stretch of 23 nucleotides in the alignment produced by the BLASTN
algorithm using the parameters described herein. The 23 nucleotide hit
includes 21 identical nucleotides, one gap and one different nucleotide.
The percent identity of the polynucleotide of the present invention to
the hit in the GenBank library is thus 21/220 times 100, or 9.5%. The
polynucleotide sequence in the GenBank database is thus not a variant of
a polynucleotide of the present invention.

Identification and Isolation of Homologous Sequences

[0076]Carbohydrate utilization-related nucleotide sequences identified
based on their sequence identity to the carbohydrate utilization-related
or multidrug transporter nucleotide sequences set forth herein or to
fragments and variants thereof are encompassed by the present invention.
Methods such as PCR or hybridization can be used to identify sequences
from a cDNA or genomic library, for example that are substantially
identical to a sequence of the invention. See, for example, Sambrook et
al. (1989) Molecular Cloning: Laboratory Manual (2d ed., Cold Spring
Harbor Laboratory Press, Plainview, N.Y.) and Innis, et al. (1990) PCR
Protocols: A Guide to Methods and Applications (Academic Press, New
York). Methods for construction of such cDNA and genomic libraries are
generally known in the art and are also disclosed in the above reference.

[0077]In hybridization techniques, the hybridization probes may be genomic
DNA fragments, cDNA fragments, RNA fragments, or other oligonucleotides,
and may consist of all or part of a known nucleotide sequence disclosed
herein. In addition, they may be labeled with a detectable group such as
32P, or any other detectable marker, such as other radioisotopes, a
fluorescent compound, an enzyme, or an enzyme co-factor. Probes for
hybridization may be made by labeling synthetic oligonucleotides based on
the known carbohydrate utilization-related or multidrug transporter
nucleotide sequences disclosed herein. Degenerate primers designed on the
basis of conserved nucleotides or amino acid residues in a known
carbohydrate utilization-related or multidrug transporter nucleotide
sequence or encoded amino acid sequence can additionally be used. The
hybridization probe typically comprises a region of nucleotide sequence
that hybridizes under stringent conditions to at least about 10,
preferably about 20, more preferably about 50, 75, 100, 125, 150, 175,
200, 250, 300, 350, or 400 consecutive nucleotides of a nucleotide
sequence of the invention or a fragment or variant thereof. To achieve
specific hybridization under a variety of conditions, such probes include
sequences that are unique among carbohydrate utilization-related or
multidrug transporter protein sequences. Preparation of probes for
hybridization is generally known in the art and is disclosed in Sambrook
et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring
Harbor Laboratory Press, Plainview, N.Y.), herein incorporated by
reference.

[0078]In one embodiment, the entire nucleotide sequence encoding a
carbohydrate utilization-related or multidrug transporter protein is used
as a probe to identify novel carbohydrate utilization-related or
multidrug transporter sequences and messenger RNAs. In another
embodiment, the probe is a fragment of a nucleotide sequence disclosed
herein. In some embodiments, the nucleotide sequence that hybridizes
under stringent conditions to the probe can be at least about 300, 325,
350, 375, 400, 425, 450, 500, 550, 600, 650, 700, 800, 900, 1000, 1200,
1400, 1600, 1800, or 2000 nucleotides in length.

[0079]Substantially identical sequences will hybridize to each other under
stringent conditions. By "stringent conditions" is meant conditions under
which a probe will hybridize to its target sequence to a detectably
greater degree than to other sequences (e.g., at least 2-fold over
background). Generally, stringent conditions encompass those conditions
for hybridization and washing under which nucleotides having at least
about 60%, 65%, 70%, preferably 75% sequence identity typically remain
hybridized to each other. Stringent conditions are known in the art and
can be found in Current Protocols in Molecular Biology (John Wiley &
Sons, New York (1989)), 6.3.1-6.3.6. Hybridization typically occurs for
less than about 24 hours, usually about 4 to about 12 hours.

[0080]Stringent conditions are sequence dependent and will differ in
different circumstances. Full-length or partial nucleic acid sequences
may be used to obtain homologues and orthologs encompassed by the present
invention. By "orthologs" is meant genes derived from a common ancestral
gene and which are found in different species as a result of speciation.
Genes found in different species are considered orthologs when their
nucleotide sequences and/or their encoded protein sequences share
substantial identity as defined elsewhere herein. Functions of orthologs
are often highly conserved among species.

[0081]When using probes, stringent conditions will be those in which the
salt concentration is less than about 1.5 M Na ion, typically about 0.01
to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the
temperature is at least about 30° C. for short probes (e.g., 10 to
50 nucleotides) and at least about 60° C. for long probes (e.g.,
greater than 50 nucleotides).

[0082]The post-hybridization washes are instrumental in controlling
specificity. The two critical factors are ionic strength and temperature
of the final wash solution. For the detection of sequences that hybridize
to a full-length or approximately full-length target sequence, the
temperature under stringent conditions is selected to be about 5°
C. lower than the thermal melting point (Tm) for the specific
sequence at a defined ionic strength and pH. However, stringent
conditions would encompass temperatures in the range of 1° C. to
20° C. lower than the Tm, depending on the desired degree of
stringency as otherwise qualified herein. For DNA-DNA hybrids, the
Tm can be determined using the equation of Meinkoth and Wahl (1984)
Anal. Biochem. 138:267-284: Tm=81.5° C.+16.6 (logM)+0.41 (%
GC)-0.61 (% form)-500/L; where M is the molarity of monovalent cations, %
GC is the percentage of guanosine and cytosine nucleotides in the DNA, %
form is the percentage of formamide in the hybridization solution, and L
is the length of the hybrid in base pairs. The Tm is the temperature
(under defined ionic strength and pH) at which 50% of a complementary
target sequence hybridizes to a perfectly matched probe.

[0083]The ability to detect sequences with varying degrees of homology can
be obtained by varying the stringency of the hybridization and/or washing
conditions. To target sequences that are 100% identical (homologous
probing), stringency conditions must be obtained that do not allow
mismatching. By allowing mismatching of nucleotide residues to occur,
sequences with a lower degree of similarity can be detected (heterologous
probing). For every 1% of mismatching, the Tm is reduced about
1° C.; therefore, hybridization and/or wash conditions can be
manipulated to allow hybridization of sequences of a target percentage
identity. For example, if sequences with ≧90% sequence identity
are preferred, the Tm can be decreased by 10° C. Two
nucleotide sequences could be substantially identical, but fail to
hybridize to each other under stringent conditions, if the polypeptides
they encode are substantially identical. This situation could arise, for
example, if the maximum codon degeneracy of the genetic code is used to
create a copy of a nucleic acid.

[0084]Exemplary low stringency conditions include hybridization with a
buffer solution of 30-35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl
sulfate) at 37° C., and a wash in 1× to 2×SSC
(20×SSC=3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55° C.
Exemplary moderate stringency conditions include hybridization in 40 to
45% formamide, 1.0 M NaCl, 1% SDS at 37° C., and a wash in
0.5× to 1×SSC at 55 to 60° C. Exemplary high
stringency conditions include hybridization in 50% formamide, 1 M NaCl,
1% SDS at 37° C., and a wash in 0.1×SSC at 60 to 65°
C. Optionally, wash buffers may comprise about 0.1% to about 1% SDS.
Duration of hybridization is generally less than about 24 hours, usually
about 4 to about 12 hours. An extensive guide to the hybridization of
nucleic acids is found in Tijssen (1993) Laboratory Techniques in
Biochemistry and Molecular Biology--Hybridization with Nucleic Acid
Probes, Part I, Chapter 2 (Elsevier, New York); and Ausubel et al., eds.
(1995) Current Protocols in Molecular Biology, Chapter 2 (Greene
Publishing and Wiley-Interscience, New York). See Sambrook et al. (1989)
Molecular Cloning: A Laboratory Manual (2d ed.; Cold Spring Harbor
Laboratory Press, Plainview, N.Y.).

[0085]In a PCR approach, oligonucleotide primers can be designed for use
in PCR reactions to amplify corresponding DNA sequences from cDNA or
genomic DNA extracted from any organism of interest. PCR primers are
preferably at least about 10 nucleotides in length, and most preferably
at least about 20 nucleotides in length. Methods for designing PCR
primers and PCR cloning are generally known in the art and are disclosed
in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed.,
Cold Spring Harbor Laboratory Press, Plainview, N.Y.). See also Innis et
al., eds. (1990) PCR Protocols: A Guide to Methods and Applications
(Academic Press, New York); Innis and Gelfand, eds. (1995) PCR Strategies
(Academic Press, New York); and Innis and Gelfand, eds. (1999) PCR
Methods Manual (Academic Press, New York). Known methods of PCR include,
but are not limited to, methods using paired primers, nested primers,
single specific primers, degenerate primers, gene-specific primers,
vector-specific primers, partially-mismatched primers, and the like.

Assays

[0086]Diagnostic assays to detect expression of the disclosed polypeptides
and/or nucleic acids as well as their disclosed activity in a sample are
disclosed. An exemplary method for detecting the presence or absence of a
disclosed nucleic acid or protein comprising the disclosed polypeptide in
a sample involves obtaining a sample from a food/dairy/feed product,
starter culture (mother, seed, bulk/set, concentrated, dried,
lyophilized, frozen), cultured food/dairy/feed product, dietary
supplement, bioprocessing fermentate, or a subject that has ingested a
probiotic material, and contacting the sample with a compound or an agent
capable of detecting the disclosed polypeptides or nucleic acids (e.g.,
an mRNA or genomic DNA comprising the disclosed nucleic acid or fragment
thereof) such that the presence of the disclosed sequence is detected in
the sample. Results obtained with a sample from the food, supplement,
culture, product, or subject may be compared to results obtained with a
sample from a control culture, product, or subject.

[0087]One agent for detecting the mRNA or genomic DNA comprising a
disclosed nucleotide sequence is a labeled nucleic acid probe capable of
hybridizing to the disclosed nucleotide sequence of the mRNA or genomic
DNA. The nucleic acid probe can be, for example, a disclosed nucleic
acid, such as a nucleic acid of odd numbered SEQ ID NOS:1-321, or a
portion thereof, such as a nucleic acid of at least 15, 30, 50, 100, 250,
or 500 nucleotides in length and sufficient to specifically hybridize
under stringent conditions to the mRNA or genomic DNA comprising the
disclosed nucleic acid sequence. Other suitable probes for use in the
diagnostic assays of the invention are described herein.

[0088]One agent for detecting a protein comprising a disclosed polypeptide
sequence is an antibody capable of binding to the disclosed polypeptide,
preferably an antibody with a detectable label. Antibodies can be
polyclonal, or more preferably, monoclonal. An intact antibody, or a
fragment thereof (e.g., Fab or F(abN)2) can be used. The term
"labeled," with regard to the probe or antibody, is meant to encompass
direct labeling of the probe or antibody by coupling (i.e., physically
linking) a detectable substance to the probe or antibody, as well as
indirect labeling of the probe or antibody by reactivity with another
reagent that is directly labeled. Examples of indirect labeling include
detection of a primary antibody using a fluorescently labeled secondary
antibody and end-labeling of a DNA probe with biotin such that it can be
detected with fluorescently labeled streptavidin.

[0089]The term "sample" is meant to include tissues, cells, and biological
fluids present in or isolated from a subject, as well as cells from
starter cultures or food products carrying such cultures, or derived from
the use of such cultures. That is, the detection method of the invention
can be used to detect mRNA, protein, or genomic DNA comprising a
disclosed sequence in a sample both in vitro and in vivo. In vitro
techniques for detection of mRNA comprising a disclosed sequence include
Northern hybridizations and in situ hybridizations. In vitro techniques
for detection of a protein comprising a disclosed polypeptide include
enzyme linked immunosorbent assays (ELISAs), Western blots,
immunoprecipitations, and immunofluorescence. In vitro techniques for
detection of genomic DNA comprising the disclosed nucleotide sequences
include Southern hybridizations. Furthermore, in vivo techniques for
detection of a protein comprising a disclosed polypeptide include
introducing into a subject a labeled antibody against the disclosed
polypeptide. For example, the antibody can be labeled with a radioactive
marker whose presence and location in a subject can be detected by
standard imaging techniques.

[0090]In one embodiment, the sample contains protein molecules from a test
subject that has consumed a probiotic material. Alternatively, the sample
can contain mRNA or genomic DNA from a starter culture.

[0091]The invention also encompasses kits for detecting the presence of
disclosed nucleic acids or proteins comprising disclosed polypeptides in
a sample. Such kits can be used to determine if a microbe expressing a
specific polypeptide of the invention is present in a food product or
starter culture, or in a subject that has consumed a probiotic material.
For example, the kit can comprise a labeled compound or agent capable of
detecting a disclosed polypeptide or mRNA in a sample and means for
determining the amount of a the disclosed polypeptide in the sample
(e.g., an antibody that recognizes the disclosed polypeptide or an
oligonucleotide probe that binds to DNA encoding a disclosed polypeptide,
e.g., even numbered SEQ ID NOS:2-322). Kits can also include instructions
detailing the use of such compounds.

[0092]For antibody-based kits, the kit can comprise, for example: (1) a
first antibody (e.g., attached to a solid support) that binds to a
disclosed polypeptide; and, optionally, (2) a second, different antibody
that binds to the disclosed polypeptide or the first antibody and is
conjugated to a detectable agent. For oligonucleotide-based kits, the kit
can comprise, for example: (1) an oligonucleotide, e.g., a detectably
labeled oligonucleotide, that hybridizes to a disclosed nucleic acid
sequence or (2) a pair of primers useful for amplifying a disclosed
nucleic acid.

[0093]The kit can also comprise, e.g., a buffering agent, a preservative,
or a protein stabilizing agent. The kit can also comprise components
necessary for detecting the detectable agent (e.g., an enzyme or a
substrate). The kit can also contain a control sample or a series of
control samples that can be assayed and compared to the test sample
contained. Each component of the kit is usually enclosed within an
individual container, and all of the various containers are within a
single package along with instructions for use.

[0094]In one embodiment, the kit comprises multiple probes in an array
format, such as those described, for example, in U.S. Pat. Nos. 5,412,087
and 5,545,531, and International Publication No. WO 95/00530, herein
incorporated by reference. Probes for use in the array may be synthesized
either directly onto the surface of the array, as disclosed in
International Publication No. WO 95/00530, or prior to immobilization
onto the array surface (Gait, ed. (1984) Oligonucleotide Synthesis a
Practical Approach IRL Press, Oxford, England). The probes may be
immobilized onto the surface using techniques well known to one of skill
in the art, such as those described in U.S. Pat. No. 5,412,087. Probes
may be a nucleic acid or peptide sequence, preferably purified, or an
antibody.

[0095]The arrays may be used to screen organisms, samples, or products for
differences in their genomic, cDNA, polypeptide, or antibody content,
including the presence or absence of specific sequences or proteins, as
well as the concentration of those materials. Binding to a capture probe
is detected, for example, by signal generated from a label attached to
the nucleic acid comprising the disclosed nucleic acid sequence, a
polypeptide comprising the disclosed amino acid sequence, or an antibody.
The method can include contacting the molecule comprising the disclosed
nucleic acid, polypeptide, or antibody with a first array having a
plurality of capture probes and a second array having a different
plurality of capture probes. The results of each hybridization can be
compared to analyze differences in expression between a first and second
sample. The first plurality of capture probes can be from a control
sample, e.g., a wild type lactic acid bacteria, or control subject, e.g.,
a food, dietary supplement, starter culture sample, or a biological
fluid. The second plurality of capture probes can be from an experimental
sample, e.g., a mutant type lactic acid bacteria, or subject that has
consumed a probiotic material, e.g., a starter culture sample or a
biological fluid.

[0096]These assays may be especially useful in microbial selection and
quality control procedures where the detection of unwanted materials is
essential. The detection of particular nucleotide sequences or
polypeptides may also be useful in determining the genetic composition of
food, fermentation products, or industrial microbes, or microbes present
in the digestive system of animals or humans that have consumed
probiotics.

Antisense Nucleotide Sequences

[0097]The present invention also encompasses antisense nucleic acids,
i.e., molecules that are complementary to a sense nucleic acid encoding a
protein, e.g., complementary to the coding strand of a double-stranded
cDNA molecule, or complementary to an mRNA sequence. Accordingly, an
antisense nucleic acid can hydrogen bond to a sense nucleic acid. The
antisense nucleic acid can be complementary to an entire carbohydrate
utilization-related or multidrug transporter coding strand, or to only a
portion thereof, e.g., all or part of the protein coding region (or open
reading frame). An antisense nucleic acid can be antisense to a noncoding
region of the coding strand of a nucleotide sequence encoding a
carbohydrate utilization-related or multidrug transporter protein. The
noncoding regions are the 5' and 3' sequences that flank the coding
region and are not translated into amino acids. Antisense nucleotide
sequences are useful in disrupting the expression of the target gene.
Antisense constructions having 70%, preferably 80%, more preferably 85%,
90% or 95% sequence identity to the corresponding sequence may be used.

[0098]Given the coding-strand sequence encoding a carbohydrate
utilization-related or multidrug transporter protein disclosed herein
(e.g., even numbered SEQ ID NOS:2-322), antisense nucleic acids of the
invention can be designed according to the rules of Watson and Crick base
pairing. The antisense nucleic acid can be complementary to the entire
coding region of carbohydrate utilization-related or multidrug
transporter mRNA, but more preferably is an oligonucleotide that is
antisense to only a portion of the coding or noncoding region of
carbohydrate utilization-related or multidrug transporter mRNA. For
example, the antisense oligonucleotide can be complementary to the region
surrounding the translation start site of carbohydrate
utilization-related or multidrug transporter mRNA. An antisense
oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40,
45, or 50 nucleotides in length, or it can be 100, 200 nucleotides, or
greater in length. An antisense nucleic acid of the invention can be
constructed using chemical synthesis and enzymatic ligation procedures
known in the art.

[0099]For example, an antisense nucleic acid (e.g., an antisense
oligonucleotide) can be chemically synthesized using naturally occurring
nucleotides or variously modified nucleotides designed to increase the
biological stability of the molecules or to increase the physical
stability of the duplex formed between the antisense and sense nucleic
acids, including, but not limited to, for example e.g., phosphorothioate
derivatives and acridine substituted nucleotides. Alternatively, the
antisense nucleic acid can be produced biologically using an expression
vector into which a nucleic acid has been subcloned in an antisense
orientation (i.e., RNA transcribed from the inserted nucleic acid will be
of an antisense orientation to a target nucleic acid of interest).

[0101]The invention also encompasses ribozymes, which are catalytic RNA
molecules with ribonuclease activity that are capable of cleaving a
single-stranded nucleic acid, such as an mRNA, to which they have a
complementary region. Ribozymes (e.g., hammerhead ribozymes (described in
Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to
catalytically cleave carbohydrate utilization-related mRNA transcripts to
thereby inhibit translation of carbohydrate utilization-related or
multidrug transporter mRNA. A ribozyme having specificity for an
carbohydrate utilization-related-encoding or multidrug
transporter-encoding nucleic acid can be designed based upon the
nucleotide sequence of an carbohydrate utilization-related or multidrug
transporter cDNA disclosed herein (e.g., odd numbered SEQ ID NOS:1-320).
See, e.g., Cech et al., U.S. Pat. No. 4,987,071; and Cech et al., U.S.
Pat. No. 5,116,742. Alternatively, carbohydrate utilization-related or
multidrug transporter mRNA can be used to select a catalytic RNA having a
specific ribonuclease activity from a pool of RNA molecules. See, e.g.,
Bartel and Szostak (1993) Science 261:1411-1418.

[0103]In some embodiments, the nucleic acids of the invention can be
modified at the base moiety, sugar moiety, or phosphate backbone to
improve, e.g., the stability, hybridization, or solubility of the
molecule. For example, the deoxyribose phosphate backbone of the nucleic
acids can be modified to generate peptide nucleic acids (see Hyrup et al.
(1996) Bioorganic & Medicinal Chemistry 4:5). As used herein, the terms
"peptide nucleic acids" or "PNAs" refer to nucleic acid mimics, e.g., DNA
mimics, in which the deoxyribose phosphate backbone is replaced by a
pseudopeptide backbone and only the four natural nucleobases are
retained. The neutral backbone of PNAs has been shown to allow for
specific hybridization to DNA and RNA under conditions of low ionic
strength. The synthesis of PNA oligomers can be performed using standard
solid-phase peptide synthesis protocols as described, for example, in
Hyrup et al. (1996) supra; Perry-O'Keefe et al. (1996) Proc. Natl. Acad.
Sci. USA 93:14670.

[0104]PNAs can be used as antisense or antigene agents for
sequence-specific modulation of gene expression by, e.g., inducing
transcription or translation arrest or inhibiting replication. PNAs of
the invention can also be used, e.g., in the analysis of single base pair
mutations in a gene by, e.g., PNA-directed PCR clamping; as artificial
restriction enzymes when used in combination with other enzymes, e.g., S1
nucleases (Hyrup (1996) supra); or as probes or primers for DNA sequence
and hybridization (Hyrup (1996) supra; Perry-O'Keefe et al. (1996)
supra).

[0105]In another embodiment, PNAs of an carbohydrate utilization-related
or multidrug transporter molecule can be modified, e.g., to enhance their
stability, specificity, or cellular uptake, by attaching lipophilic or
other helper groups to PNA, by the formation of PNA-DNA chimeras, or by
the use of liposomes or other techniques of drug delivery known in the
art. The synthesis of PNA-DNA chimeras can be performed as described in
Hyrup (1996) supra; Finn et al. (1996) Nucleic Acids Res. 24(17):3357-63;
Mag et al. (1989) Nucleic Acids Res. 17:5973; and Peterson et al. (1975)
Bioorganic Med. Chem. Lett. 5:1119.

Fusion Proteins

[0106]The invention also includes carbohydrate utilization-related or
multidrug transporter chimeric or fusion proteins. A carbohydrate
utilization-related or multidrug transporter "chimeric protein" or
"fusion protein" comprises a carbohydrate utilization-related or
multidrug transporter polypeptide operably linked to a non-carbohydrate
utilization-related or non-multidrug transporter polypeptide,
respectively. A "carbohydrate utilization-related polypeptide" or a
"multidrug transporter polypeptide" refers to a polypeptide having an
amino acid sequence corresponding to a carbohydrate utilization-related
protein or a multidrug transporter protein, respectively, whereas a
"non-carbohydrate utilization-related polypeptide" or a "non-multidrug
transporter polypeptide" refers to a polypeptide having an amino acid
sequence corresponding to a protein that is not substantially identical
to the carbohydrate utilization-related protein or multidrug transporter
protein, respectively, and which is derived from the same or a different
organism. Within a carbohydrate utilization-related or multidrug
transporter fusion protein, the carbohydrate utilization-related or
multidrug transporter polypeptide can correspond to all or a portion of a
carbohydrate utilization-related or multidrug transporter protein,
preferably including at least one biologically active portion of a
carbohydrate utilization-related or multidrug transporter protein. Within
the fusion protein, the term "operably linked" is meant to indicate that
the carbohydrate utilization-related or multidrug transporter polypeptide
and the non-carbohydrate utilization-related or multidrug transporter
polypeptide are fused in-frame to each other. The non-carbohydrate
utilization-related or multidrug transporter polypeptide can be fused to
the N-terminus or C-terminus of the carbohydrate utilization-related or
multidrug transporter polypeptide.

[0107]Expression of the linked coding sequences results in two linked
heterologous amino acid sequences that form the fusion protein. The
carrier sequence (the non-carbohydrate utilization-related or
non-multidrug transporter polypeptide) can encode a carrier polypeptide
that potentiates or increases expression of the fusion protein in the
bacterial host. The portion of the fusion protein encoded by the carrier
sequence, i.e., the carrier polypeptide, may be a protein fragment, an
entire functional moiety, or an entire protein sequence. The carrier
region or polypeptide may additionally be designed to be used in
purifying the fusion protein, either with antibodies or with affinity
purification specific for that carrier polypeptide. Likewise, physical
properties of the carrier polypeptide can be exploited to allow selective
purification of the fusion protein.

[0108]Particular carrier polypeptides of interest include superoxide
dismutase (SOD), maltose-binding protein (MBP), glutathione-S-transferase
(GST), an N-terminal histidine (His) tag, and the like. This list is not
meant to be limiting, as any carrier polypeptide that potentiates
expression of the carbohydrate utilization-related protein or multidrug
resistance protein as a fusion protein can be used in the methods of the
invention.

[0109]In one embodiment, the fusion protein is a GST-carbohydrate
utilization-related fusion protein in which the carbohydrate
utilization-related sequences are fused to the C-terminus of the GST
sequences. In another embodiment, the fusion protein is a carbohydrate
utilization-related-immunoglobulin fusion protein in which all or part of
a carbohydrate utilization-related protein is fused to sequences derived
from a member of the immunoglobulin protein family. In other embodiments,
the fusion protein comprises a multidrug transporter protein of the
present invention. The carbohydrate utilization-related- or multidrug
transporter-immunoglobulin fusion proteins of the invention can be used
as immunogens to produce anti-carbohydrate utilization-related or
anti-multidrug transporter-related antibodies in a subject, to purify
carbohydrate utilization-related or multidrug transporter-related
ligands, and in screening assays to identify molecules that inhibit the
interaction of a carbohydrate utilization-related or multidrug
transporter protein with a carbohydrate utilization-related or multidrug
transporter ligand.

[0110]One of skill in the art will recognize that the particular carrier
polypeptide is chosen with the purification scheme in mind. For example,
His tags, GST, and maltose-binding protein represent carrier polypeptides
that have readily available affinity columns to which they can be bound
and eluted. Thus, where the carrier polypeptide is an N-terminal His tag
such as hexahistidine (His6 tag), the carbohydrate
utilization-related or multidrug transporter fusion protein can be
purified using a matrix comprising a metal-chelating resin, for example,
nickel nitrilotriacetic acid (Ni-NTA), nickel iminodiacetic acid
(Ni-IDA), and cobalt-containing resin (Co-resin). See, for example,
Steinert et al. (1997) QIAGEN News 4:11-15, herein incorporated by
reference in its entirety. Where the carrier polypeptide is GST, the
carbohydrate utilization-related or multidrug transporter fusion protein
can be purified using a matrix comprising glutathione-agarose beads
(Sigma or Pharmacia Biotech); where the carrier polypeptide is a
maltose-binding protein (MBP), the carbohydrate utilization-related or
multidrug transporter fusion protein can be purified using a matrix
comprising an agarose resin derivatized with amylose.

[0111]Preferably, a chimeric or fusion protein of the invention is
produced by standard recombinant DNA techniques. For example, DNA
fragments coding for the different polypeptide sequences may be ligated
together in-frame, or the fusion gene can be synthesized, such as with
automated DNA synthesizers. Alternatively, PCR amplification of gene
fragments can be carried out using anchor primers that give rise to
complementary overhangs between two consecutive gene fragments, which can
subsequently be annealed and re-amplified to generate a chimeric gene
sequence (see, e.g., Ausubel et al., eds. (1995) Current Protocols in
Molecular Biology (Greene Publishing and Wiley-Interscience, New York).
Moreover, a carbohydrate utilization-related or multidrug
transporter-encoding nucleic acid can be cloned into a commercially
available expression vector such that it is linked in-frame to an
existing fusion moiety.

[0112]The fusion protein expression vector is typically designed for ease
of removing the carrier polypeptide to allow the carbohydrate
utilization-related or multidrug transporter protein to retain the native
biological activity associated with it. Methods for cleavage of fusion
proteins are known in the art. See, for example, Ausubel et al., eds.
(1998) Current Protocols in Molecular Biology (John Wiley & Sons, Inc.).
Chemical cleavage of the fusion protein can be accomplished with reagents
such as cyanogen bromide,
2-(2-nitrophenylsulphenyl)-3-methyl-3'-bromoindolenine, hydroxylamine, or
low pH. Chemical cleavage is often accomplished under denaturing
conditions to cleave otherwise insoluble fusion proteins.

[0113]Where separation of the carbohydrate utilization-related or
multidrug transporter polypeptide from the carrier polypeptide is desired
and a cleavage site at the junction between these fused polypeptides is
not naturally occurring, the fusion construct can be designed to contain
a specific protease cleavage site to facilitate enzymatic cleavage and
removal of the carrier polypeptide. In this manner, a linker sequence
comprising a coding sequence for a peptide that has a cleavage site
specific for an enzyme of interest can be fused in-frame between the
coding sequence for the carrier polypeptide (for example, MBP, GST, SOD,
or an N-terminal His tag) and the coding sequence for the carbohydrate
utilization-related or multidrug transporter polypeptide. Suitable
enzymes having specificity for cleavage sites include, but are not
limited to, factor Xa, thrombin, enterokinase, remin, collagenase, and
tobacco etch virus (TEV) protease. Cleavage sites for these enzymes are
well known in the art. Thus, for example, where factor Xa is to be used
to cleave the carrier polypeptide from the carbohydrate
utilization-related or multidrug transporter polypeptide, the fusion
construct can be designed to comprise a linker sequence encoding a factor
Xa-sensitive cleavage site, for example, the sequence IEGR (see, for
example, Nagai and Thogersen (1984) Nature 309:810-812, Nagai and
Thogersen (1987) Meth. Enzymol. 153:461-481, and Pryor and Leiting (1997)
Protein Expr. Purif. 10(3):309-319, herein incorporated by reference).
Where thrombin is to be used to cleave the carrier polypeptide from the
carbohydrate utilization-related or multidrug transporter polypeptide,
the fusion construct can be designed to comprise a linker sequence
encoding a thrombin-sensitive cleavage site, for example the sequence
LVPRGS or VIAGR (see, for example, Pryor and Leiting (1997) Protein Expr.
Purif. 10(3):309-319, and Hong et al. (1997) Chin. Med. Sci. J.
12(3):143-147, respectively, herein incorporated by reference). Cleavage
sites for TEV protease are known in the art. See, for example, the
cleavage sites described in U.S. Pat. No. 5,532,142, herein incorporated
by reference in its entirety. See also the discussion in Ausubel et al.,
eds. (1998) Current Protocols in Molecular Biology (John Wiley & Sons,
Inc.), Chapter 16.

Antibodies

[0114]An isolated polypeptide of the present invention can be used as an
immunogen to generate antibodies that specifically bind carbohydrate
utilization-related or multidrug transporter proteins, or stimulate
production of antibodies in vivo. The full-length carbohydrate
utilization-related or multidrug transporter protein can be used as an
immunogen or, alternatively, antigenic peptide fragments of carbohydrate
utilization-related or multidrug transporter proteins as described herein
can be used. The antigenic peptide of an carbohydrate utilization-related
or multidrug transporter protein comprises at least 8, preferably 10, 15,
20, or 30 amino acid residues of the amino acid sequences shown in even
numbered SEQ ID NOS:1-320 and encompasses an epitope of a carbohydrate
utilization-related or multidrug transporter protein such that an
antibody raised against the peptide forms a specific immune complex with
the carbohydrate utilization-related or multidrug transporter protein.
Preferred epitopes encompassed by the antigenic peptide are regions of a
carbohydrate utilization-related or multidrug transporter protein that
are located on the surface of the protein, e.g., hydrophilic regions.

Recombinant Expression Vectors and Cells

[0115]The nucleic acids of the present invention may be included in
vectors, preferably expression vectors. "Vector" refers to a nucleic acid
capable of transporting another nucleic acid to which it has been linked.
Expression vectors include one or more regulatory sequences and direct
the expression of genes to which they are operably linked. By "operably
linked" is meant that the nucleotide sequence of interest is linked to
the regulatory sequence(s) such that expression of the nucleotide
sequence is allowed (e.g., in an in vitro transcription/translation
system or in a cell when the vector is introduced into the cell). The
term "regulatory sequence" is meant to include controllable
transcriptional promoters, operators, enhancers, transcriptional
terminators, and other expression control elements such as translational
control sequences (e.g., Shine-Dalgarno consensus sequence, initiation
and termination codons). These regulatory sequences will differ, for
example, depending on the cell being used.

[0116]The vectors can be autonomously replicated in a cell (episomal
vectors), or may be integrated into the genome of a cell, and replicated
along with the host genome (non-episomal mammalian vectors). Integrating
vectors typically contain at least one sequence homologous to the
bacterial chromosome that allows for recombination to occur between
homologous DNA in the vector and the bacterial chromosome. Integrating
vectors may also comprise bacteriophage or transposon sequences. Episomal
vectors, or plasmids are circular double-stranded DNA loops into which
additional DNA segments can be ligated. Plasmids capable of stable
maintenance in a host are generally the preferred form of expression
vectors when using recombinant DNA techniques.

[0117]The expression constructs or vectors encompassed in the present
invention comprise a nucleic acid construct of the invention in a form
suitable for expression of the nucleic acid in a cell. Expression in
prokaryotic cells and plant cells is encompassed in the present
invention. It will be appreciated by those skilled in the art that the
design of the expression vector can depend on such factors as the choice
of the cell to be transformed, the level of expression of protein
desired, etc. The expression vectors of the invention can be introduced
into cells to thereby produce proteins or peptides, including fusion
proteins or peptides, encoded by nucleic acids as described herein (e.g.,
carbohydrate utilization-related or multidrug transporter proteins,
mutant forms of carbohydrate utilization-related or multidrug transporter
proteins, fusion proteins, etc.).

Bacterial Expression Vectors

[0118]Regulatory sequences include those that direct constitutive
expression of a nucleotide sequence as well as those that direct
inducible expression of the nucleotide sequence only under certain
environmental conditions. A bacterial promoter is any DNA sequence
capable of binding bacterial RNA polymerase and initiating the downstream
(3') transcription of a coding sequence (e.g., structural gene) into
mRNA. A promoter will have a transcription initiation region, which is
usually placed proximal to the 5' end of the coding sequence. This
transcription initiation region typically includes an RNA polymerase
binding site and a transcription initiation site. A bacterial promoter
may also have a second domain called an operator, which may overlap an
adjacent RNA polymerase binding site at which RNA synthesis begins. The
operator permits negative regulated (inducible) transcription, as a gene
repressor protein may bind the operator and thereby inhibit transcription
of a specific gene. Constitutive expression may occur in the absence of
negative regulatory elements, such as the operator. In addition, positive
regulation may be achieved by a gene activator protein binding sequence,
which, if present is usually proximal (5') to the RNA polymerase binding
sequence.

[0119]An example of a gene activator protein is the catabolite activator
protein (CAP), which helps initiate transcription of the lac operon in
Escherichia coli (Raibaud et al. (1984) Annu. Rev. Genet. 18:173).
Regulated expression may therefore be either positive or negative,
thereby either enhancing or reducing transcription. Other examples of
positive and negative regulatory elements are well known in the art.
Various promoters that can be included in the protein expression system
include, but are not limited to, a T7/LacO hybrid promoter, a trp
promoter, a T7 promoter, a lac promoter, and a bacteriophage lambda
promoter. Any suitable promoter can be used to carry out the present
invention, including the native promoter or a heterologous promoter.
Heterologous promoters may be constitutively active or inducible. A
non-limiting example of a heterologous promoter is given in U.S. Pat. No.
6,242,194.

[0121]In addition, synthetic promoters that do not occur in nature also
function as bacterial promoters. For example, transcription activation
sequences of one bacterial or bacteriophage promoter may be joined with
the operon sequences of another bacterial or bacteriophage promoter,
creating a synthetic hybrid promoter (U.S. Pat. No. 4,551,433). For
example, the tac (Amann et al. (1983) Gene 25:167; de Boer et al. (1983)
Proc. Natl. Acad. Sci. 80:21) and trc (Brosius et al. (1985) J. Biol.
Chem. 260:3539-3541) promoters are hybrid trp-lac promoters comprised of
both trp promoter and lac operon sequences that are regulated by the lac
repressor. The tac promoter has the additional feature of being an
inducible regulatory sequence. Thus, for example, expression of a coding
sequence operably linked to the tac promoter can be induced in a cell
culture by adding isopropyl-1-thio-β-D-galactoside (IPTG).
Furthermore, a bacterial promoter can include naturally occurring
promoters of non-bacterial origin that have the ability to bind bacterial
RNA polymerase and initiate transcription. A naturally occurring promoter
of non-bacterial origin can also be coupled with a compatible RNA
polymerase to produce high levels of expression of some genes in
prokaryotes. The bacteriophage T7 RNA polymerase/promoter system is an
example of a coupled promoter system (Studier et al. (1986) J. Mol. Biol.
189:113; Tabor et al. (1985) Proc. Natl. Acad. Sci. 82:1074). In
addition, a hybrid promoter can also be comprised of a bacteriophage
promoter and an E. coli operator region (EPO Publication No. 267, 851).

[0122]The vector may additionally contain a gene encoding the repressor
(or inducer) for that promoter. For example, an inducible vector of the
present invention may regulate transcription from the Lac operator (LacO)
by expressing the gene encoding the LacI repressor protein. Other
examples include the use of the lexA gene to regulate expression of
pRecA, and the use of trpO to regulate ptrp. Alleles of such genes that
increase the extent of repression (e.g., lacIq) or that modify the manner
of induction (e.g., lambda CI857, rendering lambda pL thermo-inducible,
or lambda CI+, rendering lambda pL chemo-inducible) may be employed.

[0123]In addition to a functioning promoter sequence, an efficient
ribosome-binding site is also useful for the expression of the fusion
construct. In prokaryotes, the ribosome binding site is called the
Shine-Dalgarno (SD) sequence and includes an initiation codon (ATG) and a
sequence 3-9 nucleotides in length located 3-11 nucleotides upstream of
the initiation codon (Shine et al. (1975) Nature 254:34). The SD sequence
is thought to promote binding of mRNA to the ribosome by the pairing of
bases between the SD sequence and the 3' end of bacterial 16S rRNA
(Steitz et al. (1979) "Genetic Signals and Nucleotide Sequences in
Messenger RNA," in Biological Regulation and Development: Gene Expression
(ed. R. F. Goldberger, Plenum Press, NY).

[0124]Carbohydrate utilization-related proteins can also be secreted from
the cell by creating chimeric DNA molecules that encode a protein
comprising a signal peptide sequence fragment that provides for secretion
of the carbohydrate utilization-related and multidrug transporter
polypeptides in bacteria (U.S. Pat. No. 4,336,336). The signal sequence
fragment typically encodes a signal peptide comprised of hydrophobic
amino acids that direct the secretion of the protein from the cell. The
protein is either secreted into the growth media (Gram-positive bacteria)
or into the periplasmic space, located between the inner and outer
membrane of the cell (Gram-negative bacteria). Preferably there are
processing sites, which can be cleaved either in vivo or in vitro,
encoded between the signal peptide fragment and the carbohydrate
utilization-related or multidrug transporter protein.

[0126]Typically, transcription termination sequences recognized by
bacteria are regulatory regions located 3' to the translation stop codon
and thus, together with the promoter, flank the coding sequence. These
sequences direct the transcription of an mRNA that can be translated into
the polypeptide encoded by the DNA. Transcription termination sequences
frequently include DNA sequences (of about 50 nucleotides) that are
capable of forming stem loop structures that aid in terminating
transcription. Examples include transcription termination sequences
derived from genes with strong promoters, such as the trp gene in E. coli
as well as other biosynthetic genes.

[0127]The expression vectors will have a plurality of restriction sites
for insertion of the carbohydrate utilization-related or multidrug
transporter sequence so that it is under transcriptional regulation of
the regulatory regions. Selectable marker genes that ensure maintenance
of the vector in the cell can also be included in the expression vector.
Preferred selectable markers include those that confer resistance to
drugs such as ampicillin, chloramphenicol, erythromycin, kanamycin
(neomycin), and tetracycline (Davies et al. (1978) Annu. Rev. Microbiol.
32:469). Selectable markers may also allow a cell to grow on minimal
medium, or in the presence of toxic metabolite and may include
biosynthetic genes, such as those in the histidine, tryptophan, and
leucine biosynthetic pathways.

[0128]The regulatory regions may be native (homologous), or may be foreign
(heterologous) to the cell and/or the nucleotide sequence of the
invention. The regulatory regions may also be natural or synthetic. Where
the region is "foreign" or "heterologous" to the cell, it is meant that
the region is not found in the native cell into which the region is
introduced. Where the region is "foreign" or "heterologous" to the
carbohydrate utilization-related or multidrug transporter nucleotide
sequence of the invention, it is meant that the region is not the native
or naturally occurring region for the operably linked carbohydrate
utilization-related or multidrug transporter nucleotide sequence of the
invention. For example, the region may be derived from phage. While it
may be preferable to express the sequences using heterologous regulatory
regions, native regions may be used. Such constructs would be expected in
some cases to alter expression levels of carbohydrate utilization-related
or multidrug transporter proteins in the cell. Thus, the phenotype of the
cell could be altered.

[0129]In preparing the expression cassette, the various DNA fragments may
be manipulated, so as to provide for the DNA sequences in the proper
orientation and, as appropriate, in the proper reading frame. Toward this
end, adapters or linkers may be employed to join the DNA fragments or
other manipulations may be involved to provide for convenient restriction
sites, removal of superfluous DNA, removal of restriction sites, or the
like. For this purpose, in vitro mutagenesis, primer repair, restriction,
annealing, resubstitutions, e.g., transitions and transversions, may be
involved.

[0130]The invention further provides a recombinant expression vector
comprising a DNA molecule of the invention cloned into the expression
vector in an antisense orientation. That is, the DNA molecule is operably
linked to a regulatory sequence in a manner that allows for expression
(by transcription of the DNA molecule) of an RNA molecule that is
antisense to carbohydrate utilization-related or multidrug transporter
mRNA. Regulatory sequences operably linked to a nucleic acid cloned in
the antisense orientation can be chosen to direct the continuous or
inducible expression of the antisense RNA molecule. The antisense
expression vector can be in the form of a recombinant plasmid or phagemid
in which antisense nucleic acids are produced under the control of a high
efficiency regulatory region, the activity of which can be determined by
the cell type into which the vector is introduced. For a discussion of
the regulation of gene expression using antisense genes see Weintraub et
al. (1986) Reviews--Trends in Genetics, Vol. 1(1).

[0131]Alternatively, some of the above-described components can be put
together in transformation vectors. Transformation vectors are typically
comprised of a selectable market that is either maintained in a replicon
or developed into an integrating vector, as described above.

Plant Expression Vectors

[0132]For expression in plant cells, the expression cassettes will
comprise a transcriptional initiation region operably linked to a
nucleotide sequence of the present invention. Various restriction sites
may be included in these expression vectors to enable insertion of the
nucleotide sequence under the transcriptional regulation of the
regulatory regions. Additionally, the expression cassette may contain
selectable marker genes, including those genes that provide herbicide or
antibiotic resistance, such as tetracycline resistance, hygromycin
resistance, ampicillin resistance, or glyphosate resistance.

[0133]The expression cassette will include in the 5'-to-3' direction of
transcription, a transcriptional and translational initiation region, a
nucleotide sequence of the invention, and a transcriptional and
translational termination region (i.e., termination region) functional in
plants. The termination region may be native with the transcriptional
initiation region comprising the promoter nucleotide sequence, may be
native with the nucleotide sequence of the invention, or may be derived
from another source. Convenient termination regions are known in the art
and include, but are not limited to, a termination region from the
Ti-plasmid of A. tumefaciens, such as the octopine synthase and nopaline
synthase termination regions. See also, Guerineau et al. (1991) Mol. Gen.
Genet. 262:141-144; Proudfoot (1991) Cell 64:671-674; Sanfacon et al.
(1991) Genes Dev. 5:141-149; Mogen et al. (1990) Plant Cell 2:1261-1272;
Munroe et al. (1990) Gene 91:151-158; Ballas et al. 1989) Nucleic Acids
Res. 17:7891-7903; and Joshi et al. (1987) Nucleic Acid Res.
15:9627-9639.

[0134]The expression cassette comprising a nucleotide sequence of the
present invention may also contain at least one additional nucleotide
sequence for a gene to be cotransformed into the organism. Alternatively,
the additional sequence(s) may be provided on another expression
cassette.

[0136]Other methods known to enhance translation and/or mRNA stability can
also be utilized, for example, introns, such as the maize ubiquitin
intron (Christensen and Quail (1996) Transgenic Res. 5:213-218 and
Christensen et al. (1992) Plant Molecular Biology 18:675-689) or the
maize AdhI intron (Kyozuka et al. (1991) Mol. Gen. Genet. 228:40-48 and
Kyozuka et al. (1990) Maydica 35:353-357), and the like. Various intron
sequences have been shown to enhance expression, particularly in
monocotyledonous cells. The introns of the maize AdhI gene have been
found to significantly enhance the expression of the wild-type gene under
its cognate promoter when introduced into maize cells. Intron 1 was found
to be particularly effective and enhanced expression in fusion constructs
with the chloramphenicol acetyltransferase gene (Callis et al. (1987)
Genes Develop. 1:1183-1200). In the same experimental system, the intron
from the maize bronze1 gene had a similar effect in enhancing expression.
The AdhI intron has also been shown to enhance CAT expression 12-fold
(Mascarenhas et al. (1990) Plant Mol. Biol. 6:913-920). Intron sequences
have routinely been incorporated into plant transformation vectors,
typically within the non-translated leader.

[0137]The expression cassette comprising a promoter sequence of the
present invention may additionally contain a 3' non-coding sequence. A
"3"non-coding sequence" or "3' non-translated region" refers to a
nucleotide sequence located 3' (downstream) to a coding sequence and
includes polyadenylation signal sequences and other sequences encoding
regulatory signals capable of affecting the addition of polyadenylic acid
tracts to the 3' end of the mRNA precursor. A 3' non-translated region
comprises a region of the mRNA generally beginning with the translation
termination codon and extending at least beyond the polyadenylation site.
Non-translated sequences located in the 3' end of a gene have been found
to influence gene expression levels. Ingelbrecht et al. (see, Plant Cell,
1:671-680, 1989) evaluated the importance of these elements and found
large differences in expression in stable plants depending on the source
of the 3' non-translated region. Using 3' non-translated regions
associated with octopine synthase, 2S seed protein from Arabidopsis,
small subunit of rbcS from Arabidopsis, extension from carrot, and
chalcone synthase from Antirrhinium, a 60-fold difference was observed
between the best-expressing construct (which contained the rbcS 3'
non-translated region) and the lowest-expressing construct (which
contained the chalcone synthase 3' region).

[0138]Transcription levels may also be increased by the utilization of
enhancers in combination with the promoter regions of the invention.
Enhancers are nucleotide sequences that act to increase the expression of
a promoter region. Enhancers are known in the art and include the SV40
enhancer region, the 35S enhancer element, and the like.

[0139]In preparing the expression cassette, the various DNA fragments may
be manipulated, so as to provide for the DNA sequences in the proper
orientation and, as appropriate, in the proper reading frame. Adapters or
linkers may be employed to join the DNA fragments or other manipulations
may be involved to provide for convenient restriction sites. Restriction
sites may be added or removed, superfluous DNA may be removed, or other
modifications may be made to the sequences of the invention. For this
purpose, in vitro mutagenesis, primer repair, restriction, annealing,
resubstitutions, for example, transitions and transversions, may be
involved.

[0141]The nucleic acids of the present invention are useful in methods
directed to expressing a nucleotide sequence in a plant. This may be
accomplished by transforming a plant cell of interest with an expression
cassette comprising a promoter operably linked to a nucleotide sequence
identified herein, and regenerating a stably transformed plant from said
plant cell. The expression cassette comprising the promoter sequence
operably linked to a nucleotide sequence of the present invention can be
used to transform any plant. In this manner, genetically modified, i.e.
transgenic or transformed, plants, plant cells, plant tissue, seed, root,
and the like can be obtained.

Microbial or Bacterial Cells

[0142]The production of bacteria containing heterologous phage resistance
genes, the preparation of starter cultures of such bacteria, and methods
of fermenting substrates, particularly food substrates such as milk, may
be carried out in accordance with known techniques.

[0143]By "introducing" as it pertains to nucleic acids is meant
introduction into prokaryotic or eukaryotic cells via conventional
transformation or transfection techniques, or by phage-mediated
infection. As used herein, the terms "transformation," "transduction,"
"conjugation," and "protoplast fusion" are meant to refer to a variety of
art-recognized techniques for introducing foreign nucleic acid (e.g.,
DNA) into a cell, including calcium phosphate or calcium chloride
co-precipitation, DEAE-dextran-mediated transfection, lipofection, or
electroporation. Suitable methods for transforming or transfecting cells
can be found in Sambrook et al. (1989) Molecular Cloning: A Laboratory
Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.) and
other laboratory manuals. By "introducing" as it pertains to polypeptides
or microorganisms of the invention, is meant introduction into a host by
ingestion, topical application, nasal, suppository, urogenital, or oral
application of the polypeptide or microorganism.

[0145]As used herein, the terms "transformed plant" and "transgenic plant"
refer to a plant that comprises within its genome a heterologous
polynucleotide. Generally, the heterologous polynucleotide is stably
integrated within the genome of a transgenic or transformed plant such
that the polynucleotide is passed on to successive generations. The
heterologous polynucleotide may be integrated into the genome alone or as
part of a recombinant expression cassette. It is to be understood that as
used herein the term "transgenic" includes any cell, cell line, callus,
tissue, plant part, or plant the genotype of which has been altered by
the presence of heterologous nucleic acid including those transgenics
initially so altered as well as those created by sexual crosses or
asexual propagation from the initial transgenic. The term "transgenic" as
used herein does not encompass the alteration of the genome (chromosomal
or extra-chromosomal) by conventional plant breeding methods or by
naturally occurring events such as random cross-fertilization,
non-recombinant viral infection, non-recombinant bacterial
transformation, non-recombinant transposition, or spontaneous mutation.

[0146]A transgenic "event" is produced by transformation of plant cells
with a heterologous DNA construct, including a nucleic acid expression
cassette that comprises a transgene of interest, the regeneration of a
population of plants resulting from the insertion of the transgene into
the genome of the plant, and selection of a particular plant
characterized by insertion into a particular genome location. An event is
characterized phenotypically by the expression of the transgene. At the
genetic level, an event is part of the genetic makeup of a plant. The
term "event" also refers to progeny produced by a sexual outcross between
the transformant and another variety that includes the heterologous DNA.

[0147]As used herein, the term "plant" includes reference to whole plants,
plant organs (e.g., leaves, stems, roots, etc.), seeds, plant cells, and
progeny of same. Parts of transgenic plants within the scope of the
invention are to be understood to comprise, for example, plant cells,
protoplasts, tissues, callus, embryos as well as flowers, stems, fruits,
ovules, leaves, or roots originating in transgenic plants or their
progeny previously transformed with a DNA molecule of the invention, and
therefore consisting at least in part of transgenic cells.

[0148]As used herein, the term "plant cell" includes, without limitation,
seeds suspension cultures, embryos, meristematic regions, callus tissue,
leaves, roots, shoots, gametophytes, sporophytes, pollen, and
microspores. The class of plants that can be used in the methods of the
invention is generally as broad as the class of higher plants amenable to
transformation techniques, including both monocotyledonous and
dicotyledonous plants.

[0151]The methods of the invention do not depend on a particular method
for introducing a nucleotide construct to a plant, only that the
nucleotide construct gains access to the interior of at least one cell of
the plant. Methods for introducing nucleotide constructs into plants are
known in the art including, but not limited to, stable transformation
methods, transient transformation methods, and virus-mediated methods.

[0152]By "transient transformation" it is meant that a nucleotide
construct introduced into a plant does not integrate into the genome of
the plant. By "stable transformation" it is meant that the nucleotide
construct introduced into a plant integrates into the genome of the plant
and is capable of being inherited by progeny thereof. "Primary
transformant" and "T0 generation" transgenic plants are of the same
genetic generation as the tissue that was initially transformed (i.e.,
not having gone through meiosis and fertilization since transformation).
"Secondary transformants" and "T1, T2, T3, and subsequent generations"
refer to transgenic plants derived from primary transformants through one
or more meiotic and fertilization cycles. They may be derived by
self-fertilization of primary or secondary transformants or crosses of
primary or secondary transformants with other transformed or
untransformed plants.

[0154]The transformed cells may be grown into plants with methods known in
the art. See, for example, McCormick et al. (1986) Plant Cell Reports
5:81-84. These plants may then be grown, and either pollinated with the
same transformed strain or different strains, and the resulting hybrid
having expression of the desired phenotypic characteristic identified.
Two or more generations may be grown to ensure that expression of the
desired phenotypic characteristic is stably maintained and inherited and
then seeds may be harvested to ensure expression of the desired
phenotypic characteristic has been achieved. Thus as used herein,
"transformed seeds" refers to seeds that contain the nucleotide construct
stably integrated into the plant genome.

Methods of Use

[0155]Methods are provided for modifying expression of carbohydrate
utilization-related or multidrug transporter genes or proteins of an
organism. In one embodiment, properties of microorganisms used in
fermentation are modified to provide strains able to utilize alternative
carbohydrates for energy or carbon sources. These modifications may
result in a new ability to synthesize, transport, accumulate, or degrade
a carbohydrate. Alternatively, these modifications may result in the
ability to survive contact with antimicrobial polypeptides, including
antibiotics and toxins. These new abilities may also allow the
microorganisms to better survive stressful conditions, such as the
digestive tract or those found during food processing and storage, which
will increase the utility of these microorganisms in fermenting various
foods, as well as allowing them to provide longer-lasting probiotic
activity after ingestion. These new abilities may also allow the
microorganisms to generate different flavors or textures in a product
upon fermentation. In addition, the new abilities may enable a bacterium
to produce a modified carbohydrate, exopolysaccharide, or cell surface
polysaccharide. In another embodiment, the properties of plants are
modified to provide similar abilities. These abilities are provided by
the nucleotide and amino acid sequences disclosed in the present
invention.

[0156]In general, the methods comprise introducing or overexpressing one
or more proteins involved in carbohydrate utilization or multidrug
resistance. By "introducing" is meant that the protein of interest is
expressed in a modified cell when it was not expressed in an unmodified
cell. By "overexpressing" is meant that the protein of interest is
expressed in an increased amount in the modified organism compared to its
production in the unmodified wild-type organism. Homofermentative lactic
acid bacteria, in particular, have a relatively simple metabolism, with
almost no overlap between energy metabolism and biosynthesis metabolism,
making them ideal targets for metabolic engineering (Hugenholz and
Kleerebezem (1999) Current Opin. Biotech. 10:492-497). The expression of
bacterial genes in plants is well known in the art. See, for example,
Shewmaker et al. (1994) Plant Physiol. 104:1159-1166; Shen et al. (1997)
Plant Physiol. 113:1177-1183; Blaszczyk et al. (1999) Plant J.
20:237-243.

[0157]Expression of one or more carbohydrate utilization-related or
multidrug transporter proteins may allow for an organism to have a
modified ability to transport a carbohydrate or an antimicrobial
polypeptide such as a bacteriocin into or out of a cell.
Transport-related carbohydrate utilization proteins or multidrug
transporter proteins comprise ABC transporter system components including
substrate-binding proteins (for example H isJ and MalE),
membrane-associated components such as permeases (for example LacF and
LacG), and cytoplasmic proteins such as ATP-binding proteins (for example
msmK). Transport-related carbohydrate utilization proteins or multidrug
transporter proteins also comprise secondary transport system proteins
such as those in the major facilitator superfamily (MFS) and the
glycoside/pentoside/hexuronide family. Group translocation system
proteins are also included, including enzyme I, enzyme II, and HPr
proteins.

[0158]Methods are known in the art for cloning and expressing carbohydrate
utilization-related proteins in microorganisms and plants, and for
assessing function of those proteins (see, for example, de Vos (1996)
Antonie van Leeuwenhoek 70:223-242; Yeo et al. (2000) Mol. Cells.
10:263-268; Goddijn et al. (1997) Plant Physiol. 113:181-190). Function
for primary and secondary transport system-related proteins may be
assessed, for example, by enzyme assays, fermentation assays, and
transport assays. Function for group translocation system-related
proteins may be assessed, for example, by sugar phosphorylation assays.
See, for example, Russell et al. (Russell et al. (1992) J. Biol. Chem.
267:4631-4637), where genes from a primary transport system (msm) in
Streptococcus mutans are identified and expressed in E. coli;
Leong-Morgenthaler et al. (Leong-Morgenthaler et al. (1991) J. Bacteriol.
173:1951-1957, where two genes from a secondary transport system
(lactose) from Lactobacillus bulgaricus were cloned and expressed in E.
coli; Vaughan et al. (Vaughan et al. (1996) Appl. Env. Microbiol.
62:1574-1582), where a secondary transport system (lacS) gene from
Leuconostoc lactis was cloned and expressed in E. coli; de Vos et al. (de
Vos et al. (1990) J. Biol. Chem. 265:22554-22560), where two PTS system
genes from Lactococcus lactis were identified, cloned and expressed in E.
coli and Lactobacillus lactis; Sato et al. (Sato et al. (1989) J.
Bacteriol. 171:263-271), where the scrA gene from Streptococcus mutans
was cloned into E. coli and found to exhibit sucrose PTS activity; Alpert
and Chassy (Alpert and Chassy (1990) J. Biol. Chem. 265:22561-22568),
where the gene coding for the lactose-specific Enzyme II of Lactobacillus
casei was cloned and expressed in E. coli; Boyd et al. (Boyd et al.
(1994) Infect. mmun. 62:1156-1165), where the genes that encode HPr and
Enzyme I of the PTS transport system of Streptococcus mutans were cloned
and expressed in E. coli; Garg et al. (Garg et al. (2002) Proc. Natl.
Acad. Sci. USA 99:15898-15903), where the overexpression of E. coli
trehalose biosynthetic genes otsA and otsB led to increased tolerance of
the transgenic plants to abiotic stress, and enhanced productivity; and
Grinius and Goldberg (Grinius and Goldberg (1994) J. Biol. Chem.
269:29998-30004), where a multidrug resistance protein was expressed and
demonstrated to function as a drug pump.

[0159]Expression of one or more carbohydrate utilization-related proteins
may allow for an organism to have a modified ability to accumulate a
carbohydrate in the cytoplasm of a cell. For example, introducing or
overexpressing an enzyme involved in sugar catabolism without expressing
a relevant transport protein may lead to an accumulation of that
carbohydrate in the cytoplasm. Alternatively, introduction or
overexpression of a carbohydrate transport-related protein may lead to
enhanced transport of the carbohydrate into the external environment.
Methods are known in the art for introducing or expressing
carbohydrate-related genes in organisms. Accumulation of a carbohydrate
in a cell may be assessed, for example, by chromatographic methods or
enzyme assays. See, for example, Chaillou et al. (1998) J. Bacteriol.
180:4011-4014 and Goddijn et al. (1997) supra.

[0160]Expression of one or more carbohydrate utilization-related proteins
may allow for an organism to have a modified ability to utilize or
produce a carbohydrate as an energy source. Methods are known in the art
for cloning and expressing carbohydrate utilization-related proteins in
organisms, and for assessing function of those proteins (see, for
example, de Vos (1996) Antonie van Leeuwenhoek 70:223-242; Hugenholz et
al. (2002) Antonie van Leeuwenhoek 82:217-235). For example, the genes
for lactose metabolism may be introduced into a bacterium to improve the
utilization of lactose, and to produce a product more acceptable to
lactose-intolerant people (Hugenholz et al. (2002) supra). Further
modifications may be made in these modified bacteria, such as blocking
glucose metabolism so that glucose is not degraded, but is released from
the cell into the medium, thereby providing natural sweetness. See, for
example (Hugenholz et al. (2002) supra). Alternatively, the genes for
galactose metabolism as well as the gene for α-phosphoglucomutase
may be introduced, to improve the galactose-fermenting capability of the
microorganism, thereby aiding in preventing the consumption of high
levels of galactose, which could cause health problems (Hugenholz et al.
(2002) supra; Hirasuka and Li (1992) J. Stud. Alcohol 62:397-402). One
gene associated with galactose metabolism is α-galactosidase, the
expression of which may be useful for removing raffinose-type sugars from
fermented products, since monogastric animals cannot degrade them
(Hugenholz et al. (2002) supra). Expression of the bacterial gene for
mannitol-1-phosphate dehydrogenase (mtlD) in tobacco plants successfully
resulted in the synthesis and accumulation of mannitol (Tarczynski et al.
(1992) Proc. Natl. Acad. Sci. USA 89:2600-2604).

[0162]Generally, permeases, membrane-associated enzymes, and regulators
such as transcriptional repressors or antiterminators may need to be
expressed in the cell for optimal utilization of a carbohydrate. The
function of transcriptional antiterminators may be assayed by
antitermination activity in a reporter system (see, for example, Alpert
and Siebers (1997) J. Bacteriol. 179:1555-1562). The function of
repressors such as lacR may be assessed by enzyme activity or growth
assays (see, for example, van Rooijen et al. (1993) Protein Eng.
6:201-206; van Rooijen and de Vos (1990) J. Biol. Chem. 265:18499-18503).

[0163]The sequences of the present invention may also modify the ability
of an organism to alter the flavor or texture of a food product.
Modification of glucose metabolism to produce alternative sugars is one
approach that may lead to altered flavor or textural characteristics.
Disruption of the lactate dehydrogenase gene with the concomitant
expression of genes from the mannitol or sorbitol operons results in the
production of mannitol and sorbitol (Hugenholz et al. (2002) supra).
Diacetyl production during fermentation results in a butter aroma, which
can be enhanced by either disruption of lactate dehydrogenase or
overexpression of NADH oxidase in combination with disruption of
α-acetolactate decarboxylase (Hugenholz and Kleerebezem, (1999)
supra; Hugenholtz et al. (2000) Appl. Environ. Microbiol. 66:4112-4114)
Alternatively, overproduction of α-acetolactate synthase or
acetohydroxy acid synthase with disruption of α-acetolactate
decarboxylase has resulted in increased diacetyl production (Swindell et
al. (1996) Appl. Environ. Microbiol. 62:2641-2643; Platteeuw et al.
(1995) Appl. Environ. Microbiol. 61:3967-3971). Overexpression of alanine
dehydrogenase results in the production of alanine instead of lactic
acid, providing a taste-enhancer and sweetener in fermented foods (Hols
et al. (1999) Nat. Biotechnol. 17:588-592).

[0164]Methods for modifying the ability of an organism to produce a
modified carbohydrate are also encompassed, comprising introducing at
least one nucleotide sequence of the present invention into an organism.
Methods for producing modified carbohydrates are also encompassed, and
comprise contacting a carbohydrate to be modified with a polypeptide of
the present invention. Methods are known in the art for producing
modified carbohydrates. See, for example Kim et al. (2001) Biotechnol.
Prog. 17:208-210.

[0165]The sequences of the current invention may also modify the ability
of an organism to survive in a food system or the gastrointestinal tract
of a mammal, or modify an organism's stability and survival during food
processing and storage. For example, increased production of trehalose
may result in prolonged freshness and taste of a fermented product (see,
for example, www.nutracells.com). Trehalose also may aid in the
prevention of diseases that result from protein aggregation or
pathological conformations of proteins, such as Creutzfeld-Jacob disease.
In plants, accumulation of trehalose leads to protection against
environmental stresses such as drought, salt, and cold (see, for example,
Jang et al. (2003) Plant Physiol. 131:516-524; Penna (2003) Trends Plant
Sci. 8:355-357; Garg et al. (2002) Proc. Natl. Acad. Science
99:15898-15903; Yeo et al. (2000) supra). In addition, plants have been
transformed with fructosyltransferase genes, which enabled the plant to
accumulate fructans to a high level (van der Meer et al. (1994) Plant
Cell 6:561-570). In addition to having a role as a carbohydrate reserve,
fructans may also provide tolerance to dry and cold conditions (Pontis
and del Campillo (1985) "Fructans" in Biochemistry of Storage
Carbohydrates in Green Plants, Day and Dixon, eds. (London: Academic
Press), pp. 810-816; Pilon-Smits et al. (1995) Plant Physiol.
107:125-130). The bacterial gene mannitol-1-phosphate dehydrogenase has
also been expressed in plants, resulting in the production of mannitol,
which is thought to confer beneficial traits including osmoregulation and
neutralization of hydroxyl radicals (Tarczynski et al. (1992) supra).

[0166]The multidrug transporter sequences of the invention may allow an
organism to survive contact with an antimicrobial polypeptide or other
toxin. This may be due to an increased ability to transport a drug or
toxin out of the cell.

[0167]Variants of these nucleotide sequences are also encompassed, such as
those that retain or modify the ability to transport a carbohydrate or
toxin into or out of a cell, and those that retain or modify the ability
to accumulate or utilize a carbohydrate. Methods for making and testing
variants of carbohydrate utilization-related or multidrug transporter
proteins are well known in the art. See, for example, Poolman et al.
(Poolman et al. (1996) Mol. Microbiol. 19:911-911), where variants of
secondary transport system proteins (mellibiose and lactose) with altered
substrate specificities were isolated or constructed and tested. In these
mutants, sugar transport is uncoupled from cation symport. See also, for
example, Djorovevic et al. (2001) supra, where mutant HPr proteins were
constructed with altered regulatory activity; and Adams et al. (Adams et
al. (1994) J. Biol. Chem. 269:5666-5672), where cold-sensitive variants
of the β-galactosidase gene from Lactobacillus delbrueckii subsp.
bulgaricus were generated and characterized. These mutated genes had a
reduced Vmax at low temperatures and therefore may be useful in
preventing the acidification of fermented products during cold storage
(Mainzer et al. (1990) "Pathway engineering of Lactobacillus bulgaricus
for improved yoghurt," in Yoghurt: Nutritional and Health Properties,
Chandan, ed., (National Yoghurt Association, Virginia, US), pp. 41-55.
See, also, Bettenbrock et al. (Bettenbrock et al. (1999) J. Bacteriol.
181:225-230), where mutants with modified galactose-specified PTS genes
were isolated. See also, van Rooijen et al. (1990) supra, where variants
of the lacR repressor were isolated that had no effect on activity. See
also Kroetz et al., where polymorphism of the human MDR1 gene was
analyzed (Kroetz et al. (2003) Pharmacogenetics 13:481-94), and Mitomo et
al., where variants of the ABC transporter ABCG2 were analyzed (Mitomo et
al. (2003) Biochem. J. 373:767-74).

[0168]Any of the above modifications may be combined with other metabolic
alterations that have been engineered or suggested in lactic acid
bacteria. These include, B-vitamin production, such as folate (B11),
riboflavin (B2), or cobalamin (B12), the production of polyols, or
low-calorie sugars, that could replace sucrose, lactose, glucose, or
fructose as sweeteners, the production of tagatose, another sucrose
replacement, the production of various exopolysaccharides, blocking
glucose metabolism to provide a natural sweetening effect, reduced
production of galactose, production of foods with lower levels of
α-galactosides such as stachyose and raffinose, and increased
production of trehalose, which has preserving properties for foodstuffs
and is potentially involved in disease prevention (Hugenholz et al.
(2002) supra; van Roojen et al. (1991) J. Biol. Chem. 266:7176-7178).

[0169]Methods are also provided for eliminating or modifying undesirable
carbohydrates from a food or chemical product. The methods comprise
contacting the product with a purified polypeptide of the present
invention. Methods to assay for the elimination or modification of
carbohydrates are well known in the art.

[0170]The following examples are offered by way of illustration and not by
way of limitation.

EXAMPLE 1

Gapped BlastP Results for Amino Acid Sequences

[0171]A Gapped BlastP sequence alignment showed that SEQ ID NO:2 (144
amino acids) has about 61% identity from amino acids 1-140 with a protein
from Listeria innocua that is homologous to a PTS system mannose-specific
factor IIAB (Accession Nos. NP-469488.1; NC--003212), about 60%
identity from amino acids 1-140 with a protein from Listeria
monocytogenes that is homologous to a PTS system mannose-specific factor
IIAB (Accession Nos. NP--463629.1; NC--003210), about 63%
identity from amino acids 1-139 with a protein from Clostridium
acetobutylicum that is a mannose-specific phosphotransferase system
component IIAB (Accession Nos. NP--149230.1; NC--001988), about
62% identity from amino acids 1-139 with a protein from Clostridium
perfringens that is a PTS system protein (Accession Nos.
NP--561737.1; NC--003366), and about 50% identity from amino
acids 2-141 with a protein from Streptococcus pyogenes that is a
mannose-specific phosphotransferase system component IIAB (Accession Nos.
NP--269761.1; NC--002737).

[0172]A Gapped BlastP sequence alignment showed that SEQ ID NO:4 (123
amino acids) has about 60% identity from amino acids 20-109 with a
protein from Listeria innocua that is homologous to a phosphotransferase
system (PTS) lichenan-specific enzyme IIA component (Accession Nos.
NP--471165.1; NC--003212), about 57% identity from amino acids
20-110 with a protein from Listeria innocua that is homologous to a
cellobiose phosphotransferase enzyme IIA component (Accession Nos.
NP--472161.1; NC--003212), about 46% identity from amino acids
1-112 with a protein from Lactococcus lactis subsp. lactis that is a
cellobiose-specific PTS system IIA component (EC 2.7.1.69) (Accession
Nos. NP--266570.1; NC--002662), about 44% identity from amino
acids 9-112 with a protein from Bacillus halodurans that is a PTS system,
cellobiose-specific enzyme IIA component (Accession Nos.
NP--241776.1; NC--002570), and about 51% identity from amino
acids 16-112 with a protein from Streptococcus pyogenes that is
homologous to a PTS enzyme III (Accession Nos. NP--607437.1;
NC--003485).

[0173]A Gapped BlastP sequence alignment showed that SEQ ID NO:6 (161
amino acids) has about 53% identity from amino acids 6-143 with a protein
from Enterococcus faecium that is a beta-glucoside specific transport
protein (BglS) (Accession Nos. gb|AAD28228.1; AF121254), about 48%
identity from amino acids 13-159 with a protein from Streptococcus
pneumoniae that is a PTS system, IIABC component (Accession Nos.
NP--345256.1; NC--003028), about 48% identity from amino acids
13-159 with a protein from Streptococcus pneumoniae that is a PTS
glucose-specific enzyme IIABC component (Accession Nos.
NP--358262.1; NC--003098), about 46% identity from amino acids
13-159 with a protein from Streptococcus pyogenes that is homologous to a
PTS system, enzyme IIA component (Accession Nos. NP--608025.1;
NC--003485), and about 46% identity from amino acids 13-159 with a
protein from Streptococcus pyogenes that is homologous to a PTS system,
enzyme IIA component (Accession Nos. NP--269950.1; NC--002737).

[0174]A Gapped BlastP sequence alignment showed that SEQ ID NO:8 (291
amino acids) has about 36% identity from amino acids 11-282 with a
protein from Bacillus subtilis that is a transcription antiterminator
(licT) (Accession No. sp|P39805|LICT_BACSU), about 36% identity from
amino acids 11-282 with a protein from Bacillus subtilis that is a
transcriptional antiterminator (BglG family) (Accession Nos.
NP--391787.1; NC--000964), about 37% identity from amino acids
11-282 with a protein from Escherichia coli that is involved in positive
regulation of the bgl operon (Accession Nos. NP--418179.1;
NC--000913), about 33% identity from amino acids 11-282 with a
protein from Erwinia chrysanthemi that is a beta-glucoside operon
antiterminator (Accession No. sp|P2621|ARBG_ERWCH), and about 34%
identity from amino acids 9-288 with a protein from Clostridium
acetobutylicum that is a transcriptional antiterminator (licT) (Accession
Nos. NP--347062.1; NC--003030).

[0175]A Gapped BlastP sequence alignment showed that SEQ ID NO:10 (480
amino acids) has about 59% identity from amino acids 8-473 with a protein
from Listeria monocytogenes that is homologous to a
phospho-beta-glucosidase (Accession Nos. NP--463849.1;
NC--003210), about 58% identity from amino acids 8-473 with a
protein from Listeria innocua that is homologous to a
phospho-beta-glucosidase (Accession Nos. NP--469689.1;
NC--003212), about 57% identity from amino acids 7-473 with a
protein from Clostridium acetobutylicum that is a
6-phospho-beta-glucosidase (NP--347379.1; NC--003030), about
57% identity from amino acids 8-473 with a protein from Clostridium
longisporum that is a 6-phospho-beta-glucosidase (Accession No.
sp|Q46130|ABGA_CLOLO), and about 55% identity from amino acids 1-473 with
a protein from Bacillus subtilis that is a beta-glucosidase (Accession
Nos. NP--391805.1; NC--000964).

[0176]A Gapped BlastP sequence alignment showed that SEQ ID NO:12 (625
amino acids) has about 38% identity from amino acids 1-624 with a protein
from Streptococcus pyogenes that is a beta-glucoside permease IIABC
component (Accession Nos. NP--268836.1; NC--002737), about 38%
identity from amino acids 1-624 with a protein from Streptococcus
pyogenes that is a beta-glucoside permease IIABC component (Accession
Nos. NP--606826.1; NC--003485), about 38% identity from amino
acids 1-605 with a protein from Streptococcus pneumoniae that is a
phosphotransferase system sugar-specific EII component (Accession Nos.
NP--358099.1; NC--003098), about 38% identity from amino acids
1-605 with a protein from Streptococcus pneumoniae that is a PTS system,
beta-glucosides-specific IIABC component (Accession Nos.
NP--345091.1; NC--003028), and about 38% identity from amino
acids 1-622 with a protein from Bacillus halodurans that is a PTS system,
beta-glucoside-specific enzyme IIABC component (Accession Nos.
NP--241162.1; NC--002570).

[0177]A Gapped BlastP sequence alignment showed that SEQ ID NO:14 (675
amino acids) has about 50% identity from amino acids 17-648 with a
protein from Clostridium acetobutylicum that is a PTS system,
beta-glucosides-specific IIABC component (Accession Nos.
NP--348035.1; NC--003030), about 50% identity from amino acids
17-656 with a protein from Bacillus halodurans that is a PTS system,
beta-glucoside-specific enzyme IIABC (Accession Nos. NP--241461.1;
NC--002570), about 50% identity from amino acids 17-656 with a
protein from Listeria monocytogenes that is homologous to a PTS system,
beta-glucosides specific enzyme IIABC (Accession Nos. NP--463560.1;
NC--003210), about 48% identity from amino acids 17-654 with a
protein from Clostridium longisporum that is a PTS-dependent enzyme II
(Accession Nos. gb|AAC05713.1; L49336), and 48% identity from amino acids
13-654 with a protein from Streptococcus mutans that is a
beta-glucoside-specific EII permease (Accession Nos. gb|AAF89975.1;
AF206272).

[0178]A Gapped BlastP sequence alignment showed that SEQ ID NO:16 (445
amino acids) has about 41% identity from amino acids 10-443 with a
protein from Bacillus subtilis that is a phosphotransferase system (PTS)
protein, lichenan-specific enzyme IIC component (Accession Nos.
NP--391737.1; NC--000964), about 42% identity from amino acids
14-442 with a protein from Bacillus subtilis that is homologous to a PTS
system IIBC component (ywbA) (Accession No. sp|P39584|YWBA_BACSU), about
41% identity from amino acids 14-441 with a protein from Bacillus
stearothermophilus that is a cellobiose phosphotransferase enzyme IIC
component (Accession No. sp|Q45400|PTCC_BACST), about 41% identity from
amino acids 12-441 with a protein from Streptococcus pneumoniae that is a
phosphotransferase system sugar-specific EII component (Accession Nos.
NP--358015.1; NC--003098), and 40% identity from amino acids
12-441 with a protein from Streptococcus pneumoniae that is a PTS system,
cellobiose-specific IIC component (Accession Nos. NP--344993.1;
NC--003028).

[0179]A Gapped BlastP sequence alignment showed that SEQ ID NO:18 (422
amino acids) has about 34% identity from amino acids 9-417 with a protein
from Bacillus subtilis that is homologous to a phosphotransferase system
enzyme II (Accession Nos. NP--391718.1; NC--000964), about 33%
identity from amino acids 17-414 with a protein from Bacillus subtilis
that is a phosphotransferase system (PTS) lichenan-specific enzyme IIC
component (Accession Nos. NP--391737.1; NC--000964), about 34%
identity from amino acids 10-417 with a protein from Bacillus
stearothermophilus that is a cellobiose phosphotransferase enzyme IIC
component (Accession No. sp|Q45400|PTCC_BACST), about 33% identity from
amino acids 9-414 with a protein from Listeria innocua that is homologous
to a PTS system, cellobiose-specific IIC component (Accession Nos.
NP--470241.1; NC--003212), and 31% identity from amino acids
11-415 with a protein from Borrelia burgdorferi that is a PTS system,
cellobiose-specific IIC component (celB) (Accession Nos.
NP--046990.1; NC--001903).

[0180]A Gapped BlastP sequence alignment showed that SEQ ID NO:20 (130
amino acids) has about 33% identity from amino acids 3-124 with a protein
from Brucella melitensis that is a phosphotransferase system IIA
component (Accession Nos. NP--540949.1; NC--003317), about 32%
identity from amino acids 2-102 with a protein from Lactobacillus
curvatus that is an EIIA-mannose protein (Accession Nos. gb|AAB04153.1;
U28163), about 32% identity from amino acids 3-96 with a protein from
Clostridium perfringens that is homologous to a PTS system protein
(Accession Nos. NP--563545.1; NC--003366), about 25% identity
from amino acids 3-123 with a protein from Clostridium perfringens that
is homologous to a PTS system protein (Accession Nos. NP--561737.1;
NC--003366), and 25% identity from amino acids 3-123 with a protein
from Clostridium acetobutylicum that is a mannose-specific
phosphotransferase system component IIAB (Accession Nos.
NP--149230.1; NC--001988).

[0181]A Gapped BlastP sequence alignment showed that SEQ ID NO:22 (162
amino acids) has about 38% identity from amino acids 8-159 with a protein
from Clostridium acetobutylicum that is a PTS system enzyme IIBC
component (galactitol/fructose-specific) (Accession Nos.
NP--349560.1; NC--003030), about 36% identity from amino acids
7-158 with a protein from Streptococcus pneumoniae that is a
phosphotransferase system sugar-specific EII component (Accession Nos.
NP--358156.1; NC--003098), about 36% identity from amino acids
7-158 with a protein from Streptococcus pneumoniae that is homologous to
a PTS system IIA component (Accession Nos. NP--345152.1;
NC--003028), about 38% identity from amino acids 20-134 with a
protein from Streptococcus agalactiae that is a GatA protein (Accession
Nos. gb|AAG09977.1; AF248038), and 33% identity from amino acids 16-159
with a protein from Bacillus halodurans that is a PTS system,
galactitol-specific enzyme IIA component (Accession Nos.
NP--241058.1; NC--002570).

[0182]A Gapped BlastP sequence alignment showed that SEQ ID NO:24 (466
amino acids) has about 47% identity from amino acids 30-461 with a
protein from Clostridium acetobutylicum that is a PTS cellobiose-specific
component IIC (Accession NP--347026.1; NC--003030), about 45%
identity from amino acids 26-465 with a protein from Lactococcus lactis
subsp. lactis that is a cellobiose-specific PTS system IIC component (EC
2.7.1.69)(Accession Nos. NP--266974.1; NC--002662), about 46%
identity from amino acids 82-465 with a protein from Lactococcus lactis
subsp. lactis that is a cellobiose-specific PTS system IIC component (EC
2.7.1.69)(Accession Nos. NP--266572.1; NC--002662), about 41%
identity from amino acids 34-466 with a protein from Streptococcus
pyogenes that is homologous to a PTS system, enzyme IIC component
(Accession Nos. NP--269994.1; NC--002737), and 40% identity
from amino acids 34-466 with a protein from Streptococcus pyogenes that
is homologous to a PTS system, enzyme IIC component (Accession Nos.
NP--608069.1; NC--003485).

[0183]A Gapped BlastP sequence alignment showed that SEQ ID NO:26 (428
amino acids) has about 28% identity from amino acids 25-420 with a
protein from Listeria innocua that is homologous to a PTS
cellobiose-specific enzyme IIC (Accession NP--472233.1;
NC--003212), about 27% identity from amino acids 115-415 with a
protein from Lactobacillus casei that is a LacE protein (Accession Nos.
emb|CAB02556.1; Z80834), about 26% identity from amino acids 137-425 with
a protein from Listeria innocua that is homologous to a PTS system,
cellobiose-specific enzyme IIC (Accession Nos. NP--472184.1;
NC--003212), about 26% identity from amino acids 137-425 with a
protein from Listeria monocytogenes that is homologous to a PTS system,
cellobiose-specific enzyme IIC (Accession Nos. NP--466230.1;
NC--003210), and 26% identity from amino acids 115-415 with a
protein from Lactobacillus casei that is a phosphotransferase system
enzyme II (EC 2.7.1.69)(Accession No. pirl B23697).

[0184]A Gapped BlastP sequence alignment showed that SEQ ID NO:28 (475
amino acids) has about 57% identity from amino acids 10-471 with a
protein from Lactococcus lactis subsp. lactis that is a
cellobiose-specific PTS system IIC component (EC 2.7.1.69) (Accession
Nos. NP--266974.1; NC--002662), about 45% identity from amino
acids 71-475 with a protein from Lactococcus lactis subsp. lactis that is
a cellobiose-specific PTS system IIC component (EC 2.7.1.69)(Accession
Nos. NP--266572.1; NC--002662), about 42% identity from amino
acids 13-470 with a protein from Clostridium acetobutylicum that is a PTS
cellobiose-specific component IIC (Accession Nos. NP--347026.1;
NC--003030), about 41% identity from amino acids 17-468 with a
protein from Streptococcus pyogenes that is homologous to a PTS system,
enzyme IIC component (Accession Nos. NP--269994.1; NC--002737),
and 41% identity from amino acids 17-468 with a protein from
Streptococcus pyogenes that is homologous to a PTS system, enzyme IIC
component (Accession Nos. NP--608069.11 (NC--003485).

[0185]A Gapped BlastP sequence alignment showed that SEQ ID NO:30 (441
amino acids) has about 46% identity from amino acids 1-428 with a protein
from Listeria innocua that is homologous to a PTS system,
cellobiose-specific enzyme IIC (Accession Nos. NP--472184.1;
NC--003212), about 46% identity from amino acids 1-428 with a
protein from Listeria monocytogenes that is homologous to a PTS system,
cellobiose-specific enzyme IIC (Accession Nos. NP--466230.1;
NC--003210), about 39% identity from amino acids 10-427 with a
protein from Streptococcus pyogenes that is homologous to a PTS system
IIC component (Accession Nos. NP--607435.1; NC--003485), about
36% identity from amino acids 1-428 with a protein from Lactococcus
lactis subsp. lactis that is a cellobiose-specific PTS system IIC
component (EC 2.7.1.69)(Accession Nos. NP--266330.1;
NC--002662), and 31% identity from amino acids 1-421 with a protein
from Listeria monocytogenes that is homologous to a cellobiose
phosphotransferase enzyme IIC component (Accession Nos.
NP--466206.1; NC--003210).

[0186]A Gapped BlastP sequence alignment showed that SEQ ID NO:32 (626
amino acids) has about 54% identity from amino acids 1-532 with a protein
from Bacillus subtilis that is a phosphotransferase system (PTS)
arbutin-like enzyme IIBC component (Accession Nos. NP--388701.1;
NC--000964), about 51% identity from amino acids 2-530 with a
protein from Clostridium perfringens that is a PTS arbutin-like enzyme
IIBC component (Accession Nos. NP--561112.1; NC--003366), about
52% identity from amino acids 1-533 with a protein from Fusobacterium
mortiferum that is a PTS protein EII (Accession Nos. gb|AAB63014.2;
U81185), about 51% identity from amino acids 1-533 with a protein from
Clostridium acetobutylicum that is a MalP protein (Accession Nos.
gb|AAK69555.1; AF290982), and 51% identity from amino acids 1-533 with a
protein from Clostridium acetobutylicum that is a PTS system,
arbutin-like IIBC component (Accession Nos. NP--347171.1;
NC--003030).

[0187]A Gapped BlastP sequence alignment showed that SEQ ID NO:34 (663
amino acids) has about 58% identity from amino acids 1-456 with a protein
from Lactococcus lactis subsp. lactis that is a sucrose-specific PTS
system IIBC component (EC2.7.1.69) (Accession Nos. NP--267287.1;
NC--002662), about 54% identity from amino acids 5-471 with a
protein from Staphylococcus aureus subsp. aureus that is homologous to a
sucrose phosphotransferase enzyme II (Accession Nos. NP--373429.1;
NC--002745), about 46% identity from amino acids 5-472 with a
protein from Bacillus halodurans that is a PTS system, sucrose
phosphotransferase enzyme IIBC component (Accession Nos.
NP--244441.1; NC--002570), about 39% identity from amino acids
4-468 with a protein from Salmonella enterica subsp. enterica serovar
Typhi that is homologous to a PTS system IIBC component (Accession Nos.
NP--457099.1; NC--003198), and 39% identity from amino acids
4-468 with a protein from Salmonella typhimurium that is homologous to a
phosphotransferase system IIB component (Accession Nos.
NP--461505.1; NC--003197).

[0188]A Gapped BlastP sequence alignment showed that SEQ ID NO:36 (665
amino acids) has about 44% identity from amino acids 1-661 with a protein
from Clostridium perfringens that is a PTS system protein (Accession Nos.
NP--561500.1; NC--003366), about 46% identity from amino acids
1-657 with a protein from Streptococcus pyogenes that is homologous to a
fructose-specific enzyme II, PTS system BC component (Accession Nos.
NP--269062.1; NC--002737), about 46% identity from amino acids
1-657 with a protein from Streptococcus pyogenes that is homologous to a
fructose-specific enzyme II, PTS system BC component (Accession Nos.
NP--607065.1; NC--003485), about 45% identity from amino acids
1-657 with a protein from Lactococcus lactis subsp. lactis that is a
fructose-specific PTS system enzyme IIBC component (EC 2.7.1.69)
(Accession Nos. NP--267115.1; NC--002662), and 43% identity
from amino acids 1-660 with a protein from Bacillus halodurans that is a
PTS system, fructose-specific enzyme IIBC component (Accession Nos.
NP--241694.1; NC--002570).

[0189]A Gapped BlastP sequence alignment showed that SEQ ID NO:38 (334
amino acids) has about 48% identity from amino acids 4-334 with a protein
from Streptococcus pneumoniae that is a sucrose operon repressor (Scr
operon regulatory protein) (Accession Nos. NP--359213.1;
NC--003098), about 46% identity from amino acids 4-334 with a
protein from Streptococcus pneumoniae that is a sugar-binding
transcriptional regulator in the LacI family (Accession Nos.
NP--346232.1; NC--003028), about 35% identity from amino acids
13-332 with a protein from Pediococcus pentosaceus that is a sucrose
operon repressor (Scr operon regulatory protein) (Accession No.
sp|P43472|SCRR PEDPE), about 35% identity from amino acids 10-334 with a
protein from Bacillus halodurans that is a transcriptional repressor of
the ribose operon (Accession Nos. NP--244594.1; NC--002570),
and 35% identity from amino acids 10-332 with a protein from
Streptococcus pneumoniae that is a sucrose operon repressor (Accession
Nos. NP--346162.1; NC--003028).

[0190]A Gapped BlastP sequence alignment showed that SEQ ID NO:40 (415
amino acids) has about 50% identity from amino acids 3-415 with a protein
from Streptococcus pneumoniae that is an ABC transporter
substrate-binding protein (Accession Nos. NP--359212.1;
NC--003098), about 27% identity from amino acids 19-389 with a
protein from Agrobacterium tumefaciens that is a sugar binding protein
(Accession Nos. NP--535638.1; NC--003306), about 25% identity
from amino acids 11-396 with a protein from Nostoc sp. PCC 7120 that is
an ABC transporter sugar binding protein (Accession Nos.
NP--488317.1; NC--003272), about 26% identity from amino acids
76-353 with a protein from Streptomyces coelicolor that is homologous to
a sugar transport sugar binding protein (Accession Nos. emb|CAB95275.1;
AL359779), and 26% identity from amino acids 1-324 with a protein from
Listeria innocua that is homologous to a sugar ABC transporter,
periplasmic sugar-binding protein (Accession Nos. NP--470104.1;
NC--003212).

[0191]A Gapped BlastP sequence alignment showed that SEQ ID NO:42 (294
amino acids) has about 56% identity from amino acids 10-285 with a
protein from Streptococcus pneumoniae that is an ABC transporter
membrane-spanning permease-sugar transporter (Accession Nos.
NP--359211.1; NC--003098), about 38% identity from amino acids
7-285 with a protein from Listeria monocytogenes that is homologous to a
sugar ABC transporter permease protein (Accession Nos. NP--464293.1;
NC--003210), about 38% identity from amino acids 7-285 with a
protein from Listeria innocua that is homologous to a sugar ABC
transporter permease protein (Accession Nos. NP--470102.1;
NC--003212), about 36% identity from amino acids 12-286 with a
protein from Synechocystis sp. PCC 6803 that is a lactose transport
system permease protein (LacF) (Accession Nos. NP--440703.1;
NC--000911), and 36% identity from amino acids 11-281 with a protein
from Xylella fastidiosa that is a ABC transporter sugar permease
(Accession Nos. NP--299726.1; NC--002488).

[0193]A Gapped BlastP sequence alignment showed that SEQ ID NO:46 (430
amino acids) has about 36% identity from amino acids 2-429 with a protein
from Streptococcus pneumoniae that is a sucrose-6-phosphate hydrolase
(Accession Nos. NP--359209.1; NC--003098), about 36% identity
from amino acids 2-429 with a protein from Streptococcus pneumoniae that
is homologous to a sucrose-6-phosphate hydrolase (Accession Nos.
NP--346228.1; NC--003028), about 36% identity from amino acids
18-373 with a protein from Thermotoga maritima that is a
beta-fructosidase (Accession Nos. NP--229215.1; NC--000853),
about 31% identity from amino acids 21-405 with a protein from Zymomonas
mobilis that is a beta-fructofuranosidase (EC 3.2.1.26) (Accession No.
pir∥JU0460), and 35% identity from amino acids 21-362 with a
protein from Escherichia coli that is a sucrose-6 phosphate hydrolase
(Accession Nos. NP--311270.1; NC--002695).

[0194]A Gapped BlastP sequence alignment showed that SEQ ID NO:48 (368
amino acids) has about 65% identity from amino acids 1-366 with a protein
from Streptococcus mutans that is a multiple sugar-binding transport
ATP-binding protein (msmK) (Accession No. sp|Q00752|MSMK_STRMU), about
65% identity from amino acids 1-366 with a protein from Streptococcus
pyogenes that is a multiple sugar-binding ABC transport system
(ATP-binding protein) (Accession Nos. NP 269942.1; NC--002737),
about 66% identity from amino acids 1-367 with a protein from
Streptococcus pneumoniae that is an ABC transporter ATP-binding
protein-multiple sugar transport (Accession Nos. NP--359030.1;
NC--003098), about 65% identity from amino acids 1-366 with a
protein from Streptococcus pyogenes that is a multiple sugar-binding ABC
transport system (ATP-binding protein) (Accession Nos. NP--608016.1;
NC--003485), and 66% identity from amino acids 1-367 with a protein
from Streptococcus pneumoniae that is a sugar ABC transporter,
ATP-binding protein (Accession Nos. NP--346026.1; NC--003028).

[0195]A Gapped BlastP sequence alignment showed that SEQ ID NO:50 (490
amino acids) has about 63% identity from amino acids 11-489 with a
protein from Streptococcus mutans that is a gtfA protein (Accession No.
pir∥BWSOGM), about 63% identity from amino acids 11-490 with a
protein from Streptococcus mutans that is a sucrose phosphorylase (EC
2.4.1.7) (Accession No. pir∥A27626), about 63% identity from
amino acids 11-489 with a protein from Streptococcus mutans that is a
sucrose phosphorylase (sucrose glucosyltransferase) (Accession No.
sp|P102491|SUCP_STRMU), about 63% identity from amino acids 11-484 with a
protein from Streptococcus pneumoniae that is a dextransucrase (sucrose
6-glucosyltransferase) (Accession Nos. NP--359301.1;
NC--003098), and 63% identity from amino acids 11-484 with a protein
from Streptococcus pneumoniae that is a sucrose phosphorylase (Accession
Nos. NP--346325.1; NC--003028).

[0196]A Gapped BlastP sequence alignment showed that SEQ ID NO:52 (328
amino acids) has about 55% identity from amino acids 47-316 with a
protein from Bacillus subtilis that is a ribose ABC transporter
(ribose-binding protein) (Accession Nos. NP--391477.1;
NC--000964), about 45% identity from amino acids 5-323 with a
protein from Lactococcus lactis subsp. lactis that is a ribose ABC
transporter substrate binding protein (Accession Nos. NP--267791.1;
NC--002662), about 42% identity from amino acids 4-278 with a
protein from Tetragenococcus halophilus that is a ribose binding protein
(Accession Nos. dbj|BAA31869.1; AB009593), about 39% identity from amino
acids 15-316 with a protein from Bacillus halodurans that is a ribose ABC
transporter (ribose-binding protein) (Accession Nos. NP--244599.1;
NC--002570), and 42% identity from amino acids 4-315 with a protein
from Pasteurella multocida that is an RbsB protein (Accession Nos.
NP--245090.1; NC--002663).

[0197]A Gapped BlastP sequence alignment showed that SEQ ID NO:54 (285
amino acids) has about 60% identity from amino acids 1-277 with a protein
from Bacillus subtilis that is a ribose ABC transporter (permease)
(Accession Nos. NP--391476.1; NC--000964), about 59% identity
from amino acids 1-277 with a protein from Bacillus subtilis that is a
ribose transport system permease protein (rbcS) (Accession No.
sp|P36948|RBSC_BACSU), about 57% identity from amino acids 4-277 with a
protein from Bacillus halodurans that is a ribose ABC transporter
(permease) (Accession Nos. NP--244598.1; NC--002570), about 58%
identity from amino acids 4-277 with a protein from Lactococcus lactis
subsp. lactis that is a ribose ABC transporter permease protein
(Accession Nos. NP 267792.1; NC--002662), and 54% identity from
amino acids 4-278 with a protein from Haemophilus influenzae that is a
D-ribose ABC transporter, permease protein (rbsC) (Accession Nos.
NP--438661.1; NC--000907).

[0198]A Gapped BlastP sequence alignment showed that SEQ ID NO:56 (496
amino acids) has about 59% identity from amino acids 5-496 with a protein
from Lactococcus lactis subsp. lactis that is a ribose ABC transporter
ATP binding protein (Accession Nos. NP--267793.1; NC--002662),
about 57% identity from amino acids 5-496 with a protein from Bacillus
subtilis that is a ribose ABC transporter (ATP-binding protein)
(Accession Nos. NP--391475.1; NC--000964), about 51% identity
from amino acids 5-496 with a protein from Bacillus subtilis that is an
ATP binding protein (Accession No. pir∥I40465), about 49%
identity from amino acids 5-495 with a protein from Bacillus halodurans
that is a ribose ABC transporter (ATP-binding protein) (Accession Nos.
NP--244597.1; NC--002570), and 45% identity from amino acids
7-494 with a protein from Agrobacterium tumefaciens that is an ABC
transporter, nucleotide binding/ATPase protein [ribose] (Accession Nos.
NP--533484.1; NC--003304).

[0199]A Gapped BlastP sequence alignment showed that SEQ ID NO:58 (134
amino acids) has about 58% identity from amino acids 4-134 with a protein
from Lactobacillus sakei that is a ribose permease (RbsD) (Accession Nos.
gb|AAD34337.1; AF115391), about 51% identity from amino acids 4-134 with
a protein from Clostridium perfringens that is homologous to a ribose ABC
transporter (Accession Nos. NP--562547.1; NC--003366), about
50% identity from amino acids 4-132 with a protein from Lactococcus
lactis subsp. lactis that is a ribose ABC transporter permease protein
(Accession Nos. NP--267794.1; NC--002662), about 45% identity
from amino acids 4-134 with a protein from Bacillus halodurans that is a
ribose ABC transporter (permease) (Accession Nos. NP--244596.1;
NC--002570), and 51% identity from amino acids 4-134 with a protein
from Staphylococcus aureus subsp. aureus that is a ribose permease
(Accession Nos. NP--370793.1; NC--002758).

[0200]A Gapped BlastP sequence alignment showed that SEQ ID NO:60 (308
amino acids) has about 51% identity from amino acids 4-301 with a protein
from Lactobacillus sakei that is a ribokinase (RbsK) (Accession Nos.
gb|AAD34338.1; AF115391), about 48% identity from amino acids 1-303 with
a protein from Staphylococcus aureus subsp. aureus that is homologous to
a ribokinase (Accession Nos. NP--370792.1; NC--002758), about
45% identity from amino acids 3-305 with a protein from Clostridium
perfringens that is a ribokinase (Accession Nos. NP--562548.1;
NC--003366), about 41% identity from amino acids 1-299 with a
protein from Haemophilus influenzae that is a ribokinase (RbsK)
(Accession Nos. NP--438663.1; NC--000907), and 38% identity
from amino acids 2-300 with a protein from Yersinia pestis that is a
ribokinase (Accession Nos. NP--403674.1; NC--003143).

[0201]A Gapped BlastP sequence alignment showed that SEQ ID NO:62 (285
amino acids) has about 63% identity from amino acids 1-285 with a protein
from Lactococcus lactis subsp. lactis that is a maltose ABC transporter
permease protein (Accession Nos. NP--267841.1; NC--002662),
about 54% identity from amino acids 6-284 with a protein from
Streptococcus pyogenes that is homologous to a maltose/maltodextrin ABC
transport system protein (permease) (Accession Nos. NP--269423.1;
NC--002737), about 38% identity from amino acids 12-284 with a
protein from Klebsiella oxytoca that is homologous to a malG protein
(Accession No. pir∥S63616), about 39% identity from amino acids
9-285 with a protein from Bacillus halodurans that is a
maltose/maltodextrin transport system (permease) (Accession Nos.
NP--243790.1; NC--002570), and 36% identity from amino acids
7-285 with a protein from Bacillus subtilis that is homologous to a
maltodextrin transport system permease (Accession Nos. NP--391294.1;
NC--000964).

[0202]A Gapped BlastP sequence alignment showed that SEQ ID NO:64 (452
amino acids) has about 63% identity from amino acids 1-452 with a protein
from Lactococcus lactis subsp. lactis that is a maltose ABC transporter
permease protein (Accession Nos. NP--267840.1; NC--002662),
about 52% identity from amino acids 3-452 with a protein from
Streptococcus pyogenes that is homologous to a maltose/maltodextrin ABC
transport system protein (permease) (Accession Nos. NP--269422.1;
NC--002737), about 52% identity from amino acids 3-452 with a
protein from Streptococcus pyogenes that is homologous to a
maltose/maltodextrin ABC transport system (permease) (Accession Nos.
NP--607422.1; NC--003485), about 34% identity from amino acids
28-451 with a protein from Klebsiella oxytoca that is homologous to a
malF protein (Accession No. pir∥S63615), and 33% identity from
amino acids 23-451 with a protein from Bacillus halodurans that is a
maltose/maltodextrin transport system permease (Accession Nos.
NP--243791.1; NC--002570).

[0203]A Gapped BlastP sequence alignment showed that SEQ ID NO:66 (408
amino acids) has about 49% identity from amino acids 1-407 with a protein
from Lactococcus lactis subsp. lactis that is a maltose ABC transporter
substrate binding protein (Accession Nos. NP--267839.1;
NC--002662), about 37% identity from amino acids 1-405 with a
protein from Streptococcus pyogenes that is homologous to a
maltose/maltodextrin-binding protein (Accession Nos. NP--607421.1;
NC--003485), about 36% identity from amino acids 1-405 with a
protein from Streptococcus pyogenes that is homologous to a
maltose/maltodextrin-binding protein (Accession Nos. NP--269421.1;
NC--002737), about 27% identity from amino acids 1-393 with a
protein from Listeria innocua that is homologous to a
maltose/maltodextrin ABC-transporter (binding protein) (Accession Nos.
NP--471563.1; NC--003212), and 26% identity from amino acids
1-403 with a protein from Bacillus subtilis that is homologous to a
maltose/maltodextrin-binding protein (Accession Nos. NP--391296.1;
NC--000964).

[0205]A Gapped BlastP sequence alignment showed that SEQ ID NO:70 (512
amino acids) has about 60% identity from amino acids 1-510 with a protein
from Streptococcus pyogenes that is homologous to a sugar ABC transporter
(ATP-binding protein) (Accession Nos. NP--269365.1;
NC--002737), about 60% identity from amino acids 1-510 with a
protein from Streptococcus pyogenes that is homologous to a sugar ABC
transporter (ATP-binding protein) (Accession Nos. NP--607296.1;
NC--003485), about 59% identity from amino acids 5-503 with a
protein from Lactococcus lactis subsp. lactis that is a sugar ABC
transporter ATP binding protein (Accession Nos. NP--267484.1;
NC--002662), about 61% identity from amino acids 7-503 with a
protein from Streptococcus pneumoniae that is a sugar ABC transporter,
ATP-binding protein (Accession Nos. NP--345337.1; NC--003028),
and 60% identity from amino acids 7-503 with a protein from Streptococcus
pneumoniae that is a ABC transporter ATP-binding protein-ribose/galactose
transport (Accession Nos. NP--358342.1; NC--003098).

[0206]A Gapped BlastP sequence alignment showed that SEQ ID NO:72 (383
amino acids) has about 49% identity from amino acids 7-351 with a protein
from Lactococcus lactis subsp. lactis that is a sugar ABC transporter
permease protein (Accession Nos. NP--267485.1; NC--002662),
about 47% identity from amino acids 4-351 with a protein from
Streptococcus pneumoniae that is an ABC transporter membrane-spanning
permease (ribose/galactose transport) (Accession Nos. NP--358343.1;
NC--003098), about 47% identity from amino acids 4-351 with a
protein from Streptococcus pneumoniae that is homologous to a sugar ABC
transporter, permease protein (Accession Nos. NP--345338.1;
NC--003028), about 49% identity from amino acids 4-342 with a
protein from Streptococcus pyogenes that is homologous to a sugar ABC
transporter (permease protein) (Accession Nos. NP--269364.1;
NC--002737), and 49% identity from amino acids 4-342 with a protein
from Streptococcus pyogenes that is homologous to a sugar ABC transporter
(permease protein) (Accession Nos. NP--607295.1; NC--003485).

[0207]A Gapped BlastP sequence alignment showed that SEQ ID NO:74 (318
amino acids) has about 67% identity from amino acids 1-318 with a protein
from Streptococcus pyogenes that is homologous to a sugar ABC transporter
(permease protein) (Accession Nos. NP--607294.1; NC--003485),
about 66% identity from amino acids 1-318 with a protein from
Streptococcus pyogenes that is homologous to a sugar ABC transporter
(permease protein) (Accession Nos. NP--269363.1; NC--002737),
about 65% identity from amino acids 1-318 with a protein from
Streptococcus pneumoniae that is homologous to a sugar ABC transporter,
permease protein (Accession Nos. NP--345339.1; NC--003028),
about 63% identity from amino acids 1-318 with a protein from Lactococcus
lactis subsp. lactis that is a sugar ABC transporter permease protein
(Accession Nos. NP--267486.1; NC--002662), and 61% identity
from amino acids 6-318 with a protein from Listeria innocua that is
homologous to a sugar ABC transporter (permease protein) (Accession Nos.
NP--470764.1; NC--003212).

[0208]A Gapped BlastP sequence alignment showed that SEQ ID NO:76 (450
amino acids) has about 68% identity from amino acids 11-448 with a
protein from Neisseria meningitidis that is homologous to a sugar
transporter (Accession Nos. NP--273437.1; NC--003112), about
68% identity from amino acids 11-448 with a protein from Neisseria
meningitidis that is homologous to an integral membrane transport protein
(Accession Nos. NP--284797.1; NC--003116), about 39% identity
from amino acids 17-229 with a protein from Caulobacter crescentus that
is homologous to a transporter (Accession Nos. NP--421086.1;
NC--002696), about 21% identity from amino acids 31-450 with a
protein from Lycopersicon esculentum that is a sucrose transporter
(Accession Nos. gb|AAG09270.1; AF176950), and 21% identity from amino
acids 31-442 with a protein fromArabidopsis thaliana that is a sucrose
transporter (Accession Nos. gb|AAG09191.1; AF175321).

[0209]A Gapped BlastP sequence alignment showed that SEQ ID NO:78 (495
amino acids) has about 32% identity from amino acids 8-482 with a protein
from Lactococcus lactis subsp. lactis that is a transporter protein
(Accession Nos. NP--266394.1; NC--002662), about 34% identity
from amino acids 8-482 with a protein from Listeria monocytogenes that is
homologous to an efflux transporter (Accession Nos. NP--464506.1;
NC--003210), about 34% identity from amino acids 8-482 with a
protein from Listeria innocua that is homologous to an efflux transporter
(Accession Nos. NP--470317.1; NC--003212), about 30% identity
from amino acids 7-422 with a protein from Clostridium acetobutylicum
that is an MDR related permease (Accession Nos. NP--149294.1;
NC--001988), and 29% identity from amino acids 8-425 with a protein
from Streptomyces coelicolor that is homologous to a membrane transport
protein (Accession Nos. emb|CAB89031.1; AL353870).

[0210]A Gapped BlastP sequence alignment showed that SEQ ID NO:80 (471
amino acids) has about 32% identity from amino acids 1-440 with a protein
from Lactococcus lactis subsp. lactis that is a transporter protein
(Accession Nos. NP--266394.1; NC--002662), about 34% identity
from amino acids 1-464 with a protein from Listeria monocytogenes that is
homologous to an efflux transporter (Accession Nos. NP--464506.1;
NC--003210), about 34% identity from amino acids 1-464 with a
protein from Listeria innocua that is homologous to an efflux transporter
(Accession Nos. NP--470317.1; NC--003212), about 29% identity
from amino acids 1-412 with a protein from Clostridium acetobutylicum
that is an MDR related permease (Accession Nos. NP--149294.1;
NC--001988), and 28% identity from amino acids 4-459 with a protein
from Streptomyces coelicolor that is homologous to an exporter (Accession
No. pir∥T36377).

[0211]A Gapped BlastP sequence alignment showed that SEQ ID NO:82 (412
amino acids) has about 49% identity from amino acids 18-400 with a
protein from Listeria innocua that is homologous to a drug-efflux
transporter (Accession Nos. NP--472212.1; NC--003212), about
49% identity from amino acids 18-400 with a protein from Listeria
monocytogenes that is homologous to a drug-efflux transporter (Accession
Nos. NP--466263.1; NC--003210), about 48% identity from amino
acids 18-397 with a protein from Escherichia coli that is homologous to a
transport protein (Accession Nos. NP--415571.1; NC--000913),
about 47% identity from amino acids 15-399 with a protein from
Lactococcus lactis subsp. lactis that is a multidrug resistance efflux
pump (Accession Nos. NP--266282.1; NC--002662), and 48%
identity from amino acids 18-399 with a protein from Salmonella
typhimurium that is homologous to an MFS family transport protein
(Accession Nos. NP--460125.1; NC--003197).

[0212]A Gapped BlastP sequence alignment showed that SEQ ID NO:84 (462
amino acids) has about 38% identity from amino acids 9-413 with ORFC from
Oenococcus oeni (Accession Nos. emb|CAB61253.1; AJ250422), about 38%
identity from amino acids 2-378 with a protein from Lactococcus lactis
subsp. lactis that is a transporter protein (Accession Nos.
NP--267695.1; NC--002662), about 34% identity from amino acids
6-411 with a protein from Streptococcus pyogenes that is homologous to a
drug resistance protein (Accession Nos. NP--606824.1;
NC--003485), about 33% identity from amino acids 6-411 with a
protein from Streptococcus pyogenes that is homologous to a drug
resistance protein (Accession Nos. NP--268834.1; NC--002737),
and 34% identity from amino acids 2-454 with a protein from Lactococcus
lactis subsp. lactis that is a drug-export protein (Accession Nos.
NP--267504.1; NC--002662).

[0213]A Gapped BlastP sequence alignment showed that SEQ ID NO:86 (490
amino acids) has about 55% identity from amino acids 3-476 with a protein
from Listeria monocytogenes that is homologous to a drug-export protein
(Accession Nos. NP--466111.1; NC--003210), about 54% identity
from amino acids 3-476 with a protein from Listeria innocua that is
homologous to a drug-export protein (Accession Nos. NP--472062.1;
NC--003212), about 45% identity from amino acids 6-478 with a
protein from Lactococcus lactis subsp. lactis that is a multidrug
resistance protein (Accession Nos. NP--267065.1; NC--002662),
about 49% identity from amino acids 8-484 with a protein from Bacillus
subtilis that is homologous to a multidrug resistance protein (Accession
Nos. NP--388266.1; NC--000964), and 44% identity from amino
acids 18-425 with a protein from Bacillus subtilis that is homologous to
a multidrug resistance protein (Accession Nos. NP--388782.1;
NC--000964).

[0214]A Gapped BlastP sequence alignment showed that SEQ ID NO:88 (416
amino acids) has about 26% identity from amino acids 17-408 with a
protein from Desulfitobacterium hafniense (Accession Nos. gb|AAL87781.1;
AF403184), about 25% identity from amino acids 26-408 with a protein from
Streptococcus pneumoniae that is transporter in the major facilitator
superfamily (Accession Nos. NP--359046.1; NC--003098), about
21% identity from amino acids 61-399 with a protein from Campylobacter
jejuni that is homologous to an efflux protein (Accession Nos.
NP--282813.1; NC--002163), about 19% identity from amino acids
25-368 with a protein from Agrobacterium tumefaciens that is homologous
to an MFS permease (Accession Nos. NP--533033.1; NC--003304),
and 25% identity from amino acids 19-205 with a protein from Bacillus
halodurans that is a multidrug resistance protein (Accession Nos.
NP--244175.1; NC--002570).

[0215]A Gapped BlastP sequence alignment showed that SEQ ID NO:90 (548
amino acids) has about 38% identity from amino acids 17-546 with a
protein from Listeria innocua that is homologous to a transporter protein
(Accession Nos. NP--471001.1; NC--003212), about 37% identity
from amino acids 17-546 with a protein from Listeria monocytogenes that
is homologous to a transporter protein (Accession Nos. NP--465149.1;
NC--003210), about 36% identity from amino acids 1-534 with a
protein from Streptococcus pneumoniae that is a polysaccharide
transporter (Accession Nos. NP--358976.1; NC--003098), about
36% identity from amino acids 17-534 with a protein from Streptococcus
pneumoniae that is homologous to a polysaccharide biosynthesis protein
(Accession Nos. NP--345978.1; NC--003028), and 35% identity
from amino acids 12-546 with a hypothetical protein from Lactococcus
lactis subsp. lactis (Accession Nos. NP--267962.1; NC--002662).

[0216]A Gapped BlastP sequence alignment showed that SEQ ID NO:92 (485
amino acids) has about 44% identity from amino acids 1-484 with a protein
from Listeria monocytogenes that is homologous to an efflux transporter
protein (Accession Nos. NP--464506.1; NC--003210), about 44%
identity from amino acids 1-484 with a protein from Listeria innocua that
is homologous to an efflux transporter protein (Accession Nos.
NP--470317.1; NC--003212), about 34% identity from amino acids
9-420 with a protein from Clostridium acetobutylicum that is an
MDR-related permease (Accession Nos. NP--149294.1; NC--001988),
about 33% identity from amino acids 12-475 with a protein from
Lactococcus lactis subsp. lactis that is a transporter protein (Accession
Nos. NP--266394.1; NC--002662), and 34% identity from amino
acids 1-457 with a hypothetical protein from Myxococcus xanthus
(Accession Nos. emb|CAB37973.1; X76640).

[0217]A Gapped BlastP sequence alignment showed that SEQ ID NO:94 (199
amino acids) has about 46% identity from amino acids 23-173 with a
protein from Listeria innocua that is homologous to a drug-efflux
transporter protein (Accession Nos. NP--472212.1; NC--003212),
about 45% identity from amino acids 23-173 with a protein from Listeria
monocytogenes that is homologous to a drug-efflux transporter protein
(Accession Nos. NP--466263.1; NC--003210), about 49% identity
from amino acids 23-173 with a protein from Lactococcus lactis subsp.
lactis that is a multidrug resistance efflux pump (Accession Nos.
NP--266282.1; NC--002662), about 46% identity from amino acids
23-173 with a protein from Salmonella enterica subsp. enterica serovar
Typhi that is homologous to an efflux pump (Accession Nos.
NP--454977.1; NC--003198), and 46% identity from amino acids
23-173 with a protein from Salmonella typhimurium that is homologous to a
permease (Accession Nos. NP--459377.1; NC--003197).

[0219]A Gapped BlastP sequence alignment showed that SEQ ID NO:98 (328
amino acids) has about 57% identity from amino acids 1-323 with a protein
from Pediococcus pentosaceus that is a sucrose operon regulatory protein
(scrR) (Accession No. sp|P43472|SCRR PEDPE), about 51% identity from
amino acids 1-322 with a protein from Streptococcus pneumoniae that is a
sucrose operon repressor (Accession Nos. NP--346162.1;
NC--003028), about 49% identity from amino acids 1-326 with a
protein from Streptococcus mutans that is a sucrose operon regulatory
protein (scrR) (Accession No. sp|Q54430|SCRR_STRMU), about 49% identity
from amino acids 1-322 with a protein from Streptococcus pyogenes that is
homologous to a sucrose operon repressor (Accession Nos.
NP--607889.1; NC--003485), and 49% identity from amino acids
1-322 with a protein from Streptococcus pyogenes that is homologous to a
sucrose operon repressor (Accession Nos. NP--269821.1;
NC--002737).

[0220]A Gapped BlastP sequence alignment showed that SEQ ID NO:100 (485
amino acids) has about 50% identity from amino acids 1-466 with a protein
from Streptococcus sobrinus that is a sucrose-6-phosphate hydrolase
(ScrB) (Accession No. pir∥S68598), about 49% identity from amino
acids 1-461 with a protein from Streptococcus pneumoniae that is a
sucrose-6-phosphate hydrolase (Accession Nos. NP--359160.1;
NC--003098), about 49% identity from amino acids 1-461 with a
protein from Streptococcus pneumoniae that is a sucrose-6-phosphate
hydrolase (Accession Nos. NP--346161.1; NC--003028), about 49%
identity from amino acids 1-466 with a protein from Streptococcus
pyogenes that is homologous to a sucrose-6-phosphate hydrolase (Accession
Nos. NP--607888.1; NC--003485), and 49% identity from amino
acids 1-466 with a protein from Streptococcus pyogenes that is homologous
to a sucrose-6-phosphate hydrolase (Accession Nos. NP--269820.1;
NC--002737).

[0221]A Gapped BlastP sequence alignment showed that SEQ ID NO:102 (649
amino acids) has about 65% identity from amino acids 1-645 with a protein
from Streptococcus mutans that is a phosphotransferase system enzyme II
(EC 2.7.1.69), sucrose-specific IIABC component (Accession No.
sp|P12655|PTSA STRMU), about 56% identity from amino acids 1-647 with a
protein from Pediococcus pentosaceus that is a phosphotransferase system
enzyme II (EC 2.7.1.69), sucrose specific enzyme IIABC (Accession No.
sp|P43470|PTSA_PEDPE), about 52% identity from amino acids 1-643 with a
protein from Lactococcus lactis that is an enzyme II sucrose protein
(Accession Nos. emb|CAB09690.1; Z97015), about 52% identity from amino
acids 114-647 with a protein from Lactobacillus sakei that is a
sucrose-specific enzyme II of the PTS (Accession Nos. gb AAK92528.1;
AF401046), and 45% identity from amino acids 1-621 with a protein from
Corynebacterium glutamicum that is a phosphotransferase system IIB
component (Accession Nos. NP--601842.1; NC--003450).

[0222]A Gapped BlastP sequence alignment showed that SEQ ID NO:104 (667
amino acids) has about 42% identity from amino acids 192-661 with a
protein from Lactococcus lactis subsp. lactis that is a
beta-glucoside-specific PTS system IIABC component (EC 2.7.1.69)
(Accession Nos. NP--266583.1; NC--002662), about 39% identity
from amino acids 191-652 with a protein from Listeria monocytogenes that
is homologous to a phosphotransferase system (PTS)
beta-glucoside-specific enzyme IIABC (Accession Nos. NP--464560.1;
NC--003210), about 37% identity from amino acids 191-662 with a
protein from Clostridium longisporum that is a PTS-dependent enzyme II
(Accession Nos. gb|AAC05713.1; L49336), about 36% identity from amino
acids 191-666 with a protein from Bacillus halodurans that is a PTS
system, beta-glucoside-specific enzyme II, ABC component (Accession Nos.
NP--241461.1; NC--002570), and 36% identity from amino acids
191-650 with a protein from Listeria innocua that is homologous to a PTS
system, beta-glucosides specific enzyme IIABC (Accession Nos.
NP--469373.1; NC--003212).

[0223]A Gapped BlastP sequence alignment showed that SEQ ID NO:106 (241
amino acids) has about 47% identity from amino acids 1-238 with a protein
from Bacillus subtilis that is a trehalose operon transcriptional
repressor (Accession No. sp|P39796|TRER_BACSU), about 41% identity from
amino acids 4-238 with a protein from Bacillus halodurans that is a
transcriptional repressor of the trehalose operon (Accession Nos.
NP--241739.1; NC--002570), about 44% identity from amino acids
9-237 with a protein from Listeria innocua that is homologous to a
transcription regulator GntR family (Accession Nos. NP--470558.1;
NC--003212), about 44% identity from amino acids 9-237 with a
protein from Listeria monocytogenes that is homologous to a transcription
regulator GntR family (Accession Nos. NP--464778.1;
NC--003210), and 41% identity from amino acids 5-238 with a protein
from Lactococcus lactis subsp. lactis that is a GntR family
transcriptional regulator (Accession Nos. NP--266581.1;
NC--002662).

[0224]A Gapped BlastP sequence alignment showed that SEQ ID NO:108 (570
amino acids) has about 56% identity from amino acids 22-566 with a
protein from Streptococcus pyogenes that is homologous to a dextran
glucosidase (Accession Nos. NP--608103.1; NC--003485), about
57% identity from amino acids 23-568 with a protein from Streptococcus
pneumoniae that is a dextran glucosidase (Accession Nos.
NP--359290.1; NC--003098), about 56% identity from amino acids
22-566 with a protein from Streptococcus pyogenes that is homologous to a
dextran glucosidase (Accession Nos. NP--270026.1; NC--002737),
about 57% identity from amino acids 23-568 with a protein from
Streptococcus pneumoniae that is homologous to a dextran glucosidase DexS
(Accession Nos. NP--346315.1; NC--003028), and 54% identity
from amino acids 17-570 with a protein from Clostridium perfringens that
is an alpha-glucosidase (Accession Nos. NP--561478.1;
NC--003366).

[0225]A Gapped BlastP sequence alignment showed that SEQ ID NO:110 (370
amino acids) has about 67% identity from amino acids 1-368 with a protein
from Streptococcus pneumoniae that is an ABC transporter ATP-binding
protein-multiple sugar transport (Accession Nos. NP--359030.1;
NC--003098), about 67% identity from amino acids 1-368 with a
protein from Streptococcus pneumoniae that is a sugar ABC transporter,
ATP-binding protein (Accession Nos. NP--346026.1; NC--003028),
about 66% identity from amino acids 1-368 with a protein from
Streptococcus mutans that is a multiple sugar-binding transport
ATP-binding protein (msmK) (Accession No. sp|Q00752|MSMK_STRMU), about
68% identity from amino acids 1-365 with a protein from Listeria innocua
that is homologous to a sugar ABC transporter, ATP-binding protein
(Accession Nos. NP--469649.1; NC--003212), and 67% identity
from amino acids 1-365 with a protein from Listeria monocytogenes that is
homologous to a sugar ABC transporter, ATP-binding protein (Accession
Nos. NP--463809.1; NC--003210).

[0226]A Gapped BlastP sequence alignment showed that SEQ ID NO:112 (278
amino acids) has about 81% identity from amino acids 2-278 with a protein
from Streptococcus mutans that is a multiple sugar-binding transport
system permease protein (msmG) (Accession No. sp|Q00751|MSMG_STRMU),
about 73% identity from amino acids 1-278 with a protein from
Streptococcus pneumoniae that is a sugar ABC transporter, permease
protein (Accession Nos. NP--346326.1; NC--003028), about 72%
identity from amino acids 2-278 with a protein from Streptococcus
pneumoniae that is a ABC transporter membrane spanning permease-multiple
sugars (Accession Nos. NP--359302.1; NC--003098), about 85%
identity from amino acids 72-278 with a hypothetical protein fragment
from Streptococcus mutans (Accession No. pir∥B27626), and 44%
identity from amino acids 4-278 with a protein from Clostridium
acetobutylicum that is a sugar permease (Accession Nos.
NP--350251.1; NC--003030).

[0227]A Gapped BlastP sequence alignment showed that SEQ ID NO:114 (291
amino acids) has about 73% identity from amino acids 4-290 with a protein
from Streptococcus pneumoniae that is an ABC transporter
membrane-spanning permease-multiple sugars (Accession Nos.
NP--359303.1; NC--003098), about 73% identity from amino acids
4-290 with a protein from Streptococcus pneumoniae that is a sugar ABC
transporter, permease protein (Accession Nos. NP--346327.1;
NC--003028), about 73% identity from amino acids 1-290 with a
protein from Streptococcus mutans that is a multiple sugar-binding
transport system permease protein (msmF) (Accession No.
sp|Q00750|MSMF_STRMU), about 53% identity from amino acids 6-291 with a
protein from Clostridium acetobutylicum that is an ABC-type sugar
transport system, permease component (Accession Nos. NP--350252.1;
NC--003030), and 32% identity from amino acids 2-291 with a protein
from Thermoanaerobacterium thermosulfurigenes that is a potential starch
degradation products transport system permease protein (Accession No.
sp|P37730|AMYD_THETU).

[0228]A Gapped BlastP sequence alignment showed that SEQ ID NO:116 (423
amino acids) has about 60% identity from amino acids 8-421 with a protein
from Streptococcus mutans that is a multiple sugar-binding protein
precursor (Accession No. sp|Q00749|MSME_STRMU), about 56% identity from
amino acids 9-421 with a protein from Streptococcus pneumoniae that is a
sugar ABC transporter, sugar-binding protein (Accession Nos.
NP--346328.1; NC--003028), about 56% identity from amino acids
9-421 with a protein from Streptococcus pneumoniae that is an ABC
transporter substrate-binding protein-multiple sugars (Accession Nos.
NP--359304.1; NC--003098), about 29% identity from amino acids
9-420 with a protein from Clostridium acetobutylicum that is an ABC-type
sugar transport system, periplasmic sugar-binding component (Accession
Nos. NP--350253.1; NC--003030), and 24% identity from amino
acids 6-412 with a protein from Bacillus subtilis that is homologous to a
multiple sugar-binding protein (Accession Nos. NP--391140.1;
NC--000964).

[0229]A Gapped BlastP sequence alignment showed that SEQ ID NO:118 (279
amino acids) has about 57% identity from amino acids 1-273 with a protein
from Pediococcus pentosaceus that is a raffinose operon transcriptional
regulatory protein (rafr) (Accession No. sp|P43465|RAFR_PEDPE), about 35%
identity from amino acids 5-273 with a protein from Streptococcus mutans
that is homologous to a transcription regulator (msmR) (Accession No.
pir∥A42400), about 35% identity from amino acids 5-273 with a
protein from Streptococcus mutans that is an msm operon regulatory
protein (Accession No. sp|Q00753|MSMR_STRMU), about 36% identity from
amino acids 19-273 with a protein from Streptococcus pneumoniae that is
an msm operon regulatory protein (Accession Nos. NP--346330.1;
NC--003028), and 36% identity from amino acids 19-273 with a protein
from Streptococcus pneumoniae that is an msm (multiple sugar metabolism)
operon regulatory protein (Accession Nos. NP--359306.1;
NC--003098).

[0230]A Gapped BlastP sequence alignment showed that SEQ ID NO:120 (277
amino acids) has about 28% identity from amino acids 37-141 with a
protein from Treponema pallidum that is homologous to an rRNA methylase
(Accession Nos. NP--218549.1; NC--000919), about 32% identity
from amino acids 74-141 with a protein from Guillardia theta that is a
GTP-binding nuclear protein RAN (Accession Nos. NP--113408.1;
NC--002753), about 29% identity from amino acids 75-141 with a
protein from Dictyostelium discoideum that is a GTP-binding nuclear
protein RAN/TC4 (Accession No. sp|P33519|RAN_DICDI), and about 25%
identity from amino acids 140-190 with a putative protein from
Arabidopsis thaliana (Accession Nos. NP--191798.1; NM--116104).

[0231]A Gapped BlastP sequence alignment showed that SEQ ID NO:122 (530
amino acids) has about 26% identity from amino acids 8-524 with a protein
from Lactococcus lactis subsp. lactis that is an ABC transporter ATP
binding and permease protein (Accession Nos. NP--267678.1;
NC--002662), about 25% identity from amino acids 49-518 with a
protein from Streptococcus pneumoniae that is an ABC transporter,
ATP-binding protein (Accession Nos. NP--344680.1; NC--003028),
about 25% identity from amino acids 49-518 with a protein from
Streptococcus pneumoniae that is an ABC transporter ATP-binding/membrane
spanning permease (Accession Nos. NP--357731.1; NC--003098),
about 24% identity from amino acids 47-511 with a protein from
Synechocystis sp. PCC 6803 that is an ABC transporter (Accession Nos.
NP--440626.1; NC--000911), and 24% identity from amino acids
7-511 with a protein from Bacillus subtilis that is homologous to an ABC
transporter (ATP-binding protein) (Accession Nos. NP--388852.1;
NC--000964).

[0232]A Gapped BlastP sequence alignment showed that SEQ ID NO:124 (530
amino acids) has about 24% identity from amino acids 4-524 with a protein
from Lactococcus lactis subsp. lactis that is an ABC transporter ATP
binding and permease protein (Accession Nos. NP--267678.1;
NC--002662), about 25% identity from amino acids 55-508 with a
protein from Streptococcus pneumoniae that is an ABC transporter,
ATP-binding protein (Accession Nos. NP--344680.1; NC--003028),
about 25% identity from amino acids 55-508 with a protein from
Streptococcus pneumoniae that is an ABC transporter ATP-binding/membrane
spanning permease (Accession Nos. NP--357731.1; NC--003098),
about 24% identity from amino acids 1-511 with a protein from
Streptococcus pneumoniae that is a drug efflux ABC transporter,
ATP-binding/permease (Accession Nos. NP--345800.1; NC--003028),
and 24% identity from amino acids 1-511 with a protein from Streptococcus
pneumoniae that is an ABC transporter ATP-binding/membrane spanning
protein (Accession Nos. NP--358796.1; NC--003098).

[0234]A Gapped BlastP sequence alignment showed that SEQ ID NO:128 (534
amino acids) has about 23% identity from amino acids 14-512 with a
protein from Streptococcus pneumoniae that is a comA protein (Accession
No. pir∥A39203), about 26% identity from amino acids 3-512 with
a protein from Lactococcus lactis that is a Lactococcin A transport
ATP-binding protein (lcnC) (Accession No. sp|Q00564|LCNC_LACLA), about
23% identity from amino acids 14-512 with a protein from Streptococcus
pneumoniae that is a transport ATP-binding protein (ComA) (Accession Nos.
NP--357637.1; NC--003098), about 25% identity from amino acids
113-509 with a protein from Streptococcus salivarius that is an ABC
transporter (Accession Nos. gb|AAC72026.1; AF043280), and 22% identity
from amino acids 14-512 with a protein from Streptococcus pneumoniae that
is a competence factor transporting ATP-binding/permease protein (ComA)
(Accession Nos. NP--344591.1; NC--003028).

[0235]A Gapped BlastP sequence alignment showed that SEQ ID NO:130 (527
amino acids) has about 23% identity from amino acids 16-524 with a
protein from Lactococcus lactis subsp. lactis that is an ABC transporter
ATP binding and permease protein (Accession Nos. NP--267678.1;
NC--002662), about 25% identity from amino acids 6-520 with a
protein from Streptococcus pneumoniae that is an ABC transporter,
ATP-binding protein (Accession Nos. NP--344680.1; NC--003028),
about 25% identity from amino acids 6-520 with a protein from
Streptococcus pneumoniae that is an ABC transporter ATP-binding/membrane
spanning permease (Accession Nos. NP--357731.1; NC--003098),
about 24% identity from amino acids 105-511 with a protein from
Streptococcus pneumoniae that is an ABC transporter ATP-binding/membrane
spanning protein (Accession Nos. NP--358796.1; NC--003098), and
25% identity from amino acids 99-511 with a protein from Nostoc sp. PCC
7120 that is an ABC transporter ATP-binding protein (Accession Nos.
NP--490-403.1; NC--003276).

[0236]A Gapped BlastP sequence alignment showed that SEQ ID NO:132 (529
amino acids) has about 25% identity from amino acids 10-526 with a
protein from Lactococcus lactis subsp. lactis that is an ABC transporter
ATP binding and permease protein (Accession Nos. NP 267678.1;
NC--002662), about 26% identity from amino acids 112-525 with a
protein from Streptococcus pneumoniae that is an ABC transporter
ATP-binding/membrane spanning permease (Accession Nos. NP--357731.1;
NC--003098), about 26% identity from amino acids 112-525 with a
protein from Streptococcus pneumoniae that is an ABC transporter,
ATP-binding protein (Accession Nos. NP--344680.1; NC--003028),
about 24% identity from amino acids 107-518 with a protein from
Brevibacillus brevis that is homologous to an ABC-transporter (TycD)
(Accession No. pir∥T31077), and 24% identity from amino acids
83-521 with a protein from Streptococcus pneumoniae that is a drug efflux
ABC transporter, ATP-binding/permease (Accession Nos. NP--345800.1;
NC--003028).

[0237]A Gapped BlastP sequence alignment showed that SEQ ID NO:134 (600
amino acids) has about 23% identity from amino acids 2-600 with a protein
from Listeria innocua that is homologous to an ABC transporter (permease)
(Accession Nos. NP--471553.1; NC--003212), about 23% identity
from amino acids 1-598 with a protein from Listeria monocytogenes that is
homologous to an ABC transporter (permease) (Accession Nos.
NP--465271.1; NC--003210), about 22% identity from amino acids
1-599 with a protein from Clostridium perfringens that is homologous to
an ABC transporter (Accession Nos. NP--561767.1; NC--003366),
about 22% identity from amino acids 1-564 with a protein from Clostridium
perfringens that is homologous to an ABC-transporter (Accession Nos.
NP--561039.1; NC--003366), and 22% identity from amino acids
4-593 with a protein from Clostridium acetobutylicum that is homologous
to a permease (Accession Nos. NP--346868.1; NC--003030).

[0238]A Gapped BlastP sequence alignment showed that SEQ ID NO:136 (249
amino acids) has about 58% identity from amino acids 1-242 with a protein
from Clostridium perfringens that is homologous to an ABC transporter
(Accession Nos. NP--561766.1; NC--003366), about 55% identity
from amino acids 3-242 with a protein from Clostridium perfringens that
is homologous to an ABC transporter (Accession Nos. NP--561038.1;
NC--003366), about 51% identity from amino acids 1-242 with a
protein from Listeria monocytogenes that is homologous to an ABC
transporter (ATP-binding protein) (Accession Nos. NP--465638.1;
NC--003210), about 50% identity from amino acids 1-242 with a
protein from Listeria innocua that is homologous to an ABC-transporter
(ATP-binding protein) (Accession Nos. NP--471552.1;
NC--003212), and 54% identity from amino acids 3-242 with a protein
from Clostridium acetobutylicum that is an ABC transporter, ATP-binding
protein (Accession Nos. NP--346867.1; NC--003030).

[0239]A Gapped BlastP sequence alignment showed that SEQ ID NO:138 (423
amino acids) has about 21% identity from amino acids 2-391 with a
hypothetical protein from Streptococcus pyogenes (Accession Nos.
NP--270004.1; NC--002737), about 21% identity from amino acids
2-383 with a hypothetical protein from Streptococcus pyogenes (Accession
Nos. NP 608080.1; NC--003485), about 26% identity from amino acids
9-166 with a protein from Bacillus subtilis that is a yvbJ protein
(Accession Nos. NP--391268.1; NC--000964), about 25% identity
from amino acids 92-281 with a protein from caprine
arthritis-encephalitis virus that is an env polyprotein precursor
(Accession No. pir∥VCLJC6), and 24% identity from amino acids
92-281 with a protein from Caprine arthritis-encephalitis virus that is
an envelope glycoprotein (Accession Nos. gb|AAD14661.1; AF105181).

[0240]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:140 (438 amino acids) has about 27% identity from amino acids 86-216
with a protein from Brochothrix campestris that is a transport accessory
protein (Accession Nos. gb|AAC95141.1; AF075600), about 26% identity from
amino acids 107-219 with a protein from Streptococcus pneumoniae that is
a bacterocin transport accessory protein (Accession Nos.
NP--345950.1; NC--003028), about 26% identity from amino acids
107-219 with a protein from Streptococcus pneumoniae that is a Bta
(Accession Nos. gb|AAD56628.1; AF165218), 23% identity from amino acids
88-201 with a hypothetical protein from Bacillus anthracis (Accession
Nos. NP--052783.1; NC--001496), and 32% identity from amino
acids 144-214 with a protein from Neisseria meningitidis that is a
thioredoxin (Accession Nos. NP--274384.1; NC--003112).

[0241]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:142 (196 amino acids) has about 56% identity from amino acids 1-196
with a protein from Lactobacillus gasseri (Accession Nos. dbj
IBAA82351.1; AB029612), about 49% identity from amino acids 10-196 with a
hypothetical protein from Lactobacillus sp. (Accession No.
sp|P29470|YLA1_LACAC), about 28% identity from amino acids 41-196 with a
protein from Lactobacillus casei that is an ABC-transporter accessory
factor (Accession Nos. NP--542220.1; NC--003320), 35% identity
from amino acids 90-196 with a protein from Lactobacillus plantarum that
is an accessory factor for ABC-transporter (PlnH) (Accession Nos.
emb|CAA64190.1; X94434), and 30% identity from amino acids 41-196 with a
protein from Lactobacillus sake that is homologous to an ABC exporter
accessory factor (SapE) (Accession No. pir∥A56973).

[0242]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:144 (720 amino acids) has about 62% identity from amino acids 9-720
with a protein from Lactobacillus plantarum that is an ABC-transporter
(PlnG) (Accession Nos. emb|CAA64189.1; X94434), about 62% identity from
amino acids 6-720 with a protein from Lactobacillus sakei that is
homologous to a translocation protein (sppT), ATP-dependent (Accession
No. pir∥S57913), about 62% identity from amino acids 2-720 with
a protein from Lactobacillus sakei that is an ATP-dependent transport
protein (SapT) (Accession No. pir∥56273), 62% identity from
amino acids 9-720 with a protein from Lactobacillus casei that is an ABC
transporter (Accession Nos. NP--542219.1; NC--003320), and 57%
identity from amino acids 25-718 with a protein from Lactobacillus
acidophilus that is an ABC transporter (Accession Nos. NP--604412.1;
NC--003458).

[0243]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:146 (234 amino acids) has about 52% identity from amino acids 13-228
with a protein from Staphylococcus aureus subsp. aureus that is
homologous to an ABC transporter ATP-binding protein (Accession Nos.
NP--370833.1; NC--002758), about 50% identity from amino acids
11-234 with a protein from Streptococcus pyogenes that is homologous to
an ABC transporter (ATP-binding protein) (Accession Nos.
NP--606994.1; NC--003485), about 50% identity from amino acids
11-234 with a protein from Streptococcus pyogenes that is homologous to
an ABC transporter (ATP-binding protein) (Accession Nos.
NP--268993.1; NC--002737), 50% identity from amino acids 13-232
with a protein from Lactococcus lactis subsp. lactis that is an ABC
transporter ATP-binding protein (Accession Nos. NP--266815.1;
NC--002662), and 53% identity from amino acids 11-233 with a protein
from Lactococcus lactis subsp. lactis that is an ABC transporter
ATP-binding protein (Accession Nos. NP--268413.1; NC--002662).

[0245]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:150 (188 amino acids) has about 47% identity from amino acids 14-85
with a protein from Lactococcus lactis subsp. lactis that is a
transcriptional regulator (Accession Nos. NP--266817.1;
NC--002662), about 28% identity from amino acids 21-90 with a
protein from Aquifex aeolicus that is a transcriptional regulator in the
TetR/AcrR family (Accession Nos. NP--213195.1; NC--000918),
about 30% identity from amino acids 14-75 with a protein from Clostridium
acetobutylicum that is a transcriptional regulator in the AcrR family
(Accession Nos. NP--348163.1; NC--003030), 29% identity from
amino acids 25-109 with a protein from Streptomyces coelicolor that is
homologous to a transcriptional regulator (Accession Nos. emb
ICAB93030.1; AL357432), and 41% identity from amino acids 27-88 with a
protein from Clostridium acetobutylicum that is a transcriptional
regulator in the TetR/AcrR family (Accession Nos. NP--347457.1;
NC--003030).

[0246]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:152 (236 amino acids) has about 65% identity from amino acids 3-236
with a protein from Streptococcus pneumoniae that is an ABC transporter
ATP-binding protein (Accession Nos. NP--359090.1; NC--003098),
about 66% identity from amino acids 4-236 with a protein from
Streptococcus pneumoniae that is an ABC transporter, ATP-binding protein
(Accession Nos. NP--346092.1; NC--003028), about 65% identity
from amino acids 4-236 with a protein from Streptococcus pyogenes that is
homologous to an ABC transporter (ATP-binding protein) (Accession Nos.
NP--607321.1; NC--003485), 65% identity from amino acids 4-236
with a protein from Streptococcus pyogenes that is homologous to an ABC
transporter (ATP-binding protein) (Accession Nos. NP--269390.1;
NC--002737), and 62% identity from amino acids 4-236 with a protein
from Listeria monocytogenes that is homologous to a ABC transporter,
ATP-binding protein (Accession Nos. NP--464748.1; NC--003210).

[0250]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:160 (779 amino acids) has about 61% identity from amino acids 1-308
with a protein from Streptococcus mutans that is an ABC transporter ATP
binding subunit (Accession Nos. gb|AAD09218.1; U73183), about 37%
identity from amino acids 1-362 with a protein from Lactococcus lactis
subsp. lactis that is an ABC transporter ATP-binding and permease protein
(Accession Nos. NP--266870.1; NC--002662), about 39% identity
from amino acids 1-295 with a protein from Listeria monocytogenes that is
homologous to an ABC transporter, ATP-binding protein (Accession Nos.
NP--464271.1; NC--003210), 47% identity from amino acids 1-221
with a protein from Archaeoglobus fulgidus that is an ABC transporter,
ATP-binding protein (Accession Nos. NP--070298.1; NC--000917),
and 49% identity from amino acids 1-218 with a protein from Archaeoglobus
fulgidus that is an ABC transporter, ATP-binding protein (Accession Nos.
NP--069851.1; NC--000917).

[0251]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:162 (38 amino acids) has about 66% identity from amino acids 1-27 with
a protein from Clostridium acetobutylicum that is a mannose-specific
phosphotransferase system component (Accession Nos. NP--149230.1;
NC--001988), about 72% identity from amino acids 3-27 with a protein
from Listeria monocytogenes that is homologous to a PTS system
mannose-specific factor IIAB (Accession Nos. NP--463629.1;
NC--003210), about 72% identity from amino acids 3-27 with a protein
from Listeria innocua that is homologous to a PTS system mannose-specific
factor IIAB (Accession Nos. NP--469488.1; NC--003212), 66%
identity from amino acids 1-27 with a protein from Clostridium
perfringens that is a PTS system protein (Accession Nos.
NP--561737.1; NC--003366), and 65% identity from amino acids
2-27 with a protein from Streptococcus pyogenes that is a
mannose-specific phosphotransferase system component IIAB (Accession Nos.
NP--269761.1; NC--002737).

[0252]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:164 (105 amino acids) has about 60% identity from amino acids 1-103
with a protein from Listeria monocytogenes that is homologous to a PTS
system mannose-specific factor IIAB (Accession Nos. NP--463629.1;
NC--003210), about 59% identity from amino acids 1-103 with a
protein from Listeria innocua that is homologous to a PTS system
mannose-specific factor IIAB (Accession Nos. NP--469-488.1;
NC--003212), about 57% identity from amino acids 1-104 with a
protein from Clostridium perfringens that is a PTS system protein
(Accession Nos. NP--561737.1; NC--003366), 53% identity from
amino acids 1-104 with a protein from Clostridium acetobutylicum that is
a mannose-specific phosphotransferase system component IIAB (Accession
Nos. NP--149230.1; NC--001988), and 54% identity from amino
acids 1-96 with a protein from Streptococcus pyogenes that is a
mannose-specific phosphotransferase system component IIAB (Accession Nos.
NP--607831.1; NC--003485).

[0253]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:166 (269 amino acids) has about 69% identity from amino acids 1-269
with a protein from Listeria innocua that is homologous to a PTS system
mannose-specific, factor IIC (Accession Nos. NP--469-489.1;
NC--003212), about 69% identity from amino acids 1-269 with a
protein from Listeria monocytogenes that is homologous to a PTS system
mannose-specific, factor IIC (Accession Nos. NP--463630.1;
NC--003210), about 67% identity from amino acids 1-269 with a
protein from Streptococcus pneumoniae that is a PTS system,
mannose-specific IIC component (Accession Nos. NP--344821.1;
NC--003028), 65% identity from amino acids 1-269 with a protein from
Streptococcus pyogenes that is homologous to a mannose-specific
phosphotransferase system component IIC (Accession Nos.
NP--269762.1; NC--002737), and 64% identity from amino acids
1-269 with a protein from Clostridium acetobutylicum that is a
mannose/fructose-specific phosphotransferase system component IIC
(Accession Nos. NP--149231.1; NC--001988).

[0254]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:168 (307 amino acids) has about 67% identity from amino acids 5-307
with a protein from Listeria innocua that is homologous to a PTS system
mannose-specific factor IID (Accession Nos. NP--469-490.1;
NC--003212), about 67% identity from amino acids 5-307 with a
protein from Listeria monocytogenes that is homologous to a PTS system
mannose-specific factor IID (Accession Nos. NP--463631.1;
NC--003210), about 64% identity from amino acids 6-303 with a
protein from Clostridium acetobutylicum that is a mannose-specific
phosphotransferase system component IID (Accession Nos.
NP--149232.1; NC--001988), 64% identity from amino acids 4-300
with a protein from Lactococcus lactis subsp. lactis that is a
mannose-specific PTS system component IID (EC 2.7.1.69) (Accession Nos.
NP--267864.1; NC--002662), and 64% identity from amino acids
5-307 with a protein from Streptococcus pneumoniae that is a PTS system,
mannose-specific IID component (Accession Nos. NP--344820.1;
NC--003028).

[0255]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:170 (111 amino acids) has about 51% identity from amino acids 4-105
with a protein from Streptococcus pyogenes that is homologous to a PTS
system enzyme II protein (Accession Nos. NP--269441.1;
NC--002737), about 54% identity from amino acids 4-110 with a
protein from Listeria monocytogenes that is homologous to a cellobiose
phosphotransferase enzyme IIB component (Accession Nos.
NP--466205.1; NC--003210), about 54% identity from amino acids
4-110 with a protein from Listeria innocua that is homologous to a
cellobiose phosphotransferase enzyme IIB component (Accession Nos.
NP--472159.1; NC--003212), 50% identity from amino acids 4-105
with a protein from Streptococcus pyogenes that is homologous to a PTS
system enzyme II (Accession Nos. NP--607438.1; NC--003485), and
50% identity from amino acids 1-109 with a protein from Lactococcus
lactis subsp. lactis that is a cellobiose-specific PTS system IIB
component (EC 2.7.1.69) (Accession Nos. NP--266569.1;
NC--002662).

[0256]A Gapped BlastP (version) sequence alignment showed that SEQ ID
NO:172 (256 amino acids) has about 53% identity from amino acids 1-250
with a protein from Streptococcus pneumoniae that is a phosphotransferase
system sugar-specific EII component (Accession Nos. NP--357876.1;
NC--003098), about 53% identity from amino acids 1-250 with a
protein from Streptococcus pneumoniae that is a PTS system IIC component
(Accession Nos. NP--344847.1; NC--003028), about 43% identity
from amino acids 1-255 with a protein from Clostridium acetobutylicum
that is a PTS cellobiose-specific component IIC (AccessionNos.
NP--347026.1; NC--003030), 38% identity from amino acids 1-249
with a protein from Lactococcus lactis subsp. lactis that is a
cellobiose-specific PTS system IIC component (EC 2.7.1.69) (Accession
Nos. NP--266572.1; NC--002662), and 37% identity from amino
acids 1-255 with a protein from Listeria innocua that is homologous to a
PTS system, cellobiose-specific IIC component (Accession Nos.
NP--470241.1; NC--003212).

[0262]Global gene expression patterns obtained from growth on eight
different carbohydrates were visualized by cluster analysis (Eisen et al.
(1998) Proc. Natl. Acad. Sci. USA 95:14863-14868) using Ward's
hierarchical clustering method, volcano plots and contour plots. Overall,
between 23 and 379 genes were differentially expressed between paired
treatment conditions (with p-values below the Bonferroni correction),
representing between 1% and 20% of the genome, respectively. All possible
treatment comparisons were considered, and a gene was considered induced
above a particular level if it showed induction in at least one treatment
comparison. For genes that showed induction in more than one instance,
the highest induction level was selected. Although 342 genes (18% of the
genome) showed induction levels above two fold, only 63 genes (3% of the
genome) showed induction above 4 fold, indicating a relatively small
number of genes were highly induced. Although overall expression levels
of the majority of the genes remained consistent regardless of the growth
substrate (80% of the genome), select clusters showed differential
transcription of genes and operons. Nevertheless, for each sugar, a
limited number of genes showed specific induction.

[0263]In the presence of glucose, ORF 1679 (SEQ ID NO:133) and ORF 1680
(SEQ ID NO:135) were highly induced when compared to other
monosaccharides (fructose, galactose) and di-saccharides (sucrose,
lactose, trehalose). The induction levels compared to other sugars varied
between 3.5 and 6.3 for ORF 1679 (SEQ ID NO:133) and between 3.7 and 4.7
for ORF 1680 (SEQ ID NO:135). ORF 1679 (SEQ ID NO:133) encodes an ABC
nucleotide binding protein, including commonly found nucleotide binding
domain motifs, namely WalkerA, WalkerB, ABC signature sequence and Linton
and Higgins motif ORF 1680 (SEQ ID NO:135) encodes an ABC permease, with
10 predicted membrane spanning domains. No solute binding protein is
encoded in their vicinity, suggesting a possible role as an exporter
rather than an importer. Several genes and operons were specifically
repressed by glucose, including ORFs 680 (SEQ ID NO:239)--ORF 686, which
are involved in glycogen metabolism. Since glycogen is metabolized by the
cell in order to store energy, in the presence of the preferred carbon
source such as glucose, energy storage is not necessary. Other genes
repressed in the presence of glucose included proteins involved in uptake
of alternative carbohydrate sources, and enzymes involved in hydrolysis
of such carbohydrates.

[0272]For the five selected genes, induction levels were compared between
six different treatments, resulting in 15 induction levels for each gene.
The induction levels measured by microarrays were plotted against
induction levels measured by Q-PCR, in order to validate microarray data.
Individual R-square values ranged between 0.642 and 0.883 for each of the
tested genes (between 0.652 and 0.978 using data in a log2 scale).
When the data were combined, the global R-square value was 0.78 (0.88
using data in a log2 scale). A correlation analysis was run in SAS
(Cary, N.C.), and showed a correlation between the two methods with
P-values less than 0.001, for Spearman, Hoeffding and Kendall tests.
Additionally, a regression analysis was run in excel (Microsoft, CA), and
showed a statistically highly significant (p<1.02×10-25)
correlation between microarray data and Q-PCR results. Nevertheless,
Q-PCR measurements revealed larger induction levels, which is likely due
to the smaller dynamic range of the microarray scanner, compared to that
of the Q-PCR cycler. Similar results have been reported previously
(Wagner et al. (2003) J. Bacteriol. 185:2080-2095).

EXAMPLE 6

Comparative Analysis

[0273]Comparative analyses of global transcription profiles determined for
growth on eight carbohydrates identified the basis for carbohydrate
transport and catabolism in L. acidophilus. Specifically, three different
types of carbohydrate transporters were differentially expressed, namely
phosphoenolpyruvate: sugar phosphotransferase system (PTS), ATP binding
cassette (ABC) and galactoside-pentose hexuronide (GPH) translocator,
illustrating the diversity of carbohydrate transporters used by
Lactobacillus acidophilus. Transcription profiles suggested that
galactosides were transported by a GPH translocator, while mono- and
di-saccharides were transported by members of the PTS, and
polysaccharides were transported by members of the ABC family.

[0275]A number of lactic acid bacteria take up glucose via a PTS
transporter. The EII.sup.Man PTS transporter has the ability to import
both mannose and glucose (Cochu et al. 2003). The Lactobacillus
acidophilus mannose PTS system is similar to that of Streptococcus
thermophilus, with proteins sharing 53-65% identity and 72-79%
similarity. Specifically, the EII.sup.Man is composed of three proteins
IIAB.sup.Man, IIC.sup.Man and IID.sup.Man, encoded by ORF 452 (SEQ ID
NOS:1 and 263) (manL), ORF 455 (SEQ ID NO:165) (manM) and ORF 456 (SEQ ID
NO:167) (manN), respectively (FIG. 2). Most of the carbohydrates examined
here specifically induced genes involved in their own transport and
hydrolysis, but glucose did not. Analysis of the mannose PTS revealed
that the genes encoding the EIIABCDMan were consistently highly
expressed, regardless of the carbohydrate source. This expression profile
suggests glucose is a preferred carbohydrate, and Lactobacillus
acidophilus is also designed for efficient utilization of different
carbohydrate sources, as suggested previously for Lactobacillusplantarum
(Kleerebezem et al, (2003) Proc. Natl. Acad. Sci. USA 100: 1990-1995).

[0277]Although it was previously suggested that the phosphoenolpyruvate:
phosphotransferase system is the primary sugar transport system of
Gram-positive bacteria (Ajdic et al. (2002) Proc. Natl. Acad. Sci. USA
99:14434-14439; Warner and Lolkema (2003) Microbiol. Mol. Rev.
67:475-490), current microarray data indicate that ABC transport systems
are also important. While PTS transporters are involved in uptake of
mono- and di-saccharides, those carbohydrates are digested in the upper
GIT. In contrast, oligosaccharides reach the lower intestine whereby
commensals are likely to compete for more complex and scarce nutrients.
Perhaps under such conditions ABC transporters are even more crucial than
the PTS, given their apparent roles in transport of oligosaccharides like
FOS and raffinose. In this regard, the ability to utilize nutrients that
has been are non digestible by the host has been associated with
competitiveness and persistence of beneficial intestinal flora in the
colon (Schell et al. (2002) Proc. Natl. Acad. Sci. USA 99: 14422-14427).

[0279]Carbon catabolite repression has already been described in
lactobacilli (Mahr et al. 2000). The PTS is characterized by a phosphate
transfer cascade involving PEP, E1, HPr, EIIABC, whereby a phosphate is
ultimately transferred to the carbohydrate substrate (Saier, 2000; Warner
and Lolkema, 2003). HPr is an important component of CCR, which is
regulated via phosphorylation by enzyme I and HPrK/P. When HPr is
phosphorylated at His15, the PTS is active, and carbohydrates transported
via the PTS are phosphorylated via EIIABCs. In contrast, when HPr is
phosphorylated at Ser46, the PTS machinery is not functional (Mijakovic
et al. (2002) Proc. Natl. Acad. Sci. USA 99:13442-13447).

[0280]Although the phosphorylation cascade suggests regulation at the
protein level, several studies report transcriptional modulation of ccpA
and ptsHI. In S. thermophilus, CcpA production is induced by glucose (van
den Bogaart et al. 2000). In several bacteria, the carbohydrate source
modulates ptsHI transcription levels (Luesink et al. 1999). In contrast,
expression levels of ccpA, ptsH, ptsI and ptsK did not vary in the
presence of different carbohydrates in Lactobacillus acidophilus. These
results are consistent with regulation via phosphorylation at the protein
level. Similar results have been reported for ccpA expression levels in
Lactobacillus pentosus (Mahr et al. (2000) Appl. Environ. Microbiol.
66:277-283), and ptsHI transcription in S. thermophilus (Cochu et al.
(2003) Appl. Environ. Microbiol. 69:5423-5432).

[0282]In summary, a variety of carbohydrate uptake systems were identified
and characterized, with respect to expression profiles in the presence of
different carbohydrates, including PTS, ABC and GHP transporters. The
uptake and catabolic machinery is highly regulated at the transcription
level, suggesting the Lactobacillus acidophilus transcriptome is
flexible, dynamic and designed for efficient carbohydrate utilization.
Differential gene expression indicated the presence of a global carbon
catabolite repression regulatory network. Regulatory proteins were
consistently highly expressed, suggesting regulation at the protein
level, rather than the transcriptional level. Collectively, Lactobacillus
acidophilus appears to be able to efficiently adapt its metabolic
machinery to fluctuating carbohydrate sources available in the
nutritional complex environment of the small intestine. In particular,
ABC transporters of the MsmEFG family involved in uptake of FOS and
raffinose likely play an important role in the ability of Lactobacillus
acidophilus to compete with intestinal commensals for complex sugars that
are not digested by the human host. Ultimately, this information provides
new insights into how undigested dietary compounds influence the
intestinal microbial balance. This study is a model for comparative
transcriptional analysis of a bacterium exposed to varying growth
substrates.

EXAMPLE 7

Multidrug Transporters

[0283]Microorganisms such as Lactobacillus acidophilus have developed
various methods in which to resist the toxic effect of antibiotics and
other deleterious compounds. One such method involves transporters that
promote the active efflux of drugs, by which drug resistance may be
affected for a particular microorganism. There are two major classes of
multidrug transporters: secondary multidrug transporters that utilize the
transmembrane electrochemical gradient of protons or sodium ions to drive
the extrusion of drugs from a cell; and ATP-binding cassette (ABC)-type
multidrug transporters that utilize the free energy of ATP hydrolysis to
pump drugs out of the cell.

[0287]All publications, patents and patent applications mentioned in the
specification are indicative of the level of those skilled in the art to
which this invention pertains. All publications, patents and patent
applications are herein incorporated by reference to the same extent as
if each individual publication or patent application was specifically and
individually indicated to be incorporated by reference.

[0288]Although the foregoing invention has been described in some detail
by way of illustration and example for purposes of clarity of
understanding, it will be obvious that certain changes and modifications
may be practiced within the scope of the appended claims.

Sequence CWU
0
SQTB
SEQUENCE LISTING
The patent application contains a lengthy "Sequence Listing" section. A
copy of the "Sequence Listing" is available in electronic form from the
USPTO web site
(http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20090093021A1).
An electronic copy of the "Sequence Listing" will also be available from
the USPTO upon request and payment of the fee set forth in 37 CFR
1.19(b)(3).