|
|
Title:
Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae
for diagnostics and therapeutics
United States Patent: 7,582,731
Issued: September 1, 2009
Inventors: Doucette-Stamm;
Lynn (Framingham, MA), Bush; David (Somerville, MA), Zeng; Qiandong
(Waltham, MA), Opperman; Timothy (Somerville, MA), Houseweart; Chad Eric
(Waltham, MA)
Assignee:
Sanofi Pasteur Limited (Toronto, Ontario, CA)
Appl. No.: 11/643,458
Filed: December 20, 2006
|
|
|
Outsourcing Guide
|
Abstract
The invention provides isolated
polypeptide and nucleic acid sequences derived from Streptococcus
pneumoniae that are useful in diagnosis and therapy of pathological
conditions; antibodies against the polypeptides; and methods for the
production of the polypeptides. The invention also provides methods for
the detection, prevention and treatment of pathological conditions
resulting from bacterial infection.
Description of the
Invention
SUMMARY OF THE INVENTION
The present invention fulfills the need for diagnostic tools and
therapeutics by providing bacterial-specific compositions and methods for
detecting, treating, and preventing bacterial infection, in particular S.
pneumoniae infection.
The present invention encompasses isolated polypeptides and nucleic acids
derived from S. pneumoniae that are useful as reagents for diagnosis of
bacterial infection, components of effective antibacterial vaccines,
and/or as targets for antibacterial drugs, including anti-S. pneumoniae
drugs. The nucleic acids and peptides of the present invention also have
utility for diagnostics and therapeutics for S. pneumoniae and other
Streptococcus species. They can also be used to detect the presence of S.
pneumoniae and other Streptococcus species in a sample; and in screening
compounds for the ability to interfere with the S. pneumoniae life cycle
or to inhibit S. pneumoniae infection. More specifically, this invention
features compositions of nucleic acids corresponding to entire coding
sequences of S. pneumoniae proteins, including surface or secreted
proteins or parts thereof, nucleic acids capable of binding mRNA from S.
pneumoniae proteins to block protein translation, and methods for
producing S. pneumoniae proteins or parts thereof using peptide synthesis
and recombinant DNA techniques. This invention also features antibodies
and nucleic acids useful as probes to detect S. pneumoniae infection. In
addition, vaccine compositions and methods for the protection or treatment
of infection by S. pneumoniae are within the scope of this invention.
The nucleotide sequences provided in SEQ ID NO: 1-SEQ ID NO: 2661, a
fragment thereof, or a nucleotide sequence at least 99.5% identical to a
sequence contained within SEQ ID NO: 1-SEQ ID NO: 2661 may be "provided"
in a variety of medias to facilitate use thereof. As used herein,
"provided" refers to a manufacture, other than an isolated nucleic acid
molecule, which contains a nucleotide sequence of the present invention,
i.e., the nucleotide sequence provided in SEQ ID NO: 1-SEQ ID NO: 2661, a
fragment thereof, or a nucleotide sequence at least 99.5% identical to a
sequence contained within SEQ ID NO: 1-SEQ ID NO: 2661. Uses for and
methods for providing nucleotide sequences in a variety of media is well
known in the art (see e.g., EPO Publication No. EP 0 756 006)
In one application of this embodiment, a nucleotide sequence of the
present invention can be recorded on computer readable media. As used
herein, "computer readable media" refers to any media which can be read
and accessed directly by a computer. Such media include, but are not
limited to: magnetic storage media, such as floppy discs, hard disc
storage media, and magnetic tape; optical storage media such as CD-ROM;
electrical storage media such as RAM and ROM; and hybrids of these
categories such as magnetic/optical storage media. A person skilled in the
art can readily appreciate how any of the presently known computer
readable media can be used to create a manufacture comprising computer
readable media having recorded thereon a nucleotide sequence of the
present invention.
As used herein, "recorded" refers to a process for storing information on
computer readable media. A person skilled in the art can readily adopt any
of the presently known methods for recording information on computer
readable media to generate manufactures comprising the nucleotide sequence
information of the present invention.
A variety of data storage structures are available to a person skilled in
the art for creating a computer readable media having recorded thereon a
nucleotide sequence of the present invention. The choice of the data
storage structure will generally be based on the means chosen to access
the stored information. In addition, a variety of data processor programs
and formats can be used to store the nucleotide sequence information of
the present invention on computer readable media. The sequence information
can be represented in a word processing text file, formatted in
commercially-available software such as WordPerfect and Microsoft Word, or
represented in the form of an ASCII file, stored in a database
application, such as DB2, Sybase, Oracle, or the like. A person skilled in
the art can readily adapt any number of data processor structuring formats
(e.g. text file or database) in order to obtain computer readable media
having recorded thereon the nucleotide sequence information of the present
invention.
By providing the nucleotide sequence of SEQ ID NO: 1-SEQ ID NO: 2661, a
fragment thereof, or a nucleotide sequence at least 99.5% identical to a
sequence contained within SEQ ID NO: 1-SEQ ID NO: 2661 in computer
readable form, a person skilled in the art can routinely access the
sequence information for a variety of purposes. Computer software is
publicly available which allows a person skilled in the art to access
sequence information provided in a computer readable media. Examples of
such computer software include programs of the "Staden Package", "DNA
Star", "MacVector", GCG "Wisconsin Package" (Genetics Computer Group,
Madison, Wis.)and "NCBI toolbox" (National Center for Biotechnology
Information).
Computer algorithms enable the identification of S. pneumoniae open
reading frames (ORFs) within SEQ ID NO: 1-SEQ ID NO: 2661 which contain
homology to ORFs or proteins from other organisms. Examples of such
similarity-search algorithms include the BLAST [Altschul et al., J. Mol.
Biol. 215:403-410 (1990)] and Smith-Waterman [Smith and Waterman (1981)
Advances in Applied Mathematics, 2:482-489] search algorithms. These
algorithms are utilized on computer systems as exemplified below. The ORFs
so identified represent protein encoding fragments within the S.
pneumoniae genome and are useful in producing commercially important
proteins such as enzymes used in fermentation reactions and in the
production of commercially useful metabolites.
The present invention further provides systems, particularly
computer-based systems, which contain the sequence information described
herein. Such systems are designed to identify commercially important
fragments of the S. pneumoniae genome. As used herein, "a computer-based
system" refers to the hardware means, software means, and data storage
means used to analyze the nucleotide sequence information of the present
invention. The minimum hardware means of the computer-based systems of the
present invention comprises a central processing unit (CPU), input means,
output means, and data storage means. A person skilled in the art can
readily appreciate that any one of the currently available computer-based
systems is suitable for use in the present invention. The computer-based
systems of the present invention comprise a data storage means having
stored therein a nucleotide sequence of the present invention and the
necessary hardware means and software means for supporting and
implementing a search means. As used herein, "data storage means" refers
to memory which can store nucleotide sequence information of the present
invention, or a memory access means which can access manufactures having
recorded thereon the nucleotide sequence information of the present
invention.
As used herein, "search means" refers to one or more programs which are
implemented on the computer-based system to compare a target sequence or
target structural motif with the sequence information stored within the
data storage means. Search means are used to identify fragments or regions
of the S. pneumoniae genome which are similar to, or "match", a particular
target sequence or target motif. A variety of known algorithms are known
in the art and have been disclosed publicly, and a variety of commercially
available software for conducting homology-based similarity searches are
available and can be used in the computer-based systems of the present
invention. Examples of such software include, but is not limited to, FASTA
(GCG Wisconsin Package), Bic_SW (Compugen Bioccelerator, BLASTN2, BLASTP2
and BLASTX2 (NCBI) and Motifs (GCG). BLASTN2, A person skilled in the art
can readily recognize that any one of the available algorithms or
implementing software packages for conducting homology searches can be
adapted for use in the present computer-based systems.
As used herein, a "target sequence" can be any DNA or amino acid sequence
of six or more nucleotides or two or more amino acids. A person skilled in
the art can readily recognize that the longer a target sequence is, the
less likely a target sequence will be present as a random occurrence in
the database. The most preferred sequence length of a target sequence is
from about 10 to 100 amino acids or from about 30 to 300 nucleotide
residues. However, it is well recognized that many genes are longer than
500 amino acids, or 1.5 kb in length, and that commercially important
fragments of the S. pneumoniae genome, such as sequence fragments involved
in gene expression and protein processing, will often be shorter than 30
nucleotides.
As used herein, "a target structural motif," or "target motif," refers to
any rationally selected sequence or combination of sequences in which the
sequence(s) are chosen based on a specific functional domain or
three-dimensional configuration which is formed upon the folding of the
target polypeptide. There are a variety of target motifs known in the art.
Protein target motifs include, but are not limited to, enzymatic active
sites, membrane spanning regions, and signal sequences. Nucleic acid
target motifs include, but are not limited to, promoter sequences, hairpin
structures and inducible expression elements (protein binding sequences).
A variety of structural formats for the input and output means can be used
to input and output the information in the computer-based systems of the
present invention. A preferred format for an output means ranks fragments
of the S. pneumoniae genome possessing varying degrees of homology to the
target sequence or target motif. Such presentation provides a person
skilled in the art with a ranking of sequences which contain various
amounts of the target sequence or target motif and identifies the degree
of homology contained in the identified fragment.
A variety of comparing means can be used to compare a target sequence or
target motif with the data storage means to identify sequence fragments of
the S. pneumoniae genome. In the present examples, implementing software
which implement the BLASTP2 and bic_SW algorithms (Altschul et al., J Mol.
Biol. 215:403-410 (1990); Compugen Biocellerator) was used to identify
open reading frames within the S. pneumoniae genome. A person skilled in
the art can readily recognize that any one of the publicly available
homology search programs can be used as the search means for the computer-
based systems of the present invention.
The invention features S. pneumoniae polypeptides, preferably a
substantially pure preparation of an S. pneumoniae polypeptide, or a
recombinant S. pneumoniae polypeptide. In preferred embodiments: the
polypeptide has biological activity; the polypeptide has an amino acid
sequence at least 60%, 70%, 80%, 90%, 95%, 98%, or 99% identical to an
amino acid sequence of the invention contained in the Sequence Listing,
preferably it has about 65% sequence identity with an amino acid sequence
of the invention contained in the Sequence Listing, and most preferably it
has about 92% to about 99% sequence identity with an amino acid sequence
of the invention contained in the Sequence Listing; the polypeptide has an
amino acid sequence essentially the same as an amino acid sequence of the
invention contained in the Sequence Listing; the polypeptide is at least
5, 10, 20, 50, 100, or 150 amino acid residues in length; the polypeptide
includes at least 5, preferably at least 10, more preferably at least 20,
more preferably at least 50, 100, or 150 contiguous amino acid residues of
the invention contained in the Sequence Listing. In yet another preferred
embodiment, the amino acid sequence which differs in sequence identity by
about 7% to about 8% from the S. pneumoniae amino acid sequences of the
invention contained in the Sequence Listing is also encompassed by the
invention.
In preferred embodiments: the S. pneumoniae polypeptide is encoded by a
nucleic acid of the invention contained in the Sequence Listing, or by a
nucleic acid having at least 60%, 70%, 80%, 90%, 95%, 98%, or 99% homology
with a nucleic acid of the invention contained in the Sequence Listing.
In a preferred embodiment, the subject S. pneumoniae polypeptide differs
in amino acid sequence at 1, 2, 3, 5, 10 or more residues from a sequence
of the invention contained in the Sequence Listing. The differences,
however, are such that the S. pneumoniae polypeptide exhibits an S.
pneumoniae biological activity, e.g., the S. pneumoniae polypeptide
retains a biological activity of a naturally occurring S. pneumoniae
enzyme.
In preferred embodiments, the polypeptide includes all or a fragment of an
amino acid sequence of the invention contained in the Sequence Listing;
fused, in reading frame, to additional amino acid residues, preferably to
residues encoded by genomic DNA 5' or 3' to the genomic DNA which encodes
a sequence of the invention contained in the Sequence Listing.
In yet other preferred embodiments, the S. pneumoniae polypeptide is a
recombinant fusion protein having a first S. pneumoniae polypeptide
portion and a second polypeptide portion, e.g., a second polypeptide
portion having an amino acid sequence unrelated to S. pneumoniae. The
second polypeptide portion can be, e.g., any of glutathione-S-transferase,
a DNA binding domain, or a polymerase activating domain. In preferred
embodiment the fusion protein can be used in a two-hybrid assay.
Polypeptides of the invention include those which arise as a result of
alternative transcription events, alternative RNA splicing events, and
alternative translational and postranslational events.
In a preferred embodiment, the encoded S. pneumoniae polypeptide differs
(e.g., by amino acid substitution, addition or deletion of at least one
amino acid residue) in amino acid sequence at 1, 2, 3, 5, 10 or more
residues, from a sequence of the invention contained in the Sequence
Listing. The differences, however, are such that: the S. pneumoniae
encoded polypeptide exhibits a S. pneumoniae biological activity, e.g.,
the encoded S. pneumoniae enzyme retains a biological activity of a
naturally occurring S. pneumoniae.
In preferred embodiments, the encoded polypeptide includes all or a
fragment of an amino acid sequence of the invention contained in the
Sequence Listing; fused, in reading frame, to additional amino acid
residues, preferably to residues encoded by genomic DNA 5' or 3' to the
genomic DNA which encodes a sequence of the invention contained in the
Sequence Listing.
The S. pneumoniae strain, 14453, from which genomic sequences have been
sequenced, has been deposited on June 26, 1997 in the American Type
Culture Collection, 10801 University Blvd., Manassas, Va. 20110-2209, and
assigned the ATCC designation #55987.
Included in the invention are: allelic variations; natural mutants;
induced mutants; proteins encoded by DNA that hybridize under high or low
stringency conditions to a nucleic acid which encodes a polypeptide of the
invention contained in the Sequence Listing (for definitions of high and
low stringency see Current Protocols in Molecular Biology, John Wiley &
Sons, New York, 1989, 6.3.1-6.3.6, hereby incorporated by reference); and,
polypeptides specifically bound by antisera to S. pneumoniae polypeptides,
especially by antisera to an active site or binding domain of S.
pneumoniae polypeptide. The invention also includes fragments, preferably
biologically active fragments. These and other polypeptides are also
referred to herein as S. pneumoniae polypeptide analogs or variants.
The invention further provides nucleic acids, e.g., RNA or DNA, encoding a
polypeptide of the invention. This includes double stranded nucleic acids
as well as coding and antisense single strands.
In preferred embodiments, the subject S. pneumoniae nucleic acid will
include a transcriptional regulatory sequence, e.g. at least one of a
transcriptional promoter or transcriptional enhancer sequence, operably
linked to the S. pneumoniae gene sequence, e.g., to render the S.
pneumoniae gene sequence suitable for expression in a recombinant host
cell.
In yet a further preferred embodiment, the nucleic acid which encodes an
S. pneumoniae polypeptide of the invention, hybridizes under stringent
conditions to a nucleic acid probe corresponding to at least 8 consecutive
nucleotides of the invention contained in the Sequence Listing; more
preferably to at least 12 consecutive nucleotides of the invention
contained in the Sequence Listing; more preferably to at least 20
consecutive nucleotides of the invention contained in the Sequence
Listing; more preferably to at least 40 consecutive nucleotides of the
invention contained in the Sequence Listing.
In another aspect, the invention provides a substantially pure nucleic
acid having a nucleotide sequence which encodes an S. pneumoniae
polypeptide. In preferred embodiments: the encoded polypeptide has
biological activity; the encoded polypeptide has an amino acid sequence at
least 60%, 70%, 80%, 90%, 95%, 98%, or 99% homologous to an amino acid
sequence of the invention contained in the Sequence Listing; the encoded
polypeptide has an amino acid sequence essentially the same as an amino
acid sequence of the invention contained in the Sequence Listing; the
encoded polypeptide is at least 5, 10, 20, 50, 100, or 150 amino acids in
length; the encoded polypeptide comprises at least 5, preferably at least
10, more preferably at least 20, more preferably at least 50, 100, or 150
contiguous amino acids of the invention contained in the Sequence Listing.
In another aspect, the invention encompasses: a vector including a nucleic
acid which encodes an S. pneumoniae polypeptide or an S. pneumoniae
polypeptide variant as described herein; a host cell transfected with the
vector; and a method of producing a recombinant S. pneumoniae polypeptide
or S. pneumoniae polypeptide variant; including culturing the cell, e.g.,
in a cell culture medium, and isolating an S. pneumoniae polypeptide or an
S. pneumoniae polypeptide variant, e.g., from the cell or from the cell
culture medium.
In another series of embodiments, the invention provides isolated nucleic
acids comprising sequences at least about 8 nucleotides in length, more
preferably at least about 12 nucleotides in length, and most preferably at
least about 15-20 nucleotides in length, that correspond to a subsequence
of any one of SEQ ID NO: 1-SEQ ID NO: 2661 or complements thereof.
Alternatively, the nucleic acids comprise sequences contained within any
ORF (open reading frame), including a complete protein-coding sequence, of
which any of SEQ ID NO: 1-SEQ ID NO: 2661 forms a part. The invention
encompasses sequence-conservative variants and function-conservative
variants of these sequences. The nucleic acids may be DNA, RNA, DNA/RNA
duplexes, protein-nucleic acid (PNA), or derivatives thereof.
In another aspect, the invention features, a purified recombinant nucleic
acid having at least 50%, 60%, 70%, 80%, 90%, 95%, 98%, or 99% homology
with a sequence of the invention contained in the Sequence Listing.
In another aspect, the invention features nucleic acids capable of binding
mRNA of S. pneumoniae. Such nucleic acid is capable of acting as antisense
nucleic acid to control the translation of mRNA of S. pneumoniae. A
further aspect features a nucleic acid which is capable of binding
specifically to an S. pneumoniae nucleic acid. These nucleic acids are
also referred to herein as complements and have utility as probes and as
capture reagents.
In another aspect, the invention features an expression system comprising
an open reading frame corresponding to S. pneumoniae nucleic acid. The
nucleic acid further comprises a control sequence compatible with an
intended host. The expression system is useful for making polypeptides
corresponding to S. pneumoniae nucleic acid.
In another aspect, the invention features a cell transformed with the
expression system to produce S. pneumoniae polypeptides.
In yet another embodiment, the invention encompasses reagents for
detecting bacterial infection, including S. pneumoniae infection, which
comprise at least one S. pneumoniae-derived nucleic acid defined by any
one of SEQ ID NO: 1-SEQ ID NO: 2661, or sequence-conservative or
function-conservative variants thereof. Alternatively, the diagnostic
reagents comprise polypeptide sequences that are contained within any open
reading frames (ORFs), including complete protein-coding sequences,
contained within any of SEQ ID NO: 1-SEQ ID NO: 2661, or polypeptide
sequences contained within any of SEQ ID NO: 2662-SEQ ID NO: 5322, or
polypeptides of which any of the above sequences forms a part, or
antibodies directed against any of the above peptide sequences or
function-conservative variants and/or fragments thereof.
The invention further provides antibodies, preferably monoclonal
antibodies, which specifically bind to the polypeptides of the invention.
Methods are also provided for producing antibodies in a host animal. The
methods of the invention comprise immunizing an animal with at least one
S. pneumoniae-derived immunogenic component, wherein the immunogenic
component comprises one or more of the polypeptides encoded by any one of
SEQ ID NO: 1-SEQ ID NO: 2661 or sequence-conservative or
function-conservative variants thereof; or polypeptides that are contained
within any ORFs, including complete protein-coding sequences, of which any
of SEQ ID NO: 1-SEQ ID NO: 2661 forms a part; or polypeptide sequences
contained within any of SEQ ID NO: 2662-SEQ ID NO: 5322; or polypeptides
of which any of SEQ ID NO: 2662-SEQ ID NO: 5322 forms a part. Host animals
include any warm blooded animal, including without limitation mammals and
birds. Such antibodies have utility as reagents for immunoassays to
evaluate the abundance and distribution of S. pneumoniae-specific
antigens.
In yet another aspect, the invention provides a method for detecting
bacterial antigenic components in a sample, which comprises the steps of:
(i) contacting a sample suspected to contain a bacterial antigenic
component with a bacterial-specific antibody, under conditions in which a
stable antigen-antibody complex can form between the antibody and
bacterial antigenic components in the sample; and (ii) detecting any
antigen-antibody complex formed in step (i), wherein detection of an
antigen-antibody complex indicates the presence of at least one bacterial
antigenic component in the sample. In different embodiments of this
method, the antibodies used are directed against a sequence encoded by any
of SEQ ID NO: 1-SEQ ID NO: 2661 or sequence-conservative or
function-conservative variants thereof, or against a polypeptide sequence
contained in any of SEQ ID NO: 2662-SEQ ID NO: 5322 or
function-conservative variants thereof.
In yet another aspect, the invention provides a method for detecting
antibacterial-specific antibodies in a sample, which comprises: (i)
contacting a sample suspected to contain antibacterial-specific antibodies
with a S. pneumoniae antigenic component, under conditions in which a
stable antigen-antibody complex can form between the S. pneumoniae
antigenic component and antibacterial antibodies in the sample; and (ii)
detecting any antigen-antibody complex formed in step (i), wherein
detection of an antigen-antibody complex indicates the presence of
antibacterial antibodies in the sample. In different embodiments of this
method, the antigenic component is encoded by a sequence contained in any
of SEQ ID NO: 1-SEQ ID NO: 2661 or sequence-conservative and
function-conservative variants thereof, or is a polypeptide sequence
contained in any of SEQ ID NO: 2662-SEQ ID NO: 5322 or
function-conservative variants thereof.
In another aspect, the invention features a method of generating vaccines
for immunizing an individual against S. pneumoniae. The method includes:
immunizing a subject with an S. pneumoniae polypeptide, e.g., a surface or
secreted polypeptide, or active portion thereof, and a pharmaceutically
acceptable carrier. Such vaccines have therapeutic and prophylactic
utilities.
In another aspect, the invention features a method of evaluating a
compound, e.g. a polypeptide, e.g., a fragment of a host cell polypeptide,
for the ability to bind an S. pneumoniae polypeptide. The method includes:
contacting the candidate compound with an S. pneumoniae polypeptide and
determining if the compound binds or otherwise interacts with an S.
pneumoniae polypeptide. Compounds which bind S. pneumoniae are candidates
as activators or inhibitors of the bacterial life cycle. These assays can
be performed in vitro or in vivo.
In another aspect, the invention features a method of evaluating a
compound, e.g. a polypeptide, e.g., a fragment of a host cell polypeptide,
for the ability to bind an S. pneumoniae nucleic acid, e.g., DNA or RNA.
The method includes: contacting the candidate compound with an S.
pneumoniae nucleic acid and determining if the compound binds or otherwise
interacts with an S. pneumoniae polypeptide. Compounds which bind S.
pneumoniae are candidates as activators or inhibitors of the bacterial
life cycle. These assays can be performed in vitro or in vivo.
Claim 1 of 7 Claims
1. An isolated polypeptide comprising the
amino acid sequence as set forth in SEQ ID NO: 5179. ____________________________________________
If you want to learn more
about this patent, please go directly to the U.S.
Patent and Trademark Office Web site to access the full
patent.
|