|
|
|
|
|
|
Title: Synthetic hepatitis C genes United States Patent: 6,653,125 Issued: November 25, 2003 Inventors: Donnelly; John J. (Moraga, CA); Liu; Margaret A. (Lafayette, CA); Shiver; John W. (Doylestown, PA); Fu; Tong-Ming (Lansdale, PA) Assignee: Merck & Co., Inc. (Rahway, NJ) Appl. No.: 194949 Filed: February 17, 2000 Abstract The present invention relates to polynucleotides comprising a DNA sequence encoding an HCV protein and fragments thereof that contain codons optimized for expression in a vertebrate host. Uses of the polynucleotides include eliciting an immune response specifically recognizing HCV. SUMMARY OF THE INVENTION This invention relates to novel formulations of nucleic acid pharmaceutical products, specifically nucleic acid vaccine products. The nucleic acid products, when introduced directly into muscle cells, induce the production of immune responses which specifically recognize Hepatitis C virus (HCV). DETAILED DESCRIPTION OF THE INVENTION This invention relates to novel fonnulations of nucleic acid pharmaceutical products, specifically nucleic acid vaccine products. The nucleic acid vaccine products, when introduced directly into muscle cells, induce the production of immune responses which specifically recognize Hepatitis C virus (HCV). Non-A, Non-B hepatitis (NANBH) is a transmissible disease (or family of diseases) that is believed to be virally induced, and is distinguishable from other forms of virus-associated liver disease, such as those caused by hepatitis A virus (HAV), hepatitis B virus (HBV), delta hepatitis virus (HDV), cytomegalovirus (CMV) or Epstein-Barr virus (EBV). Epidemiologic evidence suggests that there may be three types of NANBH: the water-bome epidemic type; the blood or needle associated type; and the sporadically occurring (community acquired) type. However, the number of causative agents is unknown. Recently, a new viral species, hepatitis C virus (HCV) has been identified as the primary (if not only) cause of blood-associated NANBH (BB-NANBH). Hepatitis C appears to be the major form of transfusion-associated hepatitis in a number of countries, including the United States and Japan. There is also evidence implicating HCV in induction of hepatocellular carcinoma. Thus, a need exists for an effective method for preventing or treating HCV infection: currently, there is none. The HCV may be distantly related to the flaviviridae. The Flavivirus family contains a large number of viruses which are small, enveloped pathogens of man. The morphology and composition of Flavivirus particles are known, and are discussed in M. A. Brinton, in "The Viruses: The Togaviridae And Flaviviridae" (Series eds. Fraenkel-Conrat and Wagner, vol. eds. Schlesinger and Schlesinger, Plenum Press, 1996), pp. 327-374. Generally, with respect to morphology, Flaviviruses contain a central nucleocapsid surrounded by a lipid bilayer. Virions are spherical and have a diameter of about 40-50 nm. Their cores are about 25-30 nm in diameter. Along the outer surface of the virion envelope are projections measuring about 5-10 nm in length with terminal knobs about 2 nm in diameter. Typical examples of the family include Yellow Fever virus, West Nile virus, and Dengue Fever virus. They possess positive-stranded RNA genomes (about 11,000 nucleotides) that are slightly larger than that of HCV and encode a polyprotein precursor of about 3500 amino acids. Individual viral proteins are cleaved from this precursor polypeptide. The genome of HCV appears to be single-stranded RNA containing about 10,000 nucleotides. The genome is positive-stranded, and possesses a continuous translational open reading frame (ORF) that encodes a polyprotein of about 3,000 amino acids. In the ORF, the structural proteins appear to be encoded in approximately the first quarter of the N-terminal region, with the majority of the polyprotein attributed to non-structural proteins. When compared with all known viral sequences, small but significant co-linear homologies are observed with the nonstructural proteins of the Flavivirus family, and with the pestiviruses (which are now also considered to be part of the Flavivirus family). Intramuscular inoculation of polynucleotide constructs, i.e., DNA plasmids encoding proteins halve been shown to result in the generation of the encoded protein in situ in muscle cells. By using cDNA plasmids encoding viral proteins, both antibody and CTL responses were generated, providing homologous and heterologous protection against subsequent challenge with either the homologous or cross-strain protection, respectively. Each of these types of immune responses offers a potential advantage over existing vaccination strategies. The use of PNVs (polynucleotide vaccines) to generate antibodies may result in an increased duration of the antibody responses as well as the provision of an antigen that can have both the exact sequence of the clinically circulating strain of virus as well as the proper post-translational modifications and conformation of the native protein (vs. a recombinant protein). The generation of CTL responses by this means offers the benefits of cross-strain protection without the use of a live potentially pathogenic vector or attenuated virus. The standard techniques of molecular biology for preparing and purifying DNA constructs enable the preparation of the DNA therapeutics of this invention. While standard techniques of molecular biology are therefore sufficient for the production of the products of this invention, the specific constructs disclosed herein provide novel therapeutics which surprisingly produce cross-strain protection, a result heretofore unattainable with standard inactivated whole virus or subunit protein vaccines. The amount of expressible DNA to be introduced to a vaccine recipient will depend on the strength of the transcriptional and translational promoters used in the DNA construct, and on the immunogenicity of the expressed gene product. In general, an immunologically or prophylactically effective dose of about 1 .mu.g to 1 mg, and preferably about 10 .mu.g to 300 .mu.g is administered directly into muscle tissue. Subcutaneous injection, intradermal introduction, impression through the skin, and other modes of administration such as intraperitoneal, intravenous, or inhalation delivery are also contemplated. It is also contemplated that booster vaccinations are to be provided. The DNA may be naked, that is, unassociated with any proteins, adjuvants or other agents which impact on the recipients immune system. In this case, it is desirable for the DNA to be in a physiologically acceptable solution, such as, but not limited to, sterile saline or sterile buffered saline. Alternatively, the DNA may be associated with surfactants, liposomes, such as lecithin liposomes or other liposomes known in the art, as a DNA-liposome mixture, (see for example WO93/24640) or the DNA may be associated with an adjuvant known in the art to boost immune responses, such as a protein or other carrier. Agents which assist in the cellular uptake of DNA, such as, but not limited to, calcium ions, detergents, viral proteins and other transfection facilitating agents may also be used to advantage. These agents are generally referred to as transfection facilitating agents and as pharmaceutically acceptable carriers. As used herein, the term gene refers to a segment of nucleic acid which encodes a discrete polypeptide. The term pharmaceutical, and vaccine are used interchangeably to indicate compositions useful for inducing immune responses. The terms construct, and plasmid are used interchangeably. The term vector is used to indicate a DNA into which genes may be cloned for use according to the method of this invention. The following examples are provided to further define the invention, without limiting the invention to the specifics of the examples. EXAMPLE 1 V1J Expression Vectors V1J is derived from vectors V1 and pUC18, a commercially available plasmid. V1 was digested with SspI and EcoRI restriction enzymes producing two fragments of DNA. The smaller of these fragments, containing the CMVintA promoter and Bovine Growth Hormone (BGH) transcription termination elements which control the expression of heterologous genes, was purified from an agarose electrophoresis gel. The ends of this DNA fragment were then "blunted" using the T4 DNA polymerase enzyme in order to facilitate its ligation to another "blunt-ended" DNA fragment. pUC18 was chosen to provide the "backbone" of the expression vector. It is known to produce high yields of plasmid, is well-characterized by sequence and function, and is of minimum size. We removed the entire lac operon from this vector, which was unnecessary for our purposes and may be detrimental to plasmid yields and heterologous gene expression, by partial digestion with the HaeII restriction enzyme. The remaining plasmid was purified from an agarose electrophoresis gel, blunt-ended with the T4 DNA polymerase, treated with calf intestinal alkaline phosphatase, and ligated to the CMVintA/BGH element described above. Plasmids exhibiting either of two possible orientations of the promoter elements within the pUC backbone were obtained. One of these plasmids gave much higher yields of DNA in E. coli and was designated V1J. This vector's structure was verified by sequence analysis of the junction regions and was subsequently demonstrated to give comparable or higher expression of heterologous genes compared with V1. The ampicillin resistance marker was replaced with the neomycin resistance marker to yield vector V1Jneo. An Sfi I site was added to V1Jneo to facilitate integration studies. A commercially available 13 base pair Sfi I linker (New England BioLabs) was added at the Kpn I site within the BGH sequence of the vector. V1Jneo was linearized with Kpn I, gel purified, blunted by T4 DNA polymerase, and ligated to the blunt Sfi I linker. Clonal isolates were chosen by restriction mapping and verified by sequencing through the linker. The new vector was designated V1Jns. Expression of heterologous genes in V1Jns (with Sfi I) was comparable to expression of the same genes in V1Jneo (with Kpn I). Vector V1Ra was derived from vector V1R, a derivative of the V1Jns vector. Multiple cloning sites (BglII, KpnI, EcoRV, EcoRI, SalI, and NotI) were introduced into V1R to create the V1Ra vector to improve the convenience of subcloning. V1Ra vector derivatives containing the tpa leader sequence and ubiquitin sequence were generated (Vtpa (FIG. 3) and Vub (FIG. 4), respectively). Expression of viral antigen from Vtpa vector will target the antigen protein into the exocytic pathway, thus producing a secretable form of the antigen proteins. These secreted proteins are likely to be captured by professional antigen presenting cells, such as macrophages and dendritic cells, and processed and presented by class II molecules to activate CD4+ Th cells. They also are more likely to efficiently simulate antibody responses. Expression of viral antigen through VUb vector will produce a ubiquitin and antigen fusion protein. The uncleavable ubiquitin segment (glycine to alanine change at the cleavage site, Butt et al., JBC 263:16364, 1988) will target the viral antigen to ubiquitin-associated proteasomes for rapid degradation. The resulting peptide fragments will be transported into the ER for antigen presentation by class I molecules. This modification is attempted to enhance the class I molecule-restricted CTL responses against the viral antigen (Townsend et al, JEM 168:1211, 1988). EXAMPLE 2 Design and Construction of the Synthetic Genes A. Design of Synthetic Gene Sepments for HCV Gene Expression Gene segments were converted to sequences having identical translated sequences (except where noted) but with alternative codon usage as defined by R. Lathe in a research article from J. Molec. Biol. Vol. 183, pp. 1-12 (1985) entitled "Synthetic Oligonucleotide Probes Deduced from Amino Acid Sequence Data: Theoretical and Practical Considerations". The methodology described below was based on our hypothesis that the known inability to express a gene efficiently in mammalian cells is a consequence of the overall transcript composition. Thus, using alternative codons encoding the same protein sequence may remove the constraints on HCV gene expression. Inspection of the codon usage within HCV genome revealed that a high percentage of codons were among those infrequently used by highly expressed human genes. The specific codon replacement method employed may be described as follows employing data from Lathe et al.: 1. Identify placement of codons for proper open reading frame. 2. Compare wild type codon for observed frequency of use by human genes. 3. If codon is not the most commonly employed, replace it with an optimal codon for high expression. 4. Inspect the third nucleotide of the new codon and the first nucleotide of the adjacent codon immediately 3'- of the first. If a 5'-CG-3' pairing has been created by the new codon selection, replace it with an optimal codon for high expression. 5. Repeat this procedure until the entire gene segment has been replaced. 6. Inspect new gene sequence for undesired sequences generated by these codon replacements (e.g., "ATTTA" sequences, inadvertent creation of intron splice recognition sites, unwanted restriction enzyme sites, etc.) and substitute codons that eliminate these sequences. 7. Assemble synthetic gene segments and test for improved expression. B. HCV Core Antigen Sequence The consensus core sequence of HCV was adopted from a generalized core sequence reported by Bukh et al. (PNAS, 91:8239, 1994). This core sequence contains all the identified CTL epitopes in both human and mouse. The gene is composed of 573 nucleotides and encodes 191 amino acids. The predicted molecular weight is about 23 kDa. The codon replacement was conducted to eliminate codons which may hinder the expression of the HCV core protein in transfected mammalian cells in order to maximize the translational efficiency of DNA vaccine. Twenty three point two percent (23.2%) of nucleotide sequence (133 out of 573 nucleotides) were altered, resulting in changes of 61.3% of the codons (117 out 191 codons) in the core antigen sequence. C. Construction of the Synthetic Core Gene The optimized HCV core gene was constructed as a synthetic gene annealed from multiple synthetic oligonucleotides. To facilitate the identification and evaluation of the synthetic gene expression in cell culture and its immunogenicity in mice, a CTL epitope derived from influenza virus nucleoprotein residues 366-374 and an antibody epitope sequence derived from SV40 T antigen residues 684-699 were tagged to the carboxyl terminal of the core sequence. For clinical use it may be desired to express the core sequence without the nucleoprotein 366-374 and SV40 T 684-698 sequences. For this reason, the sequence of the two epitopes is flanked by two EcoRI sites which will be used to excise this fragment of sequence at a later time. Thus an embodiment of the invention for clinical use could consist of the V1Ra.HCV1CorePAb, Vtpa.HCV1CorePAb, or VUb.HCV1CorePAb plasmids that had been cut with EcoRI, annealed, and ligated to yield plasmids V1Ra.HCV1Core, Vtpa.HCV1Core, and VUb.HCV1Core. The synthetic gene was built as three separate segments in three vectors, nucleotides 1 to 80 in V1Ra, nucleotides 80 to 347 (BstX1 site) in pUC18, and nucleotides 347 to 573 plus the two epitope sequence in pUC18. All the segments were verified by DNA sequencing, and joined together in V1Ra vector. D. HCV Gene Expression Constructs In each case, the junction sequences from the 5' promoter region (CMVintA) into the cloned gene is shown. The position at which the junction occurs is demarcated by a "/", which does not represent any discontinuity in the sequence. The nomenclature for these constructs follows the convention: "Vector name-HCV strain-gene". V1Ra.HICV1.CorePAb
---IntA--AGA TCT ACC ATG AGC (SEQ. ID. NO.
17)--HCV.Core.--GCC GAA TTC GCT TCC (SEQ. ID. NO.
18)--PAb Sequence--TAA ACC CGG GAA TTC TAA A GTC
GAC (SEQ. ID. NO. 19)--BGH---
Vtpa.HCV1.CorePAb
---IntA--ATC ACC ATG GAT (SEQ. ID. NO. 20)--tpa
leader--GAG ATC-TTC ATG AGC (SEQ. ID. NO. 21)--
HCV.Core.--GCC GAA TTC GCT TCC--(SEQ. ID. NO.
18) PAb Sequence--TAA ACC CGG GAA TTC TAA A GTC
GAC (SEQ. ID. NO. 19)--BGH---
VUb.HCV1.CorePAb.
---IntA--AGA TCC ACC ATG CAG (SEQ. ID. NO. 22)
--Ubiquitin--GGT GCA GAT CTG ATG AGC (SEQ. ID. NO.
23)--HCV.Core.--GCC GAA TTC GCT TCC--(SEQ. ID. NO.
18) PAb Sequence--TAA ACC CGG GAA TTC TAA A GTC
GAC--BGH--
V1Ra.HCV1.Core
---IntA--AGA TCT ACC ATG AGC (SEQ. ID. NO. 17)--
HCV.Core.--GCC TAA A GTC GAC (SEQ. ID. NO. 24)--
BGH---
Vtpa.HCV1.Core
---IntA--ATC ACC ATG GAT (SEQ. ID. NO. 20)--tpa
leader--GAG ATC-TTC ATG AGC (SEQ. ID. NO. 21)--
HCV.Core.--GCC TAA A GTC GAC (SEQ. ID. NO. 24)--
BGH---
VUb.HCV1.Core
---IntA--AGA TCC ACC ATG CAG (SEQ. ID. NO. 22)--
Ubiquitin--GGT GCA GAT CTG ATG AGC (SEQ. ID. NO.
23)--HCV.Core.--GCC TAA A GTC GAC (SEQ. ID. NO.
24)--BGH--
E. Other Synthetic HCV Genes Using similar codon optimization techniques, synthetic genes encoding the HCV E1, HCV E2, HCV E1+E2, HCV NS5a and HCV NS5b proteins were created. SEQUENCE LISTING
<100> GENERAL INFORMATION:
<160> NUMBER OF SEQ ID NOS: 25
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 1
<211> LENGTH: 3610
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 1
gatattggct attggccatt gcatacgttg tatccatatc ataatatgta catttatatt 60
ggctcatgtc caacattacc gccatgttga cattgattat tgactagtta ttaatagtaa 120
tcaattacgg ggtcattagt tcatagccca tatatggagt tccgcgttac ataacttacg 180
gtaaatggcc cgcctggctg accgcccaac gacccccgcc cattgacgtc aataatgacg 240
tatgttccca tagtaacgcc aatagggact ttccattgac gtcaatgggt ggagtattta 300
cggtaaactg cccacttggc agtacatcaa gtgtatcata tgccaagtac gccccctatt 360
gacgtcaatg acggtaaatg gcccgcctgg cattatgccc agtacatgac cttatgggac 420
tttcctactt ggcagtacat ctacgtatta gtcatcgcta ttaccatggt gatgcggttt 480
tggcagtaca tcaatgggcg tggatagcgg tttgactcac ggggatttcc aagtctccac 540
cccattgacg tcaatgggag tttgttttgg caccaaaatc aacgggactt tccaaaatgt 600
cgtaacaact ccgccccatt gacgcaaatg ggcggtaggc gtgtacggtg ggaggtctat 660
ataagcagag ctcgtttagt gaaccgtcag atcgcctgga gacgccatcc acgctgtttt 720
gacctccata gaagacaccg ggaccgatcc agcctccgcg gccgggaacg gtgcattgga 780
acgcggattc cccgtgccaa gagtgacgta agtaccgcct atagagtcta taggcccacc 840
cccttggctt cttatgcatg ctatactgtt tttggcttgg ggtctataca cccccgcttc 900
ctcatgttat aggtgatggt atagcttagc ctataggtgt gggttattga ccattattga 960
ccactcccct attggtgacg atactttcca ttactaatcc ataacatggc tctttgccac 1020
aactctcttt attggctata tgccaataca ctgtccttca gagactgaca cggactctgt 1080
atttttacag gatggggtct catttattat ttacaaattc acatatacaa caccaccgtc 1140
cccagtgccc gcagttttta ttaaacataa cgtgggatct ccacgcgaat ctcgggtacg 1200
tgttccggac atgggctctt ctccggtagc ggcggagctt ctacatccga gccctgctcc 1260
catgcctcca gcgactcatg gtcgctcggc agctccttgc tcctaacagt ggaggccaga 1320
cttaggcaca gcacgatgcc caccaccacc agtgtgccgc acaaggccgt ggcggtaggg 1380
tatgtgtctg aaaatgagct cggggagcgg gcttgcaccg ctgacgcatt tggaagactt 1440
aaggcagcgg cagaagaaga tgcaggcagc tgagttgttg tgttctgata agagtcagag 1500
gtaactcccg ttgcggtgct gttaacggtg gagggcagtg tagtctgagc agtactcgtt 1560
gctgccgcgc gcgccaccag acataatagc tgacagacta acagactgtt cctttccatg 1620
ggtcttttct gcagtcaccg tccttagatc taggtaccag atatcagaat tcagtcgaca 1680
gcggccgcga tctgctgtgc cttctagttg ccagccatct gttgtttgcc cctcccccgt 1740
gccttccttg accctggaag gtgccactcc cactgtcctt tcctaataaa atgaggaaat 1800
tgcatcgcat tgtctgagta ggtgtcattc tattctgggg ggtggggtgg ggcagcacag 1860
caagggggag gattgggaag acaatagcag gcatgctggg gatgcggtgg gctctatggg 1920
tacggccgca gcggccttaa ttaaggccgc agcggccgta cccaggtgct gaagaattga 1980
cccggttcct cgacccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc 2040
cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta 2100
taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg 2160
ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcaatgc 2220
tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac 2280
gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac 2340
ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg 2400
aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga 2460
aggacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt 2520
agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag 2580
cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacgtgatcc 2640
cgtaatgctc tgccagtgtt acaaccaatt aaccaattct gattagaaaa actcatcgag 2700
catcaaatga aactgcaatt tattcatatc aggattatca ataccatatt tttgaaaaag 2760
ccgtttctgt aatgaaggag aaaactcacc gaggcagttc cataggatgg caagatcctg 2820
gtatcggtct gcgattccga ctcgtccaac atcaatacaa cctattaatt tcccctcgtc 2880
aaaaataagg ttatcaagtg agaaatcacc atgagtgacg actgaatccg gtgagaatgg 2940
caaaagctta tgcatttctt tccagacttg ttcaacaggc cagccattac gctcgtcatc 3000
aaaatcactc gcatcaacca aaccgttatt cattcgtgat tgcgcctgag cgagacgaaa 3060
tacgcgatcg ctgttaaaag gacaattaca aacaggaatc gaatgcaacc ggcgcaggaa 3120
cactgccagc gcatcaacaa tattttcacc tgaatcagga tattcttcta atacctggaa 3180
tgctgttttc ccggggatcg cagtggtgag taaccatgca tcatcaggag tacggataaa 3240
atgcttgatg gtcggaagag gcataaattc cgtcagccag tttagtctga ccatctcatc 3300
tgtaacatca ttggcaacgc tacctttgcc atgtttcaga aacaactctg gcgcatcggg 3360
cttcccatac aatcgataga ttgtcgcacc tgattgcccg acattatcgc gagcccattt 3420
atacccatat aaatcagcat ccatgttgga atttaatcgc ggcctcgagc aagacgtttc 3480
ccgttgaata tggctcataa caccccttgt attactgttt atgtaagcag acagttttat 3540
tgttcatgat gatatatttt tatcttgtgc aatgtaacat cagagatttt gagacacaac 3600
gtggctttcc 3610
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 2
<211> LENGTH: 573
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Optimized sequence encoding HCV core antigen
<400> SEQUENCE: 2
atgagcacca accccaagcc ccagaggaag accaagagga acaccaacag gaggccccag 60
gatgtgaagt tccctggggg aggccagatt gtgggagggg tctacctgct gcccaggagg 120
ggccccaggc tgggggtgag ggctaccagg aagacctctg agaggtccca gcccaggggc 180
aggaggcagc ccatccccaa ggccaggagg cctgagggcc gctcctgggc ccagcctggc 240
tacccctggc ccctgtatgg caatgaaggc tttggctggg ctggctggct gctgtccccc 300
aggggctcca ggccctcctg gggccccaca gaccccagga ggaggtccag gaacctgggc 360
aaggtgattg acaccctgac ctgtggcttt gctgacctga tgggctacat ccccctggtg 420
ggggctcctg tgggaggggt ggctagggct ctggctcatg gggtgagggt gctggaggat 480
ggggtgaact atgctactgg caacctgcct ggctgctcct tctccatctt cctgctggcc 540
ctgctctcct gcctgacagt gcctgcttct gcc 573
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 3
<211> LENGTH: 191
<212> TYPE: PRT
<213> ORGANISM: Hepatitis C Virus
<400> SEQUENCE: 3
Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr Asn
1 5 10 15
Arg Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gln Ile Val Gly
20 25 30
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala
35 40 45
Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pro
50 55 60
Ile Pro Lys Ala Arg Arg Pro Glu Gly Arg Ser Trp Ala Gln Pro Gly
65 70 75 80
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Phe Gly Trp Ala Gly Trp
85 90 95
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro
100 105 110
Arg Arg Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cys
115 120 125
Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Leu Val Gly Ala Pro Val
130 135 140
Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp
145 150 155 160
Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile
165 170 175
Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala
180 185 190
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 4
<211> LENGTH: 103
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 4
gaattcgctt ccaatgagaa catggagacc atgaaccagc cctaccacat ctgccgcggc 60
ttcacctgct tcaagaagta aacccgggaa ttctaaagtc gac 103
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 5
<211> LENGTH: 573
<212> TYPE: DNA
<213> ORGANISM: Hepatitis C Virus
<400> SEQUENCE: 5
atgagcacga atcctaaacc tcaaagaaaa accaaacgta acaccaaccg ccgcccacag 60
gacgtcaagt tcccgggcgg tggtcagatc gttggtggag tttacttgtt gccgcgcagg 120
ggccccaggt tgggtgtgcg cgcgactagg aagacttccg agcggtcgca acctcgtgga 180
aggcgacagc ctatccccaa ggctcgccgg cccgagggca ggtcctgggc tcagcccggg 240
tacccttggc ccctctatgg caatgagggc ttcgggtggg caggatggct cctgtccccc 300
cgcggctctc ggcctagttg gggccccact gacccccggc gtaggtcgcg caatttgggt 360
aaggtcatcg ataccctcac gtgcggcttc gccgacctca tggggtacat cccgctcgtc 420
ggcgcccccg tagggggcgt cgccagggcc ctggcgcatg gcgtcagggt tctggaggac 480
ggggtgaact atgcaacagg gaatttgccc ggttgctctt tctctatctt cctcctggct 540
ctgctgtcct gcctgaccgt cccagcttct gct 573
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 6
<211> LENGTH: 582
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Optimized sequence encoding HCV E1 protein
<400> SEQUENCE: 6
atgtatgagg tgaggaatgt ctctggcgtc taccatgtga ccaatgactg ctccaactcc 60
tgcattgtct atgaggctgc tgacatgatc atgcacaccc ctggctgtgt gccatgtgtg 120
agggagggca actcctccag gtgctgggtg gccctgaccc ccaccctggc tgccaggaac 180
tcctccatcc ccaccaccac catcaggagg catgtggacc tgctggtggg cgctgctgcc 240
ctgtgctctg ccatgtatgt gggcgacctg tgtggctctg tcttcctggt gtcccagctg 300
ttcaccttct cccccaggag gtatgagact gtgcaggact gcaactgctc cctgtaccct 360
ggccatgtct ctggccacag gatggcctgg gacatgatga tgaactggtc ccccaccact 420
gccctggtgg tctcccagct gctgaggatc ccccaggctg tggtggacat ggtggtgggc 480
gcccactggg gcgtgctggc tggcctggcc tactactcca tggtgggcaa ctgggccaag 540
gtgctgattg tgatgctgct gtttgctggc gtggatggct aa 582
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 7
<211> LENGTH: 193
<212> TYPE: PRT
<213> ORGANISM: Hepatitis C Virus
<400> SEQUENCE: 7
Met Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp
1 5 10 15
Cys Ser Asn Ser Cys Ile Val Tyr Glu Ala Ala Asp Met Ile Met His
20 25 30
Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ser Ser Arg Cys
35 40 45
Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ser Ser Ile Pro
50 55 60
Thr Thr Thr Ile Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala
65 70 75 80
Leu Cys Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu
85 90 95
Val Ser Gln Leu Phe Thr Phe Ser Pro Arg Arg Tyr Glu Thr Val Gln
100 105 110
Asp Cys Asn Cys Ser Leu Tyr Pro Gly His Val Ser Gly His Arg Met
115 120 125
Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val
130 135 140
Ser Gln Leu Leu Arg Ile Pro Gln Ala Val Val Asp Met Val Val Gly
145 150 155 160
Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly
165 170 175
Asn Trp Ala Lys Val Leu Ile Val Met Leu Leu Phe Ala Gly Val Asp
180 185 190
Gly
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 8
<211> LENGTH: 1044
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Optimized sequence encoding HCV E2 protein
<400> SEQUENCE: 8
atgaccacct atgtctctgt gggccatgcc tcccagacca ccaggagggt ggcctccttc 60
ttctcccctg gctctgccca gaagatccag ctggtgaaca ccaatggctc ctggcacatc 120
aacaggactg ccctgaattg caacgagtcc atcaacactg gcttctttgc tgccctgttc 180
tatgtgaaga agttcaactc ctctggctgc tctgagagga tggcctcctg caggcccatt 240
gacaggtttg cccagggctg gggccccatc acccatgctg agtccaggtc ctctgaccag 300
aggccatact gctggcacta tgccccccag ccatgtggca ttgtgcctgc cctgcaggtc 360
tgtggccctg tctactgctt caccccatcc cctgtggtgg tgggcaccac tgacaggttt 420
ggcgtgccca cctacaactg gggcgacaat gagactgatg tgctgctgct gaacaacacc 480
aggccccccc agggcaactg gtttggctgc acctggatga actccactgg cttcaccaag 540
acctgtggcg gccccccatg caacattggc ggcgctggca acaacaccct gacctgcccc 600
actgactgct tcaggaagca tcctgaggcc acctacacca agtgtggctc tggcccatgg 660
ctgaccccca ggtgcatggt ggactaccca tacaggctgt ggcactaccc atgcaccttc 720
aacttcacca tcttcaagat caggatgtat gtgggcggcg tggagcacag gctgaatgct 780
gcctgcaact ggaccagggg cgagaggtgc aacattgagg acagggacag gtctgagctg 840
tcccccctgc tgctgtccac cactgagtgg cagatcctgc catgctcctt caccaccctg 900
cctgccctgt ccactggcct gatccatctg catcagaaca ttgtggatgt gcagtacctg 960
tacggcgtgg gctccgctgt ggtctccatt gtgatcaagt gggagtatgt gctgctgctg 1020
ttcctgctgc tggctgatgc ctaa 1044
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 9
<211> LENGTH: 347
<212> TYPE: PRT
<213> ORGANISM: Hepatitis C Virus
<400> SEQUENCE: 9
Met Thr Thr Tyr Val Ser Val Gly His Ala Ser Gln Thr Thr Arg Arg
1 5 10 15
Val Ala Ser Phe Phe Ser Pro Gly Ser Ala Gln Lys Ile Gln Leu Val
20 25 30
Asn Thr Asn Gly Ser Trp His Ile Asn Arg Thr Ala Leu Asn Cys Asn
35 40 45
Glu Ser Ile Asn Thr Gly Phe Phe Ala Ala Leu Phe Tyr Val Lys Lys
50 55 60
Phe Asn Ser Ser Gly Cys Ser Glu Arg Met Ala Ser Cys Arg Pro Ile
65 70 75 80
Asp Arg Phe Ala Gln Gly Trp Gly Pro Ile Thr His Ala Glu Ser Arg
85 90 95
Ser Ser Asp Gln Arg Pro Tyr Cys Trp His Tyr Ala Pro Gln Pro Cys
100 105 110
Gly Ile Val Pro Ala Leu Gln Val Cys Gly Pro Val Tyr Cys Phe Thr
115 120 125
Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr
130 135 140
Tyr Asn Trp Gly Asp Asn Glu Thr Asp Val Leu Leu Leu Asn Asn Thr
145 150 155 160
Arg Pro Pro Gln Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Ser Thr
165 170 175
Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn Ile Gly Gly Ala
180 185 190
Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro
195 200 205
Glu Ala Thr Tyr Thr Lys Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg
210 215 220
Cys Met Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Phe
225 230 235 240
Asn Phe Thr Ile Phe Lys Ile Arg Met Tyr Val Gly Gly Val Glu His
245 250 255
Arg Leu Asn Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asn Ile
260 265 270
Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr
275 280 285
Glu Trp Gln Ile Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser
290 295 300
Thr Gly Leu Ile His Leu His Gln Asn Ile Val Asp Val Gln Tyr Leu
305 310 315 320
Tyr Gly Val Gly Ser Ala Val Val Ser Ile Val Ile Lys Trp Glu Tyr
325 330 335
Val Leu Leu Leu Phe Leu Leu Leu Ala Asp Ala
340 345
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 10
<211> LENGTH: 1620
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Optimized sequence encoding HCV E1 +
E2 proteins
<400> SEQUENCE: 10
atgtatgagg tgaggaatgt ctctggcgtc taccatgtga ccaatgactg ctccaactcc 60
tgcattgtct atgaggctgc tgacatgatc atgcacaccc ctggctgtgt gccatgtgtg 120
agggagggca actcctccag gtgctgggtg gccctgaccc ccaccctggc tgccaggaac 180
tcctccatcc ccaccaccac catcaggagg catgtggacc tgctggtggg cgctgctgcc 240
ctgtgctctg ccatgtatgt gggcgacctg tgtggctctg tcttcctggt gtcccagctg 300
ttcaccttct cccccaggag gtatgagact gtgcaggact gcaactgctc cctgtaccct 360
ggccatgtct ctggccacag gatggcctgg gacatgatga tgaactggtc ccccaccact 420
gccctggtgg tctcccagct gctgaggatc ccccaggctg tggtggacat ggtggtgggc 480
gcccactggg gcgtgctggc tggcctggcc tactactcca tggtgggcaa ctgggccaag 540
gtgctgattg tgatgctgct gtttgctggc gtggatggca ccacctatgt ctctgtgggc 600
catgcctccc agaccaccag gagggtggcc tccttcttct cccctggctc tgcccagaag 660
atccagctgg tgaacaccaa tggctcctgg cacatcaaca ggactgccct gaattgcaac 720
gagtccatca acactggctt ctttgctgcc ctgttctatg tgaagaagtt caactcctct 780
ggctgctctg agaggatggc ctcctgcagg cccattgaca ggtttgccca gggctggggc 840
cccatcaccc atgctgagtc caggtcctct gaccagaggc catactgctg gcactatgcc 900
ccccagccat gtggcattgt gcctgccctg caggtctgtg gccctgtcta ctgcttcacc 960
ccatcccctg tggtggtggg caccactgac aggtttggcg tgcccaccta caactggggc 1020
gacaatgaga ctgatgtgct gctgctgaac aacaccaggc ccccccaggg caactggttt 1080
ggctgcacct ggatgaactc cactggcttc accaagacct gtggcggccc cccatgcaac 1140
attggcggcg ctggcaacaa caccctgacc tgccccactg actgcttcag gaagcatcct 1200
gaggccacct acaccaagtg tggctctggc ccatggctga cccccaggtg catggtggac 1260
tacccataca ggctgtggca ctacccatgc accttcaact tcaccatctt caagatcagg 1320
atgtatgtgg gcggcgtgga gcacaggctg aatgctgcct gcaactggac caggggcgag 1380
aggtgcaaca ttgaggacag ggacaggtct gagctgtccc ccctgctgct gtccaccact 1440
gagtggcaga tcctgccatg ctccttcacc accctgcctg ccctgtccac tggcctgatc 1500
catctgcatc agaacattgt ggatgtgcag tacctgtacg gcgtgggctc cgctgtggtc 1560
tccattgtga tcaagtggga gtatgtgctg ctgctgttcc tgctgctggc tgatgcctaa 1620
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 11
<211> LENGTH: 539
<212> TYPE: PRT
<213> ORGANISM: Hepatitis C Virus
<400> SEQUENCE: 11
Met Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp
1 5 10 15
Cys Ser Asn Ser Cys Ile Val Tyr Glu Ala Ala Asp Met Ile Met His
20 25 30
Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ser Ser Arg Cys
35 40 45
Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ser Ser Ile Pro
50 55 60
Thr Thr Thr Ile Arg Arg His Val Asp Leu Leu Val Gly Ala Ala Ala
65 70 75 80
Leu Cys Ser Ala Met Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu
85 90 95
Val Ser Gln Leu Phe Thr Phe Ser Pro Arg Arg Tyr Glu Thr Val Gln
100 105 110
Asp Cys Asn Cys Ser Leu Tyr Pro Gly His Val Ser Gly His Arg Met
115 120 125
Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val
130 135 140
Ser Gln Leu Leu Arg Ile Pro Gln Ala Val Val Asp Met Val Val Gly
145 150 155 160
Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly
165 170 175
Asn Trp Ala Lys Val Leu Ile Val Met Leu Leu Phe Ala Gly Val Asp
180 185 190
Gly Thr Thr Tyr Val Ser Val Gly His Ala Ser Gln Thr Thr Arg Arg
195 200 205
Val Ala Ser Phe Phe Ser Pro Gly Ser Ala Gln Lys Ile Gln Leu Val
210 215 220
Asn Thr Asn Gly Ser Trp His Ile Asn Arg Thr Ala Leu Asn Cys Asn
225 230 235 240
Glu Ser Ile Asn Thr Gly Phe Phe Ala Ala Leu Phe Tyr Val Lys Lys
245 250 255
Phe Asn Ser Ser Gly Cys Ser Glu Arg Met Ala Ser Cys Arg Pro Ile
260 265 270
Asp Arg Phe Ala Gln Gly Trp Gly Pro Ile Thr His Ala Glu Ser Arg
275 280 285
Ser Ser Asp Gln Arg Pro Tyr Cys Trp His Tyr Ala Pro Gln Pro Cys
290 295 300
Gly Ile Val Pro Ala Leu Gln Val Cys Gly Pro Val Tyr Cys Phe Thr
305 310 315 320
Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr
325 330 335
Tyr Asn Trp Gly Asp Asn Glu Thr Asp Val Leu Leu Leu Asn Asn Thr
340 345 350
Arg Pro Pro Gln Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Ser Thr
355 360 365
Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn Ile Gly Gly Ala
370 375 380
Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro
385 390 395 400
Glu Ala Thr Tyr Thr Lys Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg
405 410 415
Cys Met Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Phe
420 425 430
Asn Phe Thr Ile Phe Lys Ile Arg Met Tyr Val Gly Gly Val Glu His
435 440 445
Arg Leu Asn Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asn Ile
450 455 460
Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr
465 470 475 480
Glu Trp Gln Ile Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser
485 490 495
Thr Gly Leu Ile His Leu His Gln Asn Ile Val Asp Val Gln Tyr Leu
500 505 510
Tyr Gly Val Gly Ser Ala Val Val Ser Ile Val Ile Lys Trp Glu Tyr
515 520 525
Val Leu Leu Leu Phe Leu Leu Leu Ala Asp Ala
530 535
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 12
<211> LENGTH: 1350
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Optimized sequence encoding HCV NS5a protein
<400> SEQUENCE: 12
atgtctggct cctggctgag ggatgtctgg gactggatct gcactgtgct gactgacttc 60
aagacctggc tgcattccaa gctgctgccc aggctgcctg gcgacccatt cttctcctgc 120
cagaggggct acaggggcgt ctggaggggc gatggcgtga tgcagaccac ctgcccatgt 180
ggcgcccaga tcactggcca tgtgaagaat ggctccatga ggattgtggg ccccaagacc 240
tgctccaaca cctggcatgg caccttcccc atcaatgcct acaccactgg cccatgcacc 300
ccatcccctg cccccaacta ctccagggcc ctgtggaggg tggctgctga ggagtatgtg 360
gaggtgacca gggtgggcga cttccactat gtgactggca tgaccactga caatgtgaag 420
tgcccatgcc aggtgcctgc ccctgagttc ttcactgagg tggatggcgt gaggctgcac 480
aggtatgccc ctgcctgcaa gcccctgctg agggatgagg tgaccttcca ggtgggcctg 540
aaccagttcc ctgtgggctc ccagctgcca tgtgagcctg agcctgatgt gactgtgctg 600
acctccatgc tgactgagcc atcccacatc actgctgaga ctgccaagag gaggctggcc 660
aggggctccc ctccatccct ggcctcctcc tctgcctccc agctgtctgc tccatccctg 720
aaggccacct gcaccaccag gcatgactcc cctgatgctg acctgattga ggccaacctg 780
ctgtggaggc aggagatggg cggcaacatc accagggtgg agtctgagaa caaggtggtg 840
atcctggact cctttgagcc cctgagggct gaggaggatg agagggaggt ctctgtggct 900
gctgagatcc tgaggaagtc caggaagttc ccccctgccc tgcccatctg ggcgaggcca 960
tcctacaacc cacccctgct ggagtcctgg aaggaccctg actatgtgcc ccctgtggtg 1020
catggctgcc ccctgccccc caccatggcc ccacccatcc ccccacccag gaggaagagg 1080
actgtggtgc tgactgagtc cactgtctcc tctgccctgg ctgagctggc caccaagacc 1140
ttcggctcct ctggctcctc tgctgtggac tctggcactg ccacggcccc ccctgaccag 1200
ccatctgatg atggcgacag gggctctgat gatgagtcct actcctccat gccccccctg 1260
gagggcgagc ctggcgaccc tgacctgtct gatggctcct ggtccactgt ctctgaggag 1320
gcctctgagg atgtggcctg ctgctcctaa 1350
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 13
<211> LENGTH: 449
<212> TYPE: PRT
<213> ORGANISM: Hepatitis C Virus
<400> SEQUENCE: 13
Met Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp Ile Cys Thr Val
1 5 10 15
Leu Thr Asp Phe Lys Thr Trp Leu His Ser Lys Leu Leu Pro Arg Leu
20 25 30
Pro Gly Asp Pro Phe Phe Ser Cys Gln Arg Gly Tyr Arg Gly Val Trp
35 40 45
Arg Gly Asp Gly Val Met Gln Thr Thr Cys Pro Cys Gly Ala Gln Ile
50 55 60
Thr Gly His Val Lys Asn Gly Ser Met Arg Ile Val Gly Pro Lys Thr
65 70 75 80
Cys Ser Asn Thr Trp His Gly Thr Phe Pro Ile Asn Ala Tyr Thr Thr
85 90 95
Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg Ala Leu Trp
100 105 110
Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg Val Gly Asp Phe
115 120 125
His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro Cys Gln
130 135 140
Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val Arg Leu His
145 150 155 160
Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Asp Glu Val Thr Phe
165 170 175
Gln Val Gly Leu Asn Gln Phe Pro Val Gly Ser Gln Leu Pro Cys Glu
180 185 190
Pro Glu Pro Asp Val Thr Val Leu Thr Ser Met Leu Thr Glu Pro Ser
195 200 205
His Ile Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro
210 215 220
Pro Ser Leu Ala Ser Ser Ser Ala Ser Gln Leu Ser Ala Pro Ser Leu
225 230 235 240
Lys Ala Thr Cys Thr Thr Arg His Asp Ser Pro Asp Ala Asp Leu Ile
245 250 255
Glu Ala Asn Leu Leu Trp Arg Gln Glu Met Gly Gly Asn Ile Thr Arg
260 265 270
Val Glu Ser Glu Asn Lys Val Val Ile Leu Asp Ser Phe Glu Pro Leu
275 280 285
Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Ala Ala Glu Ile Leu
290 295 300
Arg Lys Ser Arg Lys Phe Pro Pro Ala Leu Pro Ile Trp Ala Arg Pro
305 310 315 320
Ser Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp Tyr Val
325 330 335
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Thr Met Ala Pro Pro
340 345 350
Ile Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu Ser Thr
355 360 365
Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly Ser Ser
370 375 380
Gly Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Pro Pro Asp Gln
385 390 395 400
Pro Ser Asp Asp Gly Asp Arg Gly Ser Asp Asp Glu Ser Tyr Ser Ser
405 410 415
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly
420 425 430
Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Ala Cys Cys
435 440 445
Ser
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 14
<211> LENGTH: 1773
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Optimized sequence encoding HCV NS5b protein
<400> SEQUENCE: 14
atgtcctaca cctggactgg cgccctgatc accccatgtg ctgctgagga gtccaagctg 60
cccatcaacc ccctgtccaa ctccctgctg aggcatcaca acatggtcta tgccaccacc 120
tccaggtctg ctggcctgag gcagaagaag gtgacctttg acaggctgca tgtgcctgat 180
gaccactaca gggatgtgct gaaggagatg aaggccaagg cctccactgt gaaggcgaag 240
ctgctgtctg tggaggaggc ctgcaagctg acccctcccc actctgccag gtccaagttt 300
ggctatggcg ccaaggatgt gaggaacctg tcctccaagg ctgtgaacca catccactct 360
gtctggaagg acctgctgga ggacactgag acccccattg acaccaccat catggccaag 420
aatgaggtct tctgtgtgca gcctgagaag ggcggcagga agcctgccag gctgattgtc 480
ttccctgagc tgggcgtgag ggtgtgtgag aagatggccc tgtatgatgt ggtctccacc 540
ctgccccagg ctgtgatggg ctcctcctat ggcttccagt actcccctgg ccagagggtg 600
gagttcctgg tgaatgcctg gaagtccaag aagaacccca tgggctttgc ctactgcacc 660
aggtgctttg actccactgt gactgagtct gacatcaggg tggaggagtc catctaccag 720
tgctgtgacc tggctcctga ggccaggcag gtgatcaggt ccctgactga gaggctgtac 780
attggcggcc ccctgaccaa ctccaagggc cagaactgtg gctacaggag gtgcagggcc 840
tctggcgtgc tgaccactaa ctgtggcaac accctgacct gctacctgaa ggcctctgct 900
gcttgcaggg ctgccaagct gcatgactgc accatgctgg tctgtggcga tgacctggtg 960
gtgatctgtg agtctgctgg cacccaggag gatgctgcct ccctgagggt cttcactgag 1020
gccatgacca ggtactctgc cccccctggc gaccctcccc agcctgagta tgacctggag 1080
ctgatcacct cctgctcctc caatgtctct gtggcccatg atgcctctgg caagagggtc 1140
tactacctga ccagggaccc caccaccccc ctggccaggg ctgcctggga gactgccagg 1200
cacacccctg tgaactcctg gctgggcaac atcatcatgt atgcccccac cctgtgggcc 1260
aggatgatcc tgatgaccca cttcttctcc atcctgctgg cccaggagca gctggagaag 1320
gccctgggct gccagattta tggcgccacc tacttcattg agcccctgga cctgccccag 1380
atcatccaga ggctgcatgg cctgtctgcc ttctccctgc actcctactc ccctggcgag 1440
atcaacaggg tggcctcctg cctgaggaag ctgggcgtgc cccccctgag ggtgtggagg 1500
cacagggcca ggtctgtgag ggccaagctg ctgtcccagg gcggcagggc tgccacctgt 1560
ggcaagtacc tgttcaactg ggctgtgagg accaagctga agctgacccc catccctgct 1620
gcctcccagc tggacctgtc tggctggttt gtggctggct actctggcgg cgacatctac 1680
cactccctgt ccagggccag gcccaggtgg ttcatgtggt gcctgctgct gctgtctgtg 1740
ggcgtgggca tctacctgct gcccaacagg tga 1773
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 15
<211> LENGTH: 590
<212> TYPE: PRT
<213> ORGANISM: Hepatitis C Virus
<400> SEQUENCE: 15
Met Ser Tyr Thr Trp Thr Gly Ala Leu Ile Thr Pro Cys Ala Ala Glu
1 5 10 15
Glu Ser Lys Leu Pro Ile Asn Pro Leu Ser Asn Ser Leu Leu Arg His
20 25 30
His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly Leu Arg Gln
35 40 45
Lys Lys Val Thr Phe Asp Arg Leu His Val Pro Asp Asp His Tyr Arg
50 55 60
Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys
65 70 75 80
Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala
85 90 95
Arg Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser
100 105 110
Lys Ala Val Asn His Ile His Ser Val Trp Lys Asp Leu Leu Glu Asp
115 120 125
Thr Glu Thr Pro Ile Asp Thr Thr Ile Met Ala Lys Asn Glu Val Phe
130 135 140
Cys Val Gln Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu Ile Val
145 150 155 160
Phe Pro Glu Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp
165 170 175
Val Val Ser Thr Leu Pro Gln Ala Val Met Gly Ser Ser Tyr Gly Phe
180 185 190
Gln Tyr Ser Pro Gly Gln Arg Val Glu Phe Leu Val Asn Ala Trp Lys
195 200 205
Ser Lys Lys Asn Pro Met Gly Phe Ala Tyr Cys Thr Arg Cys Phe Asp
210 215 220
Ser Thr Val Thr Glu Ser Asp Ile Arg Val Glu Glu Ser Ile Tyr Gln
225 230 235 240
Cys Cys Asp Leu Ala Pro Glu Ala Arg Gln Val Ile Arg Ser Leu Thr
245 250 255
Glu Arg Leu Tyr Ile Gly Gly Pro Leu Thr Asn Ser Lys Gly Gln Asn
260 265 270
Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Asn Cys
275 280 285
Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala Cys Arg Ala
290 295 300
Ala Lys Leu His Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val
305 310 315 320
Val Ile Cys Glu Ser Ala Gly Thr Gln Glu Asp Ala Ala Ser Leu Arg
325 330 335
Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro
340 345 350
Pro Gln Pro Glu Tyr Asp Leu Glu Leu Ile Thr Ser Cys Ser Ser Asn
355 360 365
Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr Tyr Leu Thr
370 375 380
Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg
385 390 395 400
His Thr Pro Val Asn Ser Trp Leu Gly Asn Ile Ile Met Tyr Ala Pro
405 410 415
Thr Leu Trp Ala Arg Met Ile Leu Met Thr His Phe Phe Ser Ile Leu
420 425 430
Leu Ala Gln Glu Gln Leu Glu Lys Ala Leu Gly Cys Gln Ile Tyr Gly
435 440 445
Ala Thr Tyr Phe Ile Glu Pro Leu Asp Leu Pro Gln Ile Ile Gln Arg
450 455 460
Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu
465 470 475 480
Ile Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val Pro Pro Leu
485 490 495
Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Lys Leu Leu Ser
500 505 510
Gln Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe Asn Trp Ala
515 520 525
Val Arg Thr Lys Leu Lys Leu Thr Pro Ile Pro Ala Ala Ser Gln Leu
530 535 540
Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly Asp Ile Tyr
545 550 555 560
His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Trp Cys Leu Leu
565 570 575
Leu Leu Ser Val Gly Val Gly Ile Tyr Leu Leu Pro Asn Arg
580 585 590
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 16
<211> LENGTH: 103
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 16
cttaagcgaa ggttactctt gtacctctgg tacttggtcg ggatggtgta gacggcgccg 60
aagtggacga agttcttcat ttgggccctt aagatttcag ctg 103
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 17
<211> LENGTH: 15
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 17
agatctacca tgagc 15
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 18
<211> LENGTH: 15
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 18
gccgaattcg cttcc 15
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 19
<211> LENGTH: 25
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 19
taaacccggg aattctaaag tcgac 25
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 20
<211> LENGTH: 12
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 20
atcaccatgg at 12
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 21
<211> LENGTH: 15
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 21
gagatcttca tgagc 15
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 22
<211> LENGTH: 15
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 22
agatccacca tgcag 15
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 23
<211> LENGTH: 18
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 23
ggtgcagatc tgatgagc 18
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 24
<211> LENGTH: 13
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 24
gcctaaagtc gac 13
<200> SEQUENCE CHARACTERISTICS:
<210> SEQ ID NO 25
<211> LENGTH: 4261
<212> TYPE: DNA
<213> ORGANISM: Artificial Sequence
<220> FEATURE:
<223> OTHER INFORMATION: Modified Vector Sequence
<400> SEQUENCE: 25
gatattggct attggccatt gcatacgttg tatccatatc ataatatgta catttatatt 60
ggctcatgtc caacattacc gccatgttga cattgattat tgactagtta ttaatagtaa 120
tcaattacgg ggtcattagt tcatagccca tatatggagt tccgcgttac ataacttacg 180
gtaaatggcc cgcctggctg accgcccaac gacccccgcc cattgacgtc aataatgacg 240
tatgttccca tagtaacgcc aatagggact ttccattgac gtcaatgggt ggagtattta 300
cggtaaactg cccacttggc agtacatcaa gtgtatcata tgccaagtac gccccctatt 360
gacgtcaatg acggtaaatg gcccgcctgg cattatgccc agtacatgac cttatgggac 420
tttcctactt ggcagtacat ctacgtatta gtcatcgcta ttaccatggt gatgcggttt 480
tggcagtaca tcaatgggcg tggatagcgg tttgactcac ggggatttcc aagtctccac 540
cccattgacg tcaatgggag tttgttttgg caccaaaatc aacgggactt tccaaaatgt 600
cgtaacaact ccgccccatt gacgcaaatg ggcggtaggc gtgtacggtg ggaggtctat 660
ataagcagag ctcgtttagt gaaccgtcag atcgcctgga gacgccatcc acgctgtttt 720
gacctccata gaagacaccg ggaccgatcc agcctccgcg gccgggaacg gtgcattgga 780
acgcggattc cccgtgccaa gagtgacgta agtaccgcct atagagtcta taggcccacc 840
cccttggctt cttatgcatg ctatactgtt tttggcttgg ggtctataca cccccgcttc 900
ctcatgttat aggtgatggt atagcttagc ctataggtgt gggttattga ccattattga 960
ccactcccct attggtgacg atactttcca ttactaatcc ataacatggc tctttgccac 1020
aactctcttt attggctata tgccaataca ctgtccttca gagactgaca cggactctgt 1080
atttttacag gatggggtct catttattat ttacaaattc acatatacaa caccaccgtc 1140
cccagtgccc gcagttttta ttaaacataa cgtgggatct ccacgcgaat ctcgggtacg 1200
tgttccggac atgggctctt ctccggtagc ggcggagctt ctacatccga gccctgctcc 1260
catgcctcca gcgactcatg gtcgctcggc agctccttgc tcctaacagt ggaggccaga 1320
cttaggcaca gcacgatgcc caccaccacc agtgtgccgc acaaggccgt ggcggtaggg 1380
tatgtgtctg aaaatgagct cggggagcgg gcttgcaccg ctgacgcatt tggaagactt 1440
aaggcagcgg cagaagaaga tgcaggcagc tgagttgttg tgttctgata agagtcagag 1500
gtaactcccg ttgcggtgct gttaacggtg gagggcagtg tagtctgagc agtactcgtt 1560
gctgccgcgc gcgccaccag acataatagc tgacagacta acagactgtt cctttccatg 1620
ggtcttttct gcagtcaccg tccttagatc taccatgagc accaacccca agccccagag 1680
gaagaccaag aggaacacca acaggaggcc ccaggatgtg aagttccctg ggggaggcca 1740
gattgtggga ggggtctacc tgctgcccag gaggggcccc aggctggggg tgagggctac 1800
caggaagacc tctgagaggt cccagcccag gggcaggagg cagcccatcc ccaaggccag 1860
gaggcctgag ggccgctcct gggcccagcc tggctacccc tggcccctgt atggcaatga 1920
aggctttggc tgggctggct ggctgctgtc ccccaggggc tccaggccct cctggggccc 1980
cacagacccc aggaggaggt ccaggaacct gggcaaggtg attgacaccc tgacctgtgg 2040
ctttgctgac ctgatgggct acatccccct ggtgggggct cctgtgggag gggtggctag 2100
ggctctggct catggggtga gggtgctgga ggatggggtg aactatgcta ctggcaacct 2160
gcctggctgc tccttctcca tcttcctgct ggccctgctc tcctgcctga cagtgcctgc 2220
ttctgccgaa ttcgcttcca atgagaacat ggagaccatg aaccagccct accacatctg 2280
ccgcggcttc acctgcttca agaagtaaac ccgggaattc taaagtcgac agcggccgcg 2340
atctgctgtg ccttctagtt gccagccatc tgttgtttgc ccctcccccg tgccttcctt 2400
gaccctggaa ggtgccactc ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca 2460
ttgtctgagt aggtgtcatt ctattctggg gggtggggtg gggcagcaca gcaaggggga 2520
ggattgggaa gacaatagca ggcatgctgg ggatgcggtg ggctctatgg gtacggccgc 2580
agcggcctta attaaggccg cagcggccgt acccaggtgc tgaagaattg acccggttcc 2640
tcgacccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 2700
catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 2760
caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 2820
ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcaatg ctcacgctgt 2880
aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 2940
gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 3000
cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 3060
ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 3120
tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 3180
tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 3240
cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacgtgatc ccgtaatgct 3300
ctgccagtgt tacaaccaat taaccaattc tgattagaaa aactcatcga gcatcaaatg 3360
aaactgcaat ttattcatat caggattatc aataccatat ttttgaaaaa gccgtttctg 3420
taatgaagga gaaaactcac cgaggcagtt ccataggatg gcaagatcct ggtatcggtc 3480
tgcgattccg actcgtccaa catcaataca acctattaat ttcccctcgt caaaaataag 3540
gttatcaagt gagaaatcac catgagtgac gactgaatcc ggtgagaatg gcaaaagctt 3600
atgcatttct ttccagactt gttcaacagg ccagccatta cgctcgtcat caaaatcact 3660
cgcatcaacc aaaccgttat tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc 3720
gctgttaaaa ggacaattac aaacaggaat cgaatgcaac cggcgcagga acactgccag 3780
cgcatcaaca atattttcac ctgaatcagg atattcttct aatacctgga atgctgtttt 3840
cccggggatc gcagtggtga gtaaccatgc atcatcagga gtacggataa aatgcttgat 3900
ggtcggaaga ggcataaatt ccgtcagcca gtttagtctg accatctcat ctgtaacatc 3960
attggcaacg ctacctttgc catgtttcag aaacaactct ggcgcatcgg gcttcccata 4020
caatcgatag attgtcgcac ctgattgccc gacattatcg cgagcccatt tatacccata 4080
taaatcagca tccatgttgg aatttaatcg cggcctcgag caagacgttt cccgttgaat 4140
atggctcata acaccccttg tattactgtt tatgtaagca gacagtttta ttgttcatga 4200
tgatatattt ttatcttgtg caatgtaaca tcagagattt tgagacacaa cgtggctttc 4260
c 4261
Claim 1 of 3 Claims What is claimed is: 1. A polynucleotide comprising the nucleotide sequence of SEQ ID NO: 2.
____________________________________________
|
|
|