Nucleotide Sequence of the Rubella Virus Capsid Protein Gene Reveals an Unusually High G/C Content Free

Abstract

Summary

The nucleotide sequence of the rubella virus capsid protein (C) gene has been determined from a cDNA clone derived from the 40S genomic RNA. The sequence covers the coding region of the C protein (831 nucleotides), 70 nucleotides of the 5′ untranslated region, and the 5′ end of the downstream E2 membrane protein gene. The capsid gene is unusually rich in C (41.6%) and G (31.2%) residues (G + C 72.8%), and poor in A (15.4%) and U residues (11.8%). There are regions with long runs of up to 45% C or 35% G residues. The codon usage is non-random, with a strong preference for C and G residues in the third position. Starting from two in-frame AUG codons (seven amino acid residues apart) an open reading frame (ORF) was identified that extended in frame into the ORF coding for the downstream E2 membrane protein gene. Since the amino terminus of the capsid protein is blocked, we could not determine which of the AUGs serve as the initiating codon. To verify that the deduced ORF was correct, we have determined the amino acid sequence of 13 tryptic peptides corresponding to one-third of the C protein. Our data show that the C protein is about 277 residues in length ( about 30750). It is very hydrophilic and rich in prolines (14.1%) and arginines (14.4%). Clusters of these amino acids are concentrated in the aminoterminal third of the C protein. No sequence homology to the capsid protein of several alphaviruses was observed. Together with our previous sequence data we have now completed the sequence of the genes coding for the structural proteins C, E2 and E1 of rubella virus.

Loading

Article metrics loading...

/content/journal/jgv/10.1099/0022-1317-69-3-603
1988-03-01
2024-03-29
Loading full text...

Full text loading...

/deliver/fulltext/jgv/69/3/JV0690030603.html?itemId=/content/journal/jgv/10.1099/0022-1317-69-3-603&mimeType=html&fmt=ahah

References

  1. CHANG G-J. J., TRENT D. W. 1987; Nucleotide sequence of the genome region encoding the 26S mRNA of eastern equine encephalomyelitis virus and the deduced amino acid sequence of the viral structural proteins. Journal of General Virology 68:2129–2142
    [Google Scholar]
  2. CLARKE D. M., LOO T. W., HUI I., CHONG P., GILLIAM S. 1987; Nucleotide sequence and in vitro expression of rubella virus 24S subgenomic messenger RNA encoding the structural proteins E1, E2 and C. Nucleic Acids Research 15:3041–3057
    [Google Scholar]
  3. DALGARNO L., RICE C. M., STRAUSS J. H. 1983; Ross River virus 26S RNA: complete nucleotide sequence and deduced sequence of the encoded structural proteins. Virology 129:170–187
    [Google Scholar]
  4. FREY T. K., MARR L. D., HEMPHILL M. L., DOMINGUEZ G. 1986; Molecular cloning and sequencing of the region of the rubella virus genome coding for glycoprotein El. Virology 154:228–232
    [Google Scholar]
  5. GAROFF H., FRISCHAUF A.-M., SIMONS K., LEHRACH H., DELIUS H. 1980; The capsid protein of Semliki Forest virus has clusters of basic amino acids and prolines in its amino terminal region. Proceedings of the National Academy of Sciences, U.S.A 77:6376–6380
    [Google Scholar]
  6. IKEMURA T. 1984; Codon usage and rRNA content in unicellular and multicellular organisms. Molecular. Biology and Evolution 2:13–34
    [Google Scholar]
  7. KALKKINEN N., OKER-BLOM C., PETTERSSON R. F. 1984; Three genes code for rubella virus structural proteins E1, E2a, E2b and C. Journal of General Virology 65:1549–1557
    [Google Scholar]
  8. KINNEY R. M., JOHNSON B. J. B., BROWN V. L., TRENT D. W. 1986; Nucleotide sequence of the 26S mRNA of the virulent Trinidad donkey strain of Venezuelan equine encephalitis virus and deduced sequence of the encoded structural proteins. Virology 152:400–413
    [Google Scholar]
  9. KOZAK M. 1984; Compilation and analysis of the sequences upstream from the translational start site in eukaryotic mRNAs. Nucleic Acids Research 12:857–872
    [Google Scholar]
  10. KYTE J., DOOLITTLE R. F. 1982; A simple method for displaying the hydropathic character of a protein. Journal of Molecular Biology 157:105–132
    [Google Scholar]
  11. MELANCON P., GAROFF H. 1987; Processing of the Semliki Forest virus structural polyprotein: role of the capsid protease. Journal of Virology 61:1301–1309
    [Google Scholar]
  12. MESSING J. 1983; New M13 vectors for cloning. Methods in Enzymology 101:20–78
    [Google Scholar]
  13. NAKHASI H. L., MEYER B. C., LIU T.-Y. 1986; Rubella virus cDNA. Sequence and expression of E1 envelope protein. Journal of Biological Chemistry 261:16616–16621
    [Google Scholar]
  14. OKER-BLOM C. 1984; The gene order for rubella virus structural proteins is NH2-C-E2-El-COOH. Journal of Virology 51:354–358
    [Google Scholar]
  15. OKER-BLOM C, KALKKINEN N., KÄÄRIÄINEN L., PETTERSSON R. F. 1983; Rubella virus contains one capsid protein and three envelope glycoproteins, El, E2a, and E2b. Journal of Virology 46:964–973
    [Google Scholar]
  16. OKER-BLOM C, ULMANEN I., KääRIäINEN L., PETTERSSON R. F. 1984; Rubella virus 40S genome RNA specifies a 24S subgenomic mRNA that codes for a precursor to the structural proteins. Journal of Virology 49:403–408
    [Google Scholar]
  17. PELTOLA H., SÖDERLUND H., UKKONEN E. 1984; SEQAID: a DNA sequence assembling program based on a mathematical model. Nucleic Acids Research 12:307–321
    [Google Scholar]
  18. PORTERFIELD J. S., CASALS J., CHUMAKOV M. P., GAIDAMOVICH S. Y., HANNOUN C, HOLMES I. H., HORZINEK M. C, MUSSGAY M., OKER-BLOM N., RUSSELL P. K., TRENT D. W. 1978; Togaviridae. Intervirology 9:129–148
    [Google Scholar]
  19. RICE C. M., STRAUSS J. H. 1981; Nucleotide sequence of the 26S mRNA of Sindbis virus and deduced sequence of the encoded virus structural proteins. Proceedings of the National Academy of Sciences, U.S.A 78:2062–2066
    [Google Scholar]
  20. SANGER F., NICKLEN S., COULSON A. R. 1977; DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, U.S.A 74:5463–5467
    [Google Scholar]
  21. VIDGREN G., TAKKINEN K., KALKKINEN N., KääRIäINEN L., PETTERSSON R. F. 1987; Nucleotide sequence of the genes coding for the membrane glycoproteins E1 and E2 of rubella virus. Journal of General Virology 68:2347–2357
    [Google Scholar]
  22. WAXHAM M. N., WOLINSKY J. S. 1983; Immunochemical identification of rubella virus hemagglutinin. Virology 126:194–203
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/jgv/10.1099/0022-1317-69-3-603
Loading
/content/journal/jgv/10.1099/0022-1317-69-3-603
Loading

Data & Media loading...

Most cited Most Cited RSS feed