1887

Abstract

Determination of the DNA G+C content of prokaryotic genomes using traditional methods is time-consuming and results may vary from laboratory to laboratory, depending on the technique used. We explored the possibility of extrapolating the genomic DNA G+C content of prokaryotes from gene sequences. For this, 127 universally conserved genes were studied from 50 prokaryotic genomes in the Clusters of Orthologous Groups database. Of these, 57 genes were present as a single copy in the genomes of 157 different prokaryote species available in GenBank. There was a strong correlation [coefficient of determination ( ) >95 %] between the DNA G+C contents of 20 genes and their corresponding genomes. For each of the 157 prokaryotic genomes studied, the DNA G+C content of the 20 genes was used to determine a ‘calculated’ genome DNA G+C content (CGC) and this value was compared with the ‘real’ genome DNA G+C content (RGC). In order to select the most suitable gene for the determination of CGC values, we compared the and median mol% difference between CGC and RGC as well as the sensitivity of each gene to provide CGC values for prokaryotic genomes that differ by less than 5 mol% from their RGC. The highly conserved gene (median size 1144 nucleotides), a vertically inherited member of the GTPase superfamily, showed the highest value of 0.98, the smallest median mol% difference between CGC and RGC of 1.06 and a sensitivity of 100 %. Using DNA G+C content values, the CGC values of 100 genomes not included in the calculation of differed by less than 5 mol% from their RGC values. These data suggest that the genomic DNA G+C content of prokaryotes may be estimated easily and reliably from the gene sequence.

Loading

Article metrics loading...

/content/journal/ijsem/10.1099/ijs.0.63903-0
2006-05-01
2020-01-19
Loading full text...

Full text loading...

/deliver/fulltext/ijsem/56/5/1025.html?itemId=/content/journal/ijsem/10.1099/ijs.0.63903-0&mimeType=html&fmt=ahah

References

  1. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. ( 1997; ). Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.[CrossRef]
    [Google Scholar]
  2. Bailey, N. T. J. ( 1995; ). Statistical Methods in Biology. Cambridge: University Press.
  3. Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Wheeler, D. L. ( 2005; ). GenBank. Nucleic Acids Res 33 (Database Issue), D34–D38.[CrossRef]
    [Google Scholar]
  4. Caldon, C. E., Yoong, P. & March, P. E. ( 2001; ). Evolution of a molecular switch: universal bacterial GTPases regulate ribosome function. Mol Microbiol 41, 289–297.[CrossRef]
    [Google Scholar]
  5. Cao, T. B. & Saier, M. H., Jr ( 2003; ). The general protein secretory pathway: phylogenetic analyses leading to evolutionary conclusions. Biochim Biophys Acta 1609, 115–125.[CrossRef]
    [Google Scholar]
  6. Cramer, P. ( 2002; ). Multisubunit RNA polymerases. Curr Opin Struct Biol 12, 89–97.[CrossRef]
    [Google Scholar]
  7. De Ley, J. ( 1970; ). Reexamination of the association between melting point, buoyant density, and chemical base composition of deoxyribonucleic acid. J Bacteriol 101, 738–754.
    [Google Scholar]
  8. Deschavanne, P. J., Giron, A., Vilain, J., Fagot, G. & Fertil, B. ( 1999; ). Genomic signature: characterization and classification of species assessed by chaos game representation of sequences. Mol Biol Evol 16, 1391–1399.[CrossRef]
    [Google Scholar]
  9. Drancourt, M., Roux, V., Fournier, P. E. & Raoult, D. ( 2004; ). rpoB gene sequence-based identification of aerobic Gram-positive cocci of the genera Streptococcus, Enterococcus, Gemella, Abiotrophia, and Granulicatella. J Clin Microbiol 42, 497–504.[CrossRef]
    [Google Scholar]
  10. Ezaki, T., Saidi, S. M., Liu, S. L., Hashimoto, Y., Yamamoto, H. & Yabuuchi, E. ( 1990; ). Rapid procedure to determine the DNA base composition from small amounts of gram-positive bacteria. FEMS Microbiol Lett 55, 127–130.
    [Google Scholar]
  11. Forsdyke, D. R. & Mortimer, J. R. ( 2000; ). Chargaff's legacy. Gene 261, 127–137.[CrossRef]
    [Google Scholar]
  12. Goodfellow, M., Manfio, G. P. & Chun, J. ( 1997; ). Towards a practical species concept for cultivable bacteria. In Species: The Units of Biodiversity, pp. 25–29. Edited by M. F. Clarridge & H. A. Dawah. London: Chapman and Hall.
  13. Gribaldo, S. & Cammarano, P. ( 1998; ). The root of the universal tree of life inferred from anciently duplicated genes encoding components of the protein-targeting machinery. J Mol Evol 47, 508–516.[CrossRef]
    [Google Scholar]
  14. Ishikawa, J., Yamashita, A., Mikami, Y., Hoshino, Y., Kurita, H., Hotta, K., Shiba, T. & Hattori, M. ( 2004; ). The complete genomic sequence of Nocardia farcinica IFM 10152. Proc Natl Acad Sci U S A 101, 14925–14930.[CrossRef]
    [Google Scholar]
  15. Khamis, A., Colson, P., Raoult, D. & Scola, B. L. ( 2003; ). Usefulness of rpoB gene sequencing for identification of Afipia and Bosea species, including a strategy for choosing discriminative partial sequences. Appl Environ Microbiol 69, 6740–6749.[CrossRef]
    [Google Scholar]
  16. Khamis, A., Raoult, D. & La Scola, B. ( 2004; ). rpoB gene sequencing for identification of Corynebacterium species. J Clin Microbiol 42, 3925–3931.[CrossRef]
    [Google Scholar]
  17. Knight, R. D., Freeland, S. J. & Landweber, L. F. ( 2001; ). A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol 2, research 0010.1–0010.13. doi:10.1186/gb-2001-2-4-research0010
    [Google Scholar]
  18. Ko, C. Y., Johnson, J. L., Barnett, L. B., McNair, H. M. & Vercellotti, J. R. ( 1977; ). A sensitive estimation of the percentage of guanine plus cytosine in deoxyribonucleic acid by high performance liquid chromatography. Anal Biochem 80, 183–192.[CrossRef]
    [Google Scholar]
  19. Koonin, E. V. ( 2003; ). Comparative genomics, minimal gene-sets and the last common universal ancestor. Nat Rev Microbiol 1, 127–136.[CrossRef]
    [Google Scholar]
  20. La Scola, B., Fenollar, F., Fournier, P. E., Altwegg, M., Mallet, M. N. & Raoult, D. ( 2001; ). Description of Tropheryma whipplei gen. nov., sp. nov., the Whipple's disease bacillus. Int J Syst Evol Microbiol 51, 1471–1479.
    [Google Scholar]
  21. Lobry, J. R. ( 2005; ). Influence of genomic G+C content on average amino acid composition of proteins from 59 bacterial species. Gene 205, 309–316.
    [Google Scholar]
  22. Mandel, M., Igambi, L., Bergendahl, J., Dodson, M. L., Jr & Scheltgen, E. ( 1970; ). Correlation of melting temperature and cesium chloride buoyant density of bacterial deoxyribonucleic acid. J Bacteriol 101, 333–338.
    [Google Scholar]
  23. Marmur, J. & Doty, P. ( 1962; ). Determination of the base composition of deoxyribonucleic acid from its thermal denaturation temperature. J Mol Biol 5, 109–118.[CrossRef]
    [Google Scholar]
  24. Mesbah, M. & Whitman, W. B. ( 1989; ). Measurement of deoxyguanosine/thymidine ratios in complex mixtures by high-performance liquid chromatography for determination of the mole percentage guanine + cytosine of DNA. J Chromatogr 479, 297–306.[CrossRef]
    [Google Scholar]
  25. Mollet, C., Drancourt, M. & Raoult, D. ( 1997; ). rpoB sequence analysis as a novel basis for bacterial identification. Mol Microbiol 26, 1005–1011.[CrossRef]
    [Google Scholar]
  26. Mollet, C., Drancourt, M. & Raoult, D. ( 1998; ). Determination of Coxiella burnetii rpoB sequence and its use for phylogenetic analysis. Gene 207, 97–103.[CrossRef]
    [Google Scholar]
  27. Murakami, K. S. & Darst, S. A. ( 2003; ). Bacterial RNA polymerases: the wholo story. Curr Opin Struct Biol 13, 31–39.[CrossRef]
    [Google Scholar]
  28. Olson, S. A. ( 2002; ). emboss opens up sequence analysis. European Molecular Biology Open Software Suite. Brief Bioinform 3, 87–91.[CrossRef]
    [Google Scholar]
  29. Owen, R. J. ( 1983; ). Nucleic acids in the classification of campylobacters. Eur J Clin Microbiol 2, 367–377.[CrossRef]
    [Google Scholar]
  30. Owen, R. J., Hill, L. R. & Lapage, S. P. ( 1969; ). Determination of DNA base compositions from melting profiles in dilute buffers. Biopolymers 7, 503–516.[CrossRef]
    [Google Scholar]
  31. Raoult, D., Ogata, H., Audic, S., Robert, C., Suhre, K., Drancourt, M. & Claverie, J. M. ( 2003; ). Tropheryma whipplei Twist: a human pathogenic Actinobacteria with a reduced genome. Genome Res 13, 1800–1809.
    [Google Scholar]
  32. Razin, S. ( 1985; ). Molecular biology and genetics of mycoplasmas (Mollicutes). Microbiol Rev 49, 419–455.
    [Google Scholar]
  33. Renesto, P., Lorvellec-Guillon, K., Drancourt, M. & Raoult, D. ( 2000; ). rpoB gene analysis as a novel strategy for identification of spirochetes from the genera Borrelia, Treponema, and Leptospira. J Clin Microbiol 38, 2200–2203.
    [Google Scholar]
  34. Renesto, P., Gautheret, D., Drancourt, M. & Raoult, D. ( 2001a; ). Determination of the rpoB gene sequences of Bartonella henselae and Bartonella quintana for phylogenic analysis. Res Microbiol 151, 831–836.
    [Google Scholar]
  35. Renesto, P., Gouvernet, J., Drancourt, M., Roux, V. & Raoult, D. ( 2001b; ). Use of rpoB gene analysis for detection and identification of Bartonella species. J Clin Microbiol 39, 430–437.[CrossRef]
    [Google Scholar]
  36. Sandberg, R., Branden, C. I., Ernberg, I. & Coster, J. ( 2003; ). Quantifying the species-specificity in genomic signatures, synonymous codon choice, amino acid usage and G+C content. Gene 311, 35–42.[CrossRef]
    [Google Scholar]
  37. Schildkraut, C. L., Marmur, J. & Doty, P. ( 1962; ). Determination of the base composition of deoxyribonucleic acid from its buoyant density in CsCl. J Mol Biol 4, 430–443.[CrossRef]
    [Google Scholar]
  38. Stackebrandt, E., Frederiksen, W., Garrity, G. M. & 10 other authors ( 2002; ). Report of the ad hoc committee for the re-evaluation of the species definition in bacteriology. Int J Syst Evol Microbiol 52, 1043–1047.[CrossRef]
    [Google Scholar]
  39. Taillardat-Bisch, A. V., Raoult, D. & Drancourt, M. ( 2003; ). RNA polymerase beta-subunit-based phylogeny of Ehrlichia spp., Anaplasma spp., Neorickettsia spp. and Wolbachia pipientis. Int J Syst Evol Microbiol 53, 455–458.[CrossRef]
    [Google Scholar]
  40. Tatusov, R. L., Natale, D. A., Garkavtsev, I. V. & 7 other authors ( 2001; ). The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res 29, 22–28.[CrossRef]
    [Google Scholar]
  41. Vandamme, P., Pot, B., Gillis, M., De Vos, P., Kersters, K. & Swings, J. ( 1996; ). Polyphasic taxonomy, a consensus approach to bacterial systematics. Microbiol Rev 60, 407–438.
    [Google Scholar]
  42. Xu, H. X., Kawamura, Y., Li, N., Zhao, L., Li, T. M., Li, Z. Y., Shu, S. & Ezaki, T. ( 2000; ). A rapid method for determining the G+C content of bacterial chromosomes by monitoring fluorescence intensity during DNA denaturation in a capillary tube. Int J Syst Evol Microbiol 50, 1463–1469.[CrossRef]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/ijsem/10.1099/ijs.0.63903-0
Loading
/content/journal/ijsem/10.1099/ijs.0.63903-0
Loading

Data & Media loading...

Supplements

A non-redundant list of 157 prokaryotic species for which a complete genome sequence was available in GenBank. [PDF](31 KB)

PDF

Differences between RGC and CGC values obtained for 100 prokaryotic genomes. [PDF](25 KB)

PDF

A list of 127 genes conserved in 50 prokaryotic genomes in the COG database. [PDF](99 KB)

PDF

Most cited articles

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error