1887

Abstract

Identification of short genes that encode peptides of fewer than 60 aa is challenging, both experimentally and . As a consequence, the universe of these short coding sequences (CDSs) remains largely unknown, although some are acknowledged to play important roles in cell–cell communication, particularly in Gram-positive bacteria. This paper reports a thorough search for short CDSs across streptococcal genomes. Our bioinformatic approach relied on a combination of advanced intrinsic and extrinsic methods. In the first step, intrinsic sequence information (nucleotide composition and presence of RBSs) served to identify new short putative CDSs (spCDSs) and to eliminate the differences between annotation policies. In the second step, pseudogene fragments and false predictions were filtered out. The last step consisted of screening the remaining spCDSs for lines of extrinsic evidence involving sequence and gene-context comparisons. A total of 789 spCDSs across 20 complete genomes (19 and one ) received the support of at least one line of extrinsic evidence, which corresponds to an average of 20 short CDSs per million base pairs. Most of these had no known function, and a significant fraction (31 %) are not even annotated as hypothetical genes in GenBank records. As an illustration of the value of this list, we describe a new family of CDSs, encoding very short hydrophobic peptides (20–23 aa) situated just upstream of some of the positive transcriptional regulators of the Rgg family. The expression of seven other short CDSs from CNRZ1066 that encode peptides ranging in length from 41 to 56 aa was confirmed by real-time quantitative RT-PCR and revealed a variety of expression patterns. Finally, one peptide from this list, encoded by a gene that is not annotated in GenBank, was identified in a cell-envelope-enriched fraction of CNRZ1066.

Loading

Article metrics loading...

/content/journal/micro/10.1099/mic.0.2007/006205-0
2007-11-01
2019-10-13
Loading full text...

Full text loading...

/deliver/fulltext/micro/153/11/3631.html?itemId=/content/journal/micro/10.1099/mic.0.2007/006205-0&mimeType=html&fmt=ahah

References

  1. Ajdic, D., McShan, W. M., McLaughlin, R. E., Savić, G., Chang, J., Carson, M. B., Primeaux, C., Tian, R., Kenton, S. & other authors ( 2002; ). Genome sequence of Streptococcus mutans UA159, a cariogenic dental pathogen. Proc Natl Acad Sci U S A 99, 14434–14439.[CrossRef]
    [Google Scholar]
  2. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. ( 1997; ). Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.[CrossRef]
    [Google Scholar]
  3. Banks, D. J., Porcella, S. F., Barbian, K. D., Beres, S. B., Philips, L. E., Voyich, J. M., DeLeo, F. R., Martin, J. M., Somerville, G. A. & Musser, J. M. ( 2004; ). Progress toward characterization of the group A Streptococcus metagenome: complete genome sequence of a macrolide-resistant serotype M6 strain. J Infect Dis 190, 727–738.[CrossRef]
    [Google Scholar]
  4. Beres, S. B., Sylva, G. L., Barbian, K. D., Lei, B., Hoff, J. S., Mammarella, N. D., Liu, M. Y., Smoot, J. C., Porcella, S. F. & other authors ( 2002; ). Genome sequence of a serotype M3 strain of group A Streptococcus: phage-encoded toxins, the high-virulence phenotype, and clone emergence. Proc Natl Acad Sci U S A 99, 10078–10083.[CrossRef]
    [Google Scholar]
  5. Beres, S. B., Richter, E. W., Nagiec, M. J., Sumby, P., Porcella, S. F., DeLeo, F. R. & Musser, J. M. ( 2006; ). Molecular genetic anatomy of inter- and intraserotype variation in the human bacterial pathogen group A Streptococcus. Proc Natl Acad Sci U S A 103, 7059–7064.[CrossRef]
    [Google Scholar]
  6. Besemer, J., Lomsadze, A. & Borodovsky, M. ( 2001; ). GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res 29, 2607–2618.[CrossRef]
    [Google Scholar]
  7. Bolotin, A., Quinquis, B., Renault, P., Sorokin, A., Ehrlich, S. D., Kulakauskas, S., Lapidus, A., Goltsman, E., Mazur, M. & other authors ( 2004; ). Complete sequence and comparative genome analysis of the dairy bacterium Streptococcus thermophilus. Nat Biotechnol 22, 1554–1558.[CrossRef]
    [Google Scholar]
  8. Borodovsky, M., Rudd, K. E. & Koonin, E. V. ( 1994; ). Intrinsic and extrinsic approaches for detecting genes in a bacterial genome. Nucleic Acids Res 22, 4756–4767.[CrossRef]
    [Google Scholar]
  9. Bryson, K., Loux, V., Bossy, R., Nicolas, P., Chaillou, S., van de Guchte, M., Penaud, S., Maguin, E., Hoebeke, M. & other authors ( 2006; ). agmial: implementing an annotation strategy for prokaryote genomes as a distributed system. Nucleic Acids Res 34, 3533–3545.[CrossRef]
    [Google Scholar]
  10. Chandler, J. R. & Dunny, G. M. ( 2004; ). Enterococcal peptide sex pheromones: synthesis and control of biological activity. Peptides 25, 1377–1388.[CrossRef]
    [Google Scholar]
  11. Felsenstein, J. ( 1989; ). phylip – phylogeny inference package (version 3.2). Cladistics 5, 164–166.
    [Google Scholar]
  12. Ferretti, J. J., McShan, W. M., Ajdic, D., Savic, D. J., Savic, G., Lyon, K., Primeaux, C., Sezate, S., Suvorov, A. N. & other authors ( 2001; ). Complete genome sequence of an M1 strain of Streptococcus pyogenes. Proc Natl Acad Sci U S A 98, 4658–4663.[CrossRef]
    [Google Scholar]
  13. Gardy, J. L., Laird, M. R., Chen, F., Rey, S., Walsh, C. J., Ester, M. & Brinkman, F. S. ( 2005; ). PSORTb v.2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics 21, 617–623.[CrossRef]
    [Google Scholar]
  14. Gitton, C., Meyrand, M., Wang, J., Caron, C., Trubuil, A., Guillot, A. & Mistou, M. Y. ( 2005; ). Proteomic signature of Lactococcus lactis NCDO763 cultivated in milk. Appl Environ Microbiol 71, 7152–7163.[CrossRef]
    [Google Scholar]
  15. Glaser, P., Rusniok, C., Buchrieser, C., Chevalier, F., Frangeul, L., Msadek, T., Zouine, M., Couvé, E., Lalioui, L. & other authors ( 2002; ). Genome sequence of Streptococcus agalactiae, a pathogen causing invasive neonatal disease. Mol Microbiol 45, 1499–1513.[CrossRef]
    [Google Scholar]
  16. Green, N. M., Zhang, S., Porcella, S. F., Nagiec, M. J., Barbian, K. D., Beres, S. B., LeFebvre, R. B. & Musser, J. M. ( 2005; ). Genome sequence of a serotype M28 strain of group A Streptococcus: potential new insights into puerperal sepsis and bacterial disease specificity. J Infect Dis 192, 760–770.[CrossRef]
    [Google Scholar]
  17. Hamoen, L. W., Venema, G. & Kuipers, O. P. ( 2003; ). Controlling competence in Bacillus subtilis: shared use of regulators. Microbiology 149, 9–17.[CrossRef]
    [Google Scholar]
  18. Harrison, P. M., Carriero, N., Liu, Y. & Gerstein, M. ( 2003; ). A “polyORFomic” analysis of prokaryote genomes using disabled-homology filtering reveals conserved but undiscovered short ORFs. J Mol Biol 333, 885–892.[CrossRef]
    [Google Scholar]
  19. Hols, P., Hancy, F., Fontaine, L., Grossiord, B., Prozzi, D., Leblond-Bourget, N., Decaris, B., Bolotin, A., Delorme, C. & other authors ( 2005; ). New insights in the molecular biology and physiology of Streptococcus thermophilus revealed by comparative genomics. FEMS Microbiol Rev 29, 435–463.
    [Google Scholar]
  20. Hoskins, J., Alborn, W. E., Jr, Arnold, J., Blaszczak, L. C., Burgett, S., DeHoff, B. S., Estrem, S. T., Fritz, L., Fu, D. J. & other authors ( 2001; ). Genome of the bacterium Streptococcus pneumoniae strain R6. J Bacteriol 183, 5709–5717.[CrossRef]
    [Google Scholar]
  21. Kimura, M. ( 1977; ). Preponderance of synonymous changes as evidence for the neutral theory of molecular evolution. Nature 267, 275–276.[CrossRef]
    [Google Scholar]
  22. Kleerebezem, M. ( 2004; ). Quorum sensing control of lantibiotic production; nisin and subtilin autoregulate their own biosynthesis. Peptides 25, 1405–1414.[CrossRef]
    [Google Scholar]
  23. Kozlowicz, B. K., Shi, K., Gu, Z. Y., Ohlendorf, D. H., Earhart, C. A. & Dunny, G. M. ( 2006; ). Molecular basis for control of conjugation by bacterial pheromone and inhibitor peptides. Mol Microbiol 62, 958–969.[CrossRef]
    [Google Scholar]
  24. Larsen, T. S. & Krogh, A. ( 2003; ). EasyGene – a prokaryotic gene finder that ranks ORFs by statistical significance. BMC Bioinformatics 4, 21 [CrossRef]
    [Google Scholar]
  25. Letort, C. & Juillard, V. ( 2001; ). Development of a minimal chemically-defined medium for the exponential growth of Streptococcus thermophilus. J Appl Microbiol 91, 1023–1029.[CrossRef]
    [Google Scholar]
  26. Livak, K. J. & Schmittgen, T. D. ( 2001; ). Analysis of relative gene expression data using real-time quantitative PCR and the method. Methods 25, 402–408.[CrossRef]
    [Google Scholar]
  27. Lyon, G. J. & Novick, R. P. ( 2004; ). Peptide signaling in Staphylococcus aureus and other Gram-positive bacteria. Peptides 25, 1389–1403.[CrossRef]
    [Google Scholar]
  28. Lyon, W. R., Gibson, C. M. & Caparon, M. G. ( 1998; ). A role for trigger factor and an Rgg-like regulator in the transcription, secretion and processing of the cysteine proteinase of Streptococcus pyogenes. EMBO J 17, 6263–6275.[CrossRef]
    [Google Scholar]
  29. Martin, B., Quentin, Y., Fichant, G. & Claverys, J. P. ( 2006; ). Independent evolution of competence regulatory cascades in streptococci?. Trends Microbiol 14, 339–345.[CrossRef]
    [Google Scholar]
  30. Nakagawa, I., Kurokawa, K., Yamashita, A., Nakata, M., Tomiyasu, Y., Okahashi, N., Kawabata, S., Yamazaki, K., Shiba, T. & other authors ( 2003; ). Genome sequence of an M3 strain of Streptococcus pyogenes reveals a large-scale genomic rearrangement in invasive strains and new insights into phage evolution. Genome Res 13, 1042–1055.[CrossRef]
    [Google Scholar]
  31. Nelson, B. L. ( 1995; ). Stochastic Modeling: Analysis and Simulation. Mineola, NY: Dover Publications.
  32. Nicolas, P., Bize, L., Muri, F., Hoebeke, M., Rodolphe, F., Ehrlich, S. D., Prum, B. & Bessières, P. ( 2002; ). Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models. Nucleic Acids Res 30, 1418–1426.[CrossRef]
    [Google Scholar]
  33. Nielsen, P. & Krogh, A. ( 2005; ). Large-scale prokaryotic gene prediction and comparison to genome annotation. Bioinformatics 21, 4322–4329.[CrossRef]
    [Google Scholar]
  34. Nielsen, H., Engelbrecht, J., Brunak, S. & von Heijne, G. ( 1997; ). Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng 10, 1–6.[CrossRef]
    [Google Scholar]
  35. Ochman, H. ( 2002; ). Distinguishing the ORFs from the ELFs: short bacterial genes and the annotation of genomes. Trends Genet 18, 335–337.[CrossRef]
    [Google Scholar]
  36. Paulsen, I. T., Banerjei, L., Myers, G. S., Nelson, K. E., Seshadri, R., Read, T. D., Fouts, D. E., Eisen, J. A., Gill, S. R. & other authors ( 2003; ). Role of mobile DNA in the evolution of vancomycin-resistant Enterococcus faecalis. Science 299, 2071–2074.[CrossRef]
    [Google Scholar]
  37. Pearson, W. R. ( 2000; ). Flexible sequence similarity searching with the fasta3 program package. Methods Mol Biol 132, 185–219.
    [Google Scholar]
  38. Pearson, W. R. & Lipman, D. J. ( 1988; ). Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 85, 2444–2448.[CrossRef]
    [Google Scholar]
  39. Pearson, W. R., Wood, T., Zhang, Z. & Miller, W. ( 1997; ). Comparison of DNA sequences with protein sequences. Genomics 46, 24–36.[CrossRef]
    [Google Scholar]
  40. Qi, F., Chen, P. & Caufield, P. W. ( 1999; ). Functional analyses of the promoters in the lantibiotic mutacin II biosynthetic locus in Streptococcus mutans. Appl Environ Microbiol 65, 652–658.
    [Google Scholar]
  41. Rabiner, L. R. ( 1989; ). A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77, 257–286.[CrossRef]
    [Google Scholar]
  42. Rawlinson, E. L., Nes, I. F. & Skaugen, M. ( 2002; ). LasX, a transcriptional regulator of the lactocin S biosynthetic genes in Lactobacillus sakei L45, acts both as an activator and a repressor. Biochimie 84, 559–567.[CrossRef]
    [Google Scholar]
  43. Sanders, J. W., Leenhouts, K., Burghoorn, J., Brands, J. R., Venema, G. & Kok, J. ( 1998; ). A chloride-inducible acid resistance mechanism in Lactococcus lactis and its regulation. Mol Microbiol 27, 299–310.[CrossRef]
    [Google Scholar]
  44. Schmidt, H. A., Strimmer, K., Vingron, M. & von Haeseler, A. ( 2002; ). TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18, 502–504.[CrossRef]
    [Google Scholar]
  45. Skovgaard, M., Jensen, L. J., Brunak, S., Ussery, D. & Krogh, A. ( 2001; ). On the total number of genes and their length distribution in complete microbial genomes. Trends Genet 17, 425–428.[CrossRef]
    [Google Scholar]
  46. Slamti, L. & Lereclus, D. ( 2005; ). Specificity and polymorphism of the PlcR–PapR quorum-sensing system in the Bacillus cereus group. J Bacteriol 187, 1182–1187.[CrossRef]
    [Google Scholar]
  47. Smith, T. F. & Waterman, M. S. ( 1981; ). Identification of common molecular subsequences. J Mol Biol 147, 195–197.[CrossRef]
    [Google Scholar]
  48. Smoot, J. C., Barbian, K. D., Van Gompel, J. J., Smoot, L. M., Chaussee, M. S., Sylva, G. L., Sturdevant, D. E., Ricklefs, S. M., Porcella, S. F. & other authors ( 2002; ). Genome sequence and comparative microarray analysis of serotype M18 group A Streptococcus strains associated with acute rheumatic fever outbreaks. Proc Natl Acad Sci U S A 99, 4668–4673.[CrossRef]
    [Google Scholar]
  49. Sumby, P., Porcella, S. F., Madrigal, A. G., Barbian, K. D., Virtaneva, K., Ricklefs, S. M., Sturdevant, D. E., Graham, M. R., Vuopio-Varkila, J. & other authors ( 2005; ). Evolutionary origin and emergence of a highly successful clone of serotype M1 group A Streptococcus involved multiple horizontal gene transfer events. J Infect Dis 192, 771–782.[CrossRef]
    [Google Scholar]
  50. Tettelin, H., Nelson, K. E., Paulsen, I. T., Eisen, J. A., Read, T. D., Peterson, S., Heidelberg, J., DeBoy, R. T., Haft, D. H. & other authors ( 2001; ). Complete genome sequence of a virulent isolate of Streptococcus pneumoniae. Science 293, 498–506.[CrossRef]
    [Google Scholar]
  51. Tettelin, H., Masignani, V., Cieslewicz, M. J., Eisen, J. A., Peterson, S., Wessels, M. R., Paulsen, I. T., Nelson, K. E., Margarit, I. & other authors ( 2002; ). Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae. Proc Natl Acad Sci U S A 99, 12391–12396.[CrossRef]
    [Google Scholar]
  52. Tettelin, H., Masignani, V., Cieslewicz, M. J., Donati, C., Medini, D., Ward, N. L., Angiuoli, S. V., Crabtree, J., Jones, A. L. & other authors ( 2005; ). Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”. Proc Natl Acad Sci U S A 102, 13950–13955.[CrossRef]
    [Google Scholar]
  53. Thompson, J. D., Higgins, D. G. & Gibson, T. J. ( 1994; ). clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22, 4673–4680.[CrossRef]
    [Google Scholar]
  54. Vickerman, M. M. & Minick, P. E. ( 2002; ). Genetic analysis of the rgg–gtfG junctional region and its role in Streptococcus gordonii glucosyltransferase activity. Infect Immun 70, 1703–1714.[CrossRef]
    [Google Scholar]
  55. Zuber, P. ( 2001; ). A peptide profile of the Bacillus subtilis genome. Peptides 22, 1555–1577.[CrossRef]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/micro/10.1099/mic.0.2007/006205-0
Loading
/content/journal/micro/10.1099/mic.0.2007/006205-0
Loading

Data & Media loading...

vol. , part 11, pp. 3631–3644

Supplementary data [Excel file](746 KB)



EXCEL
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error