1887

Abstract

A large part of our current understanding of gene regulation in Gram-positive bacteria is based on , as it is one of the most well studied bacterial model systems. The rapid growth in data concerning its molecular and genomic biology is distributed across multiple annotation resources. Consequently, the interpretation of data from further experiments becomes increasingly challenging in both low- and large-scale analyses. Additionally, annotation of structured RNA and non-coding RNA (ncRNA), as well as the operon structure, is still lagging behind the annotation of the coding sequences. To address these challenges, we created the genome atlas, BSGatlas, which integrates and unifies multiple existing annotation resources. Compared to any of the individual resources, the BSGatlas contains twice as many ncRNAs, while improving the positional annotation for 70 % of the ncRNAs. Furthermore, we combined known transcription start and termination sites with lists of known co-transcribed gene sets to create a comprehensive transcript map. The combination with transcription start/termination site annotations resulted in 717 new sets of co-transcribed genes and 5335 untranslated regions (UTRs). In comparison to existing resources, the number of 5′ and 3′ UTRs increased nearly fivefold, and the number of internal UTRs doubled. The transcript map is organized in 2266 operons, which provides transcriptional annotation for 92 % of all genes in the genome compared to the at most 82 % by previous resources. We predicted an off-target-aware genome-wide library of CRISPR–Cas9 guide RNAs, which we also linked to polycistronic operons. We provide the BSGatlas in multiple forms: as a website (https://rth.dk/resources/bsgatlas/), an annotation hub for display in the UCSC genome browser, supplementary tables and standardized GFF3 format, which can be used in large scale -omics studies. By complementing existing resources, the BSGatlas supports analyses of the genome and its molecular biology with respect to not only non-coding genes but also genome-wide transcriptional relationships of all genes.

Funding
This study was supported by the:
  • Innovationsfonden (Award 5163-00010B)
    • Principle Award Recipient: JanGorodkin
  • This is an open-access article distributed under the terms of the Creative Commons Attribution License.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000524
2021-02-04
2021-10-17
Loading full text...

Full text loading...

/deliver/fulltext/mgen/7/2/mgen000524.html?itemId=/content/journal/mgen/10.1099/mgen.0.000524&mimeType=html&fmt=ahah

References

  1. Schallmey M, Singh A, Ward OP. Developments in the use of Bacillus species for industrial production. Can J Microbiol 2004; 50:1–17 [View Article][PubMed]
    [Google Scholar]
  2. Irnov KA Kertsburg A, Winkler WC. Genetic control by cis-acting regulatory RNAs in Bacillus subtilis: general principles and prospects for discovery. Cold Spring Harb Symp Quant Biol 2006; 71:239–249 [View Article][PubMed]
    [Google Scholar]
  3. Nagler K, Krawczyk AO, De Jong A, Madela K, Hoffmann T et al. Identification of differentially expressed genes during Bacillus subtilis spore outgrowth in high-salinity environments using RNA sequencing. Front Microbiol 2016; 7:1564 [View Article][PubMed]
    [Google Scholar]
  4. Nicolas P, Mäder U, Dervyn E, Rochat T, Leduc A et al. Condition-dependent transcriptome reveals high-level regulatory architecture in Bacillus subtilis . Science 2012; 335:1103–1106 [View Article][PubMed]
    [Google Scholar]
  5. Lalanne J-B, Taggart JC, Guo MS, Herzel L, Schieler A et al. Evolutionary convergence of pathway-specific enzyme expression stoichiometry. Cell 2018; 173:749–761 [View Article][PubMed]
    [Google Scholar]
  6. Borriss R, Danchin A, Harwood CR, Médigue C, Rocha EPC et al. Bacillus subtilis, the model Gram-positive bacterium: 20 years of annotation refinement. Microb Biotechnol 2018; 11:3–17 [View Article][PubMed]
    [Google Scholar]
  7. Zallot R, Harrison K, Kolaczkowski B, de Crécy-Lagard V. Functional annotations of paralogs: a blessing and a curse. Life 2016; 6:39 [View Article]
    [Google Scholar]
  8. Zhu B, Stülke J. SubtiWiki in 2018: from genes and proteins to functional network annotation of the model organism Bacillus subtilis . Nucleic Acids Res 2018; 46:D743–D748 [View Article][PubMed]
    [Google Scholar]
  9. Eilbeck K, Lewis SE, Mungall CJ, Yandell M, Stein L et al. The sequence ontology: a tool for the unification of genome annotations. Genome Biol 2005; 6:R44 [View Article][PubMed]
    [Google Scholar]
  10. Conway T, Creecy JP, Maddox SM, Grissom JE, Conkle TL et al. Unprecedented high-resolution view of bacterial operon architecture revealed by RNA sequencing. mBio 2014; 5:e01442-14 [View Article][PubMed]
    [Google Scholar]
  11. Wipat A, Carter N, Brignell SC, Guy BJ, Piper K et al. The dnaB-pheA (256°-240°) region of the Bacillus subtilis chromosome containing genes responsible for stress responses, the utilization of plant cell walls and primary metabolism. Microbiology 1996; 142:3067–3078 [View Article][PubMed]
    [Google Scholar]
  12. Nudler E, Mironov AS. The riboswitch control of bacterial metabolism. Trends Biochem Sci 2004; 29:11–17 [View Article][PubMed]
    [Google Scholar]
  13. Vitreschak AG, Rodionov DA, Mironov AA, Gelfand MS. Regulation of riboflavin biosynthesis and transport genes in bacteria by transcriptional and translational attenuation. Nucleic Acids Res 2002; 30:3141–3151 [View Article][PubMed]
    [Google Scholar]
  14. Winkler WC, Cohen-Chalamish S, Breaker RR. An mRNA structure that controls gene expression by binding FMN. Proc Natl Acad Sci USA 2002; 99:15908–15913 [View Article][PubMed]
    [Google Scholar]
  15. Smaldone GT, Antelmann H, Gaballa A, Helmann JD. The FsrA sRNA and FbpB protein mediate the iron-dependent induction of the Bacillus subtilis LutABC iron-sulfur-containing oxidases. J Bacteriol 2012; 194:2586–2593 [View Article][PubMed]
    [Google Scholar]
  16. Jahn N, Preis H, Wiedemann C, Brantl S. BsrG/SR4 from Bacillus subtilis – the first temperature-dependent type I toxin-antitoxin system. Mol Microbiol 2012; 83:579–598 [View Article][PubMed]
    [Google Scholar]
  17. Müller P, Jahn N, Ring C, Maiwald C, Neubert R et al. A multistress responsive type I toxin-antitoxin system: bsrE/SR5 from the B. subtilis chromosome. RNA Biol 2016; 13:511–523 [View Article][PubMed]
    [Google Scholar]
  18. Barrick JE, Sudarsan N, Weinberg Z, Ruzzo WL, Breaker RR. 6S RNA is a widespread regulator of eubacterial RNA polymerase that resembles an open promoter. RNA 2005; 11:774–784 [View Article][PubMed]
    [Google Scholar]
  19. Trotochaud AE, Wassarman KM. A highly conserved 6S RNA structure is required for regulation of transcription. Nat Struct Mol Biol 2005; 12:313–319 [View Article][PubMed]
    [Google Scholar]
  20. Wassarman KM. 6S RNA, a global regulator of transcription. Microbiol Spectr 2018; 6:microbiolspec.RWR-0019-2018 [View Article][PubMed]
    [Google Scholar]
  21. Waters LS, Storz G. Regulatory RNAs in bacteria. Cell 2009; 136:615–628 [View Article][PubMed]
    [Google Scholar]
  22. Pelechano V, Steinmetz LM. Gene regulation by antisense transcription. Nat Rev Genet 2013; 14:880–893 [View Article][PubMed]
    [Google Scholar]
  23. Silvaggi JM, Perkins JB, Losick R. Small untranslated RNA antitoxin in Bacillus subtilis . J Bacteriol 2005; 187:6641–6650 [View Article][PubMed]
    [Google Scholar]
  24. Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 2007; 35:D61–D65 [View Article][PubMed]
    [Google Scholar]
  25. Barbe V, Cruveiller S, Kunst F, Lenoble P, Meurice G et al. From a consortium sequence to a unified sequence: the Bacillus subtilis 168 reference genome a decade later. Microbiology 2009; 155:1758–1775 [View Article][PubMed]
    [Google Scholar]
  26. Dar D, Shamir M, Mellin JR, Koutero M, Stern-Ginossar N et al. Term-seq reveals abundant ribo-regulation of antibiotics resistance in bacteria. Science 2016; 352:aad9822 [View Article][PubMed]
    [Google Scholar]
  27. Sierro N, Makita Y, de Hoon M, Nakai K. DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information. Nucleic Acids Res 2008; 36:D93–D96 [View Article][PubMed]
    [Google Scholar]
  28. Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008; 18:821–829 [View Article][PubMed]
    [Google Scholar]
  29. Wattam AR, Abraham D, Dalay O, Disz TL, Driscoll T et al. PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res 2014; 42:D581–D591 [View Article][PubMed]
    [Google Scholar]
  30. Chen I-MA, Chu K, Palaniappan K, Pillay M, Ratner A et al. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res 2019; 47:D666–D677 [View Article][PubMed]
    [Google Scholar]
  31. Andersen ES, Rosenblad MA, Larsen N, Westergaard JC, Burks J et al. The tmRDB and SRPDB resources. Nucleic Acids Res 2006; 34:D163–D168 [View Article][PubMed]
    [Google Scholar]
  32. Jühling F, Mörl M, Hartmann RK, Sprinzl M, Stadler PF et al. tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res 2009; 37:D159–D162 [View Article][PubMed]
    [Google Scholar]
  33. Li L, Huang D, Cheung MK, Nong W, Huang Q et al. BSRD: a repository for bacterial small regulatory RNA. Nucleic Acids Res 2013; 41:D233–D238 [View Article][PubMed]
    [Google Scholar]
  34. Hudson CM, Williams KP. The tmRNA website. Nucleic Acids Res 2015; 43:D138–D140 [View Article][PubMed]
    [Google Scholar]
  35. Caspi R, Altman T, Billington R, Dreher K, Foerster H et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 2014; 42:D459–D471 [View Article][PubMed]
    [Google Scholar]
  36. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H et al. Gene ontology: tool for the unification of biology. Nat Genet 2000; 25:25–29 [View Article]
    [Google Scholar]
  37. The Gene Ontology Consortium The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res 2019; 47:D330–D338 [View Article][PubMed]
    [Google Scholar]
  38. Kalvari I, Argasinska J, Quinones-Olvera N, Nawrocki EP, Rivas E et al. Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res 2018; 46:D335–D342 [View Article][PubMed]
    [Google Scholar]
  39. Nawrocki EP, Eddy SR. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 2013; 29:2933–2935 [View Article][PubMed]
    [Google Scholar]
  40. Weinberg Z, Lünse CE, Corbino KA, Ames TD, Nelson JW et al. Detection of 224 candidate structured RNAs by comparative analysis of specific subsets of intergenic regions. Nucleic Acids Res 2017; 45:10811–10823 [View Article][PubMed]
    [Google Scholar]
  41. Harris KA, Breaker RR. Large noncoding RNAs in bacteria. Microbiol Spectr 2018; 6:microbiolspec.RWR-0005-2017 [View Article][PubMed]
    [Google Scholar]
  42. Croucher NJ, Thomson NR. Studying bacterial transcriptomes using RNA-seq. Curr Opin Microbiol 2010; 13:619–624 [View Article][PubMed]
    [Google Scholar]
  43. Haas BJ, Chin M, Nusbaum C, Birren BW, Livny J. How deep is deep enough for RNA-Seq profiling of bacterial transcriptomes?. BMC Genomics 2012; 13:734 [View Article][PubMed]
    [Google Scholar]
  44. Altenbuchner J. Editing of the Bacillus subtilis genome by the CRISPR-Cas9 system. Appl Environ Microbiol 2016; 82:5421–5427 [View Article][PubMed]
    [Google Scholar]
  45. Westbrook AW, Moo-Young M, Chou CP. Development of a CRISPR-Cas9 tool kit for comprehensive engineering of Bacillus subtilis . Appl Environ Microbiol 2016; 82:4876–4895 [View Article][PubMed]
    [Google Scholar]
  46. Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol 2016; 34:184–191 [View Article][PubMed]
    [Google Scholar]
  47. Alkan F, Wenzel A, Anthon C, Havgaard JH, Gorodkin J. CRISPR-Cas9 off-targeting assessment with nucleic acid duplex energy parameters. Genome Biol 2018; 19:177 [View Article][PubMed]
    [Google Scholar]
  48. Liu D, Huang C, Guo J, Zhang P, Chen T et al. Development and characterization of a CRISPR/Cas9n-based multiplex genome editing system for Bacillus subtilis . Biotechnol Biofuels 2019; 12:197 [View Article][PubMed]
    [Google Scholar]
  49. R Core Team R: a Language and Environment for Statistical Computing Vienna: R Foundation for Statistical Computing; 2008
    [Google Scholar]
  50. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004; 5:R80 [View Article][PubMed]
    [Google Scholar]
  51. Wickham H. tidyverse: easily install and load the 'tidyverse' ( https://CRAN.R-project.org/package=tidyverse); 2017
  52. Lawrence M, Gentleman R, Carey V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 2009; 25:1841–1842 [View Article][PubMed]
    [Google Scholar]
  53. Lawrence M, Huber W, Pagès H, Aboyoun P, Carlson M et al. Software for computing and annotating genomic ranges. PLoS Comput Biol 2013; 9:e1003118 [View Article][PubMed]
    [Google Scholar]
  54. Lee S, Lawrence M, Cook D. plyranges: a fluent interface for manipulating GenomicRanges 2018
  55. Becker G, Lawrence M. genbankr: parsing GenBank files into semantically useful objects 2018
  56. Xiao N. ggsci: scientific journal and sci-fi themed color palettes for 'ggplot2' ( https://CRAN.R-project.org/package=ggsci) 2018
    [Google Scholar]
  57. Pagès H, Aboyoun P, Gentleman R, DebRoy S. Biostrings: efficient manipulation of biological strings 2019
  58. Pedersen TL. tidygraph: A Tidy API for Graph Manipulation. Available from https://CRAN.R-project.org/package=tidygraph ; 2018
  59. Zhu H. kableExtra: construct complex table with 'kable' and Pipe Syntax. Available from: https://CRAN.R-project.org/package=kableExtra ; 2019
  60. Kunst F, Ogasawara N, Moszer I, Albertini AM, Alloni G et al. The complete genome sequence of the Gram-positive bacterium Bacillus subtilis . Nature 1997; 390:249–256 [View Article][PubMed]
    [Google Scholar]
  61. Anthon C, Tafer H, Havgaard JH, Thomsen B, Hedegaard J et al. Structured RNAs and synteny regions in the pig genome. BMC Genomics 2014; 15:459 [View Article][PubMed]
    [Google Scholar]
  62. Jaccard P. The distribution of the flora in the alpine zone. New Phytol 1912; 11:37–50
    [Google Scholar]
  63. Fimlaid KA, Shen A. Diverse mechanisms regulate sporulation sigma factor activity in the Firmicutes. Curr Opin Microbiol 2015; 24:88–95 [View Article][PubMed]
    [Google Scholar]
  64. Harris RS. Improved pairwise alignment of genomic DNA. PhD thesis Pennsylvania State University; USA: 2007
    [Google Scholar]
  65. Haeussler M, Zweig AS, Tyner C, Speir ML, Rosenbloom KR et al. The UCSC Genome Browser database: 2019 update. Nucleic Acids Res 2019; 47:D853–D858 [View Article][PubMed]
    [Google Scholar]
  66. Rasmussen S, Nielsen HB, Jarmer H. The transcriptionally active regions in the genome of Bacillus subtilis . Mol Microbiol 2009; 73:1043–1057 [View Article][PubMed]
    [Google Scholar]
  67. Alkan F, Wenzel A, Palasca O, Kerpedjiev P, Rudebeck AF et al. RIsearch2: suffix array-based large-scale prediction of RNA–RNA interactions and siRNA off-targets. Nucleic Acids Res 2017; 45:e60 [View Article][PubMed]
    [Google Scholar]
  68. Giuliodori AM, Di Pietro F, Marzi S, Masquida B, Wagner R et al. The cspA mRNA is a thermosensor that modulates translation of the cold-shock protein CspA. Mol Cell 2010; 37:21–33 [View Article][PubMed]
    [Google Scholar]
  69. Mandin P, Repoila F, Vergassola M, Geissmann T, Cossart P. Identification of new noncoding RNAs in Listeria monocytogenes and prediction of mRNA targets. Nucleic Acids Res 2007; 35:962–974 [View Article][PubMed]
    [Google Scholar]
  70. Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 2005; 4:17 [View Article][PubMed]
    [Google Scholar]
  71. Toledo-Arana A, Dussurget O, Nikitas G, Sesto N, Guet-Revillet H et al. The Listeria transcriptional landscape from saprophytism to virulence. Nature 2009; 459:950–956 [View Article][PubMed]
    [Google Scholar]
  72. Schiano CA, Koo JT, Schipma MJ, Caulfield AJ, Jafari N et al. Genome-wide analysis of small RNAs expressed by Yersinia pestis identifies a regulator of the Yop-Ysc type III secretion system. J Bacteriol 2014; 196:1659–1670 [View Article][PubMed]
    [Google Scholar]
  73. Goodrich-Blair H, Scarlato V, Gott JM, Xu M-Q, Shub DA. A self-splicing group I intron in the DNA polymerase gene of bacillus subtilis bacteriophage SPO1. Cell 1990; 63:417–424 [View Article][PubMed]
    [Google Scholar]
  74. Jeske L, Placzek S, Schomburg I, Chang A, Schomburg D. BRENDA in 2019: a European ELIXIR core data resource. Nucleic Acids Res 2019; 47:D542–D549 [View Article][PubMed]
    [Google Scholar]
  75. Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 2000; 28:27–30 [View Article][PubMed]
    [Google Scholar]
  76. Leenen FAD, Vernocchi S, Hunewald OE, Schmitz S, Molitor AM et al. Where does transcription start? 5'-RACE adapted to next-generation sequencing. Nucleic Acids Res 2016; 44:2628–2645 [View Article][PubMed]
    [Google Scholar]
  77. Warrier I, Ram-Mohan N, Zhu Z, Hazery A, Echlin H et al. The transcriptional landscape of Streptococcus pneumoniae TIGR4 reveals a complex operon architecture and abundant riboregulation critical for growth and virulence. PLoS Pathog 2018; 14:e1007461 [View Article][PubMed]
    [Google Scholar]
  78. Yu S-H, Vogel J, Förstner KU. ANNOgesic: a Swiss army knife for the RNA-seq based annotation of bacterial/archaeal genomes. Gigascience 2018; 7:giy096 [View Article][PubMed]
    [Google Scholar]
  79. Stekel D. Microarray Bioinformatics Cambridge: Cambridge University Press; 2003
    [Google Scholar]
  80. Raney BJ, Dreszer TR, Barber GP, Clawson H, Fujita PA et al. Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser. Bioinformatics 2014; 30:1003–1005 [View Article][PubMed]
    [Google Scholar]
  81. Chan PP, Holmes AD, Smith AM, Tran D, Lowe TM. The UCSC Archaeal Genome Browser: 2012 update. Nucleic Acids Res 2012; 40:D646–D652 [View Article][PubMed]
    [Google Scholar]
  82. Kent WJ. BLAT – the BLAST-like alignment tool. Genome Res 2002; 12:656–664 [View Article][PubMed]
    [Google Scholar]
  83. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES et al. Integrative genomics viewer. Nat Biotechnol 2011; 29:24–26 [View Article][PubMed]
    [Google Scholar]
  84. Freese NH, Norris DC, Loraine AE. Integrated genome browser: visual analytics platform for genomics. Bioinformatics 2016; 32:2089–2095 [View Article][PubMed]
    [Google Scholar]
  85. Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D et al. Ensembl 2018. Nucleic Acids Res 2018; 46:D754–D761 [View Article][PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000524
Loading
/content/journal/mgen/10.1099/mgen.0.000524
Loading

Data & Media loading...

Supplements

Supplementary material 1

PDF

Supplementary material 2

EXCEL

Most cited this month Most Cited RSS feed

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error