1887

Abstract

A large part of our current understanding of gene regulation in Gram-positive bacteria is based on , as it is one of the most well studied bacterial model systems. The rapid growth in data concerning its molecular and genomic biology is distributed across multiple annotation resources. Consequently, the interpretation of data from further experiments becomes increasingly challenging in both low- and large-scale analyses. Additionally, annotation of structured RNA and non-coding RNA (ncRNA), as well as the operon structure, is still lagging behind the annotation of the coding sequences. To address these challenges, we created the genome atlas, BSGatlas, which integrates and unifies multiple existing annotation resources. Compared to any of the individual resources, the BSGatlas contains twice as many ncRNAs, while improving the positional annotation for 70 % of the ncRNAs. Furthermore, we combined known transcription start and termination sites with lists of known co-transcribed gene sets to create a comprehensive transcript map. The combination with transcription start/termination site annotations resulted in 717 new sets of co-transcribed genes and 5335 untranslated regions (UTRs). In comparison to existing resources, the number of 5′ and 3′ UTRs increased nearly fivefold, and the number of internal UTRs doubled. The transcript map is organized in 2266 operons, which provides transcriptional annotation for 92 % of all genes in the genome compared to the at most 82 % by previous resources. We predicted an off-target-aware genome-wide library of CRISPR–Cas9 guide RNAs, which we also linked to polycistronic operons. We provide the BSGatlas in multiple forms: as a website (https://rth.dk/resources/bsgatlas/), an annotation hub for display in the UCSC genome browser, supplementary tables and standardized GFF3 format, which can be used in large scale -omics studies. By complementing existing resources, the BSGatlas supports analyses of the genome and its molecular biology with respect to not only non-coding genes but also genome-wide transcriptional relationships of all genes.

Funding
This study was supported by the:
  • JanGorodkin , Innovationsfonden , (Award 5163-00010B)
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000524
2021-02-04
2021-02-26
Loading full text...

Full text loading...

/deliver/fulltext/mgen/10.1099/mgen.0.000524/mgen000524.html?itemId=/content/journal/mgen/10.1099/mgen.0.000524&mimeType=html&fmt=ahah

References

  1. Schallmey M, Singh A, Ward OP. Developments in the use of Bacillus species for industrial production. Can J Microbiol 2004; 50: 1 17 [CrossRef] [PubMed]
    [Google Scholar]
  2. Irnov KA Kertsburg A, Winkler WC. Genetic control by cis-acting regulatory RNAs in Bacillus subtilis: general principles and prospects for discovery. Cold Spring Harb Symp Quant Biol 2006; 71: 239 249 [CrossRef] [PubMed]
    [Google Scholar]
  3. Nagler K, Krawczyk AO, De Jong A, Madela K, Hoffmann T et al. Identification of differentially expressed genes during Bacillus subtilis spore outgrowth in high-salinity environments using RNA sequencing. Front Microbiol 2016; 7: 1564 [CrossRef] [PubMed]
    [Google Scholar]
  4. Nicolas P, Mäder U, Dervyn E, Rochat T, Leduc A et al. Condition-dependent transcriptome reveals high-level regulatory architecture in Bacillus subtilis . Science 2012; 335: 1103 1106 [CrossRef] [PubMed]
    [Google Scholar]
  5. Lalanne J-B, Taggart JC, Guo MS, Herzel L, Schieler A et al. Evolutionary convergence of pathway-specific enzyme expression stoichiometry. Cell 2018; 173: 749 761 [CrossRef] [PubMed]
    [Google Scholar]
  6. Borriss R, Danchin A, Harwood CR, Médigue C, Rocha EPC et al. Bacillus subtilis, the model Gram-positive bacterium: 20 years of annotation refinement. Microb Biotechnol 2018; 11: 3 17 [CrossRef] [PubMed]
    [Google Scholar]
  7. Zallot R, Harrison K, Kolaczkowski B, de Crécy-Lagard V. Functional annotations of paralogs: a blessing and a curse. Life 2016; 6: 39 [CrossRef]
    [Google Scholar]
  8. Zhu B, Stülke J. SubtiWiki in 2018: from genes and proteins to functional network annotation of the model organism Bacillus subtilis . Nucleic Acids Res 2018; 46: D743 D748 [CrossRef] [PubMed]
    [Google Scholar]
  9. Eilbeck K, Lewis SE, Mungall CJ, Yandell M, Stein L et al. The sequence ontology: a tool for the unification of genome annotations. Genome Biol 2005; 6: R44 [CrossRef] [PubMed]
    [Google Scholar]
  10. Conway T, Creecy JP, Maddox SM, Grissom JE, Conkle TL et al. Unprecedented high-resolution view of bacterial operon architecture revealed by RNA sequencing. mBio 2014; 5: e01442-14 [CrossRef] [PubMed]
    [Google Scholar]
  11. Wipat A, Carter N, Brignell SC, Guy BJ, Piper K et al. The dnaB-pheA (256°-240°) region of the Bacillus subtilis chromosome containing genes responsible for stress responses, the utilization of plant cell walls and primary metabolism. Microbiology 1996; 142: 3067 3078 [CrossRef] [PubMed]
    [Google Scholar]
  12. Nudler E, Mironov AS. The riboswitch control of bacterial metabolism. Trends Biochem Sci 2004; 29: 11 17 [CrossRef] [PubMed]
    [Google Scholar]
  13. Vitreschak AG, Rodionov DA, Mironov AA, Gelfand MS. Regulation of riboflavin biosynthesis and transport genes in bacteria by transcriptional and translational attenuation. Nucleic Acids Res 2002; 30: 3141 3151 [CrossRef] [PubMed]
    [Google Scholar]
  14. Winkler WC, Cohen-Chalamish S, Breaker RR. An mRNA structure that controls gene expression by binding FMN. Proc Natl Acad Sci USA 2002; 99: 15908 15913 [CrossRef] [PubMed]
    [Google Scholar]
  15. Smaldone GT, Antelmann H, Gaballa A, Helmann JD. The FsrA sRNA and FbpB protein mediate the iron-dependent induction of the Bacillus subtilis LutABC iron-sulfur-containing oxidases. J Bacteriol 2012; 194: 2586 2593 [CrossRef] [PubMed]
    [Google Scholar]
  16. Jahn N, Preis H, Wiedemann C, Brantl S. BsrG/SR4 from Bacillus subtilis – the first temperature-dependent type I toxin-antitoxin system. Mol Microbiol 2012; 83: 579 598 [CrossRef] [PubMed]
    [Google Scholar]
  17. Müller P, Jahn N, Ring C, Maiwald C, Neubert R et al. A multistress responsive type I toxin-antitoxin system: bsrE/SR5 from the B. subtilis chromosome. RNA Biol 2016; 13: 511 523 [CrossRef] [PubMed]
    [Google Scholar]
  18. Barrick JE, Sudarsan N, Weinberg Z, Ruzzo WL, Breaker RR. 6S RNA is a widespread regulator of eubacterial RNA polymerase that resembles an open promoter. RNA 2005; 11: 774 784 [CrossRef] [PubMed]
    [Google Scholar]
  19. Trotochaud AE, Wassarman KM. A highly conserved 6S RNA structure is required for regulation of transcription. Nat Struct Mol Biol 2005; 12: 313 319 [CrossRef] [PubMed]
    [Google Scholar]
  20. Wassarman KM. 6S RNA, a global regulator of transcription. Microbiol Spectr 2018; 6: microbiolspec.RWR-0019-2018 [CrossRef] [PubMed]
    [Google Scholar]
  21. Waters LS, Storz G. Regulatory RNAs in bacteria. Cell 2009; 136: 615 628 [CrossRef] [PubMed]
    [Google Scholar]
  22. Pelechano V, Steinmetz LM. Gene regulation by antisense transcription. Nat Rev Genet 2013; 14: 880 893 [CrossRef] [PubMed]
    [Google Scholar]
  23. Silvaggi JM, Perkins JB, Losick R. Small untranslated RNA antitoxin in Bacillus subtilis . J Bacteriol 2005; 187: 6641 6650 [CrossRef] [PubMed]
    [Google Scholar]
  24. Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 2007; 35: D61 D65 [CrossRef] [PubMed]
    [Google Scholar]
  25. Barbe V, Cruveiller S, Kunst F, Lenoble P, Meurice G et al. From a consortium sequence to a unified sequence: the Bacillus subtilis 168 reference genome a decade later. Microbiology 2009; 155: 1758 1775 [CrossRef] [PubMed]
    [Google Scholar]
  26. Dar D, Shamir M, Mellin JR, Koutero M, Stern-Ginossar N et al. Term-seq reveals abundant ribo-regulation of antibiotics resistance in bacteria. Science 2016; 352: aad9822 [CrossRef] [PubMed]
    [Google Scholar]
  27. Sierro N, Makita Y, de Hoon M, Nakai K. DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information. Nucleic Acids Res 2008; 36: D93 D96 [CrossRef] [PubMed]
    [Google Scholar]
  28. Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008; 18: 821 829 [CrossRef] [PubMed]
    [Google Scholar]
  29. Wattam AR, Abraham D, Dalay O, Disz TL, Driscoll T et al. PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res 2014; 42: D581 D591 [CrossRef] [PubMed]
    [Google Scholar]
  30. Chen I-MA, Chu K, Palaniappan K, Pillay M, Ratner A et al. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res 2019; 47: D666 D677 [CrossRef] [PubMed]
    [Google Scholar]
  31. Andersen ES, Rosenblad MA, Larsen N, Westergaard JC, Burks J et al. The tmRDB and SRPDB resources. Nucleic Acids Res 2006; 34: D163 D168 [CrossRef] [PubMed]
    [Google Scholar]
  32. Jühling F, Mörl M, Hartmann RK, Sprinzl M, Stadler PF et al. tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res 2009; 37: D159 D162 [CrossRef] [PubMed]
    [Google Scholar]
  33. Li L, Huang D, Cheung MK, Nong W, Huang Q et al. BSRD: a repository for bacterial small regulatory RNA. Nucleic Acids Res 2013; 41: D233 D238 [CrossRef] [PubMed]
    [Google Scholar]
  34. Hudson CM, Williams KP. The tmRNA website. Nucleic Acids Res 2015; 43: D138 D140 [CrossRef] [PubMed]
    [Google Scholar]
  35. Caspi R, Altman T, Billington R, Dreher K, Foerster H et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 2014; 42: D459 D471 [CrossRef] [PubMed]
    [Google Scholar]
  36. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H et al. Gene ontology: tool for the unification of biology. Nat Genet 2000; 25: 25 29 [CrossRef]
    [Google Scholar]
  37. The Gene Ontology Consortium The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res 2019; 47: D330 D338 [CrossRef] [PubMed]
    [Google Scholar]
  38. Kalvari I, Argasinska J, Quinones-Olvera N, Nawrocki EP, Rivas E et al. Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res 2018; 46: D335 D342 [CrossRef] [PubMed]
    [Google Scholar]
  39. Nawrocki EP, Eddy SR. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 2013; 29: 2933 2935 [CrossRef] [PubMed]
    [Google Scholar]
  40. Weinberg Z, Lünse CE, Corbino KA, Ames TD, Nelson JW et al. Detection of 224 candidate structured RNAs by comparative analysis of specific subsets of intergenic regions. Nucleic Acids Res 2017; 45: 10811 10823 [CrossRef] [PubMed]
    [Google Scholar]
  41. Harris KA, Breaker RR. Large noncoding RNAs in bacteria. Microbiol Spectr 2018; 6: microbiolspec.RWR-0005-2017 [CrossRef] [PubMed]
    [Google Scholar]
  42. Croucher NJ, Thomson NR. Studying bacterial transcriptomes using RNA-seq. Curr Opin Microbiol 2010; 13: 619 624 [CrossRef] [PubMed]
    [Google Scholar]
  43. Haas BJ, Chin M, Nusbaum C, Birren BW, Livny J. How deep is deep enough for RNA-Seq profiling of bacterial transcriptomes?. BMC Genomics 2012; 13: 734 [CrossRef] [PubMed]
    [Google Scholar]
  44. Altenbuchner J. Editing of the Bacillus subtilis genome by the CRISPR-Cas9 system. Appl Environ Microbiol 2016; 82: 5421 5427 [CrossRef] [PubMed]
    [Google Scholar]
  45. Westbrook AW, Moo-Young M, Chou CP. Development of a CRISPR-Cas9 tool kit for comprehensive engineering of Bacillus subtilis . Appl Environ Microbiol 2016; 82: 4876 4895 [CrossRef] [PubMed]
    [Google Scholar]
  46. Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol 2016; 34: 184 191 [CrossRef] [PubMed]
    [Google Scholar]
  47. Alkan F, Wenzel A, Anthon C, Havgaard JH, Gorodkin J. CRISPR-Cas9 off-targeting assessment with nucleic acid duplex energy parameters. Genome Biol 2018; 19: 177 [CrossRef] [PubMed]
    [Google Scholar]
  48. Liu D, Huang C, Guo J, Zhang P, Chen T et al. Development and characterization of a CRISPR/Cas9n-based multiplex genome editing system for Bacillus subtilis . Biotechnol Biofuels 2019; 12: 197 [CrossRef] [PubMed]
    [Google Scholar]
  49. R Core Team R: a Language and Environment for Statistical Computing Vienna: R Foundation for Statistical Computing; 2008
    [Google Scholar]
  50. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004; 5: R80 [CrossRef] [PubMed]
    [Google Scholar]
  51. Wickham H. tidyverse: easily install and load the 'tidyverse' ( https://CRAN.R-project.org/package=tidyverse). 2017
  52. Lawrence M, Gentleman R, Carey V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 2009; 25: 1841 1842 [CrossRef] [PubMed]
    [Google Scholar]
  53. Lawrence M, Huber W, Pagès H, Aboyoun P, Carlson M et al. Software for computing and annotating genomic ranges. PLoS Comput Biol 2013; 9: e1003118 [CrossRef] [PubMed]
    [Google Scholar]
  54. Lee S, Lawrence M, Cook D. plyranges: a fluent interface for manipulating GenomicRanges 2018
  55. Becker G, Lawrence M. genbankr: parsing GenBank files into semantically useful objects 2018
  56. Xiao N. ggsci: scientific journal and sci-fi themed color palettes for 'ggplot2' ( https://CRAN.R-project.org/package=ggsci) 2018
    [Google Scholar]
  57. Pagès H, Aboyoun P, Gentleman R, DebRoy S. Biostrings: efficient manipulation of biological strings 2019
  58. Pedersen TL. tidygraph: A Tidy API for Graph Manipulation. Available from https://CRAN.R-project.org/package=tidygraph . 2018
  59. Zhu H. kableExtra: construct complex table with 'kable' and Pipe Syntax. Available from: https://CRAN.R-project.org/package=kableExtra . 2019
  60. Kunst F, Ogasawara N, Moszer I, Albertini AM, Alloni G et al. The complete genome sequence of the Gram-positive bacterium Bacillus subtilis . Nature 1997; 390: 249 256 [CrossRef] [PubMed]
    [Google Scholar]
  61. Anthon C, Tafer H, Havgaard JH, Thomsen B, Hedegaard J et al. Structured RNAs and synteny regions in the pig genome. BMC Genomics 2014; 15: 459 [CrossRef] [PubMed]
    [Google Scholar]
  62. Jaccard P. The distribution of the flora in the alpine zone. New Phytol 1912; 11: 37 50
    [Google Scholar]
  63. Fimlaid KA, Shen A. Diverse mechanisms regulate sporulation sigma factor activity in the Firmicutes. Curr Opin Microbiol 2015; 24: 88 95 [CrossRef] [PubMed]
    [Google Scholar]
  64. Harris RS. Improved pairwise alignment of genomic DNA. PhD thesis Pennsylvania State University; USA: 2007
    [Google Scholar]
  65. Haeussler M, Zweig AS, Tyner C, Speir ML, Rosenbloom KR et al. The UCSC Genome Browser database: 2019 update. Nucleic Acids Res 2019; 47: D853 D858 [CrossRef] [PubMed]
    [Google Scholar]
  66. Rasmussen S, Nielsen HB, Jarmer H. The transcriptionally active regions in the genome of Bacillus subtilis . Mol Microbiol 2009; 73: 1043 1057 [CrossRef] [PubMed]
    [Google Scholar]
  67. Alkan F, Wenzel A, Palasca O, Kerpedjiev P, Rudebeck AF et al. RIsearch2: suffix array-based large-scale prediction of RNA–RNA interactions and siRNA off-targets. Nucleic Acids Res 2017; 45: e60 [CrossRef] [PubMed]
    [Google Scholar]
  68. Giuliodori AM, Di Pietro F, Marzi S, Masquida B, Wagner R et al. The cspA mRNA is a thermosensor that modulates translation of the cold-shock protein CspA. Mol Cell 2010; 37: 21 33 [CrossRef] [PubMed]
    [Google Scholar]
  69. Mandin P, Repoila F, Vergassola M, Geissmann T, Cossart P. Identification of new noncoding RNAs in Listeria monocytogenes and prediction of mRNA targets. Nucleic Acids Res 2007; 35: 962 974 [CrossRef] [PubMed]
    [Google Scholar]
  70. Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 2005; 4: 17 [CrossRef] [PubMed]
    [Google Scholar]
  71. Toledo-Arana A, Dussurget O, Nikitas G, Sesto N, Guet-Revillet H et al. The Listeria transcriptional landscape from saprophytism to virulence. Nature 2009; 459: 950 956 [CrossRef] [PubMed]
    [Google Scholar]
  72. Schiano CA, Koo JT, Schipma MJ, Caulfield AJ, Jafari N et al. Genome-wide analysis of small RNAs expressed by Yersinia pestis identifies a regulator of the Yop-Ysc type III secretion system. J Bacteriol 2014; 196: 1659 1670 [CrossRef] [PubMed]
    [Google Scholar]
  73. Goodrich-Blair H, Scarlato V, Gott JM, Xu M-Q, Shub DA. A self-splicing group I intron in the DNA polymerase gene of bacillus subtilis bacteriophage SPO1. Cell 1990; 63: 417 424 [CrossRef] [PubMed]
    [Google Scholar]
  74. Jeske L, Placzek S, Schomburg I, Chang A, Schomburg D. BRENDA in 2019: a European ELIXIR core data resource. Nucleic Acids Res 2019; 47: D542 D549 [CrossRef] [PubMed]
    [Google Scholar]
  75. Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 2000; 28: 27 30 [CrossRef] [PubMed]
    [Google Scholar]
  76. Leenen FAD, Vernocchi S, Hunewald OE, Schmitz S, Molitor AM et al. Where does transcription start? 5'-RACE adapted to next-generation sequencing. Nucleic Acids Res 2016; 44: 2628 2645 [CrossRef] [PubMed]
    [Google Scholar]
  77. Warrier I, Ram-Mohan N, Zhu Z, Hazery A, Echlin H et al. The transcriptional landscape of Streptococcus pneumoniae TIGR4 reveals a complex operon architecture and abundant riboregulation critical for growth and virulence. PLoS Pathog 2018; 14: e1007461 [CrossRef] [PubMed]
    [Google Scholar]
  78. Yu S-H, Vogel J, Förstner KU. ANNOgesic: a Swiss army knife for the RNA-seq based annotation of bacterial/archaeal genomes. Gigascience 2018; 7: giy096 [CrossRef] [PubMed]
    [Google Scholar]
  79. Stekel D. Microarray Bioinformatics Cambridge: Cambridge University Press; 2003
    [Google Scholar]
  80. Raney BJ, Dreszer TR, Barber GP, Clawson H, Fujita PA et al. Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser. Bioinformatics 2014; 30: 1003 1005 [CrossRef] [PubMed]
    [Google Scholar]
  81. Chan PP, Holmes AD, Smith AM, Tran D, Lowe TM. The UCSC Archaeal Genome Browser: 2012 update. Nucleic Acids Res 2012; 40: D646 D652 [CrossRef] [PubMed]
    [Google Scholar]
  82. Kent WJ. BLAT – the BLAST-like alignment tool. Genome Res 2002; 12: 656 664 [CrossRef] [PubMed]
    [Google Scholar]
  83. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES et al. Integrative genomics viewer. Nat Biotechnol 2011; 29: 24 26 [CrossRef] [PubMed]
    [Google Scholar]
  84. Freese NH, Norris DC, Loraine AE. Integrated genome browser: visual analytics platform for genomics. Bioinformatics 2016; 32: 2089 2095 [CrossRef] [PubMed]
    [Google Scholar]
  85. Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D et al. Ensembl 2018. Nucleic Acids Res 2018; 46: D754 D761 [CrossRef] [PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000524
Loading
/content/journal/mgen/10.1099/mgen.0.000524
Loading

Data & Media loading...

Supplements

Supplementary material 1

PDF

Supplementary material 2

EXCEL
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error