Skip to content
1887

Abstract

The use of -mers to capture genetic variation in bacterial genome-wide association studies (bGWAS) has demonstrated its effectiveness in overcoming the plasticity of bacterial genomes by providing a comprehensive array of genetic variants in a genome set that is not confined to a single reference genome. However, little attempt has been made to interpret -mers in the context of genome rearrangements, partly due to challenges in the exhaustive and high-throughput identification of genome structure and individual rearrangement events. Here, we present , a pre- and post-bGWAS processing methodology that leverages the unique properties of -mers to facilitate bGWAS for genome rearrangements. Repeat sequences are common instigators of genome rearrangements through intragenomic homologous recombination, and they are commonly found at rearrangement boundaries. Using whole-genome sequences, repeat sequences are replaced by short placeholder sequences, allowing the regions flanking repeats to be incorporated into relatively short -mers. Then, locations of flanking regions in significant -mers are mapped back to complete genome sequences to visualise genome rearrangements. Four case studies based on two bacterial species ( and ) and a simulated genome set are presented to demonstrate the ability to identify phenotype-associated rearrangements. is available at https://github.com/DorothyTamYiLing/GWarrange.

Funding
This study was supported by the:
  • Leverhulme Trust (Award RPG-2019-373)
    • Principle Award Recipient: AndrewPreston
  • This is an open-access article distributed under the terms of the Creative Commons Attribution License. This article was made open access via a Publish and Read agreement between the Microbiology Society and the corresponding author’s institution.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.001268
2024-07-09
2025-05-19
Loading full text...

Full text loading...

/deliver/fulltext/mgen/10/7/mgen001268.html?itemId=/content/journal/mgen/10.1099/mgen.0.001268&mimeType=html&fmt=ahah

References

  1. Uffelmann E, Huang QQ, Munung NS, de Vries J, Okada Y et al. Genome-wide association studies. Nat Rev Methods Primers 2021; 1:59 [View Article]
    [Google Scholar]
  2. Laabei M, Recker M, Rudkin JK, Aldeljawi M, Gulay Z et al. Predicting the virulence of MRSA from its genome sequence. Genome Res 2014; 24:839–849 [View Article] [PubMed]
    [Google Scholar]
  3. Lees JA, Croucher NJ, Goldblatt D, Nosten F, Parkhill J et al. Genome-wide identification of lineage and locus specific variation associated with pneumococcal carriage duration. eLife 2017; 6:e26255 [View Article]
    [Google Scholar]
  4. Galardini M, Clermont O, Baron A, Busby B, Dion S et al. Major role of iron uptake systems in the intrinsic extra-intestinal virulence of the genus Escherichia revealed by a genome-wide association study. PLoS Genet 2020; 16:e1009065 [View Article] [PubMed]
    [Google Scholar]
  5. Lees JA, Ferwerda B, Kremer PHC, Wheeler NE, Serón MV et al. Joint sequencing of human and pathogen genomes reveals the genetics of pneumococcal meningitis. Nat Commun 2019; 10:2176 [View Article] [PubMed]
    [Google Scholar]
  6. Young BC, Earle SG, Soeng S, Sar P, Kumar V et al. Panton–Valentine leucocidin is the key determinant of Staphylococcus aureus pyomyositis in a bacterial GWAS. eLife 2019; 8:e42486 [View Article] [PubMed]
    [Google Scholar]
  7. Sheppard SK, Didelot X, Meric G, Torralbo A, Jolley KA et al. Genome-wide association study identifies vitamin B5 biosynthesis as a host specificity factor in Campylobacter. Proc Natl Acad sci 2013; 110:11923–11927 [View Article]
    [Google Scholar]
  8. Lees JA, Vehkala M, Välimäki N, Harris SR, Chewapreecha C et al. Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes. Nat Commun 2016; 7:12797 [View Article] [PubMed]
    [Google Scholar]
  9. Jaillard M, Lima L, Tournoud M, Mahé P, van Belkum A et al. A fast and agnostic method for bacterial genome-wide association studies: bridging the gap between k-mers and genetic events. PLoS Genet 2018; 14:e1007758 [View Article] [PubMed]
    [Google Scholar]
  10. Roux de Bézieux H, Lima L, Perraudeau F, Mary A, Dudoit S et al. CALDERA: finding all significant de Bruijn subgraphs for bacterial GWAS. Bioinformatics 2022; 38:i36–i44 [View Article] [PubMed]
    [Google Scholar]
  11. Neubauer H, Galardini M. Improved interpretability of bacterial genome-wide associations using gene cluster centric k-mers. bioRxiv 20232023–2024 [View Article]
    [Google Scholar]
  12. Aras RA, Kang J, Tschumi AI, Harasaki Y, Blaser MJ. Extensive repetitive DNA facilitates prokaryotic genome plasticity. Proc Natl Acad Sci 2003; 100:13579–13584 [View Article] [PubMed]
    [Google Scholar]
  13. Piazza A, Heyer WD. Homologous recombination and the formation of complex genomic rearrangements.. Trends Cell Biol 2019; 29:135–149 [View Article] [PubMed]
    [Google Scholar]
  14. Daveran-Mingot ML, Campo N, Ritzenthaler P, Le Bourgeois P. A natural large chromosomal inversion in Lactococcus lactis is mediated by homologous recombination between two insertion sequences. J Bacteriol 1998; 180:4834–4842 [View Article]
    [Google Scholar]
  15. Consuegra J, Gaffé J, Lenski RE, Hindré T, Barrick JE et al. Insertion-sequence-mediated mutations both promote and constrain evolvability during a long-term experiment with bacteria. Nat Commun 2021; 12:980 [View Article] [PubMed]
    [Google Scholar]
  16. Lee H, Doak TG, Popodi E, Foster PL, Tang H. Insertion sequence-caused large-scale rearrangements in the genome of Escherichia coli. Nucleic Acids Res 2016; 44:7109–7119 [View Article] [PubMed]
    [Google Scholar]
  17. Anderson P, Roth J. Spontaneous tandem genetic duplications in Homologous recombination and the formation of complex genomic rearrangements.Salmonella typhimurium arise by unequal recombination between rRNA (rrn) cistrons. Proc Natl Acad Sci 1981; 78:3113–3117 [View Article] [PubMed]
    [Google Scholar]
  18. Liu S. Homologous recombination between rrn operons rearranges the chromosome in host-specialized species of Salmonella. FEMS Microbiol Lett 1998; 164:275–281 [View Article]
    [Google Scholar]
  19. Page AJ, Ainsworth EV, Langridge GC. socru: typing of genome-level order and orientation around ribosomal operons in bacteria. Microb Genom 2020; 6: [View Article] [PubMed]
    [Google Scholar]
  20. Brüssow H, Canchaya C, Hardt WD. Phages and the evolution of bacterial pathogens: from genomic rearrangements to lysogenic conversion. Microbiol Mol Biol Rev 2004; 68:560–602 [View Article] [PubMed]
    [Google Scholar]
  21. Iguchi A, Iyoda S, Terajima J, Watanabe H, Osawa R. Spontaneous recombination between homologous prophage regions causes large-scale inversions within the Escherichia coli O157:H7 chromosome. Gene 2006; 372:199–207 [View Article]
    [Google Scholar]
  22. Fitzgerald SF, Lupolova N, Shaaban S, Dallman TJ, Greig D et al. Genome structural variation in Escherichia coli O157:H7. Microb Genom 2021; 7:000682 [View Article] [PubMed]
    [Google Scholar]
  23. Darling AE, Miklós I, Ragan MA. Dynamics of genome rearrangement in bacterial populations. PLoS Genet 2008; 4:e1000128 [View Article] [PubMed]
    [Google Scholar]
  24. Sun S, Ke R, Hughes D, Nilsson M, Andersson DI. Genome-wide detection of spontaneous chromosomal rearrangements in bacteria. PLoS One 2012; 7:e42639 [View Article] [PubMed]
    [Google Scholar]
  25. Repar J, Warnecke T. Non-random inversion landscapes in prokaryotic genomes are shaped by heterogeneous selection pressures. Mol Biol Evol 2017; 34:1902–1911 [View Article] [PubMed]
    [Google Scholar]
  26. Seferbekova Z, Zabelkin A, Yakovleva Y, Afasizhev R, Dranenko NO et al. High rates of genome rearrangements and pathogenicity of Shigella spp. Front Microbiol 2021; 12:628622 [View Article] [PubMed]
    [Google Scholar]
  27. Trzilova D, Tamayo R. Site-specific recombination–how simple DNA inversions produce complex phenotypic heterogeneity in bacterial populations. Trends Genet 2021; 37:59–72 [View Article] [PubMed]
    [Google Scholar]
  28. Jasin MA, Schimmel PA. Deletion of an essential gene in Escherichia coli by site-specific recombination with linear DNA fragments. J Bacteriol 1984; 159:783–786 [View Article] [PubMed]
    [Google Scholar]
  29. Nogami TA, Mizuno TA, Mizushima SH. Construction of a series of ompF-ompC chimeric genes by in vivo homologous recombination in Escherichia coli and characterization of the translational products. J Bacteriol 1985; 164:797–801 [View Article] [PubMed]
    [Google Scholar]
  30. Waters EV, Tucker LA, Ahmed JK, Wain J, Langridge GC. Impact of Salmonella genome rearrangement on gene expression. Evol Lett 2022; 6:426–437 [View Article] [PubMed]
    [Google Scholar]
  31. Quail MA, Smith M, Coupland P, Otto TD, Harris SR et al. A tale of three next generation sequencing platforms: comparison of Ion torrent, pacific biosciences and illumina MiSeq sequencers. BMC Genomics 2012; 13:1–3 [View Article] [PubMed]
    [Google Scholar]
  32. Moss EL, Maghini DG, Bhatt AS. Complete, closed bacterial genomes from microbiomes using nanopore sequencing. Nat Biotechnol 2020; 38:701–707 [View Article] [PubMed]
    [Google Scholar]
  33. Noureen M, Tada I, Kawashima T, Arita M. Rearrangement analysis of multiple bacterial genomes. BMC Bioinformatics 2019; 20:1–10 [View Article] [PubMed]
    [Google Scholar]
  34. Leavis HL, Willems RJ, van Wamel WJ, Schuren FH, Caspers MP et al. Insertion sequence–driven diversification creates a globally dispersed emerging multiresistant subspecies of E. faecium. PLoS Pathog 2007; 3:e7 [View Article] [PubMed]
    [Google Scholar]
  35. Weigand MR, Williams MM, Peng Y, Kania D, Pawloski LC et al. Genomic survey of Bordetella pertussis diversity, United States, 2000–2013. Emerg Infect Dis 2019; 25:780–783 [View Article] [PubMed]
    [Google Scholar]
  36. Lees JA, Galardini M, Bentley SD, Weiser JN, Corander J. pyseer: a comprehensive tool for microbial pangenome-wide association studies. Bioinformatics 2018; 34:4310–4312 [View Article] [PubMed]
    [Google Scholar]
  37. Holley G, Melsted P. Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs. Genome Biol 2020; 21:1–20 [View Article] [PubMed]
    [Google Scholar]
  38. Darling AC, Mau B, Blattner FR, Perna NT. Mauve: multiple alignment of conserved genomic sequence with rearrangements.. Genome Res 2004; 14:1394–1403 [View Article] [PubMed]
    [Google Scholar]
  39. Ma L, Caulfield A, Dewan KK, Harvill ET. Pertactin-deficient Bordetella pertussis, vaccine-driven evolution, and reemergence of pertussis. Emerg Infect Dis 2021; 27:1561–1566 [View Article] [PubMed]
    [Google Scholar]
  40. Lefrancq N, Bouchez V, Fernandes N, Barkoff A-M, Bosch T et al. Global spatial dynamics and vaccine-induced fitness changes of Bordetella pertussis. Sci Transl Med 2022; 14:eabn3253 [View Article]
    [Google Scholar]
/content/journal/mgen/10.1099/mgen.0.001268
Loading
/content/journal/mgen/10.1099/mgen.0.001268
Loading

Data & Media loading...

Supplements

Supplementary material 1

PDF

Supplementary material 2

EXCEL
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error