1887

Abstract

Plasmids are a key vector of antibiotic resistance, but the current bioinformatics toolkit is not well suited to tracking them. The rapid structural changes seen in plasmid genomes present considerable challenges to evolutionary and epidemiological analysis. Typical approaches are either low resolution (replicon typing) or use shared k-mer content to define a genetic distance. However, this distance can both overestimate plasmid relatedness by ignoring rearrangements, and underestimate by over-penalizing gene gain/loss. Therefore a model is needed which captures the key components of how plasmid genomes evolve structurally – through gene/block gain or loss, and rearrangement. A secondary requirement is to prevent promiscuous transposable elements (TEs) leading to over-clustering of unrelated plasmids. We choose the ‘Double Cut and Join Indel’ (DCJ-Indel) model, in which plasmids are studied at a coarse level, as a sequence of signed integers (representing genes or aligned blocks), and the distance between two plasmids is the minimum number of rearrangement events or indels needed to transform one into the other. We show how this gives much more meaningful distances between plasmids. We introduce a software workflow pling (https://github.com/iqbal-lab-org/pling), which uses the DCJ-Indel model, to calculate distances between plasmids and then cluster them. In our approach, we combine containment distances and DCJ-Indel distances to build a TE-aware plasmid network. We demonstrate superior performance and interpretability to other plasmid clustering tools on the ‘Russian Doll’ dataset and a hospital transmission dataset.

Funding
This study was supported by the:
  • H2020 Marie Skłodowska-Curie Actions (Award 956229)
    • Principle Award Recipient: ZaminIqbal
  • This is an open-access article distributed under the terms of the Creative Commons Attribution License. This article was made open access via a Publish and Read agreement between the Microbiology Society and the corresponding author’s institution.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.001300
2024-10-14
2024-11-14
Loading full text...

Full text loading...

/deliver/fulltext/mgen/10/10/mgen001300.html?itemId=/content/journal/mgen/10.1099/mgen.0.001300&mimeType=html&fmt=ahah

References

  1. Helinski DR. A brief history of plasmids. EcoSal Plus 2022; 10:eESP00282021 [View Article] [PubMed]
    [Google Scholar]
  2. Dionisio F, Zilhão R, Gama JA. Interactions between plasmids and other mobile genetic elements affect their transmission and persistence. Plasmid 2019; 102:29–36 [View Article] [PubMed]
    [Google Scholar]
  3. Smillie C, Garcillán-Barcia MP, Francia MV, Rocha EPC, de la Cruz F. Mobility of plasmids. Microbiol Mol Biol Rev 2010; 74:434–452 [View Article]
    [Google Scholar]
  4. Ilhan J, Kupczok A, Woehle C, Wein T, Hülter NF et al. Segregational drift and the interplay between plasmid copy number and evolvability. Mol Biol Evol 2019; 36:472–486 [View Article] [PubMed]
    [Google Scholar]
  5. Wang X, Zhao J, Ji F, Chang H, Qin J et al. Multiple-replicon resistance plasmids of Klebsiella mediate extensive dissemination of antimicrobial genes. Front Microbiol 2021; 12:754931 [View Article]
    [Google Scholar]
  6. Pesesky MW, Tilley R, Beck DAC. Mosaic plasmids are abundant and unevenly distributed across prokaryotic taxa. Plasmid 2019; 102:10–18 [View Article] [PubMed]
    [Google Scholar]
  7. Zaleski P, Wawrzyniak P, Sobolewska A, Łukasiewicz N, Baran P et al. pIGWZ12 – A cryptic plasmid with a modular structure. Plasmid 2015; 79:37–47 [View Article]
    [Google Scholar]
  8. Carattoli A, Bertini A, Villa L, Falbo V, Hopkins KL et al. Identification of plasmids by PCR-based replicon typing. J Microbiol Methods 2005; 63:219–228 [View Article] [PubMed]
    [Google Scholar]
  9. Francia MV, Varsaki A, Garcillán-Barcia MP, Latorre A, Drainas C et al. A classification scheme for mobilization regions of bacterial plasmids. FEMS Microbiol Rev 2004; 28:79–100 [View Article] [PubMed]
    [Google Scholar]
  10. Carattoli A, Zankari E, García-Fernández A, Voldby Larsen M, Lund O et al. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 2014; 58:3895–3903 [View Article] [PubMed]
    [Google Scholar]
  11. Fernández-López R, Garcillán-Barcia MP, Revilla C, Lázaro M, Vielva L et al. Dynamics of the IncW genetic backbone imply general trends in conjugative plasmid evolution. FEMS Microbiol Rev 2006; 30:942–966 [View Article] [PubMed]
    [Google Scholar]
  12. Revilla C, Garcillán-Barcia MP, Fernández-López R, Thomson NR, Sanders M et al. Different pathways to acquiring resistance genes illustrated by the recent evolution of IncW plasmids. Antimicrob Agents Chemother 2008; 52:1472–1480 [View Article] [PubMed]
    [Google Scholar]
  13. Norberg P, Bergström M, Jethava V, Dubhashi D, Hermansson M. The IncP-1 plasmid backbone adapts to different host bacterial species and evolves through homologous recombination. Nat Commun 2011; 2:268 [View Article] [PubMed]
    [Google Scholar]
  14. Douarre P-E, Mallet L, Radomski N, Felten A, Mistou M-Y. Analysis of COMPASS, a new comprehensive plasmid database revealed prevalence of multireplicon and extensive diversity of IncF plasmids. Front Microbiol 2020; 11:483 [View Article] [PubMed]
    [Google Scholar]
  15. Osborn AM, da Silva Tatley FM, Steyn LM, Pickup RW, Saunders JR. Mosaic plasmids and mosaic replicons: evolutionary lessons from the analysis of genetic diversity in IncFII-related replicons. Microbiology 2000; 146:2267–2275 [View Article] [PubMed]
    [Google Scholar]
  16. Coque TM, Novais A, Carattoli A, Poirel L, Pitout J et al. Dissemination of clonally related Escherichia coli strains expressing extended-spectrum beta-lactamase CTX-M-15. Emerg Infect Dis 2008; 14:195–200 [View Article] [PubMed]
    [Google Scholar]
  17. Robertson J, Nash JHE. MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies. Microb Genom 2018; 4:e000206 [View Article] [PubMed]
    [Google Scholar]
  18. Arredondo-Alonso S, Gladstone RA, Pöntinen AK, Gama JA, Schürch AC et al. Mge-cluster: a reference-free approach for typing bacterial plasmids. NAR Genom Bioinform 2023; 5:lqad066 [View Article] [PubMed]
    [Google Scholar]
  19. Redondo-Salvo S, Fernández-López R, Ruiz R, Vielva L, de Toro M et al. Pathways for horizontal gene transfer in bacteria revealed by a global map of their plasmids. Nat Commun 2020; 11:3602 [View Article] [PubMed]
    [Google Scholar]
  20. Yu MK, Fogarty EC, Eren AM. Diverse plasmid systems and their ecology across human gut metagenomes revealed by PlasX and MobMess. Nat Microbiol 2024; 9:830–847 [View Article] [PubMed]
    [Google Scholar]
  21. Ondov BD, Starrett GJ, Sappington A, Kostic A, Koren S et al. Mash Screen: high-throughput sequence containment estimation for genome discovery. Genome Biol 2019; 20:232 [View Article] [PubMed]
    [Google Scholar]
  22. Irber L, Brooks PT, Reiter T, Pierce-Ward NT, Hera MR et al. Lightweight compositional analysis of metagenomes with FracMinHash and minimum metagenome covers. bioRxiv 2022 [View Article]
    [Google Scholar]
  23. Sankoff D. Edit distance for genome comparison based on non-local operations. In Apostolico A, Crochemore M, Galil Z, Manber U. eds Combinatorial Pattern MatchingLecture Notes in Computer Science vol 644 Springer, Berlin, Heidelberg; 1992 [View Article]
    [Google Scholar]
  24. Hannenhalli S, Pevzner PA. Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. J ACM 1999; 46:1–27 [View Article]
    [Google Scholar]
  25. Fertin G, Labarre A, Rusu I, Tannier E, Vialette S. Combinatorics of Genome Rearrangements The MIT Press; 2009 [View Article]
    [Google Scholar]
  26. Yancopoulos S, Attie O, Friedberg R. Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 2005; 21:3340–3346 [View Article] [PubMed]
    [Google Scholar]
  27. Braga MDV, Willing E, Stoye J. Double cut and join with insertions and deletions. J Comput Biol 2011; 18:1167–1184 [View Article] [PubMed]
    [Google Scholar]
  28. Compeau PE. DCJ-Indel sorting revisited. Algorithms Mol Biol 2013; 8:6 [View Article] [PubMed]
    [Google Scholar]
  29. Shao M, Lin Y, Moret BME. An exact algorithm to compute the double-cut-and-join distance for genomes with duplicate genes. J Comput Biol 2015; 22:425–435 [View Article] [PubMed]
    [Google Scholar]
  30. Bohnenkämper L, Braga MDV, Doerr D, Stoye J. Computing the rearrangement distance of natural genomes. J Comput Biol 2021; 28:410–431 [View Article] [PubMed]
    [Google Scholar]
  31. Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M et al. Versatile and open software for comparing large genomes. Genome Biol 2004; 5:R12 [View Article] [PubMed]
    [Google Scholar]
  32. Kleckner N. Transposable elements in prokaryotes. Annu Rev Genet 1981; 15:341–404 [View Article] [PubMed]
    [Google Scholar]
  33. Ross K, Varani AM, Snesrud E, Huang H, Alvarenga DO et al. TnCentral: a prokaryotic transposable element database and web portal for transposon analysis. mBio 2021; 12:e02060–21 [View Article]
    [Google Scholar]
  34. Raghavan UN, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 2007; 76:036106 [View Article]
    [Google Scholar]
  35. Hagberg AA, Schult DA, Swart PJ. Exploring network structure, dynamics, and function using NetworkX. In Varoquaux G, Vaught T, Millman J. eds Python in Science Conference Pasadena, CA USA: 2008 pp 11–15 [View Article]
    [Google Scholar]
  36. Schmartz GP, Hartung A, Hirsch P, Kern F, Fehlmann T et al. PLSDB: advancing a comprehensive database of bacterial plasmids. Nucleic Acids Res 2022; 50:D273–D278 [View Article]
    [Google Scholar]
  37. LLC GO. Gurobi Optimizer Reference Manual; 2023 https://www.gurobi.com
  38. Makhorin A. GLPK; 2012 https://www.gnu.org/software/glpk accessed 2 May 2024
  39. Partridge SR, Kwong SM, Firth N, Jensen SO. Mobile genetic elements associated with antimicrobial resistance. Clin Microbiol Rev 2018; 31: [View Article]
    [Google Scholar]
  40. Ondov BD, Treangen TJ, Melsted P, Mallonee AB, Bergman NH et al. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol 2016; 17:132 [View Article] [PubMed]
    [Google Scholar]
  41. Sheppard AE, Stoesser N, Wilson DJ, Sebra R, Kasarskis A et al. Nested Russian doll-like genetic mobility drives rapid dissemination of the carbapenem resistance gene blaKPC. Antimicrob Agents Chemother 2016; 60:3767–3778 [View Article] [PubMed]
    [Google Scholar]
  42. Roberts LW, Enoch DA, Khokhar F, Blackwell GA, Wilson H et al. Long-read sequencing reveals genomic diversity and associated plasmid movement of carbapenemase-producing bacteria in a UK hospital over 6 years. Microb Genom 2023; 9: [View Article]
    [Google Scholar]
  43. Meilǎ M. Comparing Clusterings. Proc 22nd Int Conf Mach Learn - ICML '05 2005 [View Article]
    [Google Scholar]
  44. Meilă M. Comparing clusterings—an information based distance. J Multivar Anal 2007; 98:873–895 [View Article]
    [Google Scholar]
  45. Dongen SV. Centrum Wiskunde & Informatica: performance criteria for graph clustering and Markov cluster experiments; 2000 https://ir.cwi.nl/pub/4461 accessed 2 May 2024
  46. Washburne AD, Morton JT, Sanders J, McDonald D, Zhu Q et al. Methods for phylogenetic analysis of microbiome data. Nat Microbiol 2018; 3:652–661 [View Article] [PubMed]
    [Google Scholar]
  47. Gogarten JP, Townsend JP. Horizontal gene transfer, genome innovation and evolution. Nat Rev Microbiol 2005; 3:679–687 [View Article] [PubMed]
    [Google Scholar]
  48. Didelot X, Wilson DJ. ClonalFrameML: efficient inference of recombination in whole bacterial genomes. PLoS Comput Biol 2015; 11:e1004041 [View Article] [PubMed]
    [Google Scholar]
  49. Hedge J, Wilson DJ. Bacterial phylogenetic reconstruction from whole genomes is robust to recombination but demographic inference is not. mBio 2014; 5: [View Article]
    [Google Scholar]
  50. Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun 2018; 9:5114 [View Article] [PubMed]
    [Google Scholar]
  51. Olm MR, Crits-Christoph A, Diamond S, Lavy A, Matheus Carnevali PB et al. Consistent metagenome-derived metrics verify and delineate bacterial species boundaries. mSystems 2020; 5: [View Article]
    [Google Scholar]
  52. Konstantinidis KT, Tiedje JM. Genomic insights that advance the species definition for prokaryotes. Proc Natl Acad Sci U S A 2005; 102:2567–2572 [View Article] [PubMed]
    [Google Scholar]
  53. Wayne LG, Brenner DJ, Colwell RR, Grimont PAD, Kandler O et al. Report of the ad hoc committee on reconciliation of approaches to bacterial systematics. Int J Syst Evol Microbiol 1987; 37:463–464 [View Article]
    [Google Scholar]
  54. Cazares A, Figueroa W, Cazares D, Lima L, Turnbull JD et al. Pre and post antibiotic epoch: insights into the historical spread of antimicrobial resistance. biorxiv [View Article]
    [Google Scholar]
  55. Acman M, van Dorp L, Santini JM, Balloux F. Large-scale network analysis captures biological features of bacterial plasmids. Nat Commun 2020; 11:2452 [View Article] [PubMed]
    [Google Scholar]
  56. Jesus TF, Ribeiro-Gonçalves B, Silva DN, Bortolaia V, Ramirez M et al. Plasmid ATLAS: plasmid visual analytics and identification in high-throughput sequencing data. Nucleic Acids Res 2019; 47:D188–D194 [View Article] [PubMed]
    [Google Scholar]
  57. Branger C, Ledda A, Billard-Pomares T, Doublet B, Fouteau S et al. Extended-spectrum β-lactamase-encoding genes are spreading on a wide range of Escherichia coli plasmids existing prior to the use of third-generation cephalosporins. Microb Genom 2018; 4:e000203 [View Article] [PubMed]
    [Google Scholar]
/content/journal/mgen/10.1099/mgen.0.001300
Loading
/content/journal/mgen/10.1099/mgen.0.001300
Loading

Data & Media loading...

Supplements

Supplementary material 1

PDF
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error