1887

Abstract

Plasmids are extrachromosomal genetic elements that replicate independently of the chromosome and play a vital role in the environmental adaptation of bacteria. Due to potential mobilization or conjugation capabilities, plasmids are important genetic vehicles for antimicrobial resistance genes and virulence factors with huge and increasing clinical implications. They are therefore subject to large genomic studies within the scientific community worldwide. As a result of rapidly improving next-generation sequencing methods, the quantity of sequenced bacterial genomes is constantly increasing, in turn raising the need for specialized tools to (i) extract plasmid sequences from draft assemblies, (ii) derive their origin and distribution, and (iii) further investigate their genetic repertoire. Recently, several bioinformatic methods and tools have emerged to tackle this issue; however, a combination of high sensitivity and specificity in plasmid sequence identification is rarely achieved in a taxon-independent manner. In addition, many software tools are not appropriate for large high-throughput analyses or cannot be included in existing software pipelines due to their technical design or software implementation. In this study, we investigated differences in the replicon distributions of protein-coding genes on a large scale as a new approach to distinguish plasmid-borne from chromosome-borne contigs. We defined and computed statistical discrimination thresholds for a new metric: the replicon distribution score (RDS), which achieved an accuracy of 96.6 %. The final performance was further improved by the combination of the RDS metric with heuristics exploiting several plasmid-specific higher-level contig characterizations. We implemented this workflow in a new high-throughput taxon-independent bioinformatics software tool called Platon for the recruitment and characterization of plasmid-borne contigs from short-read draft assemblies. Compared to PlasFlow, Platon achieved a higher accuracy (97.5 %) and more balanced predictions (F1=82.6 %) tested on a broad range of bacterial taxa and better or equal performance against the targeted tools PlasmidFinder and PlaScope on sequenced isolates. Platon is available at: http://platon.computational.bio/.

Funding
This study was supported by the:
  • Deutsche Forschungsgemeinschaft (Award HA 5225/1-1)
    • Principle Award Recipient: Torsten Hain
  • Deutsche Forschungsgemeinschaft (Award SFB 1021/2 2017)
    • Principle Award Recipient: Torsten Hain
  • Deutsche Forschungsgemeinschaft (Award TRR84/3 2018)
    • Principle Award Recipient: Torsten Hain
  • Deutsche Forschungsgemeinschaft (Award GO 2037/5-1)
    • Principle Award Recipient: Alexander Goesmann
  • Deutsche Forschungsgemeinschaft (Award TRR84/3 2018)
    • Principle Award Recipient: Trinad Chakraborty
  • de.NBI (Award FKZ 031A533B)
    • Principle Award Recipient: Alexander Goesmann
  • Deutsches Zentrum für Infektionsforschung (Award 8032808820)
    • Principle Award Recipient: Trinad Chakraborty
  • Deutsches Zentrum für Infektionsforschung (Award 8032808811)
    • Principle Award Recipient: Trinad Chakraborty
  • Deutsches Zentrum für Infektionsforschung (Award TI06.001)
    • Principle Award Recipient: Trinad Chakraborty
  • Deutsches Zentrum für Infektionsforschung (DE) (Award 8000 701–3)
    • Principle Award Recipient: Trinad Chakraborty
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000398
2020-06-24
2021-06-24
Loading full text...

Full text loading...

/deliver/fulltext/mgen/6/10/mgen000398.html?itemId=/content/journal/mgen/10.1099/mgen.0.000398&mimeType=html&fmt=ahah

References

  1. Clark DP, Stahl DA, Martinko JM, Madigan MT. 2010; Brock biology of microorganisms (13th edition). Benjamin Cummings. https://www.amazon.com/Brock-Biology-Microorganisms-Michael-Madigan/dp/032164963X
  2. Tazzyman SJ, Bonhoeffer S. Why there are no essential genes on plasmids. Mol Biol Evol 2015; 32:3079–3088 [View Article][PubMed]
    [Google Scholar]
  3. Thomas CM, Nielsen KM. Mechanisms of, and barriers to, horizontal gene transfer between bacteria. Nat Rev Microbiol 2005; 3:711–721 [View Article][PubMed]
    [Google Scholar]
  4. Smillie C, Garcillán-Barcia MP, Francia MV, Rocha EPC, de la Cruz F. Mobility of plasmids. Microbiol Mol Biol Rev 2010; 74:434–452 [View Article][PubMed]
    [Google Scholar]
  5. Carattoli A. Plasmids and the spread of resistance. Int J Med Microbiol 2013; 303:298–304 [View Article][PubMed]
    [Google Scholar]
  6. Dierikx C, van der Goot J, Fabri T, van Essen-Zandbergen A, Smith H et al. Extended-spectrum-β-lactamase- and AmpC-β-lactamase-producing Escherichia coli in Dutch broilers and broiler farmers. J Antimicrob Chemother 2013; 68:60–67 [View Article][PubMed]
    [Google Scholar]
  7. Schweizer C, Bischoff P, Bender J, Kola A, Gastmeier P et al. Plasmid-Mediated Transmission of KPC-2 Carbapenemase in Enterobacteriaceae in Critically Ill Patients. Front Microbiol 2019; 10:276 [View Article][PubMed]
    [Google Scholar]
  8. Zheng R, Zhang Q, Guo Y, Feng Y, Liu L et al. Outbreak of plasmid-mediated NDM-1-producing Klebsiella pneumoniae ST105 among neonatal patients in Yunnan, China. Ann Clin Microbiol Antimicrob 2016; 15:10 [View Article][PubMed]
    [Google Scholar]
  9. Yie Y, Wei Z, Tien P. A simplified and reliable protocol for plasmid DNA sequencing: fast miniprep and denaturation. Nucleic Acids Res 1993; 21:361 [View Article][PubMed]
    [Google Scholar]
  10. Orlek A, Stoesser N, Anjum MF, Doumith M, Ellington MJ et al. Plasmid classification in an era of whole-genome sequencing: application in studies of antibiotic resistance epidemiology. Front Microbiol 2017; 8:182 [View Article][PubMed]
    [Google Scholar]
  11. Arredondo-Alonso S, Willems RJ, van Schaik W, Schürch AC. On the (im)possibility of reconstructing plasmids from whole-genome short-read sequencing data. Microb Genom 2017; 3:1–8 [View Article][PubMed]
    [Google Scholar]
  12. Cohen SN. Transposable genetic elements and plasmid evolution. Nature 1976; 263:731–734 [View Article][PubMed]
    [Google Scholar]
  13. Escudero JA, Loot C, Nivina A, Mazel D. The integron: adaptation on demand. Microbiol Spectr 2015; 3:MDNA3–0019–2014 [View Article][PubMed]
    [Google Scholar]
  14. Sohn J-I, Nam J-W. The present and future of de novo whole-genome assembly. Brief Bioinform 2018; 19:23–40 [View Article][PubMed]
    [Google Scholar]
  15. Rozov R, Brown Kav A, Bogumil D, Shterzer N, Halperin E et al. Recycler: an algorithm for detecting plasmids from de novo assembly graphs. Bioinformatics 2016; 95:btw651
    [Google Scholar]
  16. Antipov D, Hartwick N, Shen M, Raiko M, Lapidus A et al. plasmidSPAdes: assembling plasmids from whole genome sequencing data. Bioinformatics 2016; 32:btw493–3387 [View Article][PubMed]
    [Google Scholar]
  17. Vielva L, de Toro M, Lanza VF, de la Cruz F. PLACNETw: a web-based tool for plasmid reconstruction from bacterial genomes. Bioinformatics 2017; 33:3796–3798 [View Article][PubMed]
    [Google Scholar]
  18. Carattoli A, Zankari E, García-Fernández A, Voldby Larsen M, Lund O et al. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 2014; 58:3895–3903 [View Article][PubMed]
    [Google Scholar]
  19. Zhou F, Xu Y. cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data. Bioinformatics 2010; 26:2051–2052 [View Article][PubMed]
    [Google Scholar]
  20. Krawczyk PS, Lipinski L, Dziembowski A. PlasFlow: predicting plasmid sequences in metagenomic data using genome signatures. Nucleic Acids Res 2018; 46:e35 [View Article][PubMed]
    [Google Scholar]
  21. Arredondo-Alonso S, Rogers MRC, Braat JC, Verschuuren TD, Top J et al. mlplasmids: a user-friendly tool to predict plasmid- and chromosome-derived sequences for single species. Microb Genom 2018; 4:1–15 [View Article][PubMed]
    [Google Scholar]
  22. Royer G, Decousser JW, Branger C, Dubois M, Médigue C et al. PlaScope: a targeted approach to assess the plasmidome from genome assemblies at the species level. Microb Genom 2018; 4: [View Article][PubMed]
    [Google Scholar]
  23. Roosaare M, Puustusmaa M, Möls M, Vaher M, Remm M. PlasmidSeeker: identification of known plasmids from bacterial whole genome sequencing reads. PeerJ 2018; 6:e4588 [View Article][PubMed]
    [Google Scholar]
  24. Stephens ZD, Lee SY, Faghri F, Campbell RH, Zhai C et al. Big data: astronomical or Genomical?. PLoS Biol 2015; 13:e1002195 [View Article][PubMed]
    [Google Scholar]
  25. Caboche S, Even G, Loywick A, Audebert C, Hot D. MICRA: an automatic pipeline for fast characterization of microbial genomes from high-throughput sequencing data. Genome Biol 2017; 18:233 [View Article][PubMed]
    [Google Scholar]
  26. Quijada NM, Rodríguez-Lázaro D, Eiros JM, Hernández M. TORMES: an automated pipeline for whole bacterial genome analysis. Bioinformatics 2019; 35:4207–4212 [View Article][PubMed]
    [Google Scholar]
  27. Schwengers O, Hoek A, Fritzenwanker M, Falgenhauer L, Hain T et al. ASA3P: an automatic and scalable pipeline for the assembly, annotation and higher-level analysis of closely related bacterial isolates. PLoS Comput Biol 2020; 16:e1007134 [View Article][PubMed]
    [Google Scholar]
  28. Galata V, Fehlmann T, Backes C, Keller A. PLSDB: a resource of complete bacterial plasmids. Nucleic Acids Res 2019; 47:D195–D202 [View Article][PubMed]
    [Google Scholar]
  29. UniProt Consortium UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res 2019; 47:D506–D515 [View Article][PubMed]
    [Google Scholar]
  30. Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using diamond. Nat Methods 2015; 12:59–60 [View Article][PubMed]
    [Google Scholar]
  31. Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput Biol 2018; 14:e1005944 [View Article][PubMed]
    [Google Scholar]
  32. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J et al. BLAST+: architecture and applications. BMC Bioinformatics 2009; 10:421 [View Article][PubMed]
    [Google Scholar]
  33. Anda M, Ohtsubo Y, Okubo T, Sugawara M, Nagata Y et al. Bacterial clade with the ribosomal RNA operon on a small plasmid rather than the chromosome. Proc Natl Acad Sci U S A 2015; 112:14343–14347 [View Article][PubMed]
    [Google Scholar]
  34. Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Rfam: an RNA family database. Nucleic Acids Res 2003; 31:439–441 [View Article][PubMed]
    [Google Scholar]
  35. Feldgarden M, Brover V, Haft DH, Prasad AB, Slotta DJ et al. Using the NCBI AMRFinder tool to determine antimicrobial resistance genotype-phenotype correlations within a collection of NARMS isolates. bioRxiv 2019; 550707:
    [Google Scholar]
  36. Eddy SR. Accelerated profile HMM searches. PLoS Comput Biol 2011; 7:e1002195 [View Article][PubMed]
    [Google Scholar]
  37. Haft DH, DiCuccio M, Badretdin A, Brover V, Chetvernin V et al. Refseq: an update on prokaryotic genome annotation and curation. Nucleic Acids Res 2018; 46:D851–D860 [View Article][PubMed]
    [Google Scholar]
  38. Robertson J, Nash JHE. MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies. Microb Genom 2018; 4: 27 07 2018 [View Article][PubMed]
    [Google Scholar]
  39. Garcillán-Barcia MP, Redondo-Salvo S, Vielva L, de la Cruz F. MOBscan: Automated Annotation of MOB Relaxases. In de la Cruz F. editor Horizontal Gene Transfer: Methods and Protocols New York, NY: Springer US; pp 295–308
    [Google Scholar]
  40. Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 2010; 11:119 [View Article][PubMed]
    [Google Scholar]
  41. Huang W, Li L, Myers JR, Marth GT. Art: a next-generation sequencing read simulator. Bioinformatics 2012; 28:593–594 [View Article][PubMed]
    [Google Scholar]
  42. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 2012; 19:455–477 [View Article][PubMed]
    [Google Scholar]
  43. Schmiedel J, Falgenhauer L, Domann E, Bauerfeind R, Prenger-Berninghoff E et al. Multiresistant extended-spectrum β-lactamase-producing Enterobacteriaceae from humans, companion animals and horses in central Hesse, Germany. BMC Microbiol 2014; 14:187 [View Article][PubMed]
    [Google Scholar]
  44. Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 2017; 13:e1005595 [View Article][PubMed]
    [Google Scholar]
  45. Guglielmini J, Quintais L, Garcillán-Barcia MP, de la Cruz F, Rocha EPC. The repertoire of ice in prokaryotes underscores the unity, diversity, and ubiquity of conjugation. PLoS Genet 2011; 7:e1002222 [View Article][PubMed]
    [Google Scholar]
  46. Abby SS, Cury J, Guglielmini J, Néron B, Touchon M et al. Identification of protein secretion systems in bacterial genomes. Sci Rep 2016; 6:23080 [View Article][PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000398
Loading
/content/journal/mgen/10.1099/mgen.0.000398
Loading

Data & Media loading...

Supplements

Supplementary material 1

PDF

Most cited this month Most Cited RSS feed

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error