1887

Abstract

Plasmids play an important role in bacterial evolution and mediate horizontal transfer of genes including virulence and antimicrobial resistance genes. Although short-read sequencing technologies have enabled large-scale bacterial genomics, the resulting draft genome assemblies are often fragmented into hundreds of discrete contigs. Several tools and approaches have been developed to identify plasmid sequences in such assemblies, but require trade-off between sensitivity and specificity. Here we propose using the Kraken classifier, together with a custom Kraken database comprising known chromosomal and plasmid sequences of species complex (KpSC), to identify plasmid-derived contigs in draft assemblies. We assessed performance using Illumina-based draft genome assemblies for 82 KpSC isolates, for which complete genomes were available to supply ground truth. When benchmarked against five other classifiers (Centrifuge, RFPlasmid, mlplasmids, PlaScope and Platon), Kraken showed balanced performance in terms of overall sensitivity and specificity (90.8 and 99.4 %, respectively, for contig count; 96.5 and >99.9 %, respectively, for cumulative contig length), and the highest accuracy (96.8% vs 91.8-96.6% for contig count; 99.8% vs 99.0-99.7 % for cumulative contig length), and F1-score (94.5 % vs 84.5-94.1 %, for contig count; 98.0 % vs 88.9-96.7 % for cumulative contig length). Kraken also achieved consistent performance across our genome collection. Furthermore, we demonstrate that expanding the Kraken database with additional known chromosomal and plasmid sequences can further improve classification performance. Although we have focused here on the KpSC, this methodology could easily be applied to other species with a sufficient number of completed genomes.

Funding
This study was supported by the:
  • Bill and Melinda Gates Foundation (Award OPP1175797)
    • Principle Award Recipient: KathrynE. Holt
  • the Viertel Charitable Foundation of Australia
    • Principle Award Recipient: KathrynE. Holt
  • the National Health and Medical Research Council of Australia (Award APP1176192)
    • Principle Award Recipient: KellyL. Wyres
  • Japan Society for the Promotion of Science (Award JP19K20461)
    • Principle Award Recipient: RyotaGomi
  • the John Mung Program from Kyoto University
    • Principle Award Recipient: RyotaGomi
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000550
2021-04-07
2021-04-15
Loading full text...

Full text loading...

/deliver/fulltext/mgen/7/4/mgen000550.html?itemId=/content/journal/mgen/10.1099/mgen.0.000550&mimeType=html&fmt=ahah

References

  1. Partridge SR, Kwong SM, Firth N, Jensen SO. Mobile genetic elements associated with antimicrobial resistance. Clin Microbiol Rev 2018; 31: [CrossRef][PubMed]
    [Google Scholar]
  2. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 2013; 29:1072–1075 [CrossRef][PubMed]
    [Google Scholar]
  3. Zhou F, Xu Y. cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data. Bioinformatics 2010; 26:2051–2052 [CrossRef][PubMed]
    [Google Scholar]
  4. Krawczyk PS, Lipinski L, Dziembowski A. PlasFlow: predicting plasmid sequences in metagenomic data using genome signatures. Nucleic Acids Res 2018; 46:e35 [CrossRef][PubMed]
    [Google Scholar]
  5. Schwengers O, Barth P, Falgenhauer L, Hain T, Chakraborty T et al. Platon: identification and characterization of bacterial plasmid contigs in short-read draft assemblies exploiting protein sequence-based replicon distribution scores. Microb Genom 2020; 6: [CrossRef][PubMed]
    [Google Scholar]
  6. Antipov D, Hartwick N, Shen M, Raiko M, Lapidus A et al. plasmidSPAdes: assembling plasmids from whole genome sequencing data. Bioinformatics 2016; 32:3380–3387 [CrossRef][PubMed]
    [Google Scholar]
  7. Linda van der Graaf – van Bloois JAW, Aldert L. Zomer, RFPlasmid: predicting plasmid sequences from short read assembly data using machine learning. bioRxiv 2020
    [Google Scholar]
  8. Arredondo-Alonso S, Rogers MRC, Braat JC, Verschuuren TD, Top J et al. mlplasmids: a user-friendly tool to predict plasmid- and chromosome-derived sequences for single species. Microb Genom 2018; 4: [CrossRef][PubMed]
    [Google Scholar]
  9. Royer G, Decousser JW, Branger C, Dubois M, Medigue C et al. PlaScope: a targeted approach to assess the plasmidome from genome assemblies at the species level. Microb Genom 2018; 4: [CrossRef][PubMed]
    [Google Scholar]
  10. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 2012; 19:455–477 [CrossRef][PubMed]
    [Google Scholar]
  11. Kim D, Song L, Breitwieser FP, Salzberg SL. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res 2016; 26:1721–1729 [CrossRef][PubMed]
    [Google Scholar]
  12. Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol 2014; 15:R46 [CrossRef][PubMed]
    [Google Scholar]
  13. Wyres KL, Holt KE. Klebsiella pneumoniae as a key trafficker of drug resistance genes from environmental to clinically important bacteria. Curr Opin Microbiol 2018; 45:131–139 [CrossRef][PubMed]
    [Google Scholar]
  14. Wyres KL, Lam MMC, Holt KE. Population genomics of Klebsiella pneumoniae . Nat Rev Microbiol 2020; 18:344–359 [CrossRef][PubMed]
    [Google Scholar]
  15. Orlek A, Phan H, Sheppard AE, Doumith M, Ellington M et al. A curated dataset of complete Enterobacteriaceae plasmids compiled from the NCBI nucleotide database. Data Brief 2017; 12:423–426 [CrossRef][PubMed]
    [Google Scholar]
  16. Gorrie CL, Mirceta M, Wick RR, Judd LM, Wyres KL et al. Antimicrobial-Resistant Klebsiella pneumoniae Carriage and infection in specialized geriatric care wards linked to acquisition in the referring hospital. Clin Infect Dis 2018; 67:161–170 [CrossRef][PubMed]
    [Google Scholar]
  17. Carattoli A, Zankari E, García-Fernández A, Voldby Larsen M, Lund O et al. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 2014; 58:3895–3903 [CrossRef][PubMed]
    [Google Scholar]
  18. Gorrie CL, Mirceta M, Wick RR, Edwards DJ, Thomson NR et al. Gastrointestinal Carriage Is a Major Reservoir of Klebsiella pneumoniae Infection in Intensive Care Patients. Clin Infect Dis 2017; 65:208–215 [CrossRef][PubMed]
    [Google Scholar]
  19. Wick RR, Judd LM, Gorrie CL, Holt KE. Completing bacterial genome assemblies with multiplex MinION sequencing. Microb Genom 2017; 3:e000132 [CrossRef][PubMed]
    [Google Scholar]
  20. Wyres KL, Wick RR, Judd LM, Froumine R, Tokolyi A et al. Distinct evolutionary dynamics of horizontal gene transfer in drug resistant and virulent clones of Klebsiella pneumoniae . PLoS Genet 2019; 15:e1008114 [CrossRef][PubMed]
    [Google Scholar]
  21. Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 2017; 13:e1005595 [CrossRef][PubMed]
    [Google Scholar]
  22. Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 2018; 34:3094–3100 [CrossRef][PubMed]
    [Google Scholar]
  23. Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S et al. Identification of acquired antimicrobial resistance genes. J Antimicrob Chemother 2012; 67:2640–2644 [CrossRef][PubMed]
    [Google Scholar]
  24. Hawass NE. Comparing the sensitivities and specificities of two diagnostic procedures performed on the same group of patients. Br J Radiol 1997; 70:360–366 [CrossRef][PubMed]
    [Google Scholar]
  25. Trajman A, Luiz RR. McNemar chi2 test revisited: comparing sensitivity and specificity of diagnostic examinations. Scand J Clin Lab Invest 2008; 68:77–80 [CrossRef][PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000550
Loading
/content/journal/mgen/10.1099/mgen.0.000550
Loading

Data & Media loading...

Supplements

Supplementary material 1

PDF

Supplementary material 2

EXCEL

Most cited this month Most Cited RSS feed

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error