rMAP: the Rapid Microbial Analysis Pipeline for ESKAPE bacterial group whole-genome sequence data Open Access

Abstract

The recent re-emergence of multidrug-resistant pathogens has exacerbated their threat to worldwide public health. The evolution of the genomics era has led to the generation of huge volumes of sequencing data at an unprecedented rate due to the ever-reducing costs of whole-genome sequencing (WGS). We have developed the Rapid Microbial Analysis Pipeline (rMAP), a user-friendly pipeline capable of profiling the resistomes of ESKAPE pathogens (, , , , and species) using WGS data generated from Illumina’s sequencing platforms. rMAP is designed for individuals with little bioinformatics expertise, and automates the steps required for WGS analysis directly from the raw genomic sequence data, including adapter and low-quality sequence read trimming, genome assembly, genome annotation, single-nucleotide polymorphism (SNP) variant calling, phylogenetic inference by maximum likelihood, antimicrobial resistance (AMR) profiling, plasmid profiling, virulence factor determination, multi-locus sequence typing (MLST), pangenome analysis and insertion sequence characterization (IS). Once the analysis is finished, rMAP generates an interactive web-like html report. rMAP installation is very simple, it can be run using very simple commands. It represents a rapid and easy way to perform comprehensive bacterial WGS analysis using a personal laptop in low-income settings where high-performance computing infrastructure is limited.

Funding
This study was supported by the:
  • Grand Challenges Africa (CA) (Award GCA/AMR/rnd2/058)
    • Principle Award Recipient: GeraldMboowa
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000583
2021-06-10
2024-03-28
Loading full text...

Full text loading...

/deliver/fulltext/mgen/7/6/mgen000583.html?itemId=/content/journal/mgen/10.1099/mgen.0.000583&mimeType=html&fmt=ahah

References

  1. Sserwadda I, Lukenge M, Mwambi B, Mboowa G, Walusimbi A et al. Microbial contaminants isolated from items and work surfaces in the post- operative ward at Kawolo General Hospital, Uganda. BMC Infect Dis 2018; 18:68 [View Article][PubMed]
    [Google Scholar]
  2. Mulani MS, Kamble EE, Kumkar SN, Tawre MS, Pardesi KR. Emerging strategies to combat ESKAPE pathogens in the era of antimicrobial resistance: a review. Front Microbiol 2019; 10:
    [Google Scholar]
  3. Ma Y-X, Wang C-Y, Li Y-Y, Li J, Wan Q-Q, Chen J-H et al. Considerations and caveats in combating ESKAPE pathogens against nosocomial infections. Adv Sci 2020; 7:1901872 [View Article][PubMed]
    [Google Scholar]
  4. Carriço JA, Rossi M, Moran-Gilad J, Van Domselaar G, Ramirez M. A primer on microbial bioinformatics for nonbioinformaticians. Clin Microbiol Infect 2018; 24:342–349 [View Article][PubMed]
    [Google Scholar]
  5. Hyeon J-Y, Li S, Mann DA, Zhang S, Li Z et al. Quasimetagenomics-based and real-time-sequencing-aided detection and subtyping of Salmonella enterica from food samples. Appl Environ Microbiol 2018; 84:e02340–17 [View Article][PubMed]
    [Google Scholar]
  6. Quijada NM, Rodríguez-Lázaro D, Eiros JM, Hernández M. TORMES: an automated pipeline for whole bacterial genome analysis. Bioinformatics 2019; 35:4207–4212 [View Article][PubMed]
    [Google Scholar]
  7. Land M, Hauser L, Jun S-R, Nookaew I, Leuze MR et al. Insights from 20 years of bacterial genome sequencing. Funct Integr Genomics 2015; 15:141–161 [View Article][PubMed]
    [Google Scholar]
  8. Andrews S. FastQC: a Quality Control Tool for High Throughput Sequence Data Cambridge, UK: Babraham Bioinformatics, Babraham Institute; 2010
    [Google Scholar]
  9. Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 2016; 32:3047–3048 [View Article][PubMed]
    [Google Scholar]
  10. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for illumina sequence data. Bioinformatics 2014; 30:2114–2120 [View Article][PubMed]
    [Google Scholar]
  11. Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 2015; 31:1674–1676 [View Article][PubMed]
    [Google Scholar]
  12. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 2014; 30:2068–2069 [View Article][PubMed]
    [Google Scholar]
  13. Cingolani P, Platts A, Wang LL, Coon M, Nguyen T et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 2012; 6:80–92
    [Google Scholar]
  14. Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 2015; 32:268–274 [View Article][PubMed]
    [Google Scholar]
  15. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009; 25:1754–1760 [View Article][PubMed]
    [Google Scholar]
  16. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J et al. The sequence Alignment/Map format and SAMtools. Bioinformatics 2009; 25:2078–2079 [View Article][PubMed]
    [Google Scholar]
  17. Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 2015; 31::3691–:3693 [View Article][PubMed]
    [Google Scholar]
  18. Hawkey J, Hamidian M, Wick RR, Edwards DJ, Billman-Jacobe H et al. ISMapper: identifying transposase insertion sites in bacterial genomes from short read sequence data. BMC Genomics 2015; 16:667 [View Article][PubMed]
    [Google Scholar]
  19. Rossum G. Python reference manual; 1995
  20. Ihaka R, RJJoc G. Statistics G. R: a language for data analysis and graphics.; 1996; 5299–314
  21. ncbi/sra-tools NCBI - National Center for Biotechnology Information/NLM/NIH; 2020
  22. Seemann T. Tseemann/shovill; 2020 https://github.com/tseemann/shovill
  23. Hyatt D, Chen G-. L, LoCascio PF, Land ML, Larimer FW et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 2010; 11:119
    [Google Scholar]
  24. Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv:12073907 [q-bio] 2012
    [Google Scholar]
  25. Tan A, Abecasis GR, Kang H. Unified representation of genetic variants. Bioinformatics 2015; 31:2202–2204
    [Google Scholar]
  26. Feldgarden M, Brover V, Haft DH, Prasad AB, Slotta DJ et al. Validating the AMRFinder tool and resistance gene database by using antimicrobial resistance genotype-phenotype correlations in a collection of isolates. Antimicrob Agents Chemother 2019; 63:e00483–19
    [Google Scholar]
  27. McArthur AG, Waglechner N, Nizam F, Yan A, Azad MA et al. The comprehensive antibiotic resistance database. Antimicrob Agents Chemother 2013; 57:3348–3357
    [Google Scholar]
  28. Gupta SK, Padmanabhan BR, Diene SM, Lopez-Rojas R, Kempf M et al. ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes. Antimicrob Agents Chemother 2014; 58:212–220
    [Google Scholar]
  29. Doster E, Lakin SM, Dean CJ, Wolfe C, Young JG et al. MEGARes 2.0: a database for classification of antimicrobial drug, biocide and metal resistance determinants in metagenomic sequence data. Nucleic Acids Res. 2019; 48::D561–D9
    [Google Scholar]
  30. Carattoli A, Zankari E, García-Fernández A, Larsen MV, Lund O et al. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 2014; 58:3895–3903
    [Google Scholar]
  31. Liu B, Zheng D, Jin Q, Chen L, Yang JJ. Nar. VFDB 2019: a comparative pathogenomic platform with an interactive web interface. Nucleic Acids Res 2019; 47: D687–D692
    [Google Scholar]
  32. Ortiz E. vcf2phylip V1. 5: convert a VCF matrix into several matrix formats for phylogenetic analysis. Zenodo 2018
    [Google Scholar]
  33. Katoh K. Standley DMJMb, evolution MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 2013; 30:772–780
    [Google Scholar]
  34. Criscuolo A, Gribaldo S. BMGE (block mapping and Gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol 2010; 10:210 [View Article][PubMed]
    [Google Scholar]
  35. Partridge SR, Kwong SM, Firth N, Jensen SO, SOJCmr J. Mobile genetic elements associated with antimicrobial resistance. Clin Microbiol Rev 2018; 31: [View Article]
    [Google Scholar]
  36. Yu G, Smith DK, Zhu H, Guan Y, Lam TT-Y. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol Evol 2017; 8:28–36
    [Google Scholar]
  37. Wickham H. ggplot2: Elegant Graphics for Data Analysis Springer; 2016 p 266
    [Google Scholar]
  38. Xie Y. Dynamic documents with R and knitr CRC Press; 2015
    [Google Scholar]
  39. Grolemund G, Allaire JJ, Xie Y. R Markdown: the definitive guide; 2018
  40. Plotly Plotly R Graphing library https://plotly.com/r/
  41. Wickham H. Reshaping data with the reshape package. J Stat Softw 2007; 21:1–20 [View Article]
    [Google Scholar]
  42. Yu G, Smith DK, Zhu H, Guan Y, Lam TTY. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol Evol 2017; 8:28–36
    [Google Scholar]
  43. Schwengers O, Hoek A, Fritzenwanker M, Falgenhauer L, Hain T et al. ASA3P: an automatic and scalable pipeline for the assembly, annotation and higher-level analysis of closely related bacterial isolates. PLoS Comput Biol 2020; 16:e1007134 [View Article]
    [Google Scholar]
  44. Petit RA, Read TD. Bactopia: a flexible pipeline for complete analysis of bacterial genomes. bioRxiv 2020:2020.02.28.969394
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000583
Loading
/content/journal/mgen/10.1099/mgen.0.000583
Loading

Data & Media loading...

Most cited Most Cited RSS feed