1887

Abstract

Whole genome sequencing (WGS) has become the reference standard for bacterial outbreak investigation and pathogen typing, providing a resolution unattainable with conventional molecular methods. Data generated with Illumina sequencers can however only be analysed after the sequencing run has finished, thereby losing valuable time during emergency situations. We evaluated both the effect of decreasing overall run time, and also a protocol to transfer and convert intermediary files generated by Illumina sequencers enabling real-time data analysis for multiple samples part of the same ongoing sequencing run, as soon as the forward reads have been sequenced. To facilitate implementation for laboratories operating under strict quality systems, extensive validation of several bioinformatics assays (16S rRNA species confirmation, gene detection against virulence factor and antimicrobial resistance databases, SNP-based antimicrobial resistance detection, serotype determination, and core genome multilocus sequence typing) for three bacterial pathogens (, , and Shiga-toxin producing ) was performed by evaluating performance in function of the two most critical sequencing parameters, i.e. read length and coverage. For the majority of evaluated bioinformatics assays, actionable results could be obtained between 14 and 22 h of sequencing, decreasing the overall sequencing-to-results time by more than half. This study aids in reducing the turn-around time of WGS analysis by facilitating a faster response in time-critical scenarios and provides recommendations for time-optimized WGS with respect to required read length and coverage to achieve a minimum level of performance for the considered bioinformatics assay(s), which can also be used to maximize the cost-effectiveness of routine surveillance sequencing when response time is not essential.

Keyword(s): Illumina , NGS , real-time , validation and WGS
  • This is an open-access article distributed under the terms of the Creative Commons Attribution License.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000699
2021-11-05
2024-10-11
Loading full text...

Full text loading...

/deliver/fulltext/mgen/7/11/mgen000699.html?itemId=/content/journal/mgen/10.1099/mgen.0.000699&mimeType=html&fmt=ahah

References

  1. Gilmour MW, Graham M, Van Domselaar G, Tyler S, Kent H et al. High-throughput genome sequencing of two listeria monocytogenes clinical isolates during a large foodborne outbreak. BMC Genomics 2010; 11:120 [View Article] [PubMed]
    [Google Scholar]
  2. Lienau EK, Strain E, Wang C, Zheng J, Ottesen AR. Identification of a salmonellosis outbreak by means of molecular sequencing. N Engl J Med 2011; 364:981–982 [View Article] [PubMed]
    [Google Scholar]
  3. Charpentier E, Garnaud C, Wintenberger C, Bailly S, Murat J-B. Added value of next-generation sequencing for multilocus sequence typing analysis of a Pneumocystis jirovecii pneumonia outbreak. Emerg Infect Dis 2017; 23:1237–1245 [View Article] [PubMed]
    [Google Scholar]
  4. Nouws S, Bogaerts B, Verhaegen B, Denayer S, Crombé F et al. The benefits of whole genome sequencing for foodborne outbreak investigation from the perspective of a National reference laboratory in a smaller country. Foods 2020; 9:E1030 [View Article]
    [Google Scholar]
  5. Meehan CJ, Goig GA, Kohl TA, Verboven L, Dippenaar A. Whole genome sequencing of Mycobacterium tuberculosis: current standards and open issues. Nat Rev Microbiol 2019; 17:533–545 [View Article] [PubMed]
    [Google Scholar]
  6. Deng X, den Bakker HC, Hendriksen RS. Genomic epidemiology: whole-genome-sequencing-powered surveillance and outbreak investigation of foodborne bacterial pathogens. Annu Rev Food Sci Technol 2016; 7:353–374 [View Article] [PubMed]
    [Google Scholar]
  7. Liu L, Li Y, Li S, Hu N, He Y et al. Comparison of next-generation sequencing systems. Role Bioinforma Agric 2012; 2012:1–25
    [Google Scholar]
  8. Cao MD, Ganesamoorthy D, Cooper MA, Coin LJM. Realtime analysis and visualization of MinION sequencing data with npReader. Bioinformatics 2016; 32:764–766 [View Article] [PubMed]
    [Google Scholar]
  9. Sanderson ND, Street TL, Foster D, Swann J, Atkins BL et al. Real-time analysis of nanopore-based metagenomic sequencing from infected orthopaedic devices 06 Biological Sciences 0604 Genetics 06 Biological Sciences 0605 Microbiology. BMC Genomics 2018; 19:1–11
    [Google Scholar]
  10. Jagadeesan B, Gerner-Smidt P, Allard MW, Leuillet S, Winkler A. The use of next generation sequencing for improving food safety: Translation into practice. Food Microbiol 2019; 79:96–115S0740-0020(18)30530-6 [View Article] [PubMed]
    [Google Scholar]
  11. Allard MW, Bell R, Ferreira CM, Gonzalez-Escalona N, Hoffmann M. Genomics of foodborne pathogens for microbial food safety. Curr Opin Biotechnol 2018; 49:224–229S0958-1669(17)30139-8 [View Article] [PubMed]
    [Google Scholar]
  12. García Fierro R, Thomas‐Lopez D, Deserio D, Liebana E, Rizzi V et al. Outcome of EC/EFSA questionnaire (2016) on use of whole genome sequencing (WGS) for food‐ and waterborne pathogens isolated from animals, food, feed and related environmental samples in EU/EFTA countries. EFSA Support Publ 2016; 15: [View Article]
    [Google Scholar]
  13. Revez J, Espinosa L, Albiger B, Leitmeyer KC, Struelens MJ et al. Survey on the use of whole-genome sequencing for infectious diseases surveillance: rapid expansion of european national capacities, 2015-2016. Front Public Health 2017; 5:347 [View Article] [PubMed]
    [Google Scholar]
  14. Rossen JWA, Friedrich AW, Moran-Gilad J. Practical issues in implementing whole-genome-sequencing in routine diagnostic microbiology. Clin Microbiol Infect 2018; 24:355–360S1198-743X(17)30630-4 [View Article] [PubMed]
    [Google Scholar]
  15. Bioinformatics Joint Research Centre Working group on Benchmarking of NGS pipelines for AMR n.d A roadmap for the generation of benchmarking resources for antimicrobial resistance detection using next generation sequencing. F1000Res Submitted:
    [Google Scholar]
  16. ISO: International Organization for Standardization 23418:2018(E): Microbiology of the food chain — Whole genome sequencing for typing and genomic characterization of foodborne bacteria — General requirements and guidance; 2020 https://www.iso.org/standard/53328.html
  17. Kozyreva VK, Truong CL, Greninger AL, Crandall J, Mukhopadhyay R. Validation and implementation of clinical laboratory improvements act-compliant whole-genome sequencing in the public health microbiology laboratory. J Clin Microbiol 2017; 55:2502–2520 [View Article] [PubMed]
    [Google Scholar]
  18. Pereira R, Oliveira J, Sousa M. Bioinformatics and computational tools for next-generation sequencing analysis in clinical genetics. J Clin Med 2020; 9:132 [View Article]
    [Google Scholar]
  19. Quick J, Ashton P, Calus S, Chatt C, Gossain S. Rapid draft sequencing and real-time nanopore sequencing in a hospital outbreak of Salmonella. Genome Biol 2015; 16:114 [View Article] [PubMed]
    [Google Scholar]
  20. Lambert D, Carrillo CD, Koziol AG, Manninger P, Blais BW. GeneSippr: A rapid whole-genome approach for the identification and characterization of foodborne pathogens such as priority Shiga toxigenic Escherichia coli. PLoS One 2015; 10:1–19
    [Google Scholar]
  21. Lindner MS, Strauch B, Schulze JM, Tausch SH, Dabrowski PW. HiLive: Real-time mapping of illumina reads while sequencing. Bioinformatics 2017; 33:917–919 [View Article] [PubMed]
    [Google Scholar]
  22. Tausch SH, Loka TP, Schulze JM, Andrusch A, Klenner J et al. PathoLive - Real time pathogen identification from metagenomic Illumina datasets. bioRxiv402370
    [Google Scholar]
  23. Tausch SH, Strauch B, Andrusch A, Loka TP, Lindner MS. LiveKraken--real-time metagenomic classification of illumina data. Bioinformatics 2018; 34:3750–3752 [View Article] [PubMed]
    [Google Scholar]
  24. Loka TP, Tausch SH, Dabrowski PW, Radoni A, Nitsche A. PriLive: Privacy-preserving real-time filtering for next-generation sequencing. Bioinformatics 2018; 34:2376–2383 [View Article] [PubMed]
    [Google Scholar]
  25. Alkan C, Sajjadian S, Eichler EE. Limitations of next-generation genome sequence assembly. Nat Methods 2011; 8:61–65 [View Article] [PubMed]
    [Google Scholar]
  26. Chaisson MJ, Brinza D, Pevzner PA. De novo fragment assembly with short mate-paired reads: Does the read length matter. Genome Res 2009; 19:336–346 [View Article] [PubMed]
    [Google Scholar]
  27. Thankaswamy-Kosalai S, Sen P, Nookaew I. Evaluation and assessment of read-mapping by multiple next-generation sequencing aligners based on genome-wide characteristics. Genomics 2017; 109:186–191S0888-7543(17)30020-4 [View Article] [PubMed]
    [Google Scholar]
  28. Olson ND, Lund SP, Colman RE, Foster JT, Sahl JW et al. Best practices for evaluating single nucleotide variant calling methods for microbial genomics. Front Genet 2015; 6:1–15 [View Article]
    [Google Scholar]
  29. Sims D, Sudbery I, Ilott NE, Heger A, Ponting CP. Sequencing depth and coverage: Key considerations in genomic analyses. Nat Rev Genet 2014; 15:121–132 [View Article] [PubMed]
    [Google Scholar]
  30. Lindsey RL, Pouseele H, Chen JC, Strockbine NA, Carleton HA. Implementation of whole genome sequencing (WGS) for identification and characterization of Shiga toxin-producing Escherichia coli (STEC) in the United States. Front Microbiol 2016; 7:1–9 [View Article]
    [Google Scholar]
  31. Bogaerts B, Nouws S, Verhaegen B, Denayer S, Van Braekel J et al. Validation strategy of a bioinformatics whole genome sequencing workflow for Shiga toxin-producing Escherichia coli using a reference collection extensively characterized with conventional methods. Microb Genom 2021; 7: [View Article] [PubMed]
    [Google Scholar]
  32. Bogaerts B, Winand R, Fu Q, Van Braekel J, Ceyssens P-J et al. Validation of a bioinformatics workflow for routine analysis of whole-genome sequencing data and related challenges for pathogen typing in a european national reference center: Neisseria meningitidis as a proof-of-concept. Front Microbiol 2019; 10:362 [View Article] [PubMed]
    [Google Scholar]
  33. Bogaerts B, Delcourt T, Soetaert K, Boarbi S, Ceyssens P-J et al. A bioinformatics WGS workflow for clinical Mycobacterium tuberculosis complex isolate analysis, validated using a reference collection extensively characterized with conventional methods and in silico approaches. J Clin Microbiol 2021; 59:e00202-21 [View Article]
    [Google Scholar]
  34. Bolger AM, Lohse M, Usadel B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 2014; 30:2114–2120 [View Article] [PubMed]
    [Google Scholar]
  35. O’Leary NA, Wright MW, Brister JR, Ciufo S, Haddad D et al. Reference sequence (RefSeq) database at NCBI: Current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 2016; 44:D733–D745
    [Google Scholar]
  36. Kwon S, Park S, Lee B, Yoon S. In-depth analysis of interrelation between quality scores and real errors in illumina reads. Proc Annu Int Conf IEEE Eng Med Biol Soc EMBS 2013635–638
    [Google Scholar]
  37. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009; 25:2078–2079 [View Article] [PubMed]
    [Google Scholar]
  38. Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C. deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 1998; 396: [View Article]
    [Google Scholar]
  39. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: Quality assessment tool for genome assemblies. Bioinformatics 2013; 29:1072–1075 [View Article] [PubMed]
    [Google Scholar]
  40. Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 2020; 21:1–13 [View Article]
    [Google Scholar]
  41. Schmid M, Frei D, Patrignani A, Schlapbach R, Frey JE. Pushing the limits of de novo genome assembly for complex prokaryotic genomes harboring very long, near identical repeats. Nucleic Acids Res 2018; 46:8953–8965 [View Article] [PubMed]
    [Google Scholar]
  42. Loka T, Tausch S, Renard B. Reliable variant calling during runtime of Illumina sequencing. bioRxiv 2018387662
    [Google Scholar]
  43. Chen TW, Gan RC, Chang YF, Liao WC, Wu TH et al. Is the whole greater than the sum of its parts? De novo assembly strategies for bacterial genomes based on paired-end sequencing. BMC Genomics 2015; 16:1–12
    [Google Scholar]
  44. Cooper AL, Low AJ, Koziol AG, Thomas MC, Leclair D et al. Systematic evaluation of whole genome sequence-based predictions of salmonella serotype and antimicrobial resistance. Front Microbiol 2020; 11:1–20 [View Article]
    [Google Scholar]
  45. Nouws S, Bogaerts B, Verhaegen B, Denayer S, Piérard D et al. Impact of dna extraction on whole genome sequencing analysis for characterization and relatedness of Shiga toxin-producing Escherichia coli isolates. Sci Rep 2020; 10:14649 [View Article] [PubMed]
    [Google Scholar]
  46. Uelze L, Borowiak M, Deneke C, Szabó I, Fischer J et al. Performance and accuracy of four open-source tools for in silico serotyping of Salmonella spp. based on whole-genome short-read sequencing data. Appl Environ Microbiol 2020; 86:1–14
    [Google Scholar]
  47. Marjuki H, Topaz N, Rodriguez-Rivera LD, Ramos E, Potts CC et al. Whole-genome sequencing for characterization of capsule locus and prediction of serogroup of invasive meningococcal isolates. J Clin Microbiol 2019; 57:1–12 [View Article]
    [Google Scholar]
  48. Bush SJ, Foster D, Eyre DW, Clark EL, De Maio N et al. Genomic diversity affects the accuracy of bacterial single-nucleotide polymorphism-calling pipelines. Gigascience 2020; 9:1–21
    [Google Scholar]
  49. Ellington MJ, Ekelund O, Aarestrup FM, Canton R, Doumith M. The role of whole genome sequencing in antimicrobial susceptibility testing of bacteria: report from the EUCAST Subcommittee. Clin Microbiol Infect 2017; 23:2–22S1198-743X(16)30568-7 [View Article] [PubMed]
    [Google Scholar]
  50. Bortolaia V, Kaas RS, Ruppe E, Roberts MC, Schwarz S. ResFinder 4.0 for predictions of phenotypes from genotypes. J Antimicrob Chemother 2020; 75:3491–3500 [View Article] [PubMed]
    [Google Scholar]
  51. coverage calculator I Estimating Sequencing Coverage. Tech Note Seq 20112–3
    [Google Scholar]
  52. Ferreira FA, Helmersen K, Visnovska T, Jørgensen SB. Rapid nanopore-­based DNA sequencing protocol of resistant bacteria for use in surveillance and outbreak investigation. Microb Genom 2021; 7: [View Article] [PubMed]
    [Google Scholar]
  53. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J. BLAST+: Architecture and applications. BMC Bioinformatics 2009; 10:421 [View Article] [PubMed]
    [Google Scholar]
  54. Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 2011; 27:2987–2993 [View Article] [PubMed]
    [Google Scholar]
  55. Danecek P, McCarthy SA. BCFtools/csq: Haplotype-aware variant consequences. Bioinformatics 2017; 33:2037–2039 [View Article] [PubMed]
    [Google Scholar]
  56. Zankari E, Allesøe R, Joensen KG, Cavaco LM, Lund O. PointFinder: A novel web tool for WGS-based detection of antimicrobial resistance associated with chromosomal point mutations in bacterial pathogens. J Antimicrob Chemother 2017; 72:2764–2768 [View Article] [PubMed]
    [Google Scholar]
/content/journal/mgen/10.1099/mgen.0.000699
Loading
/content/journal/mgen/10.1099/mgen.0.000699
Loading

Data & Media loading...

Supplements

Loading data from figshare Loading data from figshare
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error