1887

Abstract

Metagenome community analyses, driven by the continued development in sequencing technology, is rapidly providing insights in many aspects of microbiology and becoming a cornerstone tool. Illumina, Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) are the leading technologies, each with their own advantages and drawbacks. Illumina provides accurate reads at a low cost, but their length is too short to close bacterial genomes. Long reads overcome this limitation, but these technologies produce reads with lower accuracy (ONT) or with lower throughput (PacBio high-fidelity reads). In a critical first analysis step, reads are assembled to reconstruct genomes or individual genes within the community. However, to date, the performance of existing assemblers has never been challenged with a complex mock metagenome. Here, we evaluate the performance of current assemblers that use short, long or both read types on a complex mock metagenome consisting of 227 bacterial strains with varying degrees of relatedness. We show that many of the current assemblers are not suited to handle such a complex metagenome. In addition, hybrid assemblies do not fulfil their potential. We conclude that ONT reads assembled with CANU and Illumina reads assembled with SPAdes offer the best value for reconstructing genomes and individual genes of complex metagenomes, respectively.

  • This is an open-access article distributed under the terms of the Creative Commons Attribution License.
Loading

Article metrics loading...

/content/journal/micro/10.1099/mic.0.001469
2024-06-25
2024-07-15
Loading full text...

Full text loading...

/deliver/fulltext/micro/170/6/mic001469.html?itemId=/content/journal/micro/10.1099/mic.0.001469&mimeType=html&fmt=ahah

References

  1. Merino N, Aronson HS, Bojanova DP, Feyhl-Buska J, Wong ML et al. Living at the extremes: extremophiles and the limits of life in a planetary context. Front Microbiol 2019; 10:780 [View Article] [PubMed]
    [Google Scholar]
  2. Nehl DB, Knox OGG. Significance of bacteria in the rhizosphere. In Mukerji KG, Manoharachary C, Singh J. eds Microbial Activity in the Rhizoshere. Soil Biology Berlin Heidelberg: Springer; 2006 pp 89–119 [View Article]
    [Google Scholar]
  3. Manley GCA, Lee Y-K, Zhang Y. Gut microbiota and immunology of the gastrointestinal tract. In Rao SSC, Lee YY, Ghoshal UC. eds Clinical and Basic Neurogastroenterology and Motility Academic Press; 2020 pp 63–78
    [Google Scholar]
  4. Shi W, Sun Q, Fan G, Hideaki S, Moriya O et al. gcType: a high-quality type strain genome database for microbial phylogenetic and functional research. Nucleic Acids Res 2021; 49:D694–D705 [View Article] [PubMed]
    [Google Scholar]
  5. Louca S, Mazel F, Doebeli M, Parfrey LW. A census-based estimate of Earth’s bacterial and archaeal diversity. PLoS Biol 2019; 17:e3000106 [View Article] [PubMed]
    [Google Scholar]
  6. Lennon JT, Locey KJ. More support for Earth’s massive microbiome. Biol Direct 2020; 15:5 [View Article] [PubMed]
    [Google Scholar]
  7. Goh KM, Shahar S, Chan K-G, Chong CS, Amran SI et al. Current status and potential applications of underexplored prokaryotes. Microorganisms 2019; 7:468 [View Article] [PubMed]
    [Google Scholar]
  8. Steen AD, Crits-Christoph A, Carini P, DeAngelis KM, Fierer N et al. High proportions of bacteria and archaea across most biomes remain uncultured. ISME J 2019; 13:3126–3130 [View Article] [PubMed]
    [Google Scholar]
  9. Goussarov G, Mysara M, Vandamme P, Van Houdt R. Introduction to the principles and methods underlying the recovery of metagenome-assembled genomes from metagenomic data. Microbiologyopen 2022; 11:e1298 [View Article] [PubMed]
    [Google Scholar]
  10. Chen Q, Lan C, Zhao L, Wang J, Chen B et al. Recent advances in sequence assembly: principles and applications. Brief Funct Genom 2017; 16:361–378 [View Article] [PubMed]
    [Google Scholar]
  11. Forouzan E, Shariati P, Mousavi Maleki MS, Karkhane AA, Yakhchali B. Practical evaluation of 11 de novo assemblers in metagenome assembly. J Microbiol Methods 2018; 151:99–105 [View Article] [PubMed]
    [Google Scholar]
  12. Latorre-Pérez A, Villalba-Bermell P, Pascual J, Vilanova C. Assembly methods for nanopore-based metagenomic sequencing: a comparative study. Sci Rep 2020; 10:13588 [View Article] [PubMed]
    [Google Scholar]
  13. Haghshenas E, Asghari H, Stoye J, Chauve C, Hach F. HASLR: fast hybrid assembly of long reads. iScience 2020; 23:101389 [View Article] [PubMed]
    [Google Scholar]
  14. Bertrand D, Shaw J, Kalathiyappan M, Ng AHQ, Kumar MS et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat Biotechnol 2019; 37:937–944 [View Article] [PubMed]
    [Google Scholar]
  15. Bowers RM, Kyrpides NC, Stepanauskas R, Harmon-Smith M, Doud D et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat Biotechnol 2017; 35:725–731 [View Article] [PubMed]
    [Google Scholar]
  16. Chen LX, Anantharaman K, Shaiber A, Eren AM, Banfield JF. Accurate and complete genomes from metagenomes. Genome Res 2020; 30:315–333 [View Article] [PubMed]
    [Google Scholar]
  17. Nicholls SM, Quick JC, Tang S, Loman NJ. Ultra-deep, long-read nanopore sequencing of mock microbial community standards. Gigascience 2019; 8:giz043 [View Article] [PubMed]
    [Google Scholar]
  18. Singer E, Andreopoulos B, Bowers RM, Lee J, Deshpande S et al. Next generation sequencing data of a defined microbial mock community. Sci Data 2016; 3:160081 [View Article] [PubMed]
    [Google Scholar]
  19. Sevim V, Lee J, Egan R, Clum A, Hundley H et al. Shotgun metagenome data of a defined mock community using Oxford nanopore, PacBio and Illumina technologies. Sci Data 2019; 6:285 [View Article] [PubMed]
    [Google Scholar]
  20. Wang Z, Wang Y, Fuhrman JA, Sun F, Zhu S. Assessment of metagenomic assemblers based on hybrid reads of real and simulated metagenomic sequences. Brief Bioinform 2020; 21:777–790 [View Article] [PubMed]
    [Google Scholar]
  21. Vollmers J, Wiegand S, Kaster AK. Comparing and evaluating metagenome assembly tools from a microbiologist’s perspective - not only size matters!. PLoS One 2017; 12:e0169662 [View Article] [PubMed]
    [Google Scholar]
  22. Brown CL, Keenum IM, Dai D, Zhang L, Vikesland PJ et al. Critical evaluation of short, long, and hybrid assembly for contextual analysis of antibiotic resistance genes in complex environmental metagenomes. Sci Rep 2021; 11:3753 [View Article] [PubMed]
    [Google Scholar]
  23. Hu Y, Fang L, Nicholson C, Wang K. Implications of error-prone long-read whole-genome shotgun sequencing on characterizing reference microbiomes. iScience 2020; 23:101223 [View Article] [PubMed]
    [Google Scholar]
  24. Sczyrba A, Hofmann P, Belmann P, Koslicki D, Janssen S et al. Critical assessment of metagenome interpretation-a benchmark of metagenomics software. Nat Methods 2017; 14:1063–1071 [View Article] [PubMed]
    [Google Scholar]
  25. Quince C, Delmont TO, Raguideau S, Alneberg J, Darling AE et al. DESMAN: a new tool for de novo extraction of strains from metagenomes. Genome Biol 2017; 18:181 [View Article] [PubMed]
    [Google Scholar]
  26. Wick RR, Holt KE. Benchmarking of long-read assemblers for prokaryote whole genome sequencing. F1000Res 2019; 8:2138 [View Article] [PubMed]
    [Google Scholar]
  27. Forouzan E, Maleki MSM, Karkhane AA, Yakhchali B. Evaluation of nine popular de novo assemblers in microbial genome assembly. J Microbiol Methods 2017; 143:32–37 [View Article] [PubMed]
    [Google Scholar]
  28. Goussarov G, Claesen J, Mysara M, Cleenwerck I, Leys N et al. Accurate prediction of metagenome-assembled genome completeness by MAGISTA, a random forest model built on alignment-free intra-bin statistics. Environ Microbiome 2022; 17:9 [View Article] [PubMed]
    [Google Scholar]
  29. Goussarov G, Cleenwerck I, Mysara M, Leys N, Monsieurs P et al. PaSiT: a novel approach based on short-oligonucleotide frequencies for efficient bacterial identification and typing. Bioinformatics 2020; 36:2337–2344 [View Article] [PubMed]
    [Google Scholar]
  30. Hall M. Rasusa: Randomly subsample sequencing reads to a specified coverage. J Open Source Softw 2022; 7:3941 [View Article]
    [Google Scholar]
  31. Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a new versatile metagenomic assembler. Genome Res 2017; 27:824–834 [View Article] [PubMed]
    [Google Scholar]
  32. Li D, Luo R, Liu C-M, Leung C-M, Ting H-F et al. MEGAHIT v1.0: a fast and scalable metagenome assembler driven by advanced methodologies and community practices. Methods 2016; 102:3–11 [View Article] [PubMed]
    [Google Scholar]
  33. Shafin K, Pesout T, Lorig-Roach R, Haukness M, Olsen HE et al. Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes. Nat Biotechnol 2020; 38:1044–1053 [View Article] [PubMed]
    [Google Scholar]
  34. Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat Methods 2020; 17:155–158 [View Article] [PubMed]
    [Google Scholar]
  35. Nurk S, Walenz BP, Rhie A, Vollger MR, Logsdon GA et al. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res 2020; 30:1291–1305 [View Article] [PubMed]
    [Google Scholar]
  36. Kolmogorov M, Bickhart DM, Behsaz B, Gurevich A, Rayko M et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat Methods 2020; 17:1103–1110 [View Article] [PubMed]
    [Google Scholar]
  37. Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 2017; 13:e1005595 [View Article] [PubMed]
    [Google Scholar]
  38. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 2014; 9:e112963 [View Article] [PubMed]
    [Google Scholar]
  39. Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing; 2012
  40. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods 2012; 9:357–359 [View Article] [PubMed]
    [Google Scholar]
  41. Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 2011; 27:2987–2993 [View Article] [PubMed]
    [Google Scholar]
  42. Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 2018; 34:3094–3100 [View Article] [PubMed]
    [Google Scholar]
  43. Mikheenko A, Saveliev V, Gurevich A. MetaQUAST: evaluation of metagenome assemblies. Bioinformatics 2016; 32:1088–1090 [View Article] [PubMed]
    [Google Scholar]
  44. Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinform 2010; 11:119 [View Article] [PubMed]
    [Google Scholar]
  45. Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun 2018; 9:5114 [View Article] [PubMed]
    [Google Scholar]
  46. Jain M, Tyson JR, Loose M, Ip CLC, Eccles DA et al. MinION analysis and reference consortium: phase 2 data release and analysis of R9.0 chemistry. F1000Res 2017; 6:760 [View Article] [PubMed]
    [Google Scholar]
  47. Weirather JL, de Cesare M, Wang Y, Piazza P, Sebastiano V et al. Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis. F1000Res 2017; 6:100 [View Article] [PubMed]
    [Google Scholar]
  48. Stoler N, Nekrutenko A. Sequencing error profiles of Illumina sequencing instruments. NAR Genom Bioinform 2021; 3:lqab019 [View Article] [PubMed]
    [Google Scholar]
  49. Krishnakumar R, Sinha A, Bird SW, Jayamohan H, Edwards HS et al. Systematic and stochastic influences on the performance of the MinION nanopore sequencer across a range of nucleotide bias. Sci Rep 2018; 8:3159 [View Article] [PubMed]
    [Google Scholar]
  50. Sacristán-Horcajada E, González-de la Fuente S, Peiró-Pastor R, Carrasco-Ramiro F, Amils R et al. ARAMIS: from systematic errors of NGS long reads to accurate assemblies. Brief Bioinform 2021; 22:bbab170 [View Article] [PubMed]
    [Google Scholar]
  51. Browne PD, Nielsen TK, Kot W, Aggerholm A, Gilbert MTP et al. GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms. Gigascience 2020; 9:giaa008 [View Article] [PubMed]
    [Google Scholar]
  52. Quince C, Walker AW, Simpson JT, Loman NJ, Segata N. Shotgun metagenomics, from sampling to analysis. Nat Biotechnol 2017; 35:833–844 [View Article] [PubMed]
    [Google Scholar]
  53. Somerville V, Lutz S, Schmid M, Frei D, Moser A et al. Long-read based de novo assembly of low-complexity metagenome samples results in finished genomes and reveals insights into strain diversity and an active phage system. BMC Microbiol 2019; 19:143 [View Article] [PubMed]
    [Google Scholar]
  54. Murigneux V, Rai SK, Furtado A, Bruxner TJC, Tian W et al. Comparison of long-read methods for sequencing and assembly of a plant genome. Gigascience 2020; 9:giaa146 [View Article] [PubMed]
    [Google Scholar]
  55. Kerkhof LJ. Is oxford nanopore sequencing ready for analyzing complex microbiomes?. FEMS Microbiol Ecol 2021; 97:fiab001 [View Article] [PubMed]
    [Google Scholar]
  56. Sereika M, Kirkegaard RH, Karst SM, Michaelsen TY, Sørensen EA et al. Oxford nanopore R10.4 long-read sequencing enables the generation of near-finished bacterial genomes from pure cultures and metagenomes without short-read or reference polishing. Nat Methods 2022; 19:823–826 [View Article] [PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/micro/10.1099/mic.0.001469
Loading
/content/journal/micro/10.1099/mic.0.001469
Loading

Data & Media loading...

Supplements

Supplementary material 1

EXCEL

Supplementary material 2

EXCEL

Supplementary material 3

EXCEL

Supplementary material 4

PDF

Supplementary material 5

PDF

Supplementary material 6

PDF

Supplementary material 7

PDF

Supplementary material 8

PDF

Supplementary material 9

PDF

Supplementary material 10

PDF

Supplementary material 11

PDF
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error