1887

Abstract

Metagenomics and marker gene approaches, coupled with high-throughput sequencing technologies, have revolutionized the field of microbial ecology. Metagenomics is a culture-independent method that allows the identification and characterization of organisms from all kinds of samples. Whole-genome shotgun sequencing analyses the total DNA of a chosen sample to determine the presence of micro-organisms from all domains of life and their genomic content. Importantly, the whole-genome shotgun sequencing approach reveals the genomic diversity present, but can also give insights into the functional potential of the micro-organisms identified. The marker gene approach is based on the sequencing of a specific gene region. It allows one to describe the microbial composition based on the taxonomic groups present in the sample. It is frequently used to analyse the biodiversity of microbial ecosystems. Despite its importance, the analysis of metagenomic sequencing and marker gene data is quite a challenge. Here we review the primary workflows and software used for both approaches and discuss the current challenges in the field.

Funding
This study was supported by the:
  • Carmen Buchrieser , Fondation pour la Recherche Médicale , (Award EQU201903007847)
  • Carmen Buchrieser , Agence Nationale de la Recherche , (Award ANR-10-LABX-62-IBEID)
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000409
2020-07-24
2020-08-09
Loading full text...

Full text loading...

/deliver/fulltext/mgen/10.1099/mgen.0.000409/mgen000409.html?itemId=/content/journal/mgen/10.1099/mgen.0.000409&mimeType=html&fmt=ahah

References

  1. Roumpeka DD, Wallace RJ, Escalettes F, Fotheringham I, Watson M. A review of bioinformatics tools for bio-prospecting from metagenomic sequence data. Frontiers in Genetics. Epub ahead of print 2017
    [Google Scholar]
  2. Case RJ, Boucher Y, Dahllo I, Holmstro C, Doolittle WF et al. Use of 16S rRNA and rpoB genes as molecular markers for microbial ecology studies. Appl Environ Microbiol 2007; 73:278–288 [CrossRef]
    [Google Scholar]
  3. Schoch CL, Seifert KA, Huhndorf S, Robert V, Spouge JL et al. Nuclear ribosomal internal transcribed spacer (its) region as a universal DNA barcode marker for fungi. Proc Natl Acad Sci 2012; 109:6241–6246
    [Google Scholar]
  4. Nilsson RH, Tedersoo L, Abarenkov K, Carlsen T, Pennanen T et al. Methods Fungal community analysis by high-throughput sequencing of amplified markers – a user’ guide.
  5. Wilkins LGE, Ettinger CL, Jospin G, Eisen JA. Metagenome-assembled genomes provide new insight into the microbial diversity of two thermal pools in Kamchatka. Russia 20191–15
    [Google Scholar]
  6. Bishara A, Moss EL, Kolmogorov M, Parada AE, Weng Z et al. Hhs public access.
  7. Stewart RD, Auffret MD, Warr A, Wiser AH, Press MO et al. Metagenomic sequencing of the cow rumen. Nat Commun1–11
    [Google Scholar]
  8. Callahan BJ, Mcmurdie PJ, Rosen MJ, Han AW, AJ A. Hhs public access.; 2016; 13581–583
  9. Edgar R. UCHIME2: improved chimera prediction for amplicon sequencing. bioRxiv Epub ahead of print 2016
    [Google Scholar]
  10. Single- DRR, Sequence NC. Deblur rapidly resolves single-; 2017; 21–7
  11. Dilthey AT. With MetaMaps. Nat Commun
    [Google Scholar]
  12. Scholz M, Ward D V, Pasolli E, Tolio T, Zolfo M et al. Strain-level microbial epidemiology and population genomics from shotgun metagenomics; Epub ahead of print 2016; 13
  13. Fang X, Monk JM, Nurk S, Akseshina M, Zhu Q et al. Analysis of Escherichia coli from a time-series of microbiome samples from a Crohns disease patient; 2018; 91–14
  14. Walker AW, Martin JC, Scott P, Parkhill J, Flint HJ et al. 16S rRNA gene-based profiling of the human infant gut microbiota is strongly influenced by sample processing and PCR primer choice. Microbiome 20151–11
    [Google Scholar]
  15. Chen Z, Hui C, Hui M, Yeoh K, Wong Y. crossm impact of preservation method and 16S rRNA hypervariable region on gut microbiota profiling; 2019; 41–15
  16. Sze MA. The impact of DNA polymerase and number of rounds of amplification in PCR on 16S rRNA gene sequence data; 2019; 49–12
  17. Sabina J, Leamon JH. Bias in whole genome amplification: causes and considerations. Methods Mol Biol 2015; 1347:15–41
    [Google Scholar]
  18. Ross MG, Russ C, Costello M, Hollinger A, Lennon NJ et al. Characterizing and measuring bias in sequence data. Genome Biol 2013; 14:R51
    [Google Scholar]
  19. Kim D, Hofstaedter CE, Zhao C, Mattei L, Tanes C et al. Optimizing methods and dodging pitfalls in microbiome research.; 20171–14
  20. Perez-Cobas AE, Buchrieser C. Analysis of the pulmonary microbiome composition of Legionella pneumophila-Infected patients. Methods Mol Biol 1921; 2019:429–443
    [Google Scholar]
  21. Knight R, Vrbanac A, Taylor BC, Aksenov A, Callewaert C et al. Best practices for analysing microbiomes. Nat Rev Microbiol 2018; 16:410–422
    [Google Scholar]
  22. Jiao X, Zheng X, Ma L, Kutty G, Gogineni E et al. A benchmark study on error assessment and quality control of CCS reads derived from the PacBio RS. J Data Mining Genomics Proteomics Epub ahead of print July 2013; 4:
    [Google Scholar]
  23. Laver T, Harrison J, O’Neill PA, Moore K, Farbos A et al. Assessing the performance of the Oxford nanopore technologies MinION. Biomol Detect Quantif 2015; 3:1–8
    [Google Scholar]
  24. Edgar RC. Accuracy of microbial community diversity estimated by closed- and open- reference Otus.; Epub ahead of print 2017
  25. Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics. Epub ahead of print 2011
    [Google Scholar]
  26. Bolger AM, Lohse M, Usadel B. Genome analysis Trimmomatic : a flexible trimmer for Illumina sequence data; 2014; 302114–2120
  27. Andrews S. FASTQC a quality control tool for high throughput sequence data. Babraham Inst
    [Google Scholar]
  28. Shen W, Le S, Li Y, Hu F. SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA / Q File Manipulation; 20161–10
  29. Aronesty E. Comparison of sequencing utility programs. Open Bioinforma J Epub ahead of print 2013
    [Google Scholar]
  30. Stamatakis A, Zhang J, Kobert K. Genome analysis PEAR: a fast and accurate Illumina Paired-End reAd mergeR; 2014; 30614–620
  31. Quince C, Walker AW, Simpson JT, Loman NJ, Segata N. Shotgun metagenomics, from sampling to analysis. Nat Biotechnol 2017; 35:833–844
    [Google Scholar]
  32. Miller JR, Koren S, Sutton G. Assembly algorithms for next-generation sequencing data. Genomics 2010; 95:315–327
    [Google Scholar]
  33. Venter JC, Adams MD, Myers EW, PW L, Mural RJ et al. The sequence of the human genome. Science 2001; 291:1304–1351
    [Google Scholar]
  34. Ghurye JS, Cepeda-Espinoza V, Pop M. Metagenomic assembly: overview, challenges and applications. Yale J Biol Med 2016; 89:353–362
    [Google Scholar]
  35. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 2012; 19:455–477
    [Google Scholar]
  36. Boisvert S, Raymond F, Godzaridis E, Laviolette F, Corbeil J. Ray meta: scalable de novo metagenome assembly and profiling. Genome Biol 2012; 13:R122 [CrossRef][PubMed]
    [Google Scholar]
  37. Luo R, Liu B, Xie Y, Li Z, Huang W et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 2012; 1:18
    [Google Scholar]
  38. Chikhi R, Medvedev P. Informed and automated k-mer size selection for genome assembly. Bioinformatics Epub ahead of print 2014.
    [Google Scholar]
  39. Sun H, DIng J, Piednoël M, Schneeberger K. FindGSE: estimating genome size variation within human and Arabidopsis using K -mer frequencies. Bioinformatics. Epub ahead of print 2018
    [Google Scholar]
  40. Vurture GW, Sedlazeck FJ, Nattestad M, Underwood CJ, Fang H et al. GenomeScope: Fast Reference-Free Genome Profiling from Short Reads. In: Bioinformatics. 2017 Epub ahead of print 2017
    [Google Scholar]
  41. Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA - A practical iterative De Bruijn graph De Novo assembler. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2010 Epub ahead of print 2010.
    [Google Scholar]
  42. Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 2012; 28:1420–1428
    [Google Scholar]
  43. Mahadik K, Wright C, Kulkarni M, Bagchi S, Chaterji S. Scalable genome assembly through parallel de Bruijn graph construction for multiple k-mers. Sci Rep Epub ahead of print 2019
    [Google Scholar]
  44. Afiahayati SK, Sakakibara Y. MetaVelvet-SL: an extension of the velvet assembler to a de novo metagenomic assembler utilizing supervised learning. DNA Res 2015; 22:69–77
    [Google Scholar]
  45. Namiki T, Hachiya T, Tanaka H, Sakakibara Y. MetaVelvet: an extension of velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res 2012; 40:e155
    [Google Scholar]
  46. IKS Y, Li J V SJ, Martin F-P, Davies H et al. Metabonomic and microbiological analysis of the dynamic effect of vancomycin-induced gut microbiota modification in the mouse. J Proteome Res 2008; 7:3718–3728
    [Google Scholar]
  47. Chikhi R, Rizk G. Space-efficient and exact de Bruijn graph representation based on a Bloom filter. Algorithms Mol Biol 2013; 8:22
    [Google Scholar]
  48. Zimin A, Marçais G, Puiu D, Roberts M, Salzberg SL et al. The MaSuRCA genome assembler. Bioinformatics 2013; 29:2669–2677
    [Google Scholar]
  49. Vollmers J, Wiegand S, Kaster A-K. Comparing and Evaluating Metagenome Assembly Tools from a Microbiologist’s Perspective - Not Only Size Matters!. PLoS One 2017; 12:e0169662
    [Google Scholar]
  50. Wang Z, Wang Y, Fuhrman JA, Sun F, Zhu S. Assessment of metagenomic assemblers based on hybrid reads of real and simulated metagenomic sequences. Brief Bioinform 2019; 00:1–14
    [Google Scholar]
  51. Forouzan E, Shariati P, Mousavi Maleki MS, Karkhane AA, Yakhchali B. Practical evaluation of 11 de novo assemblers in metagenome assembly. J Microbiol Methods 2018; 151:99–105
    [Google Scholar]
  52. van der Walt AJ, van Goethem MW, Ramond JB, Makhalanyane TP, Reva O et al. Assembling metagenomes, one community at a time. BMC Genomics Epub ahead of print 2017
    [Google Scholar]
  53. Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a new versatile metagenomic assembler. Genome Res 2017; 27:824–834
    [Google Scholar]
  54. Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 2015; 31:1674–1676
    [Google Scholar]
  55. Wang Z, Wang Y, Fuhrman JA, Sun F, Zhu S. Assessment of metagenomic assemblers based on hybrid reads of real and simulated metagenomic sequences. Brief Bioinform.
  56. Sczyrba A, Hofmann P, Belmann P, Koslicki D, Janssen S et al. Critical assessment of metagenome Interpretation—a benchmark of metagenomics software. Nat Methods 2017; 14:1063–1071
    [Google Scholar]
  57. Chapman JA, Ho I, Sunkara S, Luo S, Schroth GP et al. Meraculous: de novo genome assembly with short Paired-End reads. PLoS One 2011; 6:e23501
    [Google Scholar]
  58. Zerbino DR, Birney E. Velvet : Algorithms for de novo short read assembly using de Bruijn graphs; 2008821–829
  59. Bertrand D, Shaw J, Kalathiyappan M, AHQ N, Kumar MS et al. Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes. Nat Biotechnol 2019; 37:937–944
    [Google Scholar]
  60. Treangen TJ, Koren S, Sommer DD, Liu B, Astrovskaya I et al. MetAMOS: a modular and open source metagenomic assembly and analysis pipeline. Genome Biol 2013; 14:R2
    [Google Scholar]
  61. Scholz M, Lo C-C CPSG. Improved assemblies using a Source-Agnostic pipeline for metagenomic assembly by merging (MeGAMerge) of contigs. Sci Rep 2015; 4:6480
    [Google Scholar]
  62. Vicedomini R, Vezzi F, Scalabrin S, Arvestad L, Policriti A. GAM-NGS: genomic assemblies merger for next generation sequencing. BMC Bioinformatics 2013; 14:S6
    [Google Scholar]
  63. Mikheenko A, Saveliev V, Gurevich A. MetaQUAST: evaluation of metagenome assemblies. Bioinformatics 2016; 32:1088–1090
    [Google Scholar]
  64. Gerlach W, Stoye J. Taxonomic classification of metagenomic shotgun sequences with CARMA3. Nucleic Acids Res. Epub ahead of print 2011
    [Google Scholar]
  65. Liu B, Gibbons T, Ghodsi M, Pop M. MetaPhyler: taxonomic profiling for metagenomic sequences. Proc - 2010 IEEE Int Conf Bioinforma Biomed BIBM 2010; 2010:95–100
    [Google Scholar]
  66. Mohammed MH, Ghosh TS, Singh NK, Mande SS. SPHINX—an algorithm for taxonomic binning of metagenomic sequences. Bioinformatics 2011; 27:22–30
    [Google Scholar]
  67. Diaz NN, Krause L, Goesmann A, Niehaus K, Nattkemper TW. TACOA – taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach. BMC Bioinformatics 2009; 10:56
    [Google Scholar]
  68. Gregor I, Dröge J, Schirmer M, Quince C, McHardy AC. PhyloPythiaS+ : a self-training method for the rapid reconstruction of low-ranking taxonomic bins from metagenomes. PeerJ 2016; 4:e1603
    [Google Scholar]
  69. Chen I-MA, Chu K, Palaniappan K, Pillay M, Ratner A et al. IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes. Nucleic Acids Res 2019; 47:D666–D677
    [Google Scholar]
  70. Meyer F, Bagchi S, Chaterji S, Gerlach W, Grama A et al. MG-RAST version 4—lessons learned from a decade of low-budget ultra-high-throughput metagenome analysis. Brief Bioinform 2019; 20:1151–1159
    [Google Scholar]
  71. Huson DH, Beier S, Flade I, Górska A, El-Hadidi M et al. MEGAN Community Edition - Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data. PLoS Comput Biol 2016; 12:e1004957
    [Google Scholar]
  72. Sedlar K, Kupkova K, Provaznik I. Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics. Comput Struct Biotechnol J 2017; 15:48–55
    [Google Scholar]
  73. Dick GJ, Andersson AF, Baker BJ, Simmons SL, Thomas BC et al. Community-Wide analysis of microbial genome sequence signatures. Genome Biol 2009; 10:R85
    [Google Scholar]
  74. Laczny CC, Muller EEL, Heintz-Buschart A, Herold M, Lebrun LA et al. Identification, recovery, and refinement of hitherto undescribed population-level genomes from the human gastrointestinal tract. Front Microbiol 2016; 7:884
    [Google Scholar]
  75. Strous M, Kraft B, Bisdorf R, Tegetmeyer HE. The binning of metagenomic contigs for microbial physiology of mixed cultures. Front Microbiol 2012; 3:410
    [Google Scholar]
  76. Kelley DR, Salzberg SL. Clustering metagenomic sequences with interpolated Markov models. BMC Bioinformatics 2010; 11:544
    [Google Scholar]
  77. Kislyuk A, Bhatnagar S, Dushoff J, Weitz JS. Unsupervised statistical clustering of environmental shotgun sequences. BMC Bioinformatics 2009; 10:316
    [Google Scholar]
  78. Y-W W, Ye Y. A novel abundance-based algorithm for binning metagenomic sequences using l-tuples. J Comput Biol 2011; 18:523–534
    [Google Scholar]
  79. Wang Y, Hu H, Li X. MBBC: an efficient approach for metagenomic binning based on clustering. BMC Bioinformatics Epub ahead of print 2015
    [Google Scholar]
  80. Nielsen HB, Almeida M, Juncker AS, Rasmussen S, Li J et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat Biotechnol 2014; 32:822–828
    [Google Scholar]
  81. Wang Y, Leung HCM, Yiu SM, Chin FYL. MetaCluster 4.0: a novel binning algorithm for NGS reads and huge number of species. J Comput Biol 2012; 19:241–249
    [Google Scholar]
  82. Chatterji S, Yamazaki I, Bai Z, Eisen J. CompostBin: a DNA composition-based algorithm for binning environmental shotgun reads..
  83. Y-W W, Simmons BA, Singer SW. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 2016; 32:605–607
    [Google Scholar]
  84. Kang DD, Li F, Kirton E, Thomas A, Egan R et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. Peer J 2019; 7:e7359 [CrossRef][PubMed]
    [Google Scholar]
  85. Alneberg J, Bjarnason BS, Bruijn de I, Schirmer M, Quick J et al. Binning metagenomic contigs by coverage and composition. Nat Methods 2014; 11:1144–1146
    [Google Scholar]
  86. YY L, Chen T, Fuhrman JA, Sun F. COCACOLA: binning metagenomic contigs using sequence composition, read coverage, CO-alignment and paired-end read linkage. Bioinformatics 2016; 33:btw290
    [Google Scholar]
  87. Lin H-H, Liao Y-C. Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes. Sci Rep 2016; 6:24175
    [Google Scholar]
  88. Dröge J, Gregor I, McHardy AC. Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods. Bioinformatics 2015; 31:817–824
    [Google Scholar]
  89. Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol; Epub ahead of print 2014; 15:
    [Google Scholar]
  90. Yu G, Jiang Y, Wang J, Zhang H, Luo H. BMC3C: binning metagenomic contigs using codon usage, sequence composition and read coverage. Bioinformatics 2018; 34:4172–4179
    [Google Scholar]
  91. Ma T, Xiao D, Xing X. MetaBMF: a scalable binning algorithm for large-scale reference-free metagenomic studies. Bioinformatics 11: [CrossRef]
    [Google Scholar]
  92. Wang Z, Wang Z, Lu YY, Sun F, Zhu S. SolidBin: improving metagenome binning with semi-supervised normalized cut. Bioinformatics 2019; 35:4229–4238 [CrossRef]
    [Google Scholar]
  93. Breitwieser FP, Lu J, Salzberg SL. A review of methods and databases for metagenomic classification and assembly. Brief Bioinform Epub ahead of print 2018
    [Google Scholar]
  94. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 2015; 25:1043
    [Google Scholar]
  95. Meyer F, Hofmann P, Belmann P, Garrido-Oter R, Fritz A et al. Amber: assessment of metagenome BinnERs. Gigascience Epub ahead of print June 2018; 7:
    [Google Scholar]
  96. Song W-Z, Thomas T. Binning_refiner: improving genome bins through the combination of different binning programs. Bioinformatics 2017; 33:1873–1875
    [Google Scholar]
  97. Sieber CMK, Probst AJ, Sharrar A, Thomas BC, Hess M et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat Microbiol 2018; 3:836–843
    [Google Scholar]
  98. Uritskiy G V, DiRuggiero J, Taylor J. MetaWRAP—a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 2018; 6:158
    [Google Scholar]
  99. Broeksema B, Calusinska M, McGee F, Winter K, Bongiovanni F et al. ICoVeR – an interactive visualization tool for verification and refinement of metagenomic bins. BMC Bioinformatics 2017; 18:233
    [Google Scholar]
  100. Miller IJ, Rees ER, Ross J, Miller I, Baxa J et al. Autometa: automated extraction of microbial genomes from individual shotgun metagenomes. Nucleic Acids Res 2019; 47:1–12
    [Google Scholar]
  101. Hugerth LW, Larsson J, Alneberg J, Lindh MV, Legrand C et al. Metagenome-assembled genomes uncover a global brackish microbiome. Genome Biol 2015; 16: [CrossRef]
    [Google Scholar]
  102. Noguchi H, Park J, Takagi T. Metagene: prokaryotic gene finding from environmental genome shotgun sequences. Nucleic Acids Res 2006; 34:5623–5630 [CrossRef]
    [Google Scholar]
  103. Noguchi H, Taniguchi T, Itoh T. MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes. DNA Res 2008; 15:387–396 [CrossRef]
    [Google Scholar]
  104. Zhu W, Lomsadze A, Borodovsky M. Ab initio gene identification in metagenomic sequences. Nucleic Acids Res 2010; 38:e132 [CrossRef]
    [Google Scholar]
  105. Lomsadze A, Gemayel K, Tang S, Borodovsky M. Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes. Genome Res 2018; 28:1079–1089 [CrossRef]
    [Google Scholar]
  106. Delcher A, Harmon D, Kasif S, White O, Salzberg SL. Improved microbial gene identification with glimmer. Nucleic Acids Res 1999; 27:4636–4641 [CrossRef]
    [Google Scholar]
  107. Hyatt D, Chen G-L, LoCascio PF, Land ML, Larimer FW et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 2010; 11:119 [CrossRef]
    [Google Scholar]
  108. Hyatt D, LoCascio PF, Hauser LJ, Uberbacher EC. Gene and translation initiation site prediction in metagenomic sequences. Bioinformatics 2012; 28:2223–2230 [CrossRef]
    [Google Scholar]
  109. Kelley DR, Liu B, Delcher AL, Pop M, Salzberg SL. Gene prediction with glimmer for metagenomic sequences augmented by classification and clustering. Nucleic Acids Res 2012; 40:e9 [CrossRef]
    [Google Scholar]
  110. Brady A, Salzberg SL, Phymm SSL. Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models. Nat Methods 2009; 6:673–676 [CrossRef]
    [Google Scholar]
  111. Rho M, Tang H, Ye Y. FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res 2010; 38:e191 [CrossRef]
    [Google Scholar]
  112. Yok NG, Rosen GL. Combining gene prediction methods to improve metagenomic gene annotation. BMC Bioinformatics 2011; 12:20 [CrossRef]
    [Google Scholar]
  113. Trimble WL, Keegan KP, D’Souza M, Wilke A, Wilkening J et al. Short-Read reading-frame predictors are not created equal: sequence error causes loss of signal. BMC Bioinformatics 2012; 13: [CrossRef]
    [Google Scholar]
  114. Huntemann M, Ivanova NN, Mavromatis K, Tripp HJ, Paez-Espino D et al. The standard operating procedure of the DOE-JGI metagenome annotation pipeline (MAP v.4). Stand Genomic Sci 2016; 11:17 [CrossRef]
    [Google Scholar]
  115. Ter-Hovhannisyan V, Lomsadze A, Chernoff YO, Borodovsky M. Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Res 2008; 18:1979–1990 [CrossRef]
    [Google Scholar]
  116. Lomsadze A, Ter-Hovhannisyan V, Chernoff YO, Borodovsky M. Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res 2005; 33:6494–6506 [CrossRef]
    [Google Scholar]
  117. Stanke M, Morgenstern B. AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res 2005; 33:W465–W467 [CrossRef]
    [Google Scholar]
  118. Souvorov A, Kapustin Y, Kiryutin B, Chetvernin V, Tatusova T et al. Gnomon–NCBI eukaryotic gene prediction tool. Natl Cent Biotechnol Inf 20101–24
    [Google Scholar]
  119. Korf I. Gene finding in novel genomes. BMC Bioinformatics 2004; 5:59 [CrossRef]
    [Google Scholar]
  120. Sallet E, Gouzy J, Schiex T. EuGene: an automated integrative gene finder for eukaryotes and prokaryotes. In Clifton NJ. editor Methods in Molecular Biology 2019 pp 97–120
    [Google Scholar]
  121. Holt C, Yandell M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 2011; 12:491 [CrossRef]
    [Google Scholar]
  122. Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY et al. Pfam: the protein families database. Nucleic Acids Res 2014; 42:D222–D230 [CrossRef]
    [Google Scholar]
  123. Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A et al. InterPro: the integrative protein signature database. Nucleic Acids Res 2009; 37:D211–D215 [CrossRef]
    [Google Scholar]
  124. Claudel-Renard C, Chevalet C, Faraut T, Kahn D. Enzyme-Specific profiles for genome annotation: PRIAM. Nucleic Acids Res 2003; 31:6633–6639 [CrossRef]
    [Google Scholar]
  125. Karp PD, Riley M, Paley SM, Pellegrini-Toole A. The MetaCyc database. Nucleic Acids Res 2002; 30:59–61 [CrossRef]
    [Google Scholar]
  126. Alcock BP, Raphenya AR, TTY L, Tsang KK, Bouchard M et al. Card 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic Acids Res Epub ahead of print 2020
    [Google Scholar]
  127. Pal C, Bengtsson-Palme J, Rensing C, Kristiansson E, Larsson DGJ. BacMet: antibacterial biocide and metal resistance genes database. Nucleic Acids Res 2014; 42:D737–D743 [CrossRef]
    [Google Scholar]
  128. Vallenet D, Calteau A, Dubois M, Amours P, Bazin A et al. Microscope: an integrated platform for the annotation and exploration of microbial gene functions through genomic, pangenomic and metabolic comparative analysis. Nucleic Acids Res Epub ahead of print 2020
    [Google Scholar]
  129. Kultima JR, Coelho LP, Forslund K, Huerta-Cepas J, Li SS et al. MOCAT2: a metagenomic assembly, annotation and profiling framework. Bioinformatics 2016; 32:2520–2523 [CrossRef][PubMed]
    [Google Scholar]
  130. Bengtsson-Palme J. Strategies for Taxonomic and Functional Annotation of Metagenomes. In: Metagenomics 2018 pp 55–79
    [Google Scholar]
  131. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 2014; 30:2068–2069 [CrossRef]
    [Google Scholar]
  132. Tanizawa Y, Fujisawa T, Nakamura Y. DFAST: a flexible prokaryotic genome annotation pipeline for faster genome publication. Bioinformatics. 2018; 34:1037–1039 [CrossRef]
    [Google Scholar]
  133. Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP et al. Ncbi prokaryotic genome annotation pipeline. Nucleic Acids Res 2016; 44:6614–6624 [CrossRef][PubMed]
    [Google Scholar]
  134. Humann JL, Lee T, Ficklin S, Main D. Structural and Functional Annotation of Eukaryotic Genomes with GenSAS New York, NY: Humana; 2019 pp 29–51
    [Google Scholar]
  135. Dong X, Strous M. An integrated pipeline for annotation and visualization of metagenomic contigs. Front Genet Epub ahead of print 2019; 10: [CrossRef]
    [Google Scholar]
  136. Goll J, Rusch DB, Tanenbaum DM, Thiagarajan M, Li K et al. METAREP: JCVI metagenomics reports-an open source tool for high-performance comparative metagenomics. Bioinformatics 2010; 26:2631–2632 [CrossRef]
    [Google Scholar]
  137. Lesker TR, Durairaj AC, Gálvez EJC, Lagkouvardos I, Baines JF et al. An integrated metagenome catalog reveals new insights into the murine gut microbiome. Cell Rep Epub ahead of print 2020
    [Google Scholar]
  138. Jia S, Wu J, Ye L, Zhao F, Li T et al. Metagenomic assembly provides a deep insight into the antibiotic resistome alteration induced by drinking water chlorination and its correlations with bacterial host changes. J Hazard Mater 2019; 379:120841 [CrossRef]
    [Google Scholar]
  139. Almeida A, Mitchell AL, Boland M, Forster SC, Gloor GB et al. A new genomic blueprint of the human gut microbiota. Nature 2019; 568:499–504 [CrossRef][PubMed]
    [Google Scholar]
  140. Pasolli E, Asnicar F, Manara S, Zolfo M, Karcher N et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 2019; 176:649–662 [CrossRef][PubMed]
    [Google Scholar]
  141. Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using diamond. Nat Methods 2015; 12:59–60 [CrossRef]
    [Google Scholar]
  142. Keegan KP, Glass EM, Meyer F. MG-RAST, a Metagenomics Service for Analysis of Microbial Community Structure and Function New York, NY: Humana Press; 2016 pp 207–233
    [Google Scholar]
  143. Stewart RD, Auffret MD, Snelling TJ, Roehe R, Watson M. MAGpy: a reproducible pipeline for the downstream analysis of metagenome-assembled genomes (MAGs). Bioinformatics. 2019; 35:2150–2152 [CrossRef]
    [Google Scholar]
  144. Quince C, Delmont TO, Raguideau S, Alneberg J, Darling AE et al. DESMAN: a new tool for de novo extraction of strains from metagenomes. Genome Biol 2017; 18:181 [CrossRef][PubMed]
    [Google Scholar]
  145. Segata N. On the road to Strain-Resolved comparative Metagenomics. mSystems Epub ahead of print 2018; 3: [CrossRef]
    [Google Scholar]
  146. Zolfo M, Tett A, Jousson O, Donati C, Segata N. MetaMLST: multi-locus strain-level bacterial typing from metagenomic samples. Nucleic Acids Res 2017; 45:e7 [CrossRef]
    [Google Scholar]
  147. Truong DT, Tett A, Pasolli E, Huttenhower C, Segata N. Microbial strain-level population structure & genetic diversity from metagenomes. Genome Res Epub ahead of print 2017
    [Google Scholar]
  148. Costea PI, Munch R, Coelho LP, Paoli L, Sunagawa S et al. metaSNV: a tool for metagenomic strain level analysis. PLoS One Epub ahead of print 2017; 12:e0182392 [CrossRef]
    [Google Scholar]
  149. Ounit R, Wanamaker S, Close TJ, Lonardi S. Clark: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers. BMC Genomics 2015; 16: [CrossRef]
    [Google Scholar]
  150. Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol 2019; 20: [CrossRef]
    [Google Scholar]
  151. Flygare S, Simmon K, Miller C, Qiao Y, Kennedy B et al. Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling. Genome Biol 2016; 17: [CrossRef]
    [Google Scholar]
  152. Kim D, Song L, Breitwieser FP, Salzberg SL. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res Epub ahead of print 2016
    [Google Scholar]
  153. Menzel P, Lee Ng K, Krogh A. Kaiju: fast and sensitive taxonomic classification for metagenomics. bioRxiv Epub ahead of print 2015
    [Google Scholar]
  154. Corvelo A, Clarke WE, Robine N, Zody MC. taxMaps: comprehensive and highly accurate taxonomic classification of short-read data in reasonable time. Genome Res. Epub ahead of print 2018
    [Google Scholar]
  155. Dröge J, Mchardy AC. Taxonomic binning of metagenome samples generated by next-generation sequencing technologies. Brief Bioinform Epub ahead of print 2012
    [Google Scholar]
  156. Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O et al. 1. Segata, N. et al. metagenomic microbial community profiling using unique clade-specific marker genes. nat. methods 9, 811–4 (2012).Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods. Epub ahead of print 2012
    [Google Scholar]
  157. Truong DT, Franzosa EA, Tickle TL, Scholz M, Weingart G et al. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nat Methods 2015; 12:902–903 [CrossRef][PubMed]
    [Google Scholar]
  158. SH Y, Siddle KJ, Park DJ, Sabeti PC. Benchmarking Metagenomics tools for taxonomic classification. Cell 2019
    [Google Scholar]
  159. Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 2010; 26:2460–2461 [CrossRef]
    [Google Scholar]
  160. Kent WJ. BLAT-the BLAST-like alignment tool. Genome Res 2002; 12:656–664 [CrossRef]
    [Google Scholar]
  161. Zhao Y, Tang H, Ye Y. RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data. Bioinformatics 2012; 28:125–126 [CrossRef]
    [Google Scholar]
  162. Westbrook A, Ramsdell J, Schuelke T, Normington L, Bergeron RD et al. PALADIN: protein alignment for functional profiling whole metagenome shotgun data. Bioinformatics 2017; 33:1473–1478 [CrossRef]
    [Google Scholar]
  163. Zhong C, Yang Y, Yooseph S. GRASP2: fast and memory-efficient gene-centric assembly and homolog search for metagenomic sequencing data. BMC Bioinformatics 2019; 20:276 [CrossRef]
    [Google Scholar]
  164. Sharifi F, Ye Y. From Gene Annotation to Function Prediction for Metagenomics 2017 In: Methods in Molecular Biology; Epub ahead of print 2017
    [Google Scholar]
  165. Ye Y, Doak TG. A parsimony approach to biological pathway reconstruction/inference for genomes and metagenomes. PLoS Comput Biol 2009; 5:e1000465 [CrossRef]
    [Google Scholar]
  166. Brown SM, Chen H, Hao Y, Laungani BP, Ali TA et al. MGS-Fast: metagenomic shotgun data fast annotation using microbial gene catalogs. Gigascience 2019; 8: [CrossRef]
    [Google Scholar]
  167. Franzosa EA, McIver LJ, Rahnavard G, Thompson LR, Schirmer M et al. Species-Level functional profiling of metagenomes and metatranscriptomes. Nat Methods 2018; 15:962968 [CrossRef]
    [Google Scholar]
  168. Nazeen S, Yu YW, Berger B. Carnelian uncovers hidden functional patterns across diverse study populations from whole metagenome sequencing reads. Genome Biol 2020; 21: [CrossRef]
    [Google Scholar]
  169. Arango-Argoty G, Singh G, Heath LS, Pruden A, Xiao W et al. MetaStorm: a public resource for customizable metagenomics annotation. PLoS One Epub ahead of print 2016; 11:e0162442 [CrossRef]
    [Google Scholar]
  170. Nayfach S, Bradley PH, Wyman SK, Laurent TJ, Williams A et al. Automated and accurate estimation of gene family abundance from shotgun metagenomes. PLoS Comput Biol Epub ahead of print 2015
    [Google Scholar]
  171. Simmonds P, Adams MJ, Benk M, Breitbart M, Brister JR et al. Consensus statement: virus taxonomy in the age of metagenomics. Nat Rev Microbiol. Epub ahead of print 2017
    [Google Scholar]
  172. Simmonds P. Methods for virus classification and the challenge of incorporating metagenomic sequence data. Journal of General Virology 2015; 96:1193–1206 [CrossRef]
    [Google Scholar]
  173. Fox G, Stackebrandt E, Hespell R, Gibson J, Maniloff J et al. The phylogeny of prokaryotes. Science 1980; 209:457–463 [CrossRef]
    [Google Scholar]
  174. Woese CR, Fox GE. Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci U S A 1977; 74:5088–5090 [CrossRef]
    [Google Scholar]
  175. Strzelecka J. Genetic and functional diversity of bacterial microbiome in soils with long term impacts of petroleum hydrocarbons; 2018; 91–17
  176. Bruno A, Sandionigi A, Bernasconi M, Panio A, Labra M et al. Changes in the drinking water microbiome: effects of water treatments along the flow of two drinking water treatment plants in a urbanized area, Milan (Italy). Front Microbiol 2018; 9:1–12 [CrossRef]
    [Google Scholar]
  177. Fish KE, Boxall JB. Biofilm Microbiome (Re)Growth Dynamics in Drinking Water Distribution Systems Are Impacted by Chlorine Concentration. Front Microbiol 2018; 9:1–21 [CrossRef]
    [Google Scholar]
  178. Bergelson J, Mittelstrass J, Horton MW. Characterizing both bacteria and fungi improves understanding of the Arabidopsis root microbiome. Sci Rep 2019; 9:1–11 [CrossRef]
    [Google Scholar]
  179. Schmitt S, Tsai P, Bell J, Fromont J, Ilan M et al. Assessing the complex sponge microbiota: core, variable and species-specific bacterial communities in marine sponges. Isme J 2012; 6:564–576 [CrossRef]
    [Google Scholar]
  180. Lu D, Tiezzi F, Schillebeeckx C, McNulty NP, Schwab C et al. Host contributes to longitudinal diversity of fecal microbiota in swine selected for lean growth. Microbiome 2018; 6:1–15 [CrossRef]
    [Google Scholar]
  181. Trial RP, Harris VC, Haak BW, Handley SA, Van LEMM et al. Clinical and Translational Report Effect of Antibiotic-Mediated Microbiome Modulation on Rotavirus Vaccine Immunogenicity : A Clinical and Translational Report Effect of Antibiotic-Mediated Microbiome Modulation on Rotavirus Vaccine Immunogenicity; 2018197–207
  182. Nearing JT, Connors J, Whitehouse S, Van Limbergen J, Macdonald T et al. Infectious complications are associated with alterations in the gut microbiome in pediatric patients with acute lymphoblastic leukemia. Front Cell Infect Microbiol 2019; 9:1–14 [CrossRef]
    [Google Scholar]
  183. Zarul M, Zoqratt H, Wei W, Eng H, Thai BT et al. Microbiome analysis of Pacific white shrimp gut and rearing water from Malaysia and Vietnam : implications for aquaculture research and management; 20181–22
  184. Mukherjee C, Beall CJ, Griffen AL, Leys EJ. High-Resolution ISR amplicon sequencing reveals personalized oral microbiome; 20181–15
  185. Thompson LR, Sanders JG, Mcdonald D, Amir A, Ladau J et al. A communal Catalogue reveals earth’ s multiscale microbial diversity.; Epub ahead of print 2017
  186. Kantor RS, Miller SE, Nelson KL, Paul CJ, Nelson KL. The water microbiome through a pilot scale advanced treatment facility for direct potable reuse. Front Microbiol 2019; 10:1–15 [CrossRef]
    [Google Scholar]
  187. Rosen MJ, Callahan BJ, Fisher DS, Holmes SP. Denoising PCR-amplified metagenome data. BMC Bioinformatics 2012; 13: [CrossRef]
    [Google Scholar]
  188. Register F, Services H. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2; Epub ahead of print 2019; 37
  189. Nearing JT, Douglas GM, Comeau AM, Langille MGI. Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches. PeerJ 2018; 6:e5364–22 [CrossRef][PubMed]
    [Google Scholar]
  190. Wang Q, Garrity GM, Tiedje JM, Cole JR. Naïve Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. Epub ahead of print 2007
    [Google Scholar]
  191. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. Epub ahead of print 1990
    [Google Scholar]
  192. Bokulich NA, Kaehler BD, Rideout JR, Dillon M, Bolyen E et al. Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin. Microbiome 2018; 6:1–17 [CrossRef]
    [Google Scholar]
  193. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J et al. BMC bioinformatics; 2009; 91–9
  194. Rognes T, Flouri T, Nichols B, Quince C, Mahé F. VSEARCH: a versatile open source tool for metagenomics. PeerJ 2016; 4:e2584–22 [CrossRef][PubMed]
    [Google Scholar]
  195. DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol 2006; 72:5069–5072 [CrossRef]
    [Google Scholar]
  196. Cole JR, Wang Q, Fish JA, Chai B, McGarrell DM et al. Ribosomal database project: data and tools for high throughput rRNA analysis. Nucleic Acids Res 2014; 42:D633–D642 [CrossRef]
    [Google Scholar]
  197. Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W et al. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res 2007; 35:7188–7196 [CrossRef]
    [Google Scholar]
  198. Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 2009; 75:7537–7541 [CrossRef]
    [Google Scholar]
  199. Id C, Kolisko M, Boscaro V, Santoferrara LF, Nenarokov S et al. EukRef : Phylogenetic curation of ribosomal RNA to enhance understanding of eukaryotic diversity and distribution; 20181–14
  200. Bass D, De VC, Bittner L, Boutte C, Decelle J et al. The Protist Ribosomal Reference database (PR 2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy; 2013; 41597–604
  201. Darling KF, FR E, Douady CJ, Escarguel G, De T et al. PFR 2 : a curated database of planktonic foraminifera 18S ribosomal DNA as a resource for studies of plankton ecology, biogeography and evolution; 2015; 491472–1485
  202. Id C, Kolisko M, Boscaro V, Santoferrara LF, Nenarokov S et al. EukRef : Phylogenetic curation of ribosomal RNA to enhance understanding of eukaryotic diversity and distribution; 20181–14
  203. Nilsson RH, Larsson K-H, Taylor AFS, Bengtsson-Palme J, Jeppesen TS et al. The unite database for molecular identification of fungi: handling dark taxa and parallel taxonomic classifications. Nucleic Acids Res 2019; 47:D259–D264 [CrossRef]
    [Google Scholar]
  204. Deshpande V, Wang Q, Greenfield P, Charleston M, Porras-Alfaro A et al. Fungal identification using a Bayesian classifier and the Warcup training set of internal transcribed spacer sequences. Mycologia 2016; 108:1–5 [CrossRef]
    [Google Scholar]
  205. Practice B. crossm the madness of microbiome: attempting to find consensus; 20181–12
  206. Ritari J, Salojärvi J, Lahti L, de Vos WM. Improved taxonomic assignment of human intestinal 16S rRNA sequences by a dedicated reference database. BMC Genomics 2015; 16:1–10 [CrossRef]
    [Google Scholar]
  207. Tsuchiya Y, Kiriyama C, Itoh M, Morisaki H, Okuda S. From 16S rRNA gene sequences.. Nat Commun Epub ahead of print 2012
    [Google Scholar]
  208. Langille MGI, Zaneveld J, Caporaso JG, McDonald D, Knights D et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol. Epub ahead of print 2013
    [Google Scholar]
  209. Jun S, Robeson MS, Hauser LJ, Schadt CW, Gorin AA. PanFP : pangenome-based functional pro fi les for microbial communities. BMC Res Notes 20151–7
    [Google Scholar]
  210. Aßhauer KP, Wemheuer B, Daniel R, Meinicke P. Sequence analysis Tax4Fun : predicting functional profiles from metagenomic 16S rRNA data; 2015; 312882–2884
  211. Douglas GM, Maffei VJ, Zaneveld J, Yuregel SN, Brown JR et al. PICRUSt2: an improved and extensible approach for metagenome inference. bioRxiv.
    [Google Scholar]
  212. Paulson JN, Stine OC, Bravo HC, Pop M. Differential abundance analysis for microbial marker-gene surveys. Nat Methods 2013; 10:1200–1202 [CrossRef][PubMed]
    [Google Scholar]
  213. Lovell D, Pawlowsky-Glahn V, Egozcue JJ, Marguerat S, Bähler J. Proportionality: a valid alternative to correlation for relative data. PLoS Comput Biol 2015; 11:e1004075 [CrossRef][PubMed]
    [Google Scholar]
  214. Weiss S, Xu ZZ, Peddada S, Amir A, Bittinger K et al. Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome 2017; 5:1–18 [CrossRef]
    [Google Scholar]
  215. Bullard JH, Purdom E, Hansen KD, Dudoit S. Evaluation of statistical methods for normalization and differential expression in mRNA-seq experiments. BMC Bioinformatics 2010; 11:94 [CrossRef]
    [Google Scholar]
  216. McMurdie PJ, Holmes S. Waste not, want not: why rarefying microbiome data is inadmissible. PLoS Comput Biol 2014; 10:e1003531 [CrossRef]
    [Google Scholar]
  217. Kurtz ZD, Müller CL, Miraldi ER, Littman DR, Blaser MJ et al. Sparse and compositionally robust inference of microbial ecological networks. PLoS Comput Biol. Epub ahead of print 2015
    [Google Scholar]
  218. Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol 2010; 11:R106 [CrossRef]
    [Google Scholar]
  219. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010; 26:139–140 [CrossRef]
    [Google Scholar]
  220. Badri M, Kurtz ZD, Müller CL, Bonneau R. Normalization methods for microbial abundance data strongly affect correlation estimates. bioRxiv. Epub ahead of print 2018
    [Google Scholar]
  221. Pereira MB, Wallroth M, Jonsson V, Kristiansson E. Comparison of normalization methods for the analysis of metagenomic gene abundance data. BMC Genomics 2018; 19:274 [CrossRef]
    [Google Scholar]
  222. Farrelly V, Rainey FA, Stackebrandt E. Effect of genome size and rrn gene copy number on PCR amplification of 16S rRNA genes from a mixture of bacterial species; 1995; 612798–2801
  223. Acinas SG, Marcelino LA, Klepac-ceraj V, Polz MF. Divergence and redundancy of 16S rRNA sequences in genomes with multiple rrn operons; 2004; 1862629–2635
  224. Stoddard SF, Smith BJ, Hein R, Roller BRK. rrnDB : improved tools for interpreting rRNA gene abundance in bacteria and archaea and a new foundation for future development; 2015; 43593–598
  225. Angly FE, Dennis PG, Skarshewski A, Vanwonterghem I, Hugenholtz P et al. CopyRighter: a rapid tool for improving the accuracy of microbial community profiles through lineage-specific gene copy number correction; 20141–13
  226. Kembel SW, Wu M, Eisen JA, Green JL. Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance. PLoS Comput Biol 2012; 8:e1002743 [CrossRef]
    [Google Scholar]
  227. Matsen FA, Kodner RB, Armbrust EV. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics 2010; 11:538 [CrossRef]
    [Google Scholar]
  228. Caron DA, Countway PD, Jones AC, Kim DY, Schnetzer A. Marine Protistan diversity.
  229. Gong W, Marchetti A. Estimation of 18S gene copy number in marine eukaryotic plankton using a next-generation sequencing approach; 2019; 61–5
  230. Louca S, Doebeli M, Parfrey LW. Correcting for 16S rRNA gene copy numbers in microbiome surveys remains an unsolved problem; 20181–12
  231. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods 2010; 7:335–336 [CrossRef]
    [Google Scholar]
  232. Meyer F, Paarmann D, D'Souza M, Olson R, Glass EM et al. The metagenomics RAST server a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics 2008; 9:386 [CrossRef]
    [Google Scholar]
  233. Nilakanta H, Drews KL, Firrell S, Foulkes MA, Jablonski KA. A review of software for analyzing molecular sequences. BMC Res Notes 2014; 7:830 [CrossRef]
    [Google Scholar]
  234. Hardge K, Neuhaus S, Kilias ES, Wolf C, Metfies K et al. Impact of sequence processing and taxonomic classification approaches on eukaryotic community structure from environmental samples with emphasis on diatoms. Mol Ecol Resour 2018; 18:204–216 [CrossRef]
    [Google Scholar]
  235. Halwachs B, Madhusudhan N, Krause R, Nilsson RH, Moissl-Eichinger C et al. Critical issues in mycobiota analysis. Front Microbiol 2017; 8:180 [CrossRef]
    [Google Scholar]
  236. Chao A. Nonparametric estimation of the number of classes in a population author. Scanadinavian J Stat 1984; 11:265–270
    [Google Scholar]
  237. Chao A, Hwang WH, Chen YC, Kuo CY. Estimating the number of shared species in two communities. Stat Sin
    [Google Scholar]
  238. Shannon CE. A mathematical theory of communication. Bell Syst Technol 1948; 27:379–423 [CrossRef]
    [Google Scholar]
  239. Simpson EH. Measurement of diversity. Nature 1949; 163:688 [CrossRef]
    [Google Scholar]
  240. Faith DP. Conservation evaluation and phylogenetic diversity.; 19921–10
  241. Bray JR, Curtis JT. An Ordination of the upland forest communities of southern Wisconsin. Ecol Monogr Epub ahead of print 1957
    [Google Scholar]
  242. Real R, Vargas JM. The probabilistic basis of Jaccard’ s index of similarity; 1996; 45380–385
  243. Lozupone CA, Hamady M, Kelley ST, Knight R. Quantitative and qualitative diversity measures lead to different insights into factors that structure microbial communities. Appl Environ Microbiol 2007; 73:1576–1585 [CrossRef]
    [Google Scholar]
  244. Goodrich JK, Di RSC, Poole AC, Koren O, William A et al. Conducting a microbiome study; 2016; 158250–262
  245. Knight R, Vrbanac A, Taylor BC, Aksenov A, Callewaert C et al. Best practices for analysing microbiomes. Nat Rev Microbiol 2018; 16:410–422 [CrossRef]
    [Google Scholar]
  246. Koren O, Knights D, Gonzalez A, Waldron L, Segata N et al. A guide to Enterotypes across the human body: meta-analysis of microbial community structures in human microbiome datasets. PLoS Comput Biol Epub ahead of print 2013; 9:e1002863 [CrossRef]
    [Google Scholar]
  247. McMurdie PJ, Holmes S. phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS One Epub ahead of print 2013; 8:e61217 [CrossRef]
    [Google Scholar]
  248. Lahti L, Shetty S, Blake T. Tools for microbiome analysis in R. Microbiome Packag Version 099.
  249. Oksanen J. Multivariate analysis of ecological communities in R: vegan tutorial. R Doc 2015; 43:
    [Google Scholar]
  250. Bulletin B, Dec N. Individual comparisons by ranking methods Frank Wilcoxon; 2006; 180–83
  251. Kruskal WH, Wallis WA. Use of ranks in One-Criterion variance analysis. J Am Stat Assoc 1952; 47:583–621 [CrossRef]
    [Google Scholar]
  252. Anderson MJ. A new method for non-parametric multivariate analysis of variance; 200632–46
  253. Clarke KR. Non-Parametric multivariate analyses of changes in community structure. Austral Ecol 1993; 18:117–143 [CrossRef]
    [Google Scholar]
  254. Mantel N. The detection of disease clustering and a generalized regression approach. Cancer Res 1967; 27:209–220
    [Google Scholar]
  255. Anderson MJ, Walsh DCI. PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: what null hypothesis are you testing?. Ecol Monogr 2013; 83:557–574 [CrossRef]
    [Google Scholar]
  256. Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R et al. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Heal Dis 2015; 26:1–7 [CrossRef]
    [Google Scholar]
  257. Segata N, Izard J, Waldron L, Gevers D, Miropolsky L et al. Metagenomic biomarker discovery and explanation. Genome Biol 2011; 12:R60 [CrossRef]
    [Google Scholar]
  258. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2. Genome Biol 2014; 15:550 [CrossRef]
    [Google Scholar]
  259. Law CW, Chen Y, Shi W, Smyth GK. voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol 2014; 15:R29 [CrossRef]
    [Google Scholar]
  260. Perez-Cobas AE, Artacho A, Ott SJ, Moya A, Gosalbes MJ et al. Structural and functional changes in the gut microbiota associated to Clostridium difficile infection. Front Microbiol 2014; 5:335
    [Google Scholar]
  261. Leung MHY, Chan KCK, Lee PKH. Skin fungal community and its correlation with bacterial community of urban Chinese individuals. Microbiome 2016; 4:1–15 [CrossRef]
    [Google Scholar]
  262. Barberán A, Bates ST, Casamayor EO, Fierer N. Using network analysis to explore co-occurrence patterns in soil microbial communities. ISME J 2012; 6:343–351 [CrossRef]
    [Google Scholar]
  263. Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol 2012; 8:e1002687–11 [CrossRef]
    [Google Scholar]
  264. Csardi GNT. The igraph software package for complex network research. Int J 2006; Complex Sy:1695
    [Google Scholar]
  265. Qu K, Guo F, Liu X, Lin Y, Zou Q. Application of machine learning in microbiology. Front Microbiol 2019; 10: [CrossRef]
    [Google Scholar]
  266. Zhou Y-H, Gallins P. A review and tutorial of machine learning methods for microbiome host trait prediction. Front Genet 2019; 10: [CrossRef]
    [Google Scholar]
  267. Breiman L. (impo)Random forests(book). Mach Learn 2001
    [Google Scholar]
  268. Subramanian S, Huq S, Yatsunenko T, Haque R, Alam MA et al. HHS public access; 2014; 510417–421
  269. Thompson J, Johansen R, Dunbar J, Munsky B. Machine learning to predict microbial community functions: an analysis of dissolved organic carbon from litter decomposition. PLoS One 2019; 14:e0215502 [CrossRef]
    [Google Scholar]
  270. Oudah M, Henschel A. Taxonomy-aware feature engineering for microbiome classification. BMC Bioinformatics 2018; 19:227 [CrossRef][PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000409
Loading
/content/journal/mgen/10.1099/mgen.0.000409
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error