1887

Abstract

Results published in an article by Poore . (. 2020;579:567–574) suggested that machine learning models can almost perfectly distinguish between tumour types based on their microbial composition using machine learning models. Whilst we believe that there is the potential for microbial composition to be used in this manner, we have concerns with the paper that make us question the certainty of the conclusions drawn. We believe there are issues in the areas of the contribution of contamination, handling of batch effects, false positive classifications and limitations in the machine learning approaches used. This makes it difficult to identify whether the authors have identified true biological signal and how robust these models would be in use as clinical biomarkers. We commend Poore . on their approach to open data and reproducibility that has enabled this analysis. We hope that this discourse assists the future development of machine learning models and hypothesis generation in microbiome research.

Funding
This study was supported by the:
  • Bob Champion Cancer Trust (Award NA)
    • Principle Award Recipient: ColinS. Cooper
  • Prostate Cancer UK (Award MA-ETNA19-003)
    • Principle Award Recipient: DanielS. Brewer
  • Big C Cancer Charity (Award 16-09R)
    • Principle Award Recipient: DanielS. Brewer
  • This is an open-access article distributed under the terms of the Creative Commons Attribution License. This article was made open access via a Publish and Read agreement between the Microbiology Society and the corresponding author’s institution.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.001088
2023-08-09
2024-05-10
Loading full text...

Full text loading...

/deliver/fulltext/mgen/9/8/mgen001088.html?itemId=/content/journal/mgen/10.1099/mgen.0.001088&mimeType=html&fmt=ahah

References

  1. Poore GD, Kopylova E, Zhu Q, Carpenter C, Fraraccio S et al. Microbiome analyses of blood and tissues suggest cancer diagnostic approach. Nature 2020; 579:567–574 [View Article] [PubMed]
    [Google Scholar]
  2. Whalen S, Schreiber J, Noble WS, Pollard KS. Navigating the pitfalls of applying machine learning in genomics. Nat Rev Genet 2022; 23:169–181 [View Article] [PubMed]
    [Google Scholar]
  3. Cotmore SF, Agbandje-McKenna M, Chiorini JA, Mukha DV, Pintel DJ et al. The family Parvoviridae. Arch Virol 2014; 159:1239–1247 [View Article] [PubMed]
    [Google Scholar]
  4. Hosoya S, Adachi K, Kasai H. Thalassomonas actiniarum sp. nov. and Thalassomonas haliotis sp. nov., isolated from marine animals. Int J Syst Evol Microbiol 2009; 59:686–690 [View Article] [PubMed]
    [Google Scholar]
  5. Liu T, Zhang Y, Zhang X, Zhou L, Meng C et al. Leucothrix sargassi sp. nov., isolated from a marine alga [Sargassum natans (L.) Gaillon]. Int J Syst Evol Microbiol 2019; 69:3857–3862 [View Article]
    [Google Scholar]
  6. Wittmann J, Klumpp J, Moreno Switt AI, Yagubi A, Ackermann H-W et al. Taxonomic reassessment of N4-like viruses using comparative genomics and proteomics suggests a new subfamily - “Enquartavirinae.”. Arch Virol 2015; 160:3053–3062 [View Article] [PubMed]
    [Google Scholar]
  7. Merabishvili M, Vandenheuvel D, Kropinski AM, Mast J, De Vos D et al. Characterization of newly isolated lytic bacteriophages active against Acinetobacter baumannii. PLoS One 2014; 9:e104853 [View Article] [PubMed]
    [Google Scholar]
  8. Ringelhan M, McKeating JA, Protzer U. Viral hepatitis and liver cancer. Philos Trans R Soc Lond B Biol Sci 2017; 372:20160274 [View Article] [PubMed]
    [Google Scholar]
  9. Zapatka M, Borozan I, Brewer DS, Iskar M, Grundhoff A et al. The landscape of viral associations in human cancers. Nat Genet 2020; 52:320–330 [View Article] [PubMed]
    [Google Scholar]
  10. Salter SJ, Cox MJ, Turek EM, Calus ST, Cookson WO et al. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol 2014; 12:87 [View Article] [PubMed]
    [Google Scholar]
  11. de Goffau MC, Lager S, Sovio U, Gaccioli F, Cook E et al. Human placenta has no microbiome but can contain potential pathogens. Nature 2019; 574:329–334 [View Article] [PubMed]
    [Google Scholar]
  12. Bedarf JR, Beraza N, Khazneh H, Özkurt E, Baker D et al. Much ado about nothing? Off-target amplification can lead to false-positive bacterial brain microbiome detection in healthy and Parkinson’s disease individuals. Microbiome 2021; 9:75 [View Article]
    [Google Scholar]
  13. de Goffau MC, Lager S, Salter SJ, Wagner J, Kronbichler A et al. Recognizing the reagent microbiome. Nat Microbiol 2018; 3:851–853 [View Article] [PubMed]
    [Google Scholar]
  14. de Goffau MC, Charnock-Jones DS, Smith GCS, Parkhill J. Batch effects account for the main findings of an in utero human intestinal bacterial colonization study. Microbiome 2021; 9:6 [View Article] [PubMed]
    [Google Scholar]
  15. Gihawi A, Rallapalli G, Hurst R, Cooper CS, Leggett RM et al. SEPATH: benchmarking the search for pathogens in human tissue whole genome sequence data leads to template pipelines. Genome Biol 2019; 20:208 [View Article] [PubMed]
    [Google Scholar]
  16. Dohlman AB, Arguijo Mendoza D, Ding S, Gao M, Dressman H et al. The cancer microbiome atlas: a pan-cancer comparative analysis to distinguish tissue-resident microbiota from contaminants. Cell Host Microbe 2021; 29:281–298 [View Article] [PubMed]
    [Google Scholar]
  17. Gerber GK. The dynamic microbiome. FEBS Lett 2014; 588:4131–4139 [View Article] [PubMed]
    [Google Scholar]
  18. McMurdie PJ. Normalization of microbiome profiling data. Methods Mol Biol 2018; 1849:143–168 [View Article] [PubMed]
    [Google Scholar]
  19. Paper W, Jahn U, Hohn MJ, Kronner M, Näther DJ et al. Ignicoccus hospitalis sp. nov., the host of “Nanoarchaeum equitans.”. Int J Syst Evol Microbiol 2007; 57:803–808 [View Article] [PubMed]
    [Google Scholar]
  20. Eisenhofer R, Minich JJ, Marotz C, Cooper A, Knight R et al. Contamination in low microbial biomass microbiome studies: issues and recommendations. Trends Microbiol 2019; 27:105–117 [View Article] [PubMed]
    [Google Scholar]
  21. Knight R, Vrbanac A, Taylor BC, Aksenov A, Callewaert C et al. Best practices for analysing microbiomes. Nat Rev Microbiol 2018; 16:410–422 [View Article] [PubMed]
    [Google Scholar]
  22. Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI et al. Bacterial community variation in human body habitats across space and time. Science 2009; 326:1694–1697 [View Article] [PubMed]
    [Google Scholar]
  23. Gihawi A, Cooper CS, Brewer DS. Caution regarding the specificities of pan-cancer microbial structure. Bioinformatics 2023 [View Article]
    [Google Scholar]
  24. Sepich-Poore GD, Kopylova E, Zhu Q, Carpenter C, Fraraccio S et al. Reply to: caution regarding the specificities of pan-cancer microbial structure. Bioinformatics 2023 [View Article]
    [Google Scholar]
  25. Breitwieser FP, Lu J, Salzberg SL. A review of methods and databases for metagenomic classification and assembly. Brief Bioinform 2019; 20:1125–1136 [View Article] [PubMed]
    [Google Scholar]
  26. Berg G, Rybakova D, Fischer D, Cernava T, Vergès M-CC et al. Microbiome definition re-visited: old concepts and new challenges. Microbiome 2020; 8:119 [View Article] [PubMed]
    [Google Scholar]
  27. Wickham H, Averick M, Bryan J, Chang W, McGowan L et al. Welcome to the Tidyverse. J Open Source Softw 2019; 4:1686 [View Article]
    [Google Scholar]
  28. Kassambara A. ggpubr: “ggplot2” Based Publication Ready Plots; 2022
  29. Clarke E-M, ggbeeswarm C. Categorical Scatter (Violin Point) Plots; 2022
  30. Wilke C. cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2; 2020
  31. Millard SP. EnvStats: An R Package for Environmental New York, NY: Springer; 2013
    [Google Scholar]
  32. Parks DH, Chuvochina M, Rinke C, Mussig AJ, Chaumeil PA et al. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res 2022; 50:D785–D794 [View Article] [PubMed]
    [Google Scholar]
  33. Yu H, Qi S, Chang Z, Rong Q, Akinyemi IA et al. Complete genome sequence of a novel velarivirus infecting areca palm in China. Arch Virol 2015; 160:2367–2370 [View Article] [PubMed]
    [Google Scholar]
  34. Rabenstein F, Seifers DL, Schubert J, French R, Stenger DC. Phylogenetic relationships, strain diversity and biogeography of tritimoviruses. J Gen Virol 2002; 83:895–906 [View Article] [PubMed]
    [Google Scholar]
  35. Roundy CM, Azar SR, Rossi SL, Weaver SC, Vasilakis N. Insect-specific viruses: a historical overview and recent developments. Adv Virus Res 2017; 98:119–146 [View Article] [PubMed]
    [Google Scholar]
  36. Short SM, Staniewski MA, Chaban YV, Long AM, Wang D. Diversity of viruses infecting eukaryotic algae. Curr Issues Mol Biol 2020; 39:29–62 [View Article] [PubMed]
    [Google Scholar]
  37. Webster DE, Beck DL, Rabenstein F, Forster RLS, Guy PL. An improved polyclonal antiserum for detecting Ryegrass mosaic rymovirus. Arch Virol 2005; 150:1921–1926 [View Article] [PubMed]
    [Google Scholar]
  38. Nedashkovskaya OI, Vancanneyt M, Kim SB, Han J, Zhukova NV et al. Salinimicrobium marinum sp. nov., a halophilic bacterium of the family Flavobacteriaceae, and emended descriptions of the genus Salinimicrobium and Salinimicrobium catena. Int J Syst Evol Microbiol 2010; 60:2303–2306 [View Article] [PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.001088
Loading
/content/journal/mgen/10.1099/mgen.0.001088
Loading

Data & Media loading...

Supplements

Supplementary material 1

PDF
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error