1887

Abstract

The accessory genes of prokaryote and eukaryote pangenomes accumulate by horizontal gene transfer, differential gene loss, and the effects of selection and drift. We have developed Coinfinder, a software program that assesses whether sets of homologous genes (gene families) in pangenomes associate or dissociate with each other (i.e. are ‘coincident’) more often than would be expected by chance. Coinfinder employs a user-supplied phylogenetic tree in order to assess the lineage-dependence (i.e. the phylogenetic distribution) of each accessory gene, allowing Coinfinder to focus on coincident gene pairs whose joint presence is not simply because they happened to appear in the same clade, but rather that they tend to appear together more often than expected across the phylogeny. Coinfinder is implemented in C++, Python3 and R and is freely available under the GNU license from https://github.com/fwhelan/coinfinder.

Funding
This study was supported by the:
  • Fiona Jane Whelan , H2020 Marie Skłodowska-Curie Actions , (Award 793818)
  • James O McInerney , Biotechnology and Biological Sciences Research Council , (Award BB/N018044/2)
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000338
2020-02-25
2020-06-04
Loading full text...

Full text loading...

/deliver/fulltext/mgen/6/3/mgen000338.html?itemId=/content/journal/mgen/10.1099/mgen.0.000338&mimeType=html&fmt=ahah

References

  1. Tettelin H, Masignani V, Cieslewicz MJ, Donati C, Medini D et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial "pan-genome". Proc Natl Acad Sci U S A 2005; 102:13950–13955 [CrossRef]
    [Google Scholar]
  2. Brockhurst MA, Harrison E, Hall JPJ, Richards T, McNally A et al. The ecology and evolution of Pangenomes. Curr Biol 2019; 29:R1094–R1103 [CrossRef]
    [Google Scholar]
  3. McInerney JO, McNally A, O'Connell MJ. Why prokaryotes have pangenomes. Nat Microbiol 2017; 2:17040 [CrossRef]
    [Google Scholar]
  4. Shapiro BJ. The population genetics of pangenomes. Nat Microbiol 2017; 2:1574 [CrossRef]
    [Google Scholar]
  5. Tassia MG, Whelan NV, Halanych KM. Toll-Like receptor pathway evolution in deuterostomes. Proc Natl Acad Sci U S A 2017; 114:7055–7060 [CrossRef]
    [Google Scholar]
  6. Bruns H, Crüsemann M, Letzel A-C, Alanjary M, McInerney JO et al. Function-Related replacement of bacterial siderophore pathways. Isme J 2018; 12:320–329 [CrossRef]
    [Google Scholar]
  7. Ruan Q, Dutta D, Schwalbach MS, Steele JA, Fuhrman JA et al. Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors. Bioinformatics 2006; 22:2532–2538 [CrossRef]
    [Google Scholar]
  8. Faust K, Raes J. CoNet app: inference of biological association networks using Cytoscape [version 2; referees: 2 approved]. F1000 Res [Internet] 2016
    [Google Scholar]
  9. Ling Y, Watanabe Y, Okuda S. The human gut microbiome is structured to optimize molecular interaction networks. Comput Struct Biotechnol J 2019; 17:1040–1046 [CrossRef]
    [Google Scholar]
  10. Weiss S, Van Treuren W, Lozupone C, Faust K, Friedman J et al. Correlation detection strategies in microbial data sets vary widely in sensitivity and precision. Isme J 2016; 10:16691681 [CrossRef]
    [Google Scholar]
  11. Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol 2012; 8:e1002687 [CrossRef]
    [Google Scholar]
  12. Kuntal BK, Chandrakar P, Sadhu S, Mande SS. 'NetShift': a methodology for understanding 'driver microbes' from healthy and disease microbiome datasets. ISME J 2019; 13:442–454 [CrossRef]
    [Google Scholar]
  13. Wu C-H, Wu C-H, Charlesworth J, Stoesser N, Gordon NC et al. Identifying lineage effects when controlling for population structure improves power in bacterial association studies. Nat Microbiol [Internet] 2016; 1: [CrossRef]
    [Google Scholar]
  14. Brynildsrud O, Bohlin J, Scheffer L, Eldholm V. Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary. Genome Biol 2016; 17: [CrossRef]
    [Google Scholar]
  15. Pensar J, Puranen S, Arnold B, MacAlasdair N, Kuronen J et al. Genome-Wide epistasis and co-selection study using mutual information. Nucleic Acids Res 2019; 47:e112 [CrossRef]
    [Google Scholar]
  16. Lassalle F, Veber P, Jauneikaite E, Didelot X. Automated reconstruction of all gene histories in large bacterial pangenome datasets and search for co-evolved gene modules with Pantagruel. bioRxiv [Internet] 2019; 19:495–586
    [Google Scholar]
  17. Cohen O, Ashkenazy H, Levy Karin E, Burstein D, Pupko T. CoPAP: coevolution of presence-absence patterns. Nucleic Acids Res 2013; 41:W232W237 [CrossRef]
    [Google Scholar]
  18. Kim P-J, Price ND. Genetic co-occurrence network across sequenced microbes. PLoS Comput Biol 2011; 7:1002340 [CrossRef]
    [Google Scholar]
  19. Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 2015; 31:3691–3693 [CrossRef]
    [Google Scholar]
  20. Bayliss SC, Thorpe HA, Coyle NM, Sheppard SK, Feil EJ. PIRATE: a fast and scalable pangenomics toolbox for clustering diverged orthologues in bacteria. Gigascience 2019; 8: [CrossRef]
    [Google Scholar]
  21. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 1990; 215:403–410 [CrossRef]
    [Google Scholar]
  22. Dongen S. Performance Criteria for Graph Clustering and Markov Cluster Experiments Amsterdam, The Netherlands: CWI (Centre for Mathematics and Computer Science); 2000
    [Google Scholar]
  23. Dongen S. A Cluster Algorithm for Graphs Amsterdam, The Netherlands, The Netherlands: CWI (Centre for Mathematics and Computer Science); 2000
    [Google Scholar]
  24. Yarza P, Richter M, Peplies J, Euzeby J, Amann R et al. The All-Species living tree project: a 16S rRNA-based phylogenetic tree of all sequenced type strains. Syst Appl Microbiol 2008; 31:241–250 [CrossRef]
    [Google Scholar]
  25. Fritz SA, Purvis A. Selectivity in mammalian extinction risk and threat types: a new measure of phylogenetic signal strength in binary traits. Conserv Biol 2010; 24:1042–1051 [CrossRef]
    [Google Scholar]
  26. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 2014; 30:2068–2069 [CrossRef]
    [Google Scholar]
  27. Obolski U, Gori A, Lourenço J, Thompson C, Thompson R et al. Identifying genes associated with invasive disease in S. pneumoniae by applying a machine learning approach to whole genome sequence typing data. Sci Rep 2019; 9: [CrossRef]
    [Google Scholar]
  28. Hiller NL, Sá-Leão R. Puzzling over the pneumococcal Pangenome. Front Microbiol 2018; 9:2580 [CrossRef]
    [Google Scholar]
  29. Lolkema JS, Chaban Y, Boekema EJ, Composition S. Structure, and distribution of bacterial V-type ATPases. vol. 35. J Bioenerg Biomembr 2003; 35:323–335 [CrossRef]
    [Google Scholar]
  30. Wang B, Qin W, Ren Y, Zhou X, Jung M-Y et al. Expansion of Thaumarchaeota habitat range is correlated with horizontal transfer of ATPase operons. Isme J 2019
    [Google Scholar]
  31. Johnston C, Polard P, Claverys J-P. The DpnI/DpnII pneumococcal system, defense against foreign attack without compromising genetic exchange. Mob Genet Elements 2013; 3:e25582 [CrossRef]
    [Google Scholar]
  32. Maestro B, Sanz JM. Choline binding proteins from Streptococcus pneumoniae: A dual role as enzybiotics and targets for the design of new antimicrobials 5 Antibiotics: MDPI AG; 2016
    [Google Scholar]
  33. Gosink KK, Mann ER, Guglielmo C, Tuomanen EI, Masure HR. Role of novel choline binding proteins in virulence of Streptococcus pneumoniae. Infect Immun 2000; 68:5690–5695 [CrossRef]
    [Google Scholar]
  34. Peters C, Bayer MJ, Bühler S, Andersen JS, Mann M et al. Trans-complex formation by proteolipid channels in the terminal phase of membrane fusion. Nature 2001; 409:581–588 [CrossRef]
    [Google Scholar]
  35. Wickham H. ggplot2: Elegant Graphics for Data Analysis [Internet] New York: Springer-Verlag New York; 2009
    [Google Scholar]
  36. Yu G, Smith DK, Zhu H, Guan Y, Lam TT-Y. ggtree : an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol Evol 2017; 8:28–36 [CrossRef]
    [Google Scholar]
  37. Bastian M, Heymann S, Jacomy M. Gephi: An open source software for exploring and manipulating networks. BT - International AAAI Conference on Weblogs and Social. Int AAAI Conf Weblogs Soc Media 2009361–362
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000338
Loading
/content/journal/mgen/10.1099/mgen.0.000338
Loading

Data & Media loading...

Most cited this month Most Cited RSS feed

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error