1887

Abstract

A homoplasy is a nucleotide identity resulting from a process other than inheritance from a common ancestor. Importantly, by distorting the ancestral relationships between nucleotide sequences, homoplasies can change the structure of the phylogeny. Homoplasies can emerge naturally, especially under high selection pressures and/or high mutation rates, or be created during the generation and processing of sequencing data. Identification of homoplasies is critical, both to understand their influence on the analyses of phylogenetic data and to allow an investigation into how they arose. Here we present HomoplasyFinder, a java application that can be used as a stand-a-lone tool or within the statistical programming environment R. Within R and Java, HomoplasyFinder is shown to be able to automatically, and quickly, identify any homoplasies present in simulated and real phylogenetic data. HomoplasyFinder can easily be incorporated into existing analysis pipelines, either within or outside of R, allowing the user to quickly identify homoplasies to inform downstream analyses and interpretation.

Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000245
2019-01-21
2020-01-18
Loading full text...

Full text loading...

/deliver/fulltext/mgen/5/1/mgen000245.html?itemId=/content/journal/mgen/10.1099/mgen.0.000245&mimeType=html&fmt=ahah

References

  1. Satya RV, Mukherjee A, Alexe G, Parida L, Bhanot G et al. Constructing near-perfect phylogenies with multiple homoplasy events. Bioinformatics 2006;22:e514e522 [CrossRef][PubMed]
    [Google Scholar]
  2. Wake DB. Homoplasy: the result of natural selection, or evidence of design limitations?. Am Nat 1991;138:543–567 [CrossRef]
    [Google Scholar]
  3. Hassanin A, Lecointre G, Tillier S. The ‘evolutionary signal’ of homoplasy in proteincoding gene sequences and its consequences for a priori weighting in phylogeny. Comptes Rendus de l'Académie des Sciences - Series III - Sciences de la Vie 1998;321:611–620 [CrossRef]
    [Google Scholar]
  4. Brandley MC, Warren DL, Leaché AD, McGuire JA. Homoplasy and clade support. Syst Biol 2009;58:184–198 [CrossRef][PubMed]
    [Google Scholar]
  5. Frost SD, Volz EM. Modelling tree shape and structure in viral phylodynamics. Philos Trans R Soc Lond B Biol Sci 2013;368:20120208 [CrossRef][PubMed]
    [Google Scholar]
  6. Radel D, Sand A, Steel M. Hide and seek: placing and finding an optimal tree for thousands of homoplasy-rich sequences. Mol Phylogenet Evol 2013;69:1186–1189 [CrossRef][PubMed]
    [Google Scholar]
  7. Rokas A, Carroll SB. Frequent and widespread parallel evolution of protein sequences. Mol Biol Evol 2008;25:1943–1953 [CrossRef][PubMed]
    [Google Scholar]
  8. Farhat MR, Shapiro BJ, Kieser KJ, Sultana R, Jacobson KR et al. Genomic analysis identifies targets of convergent positive selection in drug-resistant Mycobacterium tuberculosis. Nat Genet 2013;45:1183–1189 [CrossRef][PubMed]
    [Google Scholar]
  9. Coll F, Phelan J, Hill-Cawthorne GA, Nair MB, Mallard K et al. Genome-wide analysis of multi- and extensively drug-resistant Mycobacterium tuberculosis. Nat Genet 2018;50:307–316 [CrossRef][PubMed]
    [Google Scholar]
  10. Feil EJ, Li BC, Aanensen DM, Hanage WP, Spratt BG et al. eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data. J Bacteriol 2004;186:1518–1530 [CrossRef][PubMed]
    [Google Scholar]
  11. Bruen TC, Philippe H, Bryant D. A simple and robust statistical test for detecting the presence of recombination. Genetics 2006;172:2665–2681 [CrossRef][PubMed]
    [Google Scholar]
  12. Bobay LM, Ochman H. Impact of recombination on the base composition of bacteria and archaea. Mol Biol Evol 2017;34:2627–2636 [CrossRef][PubMed]
    [Google Scholar]
  13. Metzker ML. Sequencing technologies — the next generation. Nat Rev Genet 2010;11:31–46 [CrossRef]
    [Google Scholar]
  14. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 2016;17:333–351 [CrossRef][PubMed]
    [Google Scholar]
  15. Fitch WM. Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 1971;20:406–416 [CrossRef]
    [Google Scholar]
  16. Farris JS. The retention index and the rescaled consistency index. Cladistics 1989;5:417–419 [CrossRef]
    [Google Scholar]
  17. Schliep KP. phangorn: phylogenetic analysis in R. Bioinformatics 2011;27:592–593 [CrossRef][PubMed]
    [Google Scholar]
  18. Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 1985;39:783–791 [CrossRef][PubMed]
    [Google Scholar]
  19. Sagulenko P, Puller V, Neher RA. TreeTime: maximum-likelihood phylodynamic analysis. Virus Evol 2018;4:1–9 [CrossRef][PubMed]
    [Google Scholar]
  20. Maddison WP, Maddison DR. ‘Mesquite: a modular system for evolutionary analysis’. 2018
  21. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 2014;30:1312–1313 [CrossRef][PubMed]
    [Google Scholar]
  22. Swofford DL, Olsen GJ, Waddell PJ, Hillis DM. Phylogenetic inference. In Hillis DM, Moritz C, Mable BK. (editors) Molecular Systematics, 2nd ed. Sunderland, MA: Sinauer Associates, 407-514; 1996
    [Google Scholar]
  23. Urbanek S. rJava: Low-Level R to Java Interface. R package version 0.9-10. 2018;https://CRAN.R-project.org/package=rJava
  24. R Core Team R: A Language and Environment for Statistical Computing Vienna, Austria: R Foundation for Statistical Computing; 2016
    [Google Scholar]
  25. Crispell J, Zadoks RN, Harris SR, Paterson B, Collins DM et al. Using whole genome sequencing to investigate transmission in a multi-host system: bovine tuberculosis in New Zealand. BMC Genomics 2017;18:1–12 [CrossRef]
    [Google Scholar]
  26. Grandjean L, Gilman RH, Iwamoto T, Köser CU, Coronel J et al. Convergent evolution and topologically disruptive polymorphisms among multidrug-resistant tuberculosis in Peru. PLoS One 2017;12:e0189838 [CrossRef][PubMed]
    [Google Scholar]
  27. Didelot X, Wilson DJ. ClonalFrameML: efficient inference of recombination in whole bacterial genomes. PLoS Comput Biol 2015;11:e100404118 [CrossRef][PubMed]
    [Google Scholar]
  28. Namouchi A, Didelot X, Schöck U, Gicquel B, Rocha EP et al. After the bottleneck: Genome-wide diversification of the Mycobacterium tuberculosis complex by mutation, recombination, and natural selection. Genome Res 2012;22:721–734 [CrossRef][PubMed]
    [Google Scholar]
  29. Kao RR, Price-Carter M, Robbe-Austerman S. Use of genomics to track bovine tuberculosis transmission. Rev Sci Tech 2016;35:241–268 [CrossRef][PubMed]
    [Google Scholar]
  30. Sutcliffe IC, Harrington DJ. Lipoproteins of Mycobacterium tuberculosis: an abundant and functionally diverse class of cell envelope components. FEMS Microbiol Rev 2004;28:645–659 [CrossRef][PubMed]
    [Google Scholar]
  31. Sampson SL. Mycobacterial PE/PPE proteins at the host–pathogen interface. Clin Dev Immunol 2011;2011:1–11 [CrossRef][PubMed]
    [Google Scholar]
  32. Planet PJ, Narechania A, Chen L, Mathema B, Boundy S et al. Architecture of a species: phylogenomics of Staphylococcus aureus. Trends Microbiol 2017;25:153–166 [CrossRef][PubMed]
    [Google Scholar]
  33. Malone KM, Farrell D, Stuber TP, Schubert OT, Aebersold R et al. updated reference genome sequence and annotation of Mycobacterium bovis AF2122/97. Genome Announc 2017;5:17–18 [CrossRef][PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000245
Loading
/content/journal/mgen/10.1099/mgen.0.000245
Loading

Data & Media loading...

Most cited articles

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error