Full text loading...
Abstract
The occurrence of multiple strains of a bacterial pathogen such as M. tuberculosis or C. difficile within a single human host, referred to as a mixed infection, has important implications for both healthcare and public health. However, methods for detecting it, and especially determining the proportion and identities of the underlying strains, from WGS (whole-genome sequencing) data, have been limited. In this paper we introduce SplitStrains, a novel method for addressing these challenges. Grounded in a rigorous statistical model, SplitStrains not only demonstrates superior performance in proportion estimation to other existing methods on both simulated as well as real M. tuberculosis data, but also successfully determines the identity of the underlying strains. We conclude that SplitStrains is a powerful addition to the existing toolkit of analytical methods for data coming from bacterial pathogens and holds the promise of enabling previously inaccessible conclusions to be drawn in the realm of public health microbiology.
- Received:
- Accepted:
- Published Online:
Funding
-
Natural Sciences and Engineering Research Council of Canada
(Award Discovery)
- Principle Award Recipient: MaxwellLibbrecht
-
MRC Centre for Global Infectious Disease Analysis
(Award MR/R015600/1)
- Principle Award Recipient: LeonidChindelevitch
-
Alfred P. Sloan Foundation
(Award FG-2016-6392)
- Principle Award Recipient: LeonidChindelevitch
-
Genome Canada
(Award Machine Learning Methods to Predict Drug Resistance in Pathogenic Bacteria)
- Principle Award Recipient: LeonidChindelevitch
-
CANSSI
(Award Statistical methods for challenging problems in public health microbiology)
- Principle Award Recipient: LeonidChindelevitch