- Volume 2, Issue 8, 2016
Volume 2, Issue 8, 2016
- Research Paper
-
- Microbial evolution and epidemiology
- Phylogeography
-
-
Massive dispersal of Coxiella burnetii among cattle across the United States
Q-fever is an underreported disease caused by the bacterium Coxiella burnetii, which is highly infectious and has the ability to disperse great distances. It is a completely clonal pathogen with low genetic diversity and requires whole-genome analysis to identify discriminating features among closely related isolates. C. burnetii, and in particular one genotype (ST20), is commonly found in cow’s milk across the entire dairy industry of the USA. This single genotype dominance is suggestive of host-specific adaptation, rapid dispersal and persistence within cattle. We used a comparative genomic approach to identify SNPs for high-resolution and high-throughput genotyping assays to better describe the dispersal of ST20 across the USA. We genotyped 507 ST20 cow milk samples and discovered three subgenotypes, all of which were present across the entire country and over the complete time period studied. Only one of these sub-genotypes was observed in a single dairy herd. The temporal and geographic distribution of these sub-genotypes is consistent with a model of large-scale, rapid, frequent and continuous dissemination on a continental scale. The distribution of subgenotypes is not consistent with wind-based dispersal alone, and it is likely that animal husbandry and transportation practices, including pooling of milk from multiple herds, have also shaped the patterns. On the scale of an entire country, there appear to be few barriers to rapid, frequent and large-scale dissemination of the ST20 subgenotypes.
-
- Systems Microbiology
- Transcriptomics, proteomics, networks
-
-
DNA uptake sequences in Neisseria gonorrhoeae as intrinsic transcriptional terminators and markers of horizontal gene transfer
More LessDNA uptake sequences are widespread throughout the Neisseria gonorrhoeae genome. These short, conserved sequences facilitate the exchange of endogenous DNA between members of the genus Neisseria. Often the DNA uptake sequences are present as inverted repeats that are able to form hairpin structures. It has been suggested previously that DNA uptake sequence inverted repeats present 3′ of genes play a role in rho-independent termination and attenuation. However, there is conflicting experimental evidence to support this role. The aim of this study was to determine the role of DNA uptake sequences in transcriptional termination. Both bioinformatics predictions, conducted using TransTermHP, and experimental evidence, from RNA-seq data, were used to determine which inverted repeat DNA uptake sequences are transcriptional terminators and in which direction. Here we show that DNA uptake sequences in the inverted repeat configuration occur in N. gonorrhoeae both where the DNA uptake sequence precedes the inverted version of the sequence and also, albeit less frequently, in reverse order. Due to their symmetrical configuration, inverted repeat DNA uptake sequences can potentially act as bi-directional terminators, therefore affecting transcription on both DNA strands. This work also provides evidence that gaps in DNA uptake sequence density in the gonococcal genome coincide with areas of DNA that are foreign in origin, such as prophage. This study differentiates for the first time, to our knowledge, between DNA uptake sequences that form intrinsic transcriptional terminators and those that do not, providing characteristic features within the flanking inverted repeat that can be identified.
-
- Microbe-Niche Interactions
- Pathogenesis
-
-
Integrated computational prediction and experimental validation identifies promiscuous T cell epitopes in the proteome of Mycobacterium bovis
The discovery of novel antigens is an essential requirement in devising new diagnostics or vaccines for use in control programmes against human tuberculosis (TB) and bovine tuberculosis (bTB). Identification of potential epitopes recognised by CD4+ T cells requires prediction of peptide binding to MHC class-II, an obligatory prerequisite for T cell recognition. To comprehensively prioritise potential MHC-II-binding epitopes from Mycobacterium bovis, the agent of bTB and zoonotic TB in humans, we integrated three binding prediction methods with the M. bovisproteome using a subset of human HLA alleles to approximate the binding of epitope-containing peptides to the bovine MHC class II molecule BoLA-DRB3. Two parallel strategies were then applied to filter the resulting set of binders: identification of the top-scoring binders or clusters of binders. Our approach was tested experimentally by assessing the capacity of predicted promiscuous peptides to drive interferon-γ secretion from T cells of M. bovis infected cattle. Thus, 376 20-mer peptides, were synthesised (270 predicted epitopes, 94 random peptides with low predictive scores and 12 positive controls of known epitopes). The results of this validation demonstrated significant enrichment (>24 %) of promiscuously recognised peptides predicted in our selection strategies, compared with randomly selected peptides with low prediction scores. Our strategy offers a general approach to the identification of promiscuous epitopes tailored to target populations where there is limited knowledge of MHC allelic diversity.
-
- Microbial communities
- Environmental
-
-
The electrically conductive pili of Geobacter species are a recently evolved feature for extracellular electron transfer
More LessThe electrically conductive pili (e-pili) of Geobactersulfurreducens have environmental and practical significance because they can facilitate electron transfer to insoluble Fe(III) oxides; to other microbial species; and through electrically conductive biofilms. E-pili conductivity has been attributed to the truncated PilA monomer, which permits tight packing of aromatic amino acids to form a conductive path along the length of e-pili. In order to better understand the evolution and distribution of e-pili in the microbial world, type IVa PilA proteins from various Gram-negative and Gram-positive bacteria were examined with a particular emphasis on Fe(III)-respiring bacteria. E-pilin genes are primarily restricted to a tight phylogenetic group in the order Desulfuromonadales. The downstream gene in all but one of the Desulfuromonadales that possess an e-pilin gene is a gene previously annotated as ‘pilA–C’ that has characteristics suggesting that it may encode an outer-membrane protein. Other genes associated with pilin function are clustered with e-pilin and ‘pilA–C’ genes in the Desulfuromonadales. In contrast, in the few bacteria outside the Desulfuromonadales that contain e-pilin genes, the other genes required for pilin function may have been acquired through horizontal gene transfer. Of the 95 known Fe(III)-reducing micro-organisms for which genomes are available, 80 % lack e-pilin genes, suggesting that e-pili are just one of several mechanisms involved in extracellular electron transport. These studies provide insight into where and when e-pili are likely to contribute to extracellular electron transport processes that are biogeochemically important and involved in bioenergy conversions.
-
- Microbial evolution and epidemiology
- Communicable disease genomics
-
-
The diversity of Klebsiella pneumoniae surface polysaccharides
Klebsiella pneumoniae is considered an urgent health concern due to the emergence of multi-drug-resistant strains for which vaccination offers a potential remedy. Vaccines based on surface polysaccharides are highly promising but need to address the high diversity of surface-exposed polysaccharides, synthesized as O-antigens (lipopolysaccharide, LPS) and K-antigens (capsule polysaccharide, CPS), present in K. pneumoniae. We present a comprehensive and clinically relevant study of the diversity of O- and K-antigen biosynthesis gene clusters across a global collection of over 500 K. pneumoniae whole-genome sequences and the seroepidemiology of human isolates from different infection types. Our study defines the genetic diversity of O- and K-antigen biosynthesis cluster sequences across this collection, identifying sequences for known serotypes as well as identifying novel LPS and CPS gene clusters found in circulating contemporary isolates. Serotypes O1, O2 and O3 were most prevalent in our sample set, accounting for approximately 80 % of all infections. In contrast, K serotypes showed an order of magnitude higher diversity and differ among infection types. In addition we investigated a potential association of O or K serotypes with phylogenetic lineage, infection type and the presence of known virulence genes. K1 and K2 serotypes, which are associated with hypervirulent K. pneumoniae, were associated with a higher abundance of virulence genes and more diverse O serotypes compared to other common K serotypes.
-
- Genomic Methodologies
- Novel phylogenetic methods
-
-
Bayesian identification of bacterial strains from sequencing data
Rapidly assaying the diversity of a bacterial species present in a sample obtained from a hospital patient or an environmental source has become possible after recent technological advances in DNA sequencing. For several applications it is important to accurately identify the presence and estimate relative abundances of the target organisms from short sequence reads obtained from a sample. This task is particularly challenging when the set of interest includes very closely related organisms, such as different strains of pathogenic bacteria, which can vary considerably in terms of virulence, resistance and spread. Using advanced Bayesian statistical modelling and computation techniques we introduce a novel pipeline for bacterial identification that is shown to outperform the currently leading pipeline for this purpose. Our approach enables fast and accurate sequence-based identification of bacterial strains while using only modest computational resources. Hence it provides a useful tool for a wide spectrum of applications, including rapid clinical diagnostics to distinguish among closely related strains causing nosocomial infections. The software implementation is available at https://github.com/PROBIC/BIB.
-
- Microbial evolution and epidemiology
- Phylogeography
-
-
Molecular and biochemical characterization of the NS1 protein of non-cultured influenza B virus strains circulating in Singapore
In this study we compared the NS1 protein of Influenza B/Lee/40 and several non-cultured Influenza B virus clinical strains detected in Singapore. In B/Lee/40 virus-infected cells and in cells expressing the recombinant B/Lee/40 NS1 protein a full-length 35 kDa NS1 protein and a 23 kDa NS1 protein species (p23) were detected. Mutational analysis of the NS1 gene indicated that p23 was generated by a novel cleavage event within the linker domain between an aspartic acid and proline at amino acid residues at positions 92 and 93 respectively (DP92–93), and that p23 contained the first 92 amino acids of the NS1 protein. Sequence analysis of the Singapore strains indicated the presence of either DP92–93 or NP92–93 in the NS1 protein, but protein expression analysis showed that p23 was only detected in NS1 proteins with DP92–93.. An additional adjacent proline residue at position 94 (P94) was present in some strains and correlated with increased p23 levels, suggesting that P94 has a synergistic effect on the cleavage of the NS1 protein. The first 145 amino acids of the NS1 protein are required for inhibition of ISG15-mediated ubiquitination, and our analysis showed that Influenza B viruses circulating in Singapore with DP92–93 expressed truncated NS1 proteins and may differ in their capacity to inhibit ISG15 activity. Thus, DP92–93 in the NS1 protein may confer a disadvantage to Influenza B viruses circulating in the human population and interestingly the low frequency of DP92–93detection in the NS1 protein since 2004 is consistent with this suggestion.
-
- Short Paper
-
- Microbial evolution and epidemiology
- Communicable disease genomics
-
-
Phylogenetic structure of European Salmonella Enteritidis outbreak correlates with national and international egg distribution network
Tim Dallman, Thomas Inns, Thibaut Jombart, Philip Ashton, Nicolas Loman, Carol Chatt, Ute Messelhaeusser, Wolfgang Rabsch, Sandra Simon, Sergejs Nikisins, Helen Bernard, Simon le Hello, Nathalie Jourdan da-Silva, Christian Kornschober, Joel Mossong, Peter Hawkey, Elizabeth de Pinna, Kathie Grant and Paul ClearyOutbreaks of Salmonella Enteritidis have long been associated with contaminated poultry and eggs. In the summer of 2014 a large multi-national outbreak of Salmonella Enteritidis phage type 14b occurred with over 350 cases reported in the United Kingdom, Germany, Austria, France and Luxembourg. Egg supply network investigation and microbiological sampling identified the source to be a Bavarian egg producer. As part of the international investigation into the outbreak, over 400 isolates were sequenced including isolates from cases, implicated UK premises and eggs from the suspected source producer. We were able to show a clear statistical correlation between the topology of the UK egg distribution network and the phylogenetic network of outbreak isolates. This correlation can most plausibly be explained by different parts of the egg distribution network being supplied by eggs solely from independent premises of the Bavarian egg producer (Company X). Microbiological sampling from the source premises, traceback information and information on the interventions carried out at the egg production premises all supported this conclusion. The level of insight into the outbreak epidemiology provided by whole-genome sequencing (WGS) would not have been possible using traditional microbial typing methods.
-
-
-
NGMASTER: in silico multi-antigen sequence typing for Neisseria gonorrhoeae
Whole-genome sequencing (WGS) provides the highest resolution analysis for comparison of bacterial isolates in public health microbiology. However, although increasingly being used routinely for some pathogens such as Listeria monocytogenes and Salmonella enterica, the use of WGS is still limited for other organisms, such as Neisseria gonorrhoeae. Multi-antigen sequence typing (NG-MAST) is the most widely performed typing method for epidemiological surveillance of gonorrhoea. Here, we present NGMASTER, a command-line software tool for performing in silico NG-MAST on assembled genome data. NGMASTER rapidly and accurately determined the NG-MAST of 630 assembled genomes, facilitating comparisons between WGS and previously published gonorrhoea epidemiological studies. The source code and user documentation are available at https://github.com/MDU-PHL/ngmaster.
-
- Genomic Methodologies
- Genome variation detection
-
-
Phase variable DNA repeats in Neisseria gonorrhoeae influence transcription, translation, and protein sequence variation
There are many types of repeated DNA sequences in the genomes of the species of the genus Neisseria, from homopolymeric tracts to tandem repeats of hundreds of bases. Some of these have roles in the phase-variable expression of genes. When a repeat mediates phase variation, reversible switching between tract lengths occurs, which in the species of the genus Neisseria most often causes the gene to switch between on and off states through frame shifting of the open reading frame. Changes in repeat tract lengths may also influence the strength of transcription from a promoter. For phenotypes that can be readily observed, such as expression of the surface-expressed Opa proteins or pili, verification that repeats are mediating phase variation is relatively straightforward. For other genes, particularly those where the function has not been identified, gathering evidence of repeat tract changes can be more difficult. Here we present analysis of the repetitive sequences that could mediate phase variation in the Neisseria gonorrhoeae strain NCCP11945 genome sequence and compare these results with other gonococcal genome sequences. Evidence is presented for an updated phase-variable gene repertoire in this species, including a class of phase variation that causes amino acid changes at the C-terminus of the protein, not previously described in N. gonorrhoeae.
-
- Methods Paper
-
- Genomic Methodologies
- Genome variation detection
-
-
NASP: an accurate, rapid method for the identification of SNPs in WGS datasets that supports flexible input and output formats
Whole-genome sequencing (WGS) of bacterial isolates has become standard practice in many laboratories. Applications for WGS analysis include phylogeography and molecular epidemiology, using single nucleotide polymorphisms (SNPs) as the unit of evolution. NASP was developed as a reproducible method that scales well with the hundreds to thousands of WGS data typically used in comparative genomics applications. In this study, we demonstrate how NASP compares with other tools in the analysis of two real bacterial genomics datasets and one simulated dataset. Our results demonstrate that NASP produces similar, and often better, results in comparison with other pipelines, but is much more flexible in terms of data input types, job management systems, diversity of supported tools and output formats. We also demonstrate differences in results based on the choice of the reference genome and choice of inferring phylogenies from concatenated SNPs or alignments including monomorphic positions. NASP represents a source-available, version-controlled, unit-tested method and can be obtained from tgennorth.github.io/NASP.
-
- Systems Microbiology
- Large-scale comparative genomics
-
-
Robust high-throughput prokaryote de novo assembly and improvement pipeline for Illumina data
The rapidly reducing cost of bacterial genome sequencing has lead to its routine use in large-scale microbial analysis. Though mapping approaches can be used to find differences relative to the reference, many bacteria are subject to constant evolutionary pressures resulting in events such as the loss and gain of mobile genetic elements, horizontal gene transfer through recombination and genomic rearrangements. De novo assembly is the reconstruction of the underlying genome sequence, an essential step to understanding bacterial genome diversity. Here we present a high-throughput bacterial assembly and improvement pipeline that has been used to generate nearly 20 000 annotated draft genome assemblies in public databases. We demonstrate its performance on a public data set of 9404 genomes. We find all the genes used in multi-locus sequence typing schema present in 99.6 % of assembled genomes. When tested on low-, neutral- and high-GC organisms, more than 94 % of genes were present and completely intact. The pipeline has been proven to be scalable and robust with a wide variety of datasets without requiring human intervention. All of the software is available on GitHub under the GNU GPL open source license.
-