-
Volume 10,
Issue 3,
2024
Volume 10, Issue 3, 2024
- Research Articles
-
- Genomic Methodologies
-
-
Target enrichment improves culture-independent detection of Neisseria gonorrhoeae and antimicrobial resistance determinants direct from clinical samples with Nanopore sequencing
Multi-drug-resistant Neisseria gonorrhoeae infection is a significant public health risk. Rapidly detecting N. gonorrhoeae and antimicrobial-resistant (AMR) determinants by metagenomic sequencing of urine is possible, although high levels of host DNA and overgrowth of contaminating species hamper sequencing and limit N. gonorrhoeae genome coverage. We performed Nanopore sequencing of nucleic acid amplification test-positive urine samples and culture-positive urethral swabs with and without probe-based target enrichment, using a custom SureSelect panel, to investigate whether selective enrichment of N. gonorrhoeae DNA improves detection of both species and AMR determinants. Probes were designed to cover the entire N. gonorrhoeae genome, with tenfold enrichment of probes covering selected AMR determinants. Multiplexing was tested in a subset of samples. The proportion of sequence bases classified as N. gonorrhoeae increased in all samples after enrichment, from a median (IQR) of 0.05 % (0.01–0.1 %) to 76 % (42–82 %), giving a corresponding median improvement in fold genome coverage of 365 times (112–720). Over 20-fold coverage, required for robust AMR determinant detection, was achieved in 13/15(87 %) samples, compared to 2/15(13 %) without enrichment. The four samples multiplexed together also achieved >20-fold genome coverage. Coverage of AMR determinants was sufficient to predict resistance conferred by changes in chromosomal genes, where present, and genome coverage also enabled phylogenetic relationships to be reconstructed. Probe-based target enrichment can improve N. gonorrhoeae genome coverage when sequencing DNA extracts directly from urine or urethral swabs, allowing for detection of AMR determinants. Additionally, multiplexing prior to enrichment provided enough genome coverage for AMR detection and reduces the costs associated with this method.
-
-
-
Assembly collapsing versus heterozygosity oversizing: detection of homokaryotic and heterokaryotic Laccaria trichodermophora strains by hybrid genome assembly
Genome assembly and annotation using short-paired reads is challenging for eukaryotic organisms due to their large size, variable ploidy and large number of repetitive elements. However, the use of single-molecule long reads improves assembly quality (completeness and contiguity), but haplotype duplications still pose assembly challenges. To address the effect of read length on genome assembly quality, gene prediction and annotation, we compared genome assemblers and sequencing technologies with four strains of the ectomycorrhizal fungus Laccaria trichodermophora. By analysing the predicted repertoire of carbohydrate enzymes, we investigated the effects of assembly quality on functional inferences. Libraries were generated using three different sequencing platforms (Illumina Next-Seq, Mi-Seq and PacBio Sequel), and genomes were assembled using single and hybrid assemblies/libraries. Long reads or hybrid assemby resolved the collapsing of repeated regions, but the nuclear heterozygous versions remained unresolved. In dikaryotic fungi, each cell includes two nuclei and each nucleus has differences not only in allelic gene version but also in gene composition and synteny. These heterokaryotic cells produce fragmentation and size overestimation of the genome assembly of each nucleus. Hybrid assembly revealed a wider functional diversity of genomes. Here, several predicted oxidizing activities on glycosyl residues of oligosaccharides and several chitooligosaccharide acetylase activities would have passed unnoticed in short-read assemblies. Also, the size and fragmentation of the genome assembly, in combination with heterozygosity analysis, allowed us to distinguish homokaryotic and heterokaryotic strains isolated from L. trichodermophora fruit bodies.
-
-
-
Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae
Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab-based MIC determination can be time-consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi-quantitatively, with varying resolution, and are typically also left- and right-censored within varying ranges. We therefore investigated genome-based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi-quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black-box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machine-learning based diagnostics.
-
- Functional Genomics and Microbe–Niche Interactions
-
-
Phylometagenomics of cycad coralloid roots reveals shared symbiotic signals
Cycads are known to host symbiotic cyanobacteria, including Nostocales species, as well as other sympatric bacterial taxa within their specialized coralloid roots. Yet, it is unknown if these bacteria share a phylogenetic origin and/or common genomic functions that allow them to engage in facultative symbiosis with cycad roots. To address this, we obtained metagenomic sequences from 39 coralloid roots sampled from diverse cycad species and origins in Australia and Mexico. Culture-independent shotgun metagenomic sequencing was used to validate sub-community co-cultures as an efficient approach for functional and taxonomic analysis. Our metanalysis shows a host-independent microbiome core consisting of seven bacterial orders with high species diversity within the identified taxa. Moreover, we recovered 43 cyanobacterial metagenome-assembled genomes, and in addition to Nostoc spp., symbiotic cyanobacteria of the genus Aulosira were identified for the first time. Using this robust dataset, we used phylometagenomic analysis to reveal three monophyletic cyanobiont clades, two host-generalist and one cycad-specific that includes Aulosira spp. Although the symbiotic clades have independently arisen, they are enriched in certain functional genes, such as those related to secondary metabolism. Furthermore, the taxonomic composition of associated sympatric bacterial taxa remained constant. Our research quadruples the number of cycad cyanobiont genomes and provides a robust framework to decipher cyanobacterial symbioses, with the potential of improving our understanding of symbiotic communities. This study lays a solid foundation to harness cyanobionts for agriculture and bioprospection, and assist in conservation of critically endangered cycads.
-
-
-
Identifying the suite of genes central to swimming in the biocontrol bacterium Pseudomonas protegens Pf-5
More LessSwimming motility is a key bacterial trait, important to success in many niches. Biocontrol bacteria, such as Pseudomonas protegens Pf-5, are increasingly used in agriculture to control crop diseases, where motility is important for colonization of the plant rhizosphere. Swimming motility typically involves a suite of flagella and chemotaxis genes, but the specific gene set employed for both regulation and biogenesis can differ substantially between organisms. Here we used transposon-directed insertion site sequencing (TraDIS), a genome-wide approach, to identify 249 genes involved in P. protegens Pf-5 swimming motility. In addition to the expected flagella and chemotaxis, we also identified a suite of additional genes important for swimming, including genes related to peptidoglycan turnover, O-antigen biosynthesis, cell division, signal transduction, c-di-GMP turnover and phosphate transport, and 27 conserved hypothetical proteins. Gene knockout mutants and TraDIS data suggest that defects in the Pst phosphate transport system lead to enhanced swimming motility. Overall, this study expands our knowledge of pseudomonad motility and highlights the utility of a TraDIS-based approach for analysing the functions of thousands of genes. This work sets a foundation for understanding how swimming motility may be related to the inconsistency in biocontrol bacteria performance in the field.
-
-
-
The influence of flocculation upon global gene transcription in a yeast CYC8 mutant
More LessThe transcriptome from a Saccharomyces cerevisiae tup1 deletion mutant was one of the first comprehensive yeast transcriptomes published. Subsequent transcriptomes from tup1 and cyc8 mutants firmly established the Tup1-Cyc8 complex as predominantly acting as a repressor of gene transcription. However, transcriptomes from tup1/cyc8 gene deletion or conditional mutants would all have been influenced by the striking flocculation phenotypes that these mutants display. In this study, we have separated the impact of flocculation from the transcriptome in a cyc8 conditional mutant to reveal those genes (i) subject solely to Cyc8p-dependent regulation, (ii) regulated by flocculation only and (iii) regulated by Cyc8p and further influenced by flocculation. We reveal a more accurate list of Cyc8p-regulated genes that includes newly identified Cyc8p-regulated genes that were masked by the flocculation phenotype and excludes genes which were indirectly influenced by flocculation and not regulated by Cyc8p. Furthermore, we show evidence that flocculation exerts a complex and potentially dynamic influence upon global gene transcription. These data should be of interest to future studies into the mechanism of action of the Tup1-Cyc8 complex and to studies involved in understanding the development of flocculation and its impact upon cell function.
-
-
-
Integrative methylome and transcriptome analysis reveals epigenetic regulation of Fusobacterium nucleatum in laryngeal cancer
More LessThe aetiological mechanisms of Fusobacterium nucleatum in laryngeal cancer remain unclear. This study aimed to reveal the epigenetic signature induced by F. nucleatum in laryngeal squamous cell carcinoma (LSCC). Combined analysis of methylome and transcriptome data was performed to address the functional role of F. nucleatum in laryngeal cancer. Twenty-nine differentially expressed methylation-driven genes were identified by mapping the methylation levels of significant differential methylation sites to the expression levels of related genes. The combined analysis revealed that F. nucleatum promoted Janus kinase 3 (JAK3) gene expression in LSCC. Further validation found decreased methylation and elevated expression of JAK3 in the F. nucleatum-treated LSCC cell group; F. nucleatum abundance and JAK3 gene expression had a positive correlation in tumour tissues. This analysis provides a novel understanding of the impact of F. nucleatum in the methylome and transcriptome of laryngeal cancer. Identification of these epigenetic regulatory mechanisms opens up new avenues for mechanistic studies to explore novel therapeutic strategies.
-
- Pathogens and Epidemiology
-
-
Genomic analysis of clinical Aeromonas isolates reveals genetic diversity but little evidence of genetic determinants for diarrhoeal disease
Aeromonas spp. are associated with a number of infectious syndromes in humans including gastroenteritis and dysentery. Our understanding of the genetic diversity, population structure, virulence determinants and antimicrobial resistance of the genus has been limited by a lack of sequenced genomes linked to metadata. We performed a comprehensive analysis of the whole genome sequences of 447 Aeromonas isolates from children in Karachi, Pakistan, with moderate-to-severe diarrhoea (MSD) and from matched controls without diarrhoea that were collected as part of the Global Enteric Multicenter Study (GEMS). Human-associated Aeromonas isolates exhibited high species diversity and extensive antimicrobial and virulence gene content. Aeromonas caviae, A. dhankensis, A. veronii and A. enteropelogenes were all significantly associated with MSD in at least one cohort group. The maf2 and lafT genes that encode components of polar and lateral flagella, respectively, exhibited a weak association with isolates originating from cases of gastroenteritis.
-
-
-
A metagenomic investigation of phytoplasma diversity in Australian vegetable growing regions
In this study, metagenomic sequence data was used to investigate the phytoplasma taxonomic diversity in vegetable-growing regions across Australia. Metagenomic sequencing was performed on 195 phytoplasma-positive samples, originating either from historic collections (n=46) or during collection efforts between January 2015 and June 2022 (n=149). The sampled hosts were classified as crop (n=155), weed (n=24), ornamental (n=7), native plant (n=6), and insect (n=3) species. Most samples came from Queensland (n=78), followed by Western Australia (n=46), the Northern Territory (n=32), New South Wales (n=17), and Victoria (n=10). Of the 195 draft phytoplasma genomes, 178 met our genome criteria for comparison using an average nucleotide identity approach. Ten distinct phytoplasma species were identified and could be classified within the 16SrII, 16SrXII (PCR only), 16SrXXV, and 16SrXXXVIII phytoplasma groups, which have all previously been recorded in Australia. The most commonly detected phytoplasma taxa in this study were species and subspecies classified within the 16SrII group (n=153), followed by strains within the 16SrXXXVIII group (‘Ca. Phytoplasma stylosanthis’; n=6). Several geographic- and host-range expansions were reported, as well as mixed phytoplasma infections of 16SrII taxa and ‘Ca. Phytoplasma stylosanthis’. Additionally, six previously unrecorded 16SrII taxa were identified, including five putative subspecies of ‘Ca. Phytoplasma australasiaticum’ and a new putative 16SrII species. PCR and sequencing of the 16S rRNA gene was a suitable triage tool for preliminary phytoplasma detection. Metagenomic sequencing, however, allowed for higher-resolution identification of the phytoplasmas, including mixed infections, than was afforded by only direct Sanger sequencing of the 16S rRNA gene. Since the metagenomic approach theoretically obtains sequences of all organisms in a sample, this approach was useful to confirm the host family, genus, and/or species. In addition to improving our understanding of the phytoplasma species that affect crop production in Australia, the study also significantly expands the genomic sequence data available in public sequence repositories to contribute to phytoplasma molecular epidemiology studies, revision of taxonomy, and improved diagnostics.
-
-
-
Metagenomic sequencing sheds light on microbes putatively associated with pneumonia-related fatalities of white-tailed deer (Odocoileus virginianus)
With emerging infectious disease outbreaks in human, domestic and wild animal populations on the rise, improvements in pathogen characterization and surveillance are paramount for the protection of human and animal health, as well as the conservation of ecologically and economically important wildlife. Genomics offers a range of suitable tools to meet these goals, with metagenomic sequencing facilitating the characterization of whole microbial communities associated with emerging and endemic disease outbreaks. Here, we use metagenomic sequencing in a case-control study to identify microbes in lung tissue associated with newly observed pneumonia-related fatalities in 34 white-tailed deer (Odocoileus virginianus) in Wisconsin, USA. We identified 20 bacterial species that occurred in more than a single individual. Of these, only Clostridium novyi was found to substantially differ (in number of detections) between case and control sample groups; however, this difference was not statistically significant. We also detected several bacterial species associated with pneumonia and/or other diseases in ruminants (Mycoplasma ovipneumoniae, Trueperella pyogenes, Pasteurella multocida, Anaplasma phagocytophilum, Fusobacterium necrophorum); however, these species did not substantially differ between case and control sample groups. On average, we detected a larger number of bacterial species in case samples than controls, supporting the potential role of polymicrobial infections in this system. Importantly, we did not detect DNA of viruses or fungi, suggesting that they are not significantly associated with pneumonia in this system. Together, these results highlight the utility of metagenomic sequencing for identifying disease-associated microbes. This preliminary list of microbes will help inform future research on pneumonia-associated fatalities of white-tailed deer.
-
-
-
Fusobacterium nucleatum subsp. polymorphum recovered from malignant and potentially malignant oral disease exhibit heterogeneity in adhesion phenotypes and adhesin gene copy number, shaped by inter-subspecies horizontal gene transfer and recombination-derived mosaicism
More LessFusobacterium nucleatum is an anaerobic commensal of the oral cavity associated with periodontitis and extra-oral diseases, including colorectal cancer. Previous studies have shown an increased relative abundance of this bacterium associated with oral dysplasia or within oral tumours. Using direct culture, we found that 75 % of Fusobacterium species isolated from malignant or potentially malignant oral mucosa were F. nucleatum subsp. polymorphum. Whole genome sequencing and pangenome analysis with Panaroo was carried out on 76 F. nucleatum subsp. polymorphum genomes. F. nucleatum subsp. polymorphum was shown to possesses a relatively small core genome of 1604 genes in a pangenome of 7363 genes. Phylogenetic analysis based on the core genome shows the isolates can be separated into three main clades with no obvious genotypic associations with disease. Isolates recovered from healthy and diseased sites in the same patient are generally highly related. A large repertoire of adhesins belonging to the type V secretion system (TVSS) could be identified with major variation in repertoire and copy number between strains. Analysis of intergenic recombination using fastGEAR showed that adhesin complement is shaped by horizontal gene transfer and recombination. Recombination events at TVSS adhesin genes were not only common between lineages of subspecies polymorphum, but also between different subspecies of F. nucleatum. Strains of subspecies polymorphum with low copy numbers of TVSS adhesin encoding genes tended to have the weakest adhesion to oral keratinocytes. This study highlights the genetic heterogeneity of F. nucleatum subsp. polymorphum and provides a new framework for defining virulence in this organism.
-
-
-
Exploration of low-frequency allelic variants of SARS-CoV-2 genomes reveals coinfections in Mexico occurred during periods of VOCs turnover
A total of 14 973 alleles in 29 661 sequenced samples collected between March 2021 and January 2023 by the Mexican Consortium for Genomic Surveillance (CoViGen-Mex) and collaborators were used to construct a thorough map of mutations of the Mexican SARS-CoV-2 genomic landscape containing Intra-Patient Minor Allelic Variants (IPMAVs), which are low-frequency alleles not ordinarily present in a genomic consensus sequence. This additional information proved critical in identifying putative coinfecting variants included alongside the most common variants, B.1.1.222, B.1.1.519, and variants of concern (VOCs) Alpha, Gamma, Delta, and Omicron. A total of 379 coinfection events were recorded in the dataset (a rate of 1.28 %), resulting in the first such catalogue in Mexico. The most common putative coinfections occurred during the spread of Delta or after the introduction of Omicron BA.2 and its descendants. Coinfections occurred constantly during periods of variant turnover when more than one variant shared the same niche and high infection rate was observed, which was dependent on the local variants and time. Coinfections might occur at a higher frequency than customarily reported, but they are often ignored as only the consensus sequence is reported for lineage identification.
-
-
-
Virulence genes, resistome and mobilome of Streptococcus suis strains isolated in France
Streptococcus suis is a leading cause of infection in pigs, causing extensive economic losses. In addition, it can also infect wild fauna, and can be responsible for severe infections in humans. Increasing antimicrobial resistance (AMR) has been described in S. suis worldwide and most of the AMR genes are carried by mobile genetic elements (MGEs). This contributes to their dissemination by horizontal gene transfer. A collection of 102 strains isolated from humans, pigs and wild boars in France was subjected to whole genome sequencing in order to: (i) study their genetic diversity, (ii) evaluate their content in virulence-associated genes, (iii) decipher the mechanisms responsible for their AMR and their association with MGEs, and (iv) study their ability to acquire extracellular DNA by natural transformation. Analysis by hierarchical clustering on principal components identified a few virulence-associated factors that distinguish invasive CC1 strains from the other strains. A plethora of AMR genes (n=217) was found in the genomes. Apart from the frequently reported erm(B) and tet(O) genes, more recently described AMR genes were identified [vga(F)/sprA, vat(D)]. Modifications in PBPs/MraY and GyrA/ParC were detected in the penicillin- and fluoroquinolone-resistant isolates respectively. New AMR gene–MGE associations were detected. The majority of the strains have the full set of genes required for competence, i.e for the acquisition of extracellular DNA (that could carry AMR genes) by natural transformation. Hence the risk of dissemination of these AMR genes should not be neglected.
-
-
-
A core genome multi-locus sequence typing scheme for Streptococcus uberis: an evolution in typing a genetically diverse pathogen
More LessStreptococcus uberis is a globally endemic and poorly controlled cause of bovine mastitis impacting the sustainability of the modern dairy industry. A core genome was derived from 579 newly sequenced S. uberis isolates, along with 305 publicly available genome sequences of S. uberis isolated from 11 countries around the world and used to develop a core genome multi-locus sequence typing (cgMLST) scheme. The S. uberis core genome comprised 1475 genes, and these were used to identify 1447 curated loci that were indexed into the cgMLST scheme. This was able to type 1012 of 1037 (>97 %) isolates used and differentiated the associated sequences into 932 discrete core genome sequence types (cgSTs). Analysis of the phylogenetic relationships of cgSTs revealed no clear clustering of isolates based on metadata such as disease status or year of isolation. Geographical clustering of cgSTs was limited to identification of a UK-centric clade, but cgSTs from UK isolates were also dispersed with those originating from other geographical regions across the entire phylogenetic topology. The cgMLST scheme offers a new tool for the detailed analysis of this globally important pathogen of dairy cattle. Initial analysis has re-emphasized and exemplified the genetically diverse nature of the global population of this opportunistic pathogen.
-
-
-
Unveiling genome plasticity and a novel phage in Mycoplasma felis: Genomic investigations of four feline isolates
More LessMycoplasma felis has been isolated from diseased cats and horses, but to date only a single fully assembled genome of this species, of an isolate from a horse, has been characterized. This study aimed to characterize and compare the completely assembled genomes of four clinical isolates of M. felis from three domestic cats, assembled with the aid of short- and long-read sequencing methods. The completed genomes encoded a median of 759 ORFs (range 743–777) and had a median average nucleotide identity of 98.2 % with the genome of the available equid origin reference strain. Comparative genomic analysis revealed the occurrence of multiple horizontal gene transfer events and significant genome reassortment. This had resulted in the acquisition or loss of numerous genes within the Australian felid isolate genomes, encoding putative proteins involved in DNA transfer, metabolism, DNA replication, host cell interaction and restriction modification systems. Additionally, a novel mycoplasma phage was detected in one Australian felid M. felis isolate by genomic analysis and visualized using cryo-transmission electron microscopy. This study has highlighted the complex genomic dynamics in different host environments. Furthermore, the sequences obtained in this work will enable the development of new diagnostic tools, and identification of future infection control and treatment options for the respiratory disease complex in cats.
-
- Evolution and Responses to Interventions
-
-
Streptococcus pneumoniae serotype 3 population structure in the era of conjugate vaccines, 2001–2018
Background. Despite use of highly effective conjugate vaccines, invasive pneumococcal disease (IPD) remains a leading cause of morbidity and mortality and disproportionately affects Indigenous populations. Although included in the 13-valent pneumococcal conjugate vaccine (PCV13), which was introduced in 2010, serotype 3 continues to cause disease among Indigenous communities in the Southwest USA. In the Navajo Nation, serotype 3 IPD incidence increased among adults (3.8/100 000 in 2001–2009 and 6.2/100 000 in 2011–2019); in children the disease persisted although the rates dropped from 5.8/100 000 to 2.3/100 000.
Methods. We analysed the genomic epidemiology of serotype 3 isolates collected from 129 adults and 63 children with pneumococcal carriage (n=61) or IPD (n=131) from 2001 to 2018 of the Navajo Nation. Using whole-genome sequencing data, we determined clade membership and assessed changes in serotype 3 population structure over time.
Results. The serotype 3 population structure was characterized by three dominant subpopulations: clade II (n=90, 46.9 %) and clade Iα (n=59, 30.7 %), which fall into Clonal Complex (CC) 180, and a non-CC180 clade (n=43, 22.4 %). The proportion of clade II-associated IPD cases increased significantly from 2001 to 2010 to 2011–2018 among adults (23.1–71.8 %; P<0.001) but not in children (27.3–33.3 %; P=0.84). Over the same period, the proportion of clade II-associated carriage increased; this was statistically significant among children (23.3–52.6 %; P=0.04) but not adults (0–50.0 %, P=0.08).
Conclusions. In this setting with persistent serotype 3 IPD and carriage, clade II has increased since 2010. Genomic changes may be contributing to the observed trends in serotype 3 carriage and disease over time.
-
-
-
Genomic insights into local-scale evolution of ocular Chlamydia trachomatis strains within and between individuals in Gambian trachoma-endemic villages
Trachoma, a neglected tropical disease caused by Chlamydia trachomatis (Ct) serovars A–C, is the leading infectious cause of blindness worldwide. Africa bears the highest burden, accounting for over 86 % of global trachoma cases. We investigated Ct serovar A (SvA) and B (SvB) whole genome sequences prior to the induction of mass antibiotic drug administration in The Gambia. Here, we explore the factors contributing to Ct strain diversification and the implications for Ct evolution within the context of ocular infection. A cohort study in 2002–2003 collected ocular swabs across nine Gambian villages during a 6 month follow-up study. To explore the genetic diversity of Ct within and between individuals, we conducted whole-genome sequencing (WGS) on a limited number (n=43) of Ct-positive samples with an omcB load ≥10 from four villages. WGS was performed using target enrichment with SureSelect and Illumina paired-end sequencing. Out of 43 WGS samples, 41 provided sufficient quality for further analysis. ompA analysis revealed that 11 samples had highest identity to ompA from strain A/HAR13 (NC_007429) and 30 had highest identity to ompA from strain B/Jali20 (NC_012686). While SvB genome sequences formed two distinct village-driven subclades, the heterogeneity of SvA sequences led to the formation of many individual branches within the Gambian SvA subclade. Comparing the Gambian SvA and SvB sequences with their reference strains, Ct A/HAR13 and Ct B/Jali20, indicated an single nucleotide polymorphism accumulation rate of 2.4×10−5 per site per year for the Gambian SvA and 1.3×10−5 per site per year for SvB variants (P<0.0001). Variant calling resulted in a total of 1371 single nucleotide variants (SNVs) with a frequency >25 % in SvA sequences, and 438 SNVs in SvB sequences. Of note, in SvA variants, highest evolutionary pressure was recorded on genes responsible for host cell modulation and intracellular survival mechanisms, whereas in SvB variants this pressure was mainly on genes essential for DNA replication/repair mechanisms and protein synthesis. A comparison of the sequences between observed separate infection events (4–20 weeks between infections) suggested that the majority of the variations accumulated in genes responsible for host–pathogen interaction such as CTA_0166 (phospholipase D-like protein), CTA_0498 (TarP) and CTA_0948 (deubiquitinase). This comparison of Ct SvA and SvB variants within a trachoma endemic population focused on their local evolutionary adaptation. We found a different variation accumulation pattern in the Gambian SvA chromosomal genes compared with SvB, hinting at the potential of Ct serovar-specific variation in diversification and evolutionary fitness. These findings may have implications for optimizing trachoma control and prevention strategies.
-
-
-
Conserved patterns of sequence diversification provide insight into the evolution of two-component systems in Enterobacteriaceae
More LessTwo-component regulatory systems (TCSs) are a major mechanism used by bacteria to sense and respond to their environments. Many of the same TCSs are used by biologically diverse organisms with different regulatory needs, suggesting that the functions of TCS must evolve. To explore this topic, we analysed the amino acid sequence divergence patterns of a large set of broadly conserved TCS across different branches of Enterobacteriaceae, a family of Gram-negative bacteria that includes biomedically important genera such as Salmonella, Escherichia, Klebsiella and others. Our analysis revealed trends in how TCS sequences change across different proteins or functional domains of the TCS, and across different lineages. Based on these trends, we identified individual TCS that exhibit atypical evolutionary patterns. We observed that the relative extent to which the sequence of a given TCS varies across different lineages is generally well conserved, unveiling a hierarchy of TCS sequence conservation with EnvZ/OmpR as the most conserved TCS. We provide evidence that, for the most divergent of the TCS analysed, PmrA/PmrB, different alleles were horizontally acquired by different branches of this family, and that different PmrA/PmrB sequence variants have highly divergent signal-sensing domains. Collectively, this study sheds light on how TCS evolve, and serves as a compendium for how the sequences of the TCS in this family have diverged over the course of evolution.
-
Most Read This Month
