- Volume 10, Issue 2, 2024
Volume 10, Issue 2, 2024
- Bioresources
-
- Genomic Methodologies
-
-
A validated pangenome-scale metabolic model for the Klebsiella pneumoniae species complex
The Klebsiella pneumoniae species complex (KpSC) is a major source of nosocomial infections globally with high rates of resistance to antimicrobials. Consequently, there is growing interest in understanding virulence factors and their association with cellular metabolic processes for developing novel anti-KpSC therapeutics. Phenotypic assays have revealed metabolic diversity within the KpSC, but metabolism research has been neglected due to experiments being difficult and cost-intensive. Genome-scale metabolic models (GSMMs) represent a rapid and scalable in silico approach for exploring metabolic diversity, which compile genomic and biochemical data to reconstruct the metabolic network of an organism. Here we use a diverse collection of 507 KpSC isolates, including representatives of globally distributed clinically relevant lineages, to construct the most comprehensive KpSC pan-metabolic model to date, KpSC pan v2. Candidate metabolic reactions were identified using gene orthology to known metabolic genes, prior to manual curation via extensive literature and database searches. The final model comprised a total of 3550 reactions, 2403 genes and can simulate growth on 360 unique substrates. We used KpSC pan v2 as a reference to derive strain-specific GSMMs for all 507 KpSC isolates, and compared these to GSMMs generated using a prior KpSC pan-reference (KpSC pan v1) and two single-strain references. We show that KpSC pan v2 includes a greater proportion of accessory reactions (8.8 %) than KpSC pan v1 (2.5 %). GSMMs derived from KpSC pan v2 also generate more accurate growth predictions, with high median accuracies of 95.4 % (aerobic, n=37 isolates) and 78.8 % (anaerobic, n=36 isolates) for 124 matched carbon substrates. KpSC pan v2 is freely available at https://github.com/kelwyres/KpSC-pan-metabolic-model, representing a valuable resource for the scientific community, both as a source of curated metabolic information and as a reference to derive accurate strain-specific GSMMs. The latter can be used to investigate the relationship between KpSC metabolism and traits of interest, such as reservoirs, epidemiology, drug resistance or virulence, and ultimately to inform novel KpSC control strategies.
-
- Pathogens and Epidemiology
-
-
Mobilisation and analyses of publicly available SARS-CoV-2 data for pandemic responses
Nadim Rahman, Colman O'Cathail, Ahmad Zyoud, Alexey Sokolov, Bas Oude Munnink, Björn Grüning, Carla Cummins, Clara Amid, David F. Nieuwenhuijse, Dávid Visontai, David Yu Yuan, Dipayan Gupta, Divyae K. Prasad, Gábor Máté Gulyás, Gabriele Rinck, Jasmine McKinnon, Jeena Rajan, Jeff Knaggs, Jeffrey Edward Skiby, József Stéger, Judit Szarvas, Khadim Gueye, Krisztián Papp, Maarten Hoek, Manish Kumar, Marianna A. Ventouratou, Marie-Catherine Bouquieaux, Martin Koliba, Milena Mansurova, Muhammad Haseeb, Nathalie Worp, Peter W. Harrison, Rasko Leinonen, Ross Thorne, Sandeep Selvakumar, Sarah Hunt, Sundar Venkataraman, Suran Jayathilaka, Timothée Cezard, Wolfgang Maier, Zahra Waheed, Zamin Iqbal, Frank Møller Aarestrup, Istvan Csabai, Marion Koopmans, Tony Burdett and Guy CochraneThe COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learnt. This paper describes a component of the Platform, the SARS-CoV-2 Data Hubs, which enable the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.
-
- Research Articles
-
- Genomic Methodologies
-
-
Beyond blast: enabling microbiologists to better extract literature, taxonomic distributions and gene neighbourhood information for protein families
Capturing the published corpus of information on all members of a given protein family should be an essential step in any study focusing on specific members of that family. Using a previously gathered dataset of more than 280 references mentioning a member of the DUF34 (NIF3/Ngg1-interacting Factor 3) family, we evaluated the efficiency of different databases and search tools, and devised a workflow that experimentalists can use to capture the most information published on members of a protein family in the least amount of time. To complement this workflow, web-based platforms allowing for the exploration of protein family members across sequenced genomes or for the analysis of gene neighbourhood information were reviewed for their versatility and ease of use. Recommendations that can be used for experimentalist users, as well as educators, are provided and integrated within a customized, publicly accessible Wiki.
-
-
-
PlasmidEC and gplas2: an optimized short-read approach to predict and reconstruct antibiotic resistance plasmids in Escherichia coli
Accurate reconstruction of Escherichia coli antibiotic resistance gene (ARG) plasmids from Illumina sequencing data has proven to be a challenge with current bioinformatic tools. In this work, we present an improved method to reconstruct E. coli plasmids using short reads. We developed plasmidEC, an ensemble classifier that identifies plasmid-derived contigs by combining the output of three different binary classification tools. We showed that plasmidEC is especially suited to classify contigs derived from ARG plasmids with a high recall of 0.941. Additionally, we optimized gplas, a graph-based tool that bins plasmid-predicted contigs into distinct plasmid predictions. Gplas2 is more effective at recovering plasmids with large sequencing coverage variations and can be combined with the output of any binary classifier. The combination of plasmidEC with gplas2 showed a high completeness (median=0.818) and F1-Score (median=0.812) when reconstructing ARG plasmids and exceeded the binning capacity of the reference-based method MOB-suite. In the absence of long-read data, our method offers an excellent alternative to reconstruct ARG plasmids in E. coli.
-
-
-
Closing the genome of unculturable cable bacteria using a combined metagenomic assembly of long and short sequencing reads
Many environmentally relevant micro-organisms cannot be cultured, and even with the latest metagenomic approaches, achieving complete genomes for specific target organisms of interest remains a challenge. Cable bacteria provide a prominent example of a microbial ecosystem engineer that is currently unculturable. They occur in low abundance in natural sediments, but due to their capability for long-distance electron transport, they exert a disproportionately large impact on the biogeochemistry of their environment. Current available genomes of marine cable bacteria are highly fragmented and incomplete, hampering the elucidation of their unique electrogenic physiology. Here, we present a metagenomic pipeline that combines Nanopore long-read and Illumina short-read shotgun sequencing. Starting from a clonal enrichment of a cable bacterium, we recovered a circular metagenome-assembled genome (5.09 Mbp in size), which represents a novel cable bacterium species with the proposed name Candidatus Electrothrix scaldis. The closed genome contains 1109 novel identified genes, including key metabolic enzymes not previously described in incomplete genomes of cable bacteria. We examined in detail the factors leading to genome closure. Foremost, native, non-amplified long reads are crucial to resolve the many repetitive regions within the genome of cable bacteria, and by analysing the whole metagenomic assembly, we found that low strain diversity is key for achieving genome closure. The insights and approaches presented here could help achieve genome closure for other keystone micro-organisms present in complex environmental samples at low abundance.
-
-
-
The long and short of it: benchmarking viromics using Illumina, Nanopore and PacBio sequencing technologies
Viral metagenomics has fuelled a rapid change in our understanding of global viral diversity and ecology. Long-read sequencing and hybrid assembly approaches that combine long- and short-read technologies are now being widely implemented in bacterial genomics and metagenomics. However, the use of long-read sequencing to investigate viral communities is still in its infancy. While Nanopore and PacBio technologies have been applied to viral metagenomics, it is not known to what extent different technologies will impact the reconstruction of the viral community. Thus, we constructed a mock bacteriophage community of previously sequenced phage genomes and sequenced them using Illumina, Nanopore and PacBio sequencing technologies and tested a number of different assembly approaches. When using a single sequencing technology, Illumina assemblies were the best at recovering phage genomes. Nanopore- and PacBio-only assemblies performed poorly in comparison to Illumina in both genome recovery and error rates, which both varied with the assembler used. The best Nanopore assembly had errors that manifested as SNPs and INDELs at frequencies 41 and 157 % higher than found in Illumina only assemblies, respectively. While the best PacBio assemblies had SNPs at frequencies 12 and 78 % higher than found in Illumina-only assemblies, respectively. Despite high-read coverage, long-read-only assemblies recovered a maximum of one complete genome from any assembly, unless reads were down-sampled prior to assembly. Overall the best approach was assembly by a combination of Illumina and Nanopore reads, which reduced error rates to levels comparable with short-read-only assemblies. When using a single technology, Illumina only was the best approach. The differences in genome recovery and error rates between technology and assembler had downstream impacts on gene prediction, viral prediction, and subsequent estimates of diversity within a sample. These findings will provide a starting point for others in the choice of reads and assembly algorithms for the analysis of viromes.
-
- Microbial Communities
-
-
On the limits of 16S rRNA gene-based metagenome prediction and functional profiling
Molecular profiling techniques such as metagenomics, metatranscriptomics or metabolomics offer important insights into the functional diversity of the microbiome. In contrast, 16S rRNA gene sequencing, a widespread and cost-effective technique to measure microbial diversity, only allows for indirect estimation of microbial function. To mitigate this, tools such as PICRUSt2, Tax4Fun2, PanFP and MetGEM infer functional profiles from 16S rRNA gene sequencing data using different algorithms. Prior studies have cast doubts on the quality of these predictions, motivating us to systematically evaluate these tools using matched 16S rRNA gene sequencing, metagenomic datasets, and simulated data. Our contribution is threefold: (i) using simulated data, we investigate if technical biases could explain the discordance between inferred and expected results; (ii) considering human cohorts for type two diabetes, colorectal cancer and obesity, we test if health-related differential abundance measures of functional categories are concordant between 16S rRNA gene-inferred and metagenome-derived profiles and; (iii) since 16S rRNA gene copy number is an important confounder in functional profiles inference, we investigate if a customised copy number normalisation with the rrnDB database could improve the results. Our results show that 16S rRNA gene-based functional inference tools generally do not have the necessary sensitivity to delineate health-related functional changes in the microbiome and should thus be used with care. Furthermore, we outline important differences in the individual tools tested and offer recommendations for tool selection.
-
- Pathogens and Epidemiology
-
-
Exploring the effects of antimicrobial treatment on the gut and oral microbiomes and resistomes from elderly long-term care facility residents via shotgun DNA sequencing
Monitoring antibiotic-resistant bacteria (ARB) and understanding the effects of antimicrobial drugs on the human microbiome and resistome are crucial for public health. However, no study has investigated the association between antimicrobial treatment and the microbiome–resistome relationship in long-term care facilities, where residents act as reservoirs of ARB but are not included in the national surveillance for ARB. We conducted shotgun metagenome sequencing of oral and stool samples from long-term care facility residents and explored the effects of antimicrobial treatment on the human microbiome and resistome using two types of comparisons: cross-sectional comparisons based on antimicrobial treatment history in the past 6 months and within-subject comparisons between stool samples before, during and 2–4 weeks after treatment using a single antimicrobial drug. Cross-sectional analysis revealed two characteristics in the group with a history of antimicrobial treatment: the archaeon Methanobrevibacter was the only taxon that significantly increased in abundance, and the total abundance of antimicrobial resistance genes (ARGs) was also significantly higher. Within-subject comparisons showed that taxonomic diversity did not decrease during treatment, suggesting that the effect of the prescription of a single antimicrobial drug in usual clinical treatment on the gut microbiota is likely to be smaller than previously thought, even among very elderly people. Additional analysis of the detection limit of ARGs revealed that they could not be detected when contig coverage was <2.0. This study is the first to report the effects of usual antimicrobial treatments on the microbiome and resistome of long-term care facility residents.
-
-
-
Whole-Genome sequencing in routine Mycobacterium bovis epidemiology – scoping the potential
Adrian Allen, Ryan Magee, Ryan Devaney, Tara Ardis, Caitlín McNally, Carl McCormick, Eleanor Presho, Michael Doyle, Purnika Ranasinghe, Philip Johnston, Raymond Kirke, Roland Harwood, Damien Farrell, Kevin Kenny, Jordy Smith, Stephen Gordon, Tom Ford, Suzan Thompson, Lorraine Wright, Kerri Jones, Paulo Prodohl and Robin SkuceMycobacterium bovis the main agent of bovine tuberculosis (bTB), presents as a series of spatially-localised micro-epidemics across landscapes. Classical molecular typing methods applied to these micro-epidemics, based on genotyping a few variable loci, have significantly improved our understanding of potential epidemiological links between outbreaks. However, they have limited utility owing to low resolution. Conversely, whole-genome sequencing (WGS) provides the highest resolution data available for molecular epidemiology, producing richer outbreak tracing, insights into phylogeography and epidemic evolutionary history. We illustrate these advantages by focusing on a common single lineage of M. bovis (1.140) from Northern Ireland. Specifically, we investigate the spatial sub-structure of 20 years of herd-level multi locus VNTR analysis (MLVA) surveillance data and WGS data from a down sampled subset of isolates of this MLVA type over the same time frame. We mapped 2108 isolate locations of MLVA type 1.140 over the years 2000–2022. We also mapped the locations of 148 contemporary WGS isolates from this lineage, over a similar geographic range, stratifying by single nucleotide polymorphism (SNP) relatedness cut-offs of 15 SNPs. We determined a putative core range for the 1.140 MLVA type and SNP-defined sequence clusters using a 50 % kernel density estimate, using cattle movement data to inform on likely sources of WGS isolates found outside of core ranges. Finally, we applied Bayesian phylogenetic methods to investigate past population history and reproductive number of the 1.140 M. bovis lineage. We demonstrate that WGS SNP-defined clusters exhibit smaller core ranges than the established MLVA type - facilitating superior disease tracing. We also demonstrate the superior functionality of WGS data in determining how this lineage was disseminated across the landscape, likely via cattle movement and to infer how its effective population size and reproductive number has been in flux since its emergence. These initial findings highlight the potential of WGS data for routine monitoring of bTB outbreaks.
-
-
-
Compensatory mutations are associated with increased in vitro growth in resistant clinical samples of Mycobacterium tuberculosis
More LessMutations in Mycobacterium tuberculosis associated with resistance to antibiotics often come with a fitness cost for the bacteria. Resistance to the first-line drug rifampicin leads to lower competitive fitness of M. tuberculosis populations when compared to susceptible populations. This fitness cost, introduced by resistance mutations in the RNA polymerase, can be alleviated by compensatory mutations (CMs) in other regions of the affected protein. CMs are of particular interest clinically since they could lock in resistance mutations, encouraging the spread of resistant strains worldwide. Here, we report the statistical inference of a comprehensive set of CMs in the RNA polymerase of M. tuberculosis, using over 70 000 M. tuberculosis genomes that were collated as part of the CRyPTIC project. The unprecedented size of this data set gave the statistical tests more power to investigate the association of putative CMs with resistance-conferring mutations. Overall, we propose 51 high-confidence CMs by means of statistical association testing and suggest hypotheses for how they exert their compensatory mechanism by mapping them onto the protein structure. In addition, we were able to show an association of CMs with higher in vitro growth densities, and hence presumably with higher fitness, in resistant samples in the more virulent M. tuberculosis lineage 2. Our results suggest the association of CM presence with significantly higher in vitro growth than for wild-type samples, although this association is confounded with lineage and sub-lineage affiliation. Our findings emphasize the integral role of CMs and lineage affiliation in resistance spread and increases the urgency of antibiotic stewardship, which implies accurate, cheap and widely accessible diagnostics for M. tuberculosis infections to not only improve patient outcomes but also prevent the spread of resistant strains.
-
-
-
Investigating the rise of Omicron variant through genomic surveillance of SARS-CoV-2 infections in a highly vaccinated university population
Novel variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) continue to emerge as the coronavirus disease 2019 (COVID-19) pandemic extends into its fourth year. Understanding SARS-CoV-2 circulation in university populations is vital for effective interventions in higher education settings and will inform public health policy during pandemics. In this study, we generated 793 whole-genome sequences collected over an entire academic year in a university population in Indiana, USA. We clearly captured the rapidity with which Delta variant was wholly replaced by Omicron variant across the West Lafayette campus over the length of two academic semesters in a community with high vaccination rates. This mirrored the emergence of Omicron throughout the state of Indiana and the USA. Further, phylogenetic analyses demonstrated that there was a more diverse set of potential geographic origins for Omicron viruses introduction into campus when compared to Delta. Lastly, statistics indicated that there was a more significant role for international and out-of-state migration in the establishment of Omicron variants at Purdue. This surveillance workflow, coupled with viral genomic sequencing and phylogeographic analyses, provided critical insights into SARS-CoV-2 transmission dynamics and variant arrival.
-
-
-
A novel synthetic nucleic acid mixture for quantification of microbes by mNGS
Metagenomic next-generation sequencing (mNGS) provides considerable advantages in identifying emerging and re-emerging, difficult-to-detect and co-infected pathogens; however, the clinical application of mNGS remains limited primarily due to the lack of quantitative capabilities. This study introduces a novel approach, KingCreate-Quantification (KCQ) system, for quantitative analysis of microbes in clinical specimens by mNGS, which co-sequence the target DNA extracted from the specimens along with a set of synthetic dsDNA molecules used as Internal-Standard (IS). The assay facilitates the conversion of microbial reads into their copy numbers based on IS reads utilizing a mathematical model proposed in this study. The performance of KCQ was systemically evaluated using commercial mock microbes with varying IS input amounts, different proportions of human genomic DNA, and at varying amounts of sequence analysis data. Subsequently, KCQ was applied in microbial quantitation in 36 clinical specimens including blood, bronchoalveolar lavage fluid, cerebrospinal fluid and oropharyngeal swabs. A total of 477 microbe genetic fragments were screened using the bioinformatic system. Of these 83 fragments were quantitatively compared with digital droplet PCR (ddPCR), revealing a correlation coefficient of 0.97 between the quantitative results of KCQ and ddPCR. Our study demonstrated that KCQ presents a practical approach for the quantitative analysis of microbes by mNGS in clinical samples.
-
-
-
HAIviz: an interactive dashboard for visualising and integrating healthcare-associated genomic epidemiological data
Existing tools for phylogeographic and epidemiological visualisation primarily provide a macro-geographic view of epidemic and pandemic transmission events but offer little support for detailed investigation of outbreaks in healthcare settings. Here, we present HAIviz, an interactive web-based application designed for integrating and visualising genomic epidemiological information to improve the tracking of healthcare-associated infections (HAIs). HAIviz displays and links the outbreak timeline, building map, phylogenetic tree, patient bed movements, and transmission network on a single interactive dashboard. HAIviz has been developed for bacterial outbreak investigations but can be utilised for general epidemiological investigations focused on built environments for which visualisation to customised maps is required. This paper describes and demonstrates the application of HAIviz for HAI outbreak investigations.
-
-
-
Identification of a novel CG307 sub-clade in third-generation-cephalosporin-resistant Klebsiella pneumoniae causing invasive infections in the USA
Despite the notable clinical impact, recent molecular epidemiology regarding third-generation-cephalosporin-resistant (3GC-R) Klebsiella pneumoniae in the USA remains limited. We performed whole-genome sequencing of 3GC-R K. pneumoniae bacteraemia isolates collected from March 2016 to May 2022 at a tertiary care cancer centre in Houston, TX, USA, using Illumina and Oxford Nanopore Technologies platforms. A comprehensive comparative genomic analysis was performed to dissect population structure, transmission dynamics and pan-genomic signatures of our 3GC-R K. pneumoniae population. Of the 178 3GC-R K. pneumoniae bacteraemias that occurred during our study time frame, we were able to analyse 153 (86 %) bacteraemia isolates, 126 initial and 27 recurrent isolates. While isolates belonging to the widely prevalent clonal group (CG) 258 were rarely observed, the predominant CG, 307, accounted for 37 (29 %) index isolates and displayed a significant correlation (Pearson correlation test P value=0.03) with the annual frequency of 3GC-R K. pneumoniae bacteraemia. Interestingly, only 11 % (4/37) of CG307 isolates belonged to the commonly detected ‘Texas-specific’ clade that has been observed in previous Texas-based K. pneumoniae antimicrobial-resistance surveillance studies. We identified nearly half of our CG307 isolates (n=18) belonged to a novel, monophyletic CG307 sub-clade characterized by the chromosomally encoded bla SHV-205 and unique accessory genome content. This CG307 sub-clade was detected in various regions of the USA, with genome sequences from 24 additional strains becoming recently available in the National Center for Biotechnology Information (NCBI) SRA database. Collectively, this study underscores the emergence and dissemination of a distinct CG307 sub-clade that is a prevalent cause of 3GC-R K. pneumoniae bacteraemia among cancer patients seen in Houston, TX, and has recently been isolated throughout the USA.
-
-
-
Characterizing the diversity and commensal origins of penA mosaicism in the genus Neisseria
More LessMosaic penA alleles formed through horizontal gene transfer (HGT) have been instrumental to the rising incidence of ceftriaxone-resistant gonococcal infections. Although interspecies HGT of regions of the penA gene between Neisseria gonorrhoeae and commensal Neisseria species has been described, knowledge concerning which species are the most common contributors to mosaic penA alleles is limited, with most studies examining only a small number of alleles. Here, we investigated the origins of recombinant penA alleles through in silico analyses that incorporated 1700 penA alleles from 35 513 Neisseria isolates, comprising 15 different Neisseria species. We identified Neisseria subflava and Neisseria cinerea as the most common source of recombinant sequences in N. gonorrhoeae penA. This contrasted with Neisseria meningitidis penA, for which the primary source of recombinant DNA was other meningococci, followed by Neisseria lactamica. Additionally, we described the distribution of polymorphisms implicated in antimicrobial resistance in penA, and found that these are present across the genus. These results provide insight into resistance-related changes in the penA gene across human-associated Neisseria species, illustrating the importance of genomic surveillance of not only the pathogenic Neisseria, but also of the oral niche-associated commensals from which these pathogens are sourcing key genetic variation.
-
- Evolution and Responses to Interventions
-
-
Exploring the genetic basis of natural resistance to microcins
Enterobacteriaceae produce an arsenal of antimicrobial compounds including microcins, ribosomally produced antimicrobial peptides showing diverse structures and mechanisms of action. Microcins target close relatives of the producing strain to promote its survival. Their narrow spectrum of antibacterial activity makes them a promising alternative to conventional antibiotics, as it should decrease the probability of resistance dissemination and collateral damage to the host’s microbiota. To assess the therapeutic potential of microcins, there is a need to understand the mechanisms of resistance to these molecules. In this study, we performed genomic analyses of the resistance to four microcins [microcin C, a nucleotide peptide; microcin J25, a lasso peptide; microcin B17, a linear azol(in)e-containing peptide; and microcin E492, a siderophore peptide] on a collection of 54 Enterobacteriaceae from three species: Escherichia coli, Salmonella enterica and Klebsiella pneumoniae. A gene-targeted analysis revealed that about half of the microcin-resistant strains presented mutations of genes involved in the microcin mechanism of action, especially those involved in their uptake (fhuA, fepA, cirA and ompF). A genome-wide association study did not reveal any significant correlations, yet relevant genetic elements were associated with microcin resistance. These were involved in stress responses, biofilm formation, transport systems and acquisition of immunity genes. Additionally, microcin-resistant strains exhibited several mutations within genes involved in specific metabolic pathways, especially for S. enterica and K. pneumoniae.
-
-
-
Pneumococcal population genomics changes during the early time period of conjugate vaccine uptake in southern India
Streptococcus pneumoniae is a major cause of invasive disease of young children in low- and middle-income countries. In southern India, pneumococcal conjugate vaccines (PCVs) that can prevent invasive pneumococcal disease began to be used more frequently after 2015. To characterize pneumococcal evolution during the early time period of PCV uptake in southern India, genomes were sequenced and selected characteristics were determined for 402 invasive isolates collected from children <5 years of age during routine surveillance from 1991 to 2020. Overall, the prevalence and diversity of vaccine type (VT) and non-vaccine type (NVT) isolates did not significantly change post-uptake of PCV. Individually, serotype 1 and global pneumococcal sequence cluster (GPSC or strain lineage) 2 significantly decreased, whereas serotypes 6B, 9V and 19A and GPSCs 1, 6, 10 and 23 significantly increased in proportion post-uptake of PCV. Resistance determinants to penicillin, erythromycin, co-trimoxazole, fluoroquinolones and tetracycline, and multidrug resistance significantly increased in proportion post-uptake of PCV and especially among VT isolates. Co-trimoxazole resistance determinants were common pre- and post-uptake of PCV (85 and 93 %, respectively) and experienced the highest rates of recombination in the genome. Accessory gene frequencies were seen to be changing by small amounts across the frequency spectrum specifically among VT isolates, with the largest changes linked to antimicrobial resistance determinants. In summary, these results indicate that as of 2020 this pneumococcal population was not yet approaching a PCV-induced equilibrium and they highlight changes related to antimicrobial resistance. Augmenting PCV coverage and prudent use of antimicrobials are needed to counter invasive pneumococcal disease in this region.
-
-
-
Comparative genomics reveals distinct diversification patterns among LysR-type transcriptional regulators in the ESKAPE pathogen Pseudomonas aeruginosa
Pseudomonas aeruginosa, a harmful nosocomial pathogen associated with cystic fibrosis and burn wounds, encodes for a large number of LysR-type transcriptional regulator proteins. To understand how and why LTTR proteins evolved with such frequency and to establish whether any relationships exist within the distribution we set out to identify the patterns underpinning LTTR distribution in P. aeruginosa and to uncover cluster-based relationships within the pangenome. Comparative genomic studies revealed that in the JGI IMG database alone ~86 000 LTTRs are present across the sequenced genomes (n=699). They are widely distributed across the species, with core LTTRs present in >93 % of the genomes and accessory LTTRs present in <7 %. Analysis showed that subsets of core LTTRs can be classified as either variable (typically specific to P. aeruginosa) or conserved (and found to be distributed in other Pseudomonas species). Extending the analysis to the more extensive Pseudomonas database, PA14 rooted analysis confirmed the diversification patterns and revealed PqsR, the receptor for the Pseudomonas quinolone signal (PQS) and 2-heptyl-4-quinolone (HHQ) quorum-sensing signals, to be amongst the most variable in the dataset. Successful complementation of the PAO1 pqsR - mutant using representative variant pqsR sequences suggests a degree of structural promiscuity within the most variable of LTTRs, several of which play a prominent role in signalling and communication. These findings provide a new insight into the diversification of LTTR proteins within the P. aeruginosa species and suggests a functional significance to the cluster, conservation and distribution patterns identified.
-
- Methods
-
- Genomic Methodologies
-
-
ViromeFlowX: a Comprehensive Nextflow-based Automated Workflow for Mining Viral Genomes from Metagenomic Sequencing Data
Understanding the link between the human gut virome and diseases has garnered significant interest in the research community. Extracting virus-related information from metagenomic sequencing data is crucial for unravelling virus composition, host interactions, and disease associations. However, current metagenomic analysis workflows for viral genomes vary in effectiveness, posing challenges for researchers seeking the most up-to-date tools. To address this, we present ViromeFlowX, a user-friendly Nextflow workflow that automates viral genome assembly, identification, classification, and annotation. This streamlined workflow integrates cutting-edge tools for processing raw sequencing data for taxonomic annotation and functional analysis. Application to a dataset of 200 metagenomic samples yielded high-quality viral genomes. ViromeFlowX enables efficient mining of viral genomic data, offering a valuable resource to investigate the gut virome’s role in virus-host interactions and virus-related diseases.
-