- Volume 8, Issue 3, 2022
Volume 8, Issue 3, 2022
- Research Articles
-
- Genomic Methodologies
-
-
Finding the right fit: evaluation of short-read and long-read sequencing approaches to maximize the utility of clinical microbiome data
A long-standing challenge in human microbiome research is achieving the taxonomic and functional resolution needed to generate testable hypotheses about the gut microbiota’s impact on health and disease. With a growing number of live microbial interventions in clinical development, this challenge is renewed by a need to understand the pharmacokinetics and pharmacodynamics of therapeutic candidates. While short-read sequencing of the bacterial 16S rRNA gene has been the standard for microbiota profiling, recent improvements in the fidelity of long-read sequencing underscores the need for a re-evaluation of the value of distinct microbiome-sequencing approaches. We leveraged samples from participants enrolled in a phase 1b clinical trial of a novel live biotherapeutic product to perform a comparative analysis of short-read and long-read amplicon and metagenomic sequencing approaches to assess their utility for generating clinical microbiome data. Across all methods, overall community taxonomic profiles were comparable and relationships between samples were conserved. Comparison of ubiquitous short-read 16S rRNA amplicon profiling to long-read profiling of the 16S-ITS-23S rRNA amplicon showed that only the latter provided strain-level community resolution and insight into novel taxa. All methods identified an active ingredient strain in treated study participants, though detection confidence was higher for long-read methods. Read coverage from both metagenomic methods provided evidence of active-ingredient strain replication in some treated participants. Compared to short-read metagenomics, approximately twice the proportion of long reads were assigned functional annotations. Finally, compositionally similar bacterial metagenome-assembled genomes (MAGs) were recovered from short-read and long-read metagenomic methods, although a greater number and more complete MAGs were recovered from long reads. Despite higher costs, both amplicon and metagenomic long-read approaches yielded added microbiome data value in the form of higher confidence taxonomic and functional resolution and improved recovery of microbial genomes compared to traditional short-read methodologies.
-
-
-
Benchmarking the topological accuracy of bacterial phylogenomic workflows using in silico evolution
More LessPhylogenetic analyses are widely used in microbiological research, for example to trace the progression of bacterial outbreaks based on whole-genome sequencing data. In practice, multiple analysis steps such as de novo assembly, alignment and phylogenetic inference are combined to form phylogenetic workflows. Comprehensive benchmarking of the accuracy of complete phylogenetic workflows is lacking. To benchmark different phylogenetic workflows, we simulated bacterial evolution under a wide range of evolutionary models, varying the relative rates of substitution, insertion, deletion, gene duplication, gene loss and lateral gene transfer events. The generated datasets corresponded to a genetic diversity usually observed within bacterial species (≥95 % average nucleotide identity). We replicated each simulation three times to assess replicability. In total, we benchmarked 19 distinct phylogenetic workflows using 8 different simulated datasets. We found that recently developed k-mer alignment methods such as kSNP and ska achieve similar accuracy as reference mapping. The high accuracy of k-mer alignment methods can be explained by the large fractions of genomes these methods can align, relative to other approaches. We also found that the choice of de novo assembly algorithm influences the accuracy of phylogenetic reconstruction, with workflows employing SPAdes or skesa outperforming those employing Velvet. Finally, we found that the results of phylogenetic benchmarking are highly variable between replicates. We conclude that for phylogenomic reconstruction, k-mer alignment methods are relevant alternatives to reference mapping at the species level, especially in the absence of suitable reference genomes. We show de novo genome assembly accuracy to be an underappreciated parameter required for accurate phylogenomic reconstruction.
-
- Functional Genomics and Microbe–Niche Interactions
-
-
Population structure of ocular Streptococcus pneumoniae is highly diverse and formed by lineages that escape current vaccines
Streptococcus pneumoniae is a leading cause of ocular infections including serious and sight-threatening conditions. The use of pneumococcal conjugate vaccines (PCV) has substantially reduced the incidence of pneumonia and invasive pneumococcal diseases, but has had limited impact on ocular infections. Additionally, widespread vaccine use has resulted in ongoing selective pressure and serotype replacement in carriage and disease. To gain insight into the population structure of pneumococcal isolates causing ocular infections in a post-PCV-13 time period, we investigated the genomic epidemiology of ocular S. pneumoniae isolates (n=45) collected at Massachusetts Eye and Ear between 2014 and 2017. By performing a series of molecular typing methods from draft genomes, we found that the population structure of ocular S. pneumoniae is highly diverse with 27 sequence types (grouped into 18 clonal complexes) and 17 serotypes being identified. Distribution of these lineages diverged according to the site of isolation, with conjunctivitis being commonly caused by isolates grouped in the Epidemic Conjunctivitis Cluster-ECC (60 %), and ST448 (53.3 %) being most frequently identified. Conversely, S. pneumoniae keratitis cases were caused by a highly diverse population of isolates grouping within 15 different clonal complexes. Serotyping inference demonstrated that 95.5 % of the isolates were non-PCV-13 vaccine types. Most of the conjunctivitis isolates (80 %) were unencapsulated, with the remaining belonging to serotypes 15B, 3 and 23B. On the other hand, S. pneumoniae causing keratitis were predominantly encapsulated (95.2 %) with 13 different serotypes identified, mostly being non-vaccine types. Carriage of macrolide resistance genes was common in our ocular S. pneumoniae population (42.2 %), and usually associated with the mefA +msrD genotype (n=15). These genes were located in the Macrolide Efflux Genetic Assembly cassette and were associated with low-level in vitro resistance to 14- and 15-membered macrolides. Less frequently, macrolide-resistant isolates carried an ermB gene (n=4), which was co-located with the tetM gene in a Tn-916-like transposon. Our study demonstrates that the population structure of ocular S. pneumoniae is highly diverse, mainly composed by isolates that escape the PCV-13 vaccine, with patterns of tissue/niche segregation, adaptation and specialization. These findings suggest that the population structure of ocular pneumococcus may be shaped by multiple factors including PCV-13 selective pressure, microbial-related and niche-specific host-associated features.
-
-
-
Fatal affairs – conjugational transfer of a dinoflagellate-killing plasmid between marine Rhodobacterales
The roseobacter group of marine bacteria is characterized by a mosaic distribution of ecologically important phenotypes. These are often encoded on mobile extrachromosomal replicons. So far, conjugation had only been experimentally proven between the two model organisms Phaeobacter inhibens and Dinoroseobacter shibae . Here, we show that two large natural RepABC-type plasmids from D. shibae can be transferred into representatives of all known major Rhodobacterales lineages. Complete genome sequencing of the newly established Phaeobacter inhibens transconjugants confirmed their genomic integrity. The conjugated plasmids were stably maintained as single copy number replicons in the genuine as well as the new host. Co-cultivation of Phaeobacter inhibens and the transconjugants with the dinoflagellate Prorocentrum minimum demonstrated that Phaeobacter inhibens is a probiotic strain that improves the yield and stability of the dinoflagellate culture. The transconjugant carrying the 191 kb plasmid, but not the 126 kb sister plasmid, killed the dinoflagellate in co-culture.
-
- Microbial Communities
-
-
Novel canine high-quality metagenome-assembled genomes, prophages and host-associated plasmids provided by long-read metagenomics together with Hi-C proximity ligation
More LessThe human gut microbiome has been extensively studied, yet the canine gut microbiome is still largely unknown. The availability of high-quality genomes is essential in the fields of veterinary medicine and nutrition to unravel the biological role of key microbial members in the canine gut environment. Our aim was to evaluate nanopore long-read metagenomics and Hi-C (high-throughput chromosome conformation capture) proximity ligation to provide high-quality metagenome-assembled genomes (HQ MAGs) of the canine gut environment. By combining nanopore long-read metagenomics and Hi-C proximity ligation, we retrieved 27 HQ MAGs and 7 medium-quality MAGs of a faecal sample of a healthy dog. Canine MAGs (CanMAGs) improved genome contiguity of representatives from the animal and human MAG catalogues – short-read MAGs from public datasets – for the species they represented: they were more contiguous with complete ribosomal operons and at least 18 canonical tRNAs. Both canine-specific bacterial species and gut generalists inhabit the dog’s gastrointestinal environment. Most of them belonged to Firmicutes , followed by Bacteroidota and Proteobacteria . We also assembled one Actinobacteriota and one Fusobacteriota MAG. CanMAGs harboured antimicrobial-resistance genes (ARGs) and prophages and were linked to plasmids. ARGs conferring resistance to tetracycline were most predominant within CanMAGs, followed by lincosamide and macrolide ones. At the functional level, carbohydrate transport and metabolism was the most variable within the CanMAGs, and mobilome function was abundant in some MAGs. Specifically, we assigned the mobilome functions and the associated mobile genetic elements to the bacterial host. The CanMAGs harboured 50 bacteriophages, providing novel bacterial-host information for eight viral clusters, and Hi-C proximity ligation data linked the six potential plasmids to their bacterial host. Long-read metagenomics and Hi-C proximity ligation are likely to become a comprehensive approach to HQ MAG discovery and assignment of extra-chromosomal elements to their bacterial host. This will provide essential information for studying the canine gut microbiome in veterinary medicine and animal nutrition.
-
- Pathogens and Epidemiology
-
-
Unusual SARS-CoV-2 intrahost diversity reveals lineage superinfection
Filipe Zimmer Dezordi, Paola Cristina Resende, Felipe Gomes Naveca, Valdinete Alves do Nascimento, Victor Costa de Souza, Anna Carolina Dias Paixão, Luciana Appolinario, Renata Serrano Lopes, Ana Carolina da Fonseca Mendonça, Alice Sampaio Barreto da Rocha, Taina Moreira Martins Venas, Elisa Cavalcante Pereira, Marcelo Henrique Santos Paiva, Cassia Docena, Matheus Filgueira Bezerra, Laís Ceschini Machado, Richard Steiner Salvato, Tatiana Schäffer Gregianini, Leticia Garay Martins, Felicidade Mota Pereira, Darcita Buerger Rovaris, Sandra Bianchini Fernandes, Rodrigo Ribeiro-Rodrigues, Thais Oliveira Costa, Joaquim Cesar Sousa Jr, Fabio Miyajima, Edson Delatorre, Tiago Gräf, Gonzalo Bello, Marilda Mendonça Siqueira and Gabriel Luz WallauSevere Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has infected almost 200 million people worldwide by July 2021 and the pandemic has been characterized by infection waves of viral lineages showing distinct fitness profiles. The simultaneous infection of a single individual by two distinct SARS-CoV-2 lineages may impact COVID-19 disease progression and provides a window of opportunity for viral recombination and the emergence of new lineages with differential phenotype. Several hundred SARS-CoV-2 lineages are currently well phylogenetically defined, but two main factors have precluded major coinfection/codetection and recombination analysis thus far: (i) the low diversity of SARS-CoV-2 lineages during the first year of the pandemic, which limited the identification of lineage defining mutations necessary to distinguish coinfecting/recombining viral lineages; and the (ii) limited availability of raw sequencing data where abundance and distribution of intrasample/intrahost variability can be accessed. Here, we assembled a large sequencing dataset from Brazilian samples covering a period of 18 May 2020 to 30 April 2021 and probed it for unexpected patterns of high intrasample/intrahost variability. This approach enabled us to detect nine cases of SARS-CoV-2 coinfection with well characterized lineage-defining mutations, representing 0.61 % of all samples investigated. In addition, we matched these SARS-CoV-2 coinfections with spatio-temporal epidemiological data confirming its plausibility with the cocirculating lineages at the timeframe investigated. Our data suggests that coinfection with distinct SARS-CoV-2 lineages is a rare phenomenon, although it is certainly a lower bound estimate considering the difficulty to detect coinfections with very similar SARS-CoV-2 lineages and the low number of samples sequenced from the total number of infections.
-
-
-
Phase variation in the glycosyltransferase genes of Pasteurella multocida associated with outbreaks of fowl cholera on free-range layer farms
Fowl cholera caused by Pasteurella multocida has re-emerged in Australian poultry production since the increasing adoption of free-range production systems. Currently, autogenous killed whole-cell vaccines prepared from the isolates previously obtained from each farm are the main preventative measures used. In this study, we use whole-genome sequencing and phylogenomic analysis to investigate outbreak dynamics, as well as monitoring and comparing the variations in the lipopolysaccharide (LPS) outer core biosynthesis loci of the outbreak and vaccine strains. In total, 73 isolates from two different free-range layer farms were included. Our genomic analysis revealed that all investigated isolates within the two farms (layer A and layer B) carried LPS type L3, albeit with a high degree of genetic diversity between them. Additionally, the isolates belonged to five different sequence types (STs), with isolates belonging to ST9 and ST20 being the most prevalent. The isolates carried ST-specific mutations within their LPS type L3 outer core biosynthesis loci, including frameshift mutations in the outer core heptosyltransferase gene (htpE) (ST7 and ST274) or galactosyltransferase gene (gatG) (ST20). The ST9 isolates could be separated into three groups based on their LPS outer core biosynthesis loci sequences, with evidence for potential phase variation mechanisms identified. The potential phase variation mechanisms included a tandem repeat insertion in natC and a single base deletion in a homopolymer region of gatG. Importantly, our results demonstrated that two of the three ST9 groups shared identical rep-PCR (repetitive extragenic palindromic PCR) patterns, while carrying differences in their LPS outer core biosynthesis loci region. In addition, we found that ST9 isolates either with or without the natC tandem repeat insertion were both associated with a single outbreak, which would indicate the importance of screening more than one isolate within an outbreak. Our results strongly suggest the need for a metagenomics culture-independent approach, as well as a genetic typing scheme for LPS, to ensure an appropriate vaccine strain with a matching predicted LPS structure is used.
-
-
-
Genomic characterization of multidrug-resistant Salmonella serovar Kentucky ST198 isolated in poultry flocks in Spain (2011–2017)
Salmonella Kentucky is commonly found in poultry and rarely associated with human disease. However, a multidrug-resistant (MDR) S. Kentucky clone [sequence type (ST)198] has been increasingly reported globally in humans and animals. Our aim here was to assess if the recently reported increase of S. Kentucky in poultry in Spain was associated with the ST198 clone and to characterize this MDR clone and its distribution in Spain. Sixty-six isolates retrieved from turkey, laying hen and broiler in 2011–2017 were subjected to whole-genome sequencing to assess their sequence type, genetic relatedness, and presence of antimicrobial resistance genes (ARGs), plasmid replicons and virulence factors. Thirteen strains were further analysed using long-read sequencing technologies to characterize the genetic background associated with ARGs. All isolates belonged to the ST198 clone and were grouped in three clades associated with the presence of a specific point mutation in the gyrA gene, their geographical origin and isolation year. All strains carried between one and 16 ARGs whose presence correlated with the resistance phenotype to between two and eight antimicrobials. The ARGs were located in the Salmonella genomic island (SGI-1) and in some cases (bla SHV-12 , catA1, cmlA1, dfrA and multiple aminoglycoside-resistance genes) in IncHI2/IncI1 plasmids, some of which were consistently detected in different years/farms in certain regions, suggesting they could persist over time. Our results indicate that the MDR S. Kentucky ST198 is present in all investigated poultry hosts in Spain, and that certain strains also carry additional plasmid-mediated ARGs, thus increasing its potential public health significance.
-
-
-
Genomic and antigenic diversity of colonizing Klebsiella pneumoniae isolates mirrors that of invasive isolates in Blantyre, Malawi
Members of the Klebsiella pneumoniae species complex, particularly K. pneumoniae subsp. pneumoniae are antimicrobial resistance (AMR) associated pathogens of global importance, and polyvalent vaccines targeting Klebsiella O-antigens are in development. Whole-genome sequencing has provided insight into O-antigen distribution in the K. pneumoniae species complex, as well as population structure and virulence determinants, but genomes from sub-Saharan Africa are underrepresented in global sequencing efforts. We therefore carried out a genomic analysis of extended-spectrum beta-lactamase (ESBL)-producing K. pneumoniae species complex isolates colonizing adults in Blantyre, Malawi. We placed these isolates in a global genomic context, and compared colonizing to invasive isolates from the main public hospital in Blantyre. In total, 203 isolates from stool and rectal swabs from adults were whole-genome sequenced and compared to a publicly available multicounty collection and previously sequenced Malawian and Kenyan isolates from blood or sterile sites. We inferred phylogenetic relationships and analysed the diversity of genetic loci linked to AMR, virulence, capsule and LPS O-antigen (O-types). We find that the diversity of Malawian K. pneumoniae subsp. pneumoniae isolates represents the species’ population structure, but shows distinct local signatures concerning clonal expansions. Siderophore and hypermucoidy genes were more frequent in invasive versus colonizing isolates (present in 13 % vs 1 %) but still generally lacking in most invasive isolates. O-antigen population structure and distribution was similar in invasive and colonizing isolates, with O4 more common (14%) than in previously published studies (2–5 %). We conclude that host factors, pathogen opportunity or alternate virulence loci not linked to invasive disease elsewhere are likely to be the major determinants of invasive disease in Malawi. Distinct ST and O-type distributions in Malawi highlight the need to sample locations where the burden of invasive Klebsiella disease is greatest to robustly define secular trends in Klebsiella diversity to assist in the development of a useful vaccine. Colonizing and invasive isolates in Blantyre are similar, hence O-typing of colonizing Klebsiella isolates may be a rapid and cost-effective approach to describe global diversity and guide vaccine development.
-
-
-
Identifying large-scale recombination and capsular switching events in Streptococcus agalactiae strains causing disease in adults in the UK between 2014 and 2015
More LessCases of invasive group B streptococcal infection in the adult UK population have steadily increased over recent years, with the most common serotypes being V, III and Ia, but less is known of the genetic background of these strains. We have carried out in-depth analysis of the whole-genome sequences of 193 clinically important group B Streptococcus (GBS) isolates (184 were from invasive infection, 8 were from non-invasive infection and 1 had no information on isolation site) isolated from adults and submitted to the National Reference Laboratory at the UK Health Security Agency between January 2014 and December 2015. We have determined that capsular serotypes III (26.9%), Ia (26.4%) and V (15.0%) were most commonly identified, with slight differences in gender and age distribution. Most isolates (n=182) grouped to five clonal complexes (CCs), CC1, CC8/CC10, CC17, CC19 and CC23, with common associations between specific serotypes and virulence genes. Additionally, we have identified large recombination events mediating potential capsular switching events between sequence type (ST)1 serotype V and serotypes Ib (n=2 isolates), II (n=2 isolates) and VI (n=2 isolates); between ST19 serotype III and serotype V (n=5 isolates); and between CC17 serotype III and serotype IV (n=1 isolate). The high genetic diversity of disease-causing isolates and multiple recombination events reported in this study highlight the need for routine surveillance of the circulating disease-causing GBS strains. This information is crucial to better understand the global spread of GBS serotypes and genotypes, and will form the baseline information for any future GBS vaccine research in the UK and worldwide.
-
-
-
Genomic diversity and antimicrobial resistance among non-typhoidal Salmonella associated with human disease in The Gambia
Non-typhoidal Salmonella associated with multidrug resistance cause invasive disease in sub-Saharan Africa. Specific lineages of serovars Typhimurium and Enteritidis have been implicated. Here we characterized the genomic diversity of 100 clinical non-typhoidal Salmonella collected from 93 patients in 2001 from the eastern, and in 2006–2018 from the western regions of The Gambia respectively. A total of 93 isolates (64 invasive, 23 gastroenteritis and six other sites) representing a single infection episode were phenotypically tested for antimicrobial susceptibility using the Kirby–Bauer disc diffusion technique. Whole genome sequencing of 100 isolates was performed using Illumina, and the reads were assembled and analysed using SPAdes. The Salmonella in Silico Typing Resource (SISTR) was used for serotyping. SNP differences among the 93 isolates were determined using Roary, and phylogenetic analysis was performed in the context of 495 African strains from the European Nucleotide Archive. Salmonella serovars Typhimurium (26/64; 30.6 %) and Enteritidis (13/64; 20.3 %) were associated with invasive disease, whilst other serovars were mainly responsible for gastroenteritis (17/23; 73.9 %). The presence of three major serovar Enteritidis clades was confirmed, including the invasive West African clade, which made up more than half (11/16; 68.8 %) of the genomes. Multidrug resistance was confined among the serovar Enteritidis West African clade. The presence of this epidemic virulent clade has potential for spread of resistance and thus important implications for systematic patient management. Surveillance and epidemiological investigations to inform control are warranted.
-
-
-
Meta-analysis of the Ralstonia solanacearum species complex (RSSC) based on comparative evolutionary genomics and reverse ecology
Ralstonia solanacearum species complex (RSSC) strains are bacteria that colonize plant xylem tissue and cause vascular wilt diseases. However, individual strains vary in host range, optimal disease temperatures and physiological traits. To increase our understanding of the evolution, diversity and biology of the RSSC, we performed a meta-analysis of 100 representative RSSC genomes. These 100 RSSC genomes contain 4940 genes on average, and a pangenome analysis found that there are 3262 genes in the core genome (~60 % of the mean RSSC genome) with 13 128 genes in the extensive flexible genome. A core genome phylogenetic tree and a whole-genome similarity matrix aligned with the previously named species ( R. solanacearum , R. pseudosolanacearum , R. syzygii ) and phylotypes (I–IV). These analyses also highlighted a third unrecognized sub-clade of phylotype II. Additionally, we identified differences between phylotypes with respect to gene content and recombination rate, and we delineated population clusters based on the extent of horizontal gene transfer. Multiple analyses indicate that phylotype II is the most diverse phylotype, and it may thus represent the ancestral group of the RSSC. We also used our genome-based framework to test whether the RSSC sequence variant (sequevar) taxonomy is a robust method to define within-species relationships of strains. The sequevar taxonomy is based on alignments of a single conserved gene (egl). Although sequevars in phylotype II describe monophyletic groups, the sequevar system breaks down in the highly recombinogenic phylotype I, which highlights the need for an improved, cost-effective method for genotyping strains in phylotype I. Finally, we enabled quick and precise genome-based identification of newly sequenced RSSC strains by assigning Life Identification Numbers (LINs) to the 100 strains and by circumscribing the RSSC and its sub-groups in the LINbase Web service.
-
-
-
Kaptive 2.0: updated capsule and lipopolysaccharide locus typing for the Klebsiella pneumoniae species complex
More LessThe outer polysaccharide capsule and lipopolysaccharide (LPS) antigens are key targets for novel control strategies targeting Klebsiella pneumoniae and related taxa from the K. pneumoniae species complex (KpSC), including vaccines, phage and monoclonal antibody therapies. Given the importance and growing interest in these highly diverse surface antigens, we had previously developed Kaptive, a tool for rapidly identifying and typing capsule (K) and outer LPS (O) loci from whole genome sequence data. Here, we report two significant updates, now freely available in Kaptive 2.0 (https://github.com/katholt/kaptive): (i) the addition of 16 novel K locus sequences to the K locus reference database following an extensive search of >17 000 KpSC genomes; and (ii) enhanced O locus typing to enable prediction of the clinically relevant O2 antigen (sub)types, for which the genetic determinants have been recently described. We applied Kaptive 2.0 to a curated dataset of >12 000 public KpSC genomes to explore for the first time, to the best of our knowledge, the distribution of predicted O (sub)types across species, sampling niches and clones, which highlighted key differences in the distributions that warrant further investigation. As the uptake of genomic surveillance approaches continues to expand globally, the application of Kaptive 2.0 will generate novel insights essential for the design of effective KpSC control strategies.
-
- Evolution and Responses to Interventions
-
-
Targeted Sanger sequencing to recover key mutations in SARS-CoV-2 variant genome assemblies produced by next-generation sequencing
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is adaptively evolving to ensure its persistence within human hosts. It is therefore necessary to continuously monitor the emergence and prevalence of novel variants that arise. Importantly, some mutations have been associated with both molecular diagnostic failures and reduced or abrogated next-generation sequencing (NGS) read coverage in some genomic regions. Such impacts are particularly problematic when they occur in genomic regions such as those that encode the spike (S) protein, which are crucial for identifying and tracking the prevalence and dissemination dynamics of concerning viral variants. Targeted Sanger sequencing presents a fast and cost-effective means to accurately extend the coverage of whole-genome sequences. We designed a custom set of primers to amplify a 401 bp segment of the receptor-binding domain (RBD) (between positions 22698 and 23098 relative to the Wuhan-Hu-1 reference). We then designed a Sanger sequencing wet-laboratory protocol. We applied the primer set and wet-laboratory protocol to sequence 222 samples that were missing positions with key mutations K417N, E484K, and N501Y due to poor coverage after NGS sequencing. Finally, we developed SeqPatcher, a Python-based computational tool to analyse the trace files yielded by Sanger sequencing to generate consensus sequences, or take preanalysed consensus sequences in fasta format, and merge them with their corresponding whole-genome assemblies. We successfully sequenced 153 samples of 222 (69 %) using Sanger sequencing and confirmed the occurrence of key beta variant mutations (K417N, E484K, N501Y) in the S genes of 142 of 153 (93 %) samples. Additionally, one sample had the Y508F mutation and four samples the S477N. Samples with RT-PCR C t scores ranging from 13.85 to 37.47 (mean=25.70) could be Sanger sequenced efficiently. These results show that our method and pipeline can be used to improve the quality of whole-genome assemblies produced using NGS and can be used with any pairs of the most used NGS and Sanger sequencing platforms.
-
-
-
Global population structure of the Serratia marcescens complex and identification of hospital-adapted lineages in the complex
Serratia marcescens is an important nosocomial pathogen causing various opportunistic infections, such as urinary tract infections, bacteremia and sometimes even hospital outbreaks. The recent emergence and spread of multidrug-resistant (MDR) strains further pose serious threats to global public health. This bacterium is also ubiquitously found in natural environments, but the genomic differences between clinical and environmental isolates are not clear, including those between S. marcescens and its close relatives. In this study, we performed a large-scale genome analysis of S. marcescens and closely related species (referred to as the ‘ S. marcescens complex’), including more than 200 clinical and environmental strains newly sequenced here. Our analysis revealed their phylogenetic relationships and complex global population structure, comprising 14 clades, which were defined based on whole-genome average nucleotide identity. Clades 10, 11, 12 and 13 corresponded to S. nematodiphila , S. marcescens sensu stricto, S. ureilytica and S. surfactantfaciens, respectively. Several clades exhibited distinct genome sizes and GC contents and a negative correlation of these genomic parameters was observed in each clade, which was associated with the acquisition of mobile genetic elements (MGEs), but different types of MGEs, plasmids or prophages (and other integrative elements), were found to contribute to the generation of these genomic variations. Importantly, clades 1 and 2 mostly comprised clinical or hospital environment isolates and accumulated a wide range of antimicrobial resistance genes, including various extended-spectrum β-lactamase and carbapenemase genes, and fluoroquinolone target site mutations, leading to a high proportion of MDR strains. This finding suggests that clades 1 and 2 represent hospital-adapted lineages in the S. marcescens complex although their potential virulence is currently unknown. These data provide an important genomic basis for reconsidering the classification of this group of bacteria and reveal novel insights into their evolution, biology and differential importance in clinical settings.
-
- Methods
-
- Genomic Methodologies
-
-
Methods for the targeted sequencing and analysis of integrons and their gene cassettes from complex microbial communities
Integrons are microbial genetic elements that can integrate mobile gene cassettes. They are mostly known for spreading antibiotic resistance cassettes among human pathogens. However, beyond clinical settings, gene cassettes encode an extraordinarily diverse range of functions important for bacterial adaptation. The recovery and sequencing of cassettes has promising applications, including: surveillance of clinically important genes, particularly antibiotic resistance determinants; investigating the functional diversity of integron-carrying bacteria; and novel enzyme discovery. Although gene cassettes can be directly recovered using PCR, there are no standardised methods for their amplification and, importantly, for validating sequences as genuine integron gene cassettes. Here, we present reproducible methods for the amplification, sequence processing, and validation of gene cassette amplicons from complex communities. We describe two different PCR assays that either amplify cassettes together with integron integrases, or gene cassettes together within cassette arrays. We compare the performance of Nanopore and Illumina sequencing, and present bioinformatic pipelines that filter sequences to ensure that they represent amplicons from genuine integrons. Using a diverse set of environmental DNAs, we show that our approach can consistently recover thousands of unique cassettes per sample and up to hundreds of different integron integrases. Recovered cassettes confer a wide range of functions, including antibiotic resistance, with as many as 300 resistance cassettes found in a single sample. In particular, we show that class one integrons are collecting and concentrating resistance genes out of the broader diversity of cassette functions. The methods described here can be applied to any environmental or clinical microbiome sample.
-
- Research Articles
-
- Pathogens and Epidemiology
-
-
A high-quality reference genome for the fish pathogen Streptococcus iniae
Fish mortality caused by Streptococcus iniae is a major economic problem in aquaculture in warm and temperate regions globally. There is also risk of zoonotic infection by S. iniae through handling of contaminated fish. In this study, we present the complete genome sequence of S. iniae strain QMA0248, isolated from farmed barramundi in South Australia. The 2.12 Mb genome of S. iniae QMA0248 carries a 32 kb prophage, a 12 kb genomic island and 92 discrete insertion sequence (IS) elements. These include nine novel IS types that belong mostly to the IS3 family. Comparative and phylogenetic analysis between S. iniae QMA0248 and publicly available complete S. iniae genomes revealed discrepancies that are probably due to misassembly in the genomes of isolates ISET0901 and ISNO. Long-range PCR confirmed five rRNA loci in the PacBio assembly of QMA0248, and, unlike S. iniae 89353, no tandemly repeated rRNA loci in the consensus genome. However, we found sequence read evidence that the tandem rRNA repeat existed within a subpopulation of the original QMA0248 culture. Subsequent nanopore sequencing revealed that the tandem rRNA repeat was the most prevalent genotype, suggesting that there is selective pressure to maintain fewer rRNA copies under uncertain laboratory conditions. Our study not only highlights assembly problems in existing genomes, but provides a high-quality reference genome for S. iniae QMA0248, including manually curated mobile genetic elements, that will assist future S. iniae comparative genomic and evolutionary studies.
-
-
-
Use of genomics to explore AMR persistence in an outdoor pig farm with low antimicrobial usage
Food animals may be reservoirs of antimicrobial resistance (AMR) passing through the food chain, but little is known about AMR prevalence in bacteria when selective pressure from antimicrobials is low or absent. We monitored antimicrobial-resistant Escherichia coli over 1 year in a UK outdoor pig farm with low antimicrobial usage (AMU) compared to conventional pig farms in the United Kingdom. Short and selected long-read whole-genome sequencing (WGS) was performed to identify AMR genes, phylogeny and mobile elements in 385 E. coli isolates purified mainly from pig and some seagull faeces. Generally, low levels of antimicrobial-resistant E. coli were present, probably due to low AMU. Those present were likely to be multi-drug resistant (MDR) and belonging to particular Sequence Types (STs) such as ST744, ST88 or ST44, with shared clones (<14 Single Nucleotide Polymorphisms (SNPs) apart) isolated from different time points indicating epidemiological linkage within pigs of different ages, and between pig and the wild bird faeces. Although importance of horizontal transmission of AMR is well established, there was limited evidence of plasmid-mediated dissemination between different STs. Non-conjugable MDR plasmids or large AMR gene-bearing transposons were stably integrated within the chromosome and remained associated with particular STs/clones over the time period sampled. Heavy metal resistance genes were also detected within some genetic elements. This study highlights that although low levels of antimicrobial-resistant E. coli correlates with low AMU, a basal level of MDR E. coli can still persist on farm potentially due to transmission and recycling of particular clones within different pig groups. Environmental factors such as wild birds and heavy metal contaminants may also play important roles in the recycling and dissemination, and hence enabling persistence of MDR E. coli . All such factors need to be considered as any rise in AMU on low usage farms, could in future, result in a significant increase in their AMR burden.
-
- Evolution and Responses to Interventions
-
-
Global evolutionary dynamics and resistome analysis of Clostridioides difficile ribotype 017
Clostridioides difficile PCR ribotype (RT) 017 ranks among the most successful strains of C. difficile in the world. In the past three decades, it has caused outbreaks on four continents, more than other ‘epidemic’ strains, but our understanding of the genomic epidemiology underpinning the spread of C. difficile RT 017 is limited. Here, we performed high-resolution phylogenomic and Bayesian evolutionary analyses on an updated and more representative dataset of 282 non-clonal C. difficile RT 017 isolates collected worldwide between 1981 and 2019. These analyses place an estimated time of global dissemination between 1953 and 1983 and identified the acquisition of the ermB-positive transposon Tn6194 as a key factor behind global emergence. This coincided with the introduction of clindamycin, a key inciter of C. difficile infection, into clinical practice in the 1960s. Based on the genomic data alone, the origin of C. difficile RT 017 could not be determined; however, geographical data and records of population movement suggest that C. difficile RT 017 had been moving between Asia and Europe since the Middle Ages and was later transported to North America around 1860 (95 % confidence interval: 1622–1954). A focused epidemiological study of 45 clinical C. difficile RT 017 genomes from a cluster in a tertiary hospital in Thailand revealed that the population consisted of two groups of multidrug-resistant (MDR) C. difficile RT 017 and a group of early, non-MDR C. difficile RT 017. The significant genomic diversity within each MDR group suggests that although they were all isolated from hospitalized patients, there was probably a reservoir of C. difficile RT 017 in the community that contributed to the spread of this pathogen.
-
- Personal Views
-
- Genomic Methodologies
-
-
Software testing in microbial bioinformatics: a call to action
Computational algorithms have become an essential component of research, with great efforts by the scientific community to raise standards on development and distribution of code. Despite these efforts, sustainability and reproducibility are major issues since continued validation through software testing is still not a widely adopted practice. Here, we report seven recommendations that help researchers implement software testing in microbial bioinformatics. We have developed these recommendations based on our experience from a collaborative hackathon organised prior to the American Society for Microbiology Next Generation Sequencing (ASM NGS) 2020 conference. We also present a repository hosting examples and guidelines for testing, available from https://github.com/microbinfie-hackathon2020/CSIS.
-