-
Volume 7,
Issue 1,
2021
Volume 7, Issue 1, 2021
- BioResource
-
- Pathogens and Epidemiology
-
-
ActDES – a curated Actinobacterial Database for Evolutionary Studies
Actinobacteria is a large and diverse phylum of bacteria that contains medically and ecologically relevant organisms. Many members are valuable sources of bioactive natural products and chemical precursors that are exploited in the clinic and made using the enzyme pathways encoded in their complex genomes. Whilst the number of sequenced genomes has increased rapidly in the last 20 years, the large size, complexity and high G+C content of many actinobacterial genomes means that the sequences remain incomplete and consist of large numbers of contigs with poor annotation, which hinders large-scale comparative genomic and evolutionary studies. To enable greater understanding and exploitation of actinobacterial genomes, specialized genomic databases must be linked to high-quality genome sequences. Here, we provide a curated database of 612 high-quality actinobacterial genomes from 80 genera, chosen to represent a broad phylogenetic group with equivalent genome re-annotation. Utilizing this database will provide researchers with a framework for evolutionary and metabolic studies, to enable a foundation for genome and metabolic engineering, to facilitate discovery of novel bioactive therapeutics and studies on gene family evolution. This article contains data hosted by Microreact.
-
- Research Article
-
- Genomic Methodologies
-
-
Translatability of WGS typing results can simplify data exchange for surveillance and control of Listeria monocytogenes
More LessWhere classical epidemiology has proven to be inadequate for surveillance and control of foodborne pathogens, molecular epidemiology, using genomic typing methods, can add value. However, the analysis of whole genome sequencing (WGS) data varies widely and is not yet fully harmonised. We used genomic data on 494 Listeria monocytogenes isolates from ready-to-eat food products and food processing environments deposited in the strain collection of the German National Reference Laboratory to compare various procedures for WGS data analysis and to evaluate compatibility of results. Two different core genome multilocus sequence typing (cgMLST) schemes, different reference genomes in single nucleotide polymorphism (SNP) analysis and commercial as well as open-source software were compared. Correlation of allele distances from the different cgMLST approaches was high, ranging from 0.97 to 1, and unified thresholds yielded higher clustering concordance than scheme-specific thresholds. The number of detected SNP differences could be increased up to a factor of 3.9 using a specific reference genome compared with a general one. Additionally, specific reference genomes improved comparability of SNP analysis results obtained using different software tools. The use of a closed or a draft specific reference genome did not make a difference. The harmonisation of WGS data analysis will finally guarantee seamless data exchange, but, in the meantime, knowledge on threshold values that lead to comparable clustering of isolates by different methods may improve communication between laboratories. We therefore established a translation code between commonly applied cgMLST and SNP methods based on optimised clustering concordances. This code can work as a first filter to identify WGS-based typing matches resulting from different methods, which opens up a new perspective for data exchange and thereby accelerates time-critical analyses, such as in outbreak investigations.
-
-
-
Determining the serotype composition of mixed samples of pneumococcus using whole-genome sequencing
Serotyping of Streptococcus pneumoniae is a critical tool in the surveillance of the pathogen and in the development and evaluation of vaccines. Whole-genome DNA sequencing and analysis is becoming increasingly common and is an effective method for pneumococcal serotype identification of pure isolates. However, because of the complexities of the pneumococcal capsular loci, current analysis software requires samples to be pure (or nearly pure) and only contain a single pneumococcal serotype. We introduce a new software tool called SeroCall, which can identify and quantitate the serotypes present in samples, even when several serotypes are present. The sample preparation, library preparation and sequencing follow standard laboratory protocols. The software runs as fast as or faster than existing identification tools on typical computing servers and is freely available under an open source licence at https://github.com/knightjimr/serocall. Using samples with known concentrations of different serotypes as well as blinded samples, we were able to accurately quantify the abundance of different serotypes of pneumococcus in mixed cultures, with 100 % accuracy for detecting the major serotype and up to 86 % accuracy for detecting minor serotypes. We were also able to track changes in serotype frequency over time in an experimental setting. This approach could be applied in both epidemiological field studies of pneumococcal colonization and experimental laboratory studies, and could provide a cheaper and more efficient method for serotyping than alternative approaches.
-
-
-
Identifying novel β-lactamase substrate activity through in silico prediction of antimicrobial resistance
Diagnosing antimicrobial resistance (AMR) in the clinic is based on empirical evidence and current gold standard laboratory phenotypic methods. Genotypic methods have the potential advantages of being faster and cheaper, and having improved mechanistic resolution over phenotypic methods. We generated and applied rule-based and logistic regression models to predict the AMR phenotype from Escherichia coli and Pseudomonas aeruginosa multidrug-resistant clinical isolate genomes. By inspecting and evaluating these models, we identified previously unknown β-lactamase substrate activities. In total, 22 unknown β-lactamase substrate activities were experimentally validated using targeted gene expression studies. Our results demonstrate that generating and analysing predictive models can help guide researchers to the mechanisms driving resistance and improve annotation of AMR genes and phenotypic prediction, and suggest that we cannot solely rely on curated knowledge to predict resistance phenotypes.
-
- Microbial Communities
-
-
Intraspecies plasmid and genomic variation of Mycobacterium kubicae revealed by the complete genome sequences of two clinical isolates
More LessMycobacterium kubicae is 1 of nearly 200 species of nontuberculous mycobacteria (NTM), environmental micro-organisms that in some situations can infect humans and cause severe lung, skin and soft tissue infections. Although numerous studies have investigated the genetic variation among prevalent clinical NTM species, including Mycobacterium abscessus and Mycobacterium avium , many of the less common but clinically relevant NTM species, including M. kubicae , still lack complete genomes to serve as a comparative reference. Well-characterized representative genomes for each NTM species are important both for investigating the pathogenic potential of NTM, as well as for use in diagnostic methods, even for species that less frequently cause human disease. Here, we report the complete genomes of two M. kubicae strains, isolated from two unrelated patients. Hybrid short-read and long-read sequencing and assembly, using sequence reads from Illumina and Oxford Nanopore Technologies platforms, were utilized to resolve the chromosome and plasmid sequences of each isolate. The genome of NJH_MKUB1 had 5135 coding sequences (CDSs), a circular chromosome of length 5.3 Mb and two plasmids. The genome of NJH_MKUB2 had 5957 CDSs, a circular chromosome of 6.0 Mb and five plasmids. We compared our completed genomic assemblies to four recently released draft genomes of M. kubicae in order to better understand intraspecies genomic conservation and variability. We also identified genes implicated in drug resistance, virulence and persistence in the M. kubicae chromosome and plasmids. Virulence factors encoded in the genome and in the plasmids of M. kubicae provide a foundation for investigating how opportunistic environmental NTM may cause disease.
-
- Pathogens and Epidemiology
-
-
Cryptic prophages within a Streptococcus pyogenes genotype emm4 lineage
More LessThe major human pathogen Streptococcus pyogenes shares an intimate evolutionary history with mobile genetic elements, which in many cases carry genes encoding bacterial virulence factors. During recent whole-genome sequencing of a longitudinal sample of S. pyogenes isolates in England, we identified a lineage within emm4 that clustered with the reference genome MEW427. Like MEW427, this lineage was characterized by substantial gene loss within all three prophage regions, compared to MGAS10750 and isolates outside of the MEW427-like lineage. Gene loss primarily affected lysogeny, replicative and regulatory modules, and to a lesser and more variable extent, structural genes. Importantly, prophage-encoded superantigen and DNase genes were retained in all isolates. In isolates where the prophage elements were complete, like MGAS10750, they could be induced experimentally, but not in MEW427-like isolates with degraded prophages. We also found gene loss within the chromosomal island SpyCIM4 of MEW427-like isolates, although surprisingly, the SpyCIM4 element could not be experimentally induced in either MGAS10750-like or MEW427-like isolates. This did not, however, appear to abolish expression of the mismatch repair operon, within which this element resides. The inclusion of further emm4 genomes in our analyses ratified our observations and revealed an international emm4 lineage characterized by prophage degradation. Intriguingly, the USA population of emm4 S. pyogenes appeared to constitute predominantly MEW427-like isolates, whereas the UK population comprised both MEW427-like and MGAS10750-like isolates. The degraded and cryptic nature of these elements may have important phenotypic and fitness ramifications for emm4 S. pyogenes , and the geographical distribution of this lineage raises interesting questions on the population dynamics of the genotype.
-
-
-
Genomic diversity of Escherichia coli isolates from backyard chickens and guinea fowl in the Gambia
Chickens and guinea fowl are commonly reared in Gambian homes as affordable sources of protein. Using standard microbiological techniques, we obtained 68 caecal isolates of Escherichia coli from 10 chickens and 9 guinea fowl in rural Gambia. After Illumina whole-genome sequencing, 28 sequence types were detected in the isolates (4 of them novel), of which ST155 was the most common (22/68, 32 %). These strains span four of the eight main phylogroups of E. coli, with phylogroups B1 and A being most prevalent. Nearly a third of the isolates harboured at least one antimicrobial resistance gene, while most of the ST155 isolates (14/22, 64 %) encoded resistance to ≥3 classes of clinically relevant antibiotics, as well as putative virulence factors, suggesting pathogenic potential in humans. Furthermore, hierarchical clustering revealed that several Gambian poultry strains were closely related to isolates from humans. Although the ST155 lineage is common in poultry from Africa and South America, the Gambian ST155 isolates belong to a unique cgMLST cluster comprising closely related (38–39 alleles differences) isolates from poultry and livestock from sub-Saharan Africa – suggesting that strains can be exchanged between poultry and livestock in this setting. Continued surveillance of E. coli and other potential pathogens in rural backyard poultry from sub-Saharan Africa is warranted.
-
-
-
Genomic epidemiology of Escherichia coli isolates from a tertiary referral center in Lilongwe, Malawi
Antimicrobial resistance (AMR) is a global threat, including in sub-Saharan Africa. However, little is known about the genetics of resistant bacteria in the region. In Malawi, there is growing concern about increasing rates of antimicrobial resistance to most empirically used antimicrobials. The highly drug resistant Escherichia coli sequence type (ST) 131, which is associated with the extended spectrum β-lactamase blaCTX-M-15 , has been increasing in prevalence globally. Previous data from isolates collected between 2006 and 2013 in southern Malawi have revealed the presence of ST131 and the blaCTX-M-15 gene in the country. We performed whole genome sequencing (WGS) of 58 clinical E. coli isolates at Kamuzu Central Hospital, a tertiary care centre in central Malawi, collected from 2012 to 2018. We used Oxford Nanopore Technologies (ONT) sequencing, which was performed in Malawi. We show that ST131 is observed more often (14.9% increasing to 32.8%) and that the blaCTX-M-15 gene is occurring at a higher frequency (21.3% increasing to 44.8%). Phylogenetics indicates that isolates are highly related between the central and southern geographic regions and confirms that ST131 isolates are contained in a single group. All AMR genes, including blaCTX-M-15 , were widely distributed across sequence types. We also identified an increased number of ST410 isolates, which in this study tend to carry a plasmid-located copy of blaCTX-M-15 gene at a higher frequency than blaCTX-M-15 occurs in ST131. This study confirms the expanding nature of ST131 and the wide distribution of the blaCTX-M-15 gene in Malawi. We also highlight the feasibility of conducting longitudinal genomic epidemiology studies of important bacteria with the sequencing done on site using a nanopore platform that requires minimal infrastructure.
-
-
-
Quantitative analysis of the splice variants expressed by the major hepatitis B virus genotypes
Hepatitis B virus (HBV) is a major human pathogen that causes liver diseases. The main HBV RNAs are unspliced transcripts that encode the key viral proteins. Recent studies have shown that some of the HBV spliced transcript isoforms are predictive of liver cancer, yet the roles of these spliced transcripts remain elusive. Furthermore, there are nine major HBV genotypes common in different regions of the world, these genotypes may express different spliced transcript isoforms. To systematically study the HBV splice variants, we transfected human hepatoma cells, Huh7, with four HBV genotypes (A2, B2, C2 and D3), followed by deep RNA-sequencing. We found that 13–28 % of HBV RNAs were splice variants, which were reproducibly detected across independent biological replicates. These comprised 6 novel and 10 previously identified splice variants. In particular, a novel, singly spliced transcript was detected in genotypes A2 and D3 at high levels. The biological relevance of these splice variants was supported by their identification in HBV-positive liver biopsy and serum samples, and in HBV-infected primary human hepatocytes. Interestingly the levels of HBV splice variants varied across the genotypes, but the spliced pregenomic RNA SP1 and SP9 were the two most abundant splice variants. Counterintuitively, these singly spliced SP1 and SP9 variants had a suboptimal 5′ splice site, supporting the idea that splicing of HBV RNAs is tightly controlled by the viral post-transcriptional regulatory RNA element.
-
-
-
Comparative genomics revealed adaptive admixture in Cryptosporidium hominis in Africa
Swapnil Tichkule, Aaron R. Jex, Cock van Oosterhout, Anna Rosa Sannella, Ralf Krumkamp, Cassandra Aldrich, Oumou Maiga-Ascofare, Denise Dekker, Maike Lamshöft, Joyce Mbwana, Njari Rakotozandrindrainy, Steffen Borrmann, Thorsten Thye, Kathrin Schuldt, Doris Winter, Peter G. Kremsner, Kwabena Oppong, Prince Manouana, Mirabeau Mbong, Samwel Gesase, Daniel T. R. Minja, Ivo Mueller, Melanie Bahlo, Johanna Nader, Jürgen May, Raphael Rakotozandrindrain, Ayola Akim Adegnika, John P. A. Lusingu, John Amuasi, Daniel Eibach and Simone Mario CaccioCryptosporidiosis is a major cause of diarrhoeal illness among African children, and is associated with childhood mortality, malnutrition, cognitive development and growth retardation. Cryptosporidium hominis is the dominant pathogen in Africa, and genotyping at the glycoprotein 60 (gp60) gene has revealed a complex distribution of different subtypes across this continent. However, a comprehensive exploration of the metapopulation structure and evolution based on whole-genome data has yet to be performed. Here, we sequenced and analysed the genomes of 26 C. hominis isolates, representing different gp60 subtypes, collected at rural sites in Gabon, Ghana, Madagascar and Tanzania. Phylogenetic and cluster analyses based on single-nucleotide polymorphisms showed that isolates predominantly clustered by their country of origin, irrespective of their gp60 subtype. We found a significant isolation-by-distance signature that shows the importance of local transmission, but we also detected evidence of hybridization between isolates of different geographical regions. We identified 37 outlier genes with exceptionally high nucleotide diversity, and this group is significantly enriched for genes encoding extracellular proteins and signal peptides. Furthermore, these genes are found more often than expected in recombinant regions, and they show a distinct signature of positive or balancing selection. We conclude that: (1) the metapopulation structure of C. hominis can only be accurately captured by whole-genome analyses; (2) local anthroponotic transmission underpins the spread of this pathogen in Africa; (3) hybridization occurs between distinct geographical lineages; and (4) genetic introgression provides novel substrate for positive or balancing selection in genes involved in host–parasite coevolution.
-
-
-
Kill and cure: genomic phylogeny and bioactivity of Burkholderia gladioli bacteria capable of pathogenic and beneficial lifestyles
Burkholderia gladioli is a bacterium with a broad ecology spanning disease in humans, animals and plants, but also encompassing multiple beneficial interactions. It is a plant pathogen, a toxin-producing food-poisoning agent, and causes lung infections in people with cystic fibrosis (CF). Contrasting beneficial traits include antifungal production exploited by insects to protect their eggs, plant protective abilities and antibiotic biosynthesis. We explored the genomic diversity and specialized metabolic potential of 206 B. gladioli strains, phylogenomically defining 5 clades. Historical disease pathovars (pv.) B. gladioli pv. allicola and B. gladioli pv. cocovenenans were distinct, while B. gladioli pv. gladioli and B. gladioli pv. agaricicola were indistinguishable; soft-rot disease and CF infection were conserved across all pathovars. Biosynthetic gene clusters (BGCs) for toxoflavin, caryoynencin and enacyloxin were dispersed across B. gladioli , but bongkrekic acid and gladiolin production were clade-specific. Strikingly, 13 % of CF infection strains characterized were bongkrekic acid-positive, uniquely linking this food-poisoning toxin to this aspect of B. gladioli disease. Mapping the population biology and metabolite production of B. gladioli has shed light on its diverse ecology, and by demonstrating that the antibiotic trimethoprim suppresses bongkrekic acid production, a potential therapeutic strategy to minimize poisoning risk in CF has been identified.
-
Most Read This Month
