1887

Abstract

is part of the human nasopharyngeal microbiota and a pathogen causing invasive disease. The extensive genetic diversity observed in necessitates discriminatory analytical approaches to evaluate its population structure. This study developed a core genome multilocus sequence typing (cgMLST) scheme for using pangenome analysis tools and validated the cgMLST scheme using datasets consisting of complete reference genomes ( = 14) and high-quality draft genomes ( = 2297). The draft genome dataset was divided into a development dataset ( = 921) and a validation dataset ( = 1376). The development dataset was used to identify potential core genes, and the validation dataset was used to refine the final core gene list to ensure the reliability of the proposed cgMLST scheme. Functional classifications were made for all the resulting core genes. Phylogenetic analyses were performed using both allelic profiles and nucleotide sequence alignments of the core genome to test congruence, as assessed by Spearman’s correlation and ordinary least square linear regression tests. Preliminary analyses using the development dataset identified 1067 core genes, which were refined to 1037 with the validation dataset. More than 70% of core genes were predicted to encode proteins essential for metabolism or genetic information processing. Phylogenetic and statistical analyses indicated that the core genome allelic profile accurately represented phylogenetic relatedness among the isolates ( = 0.945). We used this cgMLST scheme to define a high-resolution population structure for , which enhances the genomic analysis of this clinically relevant human pathogen.

Funding
This study was supported by the:
  • National Institute for Health and Care Research (Award MEVacP)
    • Principle Award Recipient: MartinC. J. Maiden
  • Wellcome Trust (Award 218205/Z/19/Z)
    • Principle Award Recipient: MartinC. J. Maiden
  • This is an open-access article distributed under the terms of the Creative Commons Attribution License. This article was made open access via a Publish and Read agreement between the Microbiology Society and the corresponding author’s institution.
Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.001281
2024-08-09
2024-11-10
Loading full text...

Full text loading...

/deliver/fulltext/mgen/10/8/mgen001281.html?itemId=/content/journal/mgen/10.1099/mgen.0.001281&mimeType=html&fmt=ahah

References

  1. Carrol KC, Funke G, Landry ML, Richter SS, Warnock DW. Manual of Clinical Microbiology Washington, DC: ASM Press; 2019 [View Article]
    [Google Scholar]
  2. Brooks GF, Jawetz E, Melnick JL, Adelberg EA. Jawetz, Melnick, &Amp; Adelberg’s Medical Microbiology New York: McGraw Hill Medical; 2019
    [Google Scholar]
  3. Soeters HM, Blain A, Pondo T, Doman B, Farley MM et al. Current epidemiology and trends in invasive haemophilus influenzae disease-United States, 2009-2015. Clin Infect Dis 2018; 67:881–889 [View Article] [PubMed]
    [Google Scholar]
  4. Suga S, Ishiwada N, Sasaki Y, Akeda H, Nishi J et al. A nationwide population-based surveillance of invasive Haemophilus influenzae diseases in children after the introduction of the Haemophilus influenzae type b vaccine in Japan. Vaccine 2018; 36:5678–5684 [View Article] [PubMed]
    [Google Scholar]
  5. Bertran M, D’Aeth JC, Hani E, Amin-Chowdhury Z, Fry NK et al. Trends in invasive Haemophilus influenzae serotype a disease in England from 2008-09 to 2021-22: a prospective national surveillance study. Lancet Infect Dis 2023; 23:1197–1206 [View Article] [PubMed]
    [Google Scholar]
  6. McTaggart LR, Cronin K, Seo CY, Wilson S, Patel SN et al. Increased incidence of invasive Haemophilus influenzae disease driven by non-type B isolates in Ontario, Canada, 2014 to 2018. Microbiol Spectr 2021; 9:e0080321 [View Article]
    [Google Scholar]
  7. Su PY, Cheng WH, Ho CH. Molecular characterization of multidrug-resistant non-typeable haemophilus influenzae with high-level resistance to cefuroxime, levofloxacin, and trimethoprim-sulfamethoxazole. BMC Microbiol 2023; 23:178 [View Article] [PubMed]
    [Google Scholar]
  8. Yamada S, Seyama S, Wajima T, Yuzawa Y, Saito M et al. β-Lactamase-non-producing ampicillin-resistant Haemophilus influenzae is acquiring multidrug resistance. J Infect Public Health 2020; 13:497–501 [View Article] [PubMed]
    [Google Scholar]
  9. Zhou Y, Wang Y, Cheng J, Zhao X, Liang Y et al. Molecular epidemiology and antimicrobial resistance of Haemophilus influenzae in Guiyang, Guizhou, China. Front Public Health 2022; 10:947051 [View Article] [PubMed]
    [Google Scholar]
  10. Cox AD, Williams D, Cairns C, St Michael F, Fleming P et al. Investigating the candidacy of a capsular polysaccharide-based glycoconjugate as A vaccine to combat Haemophilus influenzae type A disease: a solution for an unmet public health need. Vaccine 2017; 35:6129–6136 [View Article] [PubMed]
    [Google Scholar]
  11. Shoukat A, Van Exan R, Moghadas SM. Cost-effectiveness of a potential vaccine candidate for Haemophilus influenzae serotype “a.”. Vaccine 2018; 36:1681–1688 [View Article] [PubMed]
    [Google Scholar]
  12. Tsang RSW, Ulanova M. The changing epidemiology of invasive Haemophilus influenzae disease: emergence and global presence of serotype a strains that may require a new vaccine for control. Vaccine 2017; 35:4270–4275 [View Article] [PubMed]
    [Google Scholar]
  13. Wilkinson TMA, Schembri S, Brightling C, Bakerly ND, Lewis K et al. Non-typeable Haemophilus influenzae protein vaccine in adults with COPD: a phase 2 clinical trial. Vaccine 2019; 37:6102–6111 [View Article] [PubMed]
    [Google Scholar]
  14. Haemophilus influenzae: Surveillance standard Geneva: World Health Organization; 20181–14
    [Google Scholar]
  15. Maiden MC. Multilocus sequence typing of bacteria. Annu Rev Microbiol 2006; 60:561–588 [View Article] [PubMed]
    [Google Scholar]
  16. Maiden MC, Jansen van Rensburg MJ, Bray JE, Earle SG, Ford SA et al. MLST revisited: the gene-by-gene approach to bacterial genomics. Nat Rev Microbiol 2013; 11:728–736 [View Article] [PubMed]
    [Google Scholar]
  17. McInerney JO, McNally A, O’Connell MJ. Why prokaryotes have pangenomes. Nat Microbiol 2017; 2:17040 [View Article] [PubMed]
    [Google Scholar]
  18. Schürch AC, Arredondo-Alonso S, Willems RJL, Goering RV. Whole genome sequencing options for bacterial strain typing and epidemiologic analysis based on single nucleotide polymorphism versus gene-by-gene-based approaches. Clin Microbiol Infect 2018; 24:350–354 [View Article] [PubMed]
    [Google Scholar]
  19. Maiden MC, Bygraves JA, Feil E, Morelli G, Russell JE et al. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci U S A 1998; 95:3140–3145 [View Article] [PubMed]
    [Google Scholar]
  20. Preska Steinberg A, Lin M, Kussell E. Core genes can have higher recombination rates than accessory genes within global microbial populations. elife 2022; 11:e78533 [View Article] [PubMed]
    [Google Scholar]
  21. de Been M, Pinholt M, Top J, Bletz S, Mellmann A et al. Core genome multilocus sequence typing scheme for high- resolution typing of Enterococcus faecium. J Clin Microbiol 2015; 53:3788–3797 [View Article] [PubMed]
    [Google Scholar]
  22. Blanc DS, Magalhães B, Koenig I, Senn L, Grandbastien B. Comparison of whole genome (wg-) and core genome (cg-) MLST (BioNumericsTM) versus SNP variant calling for epidemiological investigation of Pseudomonas aeruginosa. Front Microbiol 2020; 11:1729 [View Article] [PubMed]
    [Google Scholar]
  23. Power PM, Bentley SD, Parkhill J, Moxon ER, Hood DW. Investigations into genome diversity of Haemophilus influenzae using whole genome sequencing of clinical isolates and laboratory transformants. BMC Microbiol 2012; 12:273 [View Article] [PubMed]
    [Google Scholar]
  24. Iskander M. Development and Evaluation of Core Genome MLST Schema for Haemophilus Influenzae University of Manitoba; 2017
    [Google Scholar]
  25. Pinto M, González-Díaz A, Machado MP, Duarte S, Vieira L et al. Insights into the population structure and pan-genome of Haemophilus influenzae. Infect Genet Evol 2019; 67:126–135 [View Article] [PubMed]
    [Google Scholar]
  26. Silva M, Machado MP, Silva DN, Rossi M, Moran-Gilad J et al. chewBBACA: a complete suite for gene-by-gene schema creation and strain identification. Microb Genom 2018; 4:e000166 [View Article] [PubMed]
    [Google Scholar]
  27. Zallot R, Harrison KJ, Kolaczkowski B, de Crécy-Lagard V. Functional annotations of paralogs: a blessing and a curse. Life 2016; 6:39 [View Article] [PubMed]
    [Google Scholar]
  28. Zhang J, Halkilahti J, Hänninen M-L, Rossi M. Refinement of whole-genome multilocus sequence typing analysis by addressing gene paralogy. J Clin Microbiol 2015; 53:1765–1767 [View Article] [PubMed]
    [Google Scholar]
  29. Palma F, Mangone I, Janowicz A, Moura A, Chiaverini A et al. In vitro and in silico parameters for precise cgMLST typing of Listeria monocytogenes. BMC Genomics 2022; 23:235 [View Article] [PubMed]
    [Google Scholar]
  30. Jolley KA, Maiden MCJ. BIGSdb: scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics 2010; 11:595 [View Article] [PubMed]
    [Google Scholar]
  31. Bratcher HB, Corton C, Jolley KA, Parkhill J, Maiden MC. A gene-by-gene population genomics platform: de novo assembly, annotation and genealogical analysis of 108 representative Neisseria meningitidis genomes. BMC Genomics 2014; 15:1138 [View Article] [PubMed]
    [Google Scholar]
  32. Harrison OB, Cehovin A, Skett J, Jolley KA, Massari P et al. Neisseria gonorrhoeae population genomics: use of the gonococcal core genome to improve surveillance of antimicrobial resistance. J Infect Dis 2020; 222:1816–1825 [View Article] [PubMed]
    [Google Scholar]
  33. Cody AJ, Bray JE, Jolley KA, McCarthy ND, Maiden MCJ. Core genome multilocus sequence typing scheme for stable, comparative analyses of Campylobacter jejuni and C. coli human disease isolates. J Clin Microbiol 2017; 55:2086–2097 [View Article] [PubMed]
    [Google Scholar]
  34. Rensburg M, Berger DJ, Fohrmann A, Bray JE, Jolley KA et al. Development of the pneumococcal genome library, a core genome multilocus sequence typing scheme, and a taxonomic life identification number barcoding system to investigate and define pneumococcal population structure. bioRxiv 2023; 2023:
    [Google Scholar]
  35. Whiley D, Jolley K, Blanchard A, Coffey T, Leigh J. A core genome multi-locus sequence typing scheme for Streptococcus uberis: an evolution in typing A genetically diverse pathogen. Microb Genom 2024; 10: [View Article]
    [Google Scholar]
  36. Liang KYH, Orata FD, Islam MT, Nasreen T, Alam M et al. A vibrio cholerae core genome multilocus sequence typing scheme to facilitate the epidemiological study of cholera. J Bacteriol 2020; 202:e00086-20 [View Article] [PubMed]
    [Google Scholar]
  37. Gonzalez-Escalona N, Jolley KA, Reed E, Martinez-Urtaza J. Defining a core genome multilocus sequence typing scheme for the global epidemiology of Vibrio parahaemolyticus. J Clin Microbiol 2017; 55:1682–1697 [View Article] [PubMed]
    [Google Scholar]
  38. Abdel-Glil MY, Chiaverini A, Garofolo G, Fasanella A, Parisi A et al. A whole-genome-based gene-by-gene typing system for standardized high-resolution strain typing of Bacillus anthracis. J Clin Microbiol 2021; 59:e0288920 [View Article] [PubMed]
    [Google Scholar]
  39. Tourasse NJ, Jolley KA, Kolstø A-B, Økstad OA. Core genome multilocus sequence typing scheme for Bacillus cereus group bacteria. Res Microbiol 2023; 174:104050 [View Article] [PubMed]
    [Google Scholar]
  40. Appelt S, Rohleder AM, Jacob D, von Buttlar H, Georgi E et al. Genetic diversity and spatial distribution of Burkholderia mallei by core genome-based multilocus sequence typing analysis. PLoS One 2022; 17:e0270499 [View Article] [PubMed]
    [Google Scholar]
  41. Moreno-Manjón J, Jolley KA, Maiden MC. Acinetobacter baumannii core genome multilocus sequence typing; 2022
  42. Abdel-Glil MY, Thomas P, Linde J, Jolley KA, Harmsen D et al. Establishment of a publicly available core genome multilocus sequence typing scheme for Clostridium perfringens. Microbiol Spectr 2021; 9:e0053321 [View Article] [PubMed]
    [Google Scholar]
  43. Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 2015; 31:3691–3693 [View Article] [PubMed]
    [Google Scholar]
  44. Bayliss SC, Thorpe HA, Coyle NM, Sheppard SK, Feil EJ. PIRATE: a fast and scalable pangenomics toolbox for clustering diverged orthologues in bacteria. Gigascience 2019; 8:giz119 [View Article] [PubMed]
    [Google Scholar]
  45. Ding W, Baumdicker F, Neher RA. panX: pan-genome analysis and exploration. Nucleic Acids Res 2018; 46:e5 [View Article] [PubMed]
    [Google Scholar]
  46. Zhao Y, Wu J, Yang J, Sun S, Xiao J et al. PGAP: pan-genomes analysis pipeline. Bioinformatics 2012; 28:416–418 [View Article] [PubMed]
    [Google Scholar]
  47. Gautreau G, Bazin A, Gachet M, Planel R, Burlot L et al. PPanGGOLiN: depicting microbial diversity via a partitioned pangenome graph. PLoS Comput Biol 2020; 16:e1007732 [View Article] [PubMed]
    [Google Scholar]
  48. Peng Y, Tang S, Wang D, Zhong H, Jia H et al. MetaPGN: a pipeline for construction and graphical visualization of annotated pangenome networks. Gigascience 2018; 7:giy121 [View Article] [PubMed]
    [Google Scholar]
  49. Zhou Z, Charlesworth J, Achtman M. Accurate reconstruction of bacterial pan- and core genomes with PEPPAN. Genome Res 2020; 30:1667–1679 [View Article] [PubMed]
    [Google Scholar]
  50. Tonkin-Hill G, MacAlasdair N, Ruis C, Weimann A, Horesh G et al. Producing polished prokaryotic pangenomes with the Panaroo pipeline. Genome Biol 2020; 21:180 [View Article] [PubMed]
    [Google Scholar]
  51. Boubour A. Genomic Characterisation of Haemophilus Influenzae Capsular Locus University of Oxford; 2021
    [Google Scholar]
  52. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 2014; 30:2068–2069 [View Article] [PubMed]
    [Google Scholar]
  53. Jolley KA, Bray JE, Maiden MCJ. Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications. Wellcome Open Res 2018; 3:124 [View Article] [PubMed]
    [Google Scholar]
  54. Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol Biol Evol 2021; 38:5825–5829 [View Article] [PubMed]
    [Google Scholar]
  55. Galperin MY, Wolf YI, Makarova KS, Vera Alvarez R, Landsman D et al. COG database update: focus on microbial diversity, model organisms, and widespread pathogens. Nucleic Acids Res 2021; 49:D274–D281 [View Article] [PubMed]
    [Google Scholar]
  56. Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res 2017; 45:D353–D361 [View Article] [PubMed]
    [Google Scholar]
  57. Bruen TC, Philippe H, Bryant D. A simple and robust statistical test for detecting the presence of recombination. Genetics 2006; 172:2665–2681 [View Article] [PubMed]
    [Google Scholar]
  58. Lai YP, Ioerger TR. A statistical method to identify recombination in bacterial genomes based on SNP incompatibility. BMC Bioinformatics 2018; 19:450 [View Article] [PubMed]
    [Google Scholar]
  59. Didelot X, Wilson DJ. ClonalFrameML: efficient inference of recombination in whole bacterial genomes. PLoS Comput Biol 2015; 11:e1004041 [View Article] [PubMed]
    [Google Scholar]
  60. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 2014; 30:1312–1313 [View Article] [PubMed]
    [Google Scholar]
  61. Watts SC, Holt KE. In silico serotyping of the Haemophilus influenzae capsule locus. J Clin Microbiol 2019; 57:
    [Google Scholar]
  62. Carrera-Salinas A, González-Díaz A, Calatayud L, Mercado-Maza J, Puig C et al. Epidemiology and population structure of Haemophilus influenzae causing invasive disease. Microb Genom 2021; 7:000723 [View Article] [PubMed]
    [Google Scholar]
  63. De Chiara M, Hood D, Muzzi A, Pickard DJ, Perkins T et al. Genome sequencing of disease and carriage isolates of nontypeable Haemophilus influenzae identifies discrete population structure. Proc Natl Acad Sci U S A 2014; 111:5439–5444 [View Article] [PubMed]
    [Google Scholar]
  64. Nørskov-Lauritsen N. Classification, identification, and clinical significance of Haemophilus and Aggregatibacter species with host specificity for humans. Clin Microbiol Rev 2014; 27:214–240 [View Article] [PubMed]
    [Google Scholar]
  65. Slotved HC, Johannesen TB, Stegger M, Fuursted K. Evaluation of molecular typing for national surveillance of invasive clinical Haemophilus influenzae isolates from Denmark. Front Microbiol 2022; 13:1030242 [View Article] [PubMed]
    [Google Scholar]
  66. Zhou Z, Alikhan N-F, Sergeant MJ, Luhmann N, Vaz C et al. GrapeTree: visualization of core genomic relationships among 100,000 bacterial pathogens. Genome Res 2018; 28:1395–1404 [View Article] [PubMed]
    [Google Scholar]
  67. Atack JM, Murphy TF, Bakaletz LO, Seib KL, Jennings MP. Closed complete genome sequences of two nontypeable Haemophilus influenzae strains containing novel modA Alleles from the sputum of patients with chronic obstructive pulmonary disease. Microbiol Resour Announc 2018; 7:e00821-18 [View Article] [PubMed]
    [Google Scholar]
  68. Harrison A, Dyer DW, Gillaspy A, Ray WC, Mungur R et al. Genomic sequence of an otitis media isolate of nontypeable Haemophilus influenzae: comparative study with H. influenzae serotype d, strain KW20. J Bacteriol 2005; 187:4627–4636 [View Article] [PubMed]
    [Google Scholar]
  69. Loman NJ, Pallen MJ. Twenty years of bacterial genome sequencing. Nat Rev Microbiol 2015; 13:787–794 [View Article] [PubMed]
    [Google Scholar]
  70. Hogg JS, Hu FZ, Janto B, Boissy R, Hayes J et al. Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains. Genome Biol 2007; 8:R103 [View Article] [PubMed]
    [Google Scholar]
  71. Eutsey RA, Hiller NL, Earl JP, Janto BA, Dahlgren ME et al. Design and validation of a supragenome array for determination of the genomic content of Haemophilus influenzae isolates. BMC Genomics 2013; 14:484 [View Article] [PubMed]
    [Google Scholar]
  72. Kc R, Leong KWC, Harkness NM, Lachowicz J, Gautam SS et al. Whole-genome analyses reveal gene content differences between nontypeable Haemophilus influenzae isolates from chronic obstructive pulmonary disease compared to other clinical phenotypes. Microb Genom 2020; 6:mgen000405 [View Article] [PubMed]
    [Google Scholar]
  73. Gonzalez-Diaz A, Carrera-Salinas A, Pinto M, Cubero M, van der Ende A et al. Comparative pangenome analysis of capsulated Haemophilus influenzae serotype f highlights their high genomic stability. Sci Rep 2022; 12:3189 [View Article] [PubMed]
    [Google Scholar]
  74. Topaz N, Tsang R, Deghmane A-E, Claus H, Lâm T-T et al. Phylogenetic structure and comparative genomics of multi-national invasive Haemophilus influenzae serotype a isolates. Front Microbiol 2022; 13:856884 [View Article] [PubMed]
    [Google Scholar]
  75. Potts CC, Topaz N, Rodriguez-Rivera LD, Hu F, Chang H-Y et al. Genomic characterization of Haemophilus influenzae: a focus on the capsule locus. BMC Genomics 2019; 20:733 [View Article] [PubMed]
    [Google Scholar]
  76. Meyler K, Meehan M, Bennett D, Mulhall R, Harrison O et al. Spontaneous capsule loss in Haemophilus influenzae serotype b associated with Hib conjugate vaccine failure and invasive disease. Clin Microbiol Infect 2019; 25:390–391 [View Article] [PubMed]
    [Google Scholar]
  77. Kilian M. A taxonomic study of the genus Haemophilus, with the proposal of a new species. J Gen Microbiol 1976; 93:9–62 [View Article] [PubMed]
    [Google Scholar]
  78. Vinatzer BA, Tian L, Heath LS. A proposal for a portal to make earth’s microbial diversity easily accessible and searchable. Antonie van Leeuwenhoek 2017; 110:1271–1279 [View Article] [PubMed]
    [Google Scholar]
/content/journal/mgen/10.1099/mgen.0.001281
Loading
/content/journal/mgen/10.1099/mgen.0.001281
Loading

Data & Media loading...

Supplements

Supplementary material 1

PDF

Supplementary material 2

EXCEL
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error