%0 Journal Article %A Nagaraj, Geetha %A Govindan, Vandana %A Ganaie, Feroze %A Venkatesha, V. T. %A Hawkins, Paulina A. %A Gladstone, Rebecca A. %A McGee, Lesley %A Breiman, Robert F. %A Bentley, Stephen D. %A Klugman, Keith P. %A Lo, Stephanie W. %A Ravikumar, K. L. %T Streptococcus pneumoniae genomic datasets from an Indian population describing pre-vaccine evolutionary epidemiology using a whole genome sequencing approach %D 2021 %J Microbial Genomics, %V 7 %N 9 %@ 2057-5858 %C 000645 %R https://doi.org/10.1099/mgen.0.000645 %K global pneumococcal sequence cluster %K S. pneumoniae %K pre-vaccine %K genomic dataset %K India %I Microbiology Society, %X Globally, India has a high burden of pneumococcal disease, and pneumococcal conjugate vaccine (PCV) has been rolled out in different phases across the country since May 2017 in the national infant immunization programme (NIP). To provide a baseline for assessing the impact of the vaccine on circulating pneumococci in India, genetic characterization of pneumococcal isolates detected prior to introduction of PCV would be helpful. Here we present a population genomic study of 480 Streptococcus pneumoniae isolates collected across India and from all age groups before vaccine introduction (2009–2017), including 294 isolates from pneumococcal disease and 186 collected through nasopharyngeal surveys. Population genetic structure, serotype and antimicrobial susceptibility profile were characterized and predicted from whole-genome sequencing data. Our findings revealed high levels of genetic diversity represented by 110 Global Pneumococcal Sequence Clusters (GPSCs) and 54 serotypes. Serotype 19F and GPSC1 (CC320) was the most common serotype and pneumococcal lineage, respectively. Coverage of PCV13 (Pfizer) and 10-valent Pneumosil (Serum Institute of India) serotypes in age groups of ≤2 and 3–5 years were 63–75 % and 60–69 %, respectively. Coverage of PPV23 (Merck) serotypes in age groups of ≥50 years was 62 % (98/158). Among the top five lineages causing disease, GPSC10 (CC230), which ranked second, is the only lineage that expressed both PCV13 (serotypes 3, 6A, 14, 19A and 19F) and non-PCV13 (7B, 13, 10A, 11A, 13, 15B/C, 22F, 24F) serotypes. It exhibited multidrug resistance and was the largest contributor (17 %, 18/103) of NVTs in the disease-causing population. Overall, 42 % (202/480) of isolates were penicillin-resistant (minimum inhibitory concentration ≥0.12 µg ml−1) and 45 % (217/480) were multidrug-resistant. Nine GPSCs (GPSC1, 6, 9, 10, 13, 16, 43, 91, 376) were penicillin-resistant and among them six were multidrug-resistant. Pneumococci expressing PCV13 serotypes had a higher prevalence of antibiotic resistance. Sequencing of pneumococcal genomes has significantly improved our understanding of the biology of these bacteria. This study, describing the pneumococcal disease and carriage epidemiology pre-PCV introduction, demonstrates that 60–75 % of pneumococcal serotypes in children ≤5 years are covered by PCV13 and Pneumosil. Vaccination against pneumococci is very likely to reduce antibiotic resistance. A multidrug-resistant pneumococcal lineage, GPSC10 (CC230), is a high-risk clone that could mediate serotype replacement. %U https://www.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.000645