
Full text loading...
Enteropathogenic E. coli (EPEC) were the first E. coli strains linked to human disease (1945), and pose a serious health threat. The burden for public health changed over the past century as it did for other diarrheal diseases; whilst it is still endemic in large parts of South America, cases in Europe are commonly associated with recent travel. Despite its long-standing importance for global health and its significant impact on child mortality in Low- and Middle Income Countries (LMICs), we still have very limited understanding of what defines an EPEC beyond the diagnostic LEE island, or indeed if there are any shared characteristics, as most in-depth analyses of their pathogenic determinants are confined to few model strains. We present a study analysing ∼1300 whole-genome sequences combining published and newly sequenced datasets of EPEC and non-EPEC strains to map the evolutionary history and molecular determinants. Importantly, we have expanded the published data with whole-genome sequences of 300 historical and contemporary EPEC strains from a large collection of clinical isolates, mainly from Brazil and England, which enables us to compare the epidemiology in a high-income with a LMIC over an extended time frame. We furthermore present a molecular definition of EPEC, including mapping of phage islands and de-novo prediction of effectors in this large-scale dataset, as well as investigating the patterns of adhesins and other secretion systems, thus characterising different EPEC lineages which have emerged numerous times during the evolution of E. coli.