The Saccharomyces complex is a well-studied group of yeasts of considerable academic and industrial importance. Many species within the complex have no genome sequence publicly available and have yet to be formally compared at the genomic level with related species. We have recently sequenced the genomes of 40 species from 11 clades within the Saccharomyces complex. Considerable genomic diversity was observed in this dataset, including varying predicted genome size (9-22Mbp), coding proportion (42-76%) and gene number (4,194-11,001).

We evaluated the completeness of our gene sets and draft genome assemblies using the BUSCO tool, with a few species found to have a large number of duplicated and missing genes. These same species also had larger than average genome sizes and numbers of genes, which could indicate a duplication or hybridisation event. The total GC content was also found to vary significantly across the dataset, from Hanseniaspora uvarum (31.3%) to Lachancea thermotolerans (46%). GC content in the coding regions versus the whole genome was also observed, which could indicate differences in evolutionary pressures between species according to their environmental niches.

Here, we highlight some of the key differences found within this fascinating dataset, investigating the origins of some of the more extreme results and touching upon the opportunities and challenges they present in investigating the evolution of these yeasts.


Article metrics loading...

Loading full text...

Full text loading...


Most cited this month Most Cited RSS feed

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error