1887

Abstract

In recent years it has become apparent that prokaryotic genomes contain large numbers of pseudogenised genes which may provide valuable insights into the recent functional history of an organism. However, pseudogenes are difficult to detectab initioand are not routinely reported by gene prediction tools.

We present StORF-R(Stop-ORF-Reporter), a tool that takes as input an annotated genome and returns putative missed genes (functional and/or pseudogenised) from the intergenic regions. We show that this methodology can recover gene-families that the state-of-the-art methods continue to misreport or completely omit.

We applied StORF-R to the intergenic regions of2,665E. coligenomes and found on average 244 previously missed pseudogenised genes (with in-frame stop codons) per genome, many of which had high scoring similarity to known Swiss-Prot proteins. Many of these pseudogenised genes form widespread gene families across E. coli strains.

To investigate if this phenomenon exists in other taxa we further applied the methodology to 44,048 bacterial genomes representing 8,244 species from Ensembl. This revealed manygene-families spanning multiple species with large (>10,000) numbers of copies of both intact and pseudogenised versions. Many of these families had only previously been reported in a single or few genomes, though we detected many hundred pseudogenised versions with StORF-R, changing our understanding of how widespread these genes truly are.

These pseudogenised genes represent a pangenomic ‘graveyard’ which may alter our understanding of the definition of core and accessory genes for many species.

  • This is an open-access article distributed under the terms of the Creative Commons Attribution License.
Loading

Article metrics loading...

/content/journal/acmi/10.1099/acmi.ac2021.po0147
2022-05-27
2024-04-25
Loading full text...

Full text loading...

http://instance.metastore.ingenta.com/content/journal/acmi/10.1099/acmi.ac2021.po0147
Loading
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error