genomeRxiv: a microbial whole-genome database and diagnostic marker design resource for classification, identification, and data sharing

Leighton Pritchard; Parul Sharma; Reza Mazloom; Tessa Pierce; Luiz Irber; Bailey Harrington; Lenwood Heath; C Titus Brown; Boris Vinatzer

doi:10.1099/acmi.ac2021.po0165

Abstract

genomeRxiv is a newly-funded US-UK collaboration to provide a public, web-accessible database of public genome sequences, accurately catalogued and classified by whole-genome similarity independent of their taxonomic affiliation. Our goal is to supply the basic and applied research community with rapid, precise and accurate identification of unknown isolates based on genome sequence alone, and with molecular tools for environmental analysis.

The DNA sequencing revolution enabled the use of cultured and uncultured microorganism genomes for fast and precise identification. However, precise identification is impossible without

1. reference databases that precisely circumscribe classes of microorganisms, and label these with their uniquely-shared characteristics

2. fast algorithms that can handle the volumes of genome data

Our approach integrates the highly-resolved classification framework of Life Identification Numbers (LINs) with the speed and computational efficiency of sourmash and k-mer hashing algorithms, and the precision and filtering of average nucleotide identity (ANI). We aim to construct a single genome-based indexing scheme that extends from phylum to strain, enabling the unique and consistent placement of any sequenced prokaryote genome.

genomeRxiv includes protocols for confidentiality, allowing groups to identify and announce the identities of newly-sequenced organisms without sharing genome data directly. This protects communities working with commercially- and ethically-sensitive organisms (e.g. production engineering strains, potential bioweapons, and to enable benefit sharing with indigenous communities).

genomeRxiv will also provide online capability to design molecular diagnostic tools for metabarcoding and qPCR, to enable tracking of specific groupings of bacteria directly in the environment.

Published Online: 27/05/2022

This is an open-access article distributed under the terms of the Creative Commons Attribution License.

Volume 4, Issue 5

Meeting Report

Open Access

genomeRxiv: a microbial whole-genome database and diagnostic marker design resource for classification, identification, and data sharing

Abstract

Most read this month

Most cited Most Cited RSS feed

Antimicrobial properties of phytohormone (gibberellins) against phytopathogens and clinical pathogens

Isolation and characterization of novel soil- and plant-associated bacteria with multiple phytohormone-degrading activities using a targeted methodology

Revisiting the methods for detecting Mycobacterium tuberculosis: what has the new millennium brought thus far?

Antiviral activity of betacyanins from red pitahaya (Hylocereus polyrhizus) and red spinach (Amaranthus dubius) against dengue virus type 2 (GenBank accession no. MH488959)

Emerging source of infection – Mycobacterium tuberculosis in rescue dogs: a case report

Phylogenomics insights into order and families of Lysobacterales

Easy phylotyping of Escherichia coli via the EzClermont web app and command-line tool

Sensitivity of shotgun metagenomics to host DNA: abundance estimates depend on bioinformatic tools and contamination is the main issue

When good bacteria behave badly: a case report of Bacillus clausii sepsis in an immunocompetant adult

Cutaneous geotrichosis due to Geotrichum candidum in a burn patient