-
Volume 2,
Issue 4,
2016
Volume 2, Issue 4, 2016
- Research Paper
-
- Genomic Methodologies: Genome variation detection
-
-
Identifying copy number variation of the dominant virulence factors msa and p22 within genomes of the fish pathogen Renibacterium salmoninarum
More LessRenibacterium salmoninarum is the causative agent of bacterial kidney disease, an important disease of farmed and wild salmonid fish worldwide. Despite the wide spatiotemporal distribution of this disease and habitat pressures ranging from the natural environment to aquaculture and rivers to marine environments, little variation has been observed in the R. salmoninarum genome. Here we use the coverage depth from genomic sequencing corroborated by real-time quantitative PCR to detect copy number variation (CNV) among the genes of R. salmoninarum. CNV was primarily limited to the known dominant virulence factors msa and p22. Among 68 isolates representing the UK, Norway and North America, the msa gene ranged from two to five identical copies and the p22 gene ranged from one to five copies. CNV for these two genes co-occurred, suggesting they may be functionally linked. Isolates carrying CNV were phylogenetically restricted and originated predominantly from sites in North America, rather than the UK or Norway. Although both phylogenetic relationship and geographical origin were found to correlate with CNV status, geographical origin was a much stronger predictor than phylogeny, suggesting a role for local selection pressures in the repeated emergence and maintenance of this trait.
-
- Methods Paper
-
- Systems Microbiology: Large-scale comparative genomics
-
-
SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments
Rapidly decreasing genome sequencing costs have led to a proportionate increase in the number of samples used in prokaryotic population studies. Extracting single nucleotide polymorphisms (SNPs) from a large whole genome alignment is now a routine task, but existing tools have failed to scale efficiently with the increased size of studies. These tools are slow, memory inefficient and are installed through non-standard procedures. We present SNP-sites which can rapidly extract SNPs from a multi-FASTA alignment using modest resources and can output results in multiple formats for downstream analysis. SNPs can be extracted from a 8.3 GB alignment file (1842 taxa, 22 618 sites) in 267 seconds using 59 MB of RAM and 1 CPU core, making it feasible to run on modest computers. It is easy to install through the Debian and Homebrew package managers, and has been successfully tested on more than 20 operating systems. SNP-sites is implemented in C and is available under the open source license GNU GPL version 3.
-
Most Read This Month
