The genomes of spp. mosquitoes contain integrated sequences from nonretroviral RNA viruses that are enriched in piRNA clusters, are embedded next to transposable elements (TEs) and produce piRNAs. The parallelism between TEs and viral integrations led to the hypothesis that viral integrations may constitute an archive of past viral infections and potentially have an immunity impact on novel infection with cognate viruses, similarly to how the piRNA pathway interacts with TEs. A corollary of this hypothesis is that the landscape of viral integrations should be variable across populations depending on their viral exposure. The highly repetitive nature of spp. genomes make the discovery of viral integrations from whole genome sequencing data of wild mosquitoes a daunting task.

Here we describe a novel bioinformatic pipeline to rigorously identify viral integrations using Next Generation Sequencing (NGS) data. Libraries from single or pools of mosquitoes, reference genome statistics, the landscape of TEs and the geographic origin of the analyzed samples are the actors of the analysis.

This pipeline has been tested in and mosquitoes, allowing to compare the performance of the analyses on genome assemblies of different completeness. We identified novel viral integrations in both genomes. Additionally, we show that the landscape of viral integrations is dynamic, with a population-specific behavior that we can leverage to formulate hypothesis on mechanisms of integration and the biological role of viral integrations.

  • This is an open-access article distributed under the terms of the Creative Commons Attribution License.

Article metrics loading...

Loading full text...

Full text loading...


Most cited this month Most Cited RSS feed

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error