As sequencing read length has increased, researchers have quickly adopted longer reads for their experiments. Here, we examine 14 pathogen or host–pathogen differential gene expression data sets to assess whether using longer reads is warranted. A variety of data sets was used to assess what genomic attributes might affect the outcome of differential gene expression analysis including: gene density, operons, gene length, number of introns/exons and intron length. No genome attribute was found to influence the data in principal components analysis, hierarchical clustering with bootstrap support, or regression analyses of pairwise comparisons that were undertaken on the same reads, looking at all combinations of paired and unpaired reads trimmed to 36, 54, 72 and 101 bp. Read pairing had the greatest effect when there was little variation in the samples from different conditions or in their replicates (e.g. little differential gene expression). But overall, 54 and 72 bp reads were typically most similar. Given differences in costs and mapping percentages, we recommend 54 bp reads for organisms with no or few introns and 72 bp reads for all others. In a third of the data sets, read pairing had absolutely no effect, despite paired reads having twice as much data. Therefore, single-end reads seem robust for differential-expression analyses, but in eukaryotes paired-end reads are likely desired to analyse splice variants and should be preferred for data sets that are acquired with the intent to be community resources that might be used in secondary data analyses.
ChhangawalaS,
RudyG,
MasonCE,
RosenfeldJA.
The impact of read length on quantification of differentially expressed genes and splice junction detection. Genome Biol2015; 16:131 [View Article]
BrunoVM,
ShettyAC,
YanoJ,
FidelPL,
NoverrMC et al. Transcriptomic analysis of vulvovaginal candidiasis identifies a role for the NLRP3 inflammasome. MBio2015; 6:e00182-15 [View Article]
WatkinsTN,
LiuH,
ChungM,
HazenTH,
Dunning HotoppJC et al. Comparative transcriptomics of Aspergillusfumigatus strains upon exposure to human airway epithelial cells. Microb Genom2018; 4:mgen.0.000154 [View Article]
LiuY,
ShettyAC,
SchwartzJA,
BradfordLL,
XuW et al. New signaling pathways govern the host response to C. albicans infection in various niches. Genome Res2015; 25:679–689 [View Article]
HazenTH,
DaughertySC,
ShettyA,
MahurkarAA,
WhiteO et al. RNA-Seq analysis of isolate- and growth phase-specific differences in the global transcriptomes of enteropathogenic Escherichia coli prototype isolates. Front Microbiol2015; 6:569 [View Article]
ChungM,
TeigenLE,
LibroS,
BromleyRE,
OlleyD et al. Drug repurposing of bromodomain inhibitors as potential novel therapeutic leads for lymphatic filariasis guided by multispecies transcriptomics. mSystems2019; 4:e00596-19 [View Article]
GiffordAH,
WillgerSD,
DolbenEL,
MoultonLA,
DormanDB et al. Use of a multiplex transcript method for analysis of Pseudomonas aeruginosa gene expression profiles in the cystic fibrosis lung. Infect Immun2016; 84:2995–3006 [View Article]
NagalakshmiU,
WangZ,
WaernK,
ShouC,
RahaD et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science2008; 320:1344–1349 [View Article]
Juranic LisnicV,
Babic CacM,
LisnicB,
TrsanT,
MefferdA et al. Dual analysis of the murine cytomegalovirus and host cell transcriptomes reveal new aspects of the virus-host cell interface. PLoS Pathog2013; 9:e1003611 [View Article]
ConesaA,
MadrigalP,
TarazonaS,
Gomez-CabreroD,
CerveraA et al. A survey of best practices for RNA-Seq data analysis. Genome Biol2016; 17:13 [View Article]
R Development Core TeamR: a Language and Environment for Statistical Computing Vienna: R Foundation for Statistical Computing; 2013http://www.R-project.org/.
IguchiA,
ThomsonNR,
OguraY,
SaundersD,
OokaT et al. Complete genome sequence and comparative genome analysis of enteropathogenic Escherichia coli O127:H6 strain E2348/69. J Bacteriol2009; 191:347–354 [View Article]
DarbyAC,
ArmstrongSD,
BahGS,
KaurG,
HughesMA et al. Analysis of gene expression from the Wolbachia genome of a filarial nematode supports both metabolic and defensive roles within the symbiosis. Genome Res2012; 22:2467–2477 [View Article]