RNAseq or microarrays - which technique to choose?

For the past two decades, microarrays have been used extensively for high-throughput quantification of mRNA abundance. In the last few years, transcriptomic profiling via next generation sequencing (RNAseq) has emerged as a powerful competitor.

Microarrays typically consist of a large number of short oligonucleotide probes, representing genomic regions of interest, attached to a chip, and the signal intensity from each probe is used to estimate the RNA abundance for the corresponding region. RNAseq, in contrast, uses next generation sequencing methods to directly determine the nucleotide sequence of millions of short pieces of sampled RNA (called reads). Typically, the short reads are then mapped to a reference genome and the number of reads that map within a given region is used as a measure of the abundance of RNA from that region.

One of the most prominent advantages of RNAseq compared to array-based techniques is that with RNAseq, we are not limited to measuring the abundance of only the pre-defined sequences on the array, and we don't necessarily need extensive knowledge of the genomic sequence and the location of genes or other features of interest. On the other hand, one of the biggest hurdles for RNAseq to overcome is still the higher cost compared to microarrays. The higher cost of RNAseq may lead researchers to reduce the number of biological replicates, which would make it harder to perform reliable statistical analyses.

The reliance on hybridization makes arrays susceptible to cross-hybridization, which can have a measurable effect on observed expression levels particularly for genes with low expression. The use of hybridization also imposes a limitation on the dynamical range of expression levels. RNAseq is based on sampling (i.e the reads to be sequenced are sampled from the pool of available short segments) and for weakly expressed features the sampling variation dominates the biological variation, and weakly expressed features may not be detected at all.

Which technique to choose is normally not critical (except for the very highly or weakly expressed features), and several studies have reported a high correlation between expression measurements from RNAseq and different types of microarrays, as well as a large overlap between differentially expressed genes found by the two techniques.

Both microarray data and RNAseq data can be analyzed with Qlucore Omics Explorer (OE). Microarray data obtained by Affymetrix or Agilent arrays can be automatically normalized when imported into OE and aligned BAM files can also be directly imported and normalized. 

For RNAseq data a suite of new statistical methods, adapted to the count nature of the data, have been launched. However, it can also be shown that statistical methods combining a variance-stabilizing transformation with a regular t-test perform very well under many different conditions and also seem to be more robust towards outliers. 

More details and references can be found in the Qlucore White Paper.

The Qlucore Omics Explorer version 3.0 is now available as a beta version and it includes direct import of aligned BAM files as well as a lot more exciting functionality.