The Steps for RNA-seq analysis

The steps for RNA-seq analysis

Transcriptomic profiling via next generation sequencing (RNA-seq) is currently the most used data type by Qlucore Omics Explorer users according to a recent user study.

Qlucore Omics Explorer includes a direct and streamlined workflow for RNA-seq data import which makes it easy to get started. You can start directly from BAM files with aligned reads. The only additional file needed is a 'gtf' file that defines the genomic coordinates of the genes. The gtf file should correspond to the same reference genome as the BAM files were aligned to.

The normalization and transformation that are applied to the imported data makes it possible to directly implement all statistical methods and plots that are available in the program. No special statistics are required; t-tests and ANOVA typically perform well.

The workflow includes the following steps:

  1. Reads are assigned to genes using the 'intersection-strict' method.
  2. For between-sample normalization TMM is used (Robinson & Oshlack 2010). Since longer genes typically generate more reads than equally expressed, shorter genes, the observed counts are also divided by the length of the corresponding genes.
  3. A logarithmic transformation is applied.

Users are currently using different methods to pre-process and analyze RNA-seq data. Below we discuss how the proposed Qlucore workflow relates to some other established methods.

The TMM normalization and log-transformation approach is similar to the one used by the voom method (Law et al, 2014), but also incorporates a normalization for gene length, which voom does not. In this sense, the approach used by Qlucore Omics Explorer is also similar to the RPKM representation (Mortazavi et al, 2008), but incorporates the TMM normalization factor for an improved between-sample normalization.

Transformation-based approaches (as compared to for example count based statistical methods) have been shown to give reliable results when used together with linear modeling for differential expression analysis (Soneson & Delorenzi, 2013). In the evaluations performed in this paper, the voom transformation was used together with limma (Smyth, 2004), which can incorporate the estimated precision weights from voom when fitting its linear model (this is not done by the traditional general linear model framework within Qlucore Omics Explorer). However, based on the simulated data from Soneson & Delorenzi (2013), it can be shown that the voom transformation also performs well when used with a standard t-test (which does not use the precision weights).

More details and references are presented in the Qlucore White Paper - “Analyzing RNA-seq data with Qlucore Omics Explorer”, which can be found at www.qlucore.com/documentation.

For more information, have a look at Qlucore films, webinars and how to documents.