comparison between rnaseq and microarray for gene expression analysis

Post on 08-May-2015

972 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Comparing and intercoverting expression profiles from Microarrays to RNAseq

Yaoyu E. Wang, Ph.DCenter for Cancer Computational Biology, DFCI

SPECSII webinarJune 05, 2013

Transcriptome Profiling by High Throughput Technologies

- Transcriptome profiling represents a static gene expression state of a biological sample across the genome

- Allows for direct genomic comparisons with multiple samples to determine genes that exhibit differential expression in different state (i.e. normal vs. tumor)

- Allows for hypothesis generation on molecular abnormalities and mechanisms that may contribute to the tumor phenotype

- Provides information on molecular subtypes, the development of prognostic and predictive molecular signatures

- Two main technologies: a. Microarrayb. RNA-Sequencing (RNASeq) using next

generation sequencing

Microarray and RNASeq Experimental Flows

Affymetrix GeneChip

scanner

Microarray and RNASeq Experimental Flows

Blencowe B J et al. Genes Dev. 2009;23:1379-1386

Illumina HiSeq

The run is finished. How are sequence files created?

.bcl files

CASAVA processing•Demultiplexing•Fastq file generation•Sequencing filtering

Raw files containing base calls and quality scores

Illumina defined quality filters

Split into Project and Sample Folders

Jones_Lab

ChIP_A ChIP-B

Marcus_Lab

RNA-SeqA RNA-SeqB RNA-SeqC

Williams_Lab

Exome1 Exome2

Fastq Files Fastq Files Fastq Files

Possibilities for RNASeq Analysis

Haas & Zody. Nature Biotechnology 28, 421–423 (2010)

Using known annotations

And compare to known annotations

•Differential Expression•Differential Isoform Abundance•RNA editing•SNP, indel detection

Microarray and RNASeq Comparison

Technology RNASeq MicroarrayHigh run-to run reproducibility Yes YesDynamic Range Comparable to actual transcript abundance >8000-fold Hundred

foldAble to detect alternative splice site and novel isoforms Yes No

De novo analysis of samples without reference genome Yes No

Multiplexing Samples in one run Yes NoRequired amount of total RNA >100 ng ~1 ug Re-analyzable data Yes No

Microarray and RNASeq Comparison

Technology RNASeq MicroarrayHeterogeneity of read coverage across an expressed region Yes No

Well understood sources of experimental bias No Yes

Data portable on a flush drive (~4G) No YesData is analyzable by any PC No YesCheaper cost per sample No(?) Yes(?)

Why microarray data should not be ignored?

RNA-Seq Experiment

GEO Database

Comparing Expression Profiles from Microarrays to RNASeq

White paper, Illumina

Comparing Expression Profiles from Microarrays to RNASeq

White paper, Illumina

Comparing Expression Profiles from Microarrays to RNASeq

n=7532 n=4537

Mooney M, PloSOne (2013)

RNA-Seq vs. Microarray in Lymphoma of Canis familiaris

10 Lymphoma (3 T-cell, 7 B-cell)4 Normal lymph node

Total RNA

PE100 run50-100 million mapped reads

Compare 15,092 annotated genes on chip

RNA-Seq vs. Microarray in B-Cell Lymphomas of Canis familiaris

Mooney M, PloSOne (2013)

T

NB

r=0.6; p<10-15

Comparison of transcriptome profile from RNA-Seq and Tiling arrays

c. elegans

Biological Replicates forL2 and YA stages

Affy Tiling Arrays* Illumina RNASeq

Agarwal, BMC Genomics (2010)

* Covers whole c.elegans genome

Differential Expression genes between the L2 and YA stage

Agarwal, BMC Genomics (2010)

Comparison of transcriptome profile from RNA-Seq and tiling arrays

RNA-Seq and tiling arrays

Tiling Array

Microarray Maximum Sensitivity

RNASeq 11-plex

RNASeq 6-plex

Agarwal, BMC Genomics (2010)

Comparison of transcriptome profile from RNA-Seq and tiling arrays using qPCR as gold standard

Microarray and RNASeq Comparison

Microarray is cheaper?

RNASeq v.s. Microarray Cost Analysis

Per Sample Microarray Illumina HiSeq1 per Chip/Lane $670 $4,010.00

2 plex NA $2,097.504-plex NA $1,141.256-plex NA $822.508-plex NA $663.13

6-plex

11-plex

RNASeq v.s. Microarray Cost Analysis

Per Sample Microarray Illumina HiSeq1 per Chip/Lane $670 $4,010.00

2 plex NA $2,097.504-plex NA $1,141.256-plex NA $822.508-plex NA $663.13

Microarray and RNASeq Comparison

RNASeq is cheaper?

Microarray v.s. RNASeq: Big Data Challenge

Data Per SampleTime to

download 1 Sample

Time to download 100 samples

Cost to Store on the Cloud per Month

RNASeq 30-65GB 1 Hr 6 days $270 Microarray 30MB 5 second 8 minutes $0.30

http://www.ncbi.nlm.nih.gov/genbank/statistics

Microarray v.s. RNASeq: Bioinformatics Challenge

-Application with User Interface RNA-Seq analysis (i.e. Galaxy) can only handle very few samples

-Knowledge of Linux server, scripting language, programming language is absolutely REQUIRED

-Lack of detailed understanding in NGS technology and data leads to diverse bioinformatics tools with different characteristics

Law WC , Voom!, Bionconductor (2013)

Bioinformatics will become the single largest cost in sequencing

Conclusions

The answer is Yes- Transcriptome profiles generated by microarray

and RNASeq are in strongly concordance- Microarray data generated in the last decades is

durable- RNASeq is it offers more a lot more biological

information than microarray that is re-analyzable- NGS is getting cheaper

However, the devil is in the data- NGS data is a lot more expensive to store and

analyze- Specialized computing infrastructure and

personnel are required to take advantage of the information from NGS data

top related