comparison between rnaseq and microarray for gene expression analysis

25
Comparing and intercoverting expression profiles from Microarrays to RNAseq Yaoyu E. Wang, Ph.D Center for Cancer Computational Biology, DFCI SPECSII webinar June 05, 2013

Upload: yaoyu-wang

Post on 08-May-2015

970 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Comparison between RNASeq and Microarray for Gene Expression Analysis

Comparing and intercoverting expression profiles from Microarrays to RNAseq

Yaoyu E. Wang, Ph.DCenter for Cancer Computational Biology, DFCI

SPECSII webinarJune 05, 2013

Page 2: Comparison between RNASeq and Microarray for Gene Expression Analysis

Transcriptome Profiling by High Throughput Technologies

- Transcriptome profiling represents a static gene expression state of a biological sample across the genome

- Allows for direct genomic comparisons with multiple samples to determine genes that exhibit differential expression in different state (i.e. normal vs. tumor)

- Allows for hypothesis generation on molecular abnormalities and mechanisms that may contribute to the tumor phenotype

- Provides information on molecular subtypes, the development of prognostic and predictive molecular signatures

- Two main technologies: a. Microarrayb. RNA-Sequencing (RNASeq) using next

generation sequencing

Page 3: Comparison between RNASeq and Microarray for Gene Expression Analysis

Microarray and RNASeq Experimental Flows

Affymetrix GeneChip

scanner

Page 4: Comparison between RNASeq and Microarray for Gene Expression Analysis

Microarray and RNASeq Experimental Flows

Blencowe B J et al. Genes Dev. 2009;23:1379-1386

Illumina HiSeq

Page 5: Comparison between RNASeq and Microarray for Gene Expression Analysis

The run is finished. How are sequence files created?

.bcl files

CASAVA processing•Demultiplexing•Fastq file generation•Sequencing filtering

Raw files containing base calls and quality scores

Illumina defined quality filters

Split into Project and Sample Folders

Jones_Lab

ChIP_A ChIP-B

Marcus_Lab

RNA-SeqA RNA-SeqB RNA-SeqC

Williams_Lab

Exome1 Exome2

Fastq Files Fastq Files Fastq Files

Page 6: Comparison between RNASeq and Microarray for Gene Expression Analysis

Possibilities for RNASeq Analysis

Haas & Zody. Nature Biotechnology 28, 421–423 (2010)

Using known annotations

And compare to known annotations

•Differential Expression•Differential Isoform Abundance•RNA editing•SNP, indel detection

Page 7: Comparison between RNASeq and Microarray for Gene Expression Analysis

Microarray and RNASeq Comparison

Technology RNASeq MicroarrayHigh run-to run reproducibility Yes YesDynamic Range Comparable to actual transcript abundance >8000-fold Hundred

foldAble to detect alternative splice site and novel isoforms Yes No

De novo analysis of samples without reference genome Yes No

Multiplexing Samples in one run Yes NoRequired amount of total RNA >100 ng ~1 ug Re-analyzable data Yes No

Page 8: Comparison between RNASeq and Microarray for Gene Expression Analysis

Microarray and RNASeq Comparison

Technology RNASeq MicroarrayHeterogeneity of read coverage across an expressed region Yes No

Well understood sources of experimental bias No Yes

Data portable on a flush drive (~4G) No YesData is analyzable by any PC No YesCheaper cost per sample No(?) Yes(?)

Page 9: Comparison between RNASeq and Microarray for Gene Expression Analysis

Why microarray data should not be ignored?

RNA-Seq Experiment

GEO Database

Page 10: Comparison between RNASeq and Microarray for Gene Expression Analysis

Comparing Expression Profiles from Microarrays to RNASeq

White paper, Illumina

Page 11: Comparison between RNASeq and Microarray for Gene Expression Analysis

Comparing Expression Profiles from Microarrays to RNASeq

White paper, Illumina

Page 12: Comparison between RNASeq and Microarray for Gene Expression Analysis

Comparing Expression Profiles from Microarrays to RNASeq

n=7532 n=4537

Page 13: Comparison between RNASeq and Microarray for Gene Expression Analysis

Mooney M, PloSOne (2013)

RNA-Seq vs. Microarray in Lymphoma of Canis familiaris

10 Lymphoma (3 T-cell, 7 B-cell)4 Normal lymph node

Total RNA

PE100 run50-100 million mapped reads

Compare 15,092 annotated genes on chip

Page 14: Comparison between RNASeq and Microarray for Gene Expression Analysis

RNA-Seq vs. Microarray in B-Cell Lymphomas of Canis familiaris

Mooney M, PloSOne (2013)

T

NB

r=0.6; p<10-15

Page 15: Comparison between RNASeq and Microarray for Gene Expression Analysis

Comparison of transcriptome profile from RNA-Seq and Tiling arrays

c. elegans

Biological Replicates forL2 and YA stages

Affy Tiling Arrays* Illumina RNASeq

Agarwal, BMC Genomics (2010)

* Covers whole c.elegans genome

Page 16: Comparison between RNASeq and Microarray for Gene Expression Analysis

Differential Expression genes between the L2 and YA stage

Agarwal, BMC Genomics (2010)

Comparison of transcriptome profile from RNA-Seq and tiling arrays

Page 17: Comparison between RNASeq and Microarray for Gene Expression Analysis

RNA-Seq and tiling arrays

Tiling Array

Microarray Maximum Sensitivity

RNASeq 11-plex

RNASeq 6-plex

Agarwal, BMC Genomics (2010)

Comparison of transcriptome profile from RNA-Seq and tiling arrays using qPCR as gold standard

Page 18: Comparison between RNASeq and Microarray for Gene Expression Analysis

Microarray and RNASeq Comparison

Microarray is cheaper?

Page 19: Comparison between RNASeq and Microarray for Gene Expression Analysis

RNASeq v.s. Microarray Cost Analysis

Per Sample Microarray Illumina HiSeq1 per Chip/Lane $670 $4,010.00

2 plex NA $2,097.504-plex NA $1,141.256-plex NA $822.508-plex NA $663.13

6-plex

11-plex

Page 20: Comparison between RNASeq and Microarray for Gene Expression Analysis

RNASeq v.s. Microarray Cost Analysis

Per Sample Microarray Illumina HiSeq1 per Chip/Lane $670 $4,010.00

2 plex NA $2,097.504-plex NA $1,141.256-plex NA $822.508-plex NA $663.13

Page 21: Comparison between RNASeq and Microarray for Gene Expression Analysis

Microarray and RNASeq Comparison

RNASeq is cheaper?

Page 22: Comparison between RNASeq and Microarray for Gene Expression Analysis

Microarray v.s. RNASeq: Big Data Challenge

Data Per SampleTime to

download 1 Sample

Time to download 100 samples

Cost to Store on the Cloud per Month

RNASeq 30-65GB 1 Hr 6 days $270 Microarray 30MB 5 second 8 minutes $0.30

http://www.ncbi.nlm.nih.gov/genbank/statistics

Page 23: Comparison between RNASeq and Microarray for Gene Expression Analysis

Microarray v.s. RNASeq: Bioinformatics Challenge

-Application with User Interface RNA-Seq analysis (i.e. Galaxy) can only handle very few samples

-Knowledge of Linux server, scripting language, programming language is absolutely REQUIRED

-Lack of detailed understanding in NGS technology and data leads to diverse bioinformatics tools with different characteristics

Law WC , Voom!, Bionconductor (2013)

Page 24: Comparison between RNASeq and Microarray for Gene Expression Analysis

Bioinformatics will become the single largest cost in sequencing

Page 25: Comparison between RNASeq and Microarray for Gene Expression Analysis

Conclusions

The answer is Yes- Transcriptome profiles generated by microarray

and RNASeq are in strongly concordance- Microarray data generated in the last decades is

durable- RNASeq is it offers more a lot more biological

information than microarray that is re-analyzable- NGS is getting cheaper

However, the devil is in the data- NGS data is a lot more expensive to store and

analyze- Specialized computing infrastructure and

personnel are required to take advantage of the information from NGS data