a bioinformatics meta-analysis of differentially expressed genes in colorectal cancer

18
Bioinformatics Meta-analysis of Differentia xpressed Genes in Colorectal Cancer Simon Chan, [email protected] Thursday Trainee Seminar – October 11 th , 2007

Upload: marvel

Post on 05-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer. Simon Chan, [email protected] Thursday Trainee Seminar – October 11 th , 2007. Introduction to Colorectal Cancer (CRC). Cancerous growths in the colon, rectum or appendix - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer

Simon Chan, [email protected] Trainee Seminar – October 11th, 2007

Page 2: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Introduction to Colorectal Cancer (CRC)

• Cancerous growths in the colon, rectum or appendix

• In 2007, an estimated 20,800 Canadians will be diagnosed with CRC and approximately 8,700 will die of it (Source: Canadian Cancer Society)

• Stages of CRC (Image Source: Cardoso J, et al. 2007)

Page 3: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

High throughput gene expression analysis

• Many high throughput gene expression analyses have been performed and published:– Cancer versus Normal– Cancer versus Adenoma– Adenoma versus Normal

• Various technologies used:– Serial Analysis of Gene Expression (SAGE)– Oligo-nucleotide microarrays– cDNA microarrays

• Goal: To determine candidate diagnostic and prognostic molecular biomarkers in CRC

Page 4: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Problems• Unfortunately, low overlap between expression profiling studies

• Why?

– Different methods to obtaining tissues (ie Laser Capture Microdissection vs Microdissection)

– Tissue heterogeneity– Inadequate sample numbers– Use of different gene expression platforms (SAGE, microarray,

etc)– Different statistical methods, fold change thresholds, etc applied

• Questions: – Which genes are actually differentially expressed in CRC?

• Which genes would make good CRC biomarkers?

Page 5: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

One Solution

• Determine the intersection of a comprehensive collection of high throughput gene expression studies.

• Expect that genes biologically relevant to CRC will be reported the most often.

• System-specific spurious genes should be under-represented.

Page 6: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

• However, the statistical significance of this overlap is often not considered

• A certain level of overlap among studies can be expected due to chance alone

Table source: Cardoso J et al, 2007

Page 7: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Meta-analysis Method

• Developed a vote-counting strategy to rank differentially expressed genes based on the following criteria, in order of importance:

– Number of studies reporting a gene as differentially expressed

– Number of tissue samples showing this differential expression

– Fold Change of differential expression

Page 8: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Published Gene Expression Studies

• Collected 25 published gene expression studies– 23 studies compared Cancer versus Normal

– 7 studies compared Adenoma versus Normal

– 5 studies compared Cancer versus Adenoma

Platform Count (Total: 25)

Commerical cDNA microarrays 12

Custom cDNA microarrays 7

Affymetrix oligo-nucleotide microarrays 3

Oligo-nucleotide microarrays 2

SAGE 1

Page 9: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Study 1 Study 2 Study 25

Differentiallyexpressedgene list 1

Differentiallyexpressedgene list 2

Differentiallyexpressedgene list 25

Platform gene list 1

Platformgene list 2

Platform gene list 25

Page 10: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Example

• Croner RS et al, 2005– Compared Cancer versus Normal

– Utilized Affymetrix HG-U133A GeneChip

• Obtained platform annotation file for HG-U133A from Affymetrix website– Mapped Affy probe ids to Enterz Gene IDs (platform gene list)

• Mapped differentially expressed genes to Entrez Gene IDs (differentially expressed gene list)

Page 11: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

• Therefore, for each study, two files would be produced:

– File 1: All genes (represented by Entrez Gene IDs) covered on the platform:

• 759• 10581• 11234• 76013• etc

– File 2: Differentially expressed Entrez Gene IDs• 759 UP• 1434 DOWN• 1112 UP• etc

Page 12: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Simulations– Developed custom Perl scripts to perform Monte Carlo simulation.

– For 10,000 iterations,• For each study,

– Determine number of up-regulated (X) and down-regulated (Y) genes reported in the study

– Randomly choose X genes from the platform gene list and label as up-regulated

– Randomly choose Y genes from the platform gene list and label as down-regulated

• Determine number of overlapping genes across the studies in this simulation

– Calculate the average number of genes with overlap of 2,3,4, etc and associated P-values

Page 13: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

410

95

30 20 10 5

258.3

18.371.14 12

0

50

100

150

200

250

300

350

400

450

2 3 4 5 6 7 9 11

Number of studies

Nu

mb

er

of

ge

ne

s

Actual Overlap

Simulation

Cancer versus Normal

Page 14: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Summary of Comparisons Analyzed for Overlap

ComparisonTotal Num of Studies

Total Num of Differentially Expressed Genes Reported (mapped)

Total Num of Differentially Expressed Genes with Multi-study Confirmation

P-value

Cancer versus Normal

23 6537 (5886) 573 < .0001

Adenoma versus Normal

7 1101 (986) 39 < .0001

Cancer versus Adenoma

5 538 (415) 5 .08

Page 15: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

GeneName

DescriptionStudiesReportingthis Gene

TotalSampleSizes

MeanFoldChange

Validation

TGFβI

Transforminggrowthfactor, betainduced,68kDa

9 369 8.94 RT-PCR

IFITM1

InterferoninducedTransmembraneProtein 1 (9-27)

9 351 7.52 RT-PCR

MYC

V-mycMyelocytomatosisViral OncogeneHomolg (avian)

7 329 5.02 RT-PCR

SPARCSecreted protein,acidic, cysteine-rich(osteonectin)

7 244 6.30 IHC

GDF15Growthdifferentiationfactor 15

7 230 7.42 RT-PCR

Page 16: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Future Studies

• Purchased antibodies for certain high ranking candidates

• Validate protein expression level on colorectal tissue microarrays

• Correlate to certain prognostic outcomes

Page 17: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Conclusions

• Low overlap of results between many colorectal cancer high throughput gene expression studies

• Meta-analysis method identified consistently reported differentially expressed genes

• Cancer versus Normal and Adenoma versus Normal, but not Cancer versus Adenoma, studies resulted in genes consistently reported at a statistically significant frequency

Page 18: A Bioinformatics Meta-analysis of Differentially  Expressed Genes in Colorectal Cancer

Acknowledgements:

• Dr. Steven Jones

• Dr. Isabella Tai

• Obi Griffith

• Chan SK, Griffith OL, Tai IT, Jones SJM. Meta-analysis of Colorectal Cancer Gene Expression Profiling Studies Identifies Consistently Reported Candidate Biomarkers. Manuscript in review with Cancer Epidemiology, Biomarkers & Prevention.