head and neck cancer: microrna analysis amy li monti lab rotation boston university 11/25/13

Post on 01-Jan-2016

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Head and Neck Cancer: microRNA analysis

Amy Li

Monti Lab Rotation

Boston University

11/25/13

Dataset

Head and Neck Cancer Dataset from The Cancer Genome Atlas (TCGA)o Contains large and well-documented data in many cancer subtypeso DNA-methylation, SNP Array, RNA-seq, miRNA-seq, low pass DNA-seq,

Reverse Phase Protein Array Normalized miRNAseq data

o 463 samples 39 patients: tumor tissue and adjacent normal tissue 385 patients: tumor tissue only

o 1046 miRNA Clinical information

o 360 samples, 71 clinical attributes (ie. anatomic subdivision, gender, grade, race, stage)

Goals

Identify miRNA markers of:o Cancer status:

Normal vs. tumoro Cancer progression:

Differentially expressed in each stage or grade

Integrate miRNA expression with gene expression datao Gene set enrichment in miRNA targetso mRNA data (Vinay)

Tumor Progression Classification

Tumor Grade: Assigned based on how abnormal the tumor cells looks

under a microscope Ranges from G1 (well-differentiated) to G4

(undifferentiated) Well differentiated tumor cells from a lower grade

resemble normal cells, tend to spread slowly, and is generally indicative of better prognosis

Tumor Stage: Based on size or extent (reach) of the primary tumor

http://www.cancer.gov/cancertopics/factsheet/detection/tumor-grade

Analysis Overview

Exploratory data analysiso Mean vs. standard deviationo Boxplots of miRNA expression o Data filteringo Clinical demographics

Unsupervised clusteringo Heatmapso Fisher test for association between clusters and sample attribute assignment (tumor

status, grade, stage, etc) Tests for confounders

o Association between grade and other attributes, ie: ethnicity, gender, smoking history, age, alcohol consumption

Differential Analysiso Look for differentially expressed genes with respect to grade or stage

miRNA targets and Gene Set Enrichment Analysiso Identify sets of miRNA targets and see whether such gene sets are enriched with

respect to disease phenotype

Exploratory Data Analysis:Mean vs. standard deviation

Exploratory Data Analysis: Boxplot of Expression

Exploratory Data Analysis:Data filtering

Sample filtering: Samples without clinical labels

Gene filtering: Lowly expressed genes

o Row maximum Genes with constant expression

o Standard deviation

Full Matrix• 1046 × 463

Filtered Matrix• 692 × 393

Exploratory Data Analysis:Clinical Demographics

Exploratory Data Analysis:Clinical Demographics

Exploratory Data Analysis:Clinical Demographics

Unsupervised Clustering:Paired samples: Tumor vs. Adjacent Normal

Fisher Test: Test for association

between Cluster Assignment and Actual Class Label

P-val ~ 0

Cluster 1 Cluster 2

Normal 38 1

Tumor 2 37

Unsupervised Clustering:Grades: G1, G2, G3, G4

Fisher Test: Tested for association

between grades and cluster assignments for total number of clusters ranging from 2 to 5

P-vals not significant in all cases

Tests for confounders

Tested for association between grade and the putative confounding variable using Fisher test (discrete variables) or ANOVA (continuous variables)

Ethnicity (p=0.57), race (p=0.84), gender (p=0.09), age (p=0.55), alcohol consumption (p=0.63)

Correct gene expression for gender using a linear regression model prior to performing differential analysis

Data Processing for Differential Analysis

Sample

Filtering

• Removed samples without clinical labels• Removed samples sequenced on IlluminaGA (kept IlluminaHiseq samples)• Removed samples with minority races (kept “white”)

Gene filtering

• Removed miRNAs with low expression (90% quantile < 100)• Removed miRNAs with constant expression (sd < 0.1)

Attribute Filtering

• Grade:• Removed GX and “Not Available”• Kept G1, G2, G3, G4

• Stage:• Removed “Not Available”• Kept S1, S2, S3, S4A, S4B

Differential Analysis:Grade

diffAnal.Ro Performs permutation tests to identify significant genes

differentially regulated in one of two classes Normalized expression matrix corrected for gender Class label: grade attribute binarized to “low” vs. “high” Run diffAnal for each high vs. low cutoff:

o G0 (adjacent normal) vs. G1-G4 (tumor)o G1 vs. G2-G4o G1-G2 vs. G3-G4o G1-G3 vs. G4

Differential Analysis:Grade : G0 vs. G1-G4

Differential Analysis:Grade : G1 vs. G2-G4

Differential Analysis:Grade : G1-G2 vs. G3-G4

Differential Analysis:Grade : G1-G3 vs. G4

Differential Analysis: Trends

Found more significant markers for tumors vs. normal than for distinguishing between low and high grades

Performed same analysis for stage, significant markers for stage are weaker than that of grade

For both grade and stage, most significant markers found by diffAnal show upregulation in the later disease state.

Differential Analysis:Tumor Classification Marker

148 total genes (90% quantile > 100) used for diffAnal 65 significant genes upregulated in tumors 37 significant genes downregulated in tumors Cutoff: FDR < 0.01

Differential Analysis:Cancer Progression Marker for Grades

G1-_vs_G2+_fdr G2-_vs_G3+_fdr G3-_vs_G4+_fdr

hsa-mir-106b 0.03 0.01 0.04

hsa-mir-15b 0.03 0.05 0.81

hsa-mir-582 0.03 0.01 0.04

hsa-mir-151 0.03 0.61 -0.25

hsa-mir-196b 0.03 0.01 0.18

hsa-mir-10a 0.03 0.32 -0.96

hsa-mir-374a 0.03 0.05 -0.6

hsa-mir-128-2 0.04 0.61 0.26

hsa-mir-25 0.03 0.01 0.02

hsa-mir-128-1 0.03 0.47 0.17

hsa-mir-28 0.04 0.01 0.44

Cancer progression marker will satisfy ALL of:1. Tumor classification marker2. Significant FDR in 2/3 runs of diffAnal3. Monotonous increase or decrease across grades

4 miRNA markers identified (all are upregulated with increasing grade)

Differential Analysis:Cancer Progression Marker for Grades

Finding miRNA Targets

miRWalk Targetscan mirBase

Finding miRNA Targets

miRWalk “Validated targets” module Targets for differentially expressed miRNAs: 162 targets Intersect targets found by miRWalk with AhR targets 6 matches: NQO1, NFE2, IL1B, TNF, TGFB1, MYC

miRNA

Targets

(162)

AhR Targets

(54) (6)

miRNA markers

(4)

Work in Progress

Gene set enrichment analysis: Consider targets of strong miRNA markers as a gene set Is there an enrichment of this defined gene set in certain

disease phenotypes, ie. high grade?Pathway analysis: Which pathways are these miRNA markers involved in?Modeling tumor progression: Explore other definitions of tumor progression markers

top related