investigations into etiology of breast, esophageal, and gastric cancers: allele-specific gene...
TRANSCRIPT
Investigations into Etiology of Breast, Esophageal, and Gastric Cancers: Allele-specific Gene Expression and DNA Methylation Signature
Maxwell Lee, Ph.D.
National Cancer InstituteCenter for Cancer Research
Laboratory of Population Genetics andProgram in Bioinformatics and Computational Biology
May 14, 2013
Part 1
Part 2
Large-scale analyses of allele-specific gene expression and chromatin modifications
DNA methylation signatures for tumor classification and tumor progression
Part 3
Functional characterization of a novel oncogene identified through our genomic copy number analyses
Analyzing Allele-specific Gene Expression in Large-scaleUsing Affymetrix SNP Arrays
Genomic imprinting
X chromosome inactivation
cDNA
Affymetrix SNP array
normal human fetal tissues
Allele-specific Gene Expression and Implication for Genome Wide Association Studies
Allele-specific gene expression versus genomic imprinting and X-chromosome inactivation
•quantitative difference (2-4 fold)•20%~50% of the human genes•no parental origin preference
Implication of allele-specific gene expression for genome wide association studies
•SNPs that don’t change amino acid sequence•regulatory SNPs
277 genes (46%)equal expression
326 genes (54%)> 2-fold difference
Lo et al. Genome Res. 2003
Allele-specific ChIP-on-chip Experimental Design1347 1362
DNA
input
Pol II
H3Ac
H3K4H3K9
H3K27di
H3K27tri
active
inactive
control
96 microarray data
Genetic background influences the global epigenetic state
Samples cluster by family using allele-specific chromatin-binding activity
Family 1 Family 2
Kadota et al. PLoS Genet. 2007
1347
1362
active chromatin marks
inactive chromatin
marks
Genetic background influences the global epigenetic state
Somatic Mutations Identified through RNA-seq
4 pairs of breast tumor and normal140 millions reads
reads map to genome and transcriptome
342 somatic mutations
X
Elevated Expression of Mutant Alleles in Breast Tumors
cDNA
Genomic DNAINO80B
cDNA
Genomic DNA ARID1B
Implication for identifying driver mutationsre
lati
ve m
utan
t all
ele
inte
nsit
y in
cD
NA
no
rmal
ized
to g
enom
ic D
NA
Mean = 2.2, p-value = 0.05
Elevated Expression of Mutant Alleles in Breast Tumors
Summary of Functional data for Genes That
Displayed Elevated Expression of Somatic Mutations
gene mutation description
G3BP2 S48T GTPase activating protein (SH3 domain) binding protein 2;
oncogene, sequesting TP53
INO80B P306L INO80 complex subunit B; involved in prostate cancer
ARID1B Y1345N AT rich interactive domain 1B (SWI1-like). Its homolog, ARID1A, is frequently mutated in ovarian clear cell carcinoma
OSTF1 L20P osteoclast stimulating factor 1
GPRC5A S59C G protein-coupled receptor, family C, group 5, member A;
involved in lung cancer
RYBP K217Q RING1 and YY1 binding protein; stabilizing TP53
Part 1
Part 2
Large-scale analyses of allele-specific gene expression and chromatin modifications
DNA methylation signatures for tumor classification and tumor progression
Part 3
Functional characterization of a novel oncogene identified through our genomic copy number analyses
Identification of Novel Oncogenes through Focal Amplification Analysis
161 tumors
chro
mos
ome
161 breast tumors
putative novel oncogenes
Affymetrix SNP5 array
1q
8q
traditional approach
my approach
size of focal amplification
multiple genes
1 gene
frequency of tumors with
amplificationcommon
high frequency not required but must occur in ≥
1 tumor
Focal Amplification of TBL1XR1 in Breast Tumors
((()
Tumor 1
Tumor 2
TBL1XR1-shRNA Knockdown Suppresses In Vivo Tumor Growth
tum
or v
olum
e (m
m3 )
N=10 N=10 N=14implants
Kadota et al. Cancer Res. 2009
Western Blot
In collaboration with Lalage Wakefield
9 of 10 7 of 10 1 of 14tumor incidence
Day 39
p-value = 0.013
Part 1
Part 2
Large-scale analyses of allele-specific gene expression and chromatin modifications
DNA methylation signatures for tumor classification and tumor progression
Part 3
Functional characterization of a novel oncogene identified through our genomic copy number analyses
An algorithm for methylation and expression index (MEI)
Illumina Infinium HumanMethylation27 BeadChip
Illumina HumanRef-8 v2 Expression BeadChip
Differential methylation based on IHC (positive vs. negative for ER, PR, Her2, EGFR, or CK5)
2227 methylation markers in 1162 genes
Top 3% most variable gene expression
541 genes
128 methylation markers in 65 genes
MEI: the weighted sum of the gene expression where the weights are the negative numbers of the Spearman correlations.
Polish dataset: K-M survival based on MEI
p = 0.002
Sur
viva
l Pro
babi
lity
Year
Polish dataset: K-M survival using MEI for ER+ and ER- samples
Sur
viva
l Pro
babi
lity
p = 0.009 p = 0.360
Sur
viva
l Pro
babi
lity
ER+ cases ER- cases
Year Year
Validation: K-M survival using MEI for ER+ samples
TCGA ER+ GSE6532 ER+
NKI ER+ BT2000 ER+
Year Year
Year Year
Sur
viva
l Pro
bab
ility
OS
OS
DM
FS
p = 0.004
p = 0.001 p = 0.001
p = 0.00002
Collaborators
Lee Lab Mitsutaka Kadota Howard Yang Hailong Wu Beverly Duncan Sheryl Gere Guohong Song
Wakefield Lab Misako Sato Lalage Wakefield
Buetow Lab Chunhua Yan Michael Edmonson Rich Finney Daoud Meerzaman Ken Buetow
Nan Hu Phil Taylor Alisa GoldsteinChristian Abnet Neal Freedman Sandy DawseyJonine Figueroa Mark Sherman
Junya Fukuoka
NCI/CCR
Hunter Lab Kent Hunter
NCI/DCEG
Barbara Dunn Ronald Lubet Asad UmarNCI/DCP
Toyama University
Jiuping Ji James DoroshowNCI/DCTD
Chris Obiora Charles AdisaAbia State University
Jun RenBeijing Cancer Hospital
Purdue University Sulma Mohammed
Singer Lab Dinah Singer
Hewitt Lab Stephen Hewitt