more on tf motif finding chip-chip / seq xiaole shirley liu stat115, stat215, bio298, bist520
TRANSCRIPT
![Page 1: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/1.jpg)
More on TF Motif Finding ChIP-chip / seq
Xiaole Shirley Liu
STAT115, STAT215, BIO298, BIST520
![Page 2: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/2.jpg)
De novo Sequence Motif Finding
• Goal: look for common sequence patterns enriched in the input data (compared to the genome background)
• Regular expression enumeration – Pattern driven approach
– Enumerate patterns, check significance in dataset
– Oligonucleotide analysis, MobyDick
• Position weight matrix update – Data driven approach, use data to refine motifs
– Consensus, EM & Gibbs sampling
– Motif score and Markov background2
![Page 3: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/3.jpg)
Position Weight Matrix Update
• Advantage– Can look for motifs of any widths– Flexible with base substitutions
• Disadvantage:– EM and Gibbs sampling: no guaranteed
convergence time– No guaranteed global optimum
3
![Page 4: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/4.jpg)
Motif Finding in Bacteria
• Promoter sequences are short (200-300 bp)• Motif are usually long (10-20 bases)
– Some have two blocks with a gap, some are palindromes
– Long motifs are usually very degenerate
• Single microarray experiment sometimes already provides enough information to search for TF motifs
4
![Page 5: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/5.jpg)
Motif Finding in Lower Eukaryotes
• Upstream sequences longer (500-1000 bp), with some simple repeats
• Motif width varies (5 – 17 bases)• Expression clusters provide decent input
sequences quality for TF motif finding• Motif combination and redundancy appears,
although single motifs are usually significant enough for identification
5
![Page 6: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/6.jpg)
Yeast Promoter
Architecture
• Co-occurring regulators suggest physical interaction between the regulators
6
![Page 7: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/7.jpg)
Motif Finding in Higher Eukaryotes
• Upstream sequences very long (3KB-20KB) with repeats, TF motif could appear downstream
• Motifs can be short or long (6-20 bases), and appear in combination and clusters
• Gene expression cluster not good enough input• Need:
– Comparative Genomics: phastcons score– Motif modules: motif clusters– ChIP-chip/seq
7
![Page 8: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/8.jpg)
8
Yeast Regulatory Sequence Conservation
![Page 9: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/9.jpg)
9
UCSC PhastCons Conservation• Functional regulatory sequences are under
stronger evolutionary constraint• Align orthologous sequences together• PhastCons conservation score (0 – 1) for each
nucleotide in the genome can be downloaded from UCSC
![Page 10: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/10.jpg)
10
Conserved Motif Clusters
• First find conserved regions in the genome
• Then look for repeated transcription factors (TF) binding sites
• They form transcription factor modules
![Page 11: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/11.jpg)
Outline
• ChIP-chip on yeast– Technology and data analysis: MDscan motif finding,
regulatory network
• ChIP-X on human– Tiling microarrays and peak finding
– High throughput sequencing and peak finding
– Data analysis and examples
• Analysis: peak finding, gene expression analysis, sequence motif finding, regulatory network– Holistic picture of gene regulation
11
![Page 12: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/12.jpg)
Motivation
• Motif finding works well in bacteria, OK in yeast, marginal in worm/fly, and almost never in mammals
• Cistrome: Genome-wide in vivo binding sites of DNA-binding proteins
• ChIP-chip and ChIP-seq gives cistrome results
12
![Page 13: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/13.jpg)
ChIP-chip Technology• Chromatin ImmunoPrecipitation + microarray
– ChIP-on-chip or ChIP-chip
– Also known as Genome Scale Location Analysis
• Detect genome-wide in vivo location of TF and other DNA-binding proteins– Find all the DNA sequences bound by TF-X?
– Cook all the dishes with cinnamon
• Can learn the regulatory mechanism of a transcription factor or DNA-binding protein much better and faster
13
![Page 14: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/14.jpg)
Chromatin ImmunoPrecipitation (ChIP)
14
![Page 15: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/15.jpg)
TF/DNA Crosslinking in vivo
15
![Page 16: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/16.jpg)
Sonication (~500bp)
16
![Page 17: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/17.jpg)
TF-specific Antibody
17
![Page 18: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/18.jpg)
Immunoprecipitation
18
![Page 19: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/19.jpg)
Reverse Crosslink and DNA Purification
19
![Page 20: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/20.jpg)
Promoter Array Hybridization
Genes Intergenetic ChIP
![Page 21: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/21.jpg)
ChIP-DNA chip Detection• Started in yeast, use promoter
cDNA microarray– ~ 6000 spots, each 800-1000 bp
• Two color assay– Control: no antibody, or chromatin
(a little bit of everything)– Need triplicates to cancel noise
• Applied to all yeast TFs– TF modified to contain a tag– Tag can be precipitated with
Immunoglobin
21
![Page 22: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/22.jpg)
ChIP-chip Motif Finding
• ChIP-chip gives 10-5000 binding regions ~600-1000bp long. Precise binding motif?– Raw data is like perfect clustering, plus enrichment
values
• MDscan– High ChIP ranking => true targets, contain more sites
– Search TF motif from highest ranking targets first (high signal / background ratio)
– Refine candidate motifs with all targets
– Used successfully in ChIP-chip motif finding
22
![Page 23: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/23.jpg)
Similarity Defined by m-match
For a given w-mer and any other random w-mer
TGTAACGT 8-mer
TGTAACGT matched 8
AGTAACGT matched 7
TGCAACAT matched 6
TGACACGG matched 5
AATAACAG matched 4
m-matches for TGTAACGT
Pick a reasonable m to call two w-mers similar
23
![Page 24: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/24.jpg)
MDscan Seeds
ATTGCAAATTTTGCGAATTTTGCAAAT
Seedmotif pattern
ATTGCAAAT
A 9-mer
TTTGCAAAT
TTTGCGAAT
Hig
her
enri
chm
ent
ChIP-chip selected upstream sequences
TTGCAAATC
CAAATCCAACAAATCCAAGAAATCCAC
GCAAATCCAGCAAATTCGGCAAATCCAGGAAATCCAGGAAATCCT
TGCAAATCCTGCAAATTC
GCCACCGTACCACCGTACCACGGTGCCACGGC…
TTGCAAATCTTGCGAATATTGCAAATTTTGCCCATC
24
![Page 25: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/25.jpg)
Seed1 m-matches
Update Motifs With Remaining Seqs
ExtremeHighRank
All ChIP-selected targets25
![Page 26: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/26.jpg)
Seed1 m-matches
Refine the Motifs
ExtremeHighRank
All ChIP-selected targets26
![Page 27: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/27.jpg)
Yeast TF Regulatory Network
Protein
Gene
RegulateTranscribe
27
![Page 28: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/28.jpg)
ChIP-chip Better Explains Expression
Ndt80 regulated genes Sum1 regulated genes
Ndt80 & Sum1 regulated genes
28
![Page 29: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/29.jpg)
Genome Tiling Microarrays• Promoter array doesn’t work for human ChIP-chip
• Binding could appear in much further intergenic sequences, introns, exons, or downstream sequences.
Genomic DNA on the chromosome
Tiling Probes
29
![Page 30: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/30.jpg)
DNA Purification
30
![Page 31: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/31.jpg)
ChIP-chip on Tiling Microarray
ChIP-DNA
Noise
ChIP
Ctrl
Chromosome
31
![Page 32: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/32.jpg)
ChIP-chip
• Detect genome-wide location of transcription and epigenetic factors
• Affymetrix genome tiling arrays are cheaper
• $2000 7 arrays * 6 million probes * (3 ChIP + 3 Ctrl)
• But data is noisier and less informative
Two peaks? How about ChIP alone? Over 42M probes?
32
ChIP
Ctrl
Chromosome CoordinatesLog
Pro
be I
nte
nsit
y
![Page 33: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/33.jpg)
ChIP-chip AnalysisMann-Whitney U-test
• Affy TAS, Cawley et al (Cell 2004): – Assign 1 to all probe pairs with MM > PM
– Each probe: rank probes within [-500bp, +500bp] window
33
![Page 34: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/34.jpg)
ChIP-chip AnalysisMann-Whitney U-test
• Affy TAS, Cawley et al (Cell 2004): – Assign 1 to all probe pairs with MM > PM
– Each probe: rank probes within [-500bp, +500bp] window
– Check whether sum of ChIP ranks is much smaller
– Consider all probes equally
– Half of the probes have MM > PM
PM – MM
Histogram of (PM – MM)
34
![Page 35: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/35.jpg)
Affymetrix Tiling Array Peak Finding
• Challenges:– Massive data, probe values noisy
– Only 1/3 of researchers get it to work the first time
– Previous algorithms only work by comparing 3 ChIP with 3 Ctrl
• Model-based Analysis of Tiling arrays (MAT)– Work with single ChIP (no rep, no ctrl)
– Find individual failed samples
– More sensitive, specific, and quantitative with 3 ChIP & 3 Ctrl
MAT: Johnson et al, PNAS 2006
35
![Page 36: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/36.jpg)
MAT• Most of the probes in ChIP-chip measures
non-specific hybridization and background noise• Estimate probe behavior by checking other
probes with similar sequence on the same array• Probe sequence plays
a big role in signal
value
36
![Page 37: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/37.jpg)
Model Sequence-Specific Probe Effect
• First detailed model of probe sequence on probe signal
• AATGC ACTGT GCACA GATCG GCCAT7 A, 7 C, 6 G, 5 T, map to 2 places in genome
• Use all the probes on the array to estimate the parameters
# of T’sintercept
Position-specific
A, C, G effect
A,C,G,T count squared
25-mer copynumber
Probesignal
37
€
5α + β1A + β 2A + β 4G + β 5C + ...
+ 49γA + 49γC + 36γG + 25γT + Log(2)δ + ε
![Page 38: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/38.jpg)
Probe Standardization
• Fit the probe model array by array
6M Probes
2K bins
binaffinityi
iii s
mPMLogt
ˆ)(
Model predicted probe intensity
Observed probe intensity
Observed probe variance within
each bin38
![Page 39: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/39.jpg)
Raw probe values at two spike-in regions with concentration 2X
ChIP
Ctrl
Sequence-based probe behavior standardization
ChIP standardized
Ctrl standardized
Window-based neighboring probe combination for ChIP-region detection
ChIP Window
(ChIP – Ctrl)
(3 ChIP – 3 Ctrl)
2X 2X
39
![Page 40: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/40.jpg)
MA2C: Model-based for 2-Color Arrays
• Normalize probes by GC bins within each array– How much variance is observed in the GC bin
– Give high confidence probes more weight
• Running window average or median for peak finding
MA2C: Song et al, Genome Biol 2007
40
![Page 41: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/41.jpg)
Is a ChIP experiment working?
• MAT window scores ~ normal with long tails• Estimate pvalue of normal from left half of data• FDR = A / B (Ctrl/ChIP peaks are all FPs)• Spike-in shows MAT FDR estimate is accurate• Can find individual failed replicate
41
<1% enriched
MAT: Quality Control
Background
Enriched DNA
A B
![Page 42: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/42.jpg)
ChIP-Seq
ChIP-DNA
Noise
Sequence millions of 30-mer ends of fragments
Map 30-mers back to the genome
42
![Page 43: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/43.jpg)
MACS: Model-based Analysis for ChIP-Seq
• Use confident peaks to model shift size
Binding
43
![Page 44: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/44.jpg)
Peak Calls
• Tag distribution along the genome ~ Poisson distribution (λBG = total tag / genome size)
• ChIP-Seq show local biases in the genome– Chromatin and sequencing bias
44
![Page 45: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/45.jpg)
Peak Calls
• Tag distribution along the genome ~ Poisson distribution (λBG = total tag / genome size)
• ChIP-Seq show local biases in the genome– Chromatin and sequencing bias– 200-300bp control windows have to few tags– But can look
further
Dynamic λlocal =
max(λBG, [λctrl, λ1k,] λ5k, λ10k)
ChIP
Control
300bp1kb5kb10kb
http://liulab.dfci.harvard.edu/MACS/Zhang et al, Genome Bio, 2008
![Page 46: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/46.jpg)
CEAS: Cis-regulatory Element Annotation System
• Data Analysis Button for Biologists
http://ceas.cbi.pku.edu.cn
![Page 47: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/47.jpg)
Estrogen Receptor
• Carroll et al, Cell 2005• Overactive in > 70% of breast cancers• Where does it go in the genome?• ChIP-chip on chr21/22, motif and expression
analysis found its partner FoxA1
TF??ER
![Page 48: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/48.jpg)
Estrogen Receptor (ER) Cistrome in Breast Cancer
• Carroll et al, Nat Genet 2006
• ER may function far away (100-200KB) from genes
• Only 20% of ER sites have PhastCons > 0.2
• ER has different effect based on different collaborators
AP1
ER
NRIP
![Page 49: More on TF Motif Finding ChIP-chip / seq Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520](https://reader030.vdocuments.site/reader030/viewer/2022032605/56649e7d5503460f94b801b7/html5/thumbnails/49.jpg)
Estrogen Receptor (ER) Cistrome in Breast Cancer
• Carroll et al, Nat Genet 2006
• ER may function far away (100-200KB) from genes
• Only 20% of ER sites have PhastCons > 0.2
• ER has different effect based on different collaborators
AP1
ERNRIP