![Page 1: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/1.jpg)
Special Topics in Genomics
ChIP-chip and Tiling Arrays
![Page 2: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/2.jpg)
Traditional Method for Understanding Transcription Regulation
Very challenging for mammalian genomes
Gene expression microarray analysis
Clustering genes by
expression profile
Search conserved sequence
motifs in cluster promoters
![Page 3: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/3.jpg)
ChIP-chip Technology
• Chromatin ImmunoPrecipitation + microarray
• Detect genome-wide in vivo location of TF and other DNA-binding proteins
• Can learn the regulatory mechanism of a transcription factor or DNA-binding protein much better and faster
![Page 4: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/4.jpg)
Chromatin ImmunoPrecipitation (ChIP)
By Richard Bourgon at UC Berkley
![Page 5: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/5.jpg)
TF/DNA Crosslinking in vivo
By Richard Bourgon at UC Berkley
![Page 6: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/6.jpg)
Sonication (~500bp)
By Richard Bourgon at UC Berkley
![Page 7: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/7.jpg)
TF-specific Antibody
By Richard Bourgon at UC Berkley
![Page 8: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/8.jpg)
Immunoprecipitation
By Richard Bourgon at UC Berkley
![Page 9: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/9.jpg)
Reverse Crosslink and DNA Purification
By Richard Bourgon at UC Berkley
![Page 10: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/10.jpg)
Amplification
By Richard Bourgon at UC Berkley
![Page 11: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/11.jpg)
Genome Tiling Arrays
# Arrays human genome
# Probes / Array
# Total Probes
Probe Length
Probe Resolution
Price
Affymetrix 7 6M 42.0M 25mer 35 bp $2,000
Nimblegen 38 390K 14.8M 50mer 110 bp $30,000
Agilent 21 244K 5.1M 60mer
300 bp in genes;
500 bp in intergenic
$11,000
By Xiaole Shirley Liu at Harvard
![Page 12: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/12.jpg)
Genome Tiling Arrays
• Affymetrix genome tiling microarrays– Tile the genome non-repeat regions
– Chr21/22 tiling (earlier version): 1 million probe pairs (PM & MM) at 35 bp resolution on 3 arrays
– Whole genome: 42 million PM probes on 7 arrays
Probes
Chromosome
PM CGACATTGATTCAAGACTACATACAMM CGACATTGATTCTAGACTACATACA
By Xiaole Shirley Liu at Harvard
![Page 13: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/13.jpg)
Chromatin ImmunoPrecipitation (ChIP)
By Richard Bourgon at UC Berkley
![Page 14: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/14.jpg)
ChIP-chip Array Hybridization
• Map high intensity probes back to the genome• Locate TF binding location
Probes
Chromosome
ChIP-DNA
Noise
By Xiaole Shirley Liu at Harvard
![Page 15: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/15.jpg)
Identify ChIP-enriched Region
• Controls: sonicated genomic Input DNA• Often 3 ChIP, 3 Ctrl replicates are needed
ChIP
Ctrl
By Xiaole Shirley Liu at Harvard
![Page 16: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/16.jpg)
Mann-Whitney U-testfor ChIP-region Detection
• Affy TAS, Cawley et al (Cell 2004): – Each probe: rank probes (either PM-MM or
PM) within [-500bp, +500bp] window– Check whether sum of ChIP ranks is much
smaller
By Xiaole Shirley Liu at Harvard
![Page 17: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/17.jpg)
TileMap (Ji and Wong, Bioinformatics 2005)
STEP 1:Compute a test statistic for each probe to
summarize probe level information
STEP 2:Combine probe level test statistics of
neighboring probes to help infer binding regions
![Page 18: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/18.jpg)
Probe level test statistic: empirical Bayes approach
22s
23s
2Is
21s …
Probe
Sample Variance (df)
1 2 3 … I
i i ssS 222 )]([2s
Mean Sum of Squares
S
Is
dfI
I
dfB
1)(
2
21
2
2ˆ 22
Shrinkage Factor
222 ˆ)ˆ1(ˆ sBsB ii
Variance Shrinkage Estimator
21 2
2 23 2ˆ I…Variance Estimates
A modified t-statistic
i
iii
KK
xxt
11~
21
21
1~t 2
~t 3~t It
~…Probe level test statistics
![Page 19: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/19.jpg)
Combining neighboring probes
TileMap (MA)
1. Compute the probe level test statistic t for each probe;
2. Compute a moving average statistic to measure enrichment;
3. Estimate FDR.
TileMap (HMM)
1. Compute the probe level test statistic t for each probe;
2. Estimate the distribution of t under H0 and H1;
3. Model t by a Hidden Markov Model, and decode the HMM.
![Page 20: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/20.jpg)
Shrinking variance increases statistical power
Mean(X1)-Mean(X2)
t-statistic, canonical
t-statistic, variance shrinking
Moving Average
![Page 21: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/21.jpg)
Peak 2 (180bp) transgenics
Neural tube expression Transgenics
![Page 22: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/22.jpg)
Comparisons between TileMap and previous methods
cMyc ChIP-chip Data: 6 IP + 6 CT1 + 6 CT2
Gold Standard: Using GTRANS and Keles’ method to analyze all 18 arrays
Test data: 4 arrays, 2 IP vs 2 CT1 (s2r2)
GTRANS or TAS (Kampa et al., 2004)
1. Set a window;
2. Perform a Wilcoxon signed rank test for each window.
Keles et al. (2004)
1. Compute a t-statistic t for each probe (no shrinking, two sample only);
2. Rank probes by a moving average.
TileMap-HMM (Ji & Wong, 2005)
![Page 23: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/23.jpg)
Shrinking variance saves money
Using non-shrinking method (Keles’ method) to analyze all probes
Using shrinking method to analyze half of the probes, i.e., reduce information by half
![Page 24: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/24.jpg)
MAT(Johnson W.E. et al. PNAS, 2006)
• Model-based Analysis of Tiling arrays for ChIP-chip
• Goal: – Find ChIP-regions without replicates
– Find ChIP-region without controls
– Find ChIP-regions without MM probes
– Can analyze data array by array
By Xiaole Shirley Liu at Harvard
![Page 25: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/25.jpg)
MAT
• Estimate probe behavior by checking other probes with similar sequence on the same array
• Probe sequence plays a
big role in signal value• Most of the probes in
ChIP-chip measures
non-specific
hybridization
By Xiaole Shirley Liu at Harvard
![Page 26: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/26.jpg)
Probe Behavior Model
Baseline on number of Ts
A,C,G at each position of the 25mer
A,C,G,T Count Square
25mer Copy Number along the Genome
By Xiaole Shirley Liu at Harvard
![Page 27: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/27.jpg)
Probe Standardization
• Fit the probe model array by array• Divide array probes to bins (3k probes/bin)• Background-subtraction and standardization
(normalization) on a single array;
binaffinityi
iii s
mPMLogt
ˆ)(
Model predicted probe intensity
Observed probe intensity
Observed probe variance within
each bin
By Xiaole Shirley Liu at Harvard
![Page 28: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/28.jpg)
Eliminate Normalization
• Probe log(PM) values before and after standardization
• If normalize before model fitting– Predicted same ChIP-regions, although less confident
By Xiaole Shirley Liu at Harvard
![Page 29: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/29.jpg)
ChIP-region Detection
• Window-based MATscore– ChIP without Ctrl
– TM: trimmed mean
– Multiple ChIP with multiple Ctrl
– More probes, higher t values in ChIP, less variance (fluctuation) more confident
ChIPInput
nInputinstTMChIPinstTM
regionMAT
)'()'()(
ChIPnregioninstTMregionMAT )'()(
By Xiaole Shirley Liu at Harvard
![Page 30: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/30.jpg)
Raw probe values at two spike-in regions with concentration 2X
ChIP_1 Log(PM)
Input_1 Log(PM)
Sequence-based probe behavior standardization
ChIP_1 t-value
Input_1 t-value
Window-based neighboring probe combination for ChIP-region detection
ChIP_1 MATscore
ChIP_1/Input_1MATscore
3 Reps ChIP/InputMATscore
2X 2X
By Xiaole Shirley Liu at Harvard
![Page 31: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/31.jpg)
Statistical Significance of Hits
• P-value and FDR cutoff:– P-value from MATscore distribution– Estimate negative peaks under the same P value cutoff– Regional FDR = #negative_peaks / #positive_peaks
<1% enriched
MAT: Quality Control
Background
Enriched DNA
<1% enriched
MAT: Quality Control
Background
Enriched DNA
By Xiaole Shirley Liu at Harvard
![Page 32: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/32.jpg)
MAT summary
• Open source python http://chip.dfci.harvard.edu/~wli/MAT/
• Runs faster than array scanner• Can work with single ChIP, multiple ChIP, and
multiple ChIP with controls with increasing accuracy– Use single ChIP on promoter arrays to test antibody
and protocol before going whole genome
• Can identify individual failed samples
By Xiaole Shirley Liu at Harvard
![Page 33: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/33.jpg)
Benchmark for ChIP-chip Target Detection(Johnson D.S. et al. Genome Research, 2008)
• ENCODE Spike-in experiment: both amplified and un-amplified
• Blind test: Samples hybridized to different tiling arrays, predictions made before the key was released
ChIP96 ENCODE clones,
2,4,8,...,256X enrichment + total chromatin DNA
Input
total genomic DNA
![Page 34: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/34.jpg)
Comparison of platforms
![Page 35: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/35.jpg)
Comparison of algorithms
Combined Johnson D.S. et al. Genome Research 2008 with Ji H. et al. Nature Biotechnology 2008
![Page 36: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/36.jpg)
MBR: Microarray Blob Remover
By Xiaole Shirley Liu at Harvard
![Page 37: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/37.jpg)
xMAN: eXtreme MApping of oligoNucleotides
• http://chip.dfci.harvard.edu/~wli/xMAN• xMAN maps ~42 M Affymetrix tiling probes to the
newest human genome assembly in less than 6 CPU hours– BLAST needs 20 CPU years; BLAT needs 55 CPU days
– Probe TCCCAGCACTTTGGGAGGCTGAGGC maps to 50,660 times in the genome
• Can map long oligos, and paired tag high throughput sequencing fragments
• Store the copy number information of every probe
• mXAN filters tiling array probes to ensure one unique probe measurement per 1 kb, improves peak detection
By Xiaole Shirley Liu at Harvard
![Page 38: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/38.jpg)
CEAS: Cis-regulatory Element Annotation System
• Data Analysis Button for Biologists
http://ceas.cbi.pku.edu.cn
By Xiaole Shirley Liu at Harvard
![Page 39: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/39.jpg)
CisGenome(Ji H. et al. Nature Biotechnology, 2008)
Graphic User Interface
CisGenome Browser
Core Data Analysis
Programs
![Page 40: Special Topics in Genomics ChIP-chip and Tiling Arrays](https://reader035.vdocuments.site/reader035/viewer/2022062409/5681474b550346895db48d91/html5/thumbnails/40.jpg)
Other applications of tiling arrays
• Transcriptome mapping• MeDIP-chip• DNase-chip• Nucleosome localization• Array CGH and copy number variation