chip seq presentation
TRANSCRIPT
-
8/6/2019 ChIP Seq Presentation
1/23
PRACTICAL: CHIP-SEQ DATA
ANALYSISAndre Faure & Petra Schwalie
Paul Flicek Lab, Vertebrate Genomics, EMBL-EBI9. March 2010
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
2/23
RESOURCES
http://www.bioconductor.org (R packages & workflows; help)
http://seqanswers.com (software overview; forum)
data repositories: ArrayExpress & GEO, ENA
ENCODE, modENCODE (collaborative efforts, ChIP-seq)
Reviews + Benchmarks (see last slide)
Wednesday, 9 March 2011
http://seqanswers.com/http://seqanswers.com/http://seqanswers.com/http://www.bioconductor.org/http://www.bioconductor.org/ -
8/6/2019 ChIP Seq Presentation
3/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichment
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
4/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichment
Peak-calling
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
5/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichment
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
6/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichment
Genomic context
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
7/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichment
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
8/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichment
Read profile plots
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
9/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichment
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
10/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichment
Motif analysis
G
C
T
GT
A
GT
G
ACACGTAA
G
C
G
A
T
CG
TAGAGT
GAGACTGTACT
A
G
A
G
C
T
C
C
A
G
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
11/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichment
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
12/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichmentDifferential enrichment
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
13/23
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2)
Quality check & align (not discussed here)
(1) Peak-calling
(2) Genomic context
Read profile plots
(3) Motif analysis (de novo & scanning)
(4) Differential enrichment
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
14/23
(1) PEAK-CALLING
chipseq, GenomicRanges (Bioconductor)
estimating fragment length
extending reads
islands of enrichment
modeling the background (e.g. Poisson, neg. binomial) calling peaks (manual, MACS, SWEMBL)
genomic overlaps: comparison of peak-calling results
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
15/23
(2) GENOMIC CONTEXT
biomart, GenomicRanges (Bioconductor)
obtaining annotation (Ensembl)
overlaps with annotation (e.g. promoters)
enrichment of peaks in genomic areas (e.g. promoters) (not discussed here)
functional term enrichment (not discussed here) (e.g. GREAT, McLean et al. Nat Biotechnol)
average profile plots on genomic feature/peak summit
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
16/23
(3) MOTIF ANALYSIS
BSgenome, seqLogo, GenomicRanges (Bioconductor)
MEME (de novo motif discovery)
obtaining the peak sequences
de novo motif discovery
motif scanning: motifs per peaks?
motif enrichment vs. background (not discussed here)
refining the PWM for a given factor
motif profile plot (distribution of motif around peak summit)
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
17/23
(4) DIFFERENTIAL ENRICHMENT
DESeq, GenomicRanges (Bioconductor)
defining regions of interest (ROI) obtaining counts per regions of interest (replicates & conditions)
estimating library sizes
estimating variation of counts per ROIs
calling differentially modified regions (negative binomial distribution)
overview of significantly modified regions
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
18/23
http://www.ebi.ac.uk/~schwalie/chipseqprac_0311/chipseq_practical.pdf
Wednesday, 9 March 2011
http://www.ebi.ac.uk/~schwalie/chipseqprac_0311/http://www.ebi.ac.uk/~schwalie/chipseqprac_0311/ -
8/6/2019 ChIP Seq Presentation
19/23
(1) PEAK-CALLING
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
20/23
PEAK ANALYSIS
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
21/23
(3) MOTIF ANALYSIS
motif discovery
motif profile
motifs/peaks
MACS Swembl
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
22/23
(4) DIFFERENTIAL HISTONE
MODIFICATION
Wednesday, 9 March 2011
-
8/6/2019 ChIP Seq Presentation
23/23
CHIP-SEQ REVIEWS +
BENCHMARKS ChIP-seq: advantages and challenges of a maturing technology (Park, Nat Rev Genet 2009)
Computation for ChIP-seq and RNA-seq studies (Peke et al, Nat Methods 2009)
Design and analysis of ChIP-seq experiments for DNA-binding proteins (Kharchenko et al, NatBiotechnol 2008)
Q&A: ChIP-seq technologies and the study of gene regulation (Liu et al, MBC Biol 2010)
Evaluation of algorithm performance in ChIP-seq peak detection (Wilbanks, PLos ONE 2010)
A practical comparison of methods for detecting transcription factor binding sites in ChIP-seqexperiments (Laajala et al, BMC Bioinformatics)