regulatory variation and its functional consequences
DESCRIPTION
Regulatory variation and its functional consequences. Chris Cotsapas [email protected]. Motivating questions. How do phenotypes vary across individuals? Regulatory changes drive cellular and organismal traits Likely also drive evolutionary differences - PowerPoint PPT PresentationTRANSCRIPT
![Page 2: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/2.jpg)
Motivating questions
• How do phenotypes vary across individuals?– Regulatory changes drive cellular and organismal
traits– Likely also drive evolutionary differences
• How are genes (co)regulated?– Pathways, processes, contexts
![Page 3: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/3.jpg)
![Page 4: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/4.jpg)
![Page 5: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/5.jpg)
Regulatory variation
• What do “interesting” variants do?• Genetic changes to:
– Coding sequence **– Gene expression levels– Splice isomer levels– Methylation patterns– Chromatin accessibility– Transcription factor binding kinetics– Cell signaling– Protein-protein interactions
~88% of GWAS hits are regulatory
![Page 6: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/6.jpg)
Genetic variation alters regulation
• Protein levels – Maize (Damerval 94)
• Expression levels– Yeast, maize, mouse, humans (Brem 02, Schadt 03,
Stranger 05, Stranger 07)• RNA splicing
– Humans (Pickrell 12, Lappalainen 13)• Methylation and Dnase I peak strength
– Humans (Degner 12; Gibbs 12)
![Page 7: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/7.jpg)
• cis-eQTL– The position of the eQTL maps near
the physical position of the gene.– Promoter polymorphism?– Insertion/Deletion?– Methylation, chromatin conformation?
• trans-eQTL– The position of the eQTL does not
map near the physical position of the gene.
– Regulator?– Direct or indirect?
Modified from Cheung and Spielman 2009 Nat Gen
Genetics of gene expression (eQTL)
![Page 8: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/8.jpg)
Cis- eQTL analysis: Test SNPs within a pre-defined distance of gene
1Mb 1Mb
SNPsgene
probe
1Mb window
![Page 9: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/9.jpg)
![Page 10: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/10.jpg)
QT association• Analysis of the relationship between a dependent or outcome
variable (phenotype) with one or more independent or predictor variables (SNP genotype)
Yi = b0 + b1Xi + ei
Number of A1 Alleles0 1 2
Conti
nuou
s Tra
it Va
lue
b0
Slope: b1
Linear Regression Equation
Logistic Regression Equation
= b0 + b1Xi + eiln( )pi
(1-pi)
![Page 11: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/11.jpg)
gene 3
eQTL analysis: a GWAS for every gene
gene 2
gene N
gene 5
gene 4
gene 1
![Page 12: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/12.jpg)
cis-eQTLs are rather common
Nica et al PLoS Genet 2011
![Page 13: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/13.jpg)
Cis-eQTLs cluster around TSS
Stranger et alPLoS Genet 2012
![Page 14: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/14.jpg)
trans hotspots (yeast)
Brem et al Science 2002
![Page 15: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/15.jpg)
Yvert et al Nat Genet 2003
![Page 16: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/16.jpg)
DOES REGULATORY VARIATION ALTER PHENOTYPE? APPLICATION TO GWAS
Candidate genes, perturbations underlying organismal phenotypes
![Page 17: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/17.jpg)
Rationale
• How do disease/trait variants actually alter biology?
• If they change regulation, then:– Change in gene expression/isoform use– Phenotypic consequence*
![Page 18: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/18.jpg)
Compare patterns of association
GWAS peak
eQTL for gene 1
eQTL for gene 2
![Page 19: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/19.jpg)
Pearson’s covariance for windows of 51 SNPs between –log(p) in 2 traits
CD GWAS p
eQTL p
Detect a peak when effect is the sameNo peak when there are independent hits near each other
![Page 20: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/20.jpg)
Crohn’s/eQTL analysis
• CD meta analysis (GWAS only)• CEU Hapmap LCL eQTL data• Overlapping SNPs only (eQTL data has 610K
SNPs, most in CD meta-analysis)• Test 133 associations (total 1054 tests)
GWAS peak
eQTL for gene 1
eQTL for gene 2
![Page 21: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/21.jpg)
Crohn’s/eQTL analysisSNP CHR Gene
rs11742570 5 PTGER4
rs12994997 2 ATG16L1
rs11401 16 SPNS1
rs10781499 9 INPP5E
rs2266959 2 C22orf29
A peak implies that the same effect drives GWAS and eQTL
![Page 22: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/22.jpg)
![Page 23: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/23.jpg)
MS/eQTL analysisSNP CHR Gene
rs6880778 5 PTGER4
rs7132277 12 CDK2AP
rs7665090 4 CISD2
rs2255214 3 GOLGB1 & EAF2
rs201202118 12 METTL1 & TSFM
rs12946510 17 ORMDL3, STARD3 & ZPBP2
rs2283792 22 PPM1F
rs7552544 1 SLC30A7
rs34536443 19 SLC44A2
A peak implies that the same effect drives GWAS and eQTL
![Page 24: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/24.jpg)
![Page 25: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/25.jpg)
![Page 26: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/26.jpg)
DOES REGVAR REVEAL CO-REGULATION? A.K.A. WHERE ARE THE TRANS eQTLS?
Open question
![Page 27: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/27.jpg)
gene 3
Whole-genome eQTL analysis is an independent GWAS for expression of each gene
gene 2
gene N
gene 5
gene 4
gene 1
![Page 28: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/28.jpg)
Issues with trans mapping
• Power– Genome-wide significance is 5e-8
– Multiple testing on ~20K genes– Sample sizes clearly inadequate
• Data structure– Bias corrections deflate variance– Non-normal distributions
• Sample sizes– Far too small
![Page 29: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/29.jpg)
But…
• Assume that trans eQTLs affect many genes…
• …and you can use cross-trait methods!
![Page 30: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/30.jpg)
Association data
Z1,1 Z1,2 … … Z1,p
Z2,1
::
Zs,1 Zs,p
![Page 31: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/31.jpg)
Cross-phenotype meta-analysis
SCPMA ~L(data | λ≠1)
L(data | λ=1)
Cotsapas et al, PLoS Genetics
![Page 32: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/32.jpg)
CPMA for correlated traits
• Empirical assessment to account for correlation
• Simulate Z scores under covariance, recalculate CPMA
• Construct distribution of CPMA for dataset, call significance
with Ben Voight, U Penn
![Page 33: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/33.jpg)
Experimental design
610,180 SNPs MAF >0.15 CEU and YRI
LD pruned (r2 < 0.2)
8368 transcriptsDetectable on Illumina arrays
108 CEU individuals*109 YRI individuals*
* Stranger et al Nat Genet 2007(LCL data; publicly available)
CEU p-values Transcript ~ SNP, sex
YRI p-values Transcript ~ SNP, sex
plink CPMA
CEU CPMA scores
YRI CPMA scores
>95%ile sim CPMA
![Page 34: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/34.jpg)
Target sets of genes
• trans-acting variant: SNP with CPMA evidence• Target genes: genes affected by trans-acting
variant (i.e. regulon)
![Page 35: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/35.jpg)
Prediction 1
• Allelic effects should be conserved between two populations– Binomial test on paired observations for all genes
P < 0.05 in at least one population
True for 1124/1311 SNPs (binomial p < 0.05)
Genes pCEU < 0.05
Genes pYRI < 0.05
CEU + + - - +
YRI + + - - +
YRI - - + + -
![Page 36: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/36.jpg)
Prediction 2
• Target genes should overlap– Identify by mixture of gaussians classification– Empirical p from distribution of overlaps between
NCEU and NYRI genes across SNPs.
True for 600/1311 SNPs (empirical p < 0.05)
Genes pCEU < 0.05
Genes pYRI < 0.05
![Page 37: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/37.jpg)
What about the target genes?
• Regulons:– Encode proteins more
connected than expected by chance
www.broadinstitute.org/mpg/dapple.phpRossin et al 2011 PLoS Genetics
![Page 38: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/38.jpg)
What about the target genes?
• Regulons:– Encode proteins enriched for
TF targets (ENCODE LCL data)– 24/67 filtered TFs significant– Binomial overlap test
TF p-value
CEBPB 3.7 x 10-142
HDAC8 7.8 x 10-122
FOS 2.5 x 10-96
JUND 3.7 x 10-88
NFYB 3.3 x 10-71
ETS1 3.8 x 10-63
FAM48A 2.1 x 10-61
FOXA1 1.4 x 10-33
GATA1 4.6 x 10-33
HEY1 7.8 x 10-32
transtarget genes
CHiPseqLCL targetgenes
![Page 39: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/39.jpg)
Summary
• Regulatory variation is common• It affects gene expression levels• Likely many other types:
– DNA accessibility, chromatin states– Transcript splicing, processing, turnover
• Has phenotypic consequences– GWAS– Some cellular assays (not discussed here)
![Page 40: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/40.jpg)
Open questions
• Discover regulatory elements (cis)– Promoters, enhancers etc
• Gene regulatory circuits (trans)• Dynamics of regulation
– Splicing variation, processing, degradation• Phenotypic consequences
– Cellular assays required• Tie in to organismal phenotype
![Page 41: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/41.jpg)
NEXT-GEN SEQUENCING DATARNAseq, GTEx
![Page 42: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/42.jpg)
GTEx – Genotype-Tissue EXpressionAn NIH common fund project
Current: 35 tissues from 50 donors
Scale up: 20K tissues from 900 donors.
Novel methods groups: 5 current + RFA
![Page 43: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/43.jpg)
How can we make RNAseq useful?
• Standard eQTLs – Montgomery et al, Pickrell et al Nature 2010
• Isoform eQTLs– Depth of sequence!
• Long genes are preferentially sequenced• Abundant genes/isoforms ditto• Power!?• Mapping biases due to SNPs
![Page 44: Regulatory variation and its functional consequences](https://reader036.vdocuments.site/reader036/viewer/2022062501/568166be550346895ddac66e/html5/thumbnails/44.jpg)
RNAseq combined with other techs
• Regulons: TF gene sets via CHiP/seq– Look for trans effects
• Open chromatin states (Dnase I; methylation)– Find active genes– Changes in epigenetic marks correlated to RNA– Genetic effects
• RNA/DNA comparisons – Simultaneous SNP detection/genotyping– RNA editing ???