massively parallel functional analysis of missense mutations in … · massively parallel...

39
Lea Starita, PhD @lea_starita Shendure and Fields Labs Department of Genome Sciences University of Washington Massively parallel functional analysis of missense mutations in BRCA1 for interpreting variants of uncertain significance

Upload: others

Post on 11-May-2020

17 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Lea Starita, PhD@lea_starita

Shendure and Fields LabsDepartment of Genome Sciences

University of Washington

Massively parallel functional analysis of missense mutations in BRCA1 for interpreting variants of uncertain significance

Page 2: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

~350 Variants of Uncertain Significance (VUS)

Freq

uenc

yVariants of uncertain significance (VUS)

Page 3: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Computational prediction

ThroughputGenetic analysis or one-off experiments

Validity

Massively parallel functional analysis

+++ +

+ +++

++++++

How do we interpret the impact of genetic variation at scale?

Page 4: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Massively parallel functional assaysfor assessing function of missense variants

Generate a library with mutations in a sequence of interest

Multiplexed functional assay

Quantify effect sizes of individual variants

var-1var-2

var-3

var-4

var-1var-2

var-3

var-4

Co-transfection

CRISPR

HDR Library

Array-synthesized mutations

Haploid Cells

Population of Cells w/ Many Different Edits

var-1

var-2

var-4

var-3

Co-transfection

CRISPR

HDR Library

Array-synthesized mutations

Haploid Cells

Population of Cells w/ Many Different Edits

var-1

var-2

var-4

var-3

var-2

var-4

var-4var-4

effect size

Sequence variants in input and selected populations

Page 5: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

BARD1RING

BRCA1RING

Biochemical functions of BRCA1

BRCA11-102

BRCA1 is required for homology-directed dsDNA break repair (HDR)

BRCA1 HDR activity is required for tumor suppression

BRCA1 must dimerize with BARD1 to function in HDR

The BRCA1:BARD1 dimer has ubiquitin ligase activity

Page 6: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

BARD1RING

BRCA1RING

Multiplex assays for BRCA1 protein function and splicing

Experiments 1 and 2:BARD1-BRCA1-RING E3 ligase activityBARD1-BRCA1-RING interaction

Experiment 3: Saturation genome editing to assess the effect of SNVs on splicing. exon 17 exon 19exon 18

CRISPRHDR Library (N = 4,096)

Page 7: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

BARD1RING

BRCA1RING

Multiplex assays for BRCA1 protein function and splicing

Experiments 1 and 2:BARD1-BRCA1-RING E3 ligase activityBARD1-BRCA1-RING interaction

Experiment 3: Saturation genome editing to assess the effect of SNVs on splicing. exon 17 exon 19exon 18

CRISPRHDR Library (N = 4,096)

Page 8: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities

Page 9: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

BARD1-binding activity

selection in -histidine

transformation, Time 1 Time 2 Time 3deep sequencing deep sequencing deep sequencing deep sequencing

Time 0

B

Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities

Page 10: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

predictionmodel

How can we leverage these measurements to estimatethe likelihood that a BRCA1 variant would be pathogenic?

Page 11: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

to understand the homology-directed DNA repair (HDR) function of BRCA1?

predictionmodel

Ransburgh et al. Cancer Research 2010

HDR activity?

How can we leverage these measurements to estimatethe likelihood that a BRCA1 variant would be pathogenic?

Page 12: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Experimental data build a better predictor of BRCA1 HDR function

0.0 0.2 0.4 0.6

PolyPhen-2

CADD

Grantham

Align-GVGD

E3 + BARD1-binding

E3 + BARD1-binding

+ A-GVGD

Leave-One-Out Cross Validation R2

Page 13: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

pathogenic

VUS

benign

0.0 1.00.33 0.770

2

4

0

2

4

0

2

4

HDR predictions for clinical BRCA1 variants

Predicted HDR score

Cou

nts

Cou

nts

Cou

nts

splice

High HDR functionLow HDR function

Page 14: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

pathogenicVUS

benign

0.0 1.00.33 0.77

0

50

100

HDR predictions for 1,287 BRCA1 variants not yet seen in patients

Predicted HDR score

High HDR functionLow HDR function

Cou

nts

Page 15: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

HDR predictions for 1,287 BRCA1 variants not yet seen in patients

Cou

nts

High HDR functionLow HDR function

pathogenicVUS

benign

not yet seen

0.0 1.00.33 0.77

0

50

100

Predicted HDR score

Page 16: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

BARD1RING

BRCA1RING

Multiplex assays for BRCA1 protein function and splicing

Experiments 1 and 2:BARD1-BRCA1-RING E3 ligase activityBARD1-BRCA1-RING interaction

Experiment 3: Saturation genome editing to assess the effect of SNVs on splicing. exon 17 exon 19exon 18

CRISPRHDR Library (N = 4,096)

Page 17: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Cas9/gRNA

Edited repair templates(N = 4,096)

Multiplex genome editing to measurethe effects of SNVs on splicing

1. CRISPR-Cas9constructtargetingBRCA1 exon18

2. RepairtemplatelibrarytosubstituteSNVs withintheexon.

Findlay, Boyleetal.,Nature (2014).

Page 18: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Co-transfection,Multiplex editing

Cas9/gRNA

Edited repair templates(N = 4,096)

Multiplex genome editing to measurethe effects of SNVs on splicing

Heterogeneouspopulationofeditedcells

5 days, collect gDNA& RNA “SelectivePCR”– onlyeditedgDNA andcDNA

Page 19: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Sequence, gDNA and cDNA Variant Counts

gDNA cDNA

Co-transfection,Multiplex editing

Cas9/gRNA

Edited repair templates(N = 4,096)

5 days, collect gDNA& RNA

Multiplex genome editing to measurethe effects of SNVs on splicing

CountvariantgDNA andcDNA

cDN

A/ g

DN

A

Enrichment Scores

Rank-ordered variants

Calculateeffectsonsplicingforeachvariant

Page 20: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Variants that create splice enhancers and silencers or trigger nonsense-mediated decay behave as expected

Nature nature13695.3d 8/8/14 14:12:55

Co-transfection,multiplex editing

CRISPR

Random hexamerHDR library

5 days, gDNA &RNA prep (with RT)

ATGCTGAGTTTGTGTGTGAA CGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTG T GAACGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTGTGAA CGG AC

Endogenous locus

Donor plasmidswith random hexamers

Selective PCR

gDNA cDNA

Sequencing

Genome editingin many cells

Transcriptionand splicing

CRISPR

Hexamer Count gDNA cDNA

Enrichment scores

cDN

A/g

DN

A

a

Endogenous loci

WT

Edited

Rank-ordered hexamers

Log 2

enric

hmen

t - re

plic

ate

2a

Log 2

enric

hmen

t

Log2 enrichment - replicate 1a

–10

–5

0

5

ESE

ESS

Nonsense

Enrichment score rank

b

c

R = 0.659

–15

–10

–5

0

5

–15 –10 –5 0 5

ESEESSNonsense

Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA

and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts normalized by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.

a b

cLibrary R replicate 1 effect size

Libr

ary

R re

plic

ate

2 ef

fect

siz

e

–5–4–3–2–101

BRCA1 exon 18 position

Effe

ct s

ize

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77

A C G T

R = 0.847

–5

–4

–3

–2

–1

0

1

Library R effect size

Libr

ary

R2

effe

ct s

ize

nonsensesense

R = 0.846

–5

–4

–3

–2

–1

0

1

–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1

Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in the gDNA, and effect sizes of SNVswere determined by weighted linear regression modelling. ‘Sense’ includesboth missense and synonymous SNVs. a, Effect sizes calculated from replicatetransfections of HDR library R, consisting of a 3% per-nucleotide mutation rate

in the 39-most 39 bases and the same selective PCR site used in Fig. 1, werehighly correlated (R 5 0.846). b, Library R2 harboured a selective PCR sitecomposed of 5 synonymous changes, none of which are present in library R.When effect sizes derived from experiments with library R2 were plottedagainst those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.

RESEARCH LETTER

2 | N A T U R E | V O L 0 0 0 | 0 0 M O N T H 2 0 1 4

Nature nature13695.3d 8/8/14 14:12:55

Co-transfection,multiplex editing

CRISPR

Random hexamerHDR library

5 days, gDNA &RNA prep (with RT)

ATGCTGAGTTTGTGTGTGAA CGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTG T GAACGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTGTGAA CGG AC

Endogenous locus

Donor plasmidswith random hexamers

Selective PCR

gDNA cDNA

Sequencing

Genome editingin many cells

Transcriptionand splicing

CRISPR

Hexamer Count gDNA cDNA

Enrichment scores

cDN

A/g

DN

A

a

Endogenous loci

WT

Edited

Rank-ordered hexamers

Log 2

enric

hmen

t - re

plic

ate

2a

Log 2

enric

hmen

t

Log2 enrichment - replicate 1a

–10

–5

0

5

ESE

ESS

Nonsense

Enrichment score rank

b

c

R = 0.659

–15

–10

–5

0

5

–15 –10 –5 0 5

ESEESSNonsense

Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA

and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts normalized by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.

a b

cLibrary R replicate 1 effect size

Libr

ary

R re

plic

ate

2 ef

fect

siz

e

–5–4–3–2–101

BRCA1 exon 18 position

Effe

ct s

ize

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77

A C G T

R = 0.847

–5

–4

–3

–2

–1

0

1

Library R effect size

Libr

ary

R2

effe

ct s

ize

nonsensesense

R = 0.846

–5

–4

–3

–2

–1

0

1

–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1

Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in the gDNA, and effect sizes of SNVswere determined by weighted linear regression modelling. ‘Sense’ includesboth missense and synonymous SNVs. a, Effect sizes calculated from replicatetransfections of HDR library R, consisting of a 3% per-nucleotide mutation rate

in the 39-most 39 bases and the same selective PCR site used in Fig. 1, werehighly correlated (R 5 0.846). b, Library R2 harboured a selective PCR sitecomposed of 5 synonymous changes, none of which are present in library R.When effect sizes derived from experiments with library R2 were plottedagainst those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.

RESEARCH LETTER

2 | N A T U R E | V O L 0 0 0 | 0 0 M O N T H 2 0 1 4Nature nature13695.3d 8/8/14 14:12:55

Co-transfection,multiplex editing

CRISPR

Random hexamerHDR library

5 days, gDNA &RNA prep (with RT)

ATGCTGAGTTTGTGTGTGAA CGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTG T GAACGG AC

ATGC NNNNNN TGTGTG GATATCC AC

ATGCTGAGTTTGTGTGTGAA CGG AC

Endogenous locus

Donor plasmidswith random hexamers

Selective PCR

gDNA cDNA

Sequencing

Genome editingin many cells

Transcriptionand splicing

CRISPR

Hexamer Count gDNA cDNA

Enrichment scores

cDN

A/g

DN

A

a

Endogenous loci

WT

Edited

Rank-ordered hexamers

Log 2

enric

hmen

t - re

plic

ate

2a

Log 2

enric

hmen

t

Log2 enrichment - replicate 1a

–10

–5

0

5

ESE

ESS

Nonsense

Enrichment score rank

b

c

R = 0.659

–15

–10

–5

0

5

–15 –10 –5 0 5

ESEESSNonsense

Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA

and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts normalized by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.

a b

cLibrary R replicate 1 effect size

Libr

ary

R re

plic

ate

2 ef

fect

siz

e

–5–4–3–2–101

BRCA1 exon 18 position

Effe

ct s

ize

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77

A C G T

R = 0.847

–5

–4

–3

–2

–1

0

1

Library R effect size

Libr

ary

R2

effe

ct s

ize

nonsensesense

R = 0.846

–5

–4

–3

–2

–1

0

1

–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1

Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in the gDNA, and effect sizes of SNVswere determined by weighted linear regression modelling. ‘Sense’ includesboth missense and synonymous SNVs. a, Effect sizes calculated from replicatetransfections of HDR library R, consisting of a 3% per-nucleotide mutation rate

in the 39-most 39 bases and the same selective PCR site used in Fig. 1, werehighly correlated (R 5 0.846). b, Library R2 harboured a selective PCR sitecomposed of 5 synonymous changes, none of which are present in library R.When effect sizes derived from experiments with library R2 were plottedagainst those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.

RESEARCH LETTER

2 | N A T U R E | V O L 0 0 0 | 0 0 M O N T H 2 0 1 4

Findlay, Boyleetal.,Nature (2014).

* defined from Ke et al. 2011

*Splice enhancers

*Splice silencers

Page 21: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Effects of SNVs across BRCA1 exon 18

Co-transfection,multiplex editing

CRISPR

Random hexamerHDR library

5 days, gDNA &RNA prep (with RT)

ATGCTGAGTTTGTGTGTGAA CGG AC

ATGC NNNNNNTGTGTGGATATCCAC

ATGCNNNNNN TGTGTG GATATCCAC

ATGC NNNNNNTGTGTGGATATCCAC

ATGCTGAGTTTGTGTGT GAACGG AC

ATGCNNNNNNTGTGTGGATATCC AC

ATGCTGAGTTTGTGTGTGAACGG AC

Endogenous locus

Donor plasmidswith random hexamers

Selective PCR

gDNA cDNA

Sequencing

Genome editingin many cells

Transcriptionand splicing

CRISPR

Hexamer Count gDNA cDNA

Enrichment scores

cDN

A/g

DN

A

a

Endogenous loci

WT

Edited

Rank-ordered hexamers

Log 2

enric

hmen

t - re

plic

ate

2aLo

g 2 en

richm

ent

Log2 enrichment - replicate 1a

–10

–5

0

5

ESE

ESS

Nonsense

Enrichment score rank

b

c

R = 0.659

–15

–10

–5

0

5

–15 –10 –5 0 5

ESEESSNonsense

Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA

and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.

a b

cLibrary R replicate 1 effect size

Libr

ary

R re

plic

ate

2 ef

fect

siz

e

–5–4–3–2–101

BRCA1 exon 18 position

Effe

ct s

ize

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77

A C G T

R = 0.847

–5

–4

–3

–2

–1

0

1

Library R effect size

Libr

ary

R2

effe

ct s

ize

nonsensesense

R = 0.846

–5

–4

–3

–2

–1

0

1

–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1

Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in gDNA sequencing, and effect sizesof SNVs were determined by weighted linear regression modelling. ‘Sense’includes both missense and synonymous SNVs. a, Effect sizes calculated fromreplicate transfections of HDR library R, consisting of a 3% per-nucleotide

mutation rate in the 39-most 39 bases and the same selective PCR site used inFig. 1, were highly correlated (R 5 0.846). b, Library R2 harboured a selectivePCR site composed of 5 synonymous changes, none of which are present inlibrary R. When effect sizes derived from experiments with library R2 wereplotted against those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.

LETTER RESEARCH

4 S E P T E M B E R 2 0 1 4 | V O L 5 1 3 | N A T U R E | 1 2 1

Macmillan Publishers Limited. All rights reserved©2014

- - - - *Dashedlinesrepresentnonsense mutations

Findlay, Boyleetal.,Nature (2014).

Page 22: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Effects of SNVs across BRCA1 exon 18

Co-transfection,multiplex editing

CRISPR

Random hexamerHDR library

5 days, gDNA &RNA prep (with RT)

ATGCTGAGTTTGTGTGTGAA CGG AC

ATGC NNNNNNTGTGTGGATATCCAC

ATGCNNNNNN TGTGTG GATATCCAC

ATGC NNNNNNTGTGTGGATATCCAC

ATGCTGAGTTTGTGTGT GAACGG AC

ATGCNNNNNNTGTGTGGATATCC AC

ATGCTGAGTTTGTGTGTGAACGG AC

Endogenous locus

Donor plasmidswith random hexamers

Selective PCR

gDNA cDNA

Sequencing

Genome editingin many cells

Transcriptionand splicing

CRISPR

Hexamer Count gDNA cDNA

Enrichment scores

cDN

A/g

DN

A

a

Endogenous loci

WT

Edited

Rank-ordered hexamers

Log 2

enric

hmen

t - re

plic

ate

2aLo

g 2 en

richm

ent

Log2 enrichment - replicate 1a

–10

–5

0

5

ESE

ESS

Nonsense

Enrichment score rank

b

c

R = 0.659

–15

–10

–5

0

5

–15 –10 –5 0 5

ESEESSNonsense

Figure 1 | Saturation genome editing and multiplex functional analysis of ahexamer region influencing BRCA1 splicing. a, Experimental schematic.Cultured cells were co-transfected with a single Cas9-sgRNA construct(CRISPR) and a complex homology-directed repair (HDR) library containingan edited exon that harbours a random hexamer (blue, green, orange) and afixed selective PCR site (red). CRISPR-induced cutting stimulated homologousrecombination with the HDR library, inserting mutant exons into the genomesof many cells. At five days post-transfection, cells were harvested for gDNA

and RNA. After reverse transcription, selective PCR was performed followed bysequencing of gDNA- and cDNA-derived amplicons. Hexamer enrichmentscores were calculated by dividing cDNA counts by gDNA counts.b, Correlation of enrichment scores between biological replicates for hexamersobserved in each experiment with positions of previously identified14 exonicsplicing enhancers (ESEs), exonic splicing silencers (ESSs) and stop codonsindicated. c, Rank-ordered plot of enrichment scores with positions of ESEs,ESSs and stop codons indicated.

a b

cLibrary R replicate 1 effect size

Libr

ary

R re

plic

ate

2 ef

fect

siz

e

–5–4–3–2–101

BRCA1 exon 18 position

Effe

ct s

ize

1 2 3 4 5 6 7 8 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77

A C G T

R = 0.847

–5

–4

–3

–2

–1

0

1

Library R effect size

Libr

ary

R2

effe

ct s

ize

nonsensesense

R = 0.846

–5

–4

–3

–2

–1

0

1

–5 –4 –3 –2 –1 0 1 –5 –4 –3 –2 –1 0 1

Figure 2 | Multiplex homology-directed repair reveals effects of singlenucleotide variants on transcript abundance. Three separate HDR libraries(R, R2, and L) containing a 3% mutation rate (97% WT, 1% each non-WT base)in either half of BRCA1 exon 18 were introduced to the genome via co-transfection with pCas9-sgBRCA1x18. Enrichment scores were calculated foreach haplotype observed at least 10 times in gDNA sequencing, and effect sizesof SNVs were determined by weighted linear regression modelling. ‘Sense’includes both missense and synonymous SNVs. a, Effect sizes calculated fromreplicate transfections of HDR library R, consisting of a 3% per-nucleotide

mutation rate in the 39-most 39 bases and the same selective PCR site used inFig. 1, were highly correlated (R 5 0.846). b, Library R2 harboured a selectivePCR site composed of 5 synonymous changes, none of which are present inlibrary R. When effect sizes derived from experiments with library R2 wereplotted against those from library R, there was a strong correlation (R 5 0.847),indicating reproducibility and demonstrating that differences between selectivePCR sites did not strongly influence scores. c, Effect sizes for SNVs across theexon are displayed. Data sets from libraries R and L were combined to spanthe entire exon. Dashed lines represent SNVs that introduce nonsense codons.

LETTER RESEARCH

4 S E P T E M B E R 2 0 1 4 | V O L 5 1 3 | N A T U R E | 1 2 1

Macmillan Publishers Limited. All rights reserved©2014Findlay, Boyleetal.,Nature (2014).

MutPred Spliceannotations:C49G “SpliceAffectingVariant”A53G “ESELoss/ESSGain”A56G “ESELoss”G63T “Cryptic5’SS”T67G “Cryptic5’SS”akaVUSV1714G

- - - - *Dashedlinesrepresentnonsense mutations

Page 23: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

In summary

Parallelized assays for the protein function of the RING domain of BRCA1

Saturation genome editing to understand the effect of missense variants on splicing

Page 24: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Freq

uenc

y

Next steps…

Suggestions?

Page 25: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Library construction and variant delivery

Parallelizable assays for protein function

Calculate likelihood estimates for pathogenicity

Sequencing of variants

Computational variant scoring pipeline

Challenges for scaling up

Page 26: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

How the results from massively parallelassays could get to the bedside…

Morescans

Betterdatabases

Bettervarianteffectprediction

Page 27: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Shendure lab

Fowler lab

Parvin labMuhtadi Islam

The Ohio State University

Kitzman labUniversity of MichiganFields lab

Dave Young Justin Gullingsrud

Thanks to:

Funding from theYeast Resource Center

NIH P41

Page 28: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

The effects of missense SNVs on splicing and protein function are difficult to predict

exon 17 exon 19exon 18CRISPR

HDR Library (N = 4,096)

Multiplex genome editing to determine effects of SNVs on splicing of exon 18 of

BRCA1Findlay et al. Nature 2014

Learning the Sequence Determinants of Alternative Splicing from Millions of

Random SequencesRosenberg et al. Cell 2015

Stop Gain √ Missense ?Frameshift √

Splicing effects

Page 29: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

I-SceI-GFP donor GFP

+ I-SceI to induce break

HDR

GFP broken GFP

+ functional

BRCA1 variant

error prone break repair

+ siRNA to target BRCA1 3’UTR

+ nonfunctional

BRCA1 variant

Scoring full-length BRCA1 variants for HDR function in human cells

pathogenic

control

benign

0.00

0.25

0.50

0.75

1.00

1.25

WT

Vecto

r o

nly

R7C

M18T

L22S

C39Y

H41R

C44F

C44S

K45Q

C61G

C64G

D67Y

% H

DR

resc

ue

HDR rescue assay

Muhtadi Islam and Jeff Parvin

Page 30: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

I-SceI-GFP donor GFP

+ I-SceI to induce break

HDR

GFP broken GFP

+ functional

BRCA1 variant

error prone break repair

+ siRNA to target BRCA1 3’UTR

+ nonfunctional

BRCA1 variant

Scoring full-length BRCA1 variants for HDR function in human cells

HDR rescue assay

Page 31: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Construction of the barcoded single amino acid substitution BRCA1-RING library

+

Page 32: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

ATGCTGAGTTTGTGTGTGAACGGAC…

ATGCNNNNNNTGTGTGGATATCCAC…

ATGCNNNNNNTGTGTGGATATCCAC…

ATGCNNNNNNTGTGTGGATATCCAC…

ATGCTGAGTTTGTGTGT GAACGGAC…

ATGCNNNNNNTGTGTGGATATCCAC…

ATGCTGAGTTTGTGTGTGAACGGAC…

BRCA1 locus

Donor Plasmids w/ random 6mers Genome editing

in many cells

Transcription and Splicing

CRISPR

Endogenous Loci Eacheditedexonreceives

1.ArandomSNV2.Afixed mutation

3%of106 cells=30,000events

Multiplex genome editing to measurethe effects of SNVs on splicing

Page 33: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Prospective functional map for 1,287 BRCA1 RING variants0

25

50

75

0

2

4

6

8

Predicted HDR rescue score

0.0 1.00.33 0.53 0.77

1,287 BRCA1 RING variants

likely HDR likely HDR non-functional functional

Predicted HDR rescue score

0.0 1.00.33 0.53 0.77

59 BRCA1 RING variants

likely HDR likely HDR non-functional functional

midpoint between mean pathogenic and benign scores

max pathogenicHDR rescue score

(experimental)

min benignHDR rescue score

(experimental)

A B

COSMICEVS

pathogenicVUS

benign

subs

titut

ing

amin

o ac

id

AVILMFYWSTNQCGPRHKDE*

Zn++Zn++

N- 4-helix bundle loop 1 central helix loop 2

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

C- 4-helix bundle

C

no dataWT aa

1

2.0

0

PRED

ICTE

DHD

R re

scue

soc

re

*

*

Page 34: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

E3 ligase activity

Massively parallel assays for BRCA1-RING E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

no dataWT aa

1

2.0

0

E3 fu

nctio

nal s

core

su

bs

titu

tin

g a

min

o a

cid

AVILMFYWSTNQ

CGP

RH

KDE

*

M18

damaging

neutral

enhancing

M18

Page 35: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

Massively parallel assays for BRCA1-RING E3 ligase activity

Page 36: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities

Page 37: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

E3 ligase activity

variant library

BRCA1T7 coat BARD1(26-126) BRCA1(2-304)

A

ATP, E1, E2Flag-Ub capture elute

deep sequencing

calculate variant frequency

deep sequencing

calculate selected/input ratio for each variant for each round

calculate slope of log2 ratios over 5 rounds of selection

5XB

selection in -histidine

transformation, Time 1 Time 2 Time 3deep sequencing deep sequencing deep sequencing deep sequencing

Time 0

B

BARD1-binding activity

Massively parallel assays for the BRCA1-RINGE3 ligase and BARD1-binding activities

Yeast two-hybrid

Page 38: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

Genetic testing is big business

*41.7% of tests revealed a VUS in at least one geneTung et al. FrequencyofmutationsinindividualswithbreastcancerreferredforBRCA1 andBRCA2

testingusingnext-generationsequencingwitha25-genepanel.Cancer 2015

More companies, lower costs, more genes*

pathogenicVUS

benign

not yet seen

0.0 1.00.33 0.77

0

50

100

Predicted HDR score

Cou

nts

High HDR functionLow HDR function

0.0 0.2 0.4 0.6

PolyPhen-2

CADD

Grantham

Align-GVGD

E3 + BARD1-binding

E3 + BARD1-binding

+ A-GVGD

Leave-One-Out Cross Validation R2

Page 39: Massively parallel functional analysis of missense mutations in … · Massively parallel functional assays for assessing function of missense variants Generate a library with mutations

We need new technologies to deliver

on the promises of genetic medicine

https://www.whitehouse.gov/precision-medicine

Massively parallel functional analyses are a

possible solution

0

25

50

75

0

2

4

6

8

Predicted HDR rescue score

0.0 1.00.33 0.53 0.77

1,287 BRCA1 RING variants

likely HDR likely HDR non-functional functional

Predicted HDR rescue score

0.0 1.00.33 0.53 0.77

59 BRCA1 RING variants

likely HDR likely HDR non-functional functional

midpoint between mean pathogenic and benign scores

max pathogenicHDR rescue score

(experimental)

min benignHDR rescue score

(experimental)

A B

COSMICEVS

pathogenicVUS

benign

subs

titut

ing

amin

o ac

id

AVILMFYWSTNQCGPRHKDE*

Zn++Zn++

N- 4-helix bundle loop 1 central helix loop 2

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

C- 4-helix bundle

C

no dataWT aa

1

2.0

0

PRED

ICTE

DHD

R re

scue

soc

re

*

*

Starita et al. Genetics, 2015Findlay et al. Nature, 2014Rosenberg et al. Cell, 2015 Patwardhan et al. Nature Biotech, 2012 Fowler et al. Nature Methods, 2010…