staying on target with crispr-cas

3
NATURE BIOTECHNOLOGY VOLUME 31 NUMBER 9 SEPTEMBER 2013 807 Dana Carroll is at the Department of Biochemistry, University of Utah School of Medicine, Salt Lake City, Utah. e-mail: [email protected] guide RNA:Cas9 combinations in vitro. They, too, found that mismatches were tolerated, par- ticularly distal to the PAM and at high nuclease concentrations. Like Hsu et al. 5 , they saw that a NAG PAM could support cleavage, albeit less efficiently than NGG. All three groups assayed off-target cleavage of endogenous sites in human cells for each of sev- eral sgRNA:Cas9 combinations. The most likely secondary targets in the genome were predicted based on having a small number of mismatches to the sgRNA. After nuclease treatment, the targets were amplified by PCR and tested for sequence alterations either with a gel assay or by deep sequencing. Many candidate sites were not mutated, but in each study, some sites with multi- ple mismatches were mutated at high frequencies. The most striking off-target cleavage was observed in the study by Fu et al. 4 . One site with two mis- matches and one with three were mutated at levels equal to or higher than that of the intended target. Several sites with four, and even five, mismatches were mutated substantially 4,5,7 . Mali et al. 6 examined CRISPR specificity using a different approach. They produced a nuclease-inactivated version of Cas9 and endowed it with a transcription activation mod- ule (VP64), so association with a specific sgRNA would turn on expression of a reporter gene in cultured human cells. Using barcoding and deep sequencing, they determined which vari- ant target sites were capable of activation with a given sgRNA. Similar to the nuclease studies, this one revealed that both single and multiple mismatches were tolerated, particularly far from the PAM. Each of the three sgRNAs examined showed a somewhat different pattern of residues that tolerated mismatches versus those that did not. Using the nuclease-active form of Cas9, this group also showed substantial cleavage of a reporter target with some guide RNA variants carrying one or two mismatches. The degree of cleavage of secondary targets, in naked DNA and in cells, varies with the activ- ity of the guide RNA:Cas9 nuclease. The ratio of and an auxiliary trans-activating crRNA (tracrRNA). For genome editing purposes, how- ever, a fusion of both is frequently used as a single guide RNA (sgRNA). The sgRNA consists of a sequence complementary to the target at its 5end and a Cas9-recognizable (derived from the tracrRNA) structure at its 3end. The target also needs to have a short sequence just outside the region of RNA-DNA hybridization, called the protospacer adjacent motif (PAM; Fig. 1). Most research has made use of Cas9 from Streptococcus pyogenes, which prefers the PAM sequence NGG and a guide RNA with 20 nucleotides of homol- ogy to the target. It has now been demonstrated that the CRISPR-Cas9 system efficiently cuts suitable targets in vitro 9 and in cells 9–12 . The issue of specificity is paramount for all the targetable nucleases, particularly in appli- cations to human therapy and to food sources. Off-target cleavage by ZFNs and TALENs has been reduced by modifying the cleavage domain to require the formation of heterodimers 2 . In the CRISPR-Cas system, earlier studies showed that some base mismatches between the guide RNA and target DNA are tolerated, particularly when they are far from the PAM 9,13 . The work presented in this issue 4–7 now offers much more extensive analyses of the specificity requirements of Cas9-based genomic tools. Fu et al. 4 and Hsu et al. 5 each produced a bat- tery of sgRNAs that carried mismatches to sev- eral target sequences. By assaying the induced mutation frequencies in cultured human cells for each guide RNA:Cas9 pair, they found that some single mismatches, particularly ones near the PAM, reduced cleavage activity, but the magni- tude of the effect varied considerably among tar- gets. The sgRNAs with two or more mismatches typically were much less effective, but again, the degree of inhibition was highly variable, with some sgRNAs carrying two mismatches showing cleavage comparable to the perfect match. Pattanayak et al. 7 selected from a partially randomized library those DNA sequences that were cut most efficiently by four different We are in the midst of a revolution in genome engineering. The advent of targetable nucle- ases has given researchers the ability to induce specific double-strand breaks in chromosomal DNA whose repair either induces local muta- tions or stimulates homologous recombination with experimenter-provided donor DNA 1 . Until now, zinc-finger nucleases (ZFNs) 2 and the more recently described transcription activator-like effector nuclease (TALEN) system 3 have provided impetus to the field. In the past 12 months, a new entrant—the CRISPR-Cas RNA- guided nucleases 4 —have gained prominence. Compared with ZFNs and TALENs, these latter endonucleases not only offer a simpler means of attaining specificity (i.e., a guide RNA, rather than a DNA-binding protein domain that requires complex engineering) but also demon- strate equal or greater cleavage efficacy. One key question concerning the specificity of CRISPR- Cas RNA-guided nucleases is whether off-target cleavage is comparable to other endonuclease systems. In this issue, four studies 4–7 provide new insights into the targeting efficiency of the CRISPR-Cas system, suggesting that the current generation of RNA-guided nucleases may not yet have adequate specificity to completely dis- place their ZFN and TALEN forebears. CRISPR components derive from bacteria and archaea, where they are part of an adaptive immune system that protects the organism against invading DNA 8 . Small segments of plasmids or viral genomes from an earlier infection are incor- porated into clustered, regularly interspaced, short palindromic repeats (CRISPRs). RNA transcribed from these sequences directs the Cas9 protein to cleave and inactivate a new intruder. In the microbes from which it was first iso- lated, the Cas9 endonuclease binds two RNAs, a specificity-determining CRISPR RNA (crRNA) Staying on target with CRISPR-Cas Dana Carroll Four independent studies shed light on the specificity of RNA-guided genome editing tools based on the Streptococcus pyogenes Cas9 protein. NEWS AND VIEWS npg © 2013 Nature America, Inc. All rights reserved.

Upload: dana

Post on 15-Dec-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Staying on target with CRISPR-Cas

nature biotechnology volume 31 number 9 september 2013 807

Dana Carroll is at the Department of Biochemistry University of Utah School of Medicine Salt Lake City Utah e-mail danabiochemutahedu

guide RNACas9 combinations in vitro They too found that mismatches were tolerated par-ticularly distal to the PAM and at high nuclease concentrations Like Hsu et al5 they saw that a NAG PAM could support cleavage albeit less efficiently than NGG

All three groups assayed off-target cleavage of endogenous sites in human cells for each of sev-eral sgRNACas9 combinations The most likely secondary targets in the genome were predicted based on having a small number of mismatches to the sgRNA After nuclease treatment the targets were amplified by PCR and tested for sequence alterations either with a gel assay or by deep sequencing Many candidate sites were not mutated but in each study some sites with multi-ple mismatches were mutated at high frequencies The most striking off-target cleavage was observed in the study by Fu et al4 One site with two mis-matches and one with three were mutated at levels equal to or higher than that of the intended target Several sites with four and even five mismatches were mutated substantially457

Mali et al6 examined CRISPR specificity using a different approach They produced a nuclease-inactivated version of Cas9 and endowed it with a transcription activation mod-ule (VP64) so association with a specific sgRNA would turn on expression of a reporter gene in cultured human cells Using barcoding and deep sequencing they determined which vari-ant target sites were capable of activation with a given sgRNA Similar to the nuclease studies this one revealed that both single and multiple mismatches were tolerated particularly far from the PAM Each of the three sgRNAs examined showed a somewhat different pattern of residues that tolerated mismatches versus those that did not Using the nuclease-active form of Cas9 this group also showed substantial cleavage of a reporter target with some guide RNA variants carrying one or two mismatches

The degree of cleavage of secondary targets in naked DNA and in cells varies with the activ-ity of the guide RNACas9 nuclease The ratio of

and an auxiliary trans-activating crRNA (tracrRNA) For genome editing purposes how-ever a fusion of both is frequently used as a single guide RNA (sgRNA) The sgRNA consists of a sequence complementary to the target at its 5prime end and a Cas9-recognizable (derived from the tracrRNA) structure at its 3prime end The target also needs to have a short sequence just outside the region of RNA-DNA hybridization called the protospacer adjacent motif (PAM Fig 1) Most research has made use of Cas9 from Streptococcus pyogenes which prefers the PAM sequence NGG and a guide RNA with 20 nucleotides of homol-ogy to the target It has now been demonstrated that the CRISPR-Cas9 system efficiently cuts suitable targets in vitro9 and in cells9ndash12

The issue of specificity is paramount for all the targetable nucleases particularly in appli-cations to human therapy and to food sources Off-target cleavage by ZFNs and TALENs has been reduced by modifying the cleavage domain to require the formation of heterodimers2 In the CRISPR-Cas system earlier studies showed that some base mismatches between the guide RNA and target DNA are tolerated particularly when they are far from the PAM913 The work presented in this issue4ndash7 now offers much more extensive analyses of the specificity requirements of Cas9-based genomic tools

Fu et al4 and Hsu et al5 each produced a bat-tery of sgRNAs that carried mismatches to sev-eral target sequences By assaying the induced mutation frequencies in cultured human cells for each guide RNACas9 pair they found that some single mismatches particularly ones near the PAM reduced cleavage activity but the magni-tude of the effect varied considerably among tar-gets The sgRNAs with two or more mismatches typically were much less effective but again the degree of inhibition was highly variable with some sgRNAs carrying two mismatches showing cleavage comparable to the perfect match

Pattanayak et al7 selected from a partially randomized library those DNA sequences that were cut most efficiently by four different

We are in the midst of a revolution in genome engineering The advent of targetable nucle-ases has given researchers the ability to induce specific double-strand breaks in chromosomal DNA whose repair either induces local muta-tions or stimulates homologous recombination with experimenter-provided donor DNA1 Until now zinc-finger nucleases (ZFNs)2 and the more recently described transcription activator-like effector nuclease (TALEN) system3 have provided impetus to the field In the past 12 months a new entrantmdashthe CRISPR-Cas RNA-guided nucleases4mdashhave gained prominence Compared with ZFNs and TALENs these latter endonucleases not only offer a simpler means of attaining specificity (ie a guide RNA rather than a DNA-binding protein domain that requires complex engineering) but also demon-strate equal or greater cleavage efficacy One key question concerning the specificity of CRISPR-Cas RNA-guided nucleases is whether off-target cleavage is comparable to other endonuclease systems In this issue four studies4ndash7 provide new insights into the targeting efficiency of the CRISPR-Cas system suggesting that the current generation of RNA-guided nucleases may not yet have adequate specificity to completely dis-place their ZFN and TALEN forebears

CRISPR components derive from bacteria and archaea where they are part of an adaptive immune system that protects the organism against invading DNA8 Small segments of plasmids or viral genomes from an earlier infection are incor-porated into clustered regularly interspaced short palindromic repeats (CRISPRs) RNA transcribed from these sequences directs the Cas9 protein to cleave and inactivate a new intruder

In the microbes from which it was first iso-lated the Cas9 endonuclease binds two RNAs a specificity-determining CRISPR RNA (crRNA)

Staying on target with CRISPR-CasDana Carroll

Four independent studies shed light on the specificity of RNA-guided genome editing tools based on the Streptococcus pyogenes Cas9 protein

n e w s a n D v i e w snp

gcopy

201

3 N

atur

e A

mer

ica

Inc

All

right

s re

serv

ed

808 volume 31 number 9 september 2013 nature biotechnology

on- to off-target cleavage is improved at lower concentrations as expected Cell type might also play a role Fu et al4 found that both on- and off-target mutagenesis was lower in human embryonic kidney 293 (HEK293) and human erythroleukemia K562 cells than in human osteosarcoma U2OS cells and mutagen-esis at some secondary targets fell below the level of detection Pattanayak et al7 suggest reduc-ing the nuclease concentration or activity as a means to improve specificity but a price is paid in reduced efficacy at the designed target

What lessons can be learned from these studies The Cas9 nuclease is less sensitive to

mismatches between the sgRNA and target than we might wish Base pairs near the PAM are more important but there is no identifiable seed sequence in the first 7ndash12 positions in contrast to what was previously suggested913 Overall it is difficult to define simple rules for sgRNA design based on the results of the four studies4ndash7 The pattern of tolerated substitutions varies consider-ably among sgRNA-target pairs some being very sensitive to mismatches others accepting many variants In some cases mismatched pairs sup-ported enhanced activity either in cleavage413 or in gene activation6 There is no evidence that individual positions within the 20-base-pair

GGTGAGTGAGTGTGTGCGTG

CCACTCACTCACACACGCAC

5prime GGUGAGUGAGUGUGUGCGUG

TGG ACC

||||||||||||||||||||

PAM

5prime3prime

Guide RNA

Target DNA

Cas9 protein

Fu et al4

Hsu et al5

Pattanayak et al7

Mali et al6

EGFP reporter

Variable gRNAs for 3 sites

+ Cas9Inactivated EGFP

Cleavage andmutagenesis

Genomic target

Predictsecondary sitesgRNA

Assay individually

Endogenous EMX1

Genomic target

gRNA

Assay individually

+ Cas9

Assay each site by deep sequencing

b

In vitro library of variant recognition sites

Genomic target

gRNA

Assay individually

Deep sequence to findwhich sequences were cut

Library of variantrecognition sites + Cas9-VP64

Inactivegene

+ Cas9

Homologous repair

gRNA

Active gene

Deep sequencemRNAs to find which

sequences were activated

Barcodes

Reporter gene

Minimal promoter

Variable gRNAs

Donor DNA

a

3prime

+ Cas9

Cleavage andmutagenesis

Predictsecondary sites

+ Cas9

Variable gRNAs for 4 sites

+ Cas9

Cleavage andcapture

Predictsecondary sites

+ Cas9

Scr

een

Val

idat

ion

Scr

een

Val

idat

ion

Scr

een

Val

idat

ion

Scr

een

Val

idat

ion

RNA-DNA duplex have special properties either for or against substitutions

The lack of perfect specificity in the CRISPR system may in fact be adaptive If all positions of the 20-nucleotide guide sequence and two-base-pair PAM were strictly required Cas9 would cut only one sequence out of more than 1013 pos-sible 22-mers which seems excessive Viruses are subject to constant variation and selection so the next invading genome will certainly differ slightly from the one that established a CRISPR insert By allowing mismatches the bacterial host is flexible in its defense

What might be done to improve the specific-ity of CRISPR cleavage It seems that the best prospects lie with the Cas9 protein It might be possible to increase the stringency of recogni-tion of the guide RNA-DNA hybrid either by selection or by protein engineering The latter would require some knowledge of the structure of the complex but at the moment no high-quality structure of the protein with or with-out the RNA is available Mali et al describe an alternative that looks promising6 They use a Cas9 variant that cuts only one DNA strand plus two sgRNAs simultaneously for sequences close to each other in the genomic target The offset nicks produced are effective in generat-ing breaks that are substrates for both mutagenic and homologous repair This strengthens the analogy to ZFNs and TALENs both of which require independent binding of two DNA-recognition domains to assemble the nuclease and produce a break In the meantime direct assessment of effects at secondary targets in the genome is certainly required in most experi-mental settings

For applications to model organisms speci-ficity is less of an issue and the simplicity of the CRISPR system remains very attractive Only a single constant protein is required for all targets attacking new targets requires only knowledge of the Watson-Crick base pairing rules and the relatively short sgRNAs are comparatively easy to construct Simultaneous editing of multiple targets has already been achieved in human cells10 and in mice14 rats1516 zebrafish17

and plants18 Given the dizzying rate at which CRISPR-targeting publications are appearing

Figure 1 The sgRNACas9 targeting complex (a) The DNA target is shown in black sgRNA in blue and the Cas9 protein is the orange oval Red bases in the DNA are those recognized by the sgRNA and Cas9 Specificity is determined by how many residues in the sgRNA-DNA hybrid and in the PAM are required for recognition and cleavage (Modified from Fu et al4) (b) Schematic representation of the experimental approaches to determining specificity taken in the papers by Fu et al4 Hsu et al5 Pattanayak et al7 and Mali et al6

New S AND v iew Snp

gcopy

201

3 N

atur

e A

mer

ica

Inc

All

right

s re

serv

ed

nature biotechnology volume 31 number 9 september 2013 809

8 Sorek R Lawrence CM amp wiedenheft B Annu Rev Biochem 82 237ndash266 (2013)

9 Cong L et al Science 339 819ndash823 (2013)10 Cho Sw Kim S Kim JM amp Kim J-S

Nat Biotechnol 31 230ndash232 (2013)11 Jinek M et al eLife 2 e00471 (2013)12 Mali P et al Science 339 823ndash826 (2013)13 Jiang w Bikard D Cox D Zhang F amp Marraffini LA

Nat Biotechnol 31 233ndash239 (2013)14 wang H et al Cell Reports 153 910ndash918 (2013)15 Li D et al Nat Biotechnol 31 681ndash683 (2013)16 Li w Teng F Li T amp Zhou Q Nat Biotechnol 31

684ndash686 (2013)17 Jao L-e wente SR amp Chen w Proc Natl Acad

Sci USA (2013) epub ahead of print doi101073pnas1308335110

18 Li JF et al Nat Biotechnol 31 688ndash691 (2013)

researchers are clearly eager to capitalize on these advantages

COMPETING FINANCIAL INTERESTSThe author declares no competing financial interests

1 Jasin M Trends Genet 12 224ndash228 (1996)2 Carroll D Genetics 188 773ndash782 (2011)3 Joung JK amp Sander JD Nat Rev Mol Cell Biol 14

49ndash55 (2013)4 Fu Y et al Nat Biotechnol 31 822ndash826 (2013)5 Hsu PD et al Nat Biotechnol 31 827ndash832

(2013)6 Mali P et al Nat Biotechnol 31 833ndash838 (2013)7 Pattanayak v et al Nat Biotechnol 31 839ndash843

(2013)

Table 1 Comparison of engineered protein nanocages

Last author Particle size Particle geometry Building blocks Linkage Engineered structure

woolfson2 100 nm Curved hexagonal array

3 coiled-coil sequences

Disulfide Quaternary

Jerala1 10 nm Tetrahedron 12 coiled-coil sequences

Ser-Gly-Pro-Gly Tertiary

Yeates67 16 nm Tetrahedron Nativea dimer native trimer

Rigid helix Tertiary

Baker8 13 nm 11 nm Octahedron and tetrahedron

Nativea trimer Designed protein interface

Quaternary

aNative naturally occurring

Cages from coilsBryan s Der amp Brian Kuhlman

The use of coiled coils could facilitate the modular predictable design of protein nanocages

Bryan S Der and Brian Kuhlman are in the Department of Biochemistry and Biophysics University of North Carolina at Chapel Hill Chapel Hill North Carolina USA and Brian Kuhlman is at the Lineberger Comprehensive Cancer Center University of North Carolina at Chapel Hill Chapel Hill North Carolina USA e-mail bkuhlmanemailuncedu

connected with flexible Ser-Gly-Pro-Gly linkers folded into a particle 10 nm in diameter with a small interior In contrast Fletcher et al2 built a self-assembled cage-like particle (SAGE par-ticle Fig 1b) by extending a previously pub-lished strategy that fused a trimer to a dimer for cage assembly67 In the new work a homotri-meric coiled coil was fused to a heterodimeric coiled coil with disulfide linkers to assemble hexagonal repeats Curvature in the two- dimensional array of repeats resulted in a three-dimensional particle 100 nm in diam-eter with a porous surface and large interior The SAGE particle can be categorized as a stochastic assembly flexible and irregular but having well-defined bulk properties whereas the TET12 particle can be categorized as a deterministic assembly having a specific atom-ically definable three-dimensional structure9

What are the strengths and limitations of these two protein cage designs The TET12 design innovatively uses protein-protein interactions for tertiary structure formation and folding rather than quaternary structure Furthermore it is made from a single polypep-tide chain and does not require disulfide forma-tion making it easy to encode and fold inside of cells However this design architecture seems challenging to extend to additional polyhedra The tetrahedron is the simplest polyhedron and its six edges required 12 different coiled-coil segments The cube and octahedron have 12 edges and would therefore require 24 dif-ferent coiled-coil segments Such designs are theoretically possible given the expansive tool-kit of coiled-coil dimers but the resulting large polypeptide would be prone to misfolding as a result of off-target coiled-coil interactions

In comparison to TET12 the SAGE particle is simpler and seems easily extensible and ame-nable to sequence-structure manipulations For example tuning the affinity of the coiled-coil heterodimer resulted in tuning the size of the SAGE particle In addition homotrimeric and heterodimeric complexes other than coiled coils can be used as the building blocks67 Closure into a sphere was likely a serendipitous result requiring structural pliability in the hexagonal array but this illustrates an important concept

Coiled coils are intertwined alpha helices that form elongated bundles often with two to four helices in a bundle These proteins are attrac-tive because naturally occurring sequences and structures have been thoroughly characterized and artificial sequences and structures have been rationally and computationally designed so coiled coils can be repurposed in a modular and predictable manner45 This use of biological modularity mimics evolutionary diversification by repurposing components rather than gener-ating new components from scratch Moreover it means that the structures of the new protein cages could be conceived using a pencil paper and knowledge of modular protein building blocks in contrast to previous approaches6ndash8 that have used computational methods to con-nect or assemble naturally occurring protein dimers and trimers (Table 1)

Despite using similar building blocks Gradisar et al1 and Fletcher et al2 constructed two very different nanocage structures Gradisar et al1 built a monomeric tetrahedron (TET12 Fig 1a) from a single polypeptide sequence in which 12 coiled-coil segments

Nanoscale cages assembled from proteins may find uses in vaccination biocatalysis and targeted drug delivery but it is challenging to engineer proteins that assemble into large and well-defined enclosures Two recent papers in Nature Chemical Biology and Science describe the first approaches to assemble nanocages from the relatively simple protein building blocks of coiled coils Gradisar et al1 assembled six pairs of coiled coils to form the edges of a protein tetrahedron with a 10-nm particle size Fletcher et al2 used coiled-coil trimers and dimers to construct a honeycomb-like lattice which spontaneously curved into three- dimensional 100-nm particles Nanocages such as these are unique materials suited for biomedical applications because (i) they are soluble biocompatible genetically encoded enclosed and semi-permeable and (ii) they are larger than individual proteins dimers and multimers but they still have a finite and well-defined three-dimensional structure in contrast to nanofibers nanotubes hydrogels surfactants and surface coatings3 Furthermore the use of coiled coils yielded two very different cage structures and diversity in cage structures will promote diversity in their applications

New S AND v iew Snp

gcopy

201

3 N

atur

e A

mer

ica

Inc

All

right

s re

serv

ed

Page 2: Staying on target with CRISPR-Cas

808 volume 31 number 9 september 2013 nature biotechnology

on- to off-target cleavage is improved at lower concentrations as expected Cell type might also play a role Fu et al4 found that both on- and off-target mutagenesis was lower in human embryonic kidney 293 (HEK293) and human erythroleukemia K562 cells than in human osteosarcoma U2OS cells and mutagen-esis at some secondary targets fell below the level of detection Pattanayak et al7 suggest reduc-ing the nuclease concentration or activity as a means to improve specificity but a price is paid in reduced efficacy at the designed target

What lessons can be learned from these studies The Cas9 nuclease is less sensitive to

mismatches between the sgRNA and target than we might wish Base pairs near the PAM are more important but there is no identifiable seed sequence in the first 7ndash12 positions in contrast to what was previously suggested913 Overall it is difficult to define simple rules for sgRNA design based on the results of the four studies4ndash7 The pattern of tolerated substitutions varies consider-ably among sgRNA-target pairs some being very sensitive to mismatches others accepting many variants In some cases mismatched pairs sup-ported enhanced activity either in cleavage413 or in gene activation6 There is no evidence that individual positions within the 20-base-pair

GGTGAGTGAGTGTGTGCGTG

CCACTCACTCACACACGCAC

5prime GGUGAGUGAGUGUGUGCGUG

TGG ACC

||||||||||||||||||||

PAM

5prime3prime

Guide RNA

Target DNA

Cas9 protein

Fu et al4

Hsu et al5

Pattanayak et al7

Mali et al6

EGFP reporter

Variable gRNAs for 3 sites

+ Cas9Inactivated EGFP

Cleavage andmutagenesis

Genomic target

Predictsecondary sitesgRNA

Assay individually

Endogenous EMX1

Genomic target

gRNA

Assay individually

+ Cas9

Assay each site by deep sequencing

b

In vitro library of variant recognition sites

Genomic target

gRNA

Assay individually

Deep sequence to findwhich sequences were cut

Library of variantrecognition sites + Cas9-VP64

Inactivegene

+ Cas9

Homologous repair

gRNA

Active gene

Deep sequencemRNAs to find which

sequences were activated

Barcodes

Reporter gene

Minimal promoter

Variable gRNAs

Donor DNA

a

3prime

+ Cas9

Cleavage andmutagenesis

Predictsecondary sites

+ Cas9

Variable gRNAs for 4 sites

+ Cas9

Cleavage andcapture

Predictsecondary sites

+ Cas9

Scr

een

Val

idat

ion

Scr

een

Val

idat

ion

Scr

een

Val

idat

ion

Scr

een

Val

idat

ion

RNA-DNA duplex have special properties either for or against substitutions

The lack of perfect specificity in the CRISPR system may in fact be adaptive If all positions of the 20-nucleotide guide sequence and two-base-pair PAM were strictly required Cas9 would cut only one sequence out of more than 1013 pos-sible 22-mers which seems excessive Viruses are subject to constant variation and selection so the next invading genome will certainly differ slightly from the one that established a CRISPR insert By allowing mismatches the bacterial host is flexible in its defense

What might be done to improve the specific-ity of CRISPR cleavage It seems that the best prospects lie with the Cas9 protein It might be possible to increase the stringency of recogni-tion of the guide RNA-DNA hybrid either by selection or by protein engineering The latter would require some knowledge of the structure of the complex but at the moment no high-quality structure of the protein with or with-out the RNA is available Mali et al describe an alternative that looks promising6 They use a Cas9 variant that cuts only one DNA strand plus two sgRNAs simultaneously for sequences close to each other in the genomic target The offset nicks produced are effective in generat-ing breaks that are substrates for both mutagenic and homologous repair This strengthens the analogy to ZFNs and TALENs both of which require independent binding of two DNA-recognition domains to assemble the nuclease and produce a break In the meantime direct assessment of effects at secondary targets in the genome is certainly required in most experi-mental settings

For applications to model organisms speci-ficity is less of an issue and the simplicity of the CRISPR system remains very attractive Only a single constant protein is required for all targets attacking new targets requires only knowledge of the Watson-Crick base pairing rules and the relatively short sgRNAs are comparatively easy to construct Simultaneous editing of multiple targets has already been achieved in human cells10 and in mice14 rats1516 zebrafish17

and plants18 Given the dizzying rate at which CRISPR-targeting publications are appearing

Figure 1 The sgRNACas9 targeting complex (a) The DNA target is shown in black sgRNA in blue and the Cas9 protein is the orange oval Red bases in the DNA are those recognized by the sgRNA and Cas9 Specificity is determined by how many residues in the sgRNA-DNA hybrid and in the PAM are required for recognition and cleavage (Modified from Fu et al4) (b) Schematic representation of the experimental approaches to determining specificity taken in the papers by Fu et al4 Hsu et al5 Pattanayak et al7 and Mali et al6

New S AND v iew Snp

gcopy

201

3 N

atur

e A

mer

ica

Inc

All

right

s re

serv

ed

nature biotechnology volume 31 number 9 september 2013 809

8 Sorek R Lawrence CM amp wiedenheft B Annu Rev Biochem 82 237ndash266 (2013)

9 Cong L et al Science 339 819ndash823 (2013)10 Cho Sw Kim S Kim JM amp Kim J-S

Nat Biotechnol 31 230ndash232 (2013)11 Jinek M et al eLife 2 e00471 (2013)12 Mali P et al Science 339 823ndash826 (2013)13 Jiang w Bikard D Cox D Zhang F amp Marraffini LA

Nat Biotechnol 31 233ndash239 (2013)14 wang H et al Cell Reports 153 910ndash918 (2013)15 Li D et al Nat Biotechnol 31 681ndash683 (2013)16 Li w Teng F Li T amp Zhou Q Nat Biotechnol 31

684ndash686 (2013)17 Jao L-e wente SR amp Chen w Proc Natl Acad

Sci USA (2013) epub ahead of print doi101073pnas1308335110

18 Li JF et al Nat Biotechnol 31 688ndash691 (2013)

researchers are clearly eager to capitalize on these advantages

COMPETING FINANCIAL INTERESTSThe author declares no competing financial interests

1 Jasin M Trends Genet 12 224ndash228 (1996)2 Carroll D Genetics 188 773ndash782 (2011)3 Joung JK amp Sander JD Nat Rev Mol Cell Biol 14

49ndash55 (2013)4 Fu Y et al Nat Biotechnol 31 822ndash826 (2013)5 Hsu PD et al Nat Biotechnol 31 827ndash832

(2013)6 Mali P et al Nat Biotechnol 31 833ndash838 (2013)7 Pattanayak v et al Nat Biotechnol 31 839ndash843

(2013)

Table 1 Comparison of engineered protein nanocages

Last author Particle size Particle geometry Building blocks Linkage Engineered structure

woolfson2 100 nm Curved hexagonal array

3 coiled-coil sequences

Disulfide Quaternary

Jerala1 10 nm Tetrahedron 12 coiled-coil sequences

Ser-Gly-Pro-Gly Tertiary

Yeates67 16 nm Tetrahedron Nativea dimer native trimer

Rigid helix Tertiary

Baker8 13 nm 11 nm Octahedron and tetrahedron

Nativea trimer Designed protein interface

Quaternary

aNative naturally occurring

Cages from coilsBryan s Der amp Brian Kuhlman

The use of coiled coils could facilitate the modular predictable design of protein nanocages

Bryan S Der and Brian Kuhlman are in the Department of Biochemistry and Biophysics University of North Carolina at Chapel Hill Chapel Hill North Carolina USA and Brian Kuhlman is at the Lineberger Comprehensive Cancer Center University of North Carolina at Chapel Hill Chapel Hill North Carolina USA e-mail bkuhlmanemailuncedu

connected with flexible Ser-Gly-Pro-Gly linkers folded into a particle 10 nm in diameter with a small interior In contrast Fletcher et al2 built a self-assembled cage-like particle (SAGE par-ticle Fig 1b) by extending a previously pub-lished strategy that fused a trimer to a dimer for cage assembly67 In the new work a homotri-meric coiled coil was fused to a heterodimeric coiled coil with disulfide linkers to assemble hexagonal repeats Curvature in the two- dimensional array of repeats resulted in a three-dimensional particle 100 nm in diam-eter with a porous surface and large interior The SAGE particle can be categorized as a stochastic assembly flexible and irregular but having well-defined bulk properties whereas the TET12 particle can be categorized as a deterministic assembly having a specific atom-ically definable three-dimensional structure9

What are the strengths and limitations of these two protein cage designs The TET12 design innovatively uses protein-protein interactions for tertiary structure formation and folding rather than quaternary structure Furthermore it is made from a single polypep-tide chain and does not require disulfide forma-tion making it easy to encode and fold inside of cells However this design architecture seems challenging to extend to additional polyhedra The tetrahedron is the simplest polyhedron and its six edges required 12 different coiled-coil segments The cube and octahedron have 12 edges and would therefore require 24 dif-ferent coiled-coil segments Such designs are theoretically possible given the expansive tool-kit of coiled-coil dimers but the resulting large polypeptide would be prone to misfolding as a result of off-target coiled-coil interactions

In comparison to TET12 the SAGE particle is simpler and seems easily extensible and ame-nable to sequence-structure manipulations For example tuning the affinity of the coiled-coil heterodimer resulted in tuning the size of the SAGE particle In addition homotrimeric and heterodimeric complexes other than coiled coils can be used as the building blocks67 Closure into a sphere was likely a serendipitous result requiring structural pliability in the hexagonal array but this illustrates an important concept

Coiled coils are intertwined alpha helices that form elongated bundles often with two to four helices in a bundle These proteins are attrac-tive because naturally occurring sequences and structures have been thoroughly characterized and artificial sequences and structures have been rationally and computationally designed so coiled coils can be repurposed in a modular and predictable manner45 This use of biological modularity mimics evolutionary diversification by repurposing components rather than gener-ating new components from scratch Moreover it means that the structures of the new protein cages could be conceived using a pencil paper and knowledge of modular protein building blocks in contrast to previous approaches6ndash8 that have used computational methods to con-nect or assemble naturally occurring protein dimers and trimers (Table 1)

Despite using similar building blocks Gradisar et al1 and Fletcher et al2 constructed two very different nanocage structures Gradisar et al1 built a monomeric tetrahedron (TET12 Fig 1a) from a single polypeptide sequence in which 12 coiled-coil segments

Nanoscale cages assembled from proteins may find uses in vaccination biocatalysis and targeted drug delivery but it is challenging to engineer proteins that assemble into large and well-defined enclosures Two recent papers in Nature Chemical Biology and Science describe the first approaches to assemble nanocages from the relatively simple protein building blocks of coiled coils Gradisar et al1 assembled six pairs of coiled coils to form the edges of a protein tetrahedron with a 10-nm particle size Fletcher et al2 used coiled-coil trimers and dimers to construct a honeycomb-like lattice which spontaneously curved into three- dimensional 100-nm particles Nanocages such as these are unique materials suited for biomedical applications because (i) they are soluble biocompatible genetically encoded enclosed and semi-permeable and (ii) they are larger than individual proteins dimers and multimers but they still have a finite and well-defined three-dimensional structure in contrast to nanofibers nanotubes hydrogels surfactants and surface coatings3 Furthermore the use of coiled coils yielded two very different cage structures and diversity in cage structures will promote diversity in their applications

New S AND v iew Snp

gcopy

201

3 N

atur

e A

mer

ica

Inc

All

right

s re

serv

ed

Page 3: Staying on target with CRISPR-Cas

nature biotechnology volume 31 number 9 september 2013 809

8 Sorek R Lawrence CM amp wiedenheft B Annu Rev Biochem 82 237ndash266 (2013)

9 Cong L et al Science 339 819ndash823 (2013)10 Cho Sw Kim S Kim JM amp Kim J-S

Nat Biotechnol 31 230ndash232 (2013)11 Jinek M et al eLife 2 e00471 (2013)12 Mali P et al Science 339 823ndash826 (2013)13 Jiang w Bikard D Cox D Zhang F amp Marraffini LA

Nat Biotechnol 31 233ndash239 (2013)14 wang H et al Cell Reports 153 910ndash918 (2013)15 Li D et al Nat Biotechnol 31 681ndash683 (2013)16 Li w Teng F Li T amp Zhou Q Nat Biotechnol 31

684ndash686 (2013)17 Jao L-e wente SR amp Chen w Proc Natl Acad

Sci USA (2013) epub ahead of print doi101073pnas1308335110

18 Li JF et al Nat Biotechnol 31 688ndash691 (2013)

researchers are clearly eager to capitalize on these advantages

COMPETING FINANCIAL INTERESTSThe author declares no competing financial interests

1 Jasin M Trends Genet 12 224ndash228 (1996)2 Carroll D Genetics 188 773ndash782 (2011)3 Joung JK amp Sander JD Nat Rev Mol Cell Biol 14

49ndash55 (2013)4 Fu Y et al Nat Biotechnol 31 822ndash826 (2013)5 Hsu PD et al Nat Biotechnol 31 827ndash832

(2013)6 Mali P et al Nat Biotechnol 31 833ndash838 (2013)7 Pattanayak v et al Nat Biotechnol 31 839ndash843

(2013)

Table 1 Comparison of engineered protein nanocages

Last author Particle size Particle geometry Building blocks Linkage Engineered structure

woolfson2 100 nm Curved hexagonal array

3 coiled-coil sequences

Disulfide Quaternary

Jerala1 10 nm Tetrahedron 12 coiled-coil sequences

Ser-Gly-Pro-Gly Tertiary

Yeates67 16 nm Tetrahedron Nativea dimer native trimer

Rigid helix Tertiary

Baker8 13 nm 11 nm Octahedron and tetrahedron

Nativea trimer Designed protein interface

Quaternary

aNative naturally occurring

Cages from coilsBryan s Der amp Brian Kuhlman

The use of coiled coils could facilitate the modular predictable design of protein nanocages

Bryan S Der and Brian Kuhlman are in the Department of Biochemistry and Biophysics University of North Carolina at Chapel Hill Chapel Hill North Carolina USA and Brian Kuhlman is at the Lineberger Comprehensive Cancer Center University of North Carolina at Chapel Hill Chapel Hill North Carolina USA e-mail bkuhlmanemailuncedu

connected with flexible Ser-Gly-Pro-Gly linkers folded into a particle 10 nm in diameter with a small interior In contrast Fletcher et al2 built a self-assembled cage-like particle (SAGE par-ticle Fig 1b) by extending a previously pub-lished strategy that fused a trimer to a dimer for cage assembly67 In the new work a homotri-meric coiled coil was fused to a heterodimeric coiled coil with disulfide linkers to assemble hexagonal repeats Curvature in the two- dimensional array of repeats resulted in a three-dimensional particle 100 nm in diam-eter with a porous surface and large interior The SAGE particle can be categorized as a stochastic assembly flexible and irregular but having well-defined bulk properties whereas the TET12 particle can be categorized as a deterministic assembly having a specific atom-ically definable three-dimensional structure9

What are the strengths and limitations of these two protein cage designs The TET12 design innovatively uses protein-protein interactions for tertiary structure formation and folding rather than quaternary structure Furthermore it is made from a single polypep-tide chain and does not require disulfide forma-tion making it easy to encode and fold inside of cells However this design architecture seems challenging to extend to additional polyhedra The tetrahedron is the simplest polyhedron and its six edges required 12 different coiled-coil segments The cube and octahedron have 12 edges and would therefore require 24 dif-ferent coiled-coil segments Such designs are theoretically possible given the expansive tool-kit of coiled-coil dimers but the resulting large polypeptide would be prone to misfolding as a result of off-target coiled-coil interactions

In comparison to TET12 the SAGE particle is simpler and seems easily extensible and ame-nable to sequence-structure manipulations For example tuning the affinity of the coiled-coil heterodimer resulted in tuning the size of the SAGE particle In addition homotrimeric and heterodimeric complexes other than coiled coils can be used as the building blocks67 Closure into a sphere was likely a serendipitous result requiring structural pliability in the hexagonal array but this illustrates an important concept

Coiled coils are intertwined alpha helices that form elongated bundles often with two to four helices in a bundle These proteins are attrac-tive because naturally occurring sequences and structures have been thoroughly characterized and artificial sequences and structures have been rationally and computationally designed so coiled coils can be repurposed in a modular and predictable manner45 This use of biological modularity mimics evolutionary diversification by repurposing components rather than gener-ating new components from scratch Moreover it means that the structures of the new protein cages could be conceived using a pencil paper and knowledge of modular protein building blocks in contrast to previous approaches6ndash8 that have used computational methods to con-nect or assemble naturally occurring protein dimers and trimers (Table 1)

Despite using similar building blocks Gradisar et al1 and Fletcher et al2 constructed two very different nanocage structures Gradisar et al1 built a monomeric tetrahedron (TET12 Fig 1a) from a single polypeptide sequence in which 12 coiled-coil segments

Nanoscale cages assembled from proteins may find uses in vaccination biocatalysis and targeted drug delivery but it is challenging to engineer proteins that assemble into large and well-defined enclosures Two recent papers in Nature Chemical Biology and Science describe the first approaches to assemble nanocages from the relatively simple protein building blocks of coiled coils Gradisar et al1 assembled six pairs of coiled coils to form the edges of a protein tetrahedron with a 10-nm particle size Fletcher et al2 used coiled-coil trimers and dimers to construct a honeycomb-like lattice which spontaneously curved into three- dimensional 100-nm particles Nanocages such as these are unique materials suited for biomedical applications because (i) they are soluble biocompatible genetically encoded enclosed and semi-permeable and (ii) they are larger than individual proteins dimers and multimers but they still have a finite and well-defined three-dimensional structure in contrast to nanofibers nanotubes hydrogels surfactants and surface coatings3 Furthermore the use of coiled coils yielded two very different cage structures and diversity in cage structures will promote diversity in their applications

New S AND v iew Snp

gcopy

201

3 N

atur

e A

mer

ica

Inc

All

right

s re

serv

ed