crispr-cas systems in bacteria and archaea - carnegie institution

27
CRISPR-Cas Systems in Bacteria and Archaea: Versatile Small RNAs for Adaptive Defense and Regulation Devaki Bhaya, 1 Michelle Davison, 1,2 and Rodolphe Barrangou 3 1 Carnegie Institution for Science, Department of Plant Biology, Stanford, California 94305; email: [email protected] 2 Department of Biology, Stanford University, Stanford, California 94305; email: [email protected] 3 DANISCO, USA, Inc., Madison, Wisconsin 53716; email: [email protected] Annu. Rev. Genet. 2011. 45:273–97 The Annual Review of Genetics is online at genet.annualreviews.org This article’s doi: 10.1146/annurev-genet-110410-132430 Copyright c 2011 by Annual Reviews. All rights reserved 0066-4197/11/1201-0273$20.00 Keywords coevolution, immunity, interference, RNAi, bacteriophage Abstract Bacteria and archaea have evolved defense and regulatory mechanisms to cope with various environmental stressors, including virus attack. This arsenal has been expanded by the recent discovery of the versatile CRISPR-Cas system, which has two novel features. First, the host can specifically incorporate short sequences from invading genetic elements (virus or plasmid) into a region of its genome that is distinguished by clustered regularly interspaced short palindromic repeats (CRISPRs). Second, when these sequences are transcribed and precisely processed into small RNAs, they guide a multifunctional protein complex (Cas proteins) to recognize and cleave incoming foreign genetic material. This adaptive immunity system, which uses a library of small noncod- ing RNAs as a potent weapon against fast-evolving viruses, is also used as a regulatory system by the host. Exciting breakthroughs in under- standing the mechanisms of the CRISPR-Cas system and its potential for biotechnological applications and understanding evolutionary dy- namics are discussed. 273 Annu. Rev. Genet. 2011.45:273-297. Downloaded from www.annualreviews.org by Stanford University - Main Campus - Lane Medical Library on 08/09/12. For personal use only.

Upload: others

Post on 11-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

GE45CH13-Bhaya ARI 1 October 2011 14:42

CRISPR-Cas Systemsin Bacteria and Archaea:Versatile Small RNAsfor Adaptive Defenseand RegulationDevaki Bhaya,1 Michelle Davison,1,2

and Rodolphe Barrangou3

1Carnegie Institution for Science, Department of Plant Biology, Stanford, California 94305;email: [email protected] of Biology, Stanford University, Stanford, California 94305;email: [email protected], USA, Inc., Madison, Wisconsin 53716;email: [email protected]

Annu. Rev. Genet. 2011. 45:273–97

The Annual Review of Genetics is online atgenet.annualreviews.org

This article’s doi:10.1146/annurev-genet-110410-132430

Copyright c© 2011 by Annual Reviews.All rights reserved

0066-4197/11/1201-0273$20.00

Keywords

coevolution, immunity, interference, RNAi, bacteriophage

Abstract

Bacteria and archaea have evolved defense and regulatory mechanismsto cope with various environmental stressors, including virus attack.This arsenal has been expanded by the recent discovery of the versatileCRISPR-Cas system, which has two novel features. First, the host canspecifically incorporate short sequences from invading genetic elements(virus or plasmid) into a region of its genome that is distinguished byclustered regularly interspaced short palindromic repeats (CRISPRs).Second, when these sequences are transcribed and precisely processedinto small RNAs, they guide a multifunctional protein complex (Casproteins) to recognize and cleave incoming foreign genetic material.This adaptive immunity system, which uses a library of small noncod-ing RNAs as a potent weapon against fast-evolving viruses, is also usedas a regulatory system by the host. Exciting breakthroughs in under-standing the mechanisms of the CRISPR-Cas system and its potentialfor biotechnological applications and understanding evolutionary dy-namics are discussed.

273

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

Clustered regularlyinterspaced shortpalindromic repeats(CRISPR): hallmarkof CRISPR-Cassystems

CRISPR-associatedproteins (Cas):diverse types ofproteins encoded bycas genes in the vicinityof CRISPRs

INTRODUCTION

Bacteria and archaea have evolved to cope andthrive as communities in dynamic environ-ments that are stressful and fluctuating. Theseenvironmental stressors can be both abiotic(e.g., nonoptimal temperatures or nutrient lev-els, redox stress) and biotic (e.g., toxins, viruses,transmissible genetic elements) in nature. Theabundant presence of viruses in almost all envi-ronments is a constant threat to the survival ofbacteria and archaea (2, 87, 93, 107). Further-more, viruses can have high rates of mutationand recombination, so a successful defensesystem has to be multilayered and have theability to deal with variable and fast-evolving

CRISPRS AND RNAI: COMMONDENOMINATORS AND DIFFERENCES

The hypothesis that CRISPR/Cas systems might be an adaptiveimmune system was based on in silico analyses that also hintedat an analogy to the eukaryotic RNA interference (RNAi) mech-anism (74). Although there are common denominators betweencrRNA and RNAi-based mechanisms of action, there are alsosignificant differences. The molecular commonalities betweenCRISPR and RNAi primarily consist of the fact that both aremediated by small noncoding RNAs which in conjunction witha ribo-nucleoprotein-complex target sequence-specific cleavageof nucleic acids. There are functional similarities between theproteins involved in the biogenesis of small interfering RNAs,as well as mechanistic and structural commonalities betweenthe eukaryotic RNA-induced silencing complex (RISC), and theCRISPR-associated complex for antiviral defense (CASCADE).One notable difference is the fact that the primary targetfor CRISPR-interference is dsDNA, although mRNA can betargeted by some CRISPR/Cas systems. Some key proteins donot have direct functional/ structural equivalents, notably theeukaryotic Argonaute (ARO), and some of the CRISPR universaland “signature” proteins. Overall, although CRISPR is arguablymost similar to PIWI-interacting RNAs (piRNAs), which protectgenome integrity from parasites such as transposons (103), fur-ther studies that examine the molecular basis for CRISPR-basedinterference will likely expand the list of mechanistic idiosyn-crasies and notable differences. Nevertheless, evidence of smallRNA-based defense and regulatory systems in all three domainsof life should foster research that transcend these boundaries.

predators (2, 41, 66). In this context, thediscovery of a novel adaptive defense system inbacteria and archaea has generated great inter-est and encouraged research into the molecularbasis of the mechanism of action, consequentlythere have been several recent breakthroughs.The CRISPR (clustered regularly interspacedshort palindromic repeats)-Cas (CRISPR-associated proteins) defense comprises amultistep process by which specific smallfragments of foreign nucleic acids are first rec-ognized as being nonself and incorporated intothe host genome between short DNA repeats.Subsequently, these fragments or spacers, inconjunction with host Cas proteins, are usedas a surveillance and adaptive immune systemby which incoming foreign nucleic acids arerecognized and destroyed or possibly silenced.The CRISPR-Cas system has primarily beeninvestigated in its defensive role against foreignDNA (viruses and plasmids), but its versatile,modular architecture may allow it to play a reg-ulatory role in host cells. The apparent parallelsto the eukaryotic RNA interference (RNAi)system (see sidebar, CRISPRs and RNAi:Common Denominators and Differences) andthe potential uses in biotechnology have alsogenerated interest. The CRISPR-Cas systemprovides a unique opportunity to observe andmodel coevolution between host and virus innatural environments or in controlled settingsbecause acquisition and immunity occur onshort time scales and evidence of past geneticaggressions can be deduced in some cases.Finally, the ability to dynamically acquireforeign DNA and subsequently use it to fightoff invading genetic material has elements ofan acquired and heritable immunity system,reflecting a Lamarckian mode of evolution (64).

We begin with a personal perspective ofthe discovery of the CRISPR-Cas system andthen synthesize results in this fast-moving field,including an attempt to categorize the bewil-dering variety of Cas proteins and their func-tions. We focus on bacteria, although studieswith archaea, most of which contain a CRISPR-Cas system, have provided important insightsinto its mode of action, diversity and unique

274 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

Spacer: small, variablesequences (flanked byrepeats) in the hostgenome that areacquired from foreignnucleic acids and playa role in defense, sothe name is inaccuratebut widely adopted

Bacteriophage:viruses that infectbacteria, also known asphage

characteristics (for reviews with a focus on Ar-chaea, see 32, 69, 99, 108). We present newevidence of the regulatory role of CRISPR-Cassystems in bacteria, which functions via smallnoncoding RNAs. The evolutionary implica-tions of this rapidly evolving, heritable immunesystem in prokaryotes and the opportunity itaffords to study the coevolution of viruses andhosts are discussed. We include a section onprogress in engineering the CRISPR-Cas sys-tem for biotechnological and epidemiologicalapplications. Several recent reviews have de-scribed the experiments that led to the discov-ery of this new defense system (5, 24, 49, 58,104, 113) or have focused on the mechanismof action of the CRISPR-Cas system in bacte-ria (56, 76) or in archaea (32, 99, 108); othershave examined the CRISPR-Cas system froman evolutionary perspective (63, 112) or placedit in the context of the burgeoning small RNAworld (58).

MENAGE A TROIS:BIOINFORMATICS,BIOCHEMISTRY, ANDBACTERIOPHAGES

Brief History of theCRISPR-Cas System

A timeline of unrelated observations made overthe past twenty years provides a compellingexample of the slow but satisfying path fromhypotheses generated solely from sequenceand genome context predictions (73, 80) tobiochemical, structural, and genetic data thatsubstantiated these initial ideas (11, 30). Anunintended consequence of this trajectory hasbeen the use of several confusing acronyms andsynonyms [see Makarova (72) and Deveau et al.(24) for clarification]. Fifteen years elapsedbetween the initial report of the presence ofDNA repeat arrays in the intergenic region ad-jacent to the alkaline phosphatase (iap) gene inEscherichia coli K12 (52) and the coining of theCRISPR acronym in 2002, following the obser-vation that such arrays of repeats were commonin bacteria and archaea (53, 54, 81). In the

1990s, bioinformatics-based tools allowed foreasy identification of these highly conserved ar-rays of palindromic repeats as bacterial and ar-chaeal genome sequencing/annotation projectsgreatly expanded (16, 47, 48, 59, 60, 62, 77, 81,84, 98). In hindsight, the meager viral and plas-mid sequence information available hinderedthe interpretation of the genetic content andpotential function(s) of the CRISPRs and eventoday remains a limitation. Initial predictionsbased on bioinformatic analyses were madesuggesting involvement in chromosome parti-tioning (81) and DNA repair (73), and the boldconjecture was put forward by Makarova andcolleagues that the CRISPR-Cas system mightbe a defense system akin to eukaryotic RNAi(74).

The year 2005 marked a turning point whenthree groups independently reported that thehypervariable spacers showed sequence homol-ogy to viruses (or bacteriophages) or plasmidsand hypothesized that CRISPRs and associ-ated proteins could play a role in immunityagainst transmissible genetic elements (13, 80,90). Research groups working with geneticallytractable bacterial systems and available hostand/or viral genomes were soon able to provideexperimental evidence. A report in 2007 docu-menting the ability of the CRISPR-Cas systemto provide viral resistance (11) and a publicationin 2008 showing the ability of CRISPRs to pre-vent plasmid transfer (75) provided the impetusto investigate the mechanism of action, lead-ing to important new advances. Researchersalso quickly realized the potential of these hy-pervariable and rapidly evolving genetic loci asa valuable tool for genotyping closely relatedpathogenic or environmentally relevant strainsand this is an area of active research.

THE CAST OF CHARACTERS

Novel Features of the CRISPR-CasDefense System

The CRISPR-Cas defense system has thenovel ability to incorporate short sequencesof nonself genetic material known as spacersat specific locations within CRISPRs in the

www.annualreviews.org • CRISPR-Cas Systems 275

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

CRISPR locus/array:region on genome orplasmid containingCRISPRs

Leader: 5′ end ofcrRNA preceding thefirst CRISPR repeat,which can contain longAT-rich tracts andinclude a promoter

Protospacer:sequence in foreignDNA which isacquired as a spacerinto the host genome

Protospacer-associated motif(PAM): shortconserved sequence inthe immediate vicinityof a protospacer that isrequired foracquisition

pre-CRISPR RNA(pre-crRNA):full-length transcriptproduced fromCRISPR array

CRISPR RNA(crRNA): smallnoncoding RNAproduced by cleavageof pre-crRNA (alsoknown as psiRNA orguide RNA)

Interference:the process by whichincoming foreignDNA or RNA istargeted fordestruction; alsoknown as adaptiveimmunity

CASCADE(CRISPR-associatedcomplex for antiviraldefense):multisubunit proteincomplex required forinterference

host genome. Spacers are transcribed andprocessed into small noncoding RNAs, whichin conjunction with specific Cas protein com-plexes can bind to incoming foreign geneticmaterial if there is a close or absolute sequencematch between the small RNA and incomingnucleic acid. This sequence-specific recogni-tion process culminates in destruction of theinvading nucleic acid and requires several Casproteins. The surveillance and attack processexploits previous exposure to a virus or plasmidto target incoming foreign DNA (or RNA).This provides the host heritable immunity torecently detected foreign DNA and hence hasbeen termed an adaptive or acquired immunesystem. However, some obvious distinctionsbetween the CRISPR-Cas system and the clas-sical vertebrate immune response include thefact that the CRISPR-Cas system can readilyacquire new spacers (or conversely, lose oldspacers) and this time-resolved activity allowsit to respond dynamically to a viral predatorthat is also evolving at high rates. Further-more, spacer-derived immunity is inherited bydaughter cells, reminiscent of a Lamarckianmode of evolution, which does not occur ineukaryotes.

A functional CRISPR-Cas system has twodistinguishable components required for activ-ity (Figure 1). The first easily recognizablefeature is the CRISPR locus/array located onthe genome (either chromosome or plasmid),which contains the hypervariable spacers ac-quired from virus or plasmid DNA. The secondfeature is a diverse group of cas genes locatedin the vicinity of a CRISPR locus, which en-code proteins (generically called Cas proteins)required for the multistep defense against inva-sive genetic elements.

The CRISPR-Cas defense process can beseparated into either two or three stages(Figure 1). The first stage, referred to as adap-tation (30, 76), immunization (49), or spaceracquisition (58, 113), involves the recognitionand subsequent integration of spacers betweentwo adjacent repeat units within the CRISPRlocus. Spacers appear to be integrated primar-ily at one end (the leader end) of the CRISPR

locus; thus, positional information represents atimeline of spacer acquisition events. To indi-cate the sequence on the viral genome that cor-responds to a spacer, the term protospacer wascoined (23). In several, but not all, cases, a veryshort stretch of conserved nucleotides in theimmediate vicinity of the protospacer, referredto as the protospacer adjacent motif (PAM),or CRISPR motif, appears to be a recogni-tion motif required for acquisition of the DNAfragment (24, 79) (Figure 2). This first phaseminimally requires two nucleases, Cas1 andCas2, both of which are universally present ingenomes that have a functional CRISPR/Cassystem and can be considered hallmarks of thesystem.

In the second stage, referred to as CRISPRexpression, a primary transcript, or pre-CRISPR RNA (pre-crRNA) is transcribedfrom the CRISPR locus by RNA polymerase.Next, specific endoribonucleases cleave thepre-crRNAs into small CRISPR RNAs(crRNAs). Based on their function, these smallRNAs have also been referred to as prokaryoticsilencing (psiRNAs) (40, 74) or guide RNAs(14, 19). In the third and final stage, describedas interference (24) or immunity (30), thecrRNAs within a multiprotein complex, [calledCASCADE (CRISPR-associated complex forantiviral defense) in particular organisms, suchas E. coli ] can recognize and base-pair specifi-cally with regions of incoming foreign DNA (orRNA) that have perfect (or almost perfect) com-plementarity (14). This initiates cleavage of thecrRNA–foreign nucleic acid complex (30). Onthe other hand, if there are mismatches betweenthe spacer and target DNA or if there are muta-tions in the PAM, then cleavage is not initiated.In this case, DNA is not targeted for attack,replication of the virus proceeds, and the hostis not immune to virus attack (Figure 2). Thisleads to host lysis, and the released virus can at-tack other susceptible host cells. To operate as adefense system, all three phases (i.e., spacer ac-quisition, expression, and interference) must befunctional, but it is important to note that eachof these processes can work independently,both mechanistically and temporally.

276 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

Key

CRISPR array

Leader

cas locus

PlasmidDNA

Virus DNA

Acquisition

Expression

Cas proteins

crRNA

Pre-crRNA

Plasmid DNA cleaved

Virus DNA cleaved

Interference

Protospacer

Repeat

Spacer

Transcription start

1

2

3

PAM

10 9 8 7 6 5 4 3 2 1

Figure 1Features of the CRISPR-Cas adaptive immune system. Stage 1: CRISPR spacer acquisition. Specific fragments or protospacers (with anadjacent protospacer-associated motif; shown as red bar) of double-stranded DNA from a virus or plasmid are acquired at the leader endof a CRISPR array on host DNA. A CRISPR array consists of unique spacers (colored boxes; spacers are numbered sequentially with the mostrecently acquired spacer having the highest number) interspaced between repeats (black diamonds). Acquisition occurs by a process thatminimally requires Cas1 and Cas2, encoded in the cas locus, usually located in the vicinity of the CRISPR array. Stage 2: CRISPRexpression. Pre-CRISPR RNA (Pre-crRNA) is transcribed from the leader region by RNA polymerase and further cleaved into smallercrRNAs that contain a single spacer and a partial repeat (hairpin structures with colored spacers) by Cas proteins. Stage 3: CRISPRinterference. crRNA containing a spacer that has a strong match to incoming foreign nucleic acid (plasmid or virus) initiates a cleavageevent (shown by scissors); Cas proteins are required for this process. DNA cleavage interferes with virus replication or plasmid activity andimparts immunity to the host. Interference can be mechanistically and temporally separated from CRISPR acquisition and expression(depicted by wavy white bar across the cell ). This figure is based on the CRISPR-Cas system in Streptococcus thermophilus, which represents awell-studied and relatively simple CRISPR-Cas system.

www.annualreviews.org • CRISPR-Cas Systems 277

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

5 –ACACTTTGCTACTGCATGCCAAGCAAGTTGATATATTTCTCTTTCTTTATAAGAAACGTCGATTACGATCGGTA-33 –TGTGAAACGATGCCGTACGGTTCGTTCAACTATATAAAGAGAAAGAAATATTCTTTGCAGCTAATGCTAGCCAT-5

5 –GTTTTTGTACTCTCAAGATTTAAGTAACTGTACAACAAGCAAGTTGATATATTTCTCTTTCTTTATGTTTTTGTACTCTCAAGATTTAAGTAACTGTACAAC-33 –CAAAAACATGAGAGTTCTAAATTCATTGACATGTTGTTCGTTCAACTATATAAAGAGAAAGAAATACAAAAACATGAGAGTTCTAAATTCATTGACATGTTG-5

5 -GUUUUUGUACUCUCAAGAUUUAAGUAACUGUACAACAAGCAAGUUGAUAUAUUUCUCUUUCUUUAUGUUUUUGUACUCUCAAGAUUUAAGUAACUGUACAAC-3

AAGCAAGTTGATATATTTCTCTTTCTTTAT

AAGCAAGTTGATATATTTCTCTTTCTTTATAAGAAA

AAGCAAGTTGATATATTTCTCTTTCTTTAT

AAGCAAGTTGATATATTTCTCTTTATTGATTAATAAA

Foreign DNA cleavedHost immune

Cas1, Cas2

CASCADE + Cas3

RNA polymerase

a Viral DNA

b Host DNA

c Pre crRNA Endo ribonucleases, e.g. cas6e

crRNA

e ≥1 mismatch (in or near PAM)

PAMProtospacer

RepeatSpacerRepeat

d 100% match

Foreign DNA not cleavedHost not immune

3' P

5' OH

Host spacer

Viral DNA Viral DNA

Host spacer

Figure 2A closeup of the CRISPR-Cas system. Some of the basic steps in CRISPR-Cas defense are depicted here based on currentunderstanding of the CRISPR Type I and II systems; other variations are shown in Figure 4. (a) Double-stranded viral DNA ( gray)with a protospacer ( green) and adjacent protospacer associated motif (PAM, red ). (b) Host DNA with the newly acquired spacer (blue)flanked by CRISPR repeats (black). Acquisition minimally requires Cas1 and Cas2. (c) RNA polymerase transcribes pre-crRNA(transcription start site not shown); orange arrows mark where cleavage of pre-CRISPR RNA (crRNA) by endoribonucleases can occurto create mature crRNAs. This step is accomplished by different Cas proteins, depending on the particular CRISPR type. The putativesecondary structure of crRNA is shown below, with a 5′ handle (black) followed by the spacer (blue) and a potential hairpin structure atthe 3′ end (hairpin in black). (d ): A perfect or almost match between crRNA and foreign nucleic acid, initiates a cleavage event within theprotospacer. A number of Cas-encoded proteins, e.g., CASCADE and Cas3, are required for this process. (e) Mismatches ( purple letters)between crRNA and foreign nucleic acid (either in the protospacer or PAM) prevent cleavage, and interference does not occur. Notethat crRNA basepairs with the complementary strand of target foreign DNA during the interference step (not shown here).

CRISPRdb: CRISPRdatabase; includesseveral tools toidentify and analyzeCRISPRs. Maintainedby Universite ParisSud (http://crispr.u-psud.fr/)

CRISPI: databasethat identifies Casproteins andCRISPRs; includesmany featurescomplementary toCRISPRdb (http://crispi.genouest.org/)

The CRISPR Locus

Although these loci were christened in severalways, the term CRISPR locus (or CRISPRarray) has now been widely adopted and refersto a genomic region containing CRISPRs (24).Dedicated databases (CRISPRdb and CRISPI)(35, 94) that identify CRISPRs and Casproteins on sequenced genomes indicate thatthey are present on most archaeal (∼90%) andmany bacterial (∼50%) genomes or on residentplasmids. The number of spacers in a particularCRISPR locus can vary widely, from as few asone to several hundred (as many as 587 spacers

at a specific CRISPR locus in the myxobac-terium Haliangium ochraceum DSM 14365;NC_013440) (35). Among different species,the length of the repeat can vary from 21 bp to48 bp, whereas spacers are typically between26 bp and 72 bp (33, 35). The sequence of therepeat units in different CRISPR loci is notconserved, although there are partially con-served sequences such as a GTTTg/c motif atthe 5′ end and a GAAAC motif at the 3′ end (24,33, 54, 65). Because of the partially palindromicnature of the repeats, it was hypothesized thattranscripts from these regions may formstable, highly conserved RNA secondary

278 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

Repeat-associatedmysterious protein(RAMP): proteinswith RNAse activityinvolved in crRNAprocessing

Type I, II, and IIICRISPR-Cas system:new classification ofCRISPR-Cas systemsbased on multiplecriteria

structures (65, 74), and recent structure/func-tion data supports this (see section entitledInterference).

Although the majority of genomes containa single CRISPR-Cas locus, there are severalexamples of bacterial or archaeal species thatharbor multiple CRISPR loci on the chromo-some. Some bacterial genomes contain as manyas 13–15 CRISPR loci, and archaeal genomesalso harbor numerous CRISPR loci, based oninformation available in CRISPRdb (35) andCRISPI (94). Not all CRISPR loci have adjoin-ing cas genes; it is possible that only the subsetof CRISPR loci that have adjacent cas genes arefunctionally active, whereas the others repre-sent inactive loci, or that one set of Cas proteinssuffices for the activity of related CRISPR lociin trans (25, 50). In cases where there are multi-ple CRISPR loci on a genome, the sequence ofsome of the CRISPR repeats may be identicalor very similar, and these may utilize the sameset of Cas proteins (44, 50, 65). CRISPR re-peats have been classified into at least 12 groups,and there appears to be some correspondencebetween certain repeats and groups (or sub-types) of Cas proteins associated with them(65, 72).

Cas Proteins

CRISPR loci often have groups of conservedprotein-encoding genes, named cas genes, intheir vicinity (73). Based on computationalanalyses, Cas proteins were predicted to containidentifiable domains characteristic of helicases,nucleases, polymerases, and RNA-binding pro-teins, which led to the initial speculation thatthey may be part of a novel DNA repair system(73). The order, orientation, and groupings ofcas genes appear to be extremely variable, andthis picture grows ever more complex as thenumber of annotated genomes increases. At-tempts to classify Cas proteins have been made,but this has proven difficult because of thediversity of the proteins involved (38, 72, 74).Initially, Jansen’s group identified four genefamilies, cas1–4 (53), which were then extendedto include cas5 and cas6 (13, 38). Haft and

colleagues (38) defined eight subtypes of Casproteins based on the phylogeny of the highlyconserved Cas1 protein and the operonic orga-nization of cas genes, which were named aftereight representative organisms that containeda single CRISPR-Cas locus [e.g., E. coli Casproteins were designated cse1 (CRISPR sys-tem of E. coli gene1); other subtypes includedAeropyrum (csa), Desulfovibrio (csd), Haloarcula(csh), Mycobacterium (csm), Neisseria (csn),Thermotoga (cst), and Yersinia (csy)].

These initial categories, although useful,cannot easily handle the relationships be-tween homologous but distantly related Casproteins, the extensive variability that existsin cas operons, or organisms that containmultiple CRISPR loci. In a new and unifiedclassification system based on multiple criteria,including evolutionary relationships of con-served proteins and cas operon organization,several groups (72) working on CRISPR-Cassystems have proposed a consensus view thatthe CRISPR-Cas system can be divided intotwo partially independent subsystems. The firstconsists of an information processing moduleand requires the universally present core pro-teins, Cas1 and Cas2, which are involved in newspacer acquisition. The second, or executive,subsystem is required for processing of primaryCRISPR transcripts (crRNA) and recognitionand degradation of invading foreign nucleicacid, and is quite diverse. For instance, incertain CRISPR sub-types, the multisubunitCASCADE is involved in the processing ofthe crRNA, whereas in other types a singlemultifunctional protein may play this role. Inaddition, there are several repeat-associatedmysterious proteins (RAMPs) that constitutea large superfamily of Cas proteins. RAMPscontain at least one RNA recognition motif(RRM; it is also called the ferredoxin-folddomain), and some have been shown to beinvolved in pre-crRNA processing (14, 19, 26,42). Based on this classification that integratesphylogeny, sequence, locus organization, andcontent, three types have been distinguished,Type I, Type II, and Type III CRISPR-Cassystems (Table 1, Figure 3, Figure 4).

www.annualreviews.org • CRISPR-Cas Systems 279

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

Table 1 Major Cas proteins (Cas 1–Cas 10). For other Cas proteins, please refer to Makarova (72). Blue, red, and purpledesignate universal, signature, and type-specific Cas proteins (as in Figure 3).

Protein Distribution COG Process FunctionCas1 Universal COG1518 Spacer acquisition DNAse, not sequence specfic, can bind RNA;

present in all Types; structure available forseveral Cas 1 proteins

Cas2 Universal COG1343,COG3512

Spacer acquisition Small RNAse specific to U-rich regions;present in all Types; structure available fromThermus thermophilus and Sulfolobussolfataricus and others

Cas3 Type I signature COG1203,COG2254 Target interference DNA helicase; most proteins have a fusion toHD nuclease

Cas4 Type I, II COG1468 Spacer acquisition RecB-like nuclease with exonuclease activityhomologous to RecB

Cas5 Type I COG1688, RAMP crRNA expression RAMP protein, endoribonuclease involved incrRNA biogenesis; part of CASCADE

Cas6 Type I, III COG1583,COG5551, RAMP

crRNA expression RAMP protein, endoribonuclease involved incrRNA biogenesis; part of CASCADE;structure available from P. furiosus

Cas7 Type I COG1857,COG3649, RAMP

crRNA expression RAMP protein, endoribonuclease involved incrRNA biogenesis; part of CASCADE

Cas8 Type I Not determined crRNA expression Large protein with McrA/HNH-nucleasedomain and RuvC-like nuclease; part ofCASCADE

Cas9 Type II signature COG3513 Target interference Large multidomain protein with McrA-HNHnuclease domain and RuvC-like nucleasedomain; necessary for interference andtarget cleavage

Cas10 Type III signature COG1353 crRNA expressionand interference

HD nuclease domain, palm domain, Znribbon; some homologies with CASCADEelements

Type I CRISPR-Cas System

In addition to the presence of the conservedCas1 and Cas2 proteins, Type I is defined bythe ubiquitous presence of a signature protein,the Cas3 helicase/nuclease. Cas3 is a largemultidomain protein with distinct DNA nu-clease and helicase activities (102). In addition,there are multiple Cas proteins that formCASCADE-like complexes that are involvedin the interference step (Figures 4 and 5).Many of these proteins are in distinct RAMPsuperfamilies (Cas5, Cas6, Cas7). Of the threesystems, Type I, thus far, is the most diversewith six different subtypes (Type I-A throughType II-F) (72). The Type I CRISPR system is

believed to target DNA, and cleavage requiresCas3 [which has a histidine, aspartic acid (HD)nuclease domain] or Cas4, a RecB-familynuclease (102). The Type I CRISPR-Cassystem in E. coli is one of the best characterized(Figures 4 and 5) and recent experiments usingE.coli are described in later sections (14) (57).Multiple studies in Pseudomonas aeruginosa havealso shed light on the Type I CRISPR-Casmechanism of action (42, 118, 119). ForDNA interference, CASCADE associates withprocessed crRNA to form a ribonucleoproteincomplex that drives the formation of R-loopsin invasive double-stranded DNA (dsDNA)(Figure 5) through seed sequence–driven base

280 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

Universalcas1

Type I-E (Escherichia coli)

cas3 cse1 cse2 cas5cas7 cas6e

Type-dependent

cas1 cas2

Type III-B (Pyrococcus furiosus)

cmr1 cas10 cmr3 cmr5cmr4 cas6 cmr6 cas1cas1 cas2

Signature

cas 9 cas4

Type II-B (Streptococcus thermophilus)

cas1 cas2

Protein type

Figure 3Cas proteins in Type I, II, and III CRISPR-Cas systems. Archetypal Type I, II, and III systems arerepresented by the operon structure from Escherichia coli, Streptococcus thermophilus, and Pyrococcus furiosus,respectively. The universally present cas1 and cas2 genes required for acquisition are shown in blue.Signature genes for each type (Type I, cas3; Type II, cas9; and Type III, cas10) are shown in red.Type-dependent genes (i.e., cas4, 5, 6, 7) are in purple, cas8, which is not shown here, is found in Type I-A,I-B, and I-C. In Type III-B, cmr1, 3, 4, 5, 6 are type-dependent genes (Type III A has a different set oftype-dependent genes, several of which are repeat-associated mysterious proteins). Type I has six subtypes(Type IA-IF) and both Type II and III have two subtypes. Type-dependent proteins are typically involved inexpression and/or interference; signature genes are involved in interference. However, there are exceptionsto these categories; see Table 1 for Cas protein functions and Makarova et al. (72) for further details.

tracrRNA: Trans-encoded small RNArequired for crRNAmaturation

pairing. It was recently shown in E. coli (97) andP. aeruginosa (119) that CASCADE facilitatesdsDNA target recognition by sequence-specific hybridization between crRNA and thetarget DNA over a 7–8 bp sequence (the seedsequence) at the 5′ end of the spacer (Figure 5).

Type II CRISPR-Cas System

This system is typified by the Cas9 signatureprotein, a large multifunctional protein withthe ability to generate crRNA, as well as targetphage and plasmid DNA for degradation (30).Cas9 appears to contain two nuclease domains,one at the N terminus (RuvC-like nuclease) andan HNH (McrA-like) nuclease domain in themiddle section (which might be involved in tar-get cleavage based on its endonuclease activity)(Table 1). Type II is the simplest of the threeCRISPR-Cas types, with only four genes that

compose the operon (this includes cas9, cas1,cas2, and either cas4 or csn2).There are two sub-types, Type IIA (or CASS4 that includes csn2)and Type IIB (or CASS4a that includes cas4).The best-studied Type II system is that of Strep-tococcus thermophilus, which has been shown toprovide defense against bacteriophage and plas-mid DNA (11, 30). It was also recently estab-lished that a trans-encoded small CRISPR RNA(tracrRNA) is involved in the processing of pre-crRNA into crRNA in Type II systems throughthe formation of a duplex with the CRISPR re-peat sequence (22). Mature crRNA, togetherwith Cas9, interferes with matching invasive ds-DNA by homology-driven cleavage within theprotospacer sequence, in the direct vicinity ofthe PAM (30). Mismatches at the 3′ end of theprotospacer and/or in the PAM allow phagesand plasmids to circumvent CRISPR-encodedimmunity (24, 30).

www.annualreviews.org • CRISPR-Cas Systems 281

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

Transcription

Pre-crRNA

Type I Type II Type III

Cas3

crRNA

CASCADE

Target DNATarget DNA

RNase III

Target DNA or RNA

Cas6

Csm/Cmr

Cas9

Cas6e/Cas6ftracrRNA

PAM

Figure 4Model of Type I, II, and III CRISPR-Cas mechanism of action. Transcription of the primary transcript(pre-crRNA) by RNA polymerase is followed by the production of mature crRNAs. The interferenceprocess is different in the Type I, II, and III systems. In Type I, the multisubunit CASCADE bindspre-crRNA, which is cleaved by Cas6e in subtype I-E or by Cas6f in subtype I-F, to create crRNAs with atypical 8-nt extension or handle at the 5′ end, followed by the spacer (blue) and part of the repeat region,which can form a hairpin structure at the 3′ end. In Type II, a trans-encoded small RNA (tracrRNA)base-pairs with the repeat region and is cleaved by host RNase III. Additionally, Cas9 is required for this stepand for subsequent maturation of the crRNA. In Type III, processing of crRNA requires Cas6, but thecrRNAs appear to be transferred to a specific Cas complex (Cmr in subtype III-B and Csm in subtype III-A).In subtype III-B, the 3′ end of crRNA is further trimmed. The final step results in cleavage of targetedforeign nucleic acid and proceeds differently in all systems. In Type I, crRNA with CASCADE along withthe Cas3 subunit can recognize (via the PAM, shown in red ) complementary target DNA and is responsiblefor cleavage of target DNA. In Type II, Cas9 along with crRNA can probably target DNA for cleavage (openorange triangle) in a process that requires the PAM. Subtype IIIA can target DNA, whereas subtype III-B cantarget RNA and a PAM does not appear to be required for the activity of Type III systems. This figure ismodified from Makarova et al. (72). Filled triangles represent nuclease activity that has been experimentallydemonstrated; open triangles represent activity that has not yet been identified.

Type III CRISPR-Cas System

This system has a number of recognizable fea-tures, including the signature RAMP protein,Cas10, which is likely involved in the process-ing of crRNA and possibly also in target DNAcleavage (6), and is somewhat functionally anal-ogous to the Type I CASCADE. The Type IIIsystem also contains the signature Cas6, in-volved in crRNA processing and additionalRAMP proteins likely to be involved in crRNA

trimming. The universal cas1 and cas2 genes aremostly in operon-like structures with the restof the cas genes but are not always in the sameoperon as the RAMP proteins in the Type IIIsystems. So far, two type III systems have beendistinguished (Type IIIA and IIIB). In Pyrococ-cus furiosus, a Type IIIA system, the target ofCRISPR interference is mRNA (40), whereasin Staphylococcus epidermidis, a Type IIIB sys-tem, the target is DNA (75). This highlights

282 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

a

b c CC

CC

CCC

A

E

BB

CASCADE

Cas3PAM-protospacer

crRNAcomplementary

to target DNA

CASCADE and crRNAdisrupts DNA

base-pairing (R-loop)

Cas3 cleaves firststrand of DNA

CASCADE displaced,second strand cleavage

Double strandbreak in DNA

crRNA guided CASCADErecognition of PAMprotospacer region

?

5'OH3'

5'PAM-protopacer

SEEDcrRNA

Target DNA

D

Figure 5Model of CASCADE function and involvement of seed sequences. (a) Initially, the CASCADE-crRNA complex identifies aprotospacer-PAM region on target DNA. This promotes strand separation, and crRNA hybridizes to the complementary DNA strandleading to R-loop formation. The single-DNA strand in the R-loop structure is the target of Cas3 nuclease activity resulting insingle-strand breaks in the protospacer. Following cleavage of the first DNA strand, Cas3 helicase activity can possibly displace theCASCADE-crRNA complex and permit second DNA strand cleavage. This would result in both strands of target DNA being cleaved,and interference with virus replication or plasmid function. This model is adapted from Sinkunas et al. (102), but also incorporatesinformation from Wiedenheft et al. (119), Jore et al. (57), Semenova et al. (97). (b) Model of base pairing between crRNA spacer andtarget DNA that results in R-loop formation. The process is initiated at the seed sequence adjacent to the PAM and then propagatedalong the protospacer region in a 5′→3′ direction over the complete protospacer region (97, 119). (c) Structural model of CASCADEbased on Jore et al. (57) showing the stoichiometry and the unusual seahorse architecture of the subunits of CASCADE which includesCasA, B, C, D, and E.

the polymorphic nature of CRISPR-Cas sys-tems, even within the Type III systems.

The distribution of the three CRISPR-Cassystems has some notable features, with theType I system being found in both bacteriaand archaea. In contrast, Type II is exclusivelypresent in bacteria, whereas the Type III sys-tems appear more commonly in archaea, al-though it is also found in bacteria (72, 108). Al-though no large scale detailed distributional orfunctional analysis is yet available, there are sev-eral examples of species that contain more thanone CRISPR-Cas type. Horizontal gene trans-fer (via plasmids that harbor CRISPR-Cas locior by other gene transfer mechanisms such astransposon activity) has been implicated in themovement of CRISPR-Cas loci across widelydiverged lineages (33, 50, 88).

A PLAY IN THREE ACTS:MECHANISM OF DEFENSE VIACRISPR SPACER ACQUISITION,EXPRESSION, ANDINTERFERENCE

Current understanding of the mechanism ofaction, which is divided into three phases,CRISPR spacer acquisition, CRISPR locus ex-pression (including transcription and process-ing), and CRISPR activity or interference aredescribed below.

CRISPR Spacer Acquisition

Currently, the best-studied model organ-isms to investigate the mechanistic aspectsof CRISPR acquisition include E. coli andP. aeruginosa (Type I), S. thermophilus (Type II)

www.annualreviews.org • CRISPR-Cas Systems 283

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

and P. furiosus (Type III). The ability toacquire novel spacers has been experimentallyshown in vivo in S. thermophilus, CRISPR1and CRISPR3, which represent Type IIsystems, and in Streptococcus mutans (114). InStreptococcus sp. during the natural generationof phage-resistant strains, one or more spacerswere incorporated into their CRISPR loci (11,114). These loci evolve via addition of newspacers at the leader end, following exposureto lytic phages or plasmid transformation andmay be derived from both sense and antisenseDNA strands (11, 23, 30). Concurrent internaladdition and deletion of spacers appear tobe rare events, whereas iterative additions ofspacers increase both the level and spectrum ofphage resistance in the host (11, 23). Internaldeletions of repeat-spacer units likely occur viahomologous recombination between CRISPRdirect repeats that are physically close onthe chromosome as shown in the archaeonSulfolobus sp. (37). When Sulfolobus sp. arechallenged with various plasmid or viral genes,surviving mutants carried either partial orwhole deletions of CRISPR loci (37). Intu-itively, it seems disadvantageous and unlikelythat CRISPR arrays grow ad infinitum, andthe balance of polarized additions and internaldeletions has been documented (23), althoughthe mechanisms by which this occurs have notbeen identified. The greater likelihood of in-ternal deletions occurring toward the trailer orfarther away from the leader end of a CRISPRspacer array would preferentially delete spacersthat target historically older phages. Con-versely, this allows preferential conservationof spacers that provide immunity against con-temporary phages. Internal deletions, whichtypically consist of the removal of severalconsecutive repeat-spacer units, have been ob-served in metagenomic studies (111) and whenthe spacer contents of multiple closely relatedstrains spanning 11 genera were compared (50).

Functionally, the process of spacer acquisi-tion can be divided into distinct steps involving(a) recognition of the invasive nucleic acid andscanning foreign DNA for potential PAMsthat identify protospacers, (b) the generation

of a new repeat spacer by processing of thenucleic acid, and (c) the integration of the newCRISPR repeat spacer unit at the leader end ofthe CRISPR locus. Of these processes, only thefirst step has been characterized, and the mech-anism by which a new spacer is integrated intothe host genome is very poorly understood.PAMs, also called CRISPR motifs (23), havebeen recognized in the direct vicinity of someprotospacers (79). These conserved regions areshort (typically only 2 to 5 nt long) and occurwithin 1 to 4 bp of the protospacer sequence,on either side, depending on the system(Figure 2). For the CRISPR1 and CRISPR3systems of S. thermophilus, AGAAW andGGNG have been identified as PAMs at the3′ end of the protospacer, respectively (23,50). Equivalent motifs were identified adjacentto protospacers of S. mutans, either down-stream (NGG, NAA) or immediately upstream(TTC) of the protospacer (114). A similarTTC sequence immediately upstream of theprotospacer was identified in Xanthomonas (96).In the archaeon Sulfolobus sp., a CC motif hasbeen identified (69). Further bioinformaticsanalyses of CRISPR spacer–repeat units froma larger dataset of bacterial and archaealspecies and the corresponding protospacersshould provide a better understanding of therepertoire of PAMs; whether specific PAMsare correlated with CRISPR repeats and/orparticular Cas proteins or whether someCRISPR-Cas systems do not require PAMs,such as some Type III systems. Currently, suchan analysis is hampered by the limited range ofhost-virus systems under investigation and thelack of an extensive database of viral genomes.

The Cas1 protein has repeatedly beenlinked with involvement in the acquisitionand/or integration of novel spacers in theCRISPR locus during the acquisition process(11, 14). This correlation is also consistentwith the coevolutionary pattern and inherentfunctional linkage observed between Cas1 andspecific CRISPR repeat sequences. The crystalstructure of Cas1 is available from severalorganisms (Table 2), including P. aeruginosawhere studies have demonstrated that Cas1 is

284 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

Table 2 Available structures of Cas proteins

Protein Function Organism PDB code (Gene)Cas1 Metal-dependent DNA-specific endonuclease Pseudomonas aeruginosa 3god (cas1)Cas1 Metal-dependent DNA-specific endonuclease Thermotoga maritima 3lfx (T1797)Cas1 Metal-dependent DNA-specific endonuclease Aquiflex aeolicus 2yzsCas1 Metal-dependent DNA-specific endonuclease Pyrococcus horikoshii 3pv9 (PH1245)Cas1 Metal-dependent DNA-specific endonuclease Escherichia coli 3nke ( ygbT )Cas2 Metal-dependent ssRNA-specific

endoribonuclease; ferredoxin foldSulfolobus solfataricus 2i8e;2ivy;3exc (Sso1404)

Cas2 Metal-dependent ssRNA-specificendoribonuclease; ferredoxin fold

Desulfovibrio vulgaris 3oq2 (DvuCas2)

Cas2 Metal-dependent ssRNA-specificendoribonuclease; ferredoxin fold

Pyrococcus furiosus 2i0x (PF1117)

Cas2 Metal-dependent ssRNA-specificendoribonuclease; ferredoxin fold

Thermus thermophilus 1zpw (Tth1823)

Cas3 Endonuclease (HD domain) Thermus thermophilus 3sk9 (TTHB187)Cas3 Endonuclease (HD domain) Methanocaldococcus jannaschii 3S4LCse2 (CasB) Alpha helical protein, in CASCADE complex Thermus thermophilus 2zca (TTHB189)Cas6 Endoribonuclease that generates guide RNA Pyrococcus furiosus 3i4h, 3pkm (cas6)Cas6e (CasE) RNA-binding protein with ferredoxin fold,

RAMP proteinThermus thermophilus 1wj9 (TTHB192)

Cas6f (Csy4) Endoribonuclease processing crRNA Pseudomonas aeruginosa 2xli, 2xlj, 2xlk (csy4)Cmr5 Cmr complex, Type IIIA Thermus thermophilus 2zop (ttCmr5)Csm6/Csa3 HTH-type transcriptional regulator Sulfolobus solfataricus 2wte (Sso1445)Csa3 HTH-type transcriptional regulator Sulfolobus solfataricus 3QYF(Sso1393)

a homodimeric, metal-dependent DNAse thatcan process double-stranded DNA (dsDNA)to a size of ∼80 bp (42, 118). Cas1 also hasthe ability to interact with proteins involvedin DNA recombination and repair, and canresolve Holliday junctions, implicating its dualinvolvement in CRISPR function and in DNArepair (10). This is consistent with the proposedrole of Cas1 in addition or removal of CRISPRrepeat–spacer units and reconciles the initialprediction that Cas proteins were involved inDNA repair (73). The structure of Cas2 (whichis often genetically associated with Cas1) fromSulfolobus solfataricus (12) and other organismshas also been resolved and was reported tohave a RNA recognition domain and exhibitendoribonucleic activity (Table 2). It has beenimplicated in new repeat-spacer acquisitionand integration, although this requires furtherexperimental validation. In addition to Cas1, it

was shown that in S. thermophilus, Csn2 is neces-sary for the acquisition of novel spacers follow-ing exposure to phages (11) or plasmids (30).

CRISPR Locus Expression

We have differentiated expression into twosteps: CRISPR locus transcription/regulationand crRNA processing, both of which are re-quired for interference to occur.

CRISPR locus transcription and regulation.Transcription of a CRISPR locus into a primarytranscript, or pre-crRNA, has been examined ina few organisms. These include the workhorsegram-negative bacterium E. coli (14, 89, 92),the plant pathogen Xanthomonas oryzae (89), thethermophilic bacterium T. thermophilus (3), andtwo archaeal species, P. furiosus (39) and the cre-narchaeon Sulfolobus (69).

www.annualreviews.org • CRISPR-Cas Systems 285

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

In E. coli, it appears that CRISPR loci aretranscribed at constitutively low levels (89), butit is not yet known if this is generally true inmost organisms. In T. thermophilus, CRISPRexpression levels can be induced in the presenceof a phage toward the peak of infection near thebeginning of host cell lysis. This is corroboratedby proteomic characterizations in which peakamounts of Cas proteins are concurrent withphage proteins synthesized following phagechallenge (70). Microarray analysis of the 12CRISPR loci in T. thermophilus demonstrateda complex pattern of induction that was partlydependent on the small molecule cAMP in con-junction with the catabolite regulator protein(3, 101). In the few bacterial species that havebeen examined so far, unidirectional transcrip-tion occurs from the 5′ leader end and promot-ers lie upstream (3, 92). In contrast, detailedtranscript analyses of CRISPR loci from the ar-chaeon Sulfolobus have shown that transcriptioninitiates just upstream from the first repeat, butthat bidirectional transcription of the CRISPRlocus also occurs (68, 69). Generally, cellsappear to exhibit constitutive levels of Cas pro-teins and pre-crRNA, but under certain condi-tions it is possible to regulate these levels, sug-gesting that there is the option of a backgrounddefensive monitoring of invasive nucleic acidoccurrence, as well as the flexibility to mount amore concerted counter-attack when necessary(see section titled The Expanding Repertoire).

crRNA processing. Once pre-crRNA hasbeen transcribed, it is processed by endonucle-olytic cleavage into smaller units that typicallycontain a single spacer flanked by partialCRISPR repeats. In the archaea P. furiosus(39) and Sulfolobus sp. (69), it has been demon-strated that the processed crRNA units, afterundergoing complete processing, consist of asingle spacer flanked by partial repeats on bothsides. The cleavage of pre-crRNA occurs at thebase of the hairpin formed by the palindromicCRISPR repeats, typically yielding a crRNAwith an 8-nt tag or handle at the 5′ end anda less well-defined boundary at the 3′ end (14,19, 42). In Pyrococcus furiosus, Cas6 is involved

in the processing of pre-crRNA into crRNAunits (19). In E. coli, a multimeric complexCASCADE consisting of CasABCDE pro-cesses pre-crRNA (14), whereas in P. aeruginosathe protein Csy4 is responsible for cleavage(42). Several studies are currently investigatingthe molecular basis for crRNA biogenesisacross all three CRISPR/Cas system types,as well as the patterns that drive constitutiveexpression, regulated transcription and overallabundance of crRNA in bacteria and archaea.Preliminary results suggest that CRISPR lociare constitutively expressed, can be induced byviral challenge, and often constitute quantita-tively dominant amounts of small RNAs in thecell (22).

CRISPR Interference

The processed crRNA, together with specificCas proteins, form a CRISPR ribonucleopro-tein (crRNP) complex that facilitates spacerbase pairing to the target or matching invasivenucleic acid. The crRNA serves as a guide(hence the term guide RNA has also been used)to allow for specific base pairing between theexposed crRNA within the ribonucleoproteininterference complex and the correspondingprotospacer on the foreign DNA (14, 86). Itis likely that crRNA interacts directly withcomplementary sequences in the target. Theunique occurrence of the PAM sequence onthe invading foreign DNA (and conversely, itsabsence in the host spacer sequence) is likelyto play a dual role: first, in spacer selection andacquisition and second, in the interference pro-cess for discrimination of self versus nonself,which highlights its importance. Indeed, it hasbeen demonstrated that despite perfect matchesbetween spacer and protospacer sequences, mu-tations in the PAM can circumvent CRISPR-encoded immunity (23, 30, 95). In archaea,PAMs are likely to be involved in crRNA-mediated targeting (37), which is consistentwith crRNA-directed RNA cleavage reportedin P. furiosus (19). Expanding the spectrumof identified PAMs will be an important stepforward that requires greater sequence infor-mation about host spacers and corresponding

286 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

viral genomes. Given the current paucity ofcomplete viral genomes and the fact that theyevolve quickly, this is a challenge for futurestudies.

The CRISPR-Cas system is versatile andhas the ability to interfere with foreign ds-DNA and single-stranded (ss) mRNA. This isreflected by the bewildering diversity of Casproteins and their enzymatic activity in var-ious species (72) (Table 1). Most early evi-dence suggested that dsDNA was the primarytarget of CRISPR-encoded defense or immu-nity in bacteria, and in S. thermophilus, specificCRISPR spacers were found to match codingor template strands of dsDNA phages (11, 13,23). Marraffini & Sontheimer (75) substanti-ated this observation when they demonstratedthat inserting a self-splicing intron into a proto-spacer had an impact on CRISPR-encoded im-munity in S. epidermidis. The CRISPR systemprevented uptake of the native plasmid, whereasthe intron-containing variant could be conju-gated into the host (75). This was later con-firmed in the S. thermophilus Type II system,with compelling biochemical evidence showingthat dsDNA from phages and plasmids weredirectly cleaved by Csn1 (Cas9) in the vicin-ity of the PAM (30). On the other hand, itwas also established that ssRNA was the pri-mary target of the CRISPR-Cas system in theType III system of the archaeon P. furiosus (40)and that mRNA was also the likely target in thearchaeon Sulfolobus (32). Both these Type IIICRISPR-Cas systems contain genes encodingRAMP proteins, suggesting that these systemscan act on RNA. However, a large number of ar-chaeal species contain more than one CRISPR-Cas system, which would expand the repertoireof targets that can be recognized by an organismand add more levels of regulatory control.

The strong link between the activity of var-ious Cas proteins and the sequence of CRISPRrepeats appears to be so specific that there islittle apparent crosstalk between the differentCRISPR-Cas systems; inactivation of a specificcas encoding gene cannot be rescued by the ac-tivity of other Cas proteins. This is also con-sistent with observed coevolutionary patterns

between the sequences of CRISPR repeats andcas genes (50), and appears to hold for both theacquisition of spacers and for the interferenceprocess (11, 30). Thus, it seems likely that thediversity in Cas proteins is likely responsiblefor differential DNA versus RNA targeting andthat some RAMPs might be specifically respon-sible for target RNA interference.

Initial experiments indicated that per-fect sequence identity was required betweenspacer and protospacer sequences for CRISPR-encoded immunity to occur because the pres-ence of a single nucleotide polymorphism inthe protospacer or in the PAM sequence ab-rogated the defense response of the host (11,23) (Figure 2). However, followup experimentsin other systems have shown that in certaincases even several mismatches between spacerand protospacer still allowed for the immuneresponse to occur (30, 96). Thus, in somecases, degeneracy in recognition can be toler-ated, whereas in others, single nucleotide muta-tions in the virus genome can thwart CRISPR-encoded defense. The key appears to lie in thelocation of potential mismatches relative to thecleavage site. Mutations that are distant fromthe cleavage site do not impact activity, whereasmismatches occurring in the PAM or in the di-rect vicinity of the cleavage site have a strongimpact (18, 30, 97). One can theorize that theshort PAM sequences may be easily eroded bymutations; however, the rate at which this hap-pens in the environment has not been tested.Preliminary results in S. thermophilus seem toindicate that mutations in phage genomes thatcircumvent CRISPR-encoded immunity maybe costly given that the majority of mutationsare either nonsynonymous or deleterious (23).Likewise, some circumstantial evidence pointsto independent acquisition of effective spacerstargeting critical and highly conserved regionswith stringent and sequence-dependent func-tionalities. Although PAM-dependent sam-pling of protospacers seems implicated in spacerselection, it remains to be determined whetherthere is strand specificity or other features thatguide spacer selection and acquisition rates indifferent organisms.

www.annualreviews.org • CRISPR-Cas Systems 287

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

THE EXPANDING REPERTOIRE:CRISPR-MEDIATED ACTIVITYIN REGULATIONAND DEVELOPMENT

To date, the majority of research has focusedon the novel ability of the CRISPR-Cas systemto provide immunity against invasive geneticelements (virus or plasmid) via small noncod-ing RNAs that are derived from CRISPR loci.Yet, with the discovery of RNA interference(RNAi) in eukaryotes and the ever-expandingrepertoire of roles of small noncoding RNAsin eukaryotes, it has become obvious that smallRNAs can be multifunctional and versatile (29,55, 61). Likewise, there is growing evidenceof the diverse regulatory roles that small non-coding RNAs play in bacteria (34, 116). Theflexibility and diversity of the CRISPR-Cassystem as well as its proven ability to targetRNA suggest that the system has the abilityto regulate or silence transcript levels withinthe cell, although to date, such evidence issparse. The presence of a transcribed CRISPRspacer that matches histidyl-tRNA synthetasewas shown to contribute to attenuated histidyl-tRNA pools in the cell, with consequences forthe synthesis of histidine-rich proteins (4), butit is unclear if this is a widely used regula-tory mechanism in bacteria. A bioinformatics-based approach searching for evidence of thepresence of self DNA in CRISPR arrays lo-cated a few examples, albeit at a very low fre-quency, which could represent errors in acquisi-tion rather than a widely used regulatory system(106).

Recent reports indicate that the Cas1 pro-tein (YgbT) of E. coli exhibits nuclease activityagainst both single-stranded and branchedDNAs, replication forks, and 5′ flaps, andinteracts with RecB, RecC, and RuvB of theDNA repair system, suggesting it may have adual role and functions in the CRISPR-Casdefense system but can also function in DNArepair (10). It was also shown in E. coli thatthe CRISPR-Cas system was triggered underspecific conditions in which misfolded proteinsaccumulated in the membrane (86), so it

was speculated that the CRISPR-Cas systemcould provide a defense against defectiveprotein accumulation. It is unclear if this isa special case or whether this could hint atthe potential use of the CRISPR-Cas systemfor a more widespread surveillance system inbacteria undergoing stress conditions (eithervia virus attack or because of other environ-mental stresses). Regulation of the multipleCRISPR-Cas systems in E. coli has also beeninvestigated, and a recent study demonstratedthat in E. coli K12 transcription from the casAand CRISPR I promoters is repressed by heat-stable nucleoid-structuring protein (H-NS)(92), which is a global repressor of transcriptionin many gram-negative bacteria. In a follow-upstudy, Westra et al. (117) provided experimen-tal evidence that when LeuO (a LysR-typetranscription factor) binds to two sites flankingthe casA promoter and the H-NS nucleationsite, it results in derepression. Thus, in E. coliH-NS and LeuO appear to be antagonisticregulators of CRISPR-based immunity (117),although the implications of this regulatoryloop in the context of phage infections innatural environments awaits further study (25).In P. aeruginosa, the CRISPR loci appear tobe involved in lysogeny-dependent inhibitionof biofilm formation (18, 119), and CRISPRactivity has also been implicated in swarming ofmyxobacteria (115), indicating that CRISPR-Cas systems may be involved in functions otherthan defense. We anticipate that there will beother examples of how this versatile system hasbeen coopted for other cellular functions asother organisms are examined.

ALL THE WORLD’S A STAGE:BIOTECHNOLOGICALAPPLICATIONS ANDEVOLUTIONARY IMPLICATIONSOF THE CRISPR-CAS SYSTEM

Strain Typing andEpidemiological Studies

Incoming spacers are rapidly acquired at theleader end of the CRISPR array, so they provide

288 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

Spoligotyping:a modified form ofgenotyping based onstrain-dependenthybridization patternsof in vitro amplifiedDNA with multiplespaceroligonucleotides

a time-resolved window of spacer acquisition.Therefore, these hypervariable and rapidlyevolving genetic loci provide a valuable tool forgenotyping of strains, also known as spoligotyp-ing. This feature has been used for genotypingand epidemiological studies of pathogenic My-cobacterium tuberculosis (1, 15, 36, 121), Yersiniapestis (20, 90), Corynebacterium diphtheriae (82,83), P. aeruginosa (18), Legionella (21) Streptococ-cus pyogenes (48, 78), and Salmonella sp. (71), andfor industrially relevant organisms such as lac-tobacilli and streptococci (50, 51). CRISPR lociprovide the ability to segregate nearly identicalstrains over time or within clonal populations(8, 11, 51). The degree of spacer polymorphismin terms of both number of unique spacers andspacer arrangements usually correlates with theactivity level of a given locus; thus, in cases whenthere are multiple CRISPR loci in the genome,it is important to choose a genetically polymor-phic locus.

The temporal and spatial hypervariability ofCRISPR spacers can also be exploited to re-solve population level genotypes in complexsamples where strain diversity is difficult to de-termine, as was shown in Leptospirillum popula-tion analyses in acid mine drainage acidophilicbiofilm samples (8, 111). Similar observationswere made in natural samples containing mixedand dynamic populations of Sulfolobus sp. (45,46) and Synechococcus sp., (44) and also in hu-man subjects, where streptococci population,exposed to phage predation, showed signifi-cant changes (91). Such approaches could mon-itor complex, dynamic systems over time toidentify ancestral relationships or infer eventssuch as blooms, selective sweeps, and bottle-necks. It is likely that active and hypervari-able CRISPR loci will be increasingly usedin complex metagenomic studies to geneticallycharacterize microbial population content anddynamics (7, 105). As the understanding ofthe role viruses play in shaping bacterial andarchaeal communities grows, we foresee thatCRISPRs will not only assist in resolving pop-ulation dynamics of the microbes, but also shedlight on the coevolutionary dynamics betweenhost and virus (7, 31, 44).

Microbial Populations, PopulationDynamics, and the Modeling of theCRISPR-Associated Arms Race

CRISPR loci provide information about theexposure to foreign genetic elements and in-sight into the relationships between bacteriaor archaea and their biotic environments. Akey feature of the CRISPR defense system isthat CRISPR loci can rapidly acquire novelspacers. Phages also have the ability to mutatetheir genomes, allowing them to circumventthe host CRISPR-encoded immunity system,which relies on close matches between spacersand incoming nucleic acid. Phages may escapeCRISPR spacers by either mutating or delet-ing bases in the protospacer and/or the PAM(23, 30), or by shuffling sequences targeted byCRISPR spacers (8). These rapid evolution-ary dynamics can provide important insightsinto genome evolution of both the host andphage populations (but see References 109, 110for a different perspective). They set the stagefor mathematical modeling of their evolution-ary interplay (43) and short- or long-term ex-perimental analyses of phage-host coevolution.These experiments can be carried out in closed,controlled laboratory conditions or in open en-vironmental systems (67, 112).

Natural Defense and ImmunityAgainst Phages and Plasmids

The ability of the CRISPR-Cas system to tar-get plasmids that contain antimicrobial resis-tance markers and to target sequences from an-tibiotic resistance genes has been documented(30). Consequently, the CRISPR-Cas systemprovides a natural means to develop strains re-fractory to the uptake of plasmids that carryundesirable genes. There is potential to de-velop CRISPRs in strains so as to preclude up-take and prevent dissemination of undesirablegenetic elements such as prophages, antibioticresistance markers, and pathogenicity islands(27, 85, 100). Indeed, given the negative cor-relation between the occurrence of CRISPRand acquired antibiotic resistance in multidrug-resistant enterococci, the CRISPR-Cas system

www.annualreviews.org • CRISPR-Cas Systems 289

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

appears able to mitigate the spread of mobile el-ements, notably plasmids and prophages, whichaccount for up to 25% of pathogenic Enterococ-cus faecalis (85). This functionality could be ex-ploited to reduce the dissemination of antimi-crobial resistance genes and virulence factors inwidely distributed bacterial pathogens, such asmethicillin-resistant Staphylococcus aureus (76).

The negative effect of CRISPR-encodedimmunity on plasmid occurrence and dissem-ination is apparent in the relative scarcity ofplasmids in dairy S. thermophilus, as comparedwith Lactococcus lactis. Both species have compa-rable genome size, and identical environmentswhere phages are a recurring problem, but theformer relies primarily on chromosomally en-coded CRISPR loci, whereas the latter is mostlydependent on CRISPR-independent strategiesthat are widely encoded on plasmids, such as re-striction modification systems and abortive in-fection systems (2, 24, 66). As the mechanisticunderstanding of CRISPR-Cas defense rapidlygrows, it is important to place it in the con-text of other important host defense systems,such as abortive infection, toxin-antitoxin, andrestriction-modification (66). Furthermore, thelittle-appreciated role of phage ecology (2) andthe fledging field of viral ecogenomics will needto be integrated into our understanding of theevolution of the CRISPR-Cas systems (9, 28,87, 93).

Prospects for DevelopingCRISPR-Enabled Technologies

The potential to harness the natural ability ofthe CRISPR-Cas immune system to develop in-creased phage resistance in vitro provides anexperimental framework to iteratively build upphage resistance for perennial use of valuablecultures and domesticated industrial microbes.Further, the unique spacer combination ob-tained through several consecutive rounds ofCRISPR mutant screening can be seen as a nat-ural genetic tag for valuable proprietary strains.Although there are several examples in the lit-erature that highlight the functional featuresof the CRISPR-Cas defense system in a vari-ety of organisms, it is important to note thattheir propensity for mutations, deletions, andloss of cas genes has resulted in loss of func-tion in a large array of CRISPR loci (50).There is evidence that the CRISPR-Cas sys-tem can be moved by horizontal transfer andconversely that they can also be rapidly lost(or reorganized) from an organism (33, 44, 88).There are examples of CRISPR loci locatedon plasmids, and transposons and insertion se-quences are known to flank CRISPR loci (33,44, 50), so these loci also represent compellingexamples of various driving forces in the evo-lution of genomes, including horizontal genetransfer.

SUMMARY POINTS

1. Bacteria and archaea have evolved several mechanisms to deal with abundant, evolvingvirus populations and invasive genetic elements such as plasmids. The recently discoveredCRISPR-Cas system utilizes exposure to foreign nucleic acids to subsequently targetand destroy incoming related viruses or plasmids. This provides the host with heritableresistance, and hence it has been termed an adaptive immune response system.

2. This novel and widely occurring defense system can incorporate specific small fragmentsof foreign nucleic acids into the host genome between conserved short DNA repeatsknown as CRISPRs. Later, these fragments or spacers are transcribed and processed intosmall RNAs, which, in conjunction with Cas proteins, are used to recognize and destroynonself nucleic acid.

290 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

3. CRISPR-Cas systems have been identified in most archaeal (∼90%) and many bacterial(∼50%) genomes and on resident plasmids. Most genomes contain a single CRISPR-Caslocus, but can also harbor multiple CRISPR loci.

4. The CRISPR-Cas system has recently been categorized into three types (Type I, II, andIII) based on phylogeny, sequence, locus organization, and content of the CRISPRs andassociated cas genes (which encode various DNases, RNases, and other proteins).

5. The CRISPR-Cas defense process can be operationally distinguished into three phases:spacer acquisition, CRISPR expression, and CRISPR interference, which are temporallyseparated. Each step requires one or more Cas proteins, and deciphering the role andstructure of specific Cas proteins is an area of active research.

6. In addition to defense against incoming viruses or plasmids, CRISPRs may play a role inhost regulatory and developmental processes.

7. The hypervariable CRISPR spacers have been effectively used in pathogen and environ-mental genotyping and in the future may have broader applications.

FUTURE ISSUES

1. Many mechanistic aspects of the defense system remain unclear. Of these, the details ofthe acquisition process and whether PAMs are universally required for recognition stillneed significant experimental input. What controls the size of CRISPR loci? Are specificproteins involved in orchestrating loss of CRISPR spacers, or is it a stochastic process? Aclearer understanding of how both RNA and DNA can be targeted by the Cas proteinsis work for the future.

2. A few well-developed model systems have been successfully used for the characterizationof the basic mechanisms of CRISPR-Cas mediated defense, but to appreciate the fullscope of CRISPR-Cas mediated function will require exploring a wide range of geneti-cally tractable model systems.

3. Several open questions pertain to the distribution of the CRISPR loci in varied environ-mental niches. For example, many genomes contain multiple functional CRISPR-Casloci. What are the evolutionary driving forces for the concurrent maintenance of multiplefunctional systems? Do they provide additive immunity, and are CRISPR-Cas loci morecommon in environments with certain characteristics, e.g., high population sizes or hightemperatures? Conversely, why do certain bacterial species lack this versatile and robustdefense system? Some of these questions will benefit from targeted metagenomic studiesas well as theoretical modeling approaches.

4. The features of CRISPR-Cas systems readily lend themselves to a broad range of appli-cations, notably leveraging hypervariability for strain typing and epidemiological studies,resolving the content and dynamics of complex microbial communities, building naturaldefenses against viruses and plasmids, engineering immunity against undesirable geneticelements, and expanding the shelf life of industrial workhorses and other organisms beingdeveloped for biotechnological applications.

www.annualreviews.org • CRISPR-Cas Systems 291

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

DISCLOSURE STATEMENT

R.B. is a coinventor on several patent applications related to the use of CRISPR for variousapplications.

ACKNOWLEDGMENTS

We thank Philippe Horvath for a critical reading of the manuscript and several members of theCRISPR community for discussing and sharing their data. D.B. and M.D. were supported by theNational Science Foundation and the Carnegie Institution for Science, and R.B. by DANISCO.

LITERATURE CITED

1. Abadia E, Zhang J, dos Vultos T, Ritacco V, Kremer K, et al. 2010. Resolving lineage assignation onMycobacterium tuberculosis clinical isolates classified by spoligotyping with a new high-throughput 3RSNPs based method. Infect. Genet. Evol. 10:1066–74

2. Abedon ST. 2009. Phage evolution and ecology. Adv. Appl. Microbiol. 67:1–453. Agari Y, Sakamoto K, Tamakoshi M, Oshima T, Kuramitsu S, Shinkai A. 2010. Transcription profile of

Thermus thermophilus CRISPR systems after phage infection. J. Mol. Biol. 395:270–814. Aklujkar M, Lovley DR. 2010. Interference with histidyl-tRNA synthetase by a CRISPR spacer sequence

as a factor in the evolution of Pelobacter carbinolicus. BMC Evol. Biol. 10:2305. Al-Attar S, Westra ER, van der Oost J, Brouns SJ. 2011. Clustered regularly interspaced short palin-

dromic repeats (CRISPRs): the hallmark of an ingenious antiviral defense mechanism in prokaryotes.Biol. Chem. 392:277–89

6. Anantharaman V, Iyer LM, Aravind L. 2010. Presence of a classical RRM-fold palm domain in Thg1-type 3′–5′ nucleic acid polymerases and the origin of the GGDEF and CRISPR polymerase domains.Biol. Direct 5:43

7. Anderson RE, Brazelton WJ, Baross JA. 2011. Using CRISPRs as a metagenomic tool to identify mi-crobial hosts of a diffuse flow hydrothermal vent viral assemblage. FEMS Microbiol. Ecol. 77:120–33

8. Andersson AF, Banfield JF. 2008. Virus population dynamics and acquired virus resistance in naturalmicrobial communities. Science 320:1047–50

9. Angly FE, Felts B, Breitbart M, Salamon P, Edwards RA, et al. 2006. The marine viromes of four oceanicregions. PLoS Biol. 4:e368

10. Babu M, Beloglazova N, Flick R, Graham C, Skarina T, et al. 2011. A dual function of the CRISPR-Cassystem in bacterial antivirus immunity and DNA repair. Mol. Microbiol. 79:484–502

11. Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, et al. 2007. CRISPR provides acquiredresistance against viruses in prokaryotes. Science 315:1709–12

12. Beloglazova N, Brown G, Zimmerman MD, Proudfoot M, Makarova KS, et al. 2008. A novel familyof sequence-specific endoribonucleases associated with the clustered regularly interspaced short palin-dromic repeats. J. Biol. Chem. 283:20361–71

13. Bolotin A, Quinquis B, Sorokin A, Ehrlich SD. 2005. Clustered regularly interspaced short palindromerepeats (CRISPRs) have spacers of extrachromosomal origin. Microbiology 151:2551–61

14. Brouns SJ, Jore MM, Lundgren M, Westra ER, Slijkhuis RJ, et al. 2008. Small CRISPR RNAs guideantiviral defense in prokaryotes. Science 321:960–64

15. Brudey K, Driscoll JR, Rigouts L, Prodinger WM, Gori A, et al. 2006. Mycobacterium tuberculosis complexgenetic diversity: mining the fourth international spoligotyping database (SpolDB4) for classification,population genetics and epidemiology. BMC Microbiol. 6:23

16. Bult CJ, White O, Olsen GJ, Zhou L, Fleischmann RD, et al. 1996. Complete genome sequence of themethanogenic archaeon, Methanococcus jannaschii. Science 273:1058–73

17. Cady KC, White AS, Hammond JH, Abendroth MD, Karthikeyan RS, et al. 2011. Prevalence, conserva-tion and functional analysis of Yersinia and Escherichia CRISPR regions in clinical Pseudomonas aeruginosaisolates. Microbiology 157:430–37

292 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

18. Cady KC, O’Toole GA 2011 Non-identity targeting of Yersinia-subtype CRISPR-prophage interactionrequires the Csy and Cas3 proteins. J. Bacteriol. 93:3433–45

19. Carte J, Wang R, Li H, Terns RM, Terns MP. 2008. Cas6 is an endoribonuclease that generates guideRNAs for invader defense in prokaryotes. Genes Dev. 22:3489–96

20. Cui Y, Li Y, Gorge O, Platonov ME, Yan Y, et al. 2008. Insight into microevolution of Yersinia pestis byclustered regularly interspaced short palindromic repeats. PLoS ONE 3:e2652

21. D’auria G, Jimenez-Hernandez N, Peris-Bondia F, Moya A, Latorre A. 2010. Legionella pneumophilapangenome reveals strain-specific virulence factors. BMC Genomics 11:181

22. Deltcheva E, Chylinski K, Sharma CM, Gonzales K, Chao Y, et al. 2011. CRISPR RNA maturation bytrans-encoded small RNA and host factor RNase III. Nature 471:602–7

23. Deveau H, Barrangou R, Garneau JE, Labonte J, Fremaux C, et al. 2008. Phage response to CRISPR-encoded resistance in Streptococcus thermophilus. J. Bacteriol. 190:1390–400

24. Deveau H, Garneau JE, Moineau S. 2010. CRISPR/Cas system and its role in phage-bacteria interactions.Annu. Rev. Microbiol. 64:475–93

25. Diez-Villasenor C, Almendros C, Garcia-Martinez J, Mojica FJ. 2010. Diversity of CRISPR loci inEscherichia coli. Microbiology 156:1351–61

26. Ebihara A, Yao M, Masui R, Tanaka I, Yokoyama S, Kuramitsu S. 2006. Crystal structure of hypotheticalprotein TTHB192 from Thermus thermophilus HB8 reveals a new protein family with an RNA recognitionmotif-like domain. Protein Sci. 15:1494–99

27. Edgar R, Qimron U. 2010. The Escherichia coli CRISPR system protects from lambda lysogenization,lysogens, and prophage induction. J. Bacteriol. 192:6291–94

28. Edwards RA, Rohwer F. 2005. Viral metagenomics. Nat. Rev. Microbiol. 3:504–1029. Farazi TA, Juranek SA, Tuschl T. 2008. The growing catalog of small RNAs and their association with

distinct Argonaute/Piwi family members. Development 135:1201–1430. Garneau JE, Dupuis ME, Villion M, Romero DA, Barrangou R, et al. 2010. The CRISPR/Cas bacterial

immune system cleaves bacteriophage and plasmid DNA. Nature 468:67–7131. Garrett RA, Prangishvili D, Shah SA, Reuter M, Stetter KO, Peng X. 2010. Metagenomic analyses of

novel viruses and plasmids from a cultured environmental sample of hyperthermophilic neutrophiles.Environ. Microbiol. 12:2918–30

32. Garrett RA, Shah SA, Vestergaard G, Deng L, Gudbergsdottir S, et al. 2011. CRISPR-based immunesystems of the Sulfolobales: complexity and diversity. Biochem. Soc. Trans. 39:51–57

33. Godde JS, Bickerton A. 2006. The repetitive DNA elements called CRISPRs and their associated genes:evidence of horizontal transfer among prokaryotes. J. Mol. Evol. 62:718–29

34. Gottesman S, Storz G. 2010. Bacterial small RNA regulators: versatile roles and rapidly evolving varia-tions. Cold Spring Harb. Perspect. Biol. doi: 10.1101/cshperspect.a003798

35. Grissa I, Vergnaud G, Pourcel C. 2007. CRISPRFinder: a web tool to identify clustered regularlyinterspaced short palindromic repeats. Nucleic Acids Res. 35:W52–57

36. Groenen PM, Bunschoten AE, van Soolingen D, van Embden JD. 1993. Nature of DNA polymorphismin the direct repeat cluster of Mycobacterium tuberculosis: application for strain differentiation by a noveltyping method. Mol. Microbiol. 10:1057–65

37. Gudbergsdottir S, Deng L, Chen Z, Jensen JV, Jensen LR, et al. 2011. Dynamic properties of theSulfolobus CRISPR/Cas and CRISPR/Cmr systems when challenged with vector-borne viral and plasmidgenes and protospacers. Mol. Microbiol. 79:35–49

38. Haft DH, Selengut J, Mongodin EF, Nelson KE. 2005. A guild of 45 CRISPR-associated (Cas) proteinfamilies and multiple CRISPR/Cas subtypes exist in prokaryotic genomes. PLoS Comput. Biol. 1:e60

39. Hale C, Kleppe K, Terns RM, Terns MP. 2008. Prokaryotic silencing (psi)RNAs in Pyrococcus furiosus.RNA 14:2572–79

40. Hale CR, Zhao P, Olson S, Duff MO, Graveley BR, et al. 2009. RNA-guided RNA cleavage by a CRISPRRNA–Cas protein complex. Cell 139:945–56

41. Hambly E, Suttle CA. 2005. The viriosphere, diversity, and genetic exchange within phage communities.Curr. Opin. Microbiol. 8:444–50

42. Haurwitz RE, Jinek M, Wiedenheft B, Zhou K, Doudna JA. 2010. Sequence- and structure-specificRNA processing by a CRISPR endonuclease. Science 329:1355–58

www.annualreviews.org • CRISPR-Cas Systems 293

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

43. He J, Deem MW. 2010. Heterogeneous diversity of spacers within CRISPR (clustered regularly inter-spaced short palindromic repeats). Phys. Rev. Lett. 105:128102

44. Heidelberg JF, Nelson WC, Schoenfeld T, Bhaya D. 2009. Germ warfare in a microbial mat community:CRISPRs provide insights into the co-evolution of host and viral genomes. PLoS ONE 4:e4169

45. Held NL, Herrera A, Cadillo-Quiroz H, Whitaker RJ. 2010. CRISPR associated diversity within apopulation of Sulfolobus islandicus. PLoS ONE 5:e12988

46. Held NL, Whitaker RJ. 2009. Viral biogeography revealed by signatures in Sulfolobus islandicus genomes.Environ. Microbiol. 11:457–66

47. Hermans PW, van Soolingen D, Bik EM, de Haas PE, Dale JW, van Embden JD. 1991. Insertionelement IS987 from Mycobacterium bovis BCG is located in a hot-spot integration region for insertionelements in Mycobacterium tuberculosis complex strains. Infect. Immun. 59:2695–705

48. Hoe N, Nakashima K, Grigsby D, Pan X, Dou SJ, et al. 1999. Rapid molecular genetic subtyping ofserotype M1 group A Streptococcus strains. Emerg. Infect. Dis. 5:254–63

49. Horvath P, Barrangou R. 2010. CRISPR/Cas, the immune system of bacteria and archaea. Science327:167–70

50. Horvath P, Coute-Monvoisin AC, Romero DA, Boyaval P, Fremaux C, Barrangou R. 2009. Comparativeanalysis of CRISPR loci in lactic acid bacteria genomes. Int. J. Food Microbiol. 131:62–70

51. Horvath P, Romero DA, Coute-Monvoisin AC, Richards M, Deveau H, et al. 2008. Diversity, activity,and evolution of CRISPR loci in Streptococcus thermophilus. J. Bacteriol. 190:1401–12

52. Ishino Y, Shinagawa H, Makino K, Amemura M, Nakata A. 1987. Nucleotide sequence of the iap gene,responsible for alkaline phosphatase isozyme conversion in Escherichia coli, and identification of the geneproduct. J. Bacteriol. 169:5429–33

53. Jansen R, Embden JD, Gaastra W, Schouls LM. 2002. Identification of genes that are associated withDNA repeats in prokaryotes. Mol. Microbiol. 43:1565–75

54. Jansen R, van Embden JD, Gaastra W, Schouls LM. 2002. Identification of a novel family of sequencerepeats among prokaryotes. OMICS 6:23–33

55. Jinek M, Doudna JA. 2009. A three-dimensional view of the molecular machinery of RNA interference.Nature 457:405–12

56. Jore MM, Brouns SJ, van der Oost J. 2011. RNA in defense: CRISPRs protect prokaryotes against mobilegenetic elements. Cold Spring Harb. Perspect. Biol. doi: 10.1101/cshperspect.a003657

57. Jore MM, Lundgren M, van Duijn E, Bultema JB, Westra ER, et al. 2011. Structural basis for CRISPRRNA-guided DNA recognition by CASCADE. Nat. Struct. Mol. Biol. 18:529–36

58. Karginov FV, Hannon GJ. 2010. The CRISPR system: small RNA-guided defense in bacteria andarchaea. Mol. Cell 37:7–19

59. Kawarabayasi Y, Hino Y, Horikawa H, Yamazaki S, Haikawa Y, et al. 1999. Complete genome sequenceof an aerobic hyper-thermophilic crenarchaeon, Aeropyrum pernix K1. DNA Res. 6:83–101, 145–52

60. Kawarabayasi Y, Sawada M, Horikawa H, Haikawa Y, Hino Y, et al. 1998. Complete sequence andgene organization of the genome of a hyper-thermophilic archaebacterium, Pyrococcus horikoshii OT3(supplement). DNA Res. 5:147–55

61. Ketting RF. 2011. The many faces of RNAi. Dev. Cell 20:148–6162. Klenk HP, Clayton RA, Tomb JF, White O, Nelson KE, et al. 1997. The complete genome sequence

of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus. Nature 390:364–7063. Koonin EV, Makarova KS. 2009. CRISPR-Cas: an adaptive immunity system in prokaryotes. F1000

Biol. Rep. 1:9564. Koonin EV, Wolf YI. 2009. Is evolution Darwinian or/and Lamarckian? Biol. Direct. 4:4265. Kunin V, Sorek R, Hugenholtz P. 2007. Evolutionary conservation of sequence and secondary structures

in CRISPR repeats. Genome Biol. 8:R6166. Labrie SJ, Samson JE, Moineau S. 2010. Bacteriophage resistance mechanisms. Nat. Rev. Microbiol.

8:317–2767. Levin BR. 2010. Nasty viruses, costly plasmids, population dynamics, and the conditions for establishing

and maintaining CRISPR-mediated adaptive immunity in bacteria. PLoS Genet. 6:e100117168. Lillestol RK, Redder P, Garrett RA, Brugger K. 2006. A putative viral defence mechanism in archaeal

cells. Archaea 2:59–72

294 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

69. Lillestol RK, Shah SA, Brugger K, Redder P, Phan H, et al. 2009. CRISPR families of the crenarchaealgenus Sulfolobus: bidirectional transcription and dynamic properties. Mol. Microbiol. 72:259–72

70. Lintner NG, Frankel KA, Tsutakawa SE, Alsbury DL, Copie V, et al. 2011. The structure of theCRISPR-associated protein Csa3 provides insight into the regulation of the CRISPR/Cas system.J. Mol. Biol. 405:939–55

71. Liu F, Barrangou R, Gerner-Smidt P, Ribot EM, Knabel SJ, et al. 2011. Novel virulence gene andclustered regularly interspaced short palindromic repeat (CRISPR) multilocus sequence typing schemefor subtyping of the major serovars of Salmonella enterica subsp. enterica. Appl. Environ. Microbiol. 77:1946–56

72. Makarova KS, Haft DH, Barrangou R, Brouns SJ, Charpentier E, et al. 2011. Evolution and classificationof the CRISPR/Cas systems. Nat. Rev. Microbiol. 9:467–77

73. Makarova KS, Aravind L, Grishin NV, Rogozin IB, Koonin EV. 2002. A DNA repair system specific forthermophilic Archaea and bacteria predicted by genomic context analysis. Nucleic Acids Res. 30:482–96

74. Makarova KS, Grishin NV, Shabalina SA, Wolf YI, Koonin EV. 2006. A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery,functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action. Biol. Direct 1:7

75. Marraffini LA, Sontheimer EJ. 2008. CRISPR interference limits horizontal gene transfer in staphylo-cocci by targeting DNA. Science 322:1843–45

76. Marraffini LA, Sontheimer EJ. 2010. CRISPR interference: RNA-directed adaptive immunity in bacteriaand archaea. Nat. Rev. Genet. 11:181–90

77. Masepohl B, Gorlitz K, Bohme H. 1996. Long tandemly repeated repetitive (LTRR) sequences in thefilamentous cyanobacterium Anabaena sp. PCC 7120. Biochim. Biophys. Acta 1307:26–30

78. McShan WM, Ferretti JJ, Karasawa T, Suvorov AN, Lin S, et al. 2008. Genome sequence of a nephri-togenic and highly transformable M49 strain of Streptococcus pyogenes. J. Bacteriol. 190:7773–85

79. Mojica FJ, Diez-Villasenor C, Garcia-Martinez J, Almendros C. 2009. Short motif sequences determinethe targets of the prokaryotic CRISPR defence system. Microbiology 155:733–40

80. Mojica FJ, Diez-Villasenor C, Garcia-Martinez J, Soria E. 2005. Intervening sequences of regularlyspaced prokaryotic repeats derive from foreign genetic elements. J. Mol. Evol. 60:174–82

81. Mojica FJ, Ferrer C, Juez G, Rodriguez-Valera F. 1995. Long stretches of short tandem repeats arepresent in the largest replicons of the Archaea Haloferax mediterranei and Haloferax volcanii and could beinvolved in replicon partitioning. Mol. Microbiol. 17:85–93

82. Mokrousov I, Limeschenko E, Vyazovaya A, Narvskaya O. 2007. Corynebacterium diphtheriae spolig-otyping based on combined use of two CRISPR loci. Biotechnol. J. 2:901–6

83. Mokrousov I, Vyazovaya A, Kolodkina V, Limeschenko E, Titov L, Narvskaya O. 2009. Novelmacroarray-based method of Corynebacterium diphtheriae genotyping: evaluation in a field study in Be-larus. Eur. J. Clin. Microbiol. Infect. Dis. 28:701–3

84. Nelson KE, Clayton RA, Gill SR, Gwinn ML, Dodson RJ, et al. 1999. Evidence for lateral gene transferbetween Archaea and bacteria from genome sequence of Thermotoga maritima. Nature 399:323–29

85. Palmer KL, Gilmore MS. 2010. Multidrug-resistant enterococci lack CRISPR-cas. mBio 1:e00227–1086. Perez-Rodriguez R, Haitjema C, Huang Q, Nam KH, Bernardis S, et al. 2011. Envelope stress is a

trigger of CRISPR RNA-mediated DNA silencing in Escherichia coli. Mol. Microbiol. 79:584–9987. Pignatelli M, Aparicio G, Blanquer I, Hernandez V, Moya A, Tamames J. 2008. Metagenomics reveals

our incomplete knowledge of global diversity. Bioinformatics 24:2124–2588. Portillo MC, Gonzalez JM. 2009. CRISPR elements in the Thermococcales: evidence for associated

horizontal gene transfer in Pyrococcus furiosus. J. Appl. Genet. 50:421–3089. Pougach K, Semenova E, Bogdanova E, Datsenko KA, Djordjevic M, et al. 2010. Transcription, pro-

cessing and function of CRISPR cassettes in Escherichia coli. Mol. Microbiol. 77:1367–7990. Pourcel C, Salvignol G, Vergnaud G. 2005. CRISPR elements in Yersinia pestis acquire new repeats

by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies.Microbiology 151:653–63

91. Pride DT, Sun CL, Salzman J, Rao N, Loomer P, et al. 2011. Analysis of streptococcal CRISPRs fromhuman saliva reveals substantial sequence diversity within and between subjects over time. Genome Res.21:126–36

www.annualreviews.org • CRISPR-Cas Systems 295

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

92. Pul U, Wurm R, Arslan Z, Geissen R, Hofmann N, Wagner R. 2010. Identification and characterizationof E. coli CRISPR-Cas promoters and their silencing by H-NS. Mol. Microbiol. 75:1495–512

93. Rohwer F. 2003. Global phage diversity. Cell 113:14194. Rousseau C, Gonnet M, Le Romancer M, Nicolas J. 2009. CRISPI: a CRISPR interactive database.

Bioinformatics 25:3317–1895. Sapranauskas R, Gasiunas G, Fremaux C, Barrangou R, Horvath P, Siksnys V. 2011. The Streptococcus

thermophilus CRISPR/Cas system provides plasmid immunity in Escherichia coli. Nucl. Acids Res. doi:10.1093/nar/gkr606

96. Semenova E, Nagornykh M, Pyatnitskiy M, Artamonova II, Severinov K. 2009. Analysis of CRISPRsystem function in plant pathogen Xanthomonas oryzae. FEMS Microbiol. Lett. 296:110–16

97. Semenova E, Jore MM, Datsenko KA, Semenova A, Westra ER, et al. 2011. Interference by clusteredregularly interspaced short palindromic repeat (CRISPR) RNA is governed by a seed sequence. Proc.Natl. Acad. Sci. USA 108:10098–103

98. Sensen CW, Charlebois RL, Chow C, Clausen IG, Curtis B, et al. 1998. Completing the sequence ofthe Sulfolobus solfataricus P2 genome. Extremophiles 2:305–12

99. Shah SA, Garrett RA. 2011. CRISPR/Cas and Cmr modules, mobility and evolution of adaptive immunesystems. Res. Microbiol. 162:27–38

100. Shimomura Y, Okumura K, Yamagata Murayama S, Yagi J, Ubukata K, et al. 2011. Complete genomesequencing and analysis of a Lancefield group G Streptococcus dysgalactiae subsp. equisimilis strain causingstreptococcal toxic shock syndrome (STSS). BMC Genomics 12:17

101. Shinkai A, Kira S, Nakagawa N, Kashihara A, Kuramitsu S, Yokoyama S. 2007. Transcription activationmediated by a cyclic AMP receptor protein from Thermus thermophilus HB8. J. Bacteriol. 189:3891–901

102. Sinkunas T, Gasiunas G, Fremaux C, Barrangou R, Horvath P, Siksnys V. 2011. Cas3 is a single-strandedDNA nuclease and ATP-dependent helicase in the CRISPR/Cas immune system. EMBO J. 30:1335–42

103. Siomi MC, Sato K, Pezic D, Aravin AA. 2011. PIWI-interacting small RNAs: the vanguard of genomedefence. Nat. Rev. Mol. Cell Biol. 12:246–58

104. Sorek R, Kunin V, Hugenholtz P. 2008. CRISPR: a widespread system that provides acquired resistanceagainst phages in bacteria and archaea. Nat. Rev. Microbiol. 6:181–86

105. Sorokin VA, Gelfand MS, Artamonova II. 2010. Evolutionary dynamics of clustered irregularly inter-spaced short palindromic repeat systems in the ocean metagenome. Appl. Environ. Microbiol. 76:2136–44

106. Stern A, Keren L, Wurtzel O, Amitai G, Sorek R. 2010. Self-targeting by CRISPR: gene regulation orautoimmunity? Trends Genet. 26:335–40

107. Suttle CA. 2005. Viruses in the sea. Nature 437:356–61108. Terns MP, Terns RM. 2011. CRISPR-based adaptive immune systems. Curr. Opin. Microbiol. 14:321–27109. Touchon M, Charpentier S, Clermont O, Rocha EP, Denamur E, Branger C. 2011. CRISPR distribution

within the Escherichia coli species is not suggestive of immune-associated diversifying selection. J. Bacteriol.193:2460–67

110. Touchon M, Rocha EP. 2010. The small, slow and specialized CRISPR and anti-CRISPR of Escherichiaand Salmonella. PLoS ONE 5:e11126

111. Tyson GW, Banfield JF. 2008. Rapidly evolving CRISPRs implicated in acquired resistance of microor-ganisms to viruses. Environ. Microbiol. 10:200–7

112. Vale PF, Little TJ. 2010. CRISPR-mediated phage resistance and the ghost of coevolution past. Proc.Biol. Sci. 277:2097–103

113. van der Oost J, Jore MM, Westra ER, Lundgren M, Brouns SJ. 2009. CRISPR-based adaptive andheritable immunity in prokaryotes. Trends Biochem. Sci. 34:401–7

114. van der Ploeg JR. 2009. Analysis of CRISPR in Streptococcus mutans suggests frequent occurrence ofacquired immunity against infection by M102-like bacteriophages. Microbiology 155:1966–76

115. Viswanathan P, Murphy K, Julien B, Garza AG, Kroos L. 2007. Regulation of dev, an operon thatincludes genes essential for Myxococcus xanthus development and CRISPR-associated genes and repeats.J. Bacteriol. 189:3738–50

116. Waters LS, Storz G. 2009. Regulatory RNAs in bacteria. Cell 136:615–28

296 Bhaya · Davison · Barrangou

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45CH13-Bhaya ARI 1 October 2011 14:42

117. Westra ER, Pul U, Heidrich N, Jore MM, Lundgren M, et al. 2010. H-NS-mediated repression ofCRISPR-based immunity in Escherichia coli K12 can be relieved by the transcription activator LeuO.Mol. Microbiol. 77:1380–93

118. Wiedenheft B, Zhou K, Jinek M, Coyle SM, Ma W, Doudna JA. 2009. Structural basis for DNase activityof a conserved protein implicated in CRISPR-mediated genome defense. Structure 17:904–12

119. Wiedenheft B, van Duijn E, Bultema J, Waghmare S, Zhou K, et al. 2011. RNA-guided complex froma bacterial immune system enhances target recognition through seed sequence interactions. Proc. Natl.Acad. Sci. USA. 108:10092–97

120. Zegans ME, Wagner JC, Cady KC, Murphy DM, Hammond JH, O’Toole GA. 2009. Interaction be-tween bacteriophage DMS3 and host CRISPR region inhibits group behaviors of Pseudomonas aeruginosa.J. Bacteriol. 191:210–19

121. Zhang J, Abadia E, Refregier G, Tafaj S, Boschiroli ML, et al. 2010. Mycobacterium tuberculosis complexCRISPR genotyping: improving efficiency, throughput and discriminative power of “spoligotyping” withnew spacers and a microbead-based hybridization assay. J. Med. Microbiol. 59:285–94

www.annualreviews.org • CRISPR-Cas Systems 297

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45-FrontMatter ARI 11 October 2011 19:52

Annual Review ofGenetics

Volume 45, 2011 Contents

Comparative Genetics and Genomics of Nematodes: GenomeStructure, Development, and LifestyleRalf J. Sommer and Adrian Streit � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1

Uncovering the Mystery of Gliding Motility in the MyxobacteriaBeiyan Nan and David R. Zusman � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �21

Genetics and Control of Tomato Fruit Ripening and Quality AttributesHarry J. Klee and James J. Giovannoni � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �41

Toxin-Antitoxin Systems in Bacteria and ArchaeaYoshihiro Yamaguchi, Jung-Ho Park, and Masayori Inouye � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �61

Genetic and Epigenetic Networks in Intellectual DisabilitiesHans van Bokhoven � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �81

Axis Formation in HydraHans Bode � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 105

The Rules of Engagement in the Legume-Rhizobial SymbiosisGiles E.D. Oldroyd, Jeremy D. Murray, Philip S. Poole, and J. Allan Downie � � � � � � � � 119

A Genetic Approach to the Transcriptional Regulationof Hox Gene ClustersPatrick Tschopp and Denis Duboule � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 145

V(D)J Recombination: Mechanisms of InitiationDavid G. Schatz and Patrick C. Swanson � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 167

Human Copy Number Variation and Complex Genetic DiseaseSanthosh Girirajan, Catarina D. Campbell, and Evan E. Eichler � � � � � � � � � � � � � � � � � � � � � � 203

DNA Elimination in Ciliates: Transposon Domesticationand Genome SurveillanceDouglas L. Chalker and Meng-Chao Yao � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 227

Double-Strand Break End Resection and Repair Pathway ChoiceLorraine S. Symington and Jean Gautier � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 247

vi

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.

GE45-FrontMatter ARI 11 October 2011 19:52

CRISPR-Cas Systems in Bacteria and Archaea: Versatile Small RNAsfor Adaptive Defense and RegulationDevaki Bhaya, Michelle Davison, and Rodolphe Barrangou � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 273

Human Mitochondrial tRNAs: Biogenesis, Function,Structural Aspects, and DiseasesTsutomu Suzuki, Asuteka Nagao, and Takeo Suzuki � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 299

The Genetics of Hybrid IncompatibilitiesShamoni Maheshwari and Daniel A. Barbash � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 331

Maternal and Zygotic Control of Zebrafish Dorsoventral AxialPatterningYvette G. Langdon and Mary C. Mullins � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 357

Genomic Imprinting: A Mammalian Epigenetic Discovery ModelDenise P. Barlow � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 379

Sex in FungiMin Ni, Marianna Feretzaki, Sheng Sun, Xuying Wang, and Joseph Heitman � � � � � � 405

Genomic Analysis at the Single-Cell LevelTomer Kalisky, Paul Blainey, and Stephen R. Quake � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 431

Uniting Germline and Stem Cells: The Function of Piwi Proteinsand the piRNA Pathway in Diverse OrganismsCelina Juliano, Jianquan Wang, and Haifan Lin � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 447

Errata

An online log of corrections to Annual Review of Genetics articles may be found at http://genet.annualreviews.org/errata.shtml

Contents vii

Ann

u. R

ev. G

enet

. 201

1.45

:273

-297

. Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Sta

nfor

d U

nive

rsity

- M

ain

Cam

pus

- L

ane

Med

ical

Lib

rary

on

08/0

9/12

. For

per

sona

l use

onl

y.