structural basis of viral rna-dependent rna polymerase ...structural basis of viral rna-dependent...

10
Structural basis of viral RNA-dependent RNA polymerase catalysis and translocation Bo Shu a,b and Peng Gong a,1 a Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, Hubei 430071, China; and b University of Chinese Academy of Sciences, Beijing 100049, China Edited by Thomas A. Steitz, Howard Hughes Medical Institute, Yale University, New Haven, CT, and approved May 19, 2016 (received for review February 15, 2016) Viral RNA-dependent RNA polymerases (RdRPs) play essential roles in viral genome replication and transcription. We previously reported several structural states of the poliovirus RdRP nucleotide addition cycle (NAC) that revealed a unique palm domain-based active site closure mechanism and proposed a six-state NAC model including a hypothetical state representing translocation intermediates. Using the RdRP from another human enterovirus, enterovirus 71, here we report seven RdRP elongation complex structures derived from a crystal lattice that allows three NAC events. These structures suggested a key order of events in initial NTP binding and NTP- induced active site closure and revealed a bona fide translocation intermediate featuring asymmetric movement of the templateproduct duplex. Our work provides essential missing links in under- standing NTP recognition and translocation mechanisms in viral RdRPs and emphasizes the uniqueness of the viral RdRPs compared with other processive polymerases. RNA-dependent RNA polymerase | nucleotide addition cycle | translocation intermediate | enterovirus 71 | crystal structure I n recent years, several notable emerging infectious diseases have been caused by RNA viruses, including highly pathogenic avian influenza viruses, Ebola virus, and Middle East respiratory syn- drome coronavirus. RNA viruses are quite diverse in virus particle and genome structure and in virus entry and assembly mechanisms. However, they do share fundamental features in their genome replication and transcription, using a virally encoded RNA- dependent RNA polymerase (RdRP) to carry out the biosynthesis of an RNA product directed by an RNA template. Although the ge- nome replication machinery often requires the participation of other factors, typically at the initiation phase of synthesis, the RdRP governs the elongation phase of synthesis that includes thousands of efficient nucleotide addition cycles (NACs). Viral RdRPs vary greatly in size and structural organization, from the 50-kDa picornavirus 3D pol (1, 2), to the 100-kDa flavivirus NS5 that contains a naturally fused methyltransferase domain (3), to the 250-kDa nonsegmented neg- ative-strand RNA virus L protein harboring at least three enzyme modules (4) and the 260-kDa three-subunit PA-PB1-PB2 influenza virus replicase complex (5). On the other hand, all RdRPs share a 50- to 70-kDa polymerase core that forms a unique encircled right-hand structure with palm, fingers, and thumb domains. Among the seven classic RdRP catalytic motifs, AE are within the most conserved palm domain, and F and G are located in the fingers; they are all arranged similarly around the active site (69). The structural con- servation of the RdRP polymerase core and the seven motifs form the basis for understanding the common features in viral RdRP catalytic mechanism and for finding intervention strategies targeting these enzymes with possible broad-spectrum potential. As with other classes of nucleic acid polymerases, the viral RdRP elongation NAC comprises sequential steps of initial NTP binding, active site closure, catalysis, and translocation. In a recent study using CTP and deoxy-CTP analogs in the poliovirus (PV) RdRP elongation complex (EC) crystal-soaking experiments, the polymerase having a guanine base at the +1 template position was successfully trapped at different stages of a single NAC, leading to the proposal of a working NAC model featuring six reference states (10). The model starts with a state 1 (S 1 ) complex with a vacant active site that is in the catalyt- ically open conformation and upon NTP binding proceeds to a state 2 (S 2 ) complex with the active site still in the open conformation. An important conformational change then takes place to position key catalytic residues and two magnesium ions around the priming nu- cleotide and the substrate NTP to achieve proper geometry of a closed active site for catalysis, yielding state 3 (S 3 ) immediately before and state 4 (S 4 ) immediately after the phosphoryl transfer reaction. As the catalytic geometry starts to disintegrate, the structural changes in the palm domain result in state 5 (S 5 ) with an open conformation active site. State 6 (S 6 ) is then a hypothetical translocation in- termediate state that bridges the pretranslocation S 5 and the post- translocation S 1 in the next NAC. The six-state model has provided a framework for understanding the molecular details and the unique features of the viral RdRP elongation NAC. In the active site closure step, the notable backbone conformational changes are limited to motifs A and D in the palm domain. These conformational changes are in drastic contrast to those taken by the well-characterized A-family poly- merases that use a large-scale rotational movement of the O-helixcontaining fingers domain to achieve the same closure step (11, 12). One implication behind this apparent difference in the active site closure mode is that the mechanisms by which the polymerase se- lects the correct NTP substrate and induces active site reorganiza- tion for catalysis are also quite different. In A-family polymerases, the translocation step is coupled to the postcatalysis reopen- ing of the active site when a conserved tyrosine residue in the O-helix pushesthe nascent base pair upstream in a motion that is the reverse of that observed during the active site closure (11, 13). However, no intermediate structure between the pre- Significance RNA viruses encode a unique class of RNA-dependent RNA polymerases (RdRPs) to carry out their fully RNA-based genome replication and transcription. Although the chemical nature of nucleotide addition is essentially shared by all nucleic acid poly- merases, the structural and mechanistic details taken by each polymerase class differ to various extents. Here we report seven crystal structures of enterovirus 71 RdRP elongation complex at 2.52.8 Å resolution. In these structures the polymerases are poised at various distinct stages to reveal mechanistic details of initial NTP binding, key amino acid side-chain conformational switches during active site closure, and in particular the post- catalysis movement of the RNA duplex on the way to vacate the active site for the next nucleotide addition cycle. Author contributions: B.S. and P.G. designed research; B.S. performed research; B.S. and P.G. analyzed data; and B.S. and P.G. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Data deposition: Crystallography, atomic coordinates, and structure factors reported in this paper have been deposited in the Protein Data Bank, www.pdb.org (PDB ID codes 5F8G5F8J, 5F8L, 5F8M, and 5F8N). 1 To whom correspondence should be addressed. Email: [email protected]. www.pnas.org/cgi/doi/10.1073/pnas.1602591113 PNAS | Published online June 23, 2016 | E4005E4014 BIOCHEMISTRY PNAS PLUS Downloaded by guest on May 2, 2021

Upload: others

Post on 22-Nov-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

  • Structural basis of viral RNA-dependent RNApolymerase catalysis and translocationBo Shua,b and Peng Gonga,1

    aKey Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, Hubei 430071, China;and bUniversity of Chinese Academy of Sciences, Beijing 100049, China

    Edited by Thomas A. Steitz, Howard Hughes Medical Institute, Yale University, New Haven, CT, and approved May 19, 2016 (received for review February15, 2016)

    Viral RNA-dependent RNA polymerases (RdRPs) play essential rolesin viral genome replication and transcription. We previously reportedseveral structural states of the poliovirus RdRP nucleotide additioncycle (NAC) that revealed a unique palm domain-based active siteclosure mechanism and proposed a six-state NAC model including ahypothetical state representing translocation intermediates. Usingthe RdRP from another human enterovirus, enterovirus 71, herewe report seven RdRP elongation complex structures derived froma crystal lattice that allows three NAC events. These structuressuggested a key order of events in initial NTP binding and NTP-induced active site closure and revealed a bona fide translocationintermediate featuring asymmetric movement of the template–product duplex. Our work provides essential missing links in under-standing NTP recognition and translocationmechanisms in viral RdRPsand emphasizes the uniqueness of the viral RdRPs compared withother processive polymerases.

    RNA-dependent RNA polymerase | nucleotide addition cycle | translocationintermediate | enterovirus 71 | crystal structure

    In recent years, several notable emerging infectious diseases havebeen caused by RNA viruses, including highly pathogenic avianinfluenza viruses, Ebola virus, and Middle East respiratory syn-drome coronavirus. RNA viruses are quite diverse in virus particleand genome structure and in virus entry and assembly mechanisms.However, they do share fundamental features in their genomereplication and transcription, using a virally encoded RNA-dependent RNA polymerase (RdRP) to carry out the biosynthesis ofan RNA product directed by an RNA template. Although the ge-nome replication machinery often requires the participation of otherfactors, typically at the initiation phase of synthesis, the RdRP governsthe elongation phase of synthesis that includes thousands of efficientnucleotide addition cycles (NACs). Viral RdRPs vary greatly in sizeand structural organization, from the ∼50-kDa picornavirus 3Dpol (1,2), to the ∼100-kDa flavivirus NS5 that contains a naturally fusedmethyltransferase domain (3), to the ∼250-kDa nonsegmented neg-ative-strand RNA virus L protein harboring at least three enzymemodules (4) and the ∼260-kDa three-subunit PA-PB1-PB2 influenzavirus replicase complex (5). On the other hand, all RdRPs share a 50-to 70-kDa polymerase core that forms a unique encircled right-handstructure with palm, fingers, and thumb domains. Among the sevenclassic RdRP catalytic motifs, A–E are within the most conservedpalm domain, and F and G are located in the fingers; they are allarranged similarly around the active site (6–9). The structural con-servation of the RdRP polymerase core and the seven motifs form thebasis for understanding the common features in viral RdRP catalyticmechanism and for finding intervention strategies targeting theseenzymes with possible broad-spectrum potential.As with other classes of nucleic acid polymerases, the viral RdRP

    elongation NAC comprises sequential steps of initial NTP binding,active site closure, catalysis, and translocation. In a recent study usingCTP and deoxy-CTP analogs in the poliovirus (PV) RdRP elongationcomplex (EC) crystal-soaking experiments, the polymerase having aguanine base at the +1 template position was successfully trapped atdifferent stages of a single NAC, leading to the proposal of a working

    NAC model featuring six reference states (10). The model starts witha state 1 (S1) complex with a vacant active site that is in the catalyt-ically open conformation and upon NTP binding proceeds to a state 2(S2) complex with the active site still in the open conformation. Animportant conformational change then takes place to position keycatalytic residues and two magnesium ions around the priming nu-cleotide and the substrate NTP to achieve proper geometry of aclosed active site for catalysis, yielding state 3 (S3) immediately beforeand state 4 (S4) immediately after the phosphoryl transfer reaction.As the catalytic geometry starts to disintegrate, the structural changesin the palm domain result in state 5 (S5) with an open conformationactive site. State 6 (S6) is then a hypothetical translocation in-termediate state that bridges the pretranslocation S5 and the post-translocation S1 in the next NAC.The six-state model has provided a framework for understanding

    the molecular details and the unique features of the viral RdRPelongation NAC. In the active site closure step, the notablebackbone conformational changes are limited to motifs A and Din the palm domain. These conformational changes are in drasticcontrast to those taken by the well-characterized A-family poly-merases that use a large-scale rotational movement of the O-helix–containing fingers domain to achieve the same closure step (11, 12).One implication behind this apparent difference in the active siteclosure mode is that the mechanisms by which the polymerase se-lects the correct NTP substrate and induces active site reorganiza-tion for catalysis are also quite different. In A-family polymerases,the translocation step is coupled to the postcatalysis reopen-ing of the active site when a conserved tyrosine residue in theO-helix “pushes” the nascent base pair upstream in a motionthat is the reverse of that observed during the active site closure(11, 13). However, no intermediate structure between the pre-

    Significance

    RNA viruses encode a unique class of RNA-dependent RNApolymerases (RdRPs) to carry out their fully RNA-based genomereplication and transcription. Although the chemical nature ofnucleotide addition is essentially shared by all nucleic acid poly-merases, the structural and mechanistic details taken by eachpolymerase class differ to various extents. Here we report sevencrystal structures of enterovirus 71 RdRP elongation complex at2.5–2.8 Å resolution. In these structures the polymerases arepoised at various distinct stages to reveal mechanistic details ofinitial NTP binding, key amino acid side-chain conformationalswitches during active site closure, and in particular the post-catalysis movement of the RNA duplex on the way to vacate theactive site for the next nucleotide addition cycle.

    Author contributions: B.S. and P.G. designed research; B.S. performed research; B.S. andP.G. analyzed data; and B.S. and P.G. wrote the paper.

    The authors declare no conflict of interest.

    This article is a PNAS Direct Submission.

    Data deposition: Crystallography, atomic coordinates, and structure factors reported inthis paper have been deposited in the Protein Data Bank, www.pdb.org (PDB ID codes5F8G–5F8J, 5F8L, 5F8M, and 5F8N).1To whom correspondence should be addressed. Email: [email protected].

    www.pnas.org/cgi/doi/10.1073/pnas.1602591113 PNAS | Published online June 23, 2016 | E4005–E4014

    BIOCH

    EMISTR

    YPN

    ASPL

    US

    Dow

    nloa

    ded

    by g

    uest

    on

    May

    2, 2

    021

    http://crossmark.crossref.org/dialog/?doi=10.1073/pnas.1602591113&domain=pdfhttp://www.pdb.orghttp://www.rcsb.org/pdb/explore/explore.do?structureId=5F8Ghttp://www.rcsb.org/pdb/explore/explore.do?structureId=5F8Jhttp://www.rcsb.org/pdb/explore/explore.do?structureId=5F8Lhttp://www.rcsb.org/pdb/explore/explore.do?structureId=5F8Mhttp://www.rcsb.org/pdb/explore/explore.do?structureId=5F8Nmailto:[email protected]/cgi/doi/10.1073/pnas.1602591113

  • and posttranslocation states to refine the translocation processfurther has been captured in A-family polymerases by crystallogra-phy. Without an O-helix counterpart, viral RdRPs likely haveestablished unique conserved components to control translocation,and their ECs may experience metastable intermediates that providevaluable details for identifying translocation-related protein com-ponents and the details of RNA movement during translocation.In this study, we obtained a crystal form of the enterovirus 71

    (EV71) RdRP EC that allows multiple nucleotide incorporationsin NTP-soaking trials. By using natural NTP substrate combina-tions and controlling the incubation time, we were able to captureEC species that provide previously unidentified mechanistic detailsfor initial NTP binding, active site closure, and, in particular theRNA motion during translocation that shows an asymmetricmovement of the two strands in the template–product duplex.

    ResultsA Unique RdRP EC Lattice That Is Capable of in Situ Elongation forMultiple NACs. Recently we developed an RNA-mediated crys-tallization strategy that was highly effective in crystallizingpicornaviral RdRP ECs (14). By providing a GC or GU sequenceoverhanging the upstream end of the template–product duplex tofacilitate inter-EC contacts, an EC dimer typically becomes theminimal crystallizing unit, with the two upstream RNA duplexesinteracting in the middle and two polymerases facing away from eachother. These RNA–RNA interactions play important roles in crys-tallization but also impose steric constraints for in-crystal soakingexperiments designed to go through multiple NACs. By attemptingto crystallize EV71 RdRP EC using a combination of RdRP fromdifferent viral genotypes and RNA with different lengths of tem-plate–product duplex, we have obtained a picornaviral RdRP ECcrystal form within which ECs are no longer organized as dimers.Instead, the upstream duplex points toward a spacious solventchannel (Fig. 1A) that may allow multiple incorporation and trans-location events to occur in NTP-soaking experiments.The EC in this crystal was obtained after incorporating a (GA)3

    hexa-nucleotide sequence into an RdRP–RNA binary complexcontaining an 8-bp template–primer duplex. The remaining“GGACCU . . .” template sequence was designed to direct sub-sequent in-crystal elongation, as may be allowed by this particularlattice (Fig. 1B). When CTP, UTP, and GTP are provided, the EC isexpected to incorporate five nucleotides (i.e., CCUGG). However,even with overnight incubations, the EC incorporated three nucle-

    otides (i.e., CCU) at most, indicating a threshold for the crystal lat-tice to accommodate growth of the template–product duplex. Wethen explored using this crystal lattice as a platform to capturingimportant EC states within three consecutive NACs. By combiningan incubation time scanning strategy (15, 16) and the use of differentNTP combinations in the native EC-soaking trials, we obtained sevenrepresentative EC structures distributed in the first and the thirdNACs (Fig. 1B and Tables 1 and 2). The resulting crystal latticeswere nearly isomorphous based on unit cell dimensions and inter-complex packing modes (Tables 1 and 2 and Fig. 1C). This obser-vation is drastically different from the output of the soakingexperiments using a PV RdRP EC crystal that also allows multiplenucleotide incorporation (14); in that lattice (named “PV_r5”) everytranslocation event resulted in obvious translational movement be-tween RdRPs within the aforementioned EC dimer as the growingRNA duplexes collided with each other.

    A Plausible Order of Events During Initial NTP Binding and Active SiteClosure. The isomorphous feature of the EV71 RdRP EC latticeand the fact that the EC can enter the third NAC mean the threeconsecutive NACs occurred in an environment free of constraintsbrought by lattice variation, providing a valid platform for time-resolved NTP-soaking experiments. We tested CTP soaking withvarious incubation times and obtained three representative states inthe first NAC (denoted “C1” for “cycle 1”; Table 1). In the prior PVRdRP EC work, a 2′,3′-dideoxy CTP (ddCTP)-derived structureshowed clear density for the entire ddCTP molecule, whereas theRdRP conformation remained essentially unchanged around theactive site. This structure was assigned as the reference state 2 (S2)to represent a fully open conformation active site with a boundNTP. However, such a state was not observed in all of the CTP-soaked EV71 RdRP EC C1 structures. If the active site confor-mation remained fully open, only medium-level density for the CTPbase moiety and weak-level density of the ribose was evident (C1S1/2structure, CTP not modeled; Fig. 2 A and B). This observationindicates that the template–NTP substrate base-pairing appears tobe sampled before the rearrangement of NTP ribose, triphosphate,and the surrounding active site motifs toward the in-line catalyticgeometry. Defined density next to residue D330 (the second aspartatein the RdRP hallmark motif C sequence XGDD) with an interactingdistance of 2.1 Å to a side-chain oxygen indicated a bound magne-sium ion (“metal A” or MeA according to consensus nomenclature)that is required for subsequent catalysis.

    Fig. 1. A unique EV71 RdRP EC crystal lattice allows multiple nucleotide-incorporation events. (A) 2Fo-Fc electron density map (contoured at 1.5 σ with a radiusof 30 Å, in cyan) around the upstream end of one EC (polymerase in green, RNA strands in red and blue) shows a spacious channel to accommodate the growth ofRNA duplex. The black dot indicates the center of the map, and symmetry-related neighboring ECs are labeled as Sym1/2/3. (B) RNA sequence flanking the active siteof the native EC (C1S1) and those of the other six complexes derived from the native EC in crystal-soaking trials. The template is in cyan, and product is in green.The black box indicates the nucleotides incorporated during soaking. “C” and “S” in complex names stand for “cycle” and “state,” respectively, and the subscriptnumbers reflect the assigned cycle/state numbers. (C) The isomorphous nature of the EC lattices. C1S1/2 and C3S6 structures (bold-faced in B) were chosen asrepresentatives to indicate the very limited lattice alteration upon three rounds of nucleotide incorporation. The superimposed polymerases (using thetraditional least-squares method) are on the right side with the C1S1/2 complex taking a coloring scheme indicated by individual parts of the EC and the C3S6complex in black; their symmetry-related neighboring ECs are colored in orange (C1S1/2) and purple (C3S6).

    E4006 | www.pnas.org/cgi/doi/10.1073/pnas.1602591113 Shu and Gong

    Dow

    nloa

    ded

    by g

    uest

    on

    May

    2, 2

    021

    www.pnas.org/cgi/doi/10.1073/pnas.1602591113

  • If clear triphosphate density was visible, then key conformationalchanges around the CTP ribose 2′ and 3′ hydroxyl groups had al-ready occurred, but the active site had not yet fully closed, leading toan intermediate state we call “S2/3” (C1S2/3 structure; Fig. 2A). Theconserved motif A residue D238 experienced a hallmark side-chainrotamer change to accommodate the ribose hydroxyls and estab-lished a hydrogen-bonding network with the ribose and motif Bresidue S289 (Fig. 2C). As in the C1S1/2 structure, metal A still re-sided several Ångstroms away from its catalytic position, whereasmetal B was observed within coordination-distance range of the CTPphosphates and the motif A residue D329. D329 (the first aspartatein sequence XGDD) is one of two universally conserved aspartic acidresidues in polymerases following the two-metal-ion catalytic mech-anism (17). However, this structure is not in the fully closed cata-lytically competent state because the other universal aspartic acidresidue located at position 233 in motif A is not in place. As char-acterized in the PV RdRP EC study, D233 features the most notablebackbone movement during active site closure to achieve co-ordination with both metal ions. The capture of EV71 RdRP in apartially closed state, S2/3, provides further evidence that the likelyorder of events during active site closure is as originally suggestedby the PV study (10). In such a proposal, the precise placement ofribose 2′ and 3′ hydroxyls triggers the reorganization around theribose, including residue D238, which in turn induces the move-ment of the D233 region within the same motif for metal ion co-ordination and catalysis.A fully closed postcatalysis state in the EV71 structure was obtained

    also (C1S4 structure; Fig. 2A) in which the details of the active sitewere essentially identical to the S4 structures seen in the PV study.Relative to their location in the C1S2/3 structure, the D233 side-chaincarboxylate group moved about 3.8 Å and rotated ∼120°, and metal A

    moved 5.5 Å to be coordinated simultaneously by D233, D329, andD330 in the catalytically competent conformation (Fig. 2C).

    Postcatalysis Complexes in Two NACs Provide the NTP SelectionDetails by +1 Template Purine Bases. Many DNA-dependent po-lymerase ECs contain a preinsertion site (the “E-site” where “E”stands for “entry”) in which the +1 template nucleotide is poisedto allow initial NTP binding (13, 18–20). This site allows thenascent base pair to form without establishing the stacking in-teractions with the −1 base pair. To achieve active site closure,an NTP repositioning step needs to take place to move the NTPinto the insertion site (the “A-site” where “A” stands for “addition”),and this repositioning usually is accompanied by relatively largeconformational changes in the vicinity of the active site (11–13, 18).These features in general permit the NTP selection process to occurat two distinct sites that have different sets of interactions, possiblyimproving nucleotide selection fidelity. In contrast, in viral RdRPECs the +1 template nucleotide is prestacked on the −1 base pair,and as a result the initial NTP binding is to a site nearly identicalto the catalytic insertion site. Active site closure involves onlylimited backbone shifts in motifs A and D with several hallmarkside-chain rotamer changes as mentioned above. The NTP selectionby the RdRP EC therefore is structurally less complicated, becausethe initial NTP binding and subsequent catalysis occurs in very similarprotein environments. In addition to the C1S1 native structure andC1S4 structure that show the details of CTP selection with a guaninebase at the template +1 position (denoted “+1G:C”), we obtainedC3S1 and C3S4/5 structures showing UTP selection with a +1 adeninebase (+1A:U) (Fig. 3). When comparing the NTP-free S1 structures tothe NMP-incorporated S4 and S4/5 structures by a maximum like-lihood superimposition of the polymerase molecules (21), the

    Table 1. X-ray diffraction data collection and structure refinement statistics (set 1)

    NAC state* (PDB ID code)

    C1S1 (5F8G) C1S1/2 (5F8H) C1S2/3 (5F8I) C1S4 (5F8J)

    Data collection†

    Space group P212121 P212121 P212121 P212121Cell dimensions

    a, b, c, Å 63.1, 77.6, 153.6 63.3, 77.1, 149.9 63.7, 77.3, 149.7 63.4, 76.6, 149.3α, β, γ, ° 90, 90, 90 90, 90, 90 90, 90, 90 90, 90, 90

    Resolution, Ň 60.0–2.78 (2.88–2.78) 60.0–2.45 (2.54–2.45) 60.0–2.50 (2.59–2.50) 60.0–2.66 (2.76–2.66)Rmerge 0.117 (0.50) 0.062 (0.51) 0.060 (0.48) 0.073 (0.50)I/σI 13.6 (3.2) 21.4 (3.3) 20.1 (2.9) 17.7 (2.5)Completeness, % 97.6 (98.4) 99.7 (99.9) 99.6 (100.0) 98.1 (96.4)Redundancy 5.4 (5.5) 4.7 (4.8) 4.3 (4.3) 4.9 (4.9)

    RefinementResolution, Å 2.78 2.45 2.50 2.66No. reflections 19,258 27,772 26,095 20,430Rwork/Rfree

    §, % 19.3/24.2 19.8/24.4 19.9/24.4 18.3/22.8No. atoms

    Protein/RNA 3,681/471 3,681/471 3,681/471 3,685/491Ligand/ion/water –/ 1/115 5/2/141 40/3/144 25/3/111

    B-factorsProtein/RNA 42.6/53.2 52.3/60.8 57.5/72.2 50.4/62.8Ligand/ion/water –/18.6/38.9 36.3/44.4/49.8 53.0/53.2/55.5 43.6/39.1/48.1

    RMSDBond lengths, Å 0.008 0.008 0.008 0.008Bond angles, ° 1.13 1.13 1.12 1.17

    Ramachandran statistics¶ 91.2/8.3/0.2/0.2 93.1/6.6/0.0/0.2 92.9/6.9/0.0/0.2 92.6/6.9/0.2/0.2

    *Coding scheme: C, cycle; S, state; subscript numbers reflect the cycle and state numbers: x/(x+1) indicates an NAC species between state x and the nextreference state. Soaking strategy: C1S1/2: CTP for 4 min; C1S2/3 – CTP for 4 min, and then transfer to UTP for 10 min; C1S4: CTP for 5 min 10 s.†One crystal was used for data collection for each structure.‡Values in parentheses are for highest-resolution shell.§5% of data are taken for the Rfree set, and the same Rfree set is applied for all structures.¶Values are in percentage and are for most favored, additionally allowed, generously allowed, and disallowed regions in Ramachandran plots, respectively.

    Shu and Gong PNAS | Published online June 23, 2016 | E4007

    BIOCH

    EMISTR

    YPN

    ASPL

    US

    Dow

    nloa

    ded

    by g

    uest

    on

    May

    2, 2

    021

  • placement of the +1 template base was subjected only to very subtlemovement toward the major groove side (Fig. 3A). The polymeraseactive site conformation is essentially identical in the two productstructures, with key residues D238, D233, S289, G290, and R174following the same conformational switches for both NTPs. Theseresidues, together with motif F residue K159 sitting on the majorgroove side and residue I176 that stacks onto the template base fromthe downstream side, define a compact substrate pocket for the finalprecatalysis fidelity checkpoint. The 2′ and 3′ hydroxyls of the na-scent NMP ribose were precisely placed toward D238, with thepositional deviation of the both hydroxyl oxygen atoms in C1S4and C3S4/5 complexes being within 0.2 Å. This observation sug-gests that the establishment of the interaction network around theNTP ribose hydroxyls is a key determinant for active site closureand likely precedes and triggers the structural rearrangementsaround the catalytic metal ions.The two postcatalysis structures reported here are nearly iden-

    tical in polymerase active site conformation. However, only theC1S4 structure contains all 12 coordination partners for the twometal ions, with the 11 of these found within Mg2+ coordinationdistances (Fig. 3B). Therefore, we consider C1S4 a bona fidepostcatalysis S4 structure. In contrast, the coordination geometryhas begun to disintegrate in the C3S4/5 structure, and it thereforeshould be considered an intermediate between S4 and S5, even ifthe protein active site conformation remains closed. Note thattwo water molecules participate in Mg2+ coordination in the C1S4structure. The water molecule coordinating metal A is about 3.5 Åaway from the 3′ hydroxyl oxygen. In the norovirus RdRP-RNA-CTP crystal structure, a structurally equivalent water moleculecoordinates the metal A (a Mn2+ in that case) and was suggested to

    serve as the general base in the proposed two-proton transfermechanism (22, 23). The water molecule coordinating metal B inthe C1S4 structure is about 3.0–3.1 Å away from two phosphateoxygen atoms of the pyrophosphate, and a structurally equivalentwater molecule has not been observed in RdRP structures. Alongwith the previously proposed motif D lysine (K359 in PV RdRP)(24) and motif F arginine (R174 in PV RdRP) (10) residues, thiswater molecule also may be a candidate to protonate the β-phos-phate of the substrate NTP as a general acid.

    A Previously Unidentified Translocation Intermediate Suggests anAsymmetric Movement of Template–Product RNA Duplex. Translocationhas the net effect of changing the footprint of the polymeraseon nucleic acid by one nucleotide register and represents animportant postcatalysis event that completes the NAC. Local,if not major, conformational changes must take place toachieve translocation, and higher-energy intermediates mayexist between the pre- and posttranslocation states. However,such translocation intermediates often are difficult to capture byexperimental biology because of their short-lived nature. Com-pared with in-solution approaches, in-crystal polymerase synthesismay increase the possibility of capturing such intermediates byproviding extra constraints through the packing environment ofeach polymerase. Indeed, the time scale for a single NAC in thecurrent lattice is on the order of minutes (Tables 1 and 2), at least100-fold slower than that measured in the solution studies in PVRdRP (25, 26). Ideally, the EV71 RdRP EC lattice reported herecould allow the capture of such intermediates in all three NACs.To date, we have not successfully obtained any translocation in-termediate within the first or the second NAC, even after numerous

    Table 2. X-ray diffraction data collection and structure refinement statistics (set 2)

    NAC state* (PDB ID code)

    C3S1 (5F8L) C3S4/5 (5F8M) C3S6 (5F8N)

    Data collection†

    Space group P212121 P212121 P212121Cell dimensions

    a, b, c, Å 62.3, 76.7, 151.2 63.6, 76.7, 150.1 63.7, 77.6, 151.4α, β, γ, ° 90, 90, 90 90, 90, 90 90, 90, 90

    Resolution, Ň 60.0–2.81 (2.91–2.81) 60.0–2.83 (2.93–2.83) 60.0–2.47 (2.56–2.47)Rmerge 0.102 (0.50) 0.065 (0.50) 0.047 (0.51)I/σI 18.6 (4.0) 18.0 (2.6) 23.1 (2.6)Completeness, % 99.9 (100.0) 95.8 (96.1) 95.1 (94.8)Redundancy 6.4 (6.6) 3.7 (3.8) 4.1 (3.9)

    RefinementResolution, Å 2.81 2.83 2.47No. reflections 18,195 17,637 25,817Rwork/Rfree

    §, % 20.0/23.8 19.3/23.2 19.4/23.5No. atoms

    Protein/RNA 3,677/468 3,681/465 3,677/360Ligand/ion/water –/1/96 20/3/78 20/1/138

    B-factorsProtein/RNA 50.4/73.5 61.6/79.4 60.2/70.4Ligand/ion/water –/46.9/45.2 56.3/49.2/49.8 54.5/42.2/58.5

    RMSDBond lengths, Å 0.009 0.009 0.008Bond angles, ° 1.16 1.20 1.13

    Ramachandran statistics¶ 91.4/8.3/0.0/0.2 91.7/7.8/0.2/0.2 91.7/7.8/0.2/0.2

    *Coding scheme is the same as in Table 1. Soaking strategy: C3S1: CTP for 16 h; C3S4/5: CTP for 5 min, and then transfer to CTP/UTP/GTP for 20 min; C3S6: CTP/UTP for 16 h.†One crystal was used for data collection for each structure.‡Values in parentheses are for highest-resolution shell.§5% of data are taken for the Rfree set, and the same Rfree set is applied for all structures.¶Values are in percentage and are for most favored, additionally allowed, generously allowed, and disallowed regions in Ramachandran plots,respectively.

    E4008 | www.pnas.org/cgi/doi/10.1073/pnas.1602591113 Shu and Gong

    Dow

    nloa

    ded

    by g

    uest

    on

    May

    2, 2

    021

    www.pnas.org/cgi/doi/10.1073/pnas.1602591113

  • rounds of attempts. However, with overnight incubation in thepresence of CTP and UTP, the EC could be paused at a cleartranslocation-intermediate state in which all regions of the tem-plate–product RNA duplex except for the backbone region fromposition −3 to position +1 of the template strand had undergonemovement in the upstream direction (Fig. 4A, C3S6). To the best ofour knowledge, this is the first reported crystallography-derived poly-merase EC intermediate that demonstrates the global motion of thetemplate–product duplex during translocation, and it is conceptuallydifferent from the crystal structure of yeast RNA polymerase II (PolII) EC in complex with α-amanitin, in which only the intermediateconformation of the downstreamDNA is observed (27). Note that theintermediate RNA conformation in our C3S6 structure mostly likelyrepresents a naturally occurring state rather than an artifact inducedby crystal-packing constraints. In the previous work describing themethod used in crystallizing picornaviral polymerase EC, we dem-onstrated that the polymerase with consistent global conformationmaintains the native RNA conformation from the active site to about

    position −7, despite various extents of RNA conformational distortionoccurring further upstream (14). The polymerase global conformationof the C3S6 complex is highly consistent with the other six complexespresented in this study (Fig. 4A). The RNA regions upstream of po-sition −7 and downstream of position +2 are largely disordered in theC3S6 structure (Fig. 4B). Although a portion of RNA backboneelectron density of the disordered upstream RNA is somewhattraceable, defined intercomplex contacts are lacking at both ends ofthe RNA construct. Taken together, the consistent polymerase con-formation, the absence of defined crystal contacts at both ends of theRNA constructs, and the ability of polymerase to absorb the confor-mational distortion induced at the upstream end of its RNA collec-tively support the validity of the C3S6 structure.In comparison with the C3S4/5 pretranslocation structure within the

    same NAC, in the C3S6 structure the entire product strand moved inthe upstream direction (Fig. 5). With all base-paring hydrogen bondsmaintained between the two strands, the product strand phosphatesmoved about a half-register on average, with some variation

    Fig. 2. CTP-derived sequential EC structures demonstrate a plausible order of precatalysis NAC events. (A) Four-cycle 1 (C1) NAC structures arranged in asequential order with composite simulated-annealing (SA) omit electron density maps contoured at 1.2 σ. C1S1 represents the starting state 1 with a vacantactive site; C1S1/2 shows evidence of initial NTP binding via base-paring; C1S2/3 shows the initial conformational change around D238 that is triggered by ribosehydroxyls during active site closure; C1S4 represents the bona fide catalytically closed state 4 with 11 of 12 Mg

    2+ ion contacts intact. Coloring scheme: templatein cyan (+1 nucleotide in orange), product in green, palm in gray (YGDD sequence in magenta), ring finger in yellow, and metal ions in cyan. (B) Electrondensity around the NTP site in the C1S1/2 structure indicates that base-paring interactions and placement of the NTP ribose likely form before the correctalignment of triphosphate moiety. (C) The change in the interaction network around NTP ribose (ribose cluster) and catalytic metal sites (metal cluster)demonstrates the order of events upon NTP binding and active site closure. Red spheres indicate water molecules coordinating with Mg2+.

    Shu and Gong PNAS | Published online June 23, 2016 | E4009

    BIOCH

    EMISTR

    YPN

    ASPL

    US

    Dow

    nloa

    ded

    by g

    uest

    on

    May

    2, 2

    021

  • (Fig. 5 B and C). The product riboses and bases also moved in thesame direction but with larger variation. The upstream-most nucleo-tide resolved in the structure is the −7 position, whose base almostreached the −8 position of the pretranslocation structure, but thedownstream-most +1 base has moved by only about one fifth of aregister (Fig. 5 B and C). The template strand lagged behind, with itsbackbone phosphates at positions −2 to +1 remaining locked at theirpretranslocational positions, whereas the−7 to −4 phosphates movedupstream by less than a half-register on average (Fig. 5). Globallyassessed, the template–product RNA duplex had undergone anasymmetric movement with the product strand preceding the tem-plate strand and the upstream region leading the downstream region.With the aim of understanding this apparent strand asymmetry

    in viral RdRP translocation, we compared the details of theprotein–RNA interactions from the active site to position −7 inall three C3 structures (Fig. 5C). Except for the interactions withthe newly incorporated UMP in the C3S4/5 structure, the inter-actions are largely the same in the pre- and posttranslocationstates. Interactions between the product strand and the poly-merase are fairly evenly distributed from −7 to −1 without being

    concentrated in any one region. In contrast, polymerase interactionswith the template strand are clustered in two distinct regions aroundthe −5 to −3 and −2 to +1 nucleotides, with the latter having themost extensive contacts. Motifs B, F, and G converge at the −2 to +1region and create a sharp turn in the template strand (Fig. 5A) thatalso is observed in DNA-dependent RNA polymerases (28, 29). As aresult, the +2 template base is fully unstacked from the upstreamnucleotides and is tucked into a small surface pocket created by theindex and ring fingers (Fig. 5A) (14). Motif G residues T114 andS115 pack against the template strand +1/+2 backbone linkage andlikely serve as a control point for template strand movement duringtranslocation (9). At the upstream end of this interaction clusterthere is an unusual and conserved backbone conformation for theribose–phosphate linkage of the template strand −2 nucleotide (14).Together, these two interactions appear to lock the −2 to +1 se-quence in place, specifically preventing this region of the templatefrom moving while the remainder of the product–template RNAduplex undergoes translocation in an asymmetric fashion. To achievethe final posttranslocation state, i.e., S1 of the next NAC, the specialbackbone conformations at the S6 template +1/+2 linkage and the

    Fig. 3. Watson–Crick base-pair geometry as a major determinant of NTP selection. (A) Interaction details for NTP selection directed by +1 guanine (C1S1 andC1S4) and adenine (C3S1 and C3S4/5). All polymerase molecules were superimposed, and the structures are shown as individual panels with the S1 complexes onthe left and S4/S4/5 complexes on the right. The coloring scheme is the same for each complex and is the same as in Fig. 2 with the product RNA in green,template RNA in cyan, and the +1 template nucleotide in orange. The ribose C4′ atoms of the +1 nucleotides are shown as spheres to aid the structuralcomparison with the alignment grids. Key interactions around the ribose cluster upon active site closure, and base-pairing hydrogen-bonding interactions atthe +1 position are indicated by green and pink dashed lines, respectively. K159- and R174-related interactions are shown in purple. (B) A comparison of themetal cluster of the two product complexes. Mg2+–oxygen distances greater than 2.5 Å are shown in green text to highlight the disintegration of the ion-coordination interactions in the postcatalysis S4/5 complex. Red spheres indicate water molecules coordinating with Mg

    2+.

    E4010 | www.pnas.org/cgi/doi/10.1073/pnas.1602591113 Shu and Gong

    Dow

    nloa

    ded

    by g

    uest

    on

    May

    2, 2

    021

    www.pnas.org/cgi/doi/10.1073/pnas.1602591113

  • −2 position need to transition to the corresponding regions one basestep downstream, and the energy to achieve the requisite confor-mational changes is likely an important contributor to the ultimateenergy barrier for translocation.We next compared the pretranslocation C3S4/5 structure with

    the translocation-intermediate C3S6 structure. Although theinteractions between the template strand and the polymeraseremained largely unchanged, more than half of the residues par-ticipating in product strand contacts changed their interaction modesby interacting with a different site, losing original interactions orestablishing additional interactions (Fig. 5C). This finding is consis-tent with the observation that the interactions between the poly-merase and the product RNA are relatively nonintensive and evenlydistributed. We therefore propose that the product RNA strand maybe capable of sliding back and forth between its pre- and post-translocation states. In contrast, the movement of the template RNAis controlled more stringently by the polymerase, in particular aroundthe −2 to +2 region. The backbone conformational changes requiredto translocate the +1/+2 junction and the −2 ribose–phosphatelinkage are likely accompanied or assisted by local polymerase con-formational changes that also require energy, making the final step intranslocation rate-limiting and largely irreversible (Fig. 5C).

    DiscussionAn Improved View of Viral RdRP Elongation NAC. Using the advan-tages provided by a particular EV71 RdRP EC lattice, the cur-rent study makes important advances toward the understandingof the viral RdRP elongation NAC (Fig. 6). The current NACworking model starts with an S1 complex in the absence of NTPsubstrate with an open active site; after diffusing into the NTP entrychannel formed between the motif F in the ring finger and motifsA and D in the palm, the NTP establishes its initial interactionthrough base-paring with the +1 template base and stacking with the−1 priming base (S1/2), leading to the rearrangement around the NTPribose and motif A residue D238 and finally bringing the NTP tri-phosphate, two divalent metal ions, and another motif A residue,D233, into place to achieve the catalytically closed conformation (S3)as observed in a norovirus polymerase–RNA–CTP complex structure(23). Immediately after the phosphoryl transfer reaction (S4), thecatalytic geometry around the metal ions starts to disintegrate, and theactive site reopens (S5); translocation begins with the less-restrainedproduct RNA and the upstream region of the template–product du-plex (S6) and then finally needs to overcome the conformationaltransition around the −2 to +2 region of the template strand to be-come the posttranslocation S1 complex of the next NAC (Fig. 6A).

    On NTP Selection and Fidelity Control by Viral RdRP. Viral RdRPs typ-ically have misincorporation rates in the range of 10−4–10−5 (30–33)and therefore ought to be considered as medium-fidelity polymer-ases compared with high- and low-fidelity representatives (26, 34,35). Therefore, the NTP selection by viral RdRP, in particular for itsEC, is expected to be reasonably stringent. Because a preinsertionsite is missing in viral RdRP ECs, the NTP selection occurs in alimited space in an induced-fit manner with only local conforma-tional changes. It has been proposed that the precise recognition ofthe equivalent geometry of the Watson–Crick base pairs may bethe most important factor in NTP selection by polymerases (36).Although we have not obtained structural data for all four RNA basepairs at the RdRP +1 position, the +1G:C and +1A:U productcomplexes (C1S4 and C3S4/5) strongly support this proposal, becausethe spatial placement of the +1 nucleotides, the edges of the +1 basepair, and the shape of the active site complementing the+1 nucleotidesare highly analogous (Fig. 3A). As suggested by the C1 structures in thecurrent study, the three sequential events of initial base-pairing, ribosehydroxyl-induced conformational change around D238 (hereafter the“ribose–Asp switch”), and the alignment of NTP triphosphate andD233 around the two metal ions (hereafter the “metal–Asp switch”)likely provide the fidelity check points. The precise placement of the

    nascent NMP ribose hydroxyl groups in the two product complexesemphasizes the importance of the ribose–Asp switch in substrate se-lection. We propose that the RdRP specific ribose–Asp switch is apositive contributor to the overall fidelity for viral RdRPs and to someextent may compensate for the loss of one fidelity checkpoint resultingfrom the lack of a preinsertion site in these polymerases.

    On the Translocation Mechanism by Polymerases.Two major theorieshave been proposed to describe the central mechanism of poly-merase translocation, namely the Brownian ratchet model and thepower stroke model (11, 37–39). The Brownian ratchet modelfeatures a fast equilibrium between the pre- and posttranslocationstates, and it allows possible intermediates in the absence of the nextincoming NTP, but the presence of a bound NTP strongly stabilizesthe polymerase in the posttranslocation state. The power strokemodel emphasizes the correlation between pyrophosphate releaseand the required conformational changes to convert the pre-translocation state to the posttranslocation state in a single andlargely irreversible step. Although we are not trying to reconcilethese two models here, our structures suggest that the RdRPtranslocation process uses essential features of both. As suggested

    Fig. 4. The C3S6 complex is a translocation intermediate, has a global poly-merase conformation consistent with other complexes, and is free of definedintercomplex interactions at either end of its RNA constructs. (A) Stereo-pairimages of all seven complexes with their polymerases superimposed. The C3S6complex is shown in black, and the other six complexes are colored as the C1S1/2complex in Fig. 1C. The +2 template nucleotides in all complexes are omitted forclarity. The −1 and −8 position labels correspond to the regular translocationalpositions defined by the six nontranslocation-intermediate complexes. (B, Upper)The complete sequence of the RNA constructs in the C3S6 complex is shown withgray font indicating unresolved nucleotides in the structure. (Lower Left) Aportion of the unresolved upstream RNA backbone in the C3S6 structure issomewhat traceable. (Lower Right) The downstream RNA is essentially disor-dered. Electron densities for defined crystal contacts are lacking at both ends ofthe RNA construct. Cyan mesh indicates the 2Fo-Fc electron density map con-toured at 1 σ; green/red mesh indicates the Fo-Fc electron density map contouredat 3 σ. Green ribbons indicate polymerase; red/blue ribbons indicate template/product RNA. Black dots indicate the center of the map. The map radius is 30 Å.

    Shu and Gong PNAS | Published online June 23, 2016 | E4011

    BIOCH

    EMISTR

    YPN

    ASPL

    US

    Dow

    nloa

    ded

    by g

    uest

    on

    May

    2, 2

    021

  • by the C3S6 structure, the postcatalysis product strand is subjectto Brownian motion (fast equilibrium) between the pre-and posttranslocation states because of the evenly distributedinteractions with the polymerase (Figs. 5 and 6B). In contrast,the movement of the template strand −2 to +2 region towardthe upstream direction requires overcoming the sharp back-bone turn at +1/+2 and the special conformation of the −2phosphate linkage. This movement is likely the rate-limitingstep during translocation and therefore is largely irreversible.Therefore, the newly established protein interactions at the −2

    to +2 region in the posttranslocation state could be regarded asthe ratchet to prevent back-translocation (Fig. 6B). In the C3S6structure, pyrophosphate is present with a refined occupancy of0.55, comparable to the 0.66–0.72 occupancies seen in thepretranslocation structures (Figs. 3A and 5A), but pyrophos-phate is totally absent both in the S1 structures in this study and inpreviously determined picornavirus polymerase EC S1 structures(10, 14). Thus, the release of pyrophosphate may coincide withthe final upstream movement of the template −2 to +2 region,although whether and how the pyrophosphate release and the

    Fig. 5. A previously unidentified translocation in-termediate supports the asymmetric movement ofthe template–product duplex during translocation.(A) Stereo-pair images of the translocation in-termediate complex C3S6 with a composite SA omitelectron density map (contoured at 1.2 σ) of RNA, py-rophosphate (PPi), and residue D238 overlaid. Thecoloring scheme is as in Fig. 3A. The C3S4/5 pre-translocation (dark gray) and C3S1 posttranslocation(brown) complexes are shown with polymerasessuperimposed on the intermediate complex for com-parisons. (B) RNA-only comparison of the translocationintermediate (Left: template in cyan; Right: productin green) and the pretranslocation complex (darkgray) in the same NAC. (C) Schematic illustration ofthe different protein–RNA interactions in the pretran-slocation C3S4/5, intermediate C3S6, and posttrans-location C3S1 complexes. RNA movement of theC3S6 structure was estimated using the base, ribose,and phosphate components of the pre- and post-translocation complexes as references. The zig-zagged red symbol indicates the irregular backboneconformation of the template −2 position. Phos-phate, ribose, and base are shown as circles, penta-gons, and blocks, respectively. Solid arrows indicatehydrogen bonding, electrostatic, or hydrophobic in-teractions. Gray fonts indicate weaker interactions(judged by distance) compared with interactions in-volving the same residue in other structures.Underlining indicates a change of interaction part-ner(s) for a polymerase residue when switching fromthe posttranslocation complex to the intermediatecomplex. Strikethroughs indicate nonexistent inter-actions.

    E4012 | www.pnas.org/cgi/doi/10.1073/pnas.1602591113 Shu and Gong

    Dow

    nloa

    ded

    by g

    uest

    on

    May

    2, 2

    021

    www.pnas.org/cgi/doi/10.1073/pnas.1602591113

  • conformational changes controlling the templates strand movementare mechanically related is not obvious.It is plausible that the asymmetric movement of the two RNA

    strands observed in the current study also may be a general feature ofother classes of polymerases. Although the special backbone confor-mation of the template −2 ribose–phosphate linkage is observed onlyin viral RdRP–RNA complexes (2, 10, 14, 23), the sharp turn at the+1/+2 junction of the template strand is shared by several classes ofpolymerases (28, 29, 40, 41). The polymerases need to create theactive site at the downstream end of the +1 site, and thereforethe helical trace of the template strand backbone must deform aroundthe +1/+2 junction, making the postcatalysis translocation around thejunction less fluid. Very interestingly, the eukaryotic RNA Pol II ini-tiation complex (IC) has been captured with a template–product du-plex adopting a similar asymmetric conformation when the transcriptlength is 4–5 nt (42). Note that these Pol II IC complexes are nottranslocation intermediates, because the asymmetric conformationwas achieved through slippage-mode movement of the product strandin the downstream direction. As a result, the −1 and −2 (“i-1” and“i-2” in Pol II nomenclature) nucleotides in the product strand seemto form mismatches with the −1 and +1 nucleotides in the templatestrand. These obvious conformational dynamics of the template–product duplex may be explained in part by the intrinsic instability of aDNA-dependent RNA polymerase IC with a short transcript. In fact,a similar conformation has not been observed in Pol II IC structureswith longer transcripts or in EC structures that have established in-tensive transcript–template and transcript–polymerase interactions (29,42–44). Therefore, the mechanism for achieving a similar asymmetrictemplate–product conformation in these Pol II ICs appears to bedifferent from that suggested by the translocation intermediate in thecurrent study. More high-resolution translocation-intermediate struc-tures are needed to refine further the understanding of the polymerasetranslocation in general. As mentioned, capturing translocation in-termediates by crystallography is somewhat serendipitous, but perhapsmolecular dynamics approaches could serve as a valid tool for testingwhether other classes of polymerases also exhibit asymmetric RNA-strand movements during translocation.

    Materials and MethodsCloning and Protein Expression. The EV71 3Dpol gene within the DNA clone ofSK-EV006-LPS1 (GenBank accession no. AB550335.1, genotype B) was clonedinto a pET26b-Ub vector (45). The resulting plasmid was transformed intoEscherichia coli strain BL21(DE3) pCG1 (kindly supplied by Craig Cameron,Pennsylvania State University, State College, PA) for expression of 3Dpol with aC-terminal hexa-histidine tag as described previously (3, 45). 3Dpol was producedas a ubiquitin-fused protein, and the ubiquitin was cleaved in vivo by a coex-pressed ubiquitin-specific carboxyl terminal protease Ubp1 to produce the 3Dpol

    with homogenous native N-terminal glycine residue. Cells were grown at 25 °Covernight in LB medium with 50 μg/mL kanamycin (KAN50) and 17 μg/mLchloramphenicol (CHL17) until the OD600 was 1.0. The overnight culture wasused to inoculate 1 L of LB medium with KAN50 and CHL17 to reach an initialOD600 around 0.025. The cells were grown at 37 °C to an OD600 of 0.6 and thenwere cooled to room temperature. Isopropyl-β-D-thiogalactopyranoside wasadded to a final concentration of 0.5 mM, and the cells were grown for anadditional 11–12 h before harvesting.

    Purification of EV71 3Dpol. Cell lysis, subsequent purification, and storage procedureswere as described previously (3), except that a HiTrap Q HP column (GE Healthcare)was used in the second chromatographic purification step. This column was equili-brated with a buffer containing 50 mM NaCl, 25 mM Tris (pH 8.5), 0.1 mM EDTA,20% (vol/vol) glycerol, and 0.02% (wt/vol) NaN3, and the protein was eluted by alinear gradient to 600 mM NaCl. The final buffer condition for protein storagewas 300 mM NaCl, 5 mM Tris (pH 7.5), 0.02% (wt/vol) NaN3, and 5 mM Tris(2-carboxyethyl)phosphine. The molar extinction coefficient for 3Dpol pro-tein was calculated based on protein sequence using the ExPASy ProtParamprogram (www.expasy.org/tools/protparam.html). The yield is typically inthe range of 8–15 mg of pure protein per liter of bacterial culture.

    RNA Preparation. The template strand RNA to assemble the r5 construct(14) solely used in the current study was obtained by in vitro T7 RNA poly-merase transcription using a parental plasmid pRAV23 (kindly supplied byJeffrey Kieft, University of Colorado Denver, Denver, and Robert Batey,University of Colorado Boulder, Boulder, CO) and approaches modified fromprotocols described previously (10, 46, 47). The 10mer RNA primer (P10) waspurchased from Integrated DNA Technologies. The procedures for the self-annealing of the template strand, the subsequent annealing with P10, andthe r5 construct storage were as previously described (14).

    EC Assembly, Purification, and Storage. EC assembly, purification, and storagewas carried out using the protocols in the PV work (10), except that KCl wasprovided at a concentration of 70 mM, the NaCl concentration was reduced

    Fig. 6. Working models for viral RdRP elongation NAC and translocation. (A) The NAC model shown in a circle format. Previously reported native PV RdRP ECand three of its derivative structures obtained by CTP, 3′-deoxy-CTP (3dCTP), or ddCTP soaking serve as the reference states 1, 2, 4, and 5 (gray fonts; PDB IDcodes are listed). All seven structures reported in the current study obtained by natural NTP soaking were assigned at corresponding positions in the cycle. Anorovirus (NV) polymerase–RNA–CTP complex exhibiting a precatalysis closed-conformation active site represents reference state 3. (B) A schematic freeenergy diagram for translocation. The pretranslocation state could establish fast equilibrium with the S6 intermediate state, and the subsequent transition tothe posttranslocation state 1 of the next NAC is rate-limiting. Empty triangles indicate the interactions needed to maintain the irregular backbone con-formation of the template −2 position and the +1/+2 bend of the template. These interactions include those between the motif G T114–S115 backbone andthe +1/+2 junction of the template strand backbone and those between pinky finger K127 and R188 side chain and the template strand backbone phosphatesstabilizing the irregular conformation of the ribose–phosphate linkage at the −2 position. These interactions must be broken (indicated by the unlockedsymbol) during the final step toward the posttranslocation state. P, product; T, template; T1/T2, transition states.

    Shu and Gong PNAS | Published online June 23, 2016 | E4013

    BIOCH

    EMISTR

    YPN

    ASPL

    US

    Dow

    nloa

    ded

    by g

    uest

    on

    May

    2, 2

    021

    http://www.expasy.org/tools/protparam.html

  • to 40 mM for the assembly reaction, and the Hepes pH was increased to 7.0for both the assembly reaction and complex storage.

    EC Crystallization and NTP Soaking of the EC Crystals. The EC crystals weregrownby sitting-drop vapor diffusion at 16 °C using 7.8mg/mL EC sample. Crystalsgrew in 1–2 wk with a precipitant solution containing 0.17 M ammonium sulfate,0.085 M Mes (pH 6.5), 25.5% (wt/vol) PEG 5000 monomethyl ether, and15% (vol/vol) glycerol. NTP-soaking experiments were done under the pre-cipitant solution using 5 mM NTP and 10 mM MgCl2. For each NAC complexobtained, the NTP combination and incubation time are listed in Tables 1 and 2.

    Crystallographic Data Processing and Structure Determination. All final dif-fraction data for all crystals were collected at the Shanghai Synchrotron RadiationFacility (SSRF) beamline BL17U1 at 100 Kelvin [wavelengths: 0.9791 Å for ProteinData Bank (PDB) entries 5F8G and 5F8L; 0.9789 Å for PDB entries 5F8H, 5F8I,5F8M, and 5F8N; and 0.9792 Å for PDB entry 5F8J]. Data (100–150°) were typi-cally collected in 0.5° oscillation steps. Reflections were integrated, merged, andscaled using HKL2000 or D*Trek v9.9 (48, 49). The initial structure solution wasobtained using the molecular replacement program PHASER (50) using coordi-nates derived from the PV EC structure (PDB ID code 3OL6) as the search model(10). Manual model building and structure refinement were done using Coot andPhenix, respectively (51, 52). The 3,500-K composite simulated-annealing omit

    2Fo-Fc electron density maps were generated using CNS (53). In this process, theentire asymmetric unit was first divided into small boxes, each including a frac-tion not exceeding 5% of the model. For each box, the model in the box wasomitted for calculating the corresponding omit maps. The composite map thenwas generated by stitching all the individual omit maps together in order tomake the entire map, not only a specific region, less model-biased. Unless oth-erwise indicated, all polymerase superimpositions were done using themaximumlikelihood-based structure superimpositioning program THESEUS (21).

    ACKNOWLEDGMENTS. We thank Dr. Olve Peersen for helpful discussionsand valuable input on the manuscript content; Dr. Zhiyong Lou for providingthe cloning material for the EV71 polymerase gene; Dr. Craig Martin andDr. Zhongzhou Chen for critical reading of the manuscript; Dr. HanzhongWang, Dr. Bo Zhang, and Dr. Huimin Yan for help in initiating this EV71 RdRPproject; Wei Shi for contributions in optimizing the EC assembly reaction con-dition; Liu Deng for laboratory assistance; the Shanghai Synchrotron RadiationFacility (beamline BL17U1, Shanghai, China) and the Beijing Synchrotron Ra-diation Facility (beamline 3W1A, Beijing, China) synchrotrons for access tobeamlines; and the Core Facility and Technical Support of the Wuhan Instituteof Virology for access to instruments. This work was supported by National KeyBasic Research Program of China Grant 2013CB911100, National Natural Sci-ence Foundation of China Grant 31370198, and the Hundred Talents Programof the Chinese Academy of Sciences.

    1. Thompson AA, Peersen OB (2004) Structural basis for proteolysis-dependent activa-tion of the poliovirus RNA-dependent RNA polymerase. EMBO J 23(17):3462–3471.

    2. Ferrer-Orta C, et al. (2004) Structure of foot-and-mouth disease virus RNA-dependentRNA polymerase and its complex with a template-primer RNA. J Biol Chem 279(45):47212–47221.

    3. Lu G, Gong P (2013) Crystal structure of the full-length Japanese encephalitis virus NS5reveals a conservedmethyltransferase-polymerase interface. PLoS Pathog 9(8):e1003549.

    4. Liang B, et al. (2015) Structure of the L protein of vesicular stomatitis virus fromelectron cryomicroscopy. Cell 162(2):314–327.

    5. Pflug A, Guilligay D, Reich S, Cusack S (2014) Structure of influenza A polymerasebound to the viral RNA promoter. Nature 516(7531):355–360.

    6. Gorbalenya AE, et al. (2002) The palm subdomain-based active site is internally permutedin viral RNA-dependent RNA polymerases of an ancient lineage. J Mol Biol 324(1):47–62.

    7. Bruenn JA (2003) A structural and primary sequence comparison of the viral RNA-dependent RNA polymerases. Nucleic Acids Res 31(7):1821–1829.

    8. te Velthuis AJ (2014) Common and unique features of viral RNA-dependent poly-merases. Cell Mol Life Sci 71(22):4403–4420.

    9. Wu J, Liu W, Gong P (2015) A structural overview of RNA-dependent RNA Polymerasesfrom the Flaviviridae family. Int J Mol Sci 16(6):12943–12957.

    10. Gong P, Peersen OB (2010) Structural basis for active site closure by the poliovirusRNA-dependent RNA polymerase. Proc Natl Acad Sci USA 107(52):22505–22510.

    11. Yin YW, Steitz TA (2004) The structural mechanism of translocation and helicase ac-tivity in T7 RNA polymerase. Cell 116(3):393–404.

    12. Li Y, Korolev S, Waksman G (1998) Crystal structures of open and closed forms ofbinary and ternary complexes of the large fragment of Thermus aquaticus DNA po-lymerase I: Structural basis for nucleotide incorporation. EMBO J 17(24):7514–7525.

    13. Temiakov D, et al. (2004) Structural basis for substrate selection by t7 RNA poly-merase. Cell 116(3):381–391.

    14. Gong P, Kortus MG, Nix JC, Davis RE, Peersen OB (2013) Structures of coxsackievirus,rhinovirus, and poliovirus polymerase elongation complexes solved by engineeringRNA mediated crystal contacts. PLoS One 8(5):e60272.

    15. Nakamura T, Zhao Y, Yamagata Y, Hua YJ, Yang W (2012) Watching DNA polymeraseη make a phosphodiester bond. Nature 487(7406):196–201.

    16. Basu RS, Murakami KS (2013) Watching the bacteriophage N4 RNA polymerasetranscription by time-dependent soak-trigger-freeze X-ray crystallography. J BiolChem 288(5):3305–3311.

    17. Beese LS, Steitz TA (1991) Structural basis for the 3′-5′ exonuclease activity of Es-cherichia coli DNA polymerase I: A two metal ion mechanism. EMBO J 10(1):25–33.

    18. Westover KD, Bushnell DA, Kornberg RD (2004) Structural basis of transcription: Nu-cleotide selection by rotation in the RNA polymerase II active center. Cell 119(4):481–489.

    19. Johnson SJ, Beese LS (2004) Structures of mismatch replication errors observed in aDNA polymerase. Cell 116(6):803–816.

    20. Vassylyev DG, et al. (2007) Structural basis for substrate loading in bacterial RNApolymerase. Nature 448(7150):163–168.

    21. Theobald DL, Wuttke DS (2006) THESEUS: Maximum likelihood superpositioning andanalysis of macromolecular structures. Bioinformatics 22(17):2171–2172.

    22. Castro C, et al. (2007) Two proton transfers in the transition state for nucleotidyltransfer catalyzed by RNA- and DNA-dependent RNA and DNA polymerases. Proc NatlAcad Sci USA 104(11):4267–4272.

    23. Zamyatkin DF, et al. (2008) Structural insights into mechanisms of catalysis and in-hibition in Norwalk virus polymerase. J Biol Chem 283(12):7705–7712.

    24. Castro C, et al. (2009) Nucleic acid polymerases use a general acid for nucleotidyltransfer. Nat Struct Mol Biol 16(2):212–218.

    25. Gong P, Campagnola G, Peersen OB (2009) A quantitative stopped-flow fluorescenceassay for measuring polymerase elongation rates. Anal Biochem 391(1):45–55.

    26. Arnold JJ, Cameron CE (2004) Poliovirus RNA-dependent RNA polymerase (3Dpol): Pre-steady-state kinetic analysis of ribonucleotide incorporation in the presence of Mg2+.Biochemistry 43(18):5126–5137.

    27. Brueckner F, Cramer P (2008) Structural basis of transcription inhibition by alpha-amanitinand implications for RNA polymerase II translocation. Nat Struct Mol Biol 15(8):811–818.

    28. Yin YW, Steitz TA (2002) Structural basis for the transition from initiation to elon-gation transcription in T7 RNA polymerase. Science 298(5597):1387–1395.

    29. Gnatt AL, Cramer P, Fu J, Bushnell DA, Kornberg RD (2001) Structural basis of tran-scription: An RNA polymerase II elongation complex at 3.3 Å resolution. Science292(5523):1876–1882.

    30. Campagnola G, McDonald S, Beaucourt S, Vignuzzi M, Peersen OB (2015) Structure-function relationships underlying the replication fidelity of viral RNA-dependent RNApolymerases. J Virol 89(1):275–286.

    31. Ward CD, Stokes MA, Flanegan JB (1988) Direct measurement of the poliovirus RNApolymerase error frequency in vitro. J Virol 62(2):558–562.

    32. Graci JD, et al. (2012) Mutational robustness of an RNA virus influences sensitivity tolethal mutagenesis. J Virol 86(5):2869–2873.

    33. Sanjuán R, Nebot MR, Chirico N, Mansky LM, Belshaw R (2010) Viral mutation rates.J Virol 84(19):9733–9748.

    34. Boudsocq F, Iwai S, Hanaoka F, Woodgate R (2001) Sulfolobus solfataricus P2 DNApolymerase IV (Dpo4): An archaeal DinB-like DNA polymerase with lesion-bypassproperties akin to eukaryotic poleta. Nucleic Acids Res 29(22):4607–4616.

    35. Umar A, Kunkel TA (1996) DNA-replication fidelity, mismatch repair and genomeinstability in cancer cells. FEBS 238(2):297–307.

    36. Sloane DL, Goodman MF, Echols H (1988) The fidelity of base selection by the polymerasesubunit of DNA polymerase III holoenzyme. Nucleic Acids Res 16(14A):6465–6475.

    37. Guajardo R, Sousa R (1997) A model for the mechanism of polymerase translocation.J Mol Biol 265(1):8–19.

    38. Gelles J, Landick R (1998) RNA polymerase as a molecular motor. Cell 93(1):13–16.39. Komissarova N, Kashlev M (1997) RNA polymerase switches between inactivated and

    activated states By translocating back and forth along the DNA and the RNA. J BiolChem 272(24):15329–15338.

    40. Tahirov TH, et al. (2002) Structure of a T7 RNA polymerase elongation complex at 2.9Å resolution. Nature 420(6911):43–50.

    41. Beese LS, Derbyshire V, Steitz TA (1993) Structure of DNA polymerase I Klenowfragment bound to duplex DNA. Science 260(5106):352–355.

    42. Liu X, Bushnell DA, Silva DA, Huang X, Kornberg RD (2011) Initiation complexstructure and promoter proofreading. Science 333(6042):633–637.

    43. Kettenberger H, Armache KJ, Cramer P (2004) Complete RNA polymerase II elonga-tion complex structure and its interactions with NTP and TFIIS.Mol Cell 16(6):955–965.

    44. Wang D, Bushnell DA, Westover KD, Kaplan CD, Kornberg RD (2006) Structural basisof transcription: Role of the trigger loop in substrate specificity and catalysis. Cell127(5):941–954.

    45. Gohara DW, et al. (1999) Production of “authentic” poliovirus RNA-dependent RNApolymerase (3Dpol) by ubiquitin-protease-mediated cleavage in Escherichia coli.Protein Expr Purif 17(1):128–138.

    46. Wu J, Lu G, Zhang B, Gong P (2015) Perturbation in the conserved methyltransferase-polymerase interface of flavivirus NS5 differentially affects polymerase initiation andelongation. J Virol 89(1):249–261.

    47. Batey RT, Kieft JS (2007) Improved native affinity purification of RNA. RNA 13(8):1384–1389.48. Pflugrath JW (1999) The finer things in X-ray diffraction data collection. Acta

    Crystallogr D Biol Crystallogr 55(Pt 10):1718–1725.49. Otwinowski Z, Minor W (1997) Processing of X-ray diffraction data collected in os-

    cillation mode. Methods Enzymol 276:307–326.50. McCoy AJ, et al. (2007) Phaser crystallographic software. J Appl Cryst 40(Pt 4):658–674.51. Emsley P, Cowtan K (2004) Coot: Model-building tools for molecular graphics. Acta

    Crystallogr D Biol Crystallogr 60(Pt 12 Pt 1):2126–2132.52. Adams PD, et al. (2010) PHENIX: A comprehensive Python-based system for macro-

    molecular structure solution. Acta Crystallogr D Biol Crystallogr 66(Pt 2):213–221.53. Brünger AT, et al. (1998) Crystallography & NMR system: A new software suite for mac-

    romolecular structure determination. Acta Crystallogr D Biol Crystallogr 54(Pt 5):905–921.

    E4014 | www.pnas.org/cgi/doi/10.1073/pnas.1602591113 Shu and Gong

    Dow

    nloa

    ded

    by g

    uest

    on

    May

    2, 2

    021

    www.pnas.org/cgi/doi/10.1073/pnas.1602591113