towards structural genomics of rna: rapid nmr resonance … · 2002-07-29 · towards structural...

13
Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure Determination Using Residual Dipolar Couplings Hashim M. Al-Hashimi 1 *, Andrey Gorin 2 , Ananya Majumdar 1 Yuying Gosser 1 and Dinshaw J. Patel 1 1 Cellular Biochemistry and Biophysics Program, Memorial Sloan-Kettering Cancer Center New York, NY 10021, USA 2 Computer Science and Mathematics Division, Oak Ridge National Laboratory Oak Ridge, TN 37830, USA We report a new residual dipolar couplings (RDCs) based NMR pro- cedure for rapidly determining RNA tertiary structure demonstrated on a uniformly 15 N/ 13 C-labeled 27 nt variant of the trans-activation response element (TAR) RNA from HIV-I. In this procedure, the time-consuming nuclear Overhauser enhancement (NOE)-based sequential assignment step is replaced by a fully automated RDC-based assignment strategy. This approach involves examination of all allowed sequence-specific resonance assignment permutations for best-fit agreement between measured RDCs and coordinates for sub-structures in a target RNA. Using idealized A-form geometries to model Watson–Crick helices and coordinates from a previous X-ray structure to model a hairpin loop in TAR, the best-fit RDC assignment solutions are determined very rapidly (, five minutes of computational time) and are in complete agreement with corresponding NOE-based assignments. Orientational constraints derived from RDCs are used simultaneously to assemble sub-structures into an RNA tertiary conformation. Through enhanced speeds of appli- cation and reduced reliance on chemical shift dispersion, this RDC-based approach lays the foundation for rapidly determining RNA conformations in a structural genomics context, and may increase the size limit of RNAs that can be examined by NMR. q 2002 Elsevier Science Ltd. All rights reserved Keywords: resonance assignments; residual dipolar couplings; ribonomics; structural genomics; HIV-I TAR *Corresponding author Introduction Structural genomics and the quest for compre- hensive information about biomolecular function through large-scale structure elucidation 1 has so far focused on the high-throughput determination of protein structures using X-ray crystallography, 2 NMR spectroscopy 3,4 and computational homology modeling. 5 However, RNAs play a fundamental role in gene processing and regulation, and adopt three-dimensional architecture and recognition elements with a complexity approaching that observed in proteins. 6,7 RNA constitutes a drug target, therapeutic agent, and catalytic enzyme used in a variety of chemical and biochemical applications. 7–9 For these reasons, launching a structural genomics program and the development of methods for high-throughput structure determi- nation is as important for RNAs 10,11 as it is for proteins. Effort in protein structural genomics is focusing on experimental determination of novel protein folds, because an expansion in ”protein structural space” is expected to pave the way for future determination of protein structures based on com- putational comparative homology technologies. 5 Similarly, methods for high-throughput RNA structure determination should focus on elements that would help advance the development of computational methods for predicting RNA 0022-2836/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved E-mail address of the corresponding author: [email protected] Abbreviations used: RDC, residual dipolar coupling; TAR, trans-activation response element; HIV-I, human immunodeficiency virus type I; NOE, nuclear Overhauser enhancement. doi: 10.1016/S0022-2836(02)00160-2 available online at http://www.idealibrary.com on B w J. Mol. Biol. (2002) 318, 637–649

Upload: others

Post on 14-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

Towards Structural Genomics of RNA: Rapid NMRResonance Assignment and Simultaneous RNATertiary Structure Determination Using ResidualDipolar Couplings

Hashim M. Al-Hashimi1*, Andrey Gorin2, Ananya Majumdar1

Yuying Gosser1 and Dinshaw J. Patel1

1Cellular Biochemistry andBiophysics Program, MemorialSloan-Kettering Cancer CenterNew York, NY 10021, USA

2Computer Science andMathematics Division, OakRidge National LaboratoryOak Ridge, TN 37830, USA

We report a new residual dipolar couplings (RDCs) based NMR pro-cedure for rapidly determining RNA tertiary structure demonstrated ona uniformly 15N/13C-labeled 27 nt variant of the trans-activation responseelement (TAR) RNA from HIV-I. In this procedure, the time-consumingnuclear Overhauser enhancement (NOE)-based sequential assignmentstep is replaced by a fully automated RDC-based assignment strategy.This approach involves examination of all allowed sequence-specificresonance assignment permutations for best-fit agreement betweenmeasured RDCs and coordinates for sub-structures in a target RNA.Using idealized A-form geometries to model Watson–Crick helices andcoordinates from a previous X-ray structure to model a hairpin loop inTAR, the best-fit RDC assignment solutions are determined very rapidly(,five minutes of computational time) and are in complete agreementwith corresponding NOE-based assignments. Orientational constraintsderived from RDCs are used simultaneously to assemble sub-structuresinto an RNA tertiary conformation. Through enhanced speeds of appli-cation and reduced reliance on chemical shift dispersion, this RDC-basedapproach lays the foundation for rapidly determining RNA conformationsin a structural genomics context, and may increase the size limit of RNAsthat can be examined by NMR.

q 2002 Elsevier Science Ltd. All rights reserved

Keywords: resonance assignments; residual dipolar couplings; ribonomics;structural genomics; HIV-I TAR*Corresponding author

Introduction

Structural genomics and the quest for compre-hensive information about biomolecular functionthrough large-scale structure elucidation1 has sofar focused on the high-throughput determinationof protein structures using X-ray crystallography,2

NMR spectroscopy3,4 and computational homologymodeling.5 However, RNAs play a fundamentalrole in gene processing and regulation, and adoptthree-dimensional architecture and recognitionelements with a complexity approaching that

observed in proteins.6,7 RNA constitutes a drugtarget, therapeutic agent, and catalytic enzymeused in a variety of chemical and biochemicalapplications.7 – 9 For these reasons, launching astructural genomics program and the developmentof methods for high-throughput structure determi-nation is as important for RNAs10,11 as it is forproteins.

Effort in protein structural genomics is focusingon experimental determination of novel proteinfolds, because an expansion in ”protein structuralspace” is expected to pave the way for futuredetermination of protein structures based on com-putational comparative homology technologies.5

Similarly, methods for high-throughput RNAstructure determination should focus on elementsthat would help advance the development ofcomputational methods for predicting RNA

0022-2836/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved

E-mail address of the corresponding author:[email protected]

Abbreviations used: RDC, residual dipolar coupling;TAR, trans-activation response element; HIV-I, humanimmunodeficiency virus type I; NOE, nuclearOverhauser enhancement.

doi: 10.1016/S0022-2836(02)00160-2 available online at http://www.idealibrary.com onBw

J. Mol. Biol. (2002) 318, 637–649

Page 2: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

structures.12 While relatively robust computationalmethods exist for predicting RNA secondary struc-tures from sequence,13 RNA tertiary structures aremore generally resistant to computational charac-terization. Tertiary interactions in RNA are oftenscarce, and energetically small compared to inter-actions stabilizing secondary structure, and are fre-quently mediated by divalent cations or backboneinteraction groups that are very difficult to predicta priori.

RNA structures can be determined experi-mentally using X-ray crystallography, but nucleicacids frequently fail to yield well-diffractingcrystals or are affected by crystal-packing forces.On the other hand, NMR spectroscopy does notrequire crystallization, but application is limited torelatively small RNAs (,60 nt). This, however,does not represent a major drawback, as anincreasing number of regulatory RNAs that fallwithin the latter size limit are beingcharacterized.14 A more severe limitation to theNMR approach is the over-reliance on short-range(,5 A) distance constraints derived from themeasurement of nuclear Overhauser enhance-ments (NOEs). Structure determination usingNOE-based methods often consumes severalmonths of data acquisition and analysis, most ofwhich is expended on assignment of resonancesand NOEs. In particular, sequential assignments innucleic acids continue to rely heavily on theobservation of inter-residue NOEs between baseand sugar protons in typically overcrowdedspectra. The density of inter-proton distance con-straints is also small, and short distance constraintsare ineffective in defining extended RNA tertiaryconformation. This is in stark contrast to thesituation in proteins, where an overabundance ofrobust through-bond NMR experiments are avail-able for rapidly establishing sequential connectivi-ties, and the density of inter-proton distanceconstraints is high, many of which correspond tolong-range NOEs that can effectively define aglobular protein fold.

Recent developments in NMR methodologyinvolving the measurements of residual dipolarcouplings (RDCs) in partially oriented systemsprovide novel long-range orientationalconstraints15 – 18 that can remedy many of the afore-mentioned limitations. So far, applications ofRDCs in studies of nucleic acids have focusedprimarily on improving the accuracy and precisionof structures determined by NMR.19 – 21 In oneapproach,22,23 the measurement of five or moreindependent RDCs (D ) in sub-structures withknown molecular geometry is used to determinefive order matrix elements (Spq) describing averagesub-structure alignment:24

D ¼ DintSpqðcos aijp cos aij

q ÞSpq ð1Þ

where Dint is a constant and cos aijp are the direction

cosines that define inter-nuclear vector orientationrelative to an arbitrary sub-structure frame. Three

of the order tensor elements can then be used toorient sub-structures relative to a common order-ing frame, and hence relative to one another.Because this approach can be applied effectivelyin determining the alignment of remotely posi-tioned sub-structures in modular biomolecules, itis particularly well suited for determiningextended RNA tertiary conformation.25,26 RNAstructures are uniquely hierarchical, and are com-posed of a limited number of autonomously fold-ing structural motifs.27,28 These sub-structures canbe identified readily using RNA secondary struc-ture prediction programs,13 and can often bemodeled with reasonable accuracy using idealizedgeometries or using coordinates from previouslydetermined homologous sub-structures. The recentstructure determination of the ribosome has greatlyexpanded the available RNA structural databasefrom which model homologous sub-structures canbe derived.29 – 31

While relative sub-structure orientations can bedetermined rapidly using RDCs, this processremains limited by the time-consuming require-ments for sequence-specific resonance assign-ments. Here, we introduce a new procedure basedon order matrix analysis of RDCs that allowsrapid and simultaneous sequence-specific reson-ance assignments and RNA structure determi-nation. Application is demonstrated on auniformly 15N/13C-labeled 27 nt hairpin-loopvariant of the trans-activation response element(TAR) RNA found at the 50-end of all premessengerRNA transcripts,32 for which we have recentlyreported an RDC-based structural and dynamicanalysis.26

Results and Discussion

Assignment of RNA sub-structures with knowngeometries using RDCs

The use of RDCs in resonance assignments hasbeen demonstrated recently in application to pro-tein systems. In one study, RDC contributionswere used to overcome limitations associated withchemical shift degeneracies33 in the so-calledmolecular fragment replacement approach,34 andin another study,35 RDCs were used directly in thesimultaneous resonance assignment and proteinfold determination. The latter study relied onhaving a large number of independent RDCsmeasured between protein backbone nuclei thatare constrained by only two degrees of freedom (fand c). Unfortunately, application of this methodto nucleic acids is difficult, owing to the propor-tionally smaller number of independent RDCs thatcan be measured and the significantly largerdegree of conformational freedom that must beovercome.

Assuming model geometries for sub-structuresin RNA can overcome the latter limitations.11

When resonances are assigned correctly, and more

638 RDC-based Structure Determination of RNA

Page 3: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

than five independent RDCs measured, RDCvalues will agree with an accurately modeled sub-structure, allowing the determination of an ordermatrix that reproduces experimentally measuredRDCs satisfactorily using equation (1). Conversely,misassignment of resonances can lead to disagree-ment. This can be appreciated when visualizingRDC constraints on inter-nuclear vector orien-tation. As shown in Figure 1 for an axially sym-metric order tensor, inter-nuclear vectors (depictedas white circles on a surface of a globe) are con-strained along two cones of allowed orientationsrelative to the order tensor frame (Szz). The mis-assignment of spin-pairs effectively interchangescorresponding inter-nuclear vector orientations,and when these have different orientations andRDC values, this leads to disagreement betweenRDCs and the assumed geometry (Figure 1,A $ A00). A measure of this agreement is given bythe rmsd between measured RDCs and valuescalculated using best-fit order tensor parametersdetermined using equation (1). Due to the orienta-tional degeneracy in RDCs, resonances can be mis-

assigned without affecting agreement betweenRDCs and an assumed geometry (Figure 1,A $ A0). Such limitations can be overcome bymeasuring RDCs in a different orderingmedium,23,36 or by establishing through-bondcorrelations among spin pairs between whichRDCs are measured. Having groups of correlatedRDCs reduces the likelihood of having suchdegeneracies (Figure 1 (A,B $ A0,B0). For example,in nucleic acids, inter-nuclear vectors in nucleo-tides will seldom all point along identical cones oforientations for different positions in an RNAstructure. Even for regular A-form helices, themost common secondary structural element inRNA, variations in inter-nuclear vector orien-tations in a given residue type arise due to rotationof base-pairs about the helix axis across the first 11residues in the double helix (,338 per residue),due to variations in sequence (for example GCversus CG), and due to departures from coaxialhelix alignment. This is the basis of the assignmentstrategy that we employ here for accelerating ter-tiary structure determination of RNA using NMR.

Figure 1. Resonance assignments in known sub-structures using correlated RDCs. The orientation of a given inter-nuclear vector type (A or B) is shown as circles pointing along the surface of a sphere relative to an axially symmetricorder tensor frame (Szz). RDCs constrain inter-nuclear vector orientations along two continuous cones, shown usingdifferent colors for different RDC values. Misassignment of two inter-nuclear vectors having different correspondingRDC values leads to disagreement between RDCs and the assumed sub-structure geometry (A $ A00). Misassignmentof inter-nuclear vectors having similar RDCs does not lead to disagreement with an assumed geometry (A $ A0).This degeneracy can be minimized by correlating (indicated by broken red lines) two different types of RDC(A,B $ A0,B0).

RDC-based Structure Determination of RNA 639

Page 4: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

RDC basis set and intra-residue correlations

Decomposition of our target TAR RNA into sub-structures is shown in Figure 2(a). Based on thesecondary structure predicted using the programmfold version 3.1,37 four contiguous sub-structurescan be identified: two Watson–Crick helices (desig-nated stem 1 and 2), a UUCG hairpin loop, and athree residue UCU bulge. Continuous Watson–Crick regions in helices can be modeled usingidealized A-form geometries, while model coordi-nates for the hairpin loop can be obtained from a2.8 A resolution X-ray structure of a similar hairpinloop.38 While no predefined conformation can beassumed for the bulge, this does not precludedetermination of RNA tertiary conformation basedon the alignment of the latter three sub-structures.

A target spin basis set in RNA nucleotides thathave many of the attributes needed for obtainingresonance assignments and determining tertiarystructure using RDCs is shown in Figure 2(b)(shown in red). Independent RDCs can bemeasured accurately between many of thesedirectly bonded spin-pairs [C8–H8 (1DC8H8), C2–H2 (1DC2H2), C5–H5 (1DC5H5), C6–H6 (1DC6H6), N3–H3 (1DN3H3), N1–H1 (1DN1H1), and C10 –H10

(1DC10H10)], and NMR correlations among membersof this spin basis set are possible using robustintra-residue correlation experiments.21,39 RDCscan be measured very efficiently using simplevariants of HSQC experiments (see Materials andMethods) as additional contributions to scalarcoupling (J ) in an aligned state (J þ D ) relative toan isotropic state (J ).17,18 Partial alignment for TARwas achieved using a phage ordering medium.40,41

A total of 18/27 (1DC10H10), 3/3 (1DC2H2), 9/11(1DC8H8), 12/16 (1DC5H5) and 7/16 (1DC6H6) valuescould be measured with an average uncertaintyestimated to be ,1.5 Hz. The total acquisitiontime for these experiments under the conditionsdescribed in Materials and Methods (,1.2 mMTAR using a 500 MHz NMR spectrometer) was,five days. RDCs were recorded on a TAR sampledissolved in 2H2O, and hence no 1DNH values weremeasured.26

Resonances belonging to our spin basis set canbe intra-residue correlated using a suite ofwell established through-bond NMR experimentsinvolving intra-base and base-sugarcorrelations.21,39 Experiments used in applicationto TAR are summarized in Table 1. Correlationsneed to be established only for resonances having

Figure 2. (a) Secondary structureand molecular sub-structures in the27-mer HIV-I TAR variant (six resi-due hairpin loop in wt-TARCUGGGA is replaced with theUUCG tetra-loop) used in thisstudy. (b) Atom numbering in A–Uand G–C Watson–Crick base-pairsand a spin basis set (shown in red)between which independent RDCscan be measured and correlated inRNA. Correlations can be estab-lished among spin-pairs within aresidue (intra-residue correlations)using through-bond scalarcouplings and between base-pairedresidues (inter-residue correlations)using trans-hydrogen bond-mediated scalar couplings (seeTable 1).

640 RDC-based Structure Determination of RNA

Page 5: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

corresponding RDC values. Because correlationscan be made with greater tolerance to spectraloverlap compared to RDC measurement, all spinshaving corresponding RDC values could becorrelated unambiguously in TAR over a totalacquisition period of , three days. With theseexperiments, it was possible to group all RDCdata belonging to individual residues. While intra-residue correlation of RDCs may be sufficient toallow both assignment and tertiary structure deter-mination, it is now possible to establish inter-residue correlations in nucleic acids in a fairlyrobust and rapid manner.

Trans-hydrogen bond-mediated scalarcouplings: a direct approach to inter-residuecorrelations in nucleic acids

The discovery of trans-hydrogen bond-mediatedscalar couplings provides a direct approach forestablishing a variety of base-pair alignments innucleic acids.42 – 45 These experiments rely onmagnetization transfer across the hydrogen bond(Nd –H· · ·Na) mediated by trans-hydrogen bondscalar couplings (2hJNdNa). Typically, in an HNN-COSY spectrum, the detection of a cross-peakbetween the hydrogen-bonded proton (Hd) andthe acceptor nitrogen atom (Na) signifies hydrogenbond formation between the corresponding donorand acceptor bases. For example, A–U and G–Cbase-pairs can be identified through U(H3):A(N1)and G(H1):C(N3) cross-peaks, respectively (Figure2(b)). These provide a robust approach for estab-lishing inter-residue correlations between ourtarget spins and a means of discriminationbetween base-paired and single residue nucleo-tides. For example, in an A–U pair, inter-nucleo-tide U(H3):A(N1) and intra-nucleotide A(H2):A(N1) correlations identify the U(H3) and A(H2)protons as belonging to a single A–U linkage(Figure 2(b), Table 1). Alternatively, directU(H3):A(H2) correlations (Table 1) may beobtained using more recently developed

methodology.46 Once an inter-nucleotide 1H– 1Hconnection is established in this way, additionalintra-nucleotide correlations on either base (e.g.U(H5,H6):U(H3), U(H5):U(H6) and A(H8):A(H2),see Table 1) are used to connect all of thesebasis set protons and their associated C–H vectorsinto a single A–U base-pair unit. Similarly, spinsacross G–C base-pairs may be correlated usingeither a combination of G(H1):C(N3) andC(H5,H6):C(N3) spectra or direct G(H1):C(H5,H6)connectivities, followed by appropriate intra-residue correlations (e.g. G(H1):G(H8),C(H5):C(H6), Table 1).

In application to TAR, 3 A–U and 7 G–C trans-hydrogen bond cross-peaks could be detectedusing a suite of trans-hydrogen bond NMR experi-ments listed in Table 1 and recorded over a periodof , three days. Hence, all Watson–Crick base-pairs predicted in the TAR secondary structure(Figure 1) could be accounted for experimentallyand inter-residue correlations among our targetspins could be established unambiguously for allbase-pairs. The seven residues for which no trans-hydrogen bonds could be detected were assumedto belong to either the bulge or hairpin loop. Onthe basis of these experiments, RDCs could begrouped to individual Watson–Crick base-pairs orto single nucleotides deemed not to be involved indetectable hydrogen bond alignments, butsequence-specific assignments still need to bedetermined.

Sequence-specific assignment of resonancesin RNA

Using idealized A-form geometries generatedusing the program Insight II, and a previous 2.8 Aresolution X-ray structure38 to model Watson–Crick helices and the hairpin loop, respectively, allallowed permutations of resonance assignmentswere examined for best-fit agreement with corre-spondingly measured RDCs. Candidate residuesfor assignments and permutations are shown

Table 1. NMR experiments used in establishing intra and inter-residue correlations in nucleic acids

Correlations

Experiment AU-pairs GC-pairs Reference/source

Intra-nucleotideBid_hCNcH_py H6/C6 to H10/C10 H6/C6 to H10/C10 61Bid_hCNcH_pu H8/C8 to H10/C10 H8/C8 to H10/C10 61H8(C8)N9N3 A(H8) to A(N3,N9) G(H8) to G(N3,N9) 62H2(C4)N9 A(H2) to A(N9) NA Majumdar et al., unpublished resultsH1(N1C2)N3 NA G(H1) to G(N1,N3) 62H1(N1C6C4)N9 NA G(H1) to G(N9) 62

Inter-nucleotideHNN-COSY U(H3–N3) to A(N1) G(H1–N1) to C(N3) 42,43H5NN U(H5–N3) to A(N1) NA Majumdar et al., unpublished resultsH2NN A(H2–N1) to U(N3) NA 63,64H3(N3N1)lH2 U(H3) to A(H2) NA 46H5(N3N1)H1 NA C(H5) to G(H1) 46H6(N3N1)H1 NA C(H6) to G(H1) 46H5(N3N1)H2 U(H5) to A(H2) NA Majumdar et al., unpublished results

RDC-based Structure Determination of RNA 641

Page 6: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

schematically in Figure 3 using circles (guanine),rectangles (cytosine), squares (adenine) anddiamonds (uracil), which are linked for experimen-tally determined base-pairs. For a given assign-ment permutation, the best-fit order tensorsolution was calculated using all correspondingRDC data independently for stems 1 and 2, andthe hairpin loop. Sub-structure-specific best-fitorder tensors were then used to back calculate”best-fit” RDC values. The rmsd values betweenmeasured RDCs and best-fit back-calculated valuesfor allowed assignment permutations are shown inFigure 3 as histogram plots. These calculationscould be carried out for all three sub-structures inless than five minutes of computational time on anSGI Origin with an R10000 processor. The averagermsd values (stem 1 ¼ 7.9 Hz, stem 2 ¼ 7.1 Hz,and hairpin loop ¼ 4.3 Hz) are significantly largerthan the estimated experimental uncertainty inmeasured RDCs (,1.5 Hz), indicating that manyof the allowed resonance assignment permutationslead to significant disagreement between RDCs

and the assumed sub-structure geometry. Forstem 1, a single solution with a uniquely smallrmsd value of 1.8 Hz is obtained (Figure 3(a)),while two solutions with comparably smallrmsd values (2.2 Hz and 2.3 Hz) are obtained forstem 2 (Figure 3(b)). A reasonably unique best-fitassignment solution having an rmsd value of1.8 Hz is also obtained for the hairpin loop(Figure 3(c)).

The five best-fit sequence-specific assignmentsolutions (lowest rmsd values) are shown inFigure 4 for stem 1 (Figure 4(a)), stem 2(Figure 4(b)) and the hairpin loop (Figure 4(c)).Assignments determined using NOE methods areindicated by color-matching base-paired/singlenucleotides and their corresponding position inthe RNA secondary structure (Figure 4). For thethree sub-structures, the best-fit assignmentsolutions (solutions 1a, 2a, and 3a ) are in completeagreement with the NOE assignments, as shownby the placement of residues into their color-matching positions. For stem 1, solution 1a has a

Figure 3. The rmsds betweenmeasured RDCs and values calcu-lated using best-fit order tensorsfor allowed assignment permu-tations for three sub-structures inTAR. Candidate residues andpermutations are shown using dif-ferently colored circles (guanine),rectangles (cytosine), squares(adenine) and diamonds (uracil),which are linked for Watson–Crickbase-pairs determined using JNN-NMR. The rmsds are shown ashistogram plots for (a) stem 1,using coordinates from an idealizedA-form geometry. The 5040 reson-ance assignment permutationswere examined involving theassignment of a total ten base-pairs(seven G–C and three A–U base-pairs) and associated RDCs to sixpositions (four G–C and twoA–U). (b) Stem 2, using coordinatesfrom an idealized A-form geometry.The 540 resonance assignment per-mutations were examined involv-ing the assignment of a total of tenbase-pairs (seven G–C and threeA–U base-pairs) and associatedRDCs to four positions (three G–Cand one A–U). (c) The hairpinloop, using coordinates from a pre-vious X-ray structure. The 24 reson-ance assignment permutations wereexamined involving the assignmentof a total of seven residues (four U,two C and one G) and associatedRDCs to four positions (two U, oneC and one G).

642 RDC-based Structure Determination of RNA

Page 7: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

considerably lower rmsd (1.8 Hz) compared to theother four best-fit solutions (3.7 Hz–5.0 Hz) andsolutions 1b and 1c simply represent exchange ofassignments between neighboring residues. Thelatter can be expected because it leads to minimalchanges in the orientation of inter-nuclear vectorsin a given base-pair. On the other hand, solutions1d and 1e have significantly higher rmsd values(5.0 Hz) and include residues that belong to stem2. For the shorter stem 2, the two best-fit assign-ment solutions 2a and 2b have almost identicalrmsd values (Figure 4(b), 2.1 Hz and 2.2 Hz,respectively) but differ considerably in assign-ments. However, solution 2b differs from allother solutions (2c–e ) and, more importantly,includes residues that are assigned to stem 1 in allof the five best-fit solutions (Figure 4(a), solutions1a–e ). On the other hand, solutions 1a and 2aallocate the complete set of Watson–Crick assign-ments to unique positions in stems 1 and 2,respectively.

For the hairpin loop, the only guanine residuenot involved in Watson–Crick alignment G34 isassigned a priori (Figure 4(c), colored brown), butcandidates for assignments include bulge residuesU23, C24, and U25. While solution 3a (rmsd ¼ 1.8

Hz) is in agreement with the NOE assignmentsand has a smaller rmsd value compared to sol-utions 3b and 3c (2.3 Hz and 2.4 Hz, respectively),we illustrate an additional avenue for selectingassignments. Besides the low rmsd criterion, thederived principal order matrix parameter Szz,which defines the degree of alignment for a sub-structure, should be similar for two rigidlyattached sub-structures. While we previouslydemonstrated that this assumption of rigidity doesnot hold for the two stems separated by a flexiblebulge in TAR,26 stem 2 should be attached rigidlyto the directly linked hairpin loop. Notwithstand-ing the small number of RDCs measured in thehairpin loop, which limits the ability to accuratelydetermine an Szz value, comparing the Szz valuesfor solutions 3a (7.08(^0.35) £ 1024), 3b(3.96(^0.29) £ 1024), and 3c (4.28(^0.35) £ 1024),solution 3a has the best agreement with the Szz

value for stem 2 (using solution 2a,8.76(^0.16) £ 1024). With these sequence-specificassignments, the remaining cytosine residue can,by elimination, be assigned unambiguously to thebulge residue (C24), leaving two inter-changeableassignments for two uracil residues in the bulge(U23/U25).

Figure 4. Sequence-specific assignment arrays determined using RDCs. Candidate residues/base-pairs for assign-ments are shown using circles (guanine), rectangles (cytosine), squares (adenine) and diamonds (uracil), and are linkedfor experimentally determined Watson–Crick base-pairs. The total number of RDCs measured in a base-pair or singleresidue is shown inside individual symbols. Sequence-specific assignment solutions are shown as a function of theTAR secondary structure, which is represented as a series of color-coded wells. NOE-based assignments are indicatedby color-matching base-pairs and residues with their corresponding wells in the secondary structure. The rmsdbetween measured and calculated RDCs is shown to the right of the assignment array. Five assignment solutionshaving the lowest rmsds are shown for (a) stem 1, (b) stem 2, and (c) the hairpin loop. Bulge residues U23 and U25are colored light blue and purple, respectively.

RDC-based Structure Determination of RNA 643

Page 8: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

RDC-based sequential assignments withreduced reliance on intra and inter-residue correlations

While a detailed examination of the limits ofapplicability of this RDC-based sequential assign-ment strategy is beyond the scope of this workand is deferred for later publication (Gorin et al.,unpublished results), we explored two importantscenarios in the present application. One poten-tially significant advantage of this RDC-basedapproach over traditional NOE-based methods isthe reduced reliance on adequate chemical shiftdispersion for specific NOE reporter nuclei.Because RDCs were measured for only wellresolved resonances in the 2D NMR experiments,only 65%, 80%, 20% and 90% of the total measur-able 1DC10H10, 1DC5H5,

1DC6H6, and 1DC8H8 values,respectively, were measured in the two stems, andno RDC was measured for residue C19 in stem 1.While assignments could be established using thisincomplete data set, we further examined if assign-ments could be determined when reducing the

1DC10H10 data (and hence requirements for C10 andH10 chemical shift resolution) to only 20% of thetotal measurable RDCs (stem 1 ¼ 2/12 and stem2 ¼ 2/8) (Figure 5(a)). As shown in Figure 5(a),the correct assignments are determined for stem 1and with a significant degree of confidence. Whilethe correct assignments cannot be determinedindependently with similar confidence for theshorter stem 2, insisting on consistency withassignments for stem 1 alleviates the uncertainty,as shown in Figure 5(a), where only assignmentsfor stem 2 that are compatible with the best-fitassignment solution in stem 1 are shown.

Here, trans-hydrogen bond correlations wereestablished for all Watson–Crick base-pairs. Thismay not always be possible even for canonicalbase-pairs, due to departures from perfect canoni-cal geometries (especially for terminal residues),insufficient chemical shift resolution and/or poorsensitivity in the NMR experiments, and some mis-matched base-pair alignments (e.g. G·U) lackingN–H· · ·N hydrogen bonds cannot be detectedeasily using current NMR methods. We therefore

Figure 5. RDC assignment arraysusing non-optimal data. (a) Usingonly 20% of RDCs that can bemeasured between C10 and H10

nuclei (stem 1 ¼ 2/12 and stem2 ¼ 2/8). Only assignment sol-utions for stem 2 that are consistentwith assignments in stem 1 areshown. (b) Where base-pairing isdefined only between C19–G43(orange residues) and C29–G36(purple residues). Eight out of tenbase-pairs (G17, C45, G18, C44,A20, U42, G21, C41, A22 and U40in stem 1 and G26, C39, A27, U38,G28, C37, and C29) in stem 2 areincluded without defining theirhydrogen bonding partners. A totalof 259,299 and 72,000 assignmentpermutations were examined forstems 1 and 2, respectively.

644 RDC-based Structure Determination of RNA

Page 9: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

explored RDC-based assignments where base-pairalignment is defined explicitly for only tworesidues C19–G43 (orange residues) and C29–G36(purple residues) and eight out of ten base-pairswere included without explicitly defining theirhydrogen-bonding partners. The correct assign-ments can be determined with reasonableresolution, particularly for stem 1, where the best-fit solution differs by at least 0.4 Hz from theremaining four next to best-fit solutions (Figure5(b)). To display results more completely, we donot filter assignment solutions for stem 2 for con-sistency with stem 1, but simply point out that thelatter allows determination of the correct set ofassignments for stem 2. Together, these resultsindicate that RDC-based sequential assignments ishighly tolerant to inadequate chemical shiftresolution and incomplete trans-hydrogen-bondcorrelations, and this may allow examination oflarger RNAs than is possible currently using NOEmethods alone.

Determining RNA tertiary conformation

The analysis used in examining assignmentpermutations simultaneously provides the threeEuler angles needed to define relative sub-structure orientations. We have reported the RDC-derived inter-helical conformation for TAR(average inter-helical angle ranging between 448and 548).26 The latter study also employed ideal-ized A-form geometries for the two stems butmore extensive experimental data that included1DC20H20, 1DC30H30,

1DC40H40 and 1DC5C6. We do not

revisit this discussion, but point out that orienta-tional solutions determined here are in excellentagreement with our previous findings (data notshown), and with previous structural studies ofTAR.51 – 54

Here, the orientation of stem 2 relative to thehairpin loop, which was previously omitted fromanalysis, can be determined and compared with ahigh-resolution X-ray structure of a similar hairpinloop.38 This is shown in Figure 6, where theorientation of stem 2 (in blue) relative to the loop(in green) determined by superimposing thecenters of their orientational solutions (RDC-NMR) is compared with the 2.8 A resolution X-raystructure of the TL1 loop in 16 S ribosomal RNAfragment (X-ray).38 The orientation of the hairpinloop is superimposed for the two structures.Because the order tensor for TAR is close toaxially symmetric and hence the Sxx and Syy

orientations are not well defined, the relativeorientation of the two sub-structures about the Szz

direction is not well defined (^458). However, thisuncertainty and the degeneracy arising fromallowed inversions about principal axes is mini-mized when insisting on proper chemical linkagebetween the two sub-structures (C29(stem2)–U31(loop) and G36(stem 2)–G34(loop)). On theother hand, the principal Szz orientation is welldefined (^88) and the RDC alignment determinedby superposition of the Szz axes is in excellentagreement with the X-ray structure (the differencein orientations ,58 about the Syy direction) as canbe seen from comparison of the backbonealignments.

Figure 6. The alignment of stem 2 relative to the hairpin loop in TAR determined using RDCs is compared to thestructure of a similar hairpin loop determined by X-ray crystallography. The orientation of the loops is superimposedin the two structures. The average RDC alignment differs by ,58 from the X-ray structure.

RDC-based Structure Determination of RNA 645

Page 10: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

Geometry of sub-structures and limitsof applicability

For TAR, the assumption of idealized A-formgeometry for the two Watson–Crick stems appearsto be a very good one, because the rmsd betweenmeasured and calculated RDCs approaches theexperimental uncertainty in measuring RDCs.Agreement between RDCs and idealized RNAA-form geometries have been reported.25,47 How-ever, deviations from idealized A-form geometriescan arise, and this can potentially limit theaccuracy and applicability of our presented RDCprocedure. For example, when using a family of20 NOE-derived NMR geometries51 as input coor-dinates for stem I, 5/20 structures yielded lowestrmsd values for the correct assignment solution,while 15/20 resulted in aberrations. Nevertheless,the two lowest rmsd values across the 20 structures(2.7 Hz and 3.2 Hz for models 9 and 1) corre-sponded to the correct set of assignments. Thethird best-fit solution corresponded to incorrectassignments, but also displayed a proportionallyhigher rmsd (3.8 Hz), and no competitive alterna-tive assignment solutions were determined consist-ently with low rmsd values. This emphasizes theneed to generally examine assignments against alarge pool of candidate sub-structures. These can-didate sub-structure geometries can be derivedfrom homologous RNA sub-structures in the pro-tein data bank, which has greatly expanded owingto the recent X-ray structure determination of theribosome.29 – 31 This would allow application ofRDC-based assignments with greater accuracy,and would simultaneously allow for the refine-ment of the local geometry of sub-structures, inanalogy to the so-called molecular fragmentreplacement methodology developed by Bax andco-workers for protein systems.34,50 While this mayalso allow for more general applicability to non-helical motifs, such as bulge, loops, and junctions,structure determination as well as assignments ofsuch motifs will likely necessitate additional NOE-based NMR data. In this regard, it is important tonote that such residues tend to have better chemi-cal shift dispersion compared to helical regions.

Another approach to reduce assumptions aboutsub-structure conformation (for example in accom-modating bent helices) would be to minimize thesize of individual sub-structures.48 For TAR, thecorrect resonance assignments (albeit with weakerresolution, see Figure S1 in the SupplementaryMaterial) could be determined using RDCs whendividing stems 1 and 2 into sub-structures com-posed of only three base-pairs. Successful appli-cation using smaller fragments would benefittremendously from the additional measurement of1DNH and other RDCs, which can be measuredbetween our target nuclei,49 and/or using differentordering media.23,36 Such additional measurementsof RDCs will be critically important for ensuringconsistent measurement of RDCs for all residuesin a sub-structure, because reduced RDC represen-

tation of residues can lead to severe ambiguities inthe assignment process. Similarly, enhancing thedistribution of measured RDCs will be importantfor overcoming potential assignment ambiguitiesarising from degeneracies in RDCs. For example,for a limiting case of having one idealized helixwith Szz perfectly coincident with the helix axisand axial symmetry of alignment, RDCs measuredfor a given residue type will all have identicalvalues and various assignment permutations willbe indistinguishable using RDCs. Having saidthat, as long as more than five independent RDCshave been measured, all assignment permutationswill also lead to determination of identical andaccurate order tensor solutions, which can be usedto determine the alignment of the helix. Thereforeusing our RDC procedure, an RNA tertiary struc-ture could, in principle, be determined while stillhaving some ambiguous resonance assignments.Although this limiting scenario will not be encoun-tered frequently for folded RNAs, it argues thatgeneral applicability of our RDC procedure in res-onance assignments may require some additionalNOE data.

Finally, the number of assignment permutationsthat need to be examined will rise significantly(105–1012) with increasing RNA size (.40), orwhen having a limited number of trans-hydrogenbond correlations, posing a computational chal-lenge to implementation of our RDC-based pro-cedure. We anticipate that a combination ofparallel computing, optimization of the size ofmolecular fragments and protocols for reducingredundant assignment permutations (for example,excluding candidate residues from an assignmentpool once they have been assigned to a sub-structure), should help to enhance computationalefficiency to allow rapid application to largerRNAs.

Conclusions

The foundation of de novo determination of bio-molecular structure by NMR has traditionallyrelied on establishing resonance assignments toallow interpretation of geometry-dependent inter-actions in terms of specific structural constraints.55

Here, we have sought to reverse this tradition,and to exploit a priori structural information toexpedite resonance assignments and structuredetermination. The current pace of progress incomputational methods for predicting biomolecu-lar structure, as well as the increasing number ofbiomolecular structures determined by X-ray andNMR, provides the impetus to integrate a prioristructural information in NMR structuredetermination.11 While we have presented anapplication for nucleic acids, this RDC-assignmentapproach is extendable to proteins for which struc-tures have already been determined using X-ray orcomputational homology modeling. There is oftenthe incentive to carry out resonance assignmentsfor such proteins, because NMR can be used to

646 RDC-based Structure Determination of RNA

Page 11: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

probe inter-molecular interactions and moleculardynamics.11 For application to nucleic acids,sequential assignments using RDCs are almostentirely computer-automated, consuming less thanfive minutes of computational time for RNAs suchas TAR (,27 nt), while the entire data collectionand spectral analysis consumed less than twoweeks. The advent of cryogenic probes coupledwith higher magnetic fields can be expected toreduce data acquisition by a factor of ,4.

Tertiary structures determined using the pre-sented RDC approach may lack some of theintricate details associated with high-resolutionstructures but, nevertheless, captures significantlyimportant elements of RNA conformation perti-nent to function. For example, the inter-helicalangle in TAR determined using RDCs is ,44–548,and this is known to undergo coaxial alignmentupon complex formation with relevant targets.52

High-resolution structures can be obtained throughfurther refinement of best-fit sub-structures againstRDCs and readily assignable NOE data. Bax andco-workers have demonstrated the feasibility ofrefining model protein structures based primarilyon RDCs in cases where the model structure isreasonably similar to the final target.50 Such arefinement approach will be critically importantfor allowing general structure determination ofnon-helical linking motifs, such as bulges, loops,and junctions. Variations in RDCs (RDC mapping)can be used to probe inter-molecular interactions,56

thereby complementing information from chemicalshift perturbations, which are often more difficultto interpret for RNA compared to proteins. Finally,both RDC-derived tertiary structure anddynamics26 will provide an important databaseupon which RNA structure predication tools maybe developed. We anticipate that such applicationswill be critical components of an RNA structuralgenomics program.

Materials and Methods

Sample preparation and NMR spectroscopy

Uniformly 15N/13C-labeled TAR RNA was preparedusing standard procedures as described.57 NMR samplescontained ,1.2 mM uniformly 15N/13C-labeled TAR,15 mM sodium phosphate (pH 6.0–6.2), 25 mM sodiumsulfate and 0.1 mM EDTA. A second NMR sample wasprepared that also contained 22 mg/ml of Pf1 phage forinducing molecular alignment.58

All NMR data were acquired on Varian Inova spec-trometers operating at 1H frequencies of 500 MHz and600 MHz at 25 8C, equipped with actively shielded tripleresonance z-gradient probeheads possessing a signal-to-noise ratio (standard ethyl benzene sample) of 800:1 and1200:1, respectively. NOE-based resonance assignmentswere obtained using standard homonuclear and hetero-nuclear, 2D and 3D NMR experiments closely followinga previously reported procedure57 (see Table 1 for a listof experiments). The measurement of RDCs in TAR hasbeen described.26 Briefly, one bond 1DCH splittingsbetween C8–H8, C6–H6, C2–H2, and C10H10 were

measured using the 1JCH-CT-CE-HSQC experiment,59

and using IPAP versions of a regular HSQC without 1Hdecoupling in the indirect (13C) dimension.60 C5–H5splittings were measured using a regular CT-HSQCwithout 1H decoupling in the indirect (13C) dimension.In all cases, RDC values were calculated as the differencebetween splittings measured in the absence and presenceof phage aligning medium. The random uncertainty in1DCH estimated from multiple measurements was, onaverage, ,1.5 Hz.

Sequence-specific assignments using RDCs

Software for evaluating all allowed assignment per-mutations based on RDCs was written using Perl andCþþ. This software requires three input informationalbatches: (1) coordinates of sub-structures; (2) sequenceof sub-structures; and (3) RDC data. Coordinates forsub-structures need to be in PDB format. The sequencebatch is a linear list of all residues in the RNA and neednot follow any particular order. Residues known to beinvolved in H-bond alignments are specified in thesequence batch as A:U and G:C, and are distinguishedfrom single residues deemed not to be involved inH-bond alignments as A, U, G and C. The data batchincludes the atom types between which RDCs aremeasured. RDCs are listed as a series of data belongingto either a single residue or base-pair. A combinatorialassignment program (CAP) was written to collect thethree informational batches from separate files and con-struct all possible permutations for assignments. Specifi-cally, the CAP program matches the series of RDCmeasurements and corresponding inter-nuclear vectorsprovided in the PDB file for all allowed assignment per-mutations. For a given set of assignments, the best-fitorder matrix elements are evaluated using a previouslydescribed singular value decomposition (SVD)procedure22 and RDC values back-calculated usingequation (1). The rmsd values between calculated andexperimental RDC values are then evaluated for a givenset of assignments. The CAP program outputs the rmsdvalues for a given set of assignments. This programincludes additional features for accommodating ”gaps”in the RDC data, and incorporating RDCs from two ormore ordering media, and this will be presented inmore detail elsewhere (Gorin et al., unpublished results).

Determining the relative alignment of stem 2 and thehairpin loop using RDCs

Order tensor frames were calculated independentlyfor stem 2 and the hairpin loop using the program ORD-ERTEN_SVD.22 Idealized A-form coordinates and theprevious X-ray structure of the TL1 hairpin loop38 wereused as input coordinates for stem 2 and the hairpinloop, respectively. A total of 16 and seven RDCs wereincluded in the calculations for stem 2 and the hairpinloop, respectively, using a total of the 100,000reiterations. Input uncertainties for RDCs were onaverage 4 Hz, and were increased relative to experimen-tal uncertainties (,1.5 Hz) to allow for possibledepartures from the assumed sub-structure geometry.Order tensor frames determined independently for stem2 and the loop were superimposed using Insight II tocarry out the rotations. Although this results in fourallowed conformations that are related by inversionsabout principal axes, three of these could readily be dis-carded due to violations in linkage geometry (Figure 6).

RDC-based Structure Determination of RNA 647

Page 12: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

Acknowledgments

We thank Weijun Xu for preparation of uniformly13C,15N-labeled TAR. This work was supported by NIH(to D.J.P) and MICS Division of OASCR DOE underDE-AC05-00OR22725 (to A.G.)

References

1. Brenner, S. E. (2001). A tour of structural genomics.Nature Rev. Genet. 2, 801–809.

2. Heinemann, U., Illing, G. & Oschkinat, H. (2001).High-throughput three-dimensional protein struc-ture determination. Curr. Opin. Biotechnol. 12,348–354.

3. Prestegard, J. H., Valafar, H., Glushka, J. & Tian, F.(2001). Nuclear magnetic resonance in the era ofstructural genomics. Biochemistry, 40, 8677–8685.

4. Montelione, G. T., Zheng, D., Huang, Y. J., Gunsalus,K. C. & Szyperski, T. (2000). Protein NMR spec-troscopy in structural genomics. Nature Struct. Biol.7, 982–985.

5. Baker, D. & Sali, A. (2001). Protein structure predic-tion and structural genomics. Science, 294, 93–96.

6. Perez-Canadillas, J. M. & Varani, G. (2001). Recentadvances in RNA–protein recognition. Curr. Opin.Struct. Biol. 11, 53–58.

7. Hermann, T. & Patel, D. J. (2000). Adaptive recog-nition by nucleic acid aptamers. Science, 287,820–825.

8. Hermann, T. (2000). Strategies for the design of drugstargeting RNA and RNA–protein complexes. Angew.Chem. Int. Edit. 39, 1891–1905.

9. Famulok, M. & Jenne, A. (1999). Catalysis based onnucleic acid structures. Implementation and redesignof catalytic function in biopolymers. Topics Curr.Chem. 202, 101–131.

10. Doudna, J. A. (2000). Structural genomics of RNA.Nature Struct. Biol. 7, 954–956.

11. Al-Hashimi, H. M. & Patel, D. J. (2002). Residualdipolar couplings: synergy between NMR and struc-tural genomics. J. Biomol. NMR, 22, 1–8.

12. Vitkup, D., Melamud, E., Moult, J. & Sander, C.(2001). Completeness in structural genomics. NatureStruct. Biol. 8, 559–566.

13. Zuker, M. (2000). Calculating nucleic acid secondarystructure. Curr. Opin. Struct. Biol. 10, 303–310.

14. Lau, N. C., Lim, L. P., Weinstein, E. G. & Bartel, D. P.(2001). An abundant class of tiny RNAs with prob-able regulatory roles in Caenorhabditis elegans. Science,294, 858–862.

15. Tolman, J. R., Flanagan, J. M., Kennedy, M. A. &Prestegard, J. H. (1995). Nuclear magnetic dipoleinteractions in field-oriented proteins—informationfor structure determination in solution. Proc. NatlAcad. Sci. USA, 92, 9279–9283.

16. Tjandra, N. & Bax, A. (1997). Direct measurement ofdistances and angles in biomolecules by NMR in adilute liquid crystalline medium. Science, 278,1111–1114.

17. Prestegard, J. H., Al-Hashimi, H. M. & Tolman, J. R.(2000). NMR structures of biomolecules using fieldoriented media and residual dipolar couplings.Quart. Rev. Biophys. 33, 371–424.

18. Bax, A., Kontaxis, G. & Tjandra, N. (2001). Dipolarcouplings in macromolecular structure determi-nation. Methods Enzymol. 339, 127–174.

19. Mollova, E. T. & Pardi, A. (2000). NMR solutionstructure determination of RNAs. Curr. Opin. Struct.Biol. 10, 298–302.

20. Tolman, J. R., Al-Hashimi, H. M., Kay, L. E. &Prestegard, J. H. (2001). Structural and dynamicanalysis of residual dipolar coupling data forproteins. J. Am. Chem. Soc. 123, 1416–1424.

21. Zidek, L., Stefl, R. & Sklenar, V. (2001). NMR method-ology for the study of nucleic acids. Curr. Opin.Struct. Biol. 11, 275–281.

22. Losonczi, J. A., Andrec, M., Fischer, M. W. F. &Prestegard, J. H. (1999). Order matrix analysis ofresidual dipolar couplings using singular valuedecomposition. J. Magn. Reson. 138, 334–342.

23. Al-Hashimi, H. M., Valafar, H., Terrell, M., Zartler,E. R., Eidsness, M. K. & Prestegard, J. H. (2000). Vari-ation of molecular alignment as a means of resolvingorientational ambiguities in protein structures fromdipolar couplings. J. Magn. Reson. 143, 402–406.

24. Saupe, A. (1968). Recent results in the field of liquidcrystals. Angew. Chem., Int. Ed. Engl. 7, 97–112.

25. Mollova, E. T., Hansen, M. R. & Pardi, A. (2000).Global structure of RNA determined with residualdipolar couplings. J. Am. Chem. Soc. 122,11561–11562.

26. Al-Hashimi, H. M., Gosser, Y., Gorin, A., Hu, W.,Majumdar, A. & Patel, D. J. (2002). Concertedmoitons in HIV-1 TAR RNA may allow access tobound state conformations: RNA dynamics fromNMR residual dipolar couplings. J. Mol. Biol. 315,95–102.

27. Tinoco, I., Jr & Bustamante, C. (1999). How RNAfolds. J. Mol. Biol. 293, 271–281.

28. Moore, P. B. (1999). Structural motifs in RNA. Annu.Rev. Biochem. 68, 287–300.

29. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz,T. A. (2000). The complete atomic structure of thelarge ribosomal subunit at 2.4 angstrom resolution.Science, 289, 905–920.

30. Wimberly, B. T., Brodersen, D. E., Clemons, W. M.,Morgan-Warren, R. J., Carter, A. P., Vonrhein, C. et al.(2000). Structure of the 30S ribosomal subunit.Nature, 407, 327–339.

31. Ramakrishnan, V. & Moore, P. B. (2001). Atomicstructures at last: the ribosome in 2000. Curr. Opin.Struct. Biol. 11, 144–154.

32. Muesing, M. A., Smith, D. H. & Capon, D. J. (1987).Regulation of mRNA accumulation by a humanimmunodeficiency virus trans-activator protein. Cell,48, 691–701.

33. Zweckstetter, M. & Bax, A. (2001). Single-step deter-mination of protein substructures using dipolarcouplings: aid to structural genomics. J. Am. Chem.Soc. 123, 9490–9491.

34. Delaglio, F., Kontaxis, G. & Bax, A. (2000). Proteinstructure determination using molecular fragmentreplacement and NMR dipolar couplings. J. Am.Chem. Soc. 122, 2142–2143.

35. Tian, F., Valafar, H. & Prestegard, J. H. (2001). Adipolar coupling based strategy for simultaneousresonance assignment and structure determinationof protein backbones. J. Am. Chem. Soc. 123,11791–11796.

36. Ramirez, B. E. & Bax, A. (1998). Modulation of thealignment tensor of macromolecules dissolved in adilute liquid crystalline medium. J. Am. Chem. Soc.120, 9106–9107.

37. Mathews, D. H., Sabina, J., Zuker, M. & Turner, D. H.(1999). Expanded sequence dependence of

648 RDC-based Structure Determination of RNA

Page 13: Towards Structural Genomics of RNA: Rapid NMR Resonance … · 2002-07-29 · Towards Structural Genomics of RNA: Rapid NMR Resonance Assignment and Simultaneous RNA Tertiary Structure

thermodynamic parameters improves prediction ofRNA secondary structure. J. Mol. Biol. 288, 911–940.

38. Ennifar, E., Nikulin, A., Tishchenko, S., Serganov, A.,Nevskaya, N., Garber, M. et al. (2000). The crystalstructure of UUCG tetraloop. J. Mol. Biol. 304, 35–42.

39. Cromsigt, J., van Buuren, B., Schleucher, J. &Wijmenga, S. (2001). Resonance assignment andstructure determination for RNA. Nuclear magneticresonance of biological macromolecules. MethodsEnzymol. 338, 371–399.

40. Hansen, M. R., Mueller, L. & Pardi, A. (1998).Tunable alignment of macromolecules byfilamentous phage yields dipolar coupling inter-actions. Nature Struct. Biol. 5, 1065–1074.

41. Clore, G. M., Starich, M. R. & Gronenborn, A. M.(1998). Measurement of residual dipolar couplingsof macromolecules aligned in the nematic phase of acolloidal suspension of rod-shaped viruses. J. Am.Chem. Soc. 120, 10571–10572.

42. Dingley, A. J. & Grzesiek, S. (1998). Direct obser-vation of hydrogen bonds in nucleic acid base pairsby internucleotide (2)J(NN) couplings. J. Am. Chem.Soc. 120, 8293–8297.

43. Pervushin, K., Ono, A., Fernandez, C., Szyperski, T.,Kainosho, M. & Wuthrich, K. (1998). NMR scalercouplings across Watson–Crick base pair hydrogenbonds in DNA observed by transverse relaxationoptimized spectroscopy. Proc. Natl Acad. Sci. USA,95, 14147–14151.

44. Grzesiek, S., Cordier, F. & Dingley, A. J. (2001). Scalarcouplings across hydrogen bonds. Methods Enzymol.338, 111–133.

45. Majumdar, A. & Patel, D. J. (2002). Identifying hydro-gen bond alignments in multistranded DNA archi-tectures by NMR. Accts Chem. Res. 35, 1–11.

46. Majumdar, A., Gosser, Y. & Patel, D. J. (2001). H-1–H-1 correlations across N–H· · ·N hydrogen bonds innucleic acids. J. Biomol. NMR, 21, 289–306.

47. Sibille, N., Pardi, A., Simorre, J. P. & Blackledge, M.(2001). Refinement of local and long-range structuralorder in theophylline-binding RNA using C-13–H-1residual dipolar couplings and restrained moleculardynamics. J. Am. Chem. Soc. 123, 12135–12146.

48. Fowler, C. A., Tian, F., Al-Hashimi, H. M. & Preste-gard, J. H. (2000). Rapid determination of proteinfolds using residual dipolar couplings. J. Mol. Biol.304, 447–460.

49. Idek, L., Wu, H. H., Feigon, J. & Sklenar, V. (2001).Measurement of small scalar and dipolar couplingsin purine and pyrimidine bases. J. Biomol. NMR, 21,153–160.

50. Chou, J. J., Li, S. P. & Bax, A. (2000). Study of confor-mational rearrangement and refinement of structuralhomology models by the use of heteronucleardipolar couplings. J. Biomol. NMR, 18, 217–227.

51. Aboul-ela, F., Karn, J. & Varani, G. (1996). Structureof HIV-1 TAR RNA in the absence of ligands revealsa novel conformation of the trinucleotide bulge.Nucl. Acids Res. 24, 3974–3981.

52. Puglisi, J. D., Tan, R., Calnan, B. J., Frankel, A. D. &Williamson, J. R. (1992). Conformation of the TARRNA–arginine complex by NMR spectroscopy.Science, 257, 76–80.

53. Zacharias, M. & Hagerman, P. J. (1995). The bend inRNA created by the transactivation responseelement bulge of human-immunodeficiency-virus isstraightened by arginine and by Tat-derived peptide.Proc. Natl Acad. Sci. USA, 92, 6052–6056.

54. Long, K. S. & Crothers, D. M. (1999). Characteriz-ation of the solution conformations of unbound andTat peptide-bound forms of HIV-1 TAR RNA. Bio-chemistry, 38, 10059–10069.

55. Wuthrich, K. (1986). NMR of Proteins and NucleicAcids, Wiley, New York.

56. Lynch, S. R. & Puglisi, J. D. (2000). Application ofresidual dipolar coupling measurements to identifyconformational changes in RNA induced by anti-biotics. J. Am. Chem. Soc. 122, 7853–7854.

57. Gosser, Y., Hermann, T., Majumdar, A., Hu, W. D.,Frederick, R., Jiang, F. et al. (2001). Peptide-triggeredconformational switch in HIV-1 RRE RNA com-plexes. Nature Struct. Biol. 8, 146–150.

58. Hansen, M. R., Hanson, P. & Pardi, A. (2000). Fila-mentous bacteriophage for aligning RNA, DNA,and proteins for measurement of nuclear magneticresonance dipolar coupling interactions. MethodsEnzymol. 317, 220–240.

59. Tian, F., Al-Hashimi, H. M., Craighead, J. L. &Prestegard, J. H. (2001). Conformational analysis ofa flexible oligosaccharide using residual dipolarcouplings. J. Am. Chem. Soc. 123, 485–492.

60. Ottiger, M., Delaglio, F. & Bax, A. (1998). Measure-ment of J and dipolar couplings from simplifiedtwo-dimensional NMR spectra. J. Magn. Reson. 131,373–378.

61. Hu, W., Gosser, Y. Q., Xu, W. & Patel, D. J. (2001).Novel 2D and 3D multiple-quantum bi-directionalHCNCH experiments for the correlation of riboseand base protons/carbons in 13C/15N labeled RNA.J. Biomol. NMR, 20, 167–172.

62. Majumdar, A., Kettani, A., Skripkin, E. & Patel, D. J.(2001). Pulse sequences for detection of NH2· · ·Nhydrogen bonds in sheared G·A mismatches viaremote, non-exchangeable protons. J. Biomol. NMR,19, 103–113.

63. Majumdar, A., Kettani, A., Skripkin, E. & Patel, D. J.(1999). Observation of internucleotide NH· · ·Nhydrogen bonds in the absence of directly detectableprotons. J. Biomol. NMR, 15, 207–211.

64. Hennig, M. & Williamson, J. R. (2000). Detection ofN–H· · ·N hydrogen bonding in RNA via scalarcouplings in the absence of observable imino protonresonances. Nucl. Acids Res. 28, 1585–1593.

Edited by M. F. Summers

(Received 22 January 2002; received in revised form 27February 2002; accepted 27 February 2002)

http://www.academicpress.com/jmb

Supplementary Material comprising one Figureis available on IDEAL

RDC-based Structure Determination of RNA 649