young investigator award lecture structures of larger ... · protein structure determination by...

19
Protein Science (1994), 3:372-390. Cambridge University Press. Printed in the USA. Copyright 0 1994 The Protein Society YOUNG INVESTIGATOR AWARD LECTURE Structures of larger proteins, protein-ligand and protein-DNA complexes by multidimensional heteronuclear NMR* G. MARIUS CLORE AND ANGELA M. GRONENBORN Laboratory of Chemical Physics, Building 5, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892 (RECEIVED December 22, 1993; ACCEPTED January 7, 1994) Abstract The recent development of a whole panoply of multidimensional heteronuclear-edited and -filtered NMR experi- ments has revolutionized the field of protein structure determination by NMR, making it possible to extend the methodology from the IO-kDa limit of conventional 2-dimensional NMR to systems up to potentially 35-40 kDa. The basic strategy for solving 3-dimensional structures of larger proteins and protein-ligand complexes in solu- tion using 3- and 4-dimensional NMR spectroscopy is summarized, and the power of these methods is illustrated using 3 examples: interleukin-lb, the complex of calmodulin with a target peptide, and the specific complex of the transcription factor GATA-1 with its cognate DNA target site. Keywords: calmodulin; GATA-1; heteronuclear NMR; interleukin-lb; multidimensional NMR; protein-DNA com- plexes; protein-peptide complexes; proteins; solution structure The last few years have seen a quantum jump both in the size and accuracy of protein structures that can be determined by NMR (Clore & Gronenborn, 1991a). Thus, it is nowpossible to determine the structures of proteins in the 15-25-kDa range at a resolution comparable to 2 A resolution crystal structures (Clore & Gronenborn, 1991b). This is attributable to the devel- opment of 3- and 4-dimensional heteronuclear NMR techniques to circumvent problems associated with chemical-shift overlap and degeneracy on the one hand and large linewidths on the other (see Clore & Gronenborn, 1991a, 1991c, 1991d; Bax & Grzesiek, 1993, for reviews). In this short review, we summa- rize some of these developments and illustrate their application to the structure determination of interleukin-l/3 (Clore et al., 1991b), a complex of calmodulin with a target peptide (Ikura et al., 1992), and a complex of the DNA binding domain of the transcription factor GATA-1 with its cognate DNA target site (Omichinski et al., 1993a). Reprint requests to: G. Marius Clore, Laboratory of Chemical Phys- ics, Building 5, National Institute of Diabetes and Digestive and Kid- ney Diseases, National Institutes of Health, Bethesda, Maryland 20892; e-mail: [email protected]. *This paper is based on thelecture presented by Dr. Clore at the Sev- enth Annual Symposium of The Protein Society in San Diego on July 26, 1993, as one of the co-recipients of the Young Investigator Award sponsored by the DuPont Merck Pharmaceutical Company. The other recipient was Dr. Ad Bax. General strategy for the determination of 3-dimensional structures of larger proteins and protein complexes by NMR The main source of geometric information used in protein struc- ture determination lies in the nuclear Overhauser effect, which can be usedto identify protons separated by lessthan 5 A (Ernst et al., 1987). This distance limit arises from the fact that the NOE is proportional to the inverse sixth power of the distance between the protons. Hence the NOE intensity falls off very rap- idly with increasing distance between proton pairs. Despite the short range nature of the observed interactions, the short ap- proximate interproton distance restraints derived from the NOE measurements can be highly conformationally restrictive, par- ticularly when they involve residues that are far apart in the se- quence but close together in space. The power of NMR over other spectroscopic techniques re- sults from thefact that every proton gives rise to anindividual resonance in the spectrum, which can be resolved by higher dimensional (i.e., 2D, 3D, and 4D) techniques. Bearing this in mind, the principles of structure determination by NMR can be summarized very simply by the scheme depicted in Figure 1. The first step is to obtain sequential resonanceassignments using a combination of through-bond and through-space correlations; the second step is to obtain stereospecific assignments at chiral centers and torsion angle restraints using 3-bond scalar couplings 372

Upload: others

Post on 26-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

Protein Science (1994), 3:372-390. Cambridge University Press. Printed in the USA. Copyright 0 1994 The Protein Society

YOUNG INVESTIGATOR AWARD LECTURE

Structures of larger proteins, protein-ligand and protein-DNA complexes by multidimensional heteronuclear NMR*

G. MARIUS CLORE AND ANGELA M. GRONENBORN Laboratory of Chemical Physics, Building 5 , National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892

(RECEIVED December 22, 1993; ACCEPTED January 7, 1994)

Abstract

The recent development of a whole panoply of multidimensional heteronuclear-edited and -filtered NMR experi- ments has revolutionized the field of protein structure determination by NMR, making it possible to extend the methodology from the IO-kDa limit of conventional 2-dimensional NMR to systems up to potentially 35-40 kDa. The basic strategy for solving 3-dimensional structures of larger proteins and protein-ligand complexes in solu- tion using 3- and 4-dimensional NMR spectroscopy is summarized, and the power of these methods is illustrated using 3 examples: interleukin-lb, the complex of calmodulin with a target peptide, and the specific complex of the transcription factor GATA-1 with its cognate DNA target site.

Keywords: calmodulin; GATA-1; heteronuclear NMR; interleukin-lb; multidimensional NMR; protein-DNA com- plexes; protein-peptide complexes; proteins; solution structure

The last few years have seen a quantum jump both in the size and accuracy of protein structures that can be determined by NMR (Clore & Gronenborn, 1991a). Thus, it is now possible to determine the structures of proteins in the 15-25-kDa range at a resolution comparable to 2 A resolution crystal structures (Clore & Gronenborn, 1991b). This is attributable to the devel- opment of 3- and 4-dimensional heteronuclear NMR techniques to circumvent problems associated with chemical-shift overlap and degeneracy on the one hand and large linewidths on the other (see Clore & Gronenborn, 1991a, 1991c, 1991d; Bax & Grzesiek, 1993, for reviews). In this short review, we summa- rize some of these developments and illustrate their application to the structure determination of interleukin-l/3 (Clore et al., 1991b), a complex of calmodulin with a target peptide (Ikura et al., 1992), and a complex of the DNA binding domain of the transcription factor GATA-1 with its cognate DNA target site (Omichinski et al., 1993a).

Reprint requests to: G . Marius Clore, Laboratory of Chemical Phys- ics, Building 5 , National Institute of Diabetes and Digestive and Kid- ney Diseases, National Institutes of Health, Bethesda, Maryland 20892; e-mail: [email protected].

*This paper is based on the lecture presented by Dr. Clore at the Sev- enth Annual Symposium of The Protein Society in San Diego on July 26, 1993, as one of the co-recipients of the Young Investigator Award sponsored by the DuPont Merck Pharmaceutical Company. The other recipient was Dr. Ad Bax.

General strategy for the determination of 3-dimensional structures of larger proteins and protein complexes by NMR

The main source of geometric information used in protein struc- ture determination lies in the nuclear Overhauser effect, which can be used to identify protons separated by less than 5 A (Ernst et al., 1987). This distance limit arises from the fact that the NOE is proportional to the inverse sixth power of the distance between the protons. Hence the NOE intensity falls off very rap- idly with increasing distance between proton pairs. Despite the short range nature of the observed interactions, the short ap- proximate interproton distance restraints derived from the NOE measurements can be highly conformationally restrictive, par- ticularly when they involve residues that are far apart in the se- quence but close together in space.

The power of NMR over other spectroscopic techniques re- sults from the fact that every proton gives rise to an individual resonance in the spectrum, which can be resolved by higher dimensional (i.e., 2D, 3D, and 4D) techniques. Bearing this in mind, the principles of structure determination by NMR can be summarized very simply by the scheme depicted in Figure 1. The first step is to obtain sequential resonance assignments using a combination of through-bond and through-space correlations; the second step is to obtain stereospecific assignments at chiral centers and torsion angle restraints using 3-bond scalar couplings

372

Page 2: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

Protein structure determination by multidimensional heteronuclear NMR 373

Resonance Assignment Identification of Secondary (a) sequential

ib) side chains

w Structure Elements

l ~ e r n r ~ v e Cvcle c Stereospecific Assignment & Torsion Angle Determination

(a) coupling constants

(b) intraresidue and sequential 3D Structure Determination distance restraints

Hybrid Distance Geometry (c) systematic conformational grid - Simulated Annealing

search of Qj w. x space Simulated Annealing

" Tertiary Long Range I Distance Restraints I I I 1 I

lrcrnrive Cycle

High Resolution 3D Structure

Fig. 1. Summary of general strategy employed to solve 3D structures of macromolecules by NMR.

combined with intraresidue and sequential interresidue NOE data; the third step is to identify through-space connectivities between protons separated by less than 5 A; and finally, the fourth step involves calculating 3D structures on the basis of the amassed interproton distance and torsion angle restraints using one or more of a number of algorithms (Havel et al., 1983; Braun, 1987; Clore & Gronenborn, 1989) such as distance ge- ometry and/or simulated annealing. It is not essential to assign all the NOEs initially. Indeed, many may be ambiguous and sev- eral possibilities may exist for their assignments. Once a low res- olution structure, however, has been calculated from a subset of the NOE data that can be interpreted unambiguously, it is then possible to employ iterative methods to resolve the vast ma- jority of ambiguities. Consider for example an NOE cross peak which could be attributable to a through-space interaction be- tween either protons A and B or between protons A and C. Once a low resolution structure is available, it is usually possible to discriminate between these 2 possibilities. Thus, if protons A and C are significantly greater than 5 A apart, and protons A and B are less than 5 A apart, it is clear that the cross peak must arise from an NOE between protons A and B.

The quality of an NMR protein structure determination in- creases as the number of restraints increase (Havel & Wuthrich, 1985; Clore & Gronenborn, 1991a, 1991d; Havel, 1991; Clore et al., 1993). This progression in coordinate precision is illus- trated in Figure 2, which shows 4 generations of structures rang- ing from the first generation, which simply provides a picture of the polypeptide fold with little detail, to the fourth genera- tion, which is broadly equivalent to a 2-A resolution X-ray structure.

Sequential resonance assignment

Conventional sequential resonance assignment relies on 2D homonuclear ' H-'H correlation experiments to identify amino acid spin systems coupled with 2D 'H-'H NOE experiments to identify sequential connectivities along the backbone of the type C"H(i)-NH(i + 1 , 2, 3 , 4), NH(i)-NH(i f 2), and CaH(i)-CPH(i + 3) (Wuthrich, 1986; Clore & Gronenborn, 1987). This methodology has been successfully applied to pro- teins of less than 100 residues. For larger proteins, the spectral complexity is such that 2D experiments no longer suffice, and it is essential to increase the spectral resolution by increasing the dimensionality of the spectra (Oschkinat et al., 1988). In some cases it is still possible to apply the same strategy by making use of 3D heteronuclear ("N or I3C) edited experiments to increase the spectral resolution, as illustrated in Figure 3 (Fesik & Zuider- weg, 1988, 1990; Marion et al., 1989; Driscoll et al., 1990a). In many cases, however, numerous ambiguities still remain and it is advisable to adopt a sequential assignment strategy based solely on well-defined heteronuclear scalar couplings (Mon- telione & Wagner, 1989, 1990; Ikura et al., 1990; Clore & Gronenborn, 1991c; Bax & Grzesiek, 1993). The double and tri- ple resonance experiments that we currently use, together with the correlations that they demonstrate, are summarized in Ta- ble l . With the advent of pulse field gradients to eliminate un- desired coherence transfer pathways (Bax & Pochapsky, 1992), as opposed to selecting desired coherence transfer pathways (Hurd &John, 1991; Vuister et al., 1991), it is now possible to employ only 2-step phase-cycles without any loss in sensitivity (other than that due to the reduction in measurement time) such that each 3D experiment can be recorded in as little as 7 h. In most cases, however, signal-to-noise requirements necessitate 1- 3 days measuring time depending on the experiment.

Stereospecific assignments and torsion angle restraints

It is often possible to obtain stereospecific assignments of @-methylene protons on the basis of a qualitative interpretation of the homonuclear J,, coupling constants and the intraresi- due NOE data involving the NH, C"H, and COH protons (Hyberts et al., 1987; Wagner et al., 1987). A more rigorous ap- prach, which also permits one to to obtain 4, $, and xI re- straints as well involves the application of a conformational grid search of 4, $, xI space on the basis of the homonuclear 3 J ~ ~ , and J,, coupling constants (which are related to 4 and x I , re- spectively), and the intraresidue and sequential (i & 1) interres- idue NOEs involving the NH, C"H, and COH protons (Giintert et al., 1989; Nilges et al., 1990). This information can be sup- plemented by the measurement of heteronuclear 'JNHP and 3 J c o ~ p couplings, which are also related to xl (Vuister et al., 1994). Stereospecific assignment of valine methyl groups can be made on the basis of 3Jcyco, 3JNcy couplings (Vuister et al., 1994), as well as on the basis of the pattern of intraresidue NOEs involving the NH, C"H, and CYH protons (Zuiderweg et al., 1985). Finally, stereospecific assignments of leucine methyl groups can be made on the basis of heteronuclear 3Jc6cu and 'JC~H, couplings (Vuister et al., 1994) in combination with the pattern of intraresidue NOEs, provided that the stereospecific assignment of the @-methylene protons and the X, rotamer have been previously determined (Powers et al., 1993).

Page 3: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

374 G.M. Clore and A.M. Gronenborn

1 st Generation - 7 restraints per residues rmsd: 1.5A for backbone atoms

Z.OA for all atoms example: purothionin

2nd Generation - 10 restraints per residue rmsd: 0.9A for backbone atoms

1.2A for all atoms example: BDS-I

3rd Generation - 13 restraints per residue rmsd: 0.7A for backbone atoms

0.9A for all atoms example: BDS-I

K20

4th Generation - 16 restraints per residue rmsd: 0.4A for backbone atoms

0.9A for all atoms, < 0.5A for ordered side chains

example: Interleukin-8

Fig. 2. Illustration of the progressive improvement in the precision and accuracy of NMR structure determinations with increas- ing number of experimental restraints. All the structures have been calculated using the hybrid distance geometry-simulated annealing method, ?nd in each case the NOE-derived interproton distance restraints have been grouped into 3 broad ranges - 1.8-2.7 A, 1.8-3.3 A, and 1.8-5.0 A-corresponding to strong, medium, and weak NOES, respectively.

Page 4: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

Protein structure determination by multidimensional heteronuclear NMR 375

3

4

rz z M

5

6

3D 15N-edited NOESY

E83&82a

e

1 5 ~ (F2) = 123.7 ppm

9 8 1H F3

Fig. 3. Comparison of the NH-C"H/C@H region of a 2D "N-edited NOESY spectrum with that of a single plane taken from the 3D I5N-edited NOESY spectrum, illustrating the increase in spectral resolution afforded by increasing the dimensionality from 2 to 3.

onances to a single NH resonance position. In the 2D spectrum it is impossible to ascertain whether this involves 1 NH proton or many. Extending the spectrum to 3D by separating the NOE interactions according to the I5N chemical shift of the nitrogen attached to each amide proton reveals that there are 3 NH pro- tons involved. The identity of the originating aliphatic protons, however, is only specified by their proton chemical shifts. Yet the extent of spectral overlap in the aliphatic region of the spec- trum vastly exceeds that in the amide region. This can be re- solved by adding a further dimension in which each plane of the 3D spectrum now constitutes a cube in the 4D spectrum edited by the I3C shift of the carbon atom attached to each aliphatic proton. In this manner, each 'H-'H NOE interaction is speci- fied by 4 chemical shift coordinates, the 2 protons giving rise to the NOE and the heavy atoms to which they are attached. The resolving power of 4D heteronuclear-edited NOE spectroscopy is illustrated in Figure 5 .

Because the number of NOE interactions present in each 2D plane of a 4D I3C/l5N- or 13C/13C-edited NOESY spectrum is so small, the inherent resolution in a 4D spectrum is extremely high, despite the low level of digitization. Indeed, spectra with equivalent resolution can be recorded at magnetic field strengths considerably lower than 600 MHz, although this would obvi- ously lead to a reduction in sensitivity. Further, it can be cal- culated that 4D spectra with virtual lack of resonance overlap and good sensitivity can be obtained on proteins with as many as 400 residues. Thus, once complete 'H, I5N, and I3C assign- ments are obtained, analysis of 4D 15N/'3C- (Kay et al., 1990) and 13C/13C- (Clore et al., 1991a; Zuiderweg et al., 1991; Vuister et al., 1993) edited NOE spectra should permit the au- tomated assignment of almost all NOE interactions.

Application of 3D and 4D NMR to protein structure determination of larger proteins: the structure of interleukin-l/?

Although the potential of heteronuclear 3D and 4D NMR meth- ods in resolving problems associated with both extensive reso- nance overlap and large linewidths is obvious; how does this new approach fare in practice? In this regard it should be borne in mind that resonance assignments are only a means to an end, and the true test of multidimensional NMR lies in examining its success in solving the problem that it was originally designed to tackle, namely the determination of high resolution 3D struc- tures of larger proteins in solution.

The first successful demonstration of these new methods was the determination of the high resolution solution structure of interleukin-10 (IL-lo), a cytokine of 153 residues and molecu- lar weight 17.4 kDa, which plays a key role in the immune and inflammatory responses (see Kinemage 1; Clore et al., 1991b). At the time IL-10 was 50% larger, in terms of number of resi- dues, than the previously largest protein structures solved by NMR, namely human (Forman-Kay et al., 1991) and E. coli (Dyson et al., 1990) thioredoxin, which have 105 and 108 resi- dues, respectively. Moreover, IL-10 still represents one of the most highly refined and precise structures for proteins of this size solved by NMR.

Despite extensive analysis of 2D spectra obtained at differ- ent pH values and temperatures, as well as examination of 2D spectra of mutant proteins, it did not prove feasible to obtain unambiguous 'H assignment for more than about 30% of the residues of IL-10 (Driscoll et al., 1990a). Thus, any further progress could only be made by resorting to higher dimension-

Page 5: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

376 G. M. Clore and A.M. Gronenborn

Table 1. Summary of correlations observed in the 3 0 double and triple resonance experiments used for sequential and side-chain assignments in our laboratory

Experiment Correlation J couplinga

"N-edited HOHAHA C'H(i)-"N(i)-NH(i) JHNa

HNHA C"H(i)-"N(i)-NH(i) H(CA)NH

HNCA

CSH(i)-I5N(i)-NH(i) 3 J ~ ~ a and 'J,p

CuH(i)-I5N(i)-NH(i) l J N C a

CaH(i - 1)-l5N(i)-NH(i) 13C"(i)-'5N(i)-NH(i) I JNCa

13Ca(i - 1)-l5N(i)-NH(i) HN(C0)CA 13Ca(i - 1)-l5N(i)-NH(i) HNCO

'JNCO and 'Jcuco I3CO(i - I)-"N(i)-NH(i)

HCACO I JNCO

C'H(i)-13C"(i)-13CO(i) HCA(C0)N

I J c u c o C*H(i)-13C"(i)-15N(i + 1)

CBCA(C0)NH 'JC,CO and 'JNCO

l3CU(i - l)/I3C"(i - 1)-I5N(i)-NH(i) CBCANH 13CU(i)/13Ca(i)-15N(i)-NH(i)

' J C ~ C O . 'JNCO, and 'JCC

13CS(i - l)/"C*(i - 1)-I5N(i)-NH(i) 'JNc, and 'JCc

HBHA(C0)NH 2JNc, and 'Jcc

CpH(i - l)/CaH(i - l)-I5N(i)-NH(i) I J C ~ C O , JNCO, and I JCC

HBHANH COH(i)/C"H(i) -I5N(i)-NH(i) lJNCo, and 'JCc CSH(i - l)/CaH(i - 1)-I5N(i)-NH(i)

C(C0)NH ' JNcu and Jcc

13Cj(i - l)-15N(i)-NH(i) 'Jc,co, 'JNc0, and 'JCC

H(CC0)NH HJ(i - 1)-l5N(i)-NH(i) 'Jc,co, 'JNCO, and 'JCC

HCCH-TOCSY

JHNo

2JNCa

2JNCa

HCCH-COSY HJ-13CJ-l3CJ*'-H./*'l I Jcc Jcc ~ j - 1 3 ~ ~ , , , 13ci+n-~j*n

a In addition to the couplings indicated, all the experiments make use of the lJCH (-140 Hz) and/or l J N H (-95 Hz) couplings. The values of the couplings employed are as follows: 3 J ~ ~ a - 3-10 Hz, 'Jc- - 35 Hz, 'JC,CO - 55 Hz, lJNCO - 15 Hz, 'JNCa - l lHz, 'JNc, - 7 Hz.

ality heteronuclear NMR. The initial step involved the complete assignment of the 'H, "N, and I3C resonances of the back- bone and side chains using many of the double and triple reso- nance 3D experiments listed in Table 1 (Clore et al., 1990a; Driscoll et al., 1990a, 1990b). In the second step, backbone and side-chain torsion angle restraints, as well as stereospecific as- signments for /3-methylene protons, were obtained by means of a 3D systematic grid search of 4, $, xI space (Nilges et al., 1990). In the third step, approximate interproton distance re- straints between nonadjacent residues were derived from anal- ysis of 3D and 4D heteronuclear-edited NOE spectra. Analysis of the 3D heteronuclear-edited NOE spectra alone was sufficient to derive a low resolution structure on the basis of a small num- ber of NOEs involving solely NH, C*H, and C@H protons (Clore et al., 1990~). However, further progress using 3D NMR was severely hindered by the numerous ambiguities still present in these spectra, in particular for NOEs arising from the large number of aliphatic protons. Thus, the 4D heteronuclear-edited NOE spectra proved to be absolutely essential for the success- ful completion of this task. In addition, the proximity of back- bone NH protons to bound structural water molecules was ascertained from a 3D "N-separated rotating frame Over- hauser (ROE) spectrum, which permits one to distinguish spe- cific protein-water NOE interactions from chemical exchange with bulk solvent (Clore et al., 1990b). In this regard it should be emphasized that all the NOE data were interpreted in as con- servative a manner as possible and were simply classified into 3 distance ranges, 1.8-2.7 A, 1.8-3.3 A, and 1.8-5.0 A, corre- sponding to strong, medium, and weak intensity NOEs.

With an initial set of experimental restraints in hand, 3D struc- ture calculations were initiated using the hybrid distance geometry-dynamical simulated annealing method (Nilges et al., 1988). A key aspect of the overall strategy lies in the use of an iterative approach whereby the experimental data are reexam- ined in the light of the initial set of calculated structures in or- der to resolve ambiguities in NOE assignments, to obtain more stereospecific assignments (e.g., the a-methylene protons of gly- cine and the methyl groups of valine and leucine) and torsion angle restraints, and to assign backbone hydrogen bonds asso- ciated with slowly exchanging NH protons as well as with bound water molecules. The iterative cycle comes to an end when all the experimental data have been interpreted.

The final experimental data set for IL-l/3 comprised a total of 3,146 approximate and loose experimental restraints made up of 2,780 distance and 366 torsion angle restraints (Clore et al., 1991b). This represents an average of -21 experimental re- straints per residue. If one takes into account that interresidue NOEs affect 2 residues, whereas intraresidue NOE and torsion angle restraints only affect individual residues, the average num- ber of restraints influencing the conformation of each residue is approximately 33. Superpositions of the backbone atoms and selected side chains for 32 independently calculated structures are shown in Figure 6B and D. All 32 structures satisfy the ex- perimental restraints within their specified errors, display very small deviations from idealized covalent geometry, and have good nonbonded contacts. It can be seen that both the backbone as well as ordered side chains are exceptionally well defined. In- deed, the atomic RMS distribution about the mean coordinate

Page 6: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

Protein structure determination by multidimensional heteronuclear NMR 311

613C or 15N(F3) - F2 ( IH)

F4 (1H)

Fig. 4. Schematic illustration of the progression and relationship between 2D, 3D, and 4D heteronuclear-edited NMR spectroscopy.

2D

3D

positions is 0.4 A for the backbone atoms, 0.8 A for all atoms, and 0.5 A for side chains with 540% of their surface (relative to that in a tripeptide Gly-X-Gly) accessible to solvent (Clore et al., 1991b).

The structure of IL-1 fl itself resembles a tetrahedron and dis- plays 3-fold internal pseudosymmetry (Kinemage 1). There are 12 @-strands arranged in an exclusively antiparallel &structure, and 6 of the strands form a &barrel (seen in the front of Fig. 6A), which is closed off at the back of the molecule by the other 6 strands. Each repeating topological unit is composed of 5 strands arranged in an antiparallel manner with respect to each other, and one of these units is shown in Figure 6C. Water mol- ecules occupy very similar positions in all 3 topological units, as well as at the interface of the 3 units, and are involved in

bridging backbone hydrogen bonds. Thus, in the case of the to- pological unit shown in Figure 6C, the water molecule labeled W5 accepts a hydrogen bond from the NH of Phe-112 in strand IX and donates 2 hydrogen bonds to the backbone carbonyls of Ile-122 in strand X and Thr-144 in strand XII. The packing of some internal residues with respect to one another, as well as the excellent definition of internal side chains is illustrated in Figure 6D. Because of the high resolution of the IL-10 struc- ture it was possible to analyze in detail side chain-side chain in- teractions involved in stabilizing the structure. In addition, examination of the structure in the light of mutational data per- mitted us to propose the presence of 3 distinct sites involved in the binding of IL-16 to its cell surface receptor (Clore et al., 1991b).

Page 7: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

378

A

1

I 0

Rlt6$ *

"1

+ve levels

G . M . Clore and A.M. Gronenborn

D 34

461 50 L1106.7

Fig. 5. Example of the increase in spectral resolution afforded by 4D '3C/'3C-edited NOE spectroscopy, illustrated with interleukin-10. A: 'H(F2)-'H(F4) plane of the 4D spectrum at 6I3C(F,) = 44.3 ppm and 6I3C(F3) = 34.6 ppm; the region be- tween l and 2 ppm is boxed in and the arrow indicates the position of the Lys-77 C7H-CPH NOE cross peak. B: 2D 'H-'H NOE spectrum between 1 and 2 pprn; the X marks the chemical shift position of the Lys-77 CYH-C5H NOE cross peak seen in A. C, D: Positive and negative contours in the '3C(FI)-'3C(F3) plane of the 4D spectrum at the 'H chemical shift coordi- nates, 6'H(F2) = 1.39 ppm and 6'H(F4) = 1.67 ppm, corresponding to the Lys-77 C7H-CaH NOE cross peak seen in A and the X mark shown in B. Because extensive folding is employed, the I3C chemical shifts are given by x * nSW where x is the ppm value listed in the figure, n an integer, and SW the spectral width (20.71 ppm). Peaks folded an even number of times are of opposite sign to those folded an odd number of times. All the peaks in A are positive except for the two indicated by an as- terisk, which are negative.

Combining experimental information from crystal and resolution structures of small to medium-sized proteins of less solution studies: joint X-ray and NMR refinement than about 35 kDa. IL-10 offers an ideal system for compar-

ing the results of NMR and X-ray crystallography as, in addi- It is clear from the preceding discussion that NMR is a valid tion to the solution structure, there are 3 independently solved method, alongside X-ray crystallography, for determining high X-ray structures at 2 A resolution of the same crystal form (Fin-

Page 8: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

Protein structure determination by multidimensional heteronuclear NMR

A

B (40

h

379

Fig. 6. Solution structure of interleukin-10 determined by 3D and 4D heteronuclear NMR spectroscopy. A: Ribbon diagram of the polypeptide fold. B: Superposition of the backbone (N, C", C) atoms of 32 sim- ulated annealing structures calculated from the experimental NMR data. C: Superpo- sition of the backbone (N, C", C, 0) at- oms of one of the 3 repeating topological units, illustrating the position of tightly bound water at the interface of the 3 cen- tral strands of the unit. D: Superposition of all atoms (excluding protons) for selected side chains. The diagram in (A) was made with the program MOLSCRIPT (Kraulis, 1991). The coordinates are from Clore et al. (1991a) (PDB accession code 611B).

D

Page 9: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

3 80

zel et al., 1989; Priestle et al., 1989; Veerapandian et al., 1992). The backbone atomic RMS difference between the NMR and the X-ray structures is about 1 A, with the largest differences being confined to some of the loops and turns connecting the P-strands (Clore & Gronenborn, 1991b). Interestingly, however, the atomic RMS distribution of the 32 calculated solution struc- tures about their mean coordinate positions (-0.4 A for the backbone atoms, -0.8 A for all atoms, and -0.5 A for all at- oms of internal residues) is approximately the same as the atomic RMS differences between the 3 X-ray structures, indicating that the positional errors in the atomic coordinates determined by the 2 methods are similar (Clore & Gronenborn, 1991b). Upon initial inspection, the X-ray structures appear to be incompati- ble with the NMR data, as manifested by a relatively large num- ber of NOE and torsion angle violations and conversely, the NMR structure fits the X-ray data poorly with an R-factor of 40-50%. Because of the very different nature of the 2 methods, it is not immediately apparent that these discrepancies reflect genuine differences between the solution and X-ray structures or whether they reflect differences in the computational proce- dures employed. To analyze this in more detail we have devel- oped a new method of structure determination in which the NMR and X-ray data are combined and used simultaneously in the structure refinement (Shaanan et al., 1992). Using this ap- proach we have shown that a model can readily be generated from a joint NMR/X-ray refinement, which is compatible with the data from both techniques. Thus, there are only minimal vi- olations of the NMR restraints (NOES and torsion angles), the value of the crystallographic R-factor is comparable to, if not better than that derived from refinement against the crystallo- graphic data alone, and the deviations from idealized covalent geometry are small. In addition the Rfree (Briinger, 1992) for the model refined with the NMR and X-ray restraints is smaller than that of the model obtained by conventional crystallographic refinement, indicating that the crystallographic phases obtained by the joint NMR/X-ray refinement are more accurate. More- over, the few NMR observations that are still violated by the model serve as an indicator for genuine differences between the crystal and solution structures.

G.M. Clore and A.M. Gronenborn

The implications of the joint NMR/X-ray refinement method to structural biology are of considerable significance. In partic- ular, the full potential and future use of the method will be for structure determinations of multidomain proteins, for which only low resolution X-ray data for the entire protein are avail- able but for which detailed structural information may be ob- tained by NMR on the individual domains. Using the joint X-ray/NMR refinement approach in such cases will open the way to the study of proteins, which may otherwise never be structurally accessible by either of the two methods alone.

Structure determination of protein-peptide and protein-DNA complexes

Providing the ligand (e.g., a peptide, an oligonucleotide, a drug, etc.) presents a relatively simple spectrum that can be assigned by 2D methods, the most convenient strategy for dealing with protein-ligand complexes involves one in which the protein is labeled with ISN and I3C and the ligand is unlabeled (i.e., at natural isotopic abundance) (Ikura & Bax, 1992; Ikura et al., 1992). It is then possible to use a combination of heteronuclear filtering and editing to design experiments in which correlations involving only protein resonances, only ligand resonances, or only through-space interactions between ligand and protein are observed. These experiments are summarized in Table 2 and were first applied successfully to a complex of calmodulin with a target peptide from skeletal muscle myosin light chain kinase (see Kinemage 2; Ikura et al., 1992), and subsequently to the spe- cific complex of the DNA binding domain of the transcription factor GATA-1 with its cognate DNA target site (see Kinemage 3; Omichinski et al., 1993a).

Structure of the calmodulin-target peptide complex

Calmodulin (CaM) is a ubiquitous Ca2+ binding protein of 148 residues that is involved in a wide range of cellular Ca2+- dependent signaling pathways, thereby regulating the activity of a large number of proteins (Cohen & Klee, 1988). The crystal

Table 2. Summary of heteronuclear-filtered and -edited NOE experiments used to study protein-ligand complexes comprising a uniformly ”N/”C labeled protein and an unlabeled ligand

Type of contact Connectivity

A. Intramolecular protein contacts 4D 13C/13C-edited NOE in D20 H(j)-I3C(j)-H(i)-I3C(i) 4D ISN/I3C-edited NOE in H 2 0 H(j)-I5N(j)-H(i)-l3C(i) 3D ”N/”N-edited NOE in H20 H(j)-”N(j)-H(i)-”N(i)

2D ‘2C,14N(Fl)/12C,14N(F2)-filtered NOE in HzOa H(j)-12C(j)-H(i)-12C(i) H(j)-14N(j)-H(i)-’2C(i) H(j)-l2C(j)-H(i)-I4N(i) H(j)-I4N(j)-H(i)-I4N(i)

2D 12C(F1)/12C(F2)-filtered NOE in D20a H(j)-12C(j)-H(i)-12C(i)

3D 15N-edited(Fl)/’4N,12C(F3)-filtered NOE in H 2 0 H(j)-15N(j)-H(i)-’2C(i)

3D 13C-edited(Fl)/12C(F3)-filtered NOE in D20 H(j)-13C(j)-H(i)-12C(i)

B. Intramolecular ligand contacts

C. Intermolecular protein-ligand contacts

H( j ) -”N( j)-H(i)-14N(i)

a Similar heteronuclear-filtered 2D correlation and Hartmann-Hahn spectra can also be recorded to assign the spin systems of the ligand.

Page 10: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

Protein structure determination by multidimensional heteronuclear NMR 38 1

structure of Ca2+-CaM had been solved a number of years ago (Babu et al., 1985). It is a dumbbell-shaped molecule with an overall length of -65 A consisting of 2 globular domains, each of which contains 2 Ca2+ binding sites of the helix-loop-helix type, connected by a long, solvent-exposed, rigid central helix some 8 turns in length (residues 66-92). In solution, on the other hand, ‘H-’’N NMR relaxation measurements have demon- strated unambiguously that the central helix is disrupted near its midpoint with residues 78-81 adopting an essentially unstruc- tured “random coil” conformation, which is so flexible that the N- and C-terminal domains of Ca2+-CaM effectively tumble independently of each other (Barbato et al., 1992). Thus, in so- lution, the so-called “central helix” is not a helix at all but is a “flexible tether” whose purpose is to keep the 2 domains in close proximity for binding to their target.

In order to understand the way in which Ca2+-CaM recog- nizes its target sites, we set out to solve, in collaboration with Ad Bax, the solution structure of a complex of Ca2+-CaM with a 26-residue peptide (known as M13) comprising residues 577- 602 of the CaM binding domain of skeletal muscle myosin light

chain kinase (Kinemage 2). The solution structure was deter- mined on the basis of 1,995 experimental NMR restraints includ- ing 133 interproton distance restraints between the peptide and the protein. The N- (residues 1-5) and C- (residues 147-148) ter- mini of CaM, the tether connecting the 2 domains of CaM (res- idues 74-82), and the N- (residues 1-2) and C-termini (residues 22-26) of M13 were ill-defined by the NMR data and appear to be disordered in solution. The atomic RMS distribution about the mean coordinate positions for the rest of the structure (i.e., residues 6-73 and 83-146 of CaM and residues 3-21 of M13) is 1 .O A for the backbone atoms and 1.4 A for all atoms. Thus this structure represents a second generation structure in the classi- fication (Clore 8t Gronenborn, 1991a). A stereo view showing a best fit superposition of the 24 calculated structures is shown in Figure 7A.

The major conformational change in Ca2+-CaM that occurs upon binding M13 involves an extension of the flexible tether (residues 78-81) in the middle of the central helix of the solu- tion structure of free Ca2+-CaM to a long flexible loop extend- ing from residues 74 to 81, flanked by 2 helices (residues 65-73

’Y

P C

C

Fig. 7. Solution structure of the Ca2+-CaM-M13 peptide complex determined by 3D and 4D heteronuclear NMR spectroscopy. A: Superposition of the backbone (N, C“, C) atoms of 24 simulated annealing structures calculated from the experimental NMR data; the N- and C-terminal domains of calmodulin are shown in blue and red, respectively, and the M13 peptide is in green; the restrained regularized average structure is highlighted. B, C Two orthogonal views of a schematic ribbon drawing repre- sentation of the structure with the N- and C-terminal domains of CaM in blue and purple, respectively, the M13 peptide in yel- low, the hydrophobic side chains of the protein in red, and Trp-4, Phe-8, Val-l l , and Phe-17 side chains of the peptide in green. The diagrams in B and C were generated with the program VISP (de Castro & Edelstein, 1992). The coordinates are from Ikura et al. (1992) (PDB accession code IBBM).

Page 11: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

382 G.M. CIore and A.M. Gronenborn

and 83-93), thereby enabling the 2 domains to come together gripping the peptide rather like 2 hands capturing a rope. The hydrophobic channel formed by the 2 domains is complemen- tary in shape to that of the peptide helix. This is clearly illus- trated by the schematic ribbon drawings shown in Figure 7B and C, which also serve to highlight the approximate 2-fold pseu- dosymmetry of the complex. Thus, whereas the 2 domains of CaM are arranged in an approximately orthogonal manner to each other in the crystal structure of Ca2+-CaM (Babu et al., 1985), in the Ca2+-CaM-M13 complex they are almost sym- metrically related by a 180" rotation about a 2-fold axis. A large conformational change also occurs in the M13 peptide upon complexation from a random coil state to a well-defined helical conformation. Indeed, the helix involves all the residues (3-21) of M13 that interact with CaM, whereas the N- (residues 1-2) and C- (residues 22-26) termini of the peptide, which do not in- teract with CaM, remain disordered.

Upon complexation there is a decrease in the accessible surface area of CaM and M13 of 1,848 and 1,477 A 2 , respectively, which corresponds t o a decrease in the calculated solvation free energy of folding (Eisenberg & McLaghlan, 1986) of 18 and 20 kcal. mol", respectively. This large decrease in solvation free energy would account for the very tight binding (K,,, - lo9 M") of M13 to calmodulin. In addition, the accessible surface area of the portion of M13 (residues 3-21) in direct contact with CaM in the complex is only 494 A' compared to an accessible surface area of 3,123 A' for a random coil and 2,250 A' for a helix. Thus, over 80% of the surface of the peptide in contact with CaM is buried.

In the view shown in Figure 7B, the roof of the channel is formed by helices I1 (residues 29-38) and VI (residues 102-1 11) of the N- and C-terminal domains, respectively, which run anti- parallel to each other; the floor is formed by the flexible loop (residues 74-82) connecting the 2 domains and by helix VI11 (res- idues 138-146) of the C-terminal domain. The front of the chan- nel in Figure 7B and the left wall of the channel in Figure 7C is formed by helices I (residues 7-19) and IV (residues 65-73) and the mini-antiparallel @-sheet comprising residues 26-28 and 62-64, all from the N-terminal domain; the back of the chan- nel in Figure 7B and the right wall of the channel in Figure 7C is formed by helices V (residues 83-93) and VI11 (residues 138- 146) and the mini-antiparallel @-sheet comprising residues 99- 101 and 135-137, all from the C-terminal domain. The 2 domains of CaM are staggered with a small degree of overlap such that the hydrophobic face of the N-terminal domain mainly contacts the C-terminal half of the M13 peptide, whereas the C-terminal domain principally interacts with the N-terminal half of M13 (Fig. 7B).

The overall Ca2+-CaM-M 13 complex has a compact globu- lar shape approximating to an ellipsoid with dimensions 47 X

32 x 30 A. The helical M13 peptide passes through the center of the ellipsoid at an angle of -45" to its long axis. By way of contrast the approximate dimensions of the Ca2+-CaM X-ray structure are 65 x 30 X 30 A (Babu et al., 1985). In addition, the calculated radius of gyration for Ca2+-CaM-M13 is -17 A which is completely consistent with the decrease in the radius of gyration from -21 A to - 16 A observed by both small an- gle X-ray and neutron scattering upon complexation of Ca2+- CaM with M13 (Heidorn et al., 1989).

The Ca2+-CaM-M13 complex is stabilized by numerous hydrophobic interactions, which are summarized in Figure 8.

Particularly striking are the interactions of Trp-4 and Phe-17 of the peptide, which serve to anchor the N- and C-terminal halves of M13 to the C-terminal and N-terminal hydrophobic patches of CaM, respectively (Fig. 7C). These interactions also involve a large number of methionine residues that are unusually abun- dant in CaM, in particular 4 methionines in the C-terminal domain (Met-109, Met-124, Met-144, and Met-145) and 3 me- thionines in the N-terminal domain (Met-36, Met-51, and Met- 71). Because methionine is an unbranched hydrophobic residue extending over 4 heavy atoms (Cs, Cy, S6, Ce), the abundance of methionines can generate a hydrophobic surface whose de- tailed topology is readily adjusted by minor changes in side-chain conformation, thereby providing a mechanism to accommodate and recognize different bound peptides (O'Neil & DeGrado, 1990).

In addition to hydrophobic interactions, there are a number of possible electrostatic interactions that can be deduced from the calculated NMR structures. Putative interactions exist be- tween the Arg and Lys residues of M13 and the Glu residues of CaM, and these are also included in Figure 8. Glu-1 1 and Glu- 14 in helix I are within 5 A of Lys-5 and Lysd of M13; Glu-83, Glu-84, and Glu-87 in helix V of CaM are close to Lys-19, Arg- 16, and Lys-18, respectively, of M13; and Glu-127 in helix VI1 of CaM is close to Arg-3 of M13.

The solution structure of the Ca2+-CaM-M13 complex ex- plains a number of interesting observations. Studies of backbone amide exchange behavior have shown that upon complexation with M13, the amide exchange rates of residues 75-79 are sub- stantially increased (Spera et al., 1991). Prior NMR studies on Ca2+-CaM indicated that the long central helix is already dis- rupted near its middle (from Asp-78 to Ser-81) in solution (Ikura et al., 1991) and that large variations in the orientation of one domain relative to the other occur randomly with time (Barbato et al., 1992). The further disruption of the central helix upon complexation seen in the structure of the complex is manifested by the increased amide exchange rates and supports the view of the central helix serving as a flexible linker between the 2 do- mains. Similarly, the structure of the complex explains the find- ing that as many as 4 residues can be deleted from the middle of the central helix without dramatically altering the stability or shape of the Ca2+-CaM-M13 complex (Persechini et al., 1989; Kataoka et al., 1991a), as the long flexible loop connecting the 2 domains can readily be shortened without causing any alter- ation in the structure (cf. Fig. 7). The observation from photo- affinity labeling studies that the 2 domains of CaM interact simultaneously with opposite ends of the peptide such that res- idue 4 of the peptide (numbering for M13) can be crosslinked to Met-124 or Met-144 of the C-terminal domain and that resi- due 13 of the peptide can be crosslinked to Met-71 of the N-terminal domain (O'Neil et al., 1989) is readily explained by the structural finding that the N-terminal half of the peptide in- teracts predominantly with the C-terminal domain, whereas the C-terminal half of the peptide interacts predominantly with the N-terminal domain (Figs. 7, 8). The observation that at least 17 residues of the M13 peptide from either skeletal muscle or smooth muscle are necessary for high affinity binding (Lukas et al., 1986; Blumenthal & Krebs, 1987) is readily explained by the intimate interactions of the C-terminal hydrophobic residue (i.e., Phe-17) with the N-terminal domain of CaM by which the peptide is anchored. Finally, the structure accounts for experi- ments in which crosslinking of residues 3 and 146 of CaM, mu-

Page 12: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

Protein structure determination by multidimensional heteronuclear NMR

C-DOMAIN

@ ( E l 27) M145

F141

127 L32 M36 M51 152 v 5 5 163

M51 v 5 5

M51

J

383

N-DOMAIN Fig. 8. Summary of residue pairs for which intermolecular NOES between CaM and MI3 are observed. CaM residues involved in hydrophobic interactions are boxed. Also included are potential electrostatic interactions between negatively charged Glu res- idues of CaM (shown in parentheses) and positively charged Lys and Arg residues of M 13.

tated to Cys, has no effect on the activation of myosin light chain kinase, even if the central helix is cleaved proteolytically at Lys-77 by trypsin (Persechini & Kretsinger, 1988). Thus, al- though the C" carbons of residues 3 and 146 are 37 A apart in the X-ray structure of Ca2+-CaM, they are only -20 A apart in the solution structure of the Ca2+-CaM-M13 complex, which is close enough to permit crosslinking to occur.

A large body of experimental data shows that CaM binds to numerous proteins whose binding domains exhibit a propensity for a-helix formation (Cohen & Klee, 1988). A comparison of these sequences reveals little homology. Nevertheless, many of the very tightly binding peptides (K,,, 2 5 X IO' M") have the common property of containing either aromatic residues or long chain hydrophobic residues (Leu, Ile, or Val) separated by 12 residues, as summarized in Figure 9. In the case of M13, these 2 residues are Trp-4 and Phe-17, which are exclusively in con- tact with the C- and N-terminal domains of CaM, respectively (Figs. 7, 8). Given that these 2 residues are involved in more hy- drophobic interactions with CaM than any other residues of the peptide (cf. Fig. 8), it seems likely that this feature of the se- quence can be used to align the CaM binding sequences listed in Figure 9, thereby permitting one to predict their interaction with CaM. It is clear from this alignment that the pattern of hy- drophobic and hydrophilic residues is in general comparable for the various peptides, suggesting that the mode of binding and the structure of the corresponding complexes with Ca2+-CaM are also likely to be similar. For example, there is, in general, conservation of hydrophobic residues at the positions equiva-

lent to Phe-8, which interacts with the C-terminal domain, and Val-1 1, which interacts with both domains (cf. Figs. 7, 8). In ad- dition, there are no acidic residues present that would result in unfavorable electrostatic interactions with the negatively charged Glu residues on the surface of CaM (cf. Fig. 7). The minimum length of peptide required for high affinity binding to Ca2+- CaM is defined by the 14-residue mastaporans, which comprise the 2 hydrophobic residues at the N- and C-termini (Fig. 9) and have approximately the same equilibrium association constant (K,,, - 1-3 x IO9 M") as M13 (Cox et al., 1985). This struc- tural alignment also predicts that a peptide stopping just short of the second hydrophobic residue of the pair (i.e., the residue equivalent to Phe-17) would only bind to the C-terminal domain and that the resulting complex would therefore retain the dumb- bell shape of Ca2+-CaM. This is exactly what has been ob- served by small angle X-ray scattering using 2 synthetic peptides, C24W and C20W (Fig. 9), comprising different portions of the CaM binding domain of the plasma membrane Ca2+ pump (Kataoka et al., 1991b). The complex with the C24W peptide, which corresponds to residues 1-24 of M13 and contains a Trp at position 4 and a Val at position 17, has a globular shape sim- ilar to that of Ca2+-CaM-M13. The complex with the C20W peptide, on the other hand, which corresponds to residues -4 to 16 of M13 and therefore lacks the C-terminal hydrophobic residue of the pair, retains the dumbbell shape of Ca2+-CaM, suggesting that the peptide only binds to the C-terminal domain.

Thus the solution structure of the complex of Ca2+-CaM with M13 reveals an unusual binding mode in which the target

Page 13: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

3 84 G.M. Clore and A.M. Gronenborn

SK-MLCK M13 SM-MLCK M13 Ca Pump C24W

c2 ow Calspermin Calcineurin Mastaporan Mastaporan X Mellitin

Interacting domain of CaM

1 5 10 15 20 25

K K N B I A Q K T G H A F R G L N R F R G L N R K A A V K A R W K I R A N L K A L A c N W K G I A K v LET G

V 1 A

A L

S A A N R R A I G R Q T Q I R Q T Q I K V A S S R G K M A R L A K K I M A K K L P A L I S

1 L

W

C C C N N

K K I S S S G A L M s s s V N A F R S S

G S S F V L

I K R K R Q Q

Fig. 9. Alignment of tightly binding (Kas > 5 X lo7 M") CaM binding sequences based on the structural role of Trp-4 and Phe-17 in anchoring the MI3 peptide to the C- and N-terminal domains of CaM, respectively.

peptide is sequestrated into a hydrophobic channel formed by the 2 domains of CaM with interactions involving 19 residues of the target peptide (i.e., residues 3-21 of M13). In addition, a key requirement appears to be the presence of 2 long chain hy- drophobic or aromatic residues separated by 12 residues in or- der to anchor the peptide to the 2 domains of CaM (Fig. 7). By analogy, the rope (i.e., the CaM binding domain of the target) has to be long enough and have 2 knots at each end for the 2 hands (i.e., domains) of CaM to grip it. This particular mode of binding is therefore only likely to occur if the CaM binding site is located either at an easily accessible C- or N-terminus or in a long exposed surface loop of the target protein. An exam- ple of the former is myosin light chain kinase and of the latter is calcineurin, and, in accordance with their location, the CaM binding sites are susceptible to proteolysis (Blumenthal &Krebs, 1987; Guerini & Klee, 1991). Clearly, other types of complexes between Ca2+-CaM and its target proteins are possible given the inherent flexibility of the central helix. For example, in the case of the y subunit of phosphorylase kinase, it appears that there are 2 discontinuous CaM binding sites that are capable of binding to Ca2+-CaM simultaneously (Dasgupta et al., 1989), and binding of a peptide derived from one of these sites causes elongation rather than contraction of Ca2+-CaM (Trewhella et al., 1990), indicating that the complex is of a quite different structural nature. Similarly, in the case of cyclic nucleotide phos- phodiesterase (Charbonneau et al., 1991) and CaM kinase I1 (Bennett & Kennedy, 1987), the CaM binding sequences do not have the same spacing of hydrophobic residues seen in MI3 and the other sequences listed in Figure 13, and, in addition, CaM kinase I1 is not susceptible to proteolysis in the absence of phos- phorylation (Kwiatkowski & King, 1989), suggesting that the mode of binding is different again. Thus, in all likelihood, the complex of Ca2+-CaM with the MI3 peptide from skeletal muscle myosin light chain kinase represents one of a range of Ca2+-CaM binding modes achieving CaM-target protein inter- actions in an efficient and elegant manner.

Structure of the specific complex of the transcription factor GATA-I with DNA

The erythroid-specific transcription factor GATA-1 is respon- sible for the regulation of transcription of erythroid-expressed

genes and is an essential component required for the generation of the erythroid lineage (Orkin, 1992). GATA-1 binds specifi- cally as a monomer to the asymmetric consensus target sequence (T/A)GATA(A/G) found in the cis-regulatory elements of all globin genes and most other erythroid-specific genes that have been examined (Evans & Felsenfeld, 1989). GATA-1 was the first member of a family of proteins, which now includes regulatory proteins expressed in other cell lineages, characterized by their recognition of the GATA DNA sequence and by the presence of 2 metal-binding regions of the form Cys-X-X-Cys-(X),,-Cys- X-X-Cys separated by 29 residues. Mutation and deletion stud- ies on GATA-1 have indicated that the N-terminal metal-binding region is not required for specific DNA binding (Martin & Or- kin, 1986), and studies with synthetic peptides have demon- strated conclusively that a 59-residue fragment (residues 158-216 of chicken GATA-I) comprising the C-terminal metal binding region complexed to zinc and 28 residues C-terminal to the last Cys constitutes the minimal unit required for specific binding (K,,, - 1.2 x 10' M") (Omichinski et al., 1993b). In order to understand the mechanism of specific DNA recognition by GATA-I we set out to solve the solution structure of the specific complex of a 66-residue fragment (residues 158-223) compris- ing the DNA binding domain of chicken GATA-I (cGATA-I) with a 16-bp oligonucleotide containing the target sequence AGATAA, by means of multidimensional heteronuclear filtered and separated NMR spectroscopy (see Kinemage 3; Omichin- ski et al., 1993a).

The structure calculations were based on a total of 1,772 ex- perimental NMR restraints, including 117 intermolecular inter- proton distance restraints between the protein and the DNA. A stereo view of a best-fit superposition of 30 calculated structures (residues 2-59 of the protein and base pairs 6-13 of the DNA) is shown in Figure 10. The N- (residue 1) and C- (residues 60- 66) termini of the protein are disordered. Base pairs 6-13 of the DNA are in contact with the cGATA-I DNA binding domain and are well defined both locally and globally. The orientation, however, of the first 5 and last 3 bp of the DNA, which are not in contact with the protein, is poorly defined with respect to the core of the complex, although the conformation of each of these bases at a local level is reasonably well defined. This is due to the fact that, in addition to their approximate nature, the inter- proton distance restraints within the DNA are solely sequential.

Page 14: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

Protein structure determination by multidimensional heteronuclear NMR 385

1. e T27

Fig. 10. Stereo view showing a superposition of the 30 simulated annealing structures of the specific complex of the DNA bind- ing domain of cGATA-I with DNA calculated on the basis of the experimental NMR data derived from 3D and 4D heteronu- clear NMR spectroscopy. The backbone (N, C", C) atoms of cGATA-1 are shown in red and all the non-hydrogen atoms of the DNA in blue. The restrained regularized mean structure of the complex is highlighted. The coordinates are from Omichin- ski et al. (1993a) (PDB accession code IGAT).

Hence, they are inadequate to ascertain the relative orientation of base pairs separated by more than 5-6 steps with any great degree of precision and accuracy. The global conformation of the central 8 bp, on the other hand, is determined not only by the restraints within the DNA, but more importantly by the large number of intermolecular interproton distance restraints be- tween the protein and DNA. The atomic RMS distribution of the 30 SA structures about the mean coordinate positions for the complex proper (i.e., residues 2-59 of the protein and base pairs 6-13 of the DNA) is 0.70 k 0.13 A and 1.13 k 0.08 A for protein backbone plus DNA and all protein atoms plus DNA, respectively.

The protein can be divided into 2 modules: the protein core, which consists of residues 2-5 1 and contains the zinc coordina- tion site, and an extended C-terminal tail (residues 52-59).

A schematic ribbon drawing of the core is presented in Fig- ure 11A. The core starts out with a turn (residues 2-5), followed by 2 short irregular antiparallel @-sheets, a helix (residues 28- 38), and a long loop (residues 39-5 l ) , which includes a helical turn (residues 44-47), as well as an !"e loop (residues 47-5 1). &strands 1 (residues 5-7) and 2 (residues 11-14) form the first &sheet, and 0-strands 3 (residues 18-21) and 4 (residues 24- 27) form the second &sheet.

Part of the core of the cGATA-1 DNA binding domain is structurally similar to that of the N-terminal zinc-containing module of the DNA binding domain of the glucocorticoid re- ceptor (Luisi et al., 1991). Thus the C" atoms of 30 residues of these 2 proteins can be superimposed with an RMS difference of only 1.4 A (Fig. 11B). Apart from the 4 Cys residues that co- ordinate the zinc atom, only 1 residue (Lys-36 in the cGATA-1 DNA binding domain and Lys-465 in the glucocorticoid recep- tor) is conserved between the 2 proteins. The structural similar- ity extends from the N-terminus up to the end of the helix (residues 3-39 of the cGATA-1 DNA binding domain and resi- dues 436-468 of the glucocorticoid receptor), and the Zn-ST

geometry, as well as the side-chain conformations of the 4 co- ordinating cysteines, are identical. The loop between strands 02 and 03 has 3 deletions, and the turn between strands 03 and p4 has 1 deletion in the glucocorticoid receptor with respect to cGATA-1. The topology and polypeptide trace following the car- boxy end of the helix, however, are entirely different in the 2 proteins. Thus, in the DNA binding domain of the glucocorti- coid receptor there is a second compact zinc-containing mod- ule (residues 470-514) made up of 2 strands and 2 helices, whereas in the cGATA-1 DNA binding domain there is a long loop (residues 38-51) and extended strand (residues 52-59).

The overall topology and structural organization of the com- plex is shown in Figure 12A and B. The conformation of the oli- gonucleotide is B-type. The helix and the loop connecting strands /32 and p3 (which is located directly beneath the helix) are located in the major groove, whereas the C-terminal tail wraps around the DNA and lies in the minor groove, directly opposite the helix. The overall appearance is analogous to that of a right hand holding a rope, with the rope representing the DNA, the palm and fingers of the hand the core of the protein, and the thumb the C-terminal tail. It is this pincer-like config- uration of the protein that causes a small 10" kink in the DNA. The long axis of the helix lies at an angle of -40" to the base planes of the DNA (Fig. 12A), whereas the C-terminal tail is ap- proximately parallel to the base planes (Fig. 12B).

Views of side-chain contacts with the DNA in the major and minor grooves are shown in Figure 12C and D, respectively, and a schematic representation of all the contacts is provided in Fig- ure 13. The cGATA-1 DNA binding domain makes specific con- tacts with 8 bases, 7 in the major groove (A6, G7, A8, T25, A24, T23, and T22) and 1 in the minor groove (T9). All the base con- tacts in the major groove involve the helix and the loop connect- ing P-strands 2 and 3. In contrast to other DNA binding proteins, the majority of base contacts involve hydrophobic in- teractions. Thus, Leu-17 interacts with A6, G7, and T25, Thr-

Page 15: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

G.M. Clore and A.M. Gronenborn 386

A B

. .

Fig. 11. A: Schematic ribbon drawing of the core of the cGATA-1 DNA binding domain. B: Superposition of the C" atoms of the cGATA-I (green) and glucocorticoid receptor (red) DNA binding domains. The zinc and coordinating cysteines are shown in yellow for cGATA-1 and in purple for the glucocorticoid receptor; the residues are labeled according to the numbering in cGATA-I. The alignment of cGATA-I with the glucocorticoid receptor is as follows: residues 3-13, 18-21, 25-39, and 46 of cGATA-I are superimposed on residues 436-446,448-451,454-468, and 490, respectively, of the glucocorticoid receptor with aC"atomic RMS deviation of 1.4 A. The diagram in A was made with the program MOLSCRIPT (Kraulis, 1991). The coor- dinates of the glucocorticoid receptor DNA binding domain shown in B is taken from Luisi et al. (1991). The cGATA-I coordi- nates are from Omichinski et al. (1993a) (PDB accession code IGAT).

16 with A24 and T25, Leu-33 with A24 and T23, and Leu-37 with T23 and T22. This accounts for the predominance of thy- midines in the DNA target site. Indeed, there are only 3 hydro- gen bonding interactions: namely, between the side chain of Asn-29 and the N6 atoms of A24 and A8 in the major groove; and between the NrH3+ of Lys-57 and the 0 2 atom of T9 in the minor groove. In this regard, it is interesting to note that there is a reduction of 1,127 A2 in the surface-accessible area of the cGATA-1 DNA binding domain in the presence of DNA (cor- responding to a 20% decrease in the accessible surface), and a decrease in the calculated solvation free energy of folding (Eisen- berg & McLaghlan, 1986) of 13 kcal-mol". This latter effect can clearly make a sizeable contribution to the specific binding constant (K,,, - 1.2 x lo8 "I).

The remaining contacts involve the sugar-phosphate back- bone, the majority of which are located on the second strand (that is G20 to T27). Salt bridges and/or hydrogen bonds with the phosphates of G7, A24, and T22 are made by Arg-19, Arg- 47, and His-38, respectively, in the major groove, and with the phosphates of C13, T25, C26, and T27 by Arg-54, Thr-53, Arg- 56, and Ser-59, respectively, in the minor groove. The interac- tions of Arg-54 and Arg-56 above and below the polypeptide chain span the full length of the target site and are probably re- sponsible for the bending of the DNA in the direction of the mi- nor groove. Likewise, all the sugar contacts involve the second strand. In the major groove they are hydrophobic in nature and involve contacts between the sugars of T22, T23, and A24 with Tyr-34, Leu-33, and Ala-30, and Ile-51 and Thr-16, respectively. In the minor groove, hydrophobic sugar DNA-protein interac- tions are made by C13 with the aliphatic portion of the side chain

of Arg-54, T23 and T24 with (3111-52, T25 and C26 with the ali- phatic portion of the side chain of Arg-56, and C26 with Ser- 59. In addition, there is a hydrogen bond between the side-chain amide of Gln-52 and the sugar 03 ' atom of T23.

The mode of specific DNA binding protein that is revealed in this structure is distinct from that observed for the other 3 classes of zinc-containing DNA binding domains whose struc- tures have previously been solved (Luisi et al., 1991; Pavletich & Pabo, 1991, 1993; Mamorstein et al., 1992; Fairall et al., 1993; Schwabe et al., 1993). Features specific to the complex with the DNA binding domain of cGATA-1 include the relatively small size of the DNA target site (8 base pairs of which only a con- tiguous stretch of 6 is involved in specific contacts), the monomeric nature of the complex in which only a single zinc- binding module is required for specific binding, the predomi- nance of hydrophobic interactions involved in specific base contacts in the major groove, the presence of a basic C-terminal tail that interacts with the DNA in the minor groove and con- stitutes a key component of specificity, and finally the pincer- like nature of the complex in which the core and tail subdomains are opposed and surround the DNA just like a hand gripping a rope. The structure of the cGATA-1 DNA binding domain re- veals a modular design. The fold of residues 3-39 is similar to that of the N-terminal zinc binding module of the DNA bind- ing domain of the glucocorticoid receptor, although, with the exception of the 4 Cys residues that coordinate zinc, there is no significant sequence identity between these regions of the 2 pro- teins. Residues 40-66 are part of a separate structural motif. In this regard it is interesting to note that, in addition to both zinc-binding modules being encoded on separate exons in the

Page 16: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

Protein structure determination by multidimensional heteronuclear NMR 387

Fig. 12. A, B: Schematic ribbon drawings illustrating the interactions of cGATA-1 with DNA. C, D: Side-chain interactions between cGATA-I and the DNA in the major and minor grooves, respectively. The protein backbone is shown in green and the protein side chains in yellow; the color code for the DNA bases is as follows: red for A, lilac for T, dark blue for G, and light blue for C. The diagrams were made using the program VISP (de Castro & Edelstein, 1992). The coordinates of the cGATA-1- DNA complex are from Omichinski et al. (1993a) (PDB accession code IGAT).

cGATA-1 gene (exons 4 and 5 ) , the next introdexon boundary lies between amino acids 39 and 40 (current numbering scheme) of the DNA binding domain, thereby separating the C-terminal zinc-binding domain from the basic tail (Hannon et al., 1991).

Concluding remarks

From the examples presented in this review it should be clear that the recent development of a whole range of highly sensitive multi- dimensional heteronuclear edited and filtered NMR experiments has revolutionized the field of protein structure determination by NMR. Proteins and protein complexes in the 15-25-kDa range are now amenable to detailed structure analysis in solu- tion. Moreover, the potential of the current methods can prob- ably be extended to systems even up to 40 kDa providing that they are very well behaved from an NMR perspective. Never- theless, despite these advances, it should always be borne in mind that there are a number of key requirements that have to be sat- isfied to permit a successful structure determination of larger proteins and protein complexes by NMR. The protein in hand must be soluble and should not aggregate up to concentrations

of about 1 mM, it must be stable at room temperature or slightly higher for many weeks, it should not exhibit significant confor- mational heterogeneity that could result in extensive line broad- ening, and finally it must be amenable to uniform "N and 13C labeling. At the present time there are only a few examples in the literature of proteins in the 15-25-kDa range that have been solved by multidimensional heteronuclear NMR spectroscopy. In addition to the 3 examples presented here, only the structures of interleukin-4 (Powers et al., 1992, 1993; Smith et al., 1992), glucose permease IIA (Fairbrother et al., 1991), and the com- piex of cyclophilin with cyclosporin (Theriault et al., 1993) have been published. It is hoped that over the next few years, the widespread use of these multidimensional heteronuclear exper- iments coupled with semi-automated assignment procedures will result in many more NMR structures of such larger proteins and protein-ligand complexes.

Acknowledgments

We thank our many colleagues, past and present, who have contributed to the work carried out in our laboratory. Above all, we thank Ad Bas,

Page 17: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

388 G.M. CIore and A.M. Gronenborn

BP13

BP12

BP11

BP10

BPQ

BP8

B P7

BP6

with whom we have shared numerous stimulating discussions, fruitful experiments, and a continuous and most enjoyable collaboration in the best of scientific spirits. This work was supported in part by the AIDS Targeted Anti-Viral Program of the Office of the Director of the Na- tional Institutes of Health.

References

Babu YS, Sack JS, Greenhough TJ, Bugg CE, Means AR, Cook WJ. 1985. Three dimensional structure of calmodulin. Nature 315:37-40.

Barbato G, Ikura M, Kay L, Pastor RW, Bax A. 1992. Backbone dynamics of calmodulin studied by I5N relaxation using inverse detected two- dimensional NMR spectroscopy: The central helix is flexible. Biochem-

Bax A, Grzesiek S. 1993. Methodological advances in protein NMR. Arc istry 31:5269-5278.

Chem Res26:131-138. Bax A, Pochapsky SS. 1992. Optimized recording of heteronuclear multi-

dimensional NMR spectra using pulse field gradients. J Magn Reson 99:638-643.

Bennett MK, Kennedy MB. 1987. Deduced primary structure of the 8 sub- unit of brain type I1 Ca2+/calmodulin dependent protein kinase deter- mined by molecular cloning. Proc Natl Acad Sei USA 84:1794-1798.

Blumenthal DK, Krebs EG. 1987. Preparation and properties of the calmod- ulin binding domain of skeletal muscle myosin light chain kinase. Meth- ods Enzymol 139:115-126.

Braun W. 1987. Distance geometry and related methods for protein struc- ture determination from NMR data. Q Rev Biophys 19:115-157.

Briinger AT. 1992. The free R value: A novel statistical quantity for assessing the accuracy of crystal structures. Nature (Lond) 355:472-474.

Charbonneau H, Kumar S, Novack JP, Blumenthal DK, Griffin PR, Shabanowitz J, Hunt DF, Beavo JA, Walsh KA. 1991. Evidence for domain organization within the 61-kDa calmodulin-dependent cyclic nucleotide phosphodiesterase from bovine brain. Biochemisfry 30: 7931-7940.

Clore GM, Bax A, Driscoll PC, Wingfield PT, Gronenborn AM. 1990a. As- signment of the side chain 'H and I3C resonances of interleukin-lo using double and triple resonance heteronuclear three-dimensional NMR

Clore GM, Bax A, Wingfield PT, Gronenborn AM. 1990b. Identification spectroscopy. Biochemistry 29:8172-8184.

and localization of bound internal water in the solution structure of interleukin-lo by heteronuclear three-dimensional 'H rotating frame Overhauser "N-IH multiple quantum coherence NMR spectroscopy. Biochemistry 29:5671-5676.

Clore GM, Driscoll PC, Wingfield PT, Gronenborn AM. 1990~. Low resolu- tion structure of interleukin-18 in solution derived from 'H-15N hetero- nuclear three-dimensional NMR spectroscopy. JMol Bio1214:811-817.

Clore GM, Gronenborn AM. 1987. Determination of three-dimensional structures of proteins in solution by nuclear magnetic resonance spec- troscopy. Protein Eng I :275-288.

Clore GM, Gronenborn AM. 1989. Determination of three-dimensional structures of proteins and nucleic acids in solution by nuclear magnetic resonance spectroscopy. CRC Crit Rev Biochem Mol Biol 24:479- 564.

Clore GM, Gronenborn AM. 1991a. Structures of larger proteins in solu- tion: Three- and four-dimensional heteronuclear NMR spectroscopy. Sci- ence 252:1390-1399.

Page 18: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

Protein structure determination by multidimensional heteronuclear NMR 389

Clore GM, Gronenborn AM. 1991b. Comparison of the solution nuclear

interleukin-10. JMol Eiol221:47-53. magnetic resonance and X-ray crystal structures of human recombinant

Clore GM, Gronenborn AM. 1991~. Applications of three- and four- dimensional heteronuclear NMR spectroscopy to protein structure de- termination. Progr Nucl Magn Reson Spectrosc 23:43-92.

Clore GM, Gronenborn AM. 1991d. Two, three and four dimensional NMR methods for obtaining larger and more precise three-dimensional struc-

Clore GM, Kay LE, Bax A, Gronenborn AM. 1991a. Four dimensional tures of proteins in solution. Annu Rev Biophys Eiophys Chem 20:29-63.

13C/13C-edited nuclear Overhauser enhancement spectroscopy of a pro- tein in solution: Application to interleukin-10. Biochemistry30:12-18.

Clore GM, Robien MA, Gronenborn AM. 1993. Exploring the limits of pre- cision and accuracy of protein structures determined by nuclear magnetic

Clore GM, Wingfield PT, Gronenborn AM. 1991b. High resolution three- resonance spectroscopy. J Mol Eiol231:82-102.

dimensional structure of interleukin-16 in solution by three and four dimensional nuclear magnetic resonance spectroscopy. Biochemistry 30:2315-2323.

Cohen P, Klee CB. 1988. Molecular aspects of cellular recognition, vol5. New York: Elsevier.

Cox JA, Comte M, Fitton JE, DeGrado WF. 1985. The interaction of cal- modulin with amphiphilic peptides. J Biol Chem 260:2527-2534.

Dasgupta M, Honeycutt T, Blumenthal DK. 1989. The y-subunit of skele- tal muscle phosphorylase kinase contains two noncontiguous domains that act in concert to bind calmodulin. JEiol Chem 264:17156-17163.

de Castro E, Edelstein S. 1992. VISP 1.0 user's guide. Geneva: University of Geneva.

Driscoll PC, Clore GM, Marion D, Wingfield PT, Gronenborn AM. 1990a. Complete resonance assignment for the polypeptide backbone of interleukin-10 using three-dimensional heteronuclear NMR spectroscopy. Biochemistry 29:3542-3556.

Driscoll PC, Gronenborn AM, Wingfield PT, Clore GM. 1990b. Determina- tion of the secondary structure and molecular topology of interleukin-I@ using two- and three-dimensional heteronuclear 15N-'H NMR spectros- copy. Biochemistry 29:4468-4482.

Dyson HJ, Gippert GP, Case DA, Holmgren A, Wright PE. 1990. Three- dimensional solution structure of the reduced form of Escherichia coli thioredoxin determined by nuclear magnetic resonance spectroscopy. Bio- chemistry 29:4129-4136.

Eisenberg D, McLaghlan AD. 1986. Solvation energy in protein folding and binding. Nature 319: 199-203.

Ernst RR, Bodenhausen G, Wokaun A. 1987. Principles of nuclear magnetic resonance in one and two dimensions. Oxford, UK: Clarendon Press.

Evans T, Felsenfeld G. 1989. The erythroid-specific transcription factor eryfl: A new finger protein. Cell 58:877-885.

Fairall L, Schwabe JWR, Chapman L, Finch JT, Rhodes D. 1993. The crys- tal structure of a two zinc-finger peptide reveals an extension to the rules

Fairbrother WJ, Gippert GP, Reizer J, Saier MH, Wright PE. 1992. Low for zinc-finger/DNA recognition. Nature 366:483-487.

resolution structure of the Bacillus subtilis glucose permease IIA domain

Lett 296:148-152. derived from heteronuclear three-dimensional NMR spectroscopy. FEBS

Fesik SW, Zuiderweg ERP. 1988. Heteronuclear three-dimensional NMR spectroscopy: A strategy for the simplification of homonuclear two- dimensional NMR spectra. JMagn Reson 78:588-593.

Fesik SW, Zuiderweg ERP. 1990. Heteronuclear three-dimensional NMR spectroscopy of isotopically labelled biological macromolecules. Q Rev

Finzel BC, Clancy LL, Holland DR, Muchmore SW, Watenpaugh KD, Ein- Biophys 23 :97- 13 1.

spahr HM. 1989. Crystal structure of recombinant human interleukin- 10 at 2.0 A resolution. JMol Eiol209:779-791.

Forman-Kay JD, Clore GM, Wingfield PT, Gronenborn AM. 1991. The high resolution three-dimensional structure of reduced recombinant human thioredoxin in solution. Biochemistry 30:2685-2698.

Guerini D, Klee CB. 1991. Structural diversity of calcineurin Ca2+-calmodulin

Giintert P, Braun W, Billeter M, Wiithrich K. 1989. Automated stereospe- stimulated phosphatases. Adv Protein Phosphatases 6:391-410.

cific 'H assignments and their impact on the precision of protein struc- ture determinations in solution. J A m Chem Soc 111:3997-4004.

Hannon R, Evans T, Felsenfeld G, Gould H. 1991. Structure and promoter activity of the gene for the erythroid transcription factor GATA-I. Proc Nut1 Acad Sci USA 88:3004-3008.

Havel TF. 1991. An evaluation of computational strategies for use in the de- termination of protein structure from distance constraints obtained by nuclear magnetic resonance. Progr Eiophys Mol Biol 56:43-78.

Have1 TF, Kuntz ID, Crippen GM. 1983. Theory and practice of distance geometry. Bull Math Biol45:665-720.

Havel TF, Wiithrich K. 1985. An evaluation of the combined use of nuclear

tein conformation in solution. JMol Eiol 182:281-294. magnetic resonance and distance geometry for the determination of pro-

Heidorn DB, Seeger PA, Rokop SE, Blumenthal DK, Means AR, Crespi H, Trewhella J. 1989. Changes in the structure of calmodulin induced by a peptide based on the calmodulin-binding domain of myosin light chain kinase. Biochemistry 28:6757-6764.

Hurd RE, John BK. 1991. Gradient-enhanced proton-detected heteronuclear multiple-quantum coherence spectroscopy. J Magn Reson 91:648- 653.

Hyberts SG, Marki W, Wagner G. 1987. Stereospecific assignment of side chain protons and characterization of torsion angles in eglin c. Eur JBio- chem 164:625-635.

Ikura M, Bax A. 1992. Isotope filtered 2D NMR of a protein-peptide com- plex: Study of a skeletal muscle myosin light chain kinase fragment bound to calmodulin. J Am Chem SOC 114:2433-2440.

Ikura M, Clore GM, Gronenborn AM, Zhu G, Klee CB, Bax A. 1992. So- lution structure of a calmodulin-target peptide complex by multidimen- sional NMR. Science 256:632-638.

Ikura M, Kay LE, Bax A. 1990. A novel approach for sequential assignment of 'H, I3C and "N spectra of larger proteins: Heteronuclear triple res- onance NMR spectroscopy. Application to calmodulin. Biochemistry

Ikura M, Kay LE, Krinks M, Bax A. 1991. Triple-resonance multidimen- sional NMR study of calmodulin complexed with the binding domain of skeletal muscle myosin light-chain kinase: Indication of a conforma- tional change in the central helix. Biochemistry 30:5498-5504.

Kataoka M, Head JF, Persechini A, Kretsinger RH, Engelman DM. 1991a. Small-angle X-ray scattering studies of calmodulin mutants with dele- tions in the linker region of the central helix indicate that the linker re- gion retains predominantly a-helical conformation. Biochemistry 30:1188-1192.

Kataoka M, Head JF, Vorherr T, Krebs J, Carafoli E. 1991b. Small-angle X-ray scattering study of calmodulin bound to two peptides correspond- ing to parts of the calmodulin-binding domain of the plasma membrane Ca2+ pump. Biochemistry 30:6247-6251.

Kay LE, Clore GM, Bax A, Gronenborn AM. 1990. Four-dimensional het- eronuclear triple resonance NMR spectroscopy of interleukin-10 in so- lution. Science 249:411-414.

Kraulis PJ. 1991. MOLSCRIPT: A program to produce both detailed and schematic plots of protein structures. J Appl Crystallogr 24:946-950.

Kwiatkowski AP, King MM. 1989. Autophosphorylation of the type I1 cal- modulin dependent protein kinase is essential for the formation of a pro- teolytic fragment with catalytic activity: Implications for long-term synaptic potentiation. Biochemistry 28:5380-5385.

Luisi BF, Xu WX, Otwinowski Z, Freedman LP, Yamamoto KR, Sigler PC.

receptor with DNA. Nature 352:497-505. 1991. Crystallographic analysis of the interaction of the glucocorticoid

Lukas TJ, Burgess WH, Predergast FG, Lau W, Watterson DM. 1986. Cal- modulin binding domains: Characterization of a phosphorylation and calmodulin binding site from myosin light chain kinase. Biochemistry 25:1458-1464.

Mamorstein R, Carey M, Ptashne M, Harrison SC. 1992. DNA recognition by GAL4: Structure of a protein-DNA complex. Nature 356:408-414.

Marion D, Driscoll PC, Kay LE, Wingfield PT, Bax A, Gronenborn AM, Clore GM. 1989. Overcoming the overlap problem in the assignment of 'H-NMR spectra of larger proteins using three-dimensional heteronu- clear 'H-I5N Hartmann-Hahn and nuclear Overhauser-multiple quan- tum coherence spectroscopy. Application to interleukin-10. Biochemistry

Martin D, Orkin S. 1986. Transcriptional activation and DNA binding by the erythroid factor GF-1/NF-El/Eryf 1. Genes Dev 4:1886-1989.

Montelione GT, Wagner G. 1989. Accurate measurement of homonuclear HN-Ha coupling constants in polypeptides using heteronuclear 2D NMR experiments. J A m Chem Soc 1115474-5475.

Montelione GT, Wagner G. 1990. Conformation independent sequential NMR connections in isotope-enriched polypeptides by 'H-'3C-'SN tri-

Nilges M, Clore GM, Gronenborn AM. 1988. Determination of three- ple resonance experiments. JMagn Reson 87:183-188.

dimensional structures of proteins from interproton distance data by hy- brid distance geometry-dynamical simulated annealing calculations. FEBS Lett 229:317-324.

Nilges M, Clore GM, Gronenborn AM. 1990. 'H-NMR stereospecific as-

Omichinski JG, Clore GM, Schaad 0, Felsenfeld G, Trainor C, Appella E, signments by conformational database searches. Biopolymers29:813-822.

Stahl SJ, Gronenborn AM. 1993a. NMR structure of a specific DNA complex of the Zn-containing DNA binding domain of GATA-1. Science 261:438-446.

Omichinski JG, Trainor C, Evans T, Gronenborn AM, Clore GM, Felsen- feld G. 1993b. A small single-'finger' peptide from the erythroid factor

29:4659-4667.

29~6150-6156.

Page 19: YOUNG INVESTIGATOR AWARD LECTURE Structures of larger ... · Protein structure determination by multidimensional heteronuclear NMR 375 3 4 rz z M 5 6 3D 15N-edited NOESY E83&82a e

390 G.M. Clore and A.M. Gronenborn

Acad Sci USA 90:1676-1680. GATA-I binds specifically to DNA as a zinc or iron complex. Proc Natl

O'Neil KT, DeGrado WF. 1990. How calmodulin binds its targets: Sequence independent recognition of amphiphilic a-helices. Trends Biochem Sci

O'Neil KT, Erickson-Viitanen S , DeGrado WF. 1989. Photolabeling of

p-benzoylphenylalanine. JBiol Chem 264:14571- 14578. calmodulin with basic, amphiphilic a-helical peptides containing

Orkin SH. 1992. GATA-binding transcription factors in hematopoietic cells. Blood 80:575-581.

Oschkinat H, Griesinger C, Kraulis PJ , Ssrensen OW, Ernst RR, Gronen- born AM, Clore GM. 1988. Three-dimensional NMR spectroscopy of a protein in solution. Nature (Lond) 332:374-376.

Pavletich NP, Pabo CO. 1991. Zinc finger-DNA recognition: Crystal struc- ture of a Zif268-DNA complex at 2.1 A. Science 252:809-817.

Pavletich NP, Pabo CO. 1993. Crystal structure of a five-finger GLI-DNA

Persechini A, Blumenthal DK, Jarrett HW, Klee CB, Hardy DO, Kretsinger complex: New perspectives on zinc fingers. Science 261:1701-1707.

RH. 1989. The effects of deletions in the central helix of calmodulin on enzyme activation and peptide binding. J Biol Chem 26424052-8058.

Persechini A, Kretsinger RH. 1988. The central helix of calmodulin func- tions as a flexible tether. JBiol Chem 263:12175-12178.

Powers R, Garrett DS, March CJ, Frieden EA, Gronenborn AM, Clore GM. 1992. Three-dimensional structure of interleukin-4 by multi-dimensional

Powers R, Garrett DS, March CJ, Frieden EA, Gronenborn AM, Clore GM. heteronuclear magnetic resonance spectroscopy. Science 256:1673-1677.

1993. The high resolution three-dimensional solution structure of human interleukin-4 determined by multi-dimensional heteronuclear magnetic

Priestle JP, Schar HP, Griitter MG. 1989. Crystallographic refinement of resonance spectroscopy. Biochemistry 32:6744-6762.

interleukin-loat 2.OA resolution. ProcNutlAcudSci USA 86:9667-9671. Schwabe JWR, Chapman L, Finch JT, Rhodes D. 1993. The crystal struc-

ture of the estrogen receptor DNA-binding domain bound to DNA: How receptors discriminate between their response elements. Cell 75:567-568.

Shaanan B, Gronenborn AM, Cogen GH, Gilliland GL, Veerapandian B, Davies DR. Clore GM. 1992. Combining experimental information from crystal and solution studies: Joint X-ray and NMR refinement. Science

Smith LJ, Redfield C, Boyd J, Lawrence GMP, Edwards RG, Smith RAG,

15~59-64.

257:961-964.

Dobson CM. 1992. Human interleukin-4: The solution structure of a four-helix bundle protein. JMol Biol 224:899-904.

Spera S , Ikura M, Bax A. 1991. Measurement of the exchange rates of rap- idly exchanging amide protons: Application to the study of calmodulin

NMR 1:155-165. and its complex with a myosin light chain kinase fragment. J Biomol

Theriault Y, Logan TM, Meadows R, Yu L, Olejniczak ET, Holzman T, Sim- mer RL, Fesik TM. 1993. Solution structure of the cyclosporin A/cyclophilin complex by NMR. Nature 361:88-91.

Trewhella J, Blumenthal DK, Rokop SE, Seeger PA. 1990. Small-angle scat- tering studies show distinct conformations of calmodulin in its complex with two peptides based on the regulatory domain of the catalytic sub- unit of phosphorylase kinase. Biochemistry 29:9316-9324.

Veerapandian B, Gilliland GL, Raag R, Svensson AL, Masui Y, Hirai Y, Poulos TL. 1992. Functional implications of interleukin-18 based on the three dimensional structure. Proteins Struct Funct Genet 12:lO-24.

Vuister GW, Boelens R, Kaptein R, Hurd RE, John BK, Van Zilj PCM. 1991. Gradient enhanced HMQC and HSQC spectroscopy: Applications to "N-labeled Mnr repressor. J A m Chem SOC 113:9688-9690.

Vuister GW, Clore GM, Gronenborn AM, Powers R, Garrett DS, Tschudin

dimensional I3C/I3C separated HMQC-NOE-HMQC spectra using R, Bax A. 1993. Increased resolution and improved quality in four-

Vuister GW, Grzesiek S , Delaglio F, Wang AC, Tschudin R, Zhu G, Bax A. pulse field gradients. JMagn Reson Ser B 101:210-213.

1994. Measurement of homo- and heteronuclear J couplings from quan- titative J correlation. Methods Enzymol. Forthcoming.

Wagner G, Braun W, Have1 TF, Schauman T, Go N, Wiithrich K. 1987. Pro- tein structures in solution by nuclear magnetic resonance and distance geometry: The polypeptide fold of the basic pancreatic trypsin inhibi- tor determined using two different algorithms: DISGEO and DISMAN. J Mol Biol 196:611-639.

Wiithrich K. 1986. NMR of proteins and nucleic acids. New York: Wiley. Zuiderweg ERP, Boelens R, Kaptein R. 1985. Stereospecific assignment of

IH-NMR methyl lines and conformation of valyl residues in the lac re-

Zuiderweg ERP, Petros AM, Fesik SW, Olejniczak ET. 1991. Four- pressor headpiece. Biopolymers 24:601-611.

dimensional [I3C, 'H, I3C, 'HI HMQC-NOE-HMQC NMR spectros- copy: Resolving tertiary NOE distance restraints in spectra of larger proteins. J A m Chem SOC 113:370-372.