4 experimental protein structure determination 2014 · protein structure determination using nmr...
TRANSCRIPT
1
Programme 8.00-8.15 Good morning and quiz results 8.15-8.20 Break 8.20-8.50 Experimental structure determination – part I 8.50-9.00 Break
9.00-9.30 Experimental structure determination – part II 9.30-9.45 Break 9.45-11.00 Exercise: Reading protein structure papers 11.00-11.30 Summary & Discussion 11.30-12.00 Quiz!
1
Feedback Persons
2
http://www.bio-evaluering.dk
2
Programme 8.00-8.15 Good morning and quiz results 8.15-8.20 Break 8.20-8.50 Experimental structure determination – part I 8.50-9.00 Break
9.00-9.30 Experimental structure determination – part II 9.30-9.45 Break 9.45-11.00 Exercise: Reading protein structure papers 11.00-11.30 Summary & Discussion 11.30-12.00 Quiz!
3
Experimental Protein Structure Determination
X-ray crystallography &
NMR Spectroscopy
4
3
Learning Objectives
After today you should be able to:
• Give an outline of the most important steps (and obstacles) in protein structure determination by X-ray crystallography and NMR spectroscopy.
• Identify relevant parameters for evaluating the quality of protein structures determined by X-ray crystallography and NMR spectroscopy.
5
Part I
X-ray crystallography
6
4
Outline
• Protein X-ray Crystallography – X-rays and diffraction – Crystallisation – The phase problem and solutions – Model refinement – Key parameters – Bottlenecks!
7
Growing Crystals
• Hanging drop technique (vapour diffusion)
• Protein drop slowly equilibrates with reservoir solution leading to slow supersaturation and precipitation of the protein.
8
5
X-Ray Sources
• Synchrotron
ESRF
Grenoble, France
9
X-rays
Fourier transform
10
6
Recording Data
• The Diffraction Experiment
11
Object and Diffraction Image
• Diffraction is a Fourier transform of the object (phase colouring)
12
7
Diffraction Image
Structure factor Individual reflections
13
Resolution rings
€
F hkl( ) = f j exp 2πi hx j + ky j + lz j( )[ ]j=1
N
∑
Reflection(s) €
Ihkl ∝ Fobs(hkl)2
The Phase Problem
• Loss of phase information when recording data.
• For crystals
Diffraction experiment
14
€
Ihkl ∝ Fobs(hkl)2
8
Phase Information
• Our Fourier friends • Phases switched
15
From Data to Structure
• Phasing methods – MIR: multiple isomorphous replacements
• Depends on scattering contribution from few heavy atoms, crystals must be isomorphous (same shape)
– MAD (SAD): multiple (single) anomalous diffraction • Depends on scattering from atoms above and below absorption
wavelenght • Analogous to MIR
• Molecular replacement – “Stealing” the phases from a (very) similar structure to
phase data of homologous structure.
16
9
The Importance of Resolution
high
low 4 Å
2 Å
3 Å
1 Å
17
Phase Information
18
10
Missing Data
• Data completeness • Simulated oscillation data with crystal decay
19
• Unit cell contents
Crystal Contents vs. Structure • Lattice and unit cell
20
11
Crystal Contents vs. Structure • Lattice and unit cell • Unit cell contents
• Repeated in all three dimensions by translation
21
Crystal Symmetry • Asymmetric unit
– Generates the unit cell by symmetry.
– Usually one or more molecules/subunits.
• Biological unit – Biomolecule entry in
pdb file. – May be more, less or
same as asym. unit.
• Symmetry and symbols – Examples:
P21,P43212, C222 – Numbers refer to
rotation axes.
22
12
Asymmetric Unit vs. Crystal
23
Non-Crystallographic Symmetry • NCS arises when the
symmetry of multimers does NOT coincide with crystal symmetry.
• Useful in refinement of low-resolution structures – Constraint (=) – Restraint (~)
24
13
Solvent in Protein Structures
Two levels of solvent description: • Bulk solvent in solvent channels
– Implicit description – Not part of the final model, but part
of the data processing/handling
• Crystallographic solvent – Visible in electron density maps
• Found on the surface and especially inside proteins.
• Integral part of protein structure! – Explicit atomic description
25
B-factors • Measure of atomic vibrations
<u2> is mean square displacement
• Also contains contributions from: – Crystal defects/decay – Data incompleteness – Multiple conformations – Model/refinement errors – Occupancy
26
€
B = 8π u2
14
Model Refinement
• Model refinement consists of several rounds of simultaneous
– Positional refinement – B factor refinement – Addition of water molecules – Assignment of multiple
conformations
• Goal: To minimize the difference between observed data and model.
• Measure: R-factor
27
Model vs. Data Agreement
• R-factor:
• Rfree: – Unbiased measure. – Calculated on 5-10%
of data not included in refinement.
28
€
R =Fobs − Fcalc∑
Fobs∑
15
Key Parameters • Resolution • R values
– Agreement between data and model. – Usually between 0.15 and 0.25, should not
exceed 0.30. • R ~ Resolution / 10 • R + 0.05 > Rfree > R.
• Ramachandran plot
• B factors – Contributions from static and dynamic disorder
• Well determined ~10-20 Å2, intermediate ~20-30 Å2, flexible 30-50 Å2, invisible >60 Å2.
Bottlenecks
• Getting the protein in sufficient quantity and purity
• Crystallisation (trial and error)
• Diffraction/resolution limit
30
16
Programme 8.00-8.15 Good morning and quiz results 8.15-8.20 Break 8.20-8.50 Experimental structure determination – part I 8.50-9.00 Break
9.00-9.30 Experimental structure determination – part II 9.30-9.45 Break 9.45-11.00 Exercise: Reading protein structure papers 11.00-11.30 Summary & Discussion 11.30-12.00 Quiz!
31
Part II
NMR Spectroscopy
32
17
Protein Structure Determination Using
NMR spectroscopy
33
After this lesson you should be able to: • Analyze an NMR structure, evaluate the over-all quality and
find well-defined and less defined regions of the structure
Outline • A short introduction to the technique • What is NMR be used for? • From protein to spectra to structure • A comparison to X-ray crystallography • How to evaluate NMR structures?
34
Learning Objectives & Outline
18
NMR Basics
• NMR is nuclear magnetic resonance
• NMR spectroscopy is done on proteins IN SOLUTION
• Only atoms 1H, 13C, 15N (and 31P) can be detected in NMR experiments
• Proteins up to 30 kDa
• Proteins stable at high concentration (0.5-1 mM), preferably @ room temperature
35
What Is NMR Used For ?
• Structural information
• Studies of protein folding
• Mapping interaction sites
• Dynamics of proteins in solution
36
19
NMR Structural Genomics Projects
• Explore structure space
• Help drug-design
• Facilitate protein structure prediction
• Development of newer more efficient and less time-consuming NMR techniques
Example: Structural genomics of the SARS-virus genome
1YSY, Peti et al, JVI, 2005
37
The Basics of Data Collection
• In a magnetic field the nuclei align like small magnets • The nuclei are excited by radio frequency (RF) pulses • When the RF pulse stops, the nuclei relax by (re-)emitting RF
radiation • The measured emitted RF radiation is the NMR data • It is Fourier-transformed into a spectrum
1H ppm 38
20
NMR Spectrometers
The resolution of the spectra is depending on the strength of the magnetic field.
500-900 Mhz magnetic fields are used for protein NMR spectroscopy.
39
• The signal is seen as “peaks” in spectra.
• Most hydrogen atoms in a folded protein have a unique frequency which is called chemical shift and is measured in ppm.
1H ppm
15N ppm
How Can an NMR Spectrum Give Information About the Structure?
1H,15N -HSQC
40
21
Summary of the Technique • Proteins in solution • Only 13C,15N and 1H atoms are observed • The nuclei are aligned in a magnetic field • Exciting nuclei using RF-radiation • Measure signal of emitted RF-radiation • Fourier transform into spectrum • Chemical environment influences the chemical shift of the
nuclei • High magnetic strength = high resolution of spectra
41
Magnetization transfer between atoms in space (NOESY spectra)
Magnetization transfer in 1H,15N NOESY
NMR Experiments for Structure Determination
Magnetization transfer between atoms which are chemically bonded (ex. COSY,1H,15N -HSQC, HNCACB)
Magnetization transfer in HNCACB
42
22
• NOE (Nuclear Overhauser Effect) • The distance (H-H) is measured by the intensity (volume) of
the NOE peaks in the NMR spectrum and used as a RANGE
• The NOESY quality => number of constraints • The NOESY quality depends on:
– Overlap of peaks in spectra – Molecular rotation (tumbling) time – Local (loop) dynamics
Distance Constraints from NOESY Spectra
43
Different Classes of NOEs
• Intra-residual (|i-i| = 0) • Sequential (|i-j| = 1) • Middle range (1<|i-j|<=5) • Long range (|i-j|> 5)
44
23
Other Constraints Used in NMR Structure Determination of Proteins
• Dihedral angle restraints (phi, psi, chi) – Ranges of dihedral angles based on
chemical shifts and j-coupling
• Hydrogen bonds – The NOE pattern can indicate hydrogen bonds
– Slow exchange rates indicate hydrogen bonds
• Residual dipolar couplings (RDCs) – Constraints of the orientation of e.g. helical elements
relative to each other. 45
Distance Constraints
• Protein – Grey – White – Blue – Red
• Distance
constraints – Green – Yellow – Cyan
Creator Dr. Shauna Farr-Jones
46
24
Structure Calculation
• All constraints are used as input in a program for protein structure calculation (X-plor, Cyana, CNS)
• The programs are based on energy terms
• Programs include energy terms for general geometrical features of proteins (bond lengths, angles etc.)
• The total energy is a sum of the energy based on the constraints and the ideal geometry terms
47
Energy Minimization
• The total energy is minimized to get representations of the structure which obey geometrical (chemical) and experimental constraints
48
25
Iterations Between Structure Calculation and Interpretation of Spectra
• Typically 100-200 structures are calculated
• VIOLATIONS = constraints that are not fulfilled
• Correction for wrong assignments
• Some spectroscopists just remove violated constraints….
“Consistently violated restraints were identified and eliminated in subsequent rounds of structure calculations until a consistent set of restraints was obtained with few violations in the ensemble. “
Keeler C et al. JMB, 2003
49
Deposited NMR Structure
• Finally, an ensemble of the 10-25 structures with lowest total energy and lowest number of violations are deposited in the Protein Data Bank
• The restraints are deposited in the BioMagResBank (BMRB)
• The ensemble shows possible structures within the space defined by the constraints
The ensemble of 20 NMR structures of the U-box domain from Atpub14 deposited in PDB (1T1H)
Andersen et al. , JBC, 2004
50
26
Summary of NMR Structure Determination
• Assign peaks • Assign NOESY spectra • Obtain other restraints (RDCs, angles) • Structure calculation • Check violations/ bad assignments • Deposit structure + restraints
E
51
NMR Spectroscopy vs. X-ray Crystallography
• Hydrogen atoms are observed! • Only 13C,15N and 1H are observed • Study of proteins in solution • Only proteins up to 30-40 kDa • No total “map” of the structure • Information used is incomplete and used as restraints • An ensemble of structures is submitted to PDB • The solved structure can be used for further dynamics
characterization with NMR
52
27
How to Evaluate NMR Structures?
• What types of experiments are used?
– Homo-nuclear (COSY, NOESY, TOCSY)
– Hetero-nuclear (HSQC, HNCACB, HCCH-TOCSY)
Homo-nuclear Bad resolution of spectra
=> Best for proteins < 10 kDA
Hetero-nuclear Better resolution of spectra
=> Better in general
53
54
Well-defined structures RMSDs < 0.6 Å
• Atomic backbone RMSD
Less well-defined structures RMSDs > 0.6 Å 3GF1, Cooke et al. Biochemistry, 1991 1T1H, Andersen et al. JBC, 2004
€
RMSD =xi − xi
'( )2
1
n∑
n
How to Evaluate NMR Structures?
54
28
How to Evaluate NMR Structures?
• What types of constraints? • Numbers of constraints?
• Distance restraints
• Angle restraints
• RDCs
• Hydrogen bonds
In well-defined regions each residues have at least 20 restraints
Best if no violations of distance restraints >0.5Å
55
Which regions in the structure are most well-defined?
Look at the pdb ensembles to see which regions are well-defined
1RJH
Nielbo et al, Biochemistry, 2003
How to Evaluate NMR Structures?
56
29
• What regions in the structure are most well-defined? Look at restraints in PDB (Experimental Method à data)
Example of restraints ASSIGN (RESID 252 AND NAME HA ) (RESID 260 AND NAME HD# ) 3.38 1.58 1.58 !
ASSIGN (RESID 252 AND NAME HA ) (RESID 261 AND NAME HE# ) 3.23 1.44 1.44 !
ASSIGN (RESID 252 AND NAME HA ) (RESID 307 AND NAME HD# ) 3.56 1.77 1.77 !
ASSIGN (RESID 252 AND NAME HA ) (RESID 307 AND NAME HG ) 3.65 1.85 1.85 !
ASSIGN (RESID 252 AND NAME HA ) (RESID 311 AND NAME HG2# ) 3.65 1.85 1.85 !
ASSIGN (RESID 252 AND NAME HA ) (RESID 252 AND NAME HD# ) 2.96 1.17 1.17 !
ASSIGN (RESID 252 AND NAME HA ) (RESID 252 AND NAME HE# ) 3.65 1.85 1.85 !
!
How to Evaluate NMR Structures?
57
A Word of Caution • RMSD depends on:
– Number of structures in ensemble (PDB file).
– Total number of structures generated
• Ensemble RMSD should be maximized within the experimental restraints, NOT minimized!
58
Spronk 2003, Journal of Biomolecular NMR
30
Summary on Evaluation
• Homo-/hetero-nuclear NMR? • RMSD (be careful!)? • Types of constraints? • Numbers of constraints? • Violations? • Specific regions?
59
Summary: X-rays vs. NMR
§ X-ray crystallography § X-rays
§ Proteins of any size § Proteins in crystal § Complete data/total
map of structure § Many details – one
model § Resolution, R-values,
Ramachandran plot
§ NMR spectroscopy § Radio waves +
magnetic field § Proteins below 50 kDa § Proteins in solution § Incomplete data
§ Fewer details – many models (ensemble)
§ Restraint violations, RMSD, Ramachandran plot
31
Programme 8.00-8.15 Good morning and quiz results 8.15-8.20 Break 8.20-8.50 Experimental structure determination – part I 8.50-9.00 Break
9.00-9.30 Experimental structure determination – part II 9.30-9.45 Break 9.45-11.00 Exercise: Reading protein structure papers 11.00-11.30 Summary & Discussion 11.30-12.00 Quiz!
61
Programme 8.00-8.15 Good morning and quiz results 8.15-8.20 Break 8.20-8.50 Experimental structure determination – part I 8.50-9.00 Break
9.00-9.30 Experimental structure determination – part II 9.30-9.45 Break 9.45-11.00 Exercise: Reading protein structure papers 11.00-11.30 Summary & Discussion 11.30-12.00 Quiz!
62