function, evolution & experimental methods - cbs · center for biological sequence...

49
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU Thomas Blicher, Center for Biological Sequence Analysis Anne Mølgaard, Kemisk Institut, Københavns Universitet Details of Protein Structure Function, evolution & experimental methods

Upload: ngongoc

Post on 11-Apr-2019

220 views

Category:

Documents


0 download

TRANSCRIPT

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Thomas Blicher, Center for Biological Sequence AnalysisAnne Mølgaard, Kemisk Institut, Københavns Universitet

Details of Protein Structure

Function, evolution &experimental methods

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Learning Objectives

Outline the basic levels of protein structure.

Outline key differences between X-raycrystallography and NMR spectroscopy.

Identify relevant parameters for evaluatingthe quality of protein structures determinedby X-ray crystallography and NMRspectroscopy.

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Outline

Protein structure evolution and functionInferring function from structure.Modifying function

Experimental techniquesX-ray crystallographyNMR spectroscopy

Structure validation

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Watson, Crick and DNA, 1952

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

"We wish to suggest a structure for the salt ofdeoxyribose nucleic acid (D.N.A.). This structurehas novel features which are of considerablebiological interest….…It has not escaped our notice that the specificpairing we have postulated immediatelysuggests a possible copying mechanism for thegenetic material."

J.D. Watson & F.H.C. Crick (1953) Nature, 171, 737.

DNA Conclusions

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

“Could the search for ultimate truth really haverevealed so hideous and visceral-looking anobject?” Max Perutz, 1964, on protein structure

John Kendrew, 1959, with myoglobin model

Once Upon a Time…

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

They provide a detailed picture ofinteresting biological features, such asactive site, substrate specificity, allostericregulation etc.

They aid in rational drug design and proteinengineering.

They can elucidate evolutionaryrelationships undetectable by sequencecomparisons.

Why are Protein Structures soInteresting?

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

In evolution structure is conserved longer thanboth function and sequence.

Structure > Function > Sequence

Structure & Evolution

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Rhamnogalacturonanacetylesterase

(A. aculeatus) (1k7c)

Platelet activatingfactor acetylhydrolase

(B. Taurus) (1WAB)

Serine esterase(S. scabies) (1ESC)

Structure & Evolution

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

COOH

NH2

Asp His Ser Topological switchpoint

Inferring biologicalfeatures from the structure

1DEO

Structure to Function

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Platelet activatingfactor acetylhydrolase

Serine esterase

Rhamnogalacturonanacetylesterase

Mølgaard, Kauppinen & Larsen (2000) Structure, 8, 373-383.

Structure & Evolution

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Why Fold?

Hydrophobic collapseHydrophobic residues cluster to “escape” interactionswith water.

Indirect effect of attraction between water molecules.

Polar backbone groups form secondary structure tosatisfy hydrogen bonding donors and acceptors.Interactions withInitially formed structure is in molten globule state(ensemble).Molten globule condenses to native fold via transitionstate

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Hydrophobic Effect and Folding

Oil and water

Clathrate structures

Entropy

Indirect consequenceof attraction betweenwater molecules

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Hydrophobic Core

Hydrophobic side chains go into the core ofthe molecule – but the main chain is highlypolar.The polar groups (C=O and NH) areneutralized through formation of H-bonds.

Myoglobin

Surface Interior

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Hydrophobic vs. Hydrophilic

Globular protein (insolution)

Membrane protein (inmembrane)

Myoglobin Aquaporin

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Hydrophobic vs. Hydrophilic

Globular protein (insolution)

Membrane protein (inmembrane)

Myoglobin Aquaporin

Cross-section Cross-section

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Characteristics of Helices

Aligned peptideunits DipolarmomentIon/ligand bindingSecondary andquaternarystructure packingCapping residuesThe helix(i i+4)Other helix types!(310, )

N

C

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

-Sheets

Multiple strands sheet

Parallel vs. antiparallelTwist

FlexibilityVs. helicesFoldingStructure propagation(amyloids)Other…

Thioredoxin

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

-Sheets

Multiple strands sheet

Parallel vs. antiparallelTwist

FlexibilityVs. helicesFoldingStructure propagation(amyloids)Other…

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

-Sheets

Multiple strands sheet

Parallel vs. antiparallelTwist

FlexibilityVs. helicesFoldingStructure propagation(amyloids)Other…

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

-Sheets

Multiple strands sheet

Parallel vs. antiparallelTwist

FlexibilityVs. helicesFoldingStructure propagation(amyloids)Other…

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

-Sheets

Multiple strands sheet

Parallel vs. antiparallelTwist

Strand interactionsare non-local

FlexibilityVs. helicesFolding

Antiparallel Parallel

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Turns, Loops & Bends Revisited

Between helicesand sheets

On protein surface

Intrinsically“unstructured”proteins

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Structure LevelsPrimary structure = Sequence

Secondary Structure = Helix,sheets/strands, loops & turns

Structural Motif = Small,recurrent arrangement ofsecondary structure, e.g.

Helix-loop-helixBeta hairpinsEF hand (calcium binding motif)Etc.

Tertiary structure = Arrangementof Secondary structure elements

MSSVLLGHIKKLEMGHS…

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Myoglobin

Hemoglobin

Quaternary Structure

Assembly ofmonomers/subunitsinto protein complex

Backbone-backbone,backbone-side-chain &side-chain-side-chaininteractions:

Intramolecular vs.intermolecular contacts.For ligand binding sidechains may or may notcontribute. For the latter,mutations have littleeffect.

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Grouping Amino Acids

Livingstone & Barton, CABIOS, 9, 745-756, 1993

A – AlaC – CysD – AspE – GluF – PheG – GlyH – HisI – IleK – LysL – Leu

M – MetN – AsnP – ProQ – GlnR – ArgS – SerT – ThrV – ValW – TrpY - Tyr

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

http://www.ch.cam.ac.uk/magnus/molecules/amino/

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Proteins Are PolypeptidesThe peptide bond A polypeptide chain

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Ramachandran Plot

Allowed backbone torsion angles in proteins

N

H

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Torsion Angles

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Ramachandran Plots

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Im, Ryu & Yu (2004) Engineering thermostability in serine protease inhibitorsPEDS, 17, 325-331.

Engineering Thermostability

Example: Serpin (serineprotease inhibitor)OverpackingBuried polar groupsCavities

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Experimental Methods

Crystallography&

NMR spectroscopy

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

X-ray crystallographyNuclear Magnetic Resonance (NMR)Modelling techniques

More exotic techniquesCryo electron microscopy (Cryo EM)Small angle X-ray scattering (SAXS)Neutron scattering

Methods for StructureDetermination

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

X-ray Crystallography

No size limitation.Protein molecules are ”stuck” in a crystallattice.Some proteins seem to be uncrystallizable.Slow.

Especially suited for studying structuraldetails.

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

X-rays

Fourier transform

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

The Importance of Resolution

high

low4 Å

2 Å

3 Å

1 Å

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Key Parameters

ResolutionR values

Agreement between data and model.Usually between 0.15 and 0.25, should not exceed 0.30.

B factorsContributions from static and dynamic disorder

Well determined ~10-20 Å2, intermediate ~20-30 Å2, flexible 30-50 Å2, invisible >60 Å2.

No. of observations vs. parametersRamachandran plot

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

NMR Spectroscopy

Upper limit for structure determinationcurrently ~50 kDa.Protein molecules are in solution.Dynamics, protein folding.Slow.

Especially suited for studies of proteindynamics of small to medium size proteins.

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

NMR Basics

NMR is nuclear magnetic resonance

NMR spectroscopy is done on proteins INSOLUTION

Only atoms 1H, 13C, 15N (and 31P) can be detectedin NMR experiments

Proteins up to 30 kDa

Proteins stable at high concentration (0.5-1mM),preferably at room temperature

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

NMR Spectroscopy

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Well-defined structuresRMSDs < 0.6 Å

Evalutation of NMR Structures

Atomic backbone RMSD:

Less well-defined structuresRMSDs > 0.6 Å

3GF1, Cooke et al. Biochemistry, 19911T1H, Andersen et al. JBC, 2004

( )n

xxRMSD

n

ii

= 1

2'

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Evaluation of NMR Structures

What regions in the structure are most well-defined?

Look at the pdbensembles to seewhich regions arewell-defined

1RJH

Nielbo et al, Biochemistry, 2003

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Which Structural Model?

Normally NMR structure models are listedaccording to the total energy and thenumber of violations.Model 1 in the PDB file is often the one withlowest energy and fewest violations.Use that model as template for modelling.

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

NMR versus X-ray Crystallography

Hydrogen atoms are observed!

Only 13C,15N and 1H are observed

Study of proteins in solution

Only proteins up to 30-40 kDa

No total “map” of the structure

Information used is incomplete and used as restraints

An ensemble of structures is submitted to PDB

The solved structure can be used for further dynamicscharacterization with NMR

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Holdings of the Protein DataBank (PDB):

The PDB also containsnucleotide and nucleotideanalogue structures.

PDB

Sep. 2001 May 2006 Oct. 2007X-ray 13116 30860 39706NMR 2451 5368 6862Other 338 200 250Total 15905 36428 46818

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Summary

In evolution structure is conserved longerthan both function and sequence.

X-ray crystallographyProteins in crystallatticeMany details – onemodelResolution, R-values,Ramchandran plot

NMR spectroscopyProteins in solutionFewer details – manymodelsViolations, RMSD,Ramachandran plot

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

LinksPDB (protein structure database)

www.pdb.org/

PyMOL home:http://pymol.sourceforge.net/

PyMOL manual:http://pymol.sourceforge.net/newman/user/toc.html

PyMOL Wiki:http://www.pymolwiki.org/index.php/Main_Page

PyMOL settings (documented):http://cluster.earlham.edu/detail/bazaar/software/pymol/modules/pymol/setting.py

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS TECHNICAL UNIVERSITY OF DENMARK DTU

Other Courses

27617Protein Structure and Computational BiologyMaster’s level course13 weeks, spring semester5 ECTSMax. 40 students