1 protein structure, structure classification and prediction bioinformatics x3 january 2005 p....
Post on 18-Dec-2015
215 views
TRANSCRIPT
![Page 1: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/1.jpg)
1
Protein Structure, Structure Protein Structure, Structure Classification and PredictionClassification and Prediction
Bioinformatics X3
January 2005
P. Johansson, D. Madsen
Dept.of Cell & Molecular Biology,
Uppsala University
![Page 2: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/2.jpg)
2
OverviewOverview
• Introduction to proteins, structure & classification
• Protein Folding
• Experimental techniques for structure determination
• Structure prediction
![Page 3: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/3.jpg)
3
![Page 4: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/4.jpg)
4
ProteinsProteins
• Proteins play a crucial role in virtually all biological processes with a broad range of functions.
• The activity of an enzyme or the function of a protein is governed by
the three-dimensional structure
![Page 5: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/5.jpg)
5
20 amino acids - the building blocks
![Page 6: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/6.jpg)
6
The Amino AcidsThe Amino Acids
![Page 7: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/7.jpg)
7
Hydrophilic or hydrophobic..?Hydrophilic or hydrophobic..?
• Virtually all soluble proteins feature a hydrophobic core surrounded by a hydrophilic surface
• But, peptide backbone is inherently polar ?
• Solution ; neutralize potential H-donors & acceptors using ordered secondary structure
![Page 8: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/8.jpg)
8
Secondary StructureSecondary Structure: -helix
![Page 9: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/9.jpg)
9
• 3.6 residues / turn
• Axial dipole moment
• Not Proline & Glycine
• Protein surfaces
Secondary StructureSecondary Structure: -helix
![Page 10: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/10.jpg)
10
Secondary StructureSecondary Structure: -sheets
![Page 11: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/11.jpg)
11
Secondary StructureSecondary Structure: -sheets
• Parallel or antiparallel
• Alternating side-chains
• No mixing
• Loops often have polar amino acids
![Page 12: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/12.jpg)
12
Structural classificationStructural classification
• Databases– SCOP, ’Structural Classification of Proteins’,
manual classification
– CATH, ’Class Architecture Topology Homology’, based on the SSAP algorithm
– FSSP, ’Family of Structurally Similar Proteins’, based on the DALI algorithm
– PClass, ’Protein Classification’ based on the LOCK and 3Dsearch algorithms
![Page 13: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/13.jpg)
13
Structural classification, CATHStructural classification, CATH
• Class, four types :– Mainly
– structures
– Mainly
– No secondary structure
• Arhitecture (fold)
• Topology (superfamily)
• Homology (family)
![Page 14: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/14.jpg)
14
Structural classification..Structural classification..
![Page 15: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/15.jpg)
15
Structural classification..Structural classification..
• Two types of algorithms
– Inter-Molecular, 3D, Rigid Body ; structural alignment in a common coordinate system (hard) e.g. VAST, LOCK.. alg.
– Intra-Molecular, 2D, Internal Geometry ; structural alignment using internal distances and angles e.g. DALI, STRUCTURAL, SSAP.. alg.
![Page 16: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/16.jpg)
16
Structural classification, Structural classification, SSAPSSAP
• SSAP, ‘Sequential Structure Alignment Program’
Basic idea ; The similarity between residue i in molecule A and residue k in molecule B is characterised in terms of their structural surroundings
This similarity can be quantified into a score, Sik
Based on this similarity score and some specified gap penalty, dynamic programming is used to find the optimal structural alignment
![Page 17: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/17.jpg)
17
Structural classification, Structural classification, SSAPSSAP
The structural neighborhood of residue i in A compared to residue k in B
i k
![Page 18: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/18.jpg)
18
Structural classification, Structural classification, SSAP..SSAP..
Distance between residue i & j in molecule A ; dAi,j
Similarity for two pairs of residues, i j in A & k l in B ;
,,bdd
as
Bkl
Aij
klij +−= a,b constants
Similarity between residue i in A and residue k in B ;
∑−= ++ +−
=n
nmB
mkkA
mii
kibdd
aS
,,
,
Idea ; Si,k is big if the distances from residue i in A to the 2n nearest neighbours are similar to the corresponding distances around k in B
![Page 19: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/19.jpg)
19
Structural classification, Structural classification, SSAP..SSAP..
This works well for small structures and local structural alignments - however, insertions and deletions cause problems unrelated distances
HSERAHVFIM..
GQ-VMAC-NW..
i=5
k=4
A :
B :
- The real algorithm uses Dynamic programming on two levels, first to find which distances to compare Sik, then to align the structures using these scores
![Page 20: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/20.jpg)
20
Experimental techniques for structure Experimental techniques for structure determinationdetermination
• X-ray Crystallography
• Nuclear Magnetic Resonance spectroscopy (NMR)
• Electron Microscopy/Diffraction
• Free electron lasers ?
![Page 21: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/21.jpg)
21
X-ray CrystallographyX-ray Crystallography
![Page 22: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/22.jpg)
22
X-ray Crystallography..X-ray Crystallography..
• From small molecules to viruses
• Information about the positions of individual atoms
• Limited information about dynamics
• Requires crystals
![Page 23: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/23.jpg)
23
![Page 24: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/24.jpg)
24
NMRNMR
• Limited to molecules up to ~50kDa (good quality up to 30 kDa)
• Distances between pairs of hydrogen atoms
• Lots of information about dynamics• Requires soluble, non-aggregating
material• Assignment problem
![Page 25: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/25.jpg)
25
Electron Microscopy/ DiffractionElectron Microscopy/ Diffraction
• Low to medium resolution• Limited information about
dynamics• Can use very small crystals
(nm range)• Can be used for very large
molecules and complexes
![Page 26: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/26.jpg)
26
![Page 27: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/27.jpg)
27
Structure PredictionStructure Prediction
GPSRYIV…?
![Page 28: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/28.jpg)
28
Protein FoldingProtein Folding
• Different sequence Different structure
• Free energy difference small due
to large entropy decrease,
G = H - TS
![Page 29: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/29.jpg)
29
Structure PredictionStructure Prediction
Why is structure prediction and especially ab initio calculations hard..?
• Many degrees of freedom / residue
• Remote noncovalent interactions
• Nature does not go through all conformations
• Folding assisted by enzymes & chaperones
![Page 30: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/30.jpg)
30
Ab initio calculations used
for smaller problems ;
• Calculation of affinity
• Enzymatic pathways
Molecular dynamicsMolecular dynamics
![Page 31: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/31.jpg)
31
Sequence Classification rev.
• Class : Secondary structure content
• Fold : Major structural similarity.
• Superfamily : Probable common evolutionary origin.
• Family : Clear evolutionary relationship.
![Page 32: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/32.jpg)
32
• Search sequence data banks for homologs
• Search methods e.g. BLAST, PSIBLAST, FASTA…
• Homologue in PDB..?
Structure PredictionStructure Prediction
IVTY…PGGG HYW…QHG
![Page 33: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/33.jpg)
33
Multiple sequence / structure alignment• Contains more information than a single sequence
for applications like homology modeling and secondary structure prediction
• Gives location of conserved parts and residues likely to be buried in the protein core or exposed to solvent
Structure PredictionStructure Prediction
![Page 34: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/34.jpg)
34
HFD fingerprint
Multiple alignment exampleMultiple alignment example
![Page 35: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/35.jpg)
35
• Statistical Analysis (old fashioned):– For each amino acid type assign it’s ‘propensity’
to be in a helix, sheet, or coil.
• Limited accuracy ~55-60%. • Random prediction ~38%.
MTLLALGINHKTAP...CCEEEEEECCCCCC...
Secondary Structure PredictionSecondary Structure Prediction
![Page 36: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/36.jpg)
36
• Each residue is classified as:– H/H, strong helix / strand former.– h/h, weak helix / strand former.– I, indifferent.– b/b, weak helix/strand breaker.– B/B, strong helix / strand breaker.
The Chou & Fasman Method
![Page 37: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/37.jpg)
37
The Chou & Fasman Method..
• Score each residue: – H/h=1, I=0 or ½, B/b=-1.– H/h=1, I=0 or ½, B/b=-1.
• Helix nucleation: – Score > 4 in a “window” of 6 residues.
• Strand nucleation:– Score > 3 in a “window” of 5 residues.
• Propagate until score < 1 in a 4 residue “window”.
![Page 38: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/38.jpg)
38
GPSRYIVTLANGKHelix:
Strand
-1 -1 0 0 -1 1 1 0 1 1 -1 -1 1
-1 -1 -1 .5 1 1 1 1 1 0 0 -1 -1
-2 0 1 2 3 3 1 No nucl.
-1.5 .5 2.5 4.5 5 4 3 1 -1
-2.5 -.5 1.5 … 3 1 -1
Nucleation
Propagate
GPSRYIVTLANGKResult
The Chou & Fasman Method..
![Page 39: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/39.jpg)
39
• Neural networks (e.g. the PHD server):– Input: a number of protein sequences +
secondary structure.– Output: a trained network that predicts
secondary structure elements with ~70% accuracy.
• Use many different methods and compare (e.g. the JPred server)!
Modern methods
![Page 40: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/40.jpg)
40
Summary
• The function of a protein is governed by its structure
• Different sequence Different structure
• PDB, protein data bank
• Secondary structure prediction is hard, tertiary structure prediction is even harder
• Use homologs whenever possible or different methods to assess quality
![Page 41: 1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala](https://reader030.vdocuments.site/reader030/viewer/2022032800/56649d255503460f949fc251/html5/thumbnails/41.jpg)
41