predict protein-protein interaction sites - rostlab · © burkhard rost (tum munich) /110...
TRANSCRIPT
© Burkhard Rost (TUM Munich) /1101
title: Prediction of PPI2 Predict protein-protein interaction sitesshort title: pp2_ppi2
lecture: Protein Prediction 2 - Protein function TUM Winter 2013/2014
© Burkhard Rost (TUM Munich) /110
Announcements
Videos: SciVe / www.rostlab.orgTHANKS : Tim Karl + Jona ReebSpecial lectures:
• Oct 29: Tobias Hamp• Nov 21: Tanya Goldberg• Nov 28: Arthur Dong• Dec 03+05: Marco De Vivo/Marco Punta• Dec 17+19: Andrea Schafferhans
No lecture:• Oct 10 Thu (Reformation)• Nov 12 Tue (Student assembly)• Dec 12 Thu (TUM Dies Academicus)
LAST lecture: Jan 30Examen: Feb 4 or Feb 6 (likely this room)
• Makeup: Apr 9 - morning
CONTACT: Marlena Drabik [email protected]
2
© Burkhard Rost (TUM Munich) /110
IV.Predict protein interactions
3
© Burkhard Rost (TUM Munich) /110
ISIS
4
© Burkhard Rost (TUM Munich) /110
IV.1a protein interactionsProtein-protein
interactions (PPI):terminology
5
© Burkhard Rost (TUM Munich) /110
Different interfaces = different physics?
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
gp120
CD4
antibody-1antibody-2
HIV gp120 / CD4 / FAB
6
© Burkhard Rost (TUM Munich) /110
Protein association
7
A activates B activates C activates D activates ....
ABCDare
associated
© Burkhard Rost (TUM Munich) /1108
Physical interaction NOT association
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
gp120
CD4
antibody-1antibody-2
HIV gp120 / CD4 / FAB
YES
YES
NO
© Burkhard Rost (TUM Munich) /110
IV.1b protein interactionsPPI - homology-based inference
9
© Burkhard Rost (TUM Munich) /110
Sarah Gilman
10
© Burkhard Rost (TUM Munich) /11011
Physical interaction NOT association
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
gp120
CD4
antibody-1antibody-2
HIV gp120 / CD4 / FAB
YES
YES
NO
© Burkhard Rost (TUM Munich) /110
A B
A’ ? B’
similarity > X similarity > X
Can we transfer binding through homology?
12
Obviously, otherwise no value in model organisms ...
©Sven Mika & Burkhard Rost (Columbia New York)S Mika & B Rost 2006 PLoS Genetics, Vol 2, e29
© Burkhard Rost (TUM Munich) /110
Inter and Intra-species the same?
13
similarity > X Worm
A B
A’ B’?
A’’
similarity > X
B’’ Human?
©Sven Mika & Burkhard Rost (Columbia New York)
© Burkhard Rost (TUM Munich) /110
Much better intra-species
14S Mika & B Rost 2006 PLoS Genetics, Vol 2, e29
more similarless similar
Worm
Human
A B
?A’’ B’’
?A’ B’
© Burkhard Rost (TUM Munich) /110
Much better intra-species
15S Mika & B Rost 2006 PLoS Genetics, Vol 2, e29
more similarless similar
Worm
Human
A B
?A’’ B’’
?A’ B’
© Burkhard Rost (TUM Munich) /110
Much better intra-species
16S Mika & B Rost 2006 PLoS Genetics, Vol 2, e29
more similarless similar
Fruitfly (Drosophila Melanogaster)
© Burkhard Rost (TUM Munich) /110
Why?
17
© Burkhard Rost (TUM Munich) /110
Why?
17
© Burkhard Rost (TUM Munich) /110
“Paralogs” conserve interactions
“orthologs” don’t?
18
© Burkhard Rost (TUM Munich) /110
Model organisms pose problems for protein-protein interactions
19S Mika & B Rost 2006 PLoS Genetics, Vol 2, e29
© Burkhard Rost (TUM Munich) /110
IV.2 protein interactionsPPI de novo?
20
© Burkhard Rost (TUM Munich) /11021
1999: Want to predict protein-protein partners
Simple method for prediction:
© Burkhard Rost (TUM Munich) /11022
Road to predicting protein-protein partners
Implement simple method to do this failed entirely: too many false positives
Reduce false positives:
predict surface residues (PROFacc, 1999) note: 1/2 of residues -> 1/4 of false positives!
© Burkhard Rost (TUM Munich) /11023
Prediction of solvent accessibility
Inside
solvent
coreprotein
• 50% of residues somehow accessible to solvent
• 10% not at all
© Burkhard Rost (TUM Munich) /11024
Road to predicting protein-protein partners
Implement simple method to do this failed entirely: too many false positives
Reduce false positives:
predict surface residues (PROFacc, 1999) note: 1/2 of residues -> 1/4 of false positives!
© Burkhard Rost (TUM Munich) /11025
Road to predicting protein-protein partners
Implement simple method to do this failed entirely: too many false positives
Reduce false positives:
predict surface residues (PROFacc, 1999) note: 1/2 of residues -> 1/4 of false positives!
predict residues in external interfaces (ISIS, 2004)
© Burkhard Rost (TUM Munich) /11026
Predict protein-protein binding partners
Reducing false positives:predict surface residues (PROFacc, 1999)
predict residues in external interfaces (ISIS, 2004)
predict residues saturated internally (PROFcon, 2004)
localization (e.g. only all nuclear, LOCtree, 2004)
© Burkhard Rost (TUM Munich) /110
Reduction by localization
27
Extra-cellular
Cytoplasm
Organelles
Mitochondria Nuclear
TMtransmembrane
Extra-cellular
Cytoplasm
Organelles
Mitochondria
Nuclear
TMtransmembrane
Y Ofran et al. & Rost 2006 Bioinformatics 22:e402-7
e.g. nuclear:=15% nuclear + 15% cytoplasm=> • come in with nuclear protein: maximally 30% of proteins to test
© Burkhard Rost (TUM Munich) /110
Predict subcellular localization: LOCtree 2: 18 classes!
28
TatyanaGoldberg
T Goldberg, T Hamp & B Rost (2012) submitted
k-mer profile kernel SVM
SVM$
SVM$SVM$SVM$
SVM$SVM$SVM$SVM$ SVM$
SVM$
SVM$SVM$
SVM$
SVM$
SVM
SVM$
SVM$
SVM$ SVM$
SVM$
SVM
SVM$
SVM$
SVM$
SVM$
SVM$SVM$SVM
SVM$
SVM$
SVM$
SVM$SVM$SVM$
SVM$SVM$SVM$SVM$
SVM$
SVM$ER$
EXT$ GOLGI$ NUC$CYT$VAC$
PERO$
MITO$
CHLO$PLAST$
PLAS$ GOLGI$ VAC$
ER$ NUC$
PERO$
MITO$CHLO$
SVM$SVM$
Non$membrane$
Eukaryo1c$Protein$Sequence$
Transmembrane$
Intra:cellular$ Secretory$pathway$ Intra:cellular$
TobiasHamp
B Alberts et al. 1994 The Cell Garland
© Burkhard Rost (TUM Munich) /11029
Predict protein-protein binding partners
Reducing false positives:predict surface residues (PROFacc, 1999)
predict residues in external interfaces (ISIS, 2004)
predict residues saturated internally (PROFcon, 2004)
localization (e.g. only all nuclear, LOCtree, 2004)
predict residues in protein-substrate interfaces (active)
© Burkhard Rost (TUM Munich) /11030
Predict protein-protein binding partners
Reducing false positives:predict surface residues (PROFacc, 1999)
predict residues in external interfaces (ISIS, 2004)
predict residues saturated internally (PROFcon, 2004)
localization (e.g. only all nuclear, LOCtree, 2004)
predict residues in protein-substrate interfaces (active)
predict protein domains/improve alignments
© Burkhard Rost (TUM Munich) /11031
Predict protein-protein binding partners
Reducing false positives:predict surface residues (PROFacc, 1999)
predict residues in external interfaces (ISIS, 2004)
predict residues saturated internally (PROFcon, 2004)
localization (e.g. only all nuclear, LOCtree, 2004)
predict residues in protein-substrate interfaces (active)
predict protein domains/improve alignments
Put it all together and predict binding partners
TobiasHamp
© Burkhard Rost (TUM Munich) /110
IV.3 protein interactions sites
PPI - data collection
32
© Burkhard Rost (TUM Munich) /110
Different interfaces, different physics!
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
gp120
CD4
antibody-1antibody-2
HIV gp120 / CD4 / FAB
33
© Burkhard Rost (TUM Munich) /110
Different interfaces, different physics!
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
gp120
CD4
antibody-1antibody-2
HIV gp120 / CD4 / FAB
At least 6 types of interfaces differ in sequence!
Internal (inter-domain and intra-domain)External homomers (permanent/transient)External heteromers (permanent/transient)
Y Ofran & B Rost (2003) J Mol Biol 325, 377-87
33
© Burkhard Rost (TUM Munich) /110
How to answer the question?
34
??
???
© Burkhard Rost (TUM Munich) /110
PDB: numbers
Molecules of experimentally determined structure (3D co-ordinates)www.pdb.org• check out: Molecule of the MonthStat 2010/04:• ~65,000 structures• 60K proteins• 2K DNA/RNA• 3K complexes• 56K X-ray• 8K NMR• 0.3K Electron microscopy
35
© Burkhard Rost (TUM Munich) /110
extract interactionshow?
36
© Burkhard Rost (TUM Munich) /110
How 1: take all or subset?
37
© Burkhard Rost (TUM Munich) /110
Remove redundancy
38
PDB
CD-Hit / UniqueProt, e.g. 70% PIDE
© Burkhard Rost (TUM Munich) /110
Histogram of protein family sizes
39
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
#
© Burkhard Rost (TUM Munich) /110
Histogram of protein family sizes
39
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
#
?
© Burkhard Rost (TUM Munich) /110
Histogram of protein family sizes
40
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
#
© Burkhard Rost (TUM Munich) /110
Zipf law / Power law
41
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
Wikipedia:George Kingsley Zipf (Jan 2, 1902-Sep 25, 1950; Harvard)
#
© Burkhard Rost (TUM Munich) /110
Size of protein families (assumption)
42
Wikipedia:George Kingsley Zipf (Jan 2, 1902-Sep 25, 1950; Harvard)
PDB “nature”
© Burkhard Rost (TUM Munich) /110
Size of protein families: assumed relation
43
PDB1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35
“nature”
© Burkhard Rost (TUM Munich) /110
Size of protein families: assumed relation
43
PDB1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35
“nature”
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35
© Burkhard Rost (TUM Munich) /110
Problems with redundancy reduction
What approximates “nature”?
How far to move “is” toward “should be”?
44
© Burkhard Rost (TUM Munich) /110
Develop method
1. PDB->Unique
45
© Burkhard Rost (TUM Munich) /110
How 2:distance
46
© Burkhard Rost (TUM Munich) /110
Different interfaces = different physics
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
gp120
CD4
antibody-1antibody-2
HIV gp120 / CD4 / FAB
47
© Burkhard Rost (TUM Munich) /110
Different interfaces = different physics?
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
gp120
CD4
antibody-1antibody-2
HIV gp120 / CD4 / FAB
48
gp120
CD4
antibody-1antibody-2
<6.5 Å?
© Burkhard Rost (TUM Munich) /110
Develop method
1. PDB->Unique2. parse heavy atoms <6.5 Ångstrøm (0.65 nm)
49
gp120
CD4
antibody-1antibody-2
© Burkhard Rost (TUM Munich) /110
Different interfaces = different physics
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
gp120
CD4
antibody-1antibody-2
HIV gp120 / CD4 / FAB
50
© Burkhard Rost (TUM Munich) /110
PDB chains
Each atom in the PDB belongs to a residueEach residue belongs to a chainChains may have “breaks”
51
© Burkhard Rost (TUM Munich) /110
Different interfaces = different physics
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
gp120
CD4
antibody-1antibody-2
HIV gp120 / CD4 / FAB
52
© Burkhard Rost (TUM Munich) /110
Map PDB to annotations
53
PDB SWISS-PROT
BLAST 10-10, >90% PIDE over >90% of length
chain A maps to SPa, chain B maps to SPbif (SPa=SPb) assume A and B from same proteinelse A and B from two different proteins
© Burkhard Rost (TUM Munich) /110
Develop method
1. PDB->Unique2. parse heavy atoms <6.5 Ångstrøm (0.65 nm)3. map chains to SWISS-PROT, distinguish transient protein-protein interactions from others
54
© Burkhard Rost (TUM Munich) /110
Remove redundancy for transient PP
55
PDB-PP
CD-Hit / UniqueProt, e.g. 70% PIDE
© Burkhard Rost (TUM Munich) /110
Develop method
1. parse heavy atoms <6.5 Ångstrøm (0.65 nm)2. map chains to SWISS-PROT, distinguish transient protein-protein interactions from others3. PDB sub(PP)->Unique
56
© Burkhard Rost (TUM Munich) /110
Develop method
1. parse heavy atoms <6.5 Ångstrøm (0.65 nm)2. map chains to SWISS-PROT, distinguish transient protein-protein interactions from others3. PDB sub(PP)->Unique
NOW we have a data set and can apply machine learning
57
© Burkhard Rost (TUM Munich) /110
PPI interfaces use local segments
58Y Ofran & B Rost (2003) FEBS Lett 544, 236-9
© Burkhard Rost (TUM Munich) /110
1st question:external and
internal interactions the same biophysics?
59
© Burkhard Rost (TUM Munich) /110
Different interfaces = different physics?
60
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
HIV gp120 / CD4 / FAB
gp120
CD4
antibody-1antibody-2
Stru
ctur
e: H
endr
icks
on la
b
© Burkhard Rost (TUM Munich) /110
Different interfaces = different physics?
60
PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
HIV gp120 / CD4 / FAB
gp120
CD4
antibody-1antibody-2
Stru
ctur
e: H
endr
icks
on la
b
At least 6 types of interfaces differ in sequence!
Internal (inter-domain and intra-domain)External homomers (permanent/transient)External heteromers (permanent/transient)
Y Ofran & B Rost (2003) J Mol Biol 325, 377-87
© Burkhard Rost (TUM Munich) /110
Interface types: composition
61Y Ofran & B Rost (2003) J Mol Biol 325, 377-87
gp120
CD4
antibody-1antibody-2
© Burkhard Rost (TUM Munich) /110
Interface types differ in composition
62Y Ofran & B Rost (2003) J Mol Biol 325, 377-87
gp120
CD4
antibody-1antibody-2
© Burkhard Rost (TUM Munich) /110
Interface types differ in composition
63Y Ofran & B Rost (2003) J Mol Biol 325, 377-87
gp120
CD4
antibody-1antibody-2
They obviously differ!But, are these differences meaningful?
© Burkhard Rost (TUM Munich) /110
How to answer the meaningful question?
64
??
???
© Burkhard Rost (TUM Munich) /11065
Are these differences statistically significant?
Y Ofran & B Rost 2005 submitted
© Burkhard Rost (TUM Munich) /11065
Are these differences statistically significant?Chi-square test:
known problem: small data setshere millions of points
Y Ofran & B Rost 2005 submitted
© Burkhard Rost (TUM Munich) /11065
Are these differences statistically significant?Chi-square test:
known problem: small data setshere millions of points
all differences < 10-300-> SIGNIFICANT
Y Ofran & B Rost 2005 submitted
© Burkhard Rost (TUM Munich) /11065
Are these differences statistically significant?Chi-square test:
known problem: small data setshere millions of points
all differences < 10-300-> SIGNIFICANT
… unfortunately also:proteins [a-b] vs [c-d]1 vs 2 authors random subsets ...
Y Ofran & B Rost 2005 submitted
© Burkhard Rost (TUM Munich) /11066
Find-self test (statistical significance)
Y Ofran & B Rost 2005 submitted
© Burkhard Rost (TUM Munich) /110
Find-self test on six types of interfaces
67Y Ofran & B Rost (2003) J Mol Biol 325, 377-87
gp120
CD4
antibody-1antibody-2
© Burkhard Rost (TUM Munich) /110
IV.4 protein interactionsPPI predict binding
sites
68
© Burkhard Rost (TUM Munich) /110
Machine learning
69
© Burkhard Rost (TUM Munich) /110
how to choose the input features?
70
© Burkhard Rost (TUM Munich) /110
ask your friend(ideally in the group)
71
© Burkhard Rost (TUM Munich) /110
Strength of prediction reflects reliability?
72
© Burkhard Rost (TUM Munich) /110
Strength of prediction reflects reliability?
0.6
0.4
0.9
0.1
weakstrong
72
© Burkhard Rost (TUM Munich) /110
More complex system to predict structure
Sequence PSI-BLAST FilterPROFsecPROFacc
1999
73
© Burkhard Rost (TUM Munich) /11074
What it takes to realize this as a server
2004
PDB SWISS-PROT TrEMBL HSSP
© Burkhard Rost (TUM Munich) /110
Much more complex system for function
75
© Burkhard Rost (TUM Munich) /110
Much more complex system for function
75
2004
© Burkhard Rost (TUM Munich) /110
Much more complex system for function
75
2004
© Burkhard Rost (TUM Munich) /110
Few features
76
Profilepredicted 1D structure• secondary structure• solvent accessibility• membrane regions• disorderpredicted aspects of function
© Burkhard Rost (TUM Munich) /110
Publish not to perish -
tie it up
77
© Burkhard Rost (TUM Munich) /110
Are we there yet?
78
© Burkhard Rost (TUM Munich) /110
Let neural networks figure it out ...
79
© Burkhard Rost (TUM Munich) /110
Let neural networks figure it out ...
79
Train
Test
© Burkhard Rost (TUM Munich) /110
Cross-validation: how?
80
PDB-PP
CD-Hit / UniqueProt, e.g. 70% PIDE
© Burkhard Rost (TUM Munich) /110
Random split not enough
81
© Burkhard Rost (TUM Munich) /110
avoid overlap training/cross-
training vs. testing
82
© Burkhard Rost (TUM Munich) /110
Now, are we there yet?
83
© Burkhard Rost (TUM Munich) /110
PP interfaces predicted from sequence
Y Ofran & B Rost 2003 FEBS Lett 544:236-9Y Ofran & B Rost 2007 Bioinformatics e13-16
84
© Burkhard Rost (TUM Munich) /110
0.6
0.4
0.9
0.1
weakstrong
85
Strength of prediction reflects reliability?
© Burkhard Rost (TUM Munich) /110
PP interfaces predicted from sequence
Y Ofran & B Rost 2003 FEBS Lett 544:236-9Y Ofran & B Rost 2007 Bioinformatics e13-16
86
© Burkhard Rost (TUM Munich) /110
PP interfaces predicted from sequence
87
Accuracy:>94% for 1 in 10>70% for 2 in 10
Y Ofran & B Rost (2003) FEBS Lett 544, 236-9
© Burkhard Rost (TUM Munich) /110
Successful prediction: skp1-skp2
88
Uniquitin ligase skp1-skp2 complexSchulman, B.A. et al. (2000) Nature 408, 381-6
Green: 2 correctly predicted residues
(pocket binding TRP109 of SKP-2 F-box protein)
Y Ofran & B Rost (2003) FEBS Lett 544, 236-9
Accuracy:>94% for 1 in 10>70% for 2 in 10
© Burkhard Rost (TUM Munich) /110
Prediction system
Level 1:Neural networksinput: alignment profile/predicted secondary structure + accessibility (PROF)/predicted sequence complexity/overall features (protein length, amino acid composition, asf.)output: 2 units: is or is not P=P
Level 2:Neural networks using input from previous levelLevel 3:simple clustering
89
© Burkhard Rost (TUM Munich) /110
PPI interfaces use local segments
90Y Ofran & B Rost (2003) FEBS Lett 544, 236-9
© Burkhard Rost (TUM Munich) /110
PP interfaces predicted from sequence
Y Ofran & B Rost 2003 FEBS Lett 544:236-9Y Ofran & B Rost 2007 Bioinformatics e13-16
91
© Burkhard Rost (TUM Munich) /110
PP interfaces predicted from sequence
Y Ofran & B Rost 2003 FEBS Lett 544:236-9Y Ofran & B Rost 2007 Bioinformatics e13-16
92
© Burkhard Rost (TUM Munich) /110
PPI hot spots?
93
© Burkhard Rost (TUM Munich) /110
Interaction HOT SPOTS
94
residues that are essential for protein-protein interactionsoperational: • 1. residue in the interface• 2. mutation of the residue knocks out interaction
© Burkhard Rost (TUM Munich) /110
PP interfaces predicted from sequence
Very strong =
hot spots?
Y Ofran & B Rost 2003 FEBS Lett 544:236-9Y Ofran & B Rost 2007 Bioinformatics e13-16
95
© Burkhard Rost (TUM Munich) /110
Prediction of hot spots for CD4
• alanine scan for V1 domain of CD4 (bound to gp120)(A Ashkenazi et al. & DJ Capon (1990) PNAS 87, 7150)red: observed
Y Ofran & B Rost 2007 PLoS CB 3:e11996
© Burkhard Rost (TUM Munich) /110
Prediction of hot spots for CD4
• alanine scan for V1 domain of CD4 (bound to gp120)(A Ashkenazi et al. & DJ Capon (1990) PNAS 87, 7150)red: observed
• structure:PD Kwong et al. & WA Hendrickson (2000) Structure 8, 1329-1339.
purple: predicted(Y Ofran & B Rost (2006) ISIS submitted)
Y Ofran & B Rost 2007 PLoS CB 3:e11996
© Burkhard Rost (TUM Munich) /110
enough to publish?
97
© Burkhard Rost (TUM Munich) /110
Hot spots reliably predicted from sequence!
worst:~60%right
hottest of hot = no error!
Y Ofran & B Rost 2007 PLoS CB 3:e11998
© Burkhard Rost (TUM Munich) /11099
What makes it work?
Evolutionary information:• Optimally choosing profile
• Explicitly using conserved residues
(Predicted) 1D Structure important: good prediction + used correctly• Surface residues
• Secondary structure
Mark low-complexity and sticky
Filtering “isolated predictions”
© Burkhard Rost (TUM Munich) /110
Hot spots prediction requires full information
100
0 10 20 30 40 50 60 70 80 90
89
85
36
35
12
Sequence+Evolution+Exp. structure
Sequence+Evolution+Pred. structure
Evolution only
Sequence only
Hydrophobic Moment
Y Ofran & B Rost 2007 PLoS CB 3:e119
© Burkhard Rost (TUM Munich) /110
Functionally important residues - interactions sites
101
© Marco Punta & Yanay Ofran & Burkhard Rost (Columbia New York)
..LNDRA. ..LNDRA...---P-.
Y Ofran & B Rost 2007 PLoS CB 3:e119
© Burkhard Rost (TUM Munich) /110
Find non-homologous competitive binder
102
© Burkhard Rost (TUM Munich) /110
IV.5 protein interactionsPPI - hubs
103
© Burkhard Rost (TUM Munich) /110104
Ta-Tsen Soong
Eyal MozesSara Gilman
© Burkhard Rost (TUM Munich) /110
Connect micro- and macro-level
105
micro level:residuesRIGHT: more hotspots
macro level:networks
UP: more partners
© Burkhard Rost (TUM Munich) /110
Connect micro- and macro-level
105
micro level:residuesRIGHT: more hotspots
macro level:networks
UP: more partners
© Burkhard Rost (TUM Munich) /110
Date- and Party-hubs
106
Hubs: promiscuous proteins
Date/Party hubsNotation introduced by Marc VidalJD Han et al. & M Vidal 2004 Nature 430:88-93
• Date hubs interactions at different times/same location?
• Party hubs interactions at same time/different location
© Burkhard Rost (TUM Munich) /110
More hotspots -> more party-hub like!
107
Non-hubsParty hubsDate hubs
micro: more hotspots
macro: more partners
Y Ofran, A Schlessinger & B Rost submitted
© Burkhard Rost (TUM Munich) /110
More hotspots -> more party-hub like!
107
0
0.125
0.250
0.375
0.500
926
42
Non-hubsParty hubsDate hubs
micro: more hotspots
macro: more partners
Y Ofran, A Schlessinger & B Rost submitted
© Burkhard Rost (TUM Munich) /110
More hotspots -> more party-hub like!
107
0
0.125
0.250
0.375
0.500
926
42
Non-hubsParty hubsDate hubs
00.10.20.30.40.5
9 26 42
micro: more hotspots
macro: more partners
Y Ofran, A Schlessinger & B Rost submitted
© Burkhard Rost (TUM Munich) /110
More unstructured -> more date-hub like!
108
Non-hubsParty hubsDate hubs
micro: more hotspots
macro: more partners
Y Ofran, A Schlessinger & B Rost submitted
© Burkhard Rost (TUM Munich) /110
More unstructured -> more date-hub like!
108
0
0.14
0.28
0.42
0.56
0.70
NORSnet
Non-hubsParty hubsDate hubs
micro: more hotspots
macro: more partners
Y Ofran, A Schlessinger & B Rost submitted
© Burkhard Rost (TUM Munich) /110
Examples for Date & Party hubs
109Y Ofran, A Schlessinger & B Rost submitted
FUS3 MAP kinase - date hub (PDB 2b9f) right complex with MSG5binding motif (light blue) ISIS unstructured
ABC10-beta subunit of RNA polymerase - party hub (PDB 1r9sJ
right: RNA Polymerase II elongation complex (ABC10-
beta in red)
© Burkhard Rost (TUM Munich) /110
Lecture plan (PP2 function)01: 2013/10/15: no lecture02: 2013/10/17: welcome: who we are03: 2013/10/22: Intro - function 1: concepts04: 2013/10/24: Intro - function 2: homology05: 2013/10/29: Tobias Hamp: Homology-based prediction of function06: 2013/10/31: no lecture Reformation07: 2013/11/05: Tobias Hamp: Homology-based prediction of function 208: 2013/11/07: Intro - function 3: motifs09: 2013/11/12: no lecture: SVV (student reps)10: 2013/11/14: Localization 111: 2013/11/19: Localization 212: 2013/11/21: Localization 3 - Tatyana Goldberg13: 2013/11/26: Arthur Dong: Protein-protein interaction networks14: 2013/11/28: Localization 4 - Tatyana Goldberg15: 2013/12/03: Marco De Vivo (ISS Genoa) - Drug Design16: 2013/12/05: Marco Punta (Pfam, EBI) - Pfam17: 2013/12/10: Exercise/Presentations18: 2013/12/12: no lecture: Dies Academicus / Protein-protein interaction 119: 2013/12/17: Andrea Schafferhans: 3D function prediction20: 2013/12/19: Andrea Schafferhans: Docking21-24: no lectures - winter break (2013/12/23 - 2014/01/06)25: 2014/01/07: Protein-protein interaction 126: 2014/01/09: Protein-protein interaction 227: 2014/01/14: Protein-protein interaction 3 / Protein-DNA/RNA interaction28: 2014/01/16: Protein-DNA/RNA interaction 229: 2014/01/21: SNP effect 1 30: 2014/01/23: SNP effect 231: 2014/01/28: SNP effect 332: 2014/01/30: wrap-up33: 2014/02/04: examen34: 2014/02/06: no lecture
110