membrane bioinformatics ss09 1 v9 – orientation of tm helices -modelling 3d structures of helical...
DESCRIPTION
3 Design helical bundles using effective energy functions Aim: assemble TM bundles Glycophorin A dimer, Erb/Neu dimer, phospholamban pentamer Method: scan 6-D conformational space of dimers of ideal helices Membrane Bioinformatics SS09TRANSCRIPT
Membrane Bioinformatics SS091
V9 – orientation of TM helices
- Modelling 3D structures of helical TM bundlesPark, Staritzbichler, Elsner & Helms, Proteins (2004), Park & Helms, Proteins (2006)
- Beuming & Weinstein (2004)T. Beming & H. Weinstein (2004) Bioinformatics 20, 1822
- Adamian & Liang (2006)L. Adamian & J. Liang (2006) BMC Struct. Biol. 6, 13
- TMX: predict lipid-accessible sides of TM helices from sequencePark & Helms, Bioinformatics (2007), Park, Hayat & Helms, BMC Bioinformatics (2007),
2
Structure modelling for helical membrane proteins>P52202 RHO -- Rhodopsin. MNGTEGPDFYIPFSNKTGVVRSPFEYPQYYLAEPWKYSALAAYMFMLIILGFPINFLTLYVTVQHKKLRSPLNYILLNLAVADLFMVLGGFTTTLYTSMNGYFVFGVTGCYFEGFFATLGGEVALWCLVVLAIERYIVVCKPMSNFRFGENHAIMGVVFTWIMALTCAAPPLVGWSRYIPEGMQCSCGVDYYTLKPEVNNESFVIYMFVVHFAIPLAVIFFCYGRLVCTVKEAAAQQQESATTQKAEKEVTRMVIIMVVSFLICWVPYASVAFYIFSNQGSDFGPVFMTIPAFFAKSSAIYNPVIYIVMNKQFRNCMITT LCCGKNPLGDDETATGSKTETSSVSTSQVSPA
www.gpcr.org
EMBO Reports (2002)
1D
2D
3D
Membrane Bioinformatics SS09
3
Design helical bundles using effective energy functions
Aim: assemble TM bundles
Glycophorin A dimer, Erb/Neu dimer, phospholamban pentamer
Method: scan 6-D conformational space of dimers of ideal helices
Membrane Bioinformatics SS09
4
Example for parametrised energy function between 2 residues
docking of helix-dimers: energy scoring
search 5 degrees of freedom systematically.
score conformations by residue-residue energy function.
Park et al. Proteins (2004)Membrane Bioinformatics SS09
5
Test for Glycophorin A, dimer of two identical helices, NMR structure available
docking of helix-dimers
RMSD between best model and NMRstructure only 0.8 Å.
Energy landscape around the minimum Minimum is trulyglobal minimum.
Park et al. Proteins (2004)
However, this is not thecase for dimers in larger TMH proteins.Membrane Bioinformatics SS09
6
Need more/other information to orient helicesEarly suggestion: TM proteins are „inside-out“ proteins. That means that are hydrophobic outside and hydrophilic inside.
compute hydrophobic moment = the direction of largest hydrophobicity
N
iiprojiC rriH
N 1
1
here, rproj(i) is the projection of the side-chain onto the helical axis, i.e. the vector difference describes the shortest distance between residue i and the helix axis.H(i) is the hydrophobicity of residue i.
This method was introduced by David Eisenberg (1982, Nature)
Membrane Bioinformatics SS09
7
role of hydrophobic momentAccording to the concept of Eisenberg,all helices would orient their most hydrophobic side towards the bilayer.
However, this measure is quite unprecise (Park & Helms, Biopolymers 2006).
Hydrophobicity scalesww: Wimley-White scaleeis: Eisenberg scaleges: Goldman/Engelman/Steitz scalekd: Kyte-Doolittle scaleSpecialized scaleskP: kProtbw: Beuming & Weinstein scaletmlip1/2: Adamian & Liang
Membrane Bioinformatics SS09
8
Beuming & Weinstein (2004): amino acid propensities(1) Hydrophobic residues (A, I, L, V)
make up 48.7 % of all residues in TM proteins
(2) Charged residues (D, E, H, K, R) constitute only 5.5%
(3) Glycine (G) is relatively abundant(4) Small residues (A, C, S, T) form
30.6%(5) Aromatic residues (F,W,Y) represent
15.8%6 -branched residues (T, I, V) form
24.9%.(7) Proline is a helix-breaker and is
underrepresented(8) Also, Cys, Gln, and Asn are rarely
found.
Membrane Bioinformatics SS09
9
amino acid propensities: conclusionsThe overall amino acid composition deviates significantly from that of the whole
genome.
Hydrophobic residues (A, F, G, I, L, M, V, W) occur more frequently in MPs than in the whole genomes.
Conversely, residues C, D, E, K, N, P, Q, R are underrepresented in MPs.
H, S, T, and Y have equal distributions in MPs and whole genomes.
Membrane Bioinformatics SS09
10
Beuming & Weinstein (2004): inside vs. outside(1) Most of the exposed (lipid facing)
charged residues (D, E, K, H, R) that are found in TMs are located in the terminal regions (4.4%) rather than in the central region (2.7%).
(2) The exposed terminal parts are very rich in aromatic residues (21.3%) compard to the central part (16.1%).
Membrane Bioinformatics SS09
11
Beuming & Weinstein (2004): surface propensity scale
Table shows fraction SF of exposed residue i.Trp has highest value of SF, His has smallest value.
Normalize SP values with respect to His (SP=0) and Trp (SP=1).
HISTRP
HISXX SFSF
SFSFSP
Membrane Bioinformatics SS09
12
correlation of SP scale with other scalesCompute correlation coefficient.
SP propensity scale has high correlation with hydrophobicity or volume scales.
Combine SP scale with conservation index:
Alignmentn a
iaiai p
ffCI log
pa : a priori distribution of residues
Membrane Bioinformatics SS09
13
Beuming & Weinstein (2004)Add propensity score and conservation score:
total score(i) = SPi + CIi
Accuracy to detect the buried resides is ca. 70%.
Membrane Bioinformatics SS09
14
Beuming & Weinstein (2004)(top) correct SASA in X-ray structure
(middle): prediction based on amino-acid propensity + conservationBEST!
(bottom): prediction based only on conservation
Membrane Bioinformatics SS09
15
Adamian & Liang (2006): interacting helicesExample for two interacting TM helices in succinate dehydrogenase.
Interacting residues follow heptad motiv.
Note the periodicity of 3.6 residues per turn in an ideal -helix.
Membrane Bioinformatics SS09
16
Adamian & Liang (2006)Heptad motifs are generally preferred for interacting helix pairs.
For left-handed helices, about 94.7% and 92.4% of interacting residues can be mapped to heptad repeats for parallel and anti-parallel helices.
For right-handed pairs the number are slightly less.
Assume that the residues of lipid-accessible helices follows a similar pattern.
Membrane Bioinformatics SS09
17
Adamian & Liang (2006)Each TM helix has „7 faces“.
A: the anchoring residues are0, 7, 14, and 21contacts are also formed by residues 3, 4, 10, 11, 17, 18
Membrane Bioinformatics SS09
18
Adamian & Liang (2006)
Combine lipophilicity score Lf and positional entropy Ef of a helical face by simply multiplying them.
Membrane Bioinformatics SS09
19
Adamian & Liang (2006): Test fo TRP channel
Membrane Bioinformatics SS09
20
Adamian & Liang (2006): discuss failures
Sometimes, binding sites for individual lipids (e.g. cardiolipin) are formed on the surfaces of TM proteins. Those residues will also be highly conserved, and the method will therefore fail.
Membrane Bioinformatics SS09
21
What is needed for true de novo design of helical bundles?
Aim: explore new TM protein topologies.
distance-dependent residue-residue force field Generate energetically favorable geometries of helix dimers.
Overlap helix dimers full protein structure.
Membrane Bioinformatics SS09
22
Derivation of position scores
(1) For each test protein, 1000 similar sequences from non-redundant database using BLAST URLAPI.
(2) generate initial multiple sequence alignment (MSA) with ClustalW.Delete fragments < 80% of length of query sequence.
From these refined MSA, apply 6 different % identity criteria, 6 final MSAs for each test protein.
Pei & Grishin: need to align ≥ 20 sequences to accurately estimate conservation indices from MSAs.
Membrane Bioinformatics SS09
23
Test: correct orientation (0,0) has lowest score.
predicting the TM-helix-orientation from sequences
CI: conservation index in MSA
SASA: Solvent accessible surface area,relative to a single, free helix
fj(i): frequency of amino acid jin position i.fj : frequency of amino acid j in full alignment.
C : average conservation index (CI): Standard deviation
Positive values: conserved positionsNegative values: variable positions
12
CfifCI
jjji
Assumption:lipid-exposed positions areless conserved.
Membrane Bioinformatics SS09
24
Aim: construct structural model for a bundle of ideal transmembrane helices.
(1) Construct 12 good geometries for every helix pair AB, BC, CD, DE, EF, FG
(2) overlay ABCDEFG„thin out“ solution space containing ca. 126 models(a) remove „solutions“ where helices collide with eachother(b) delete non-compact „solutions“
(3) score remaining 106 solutions by sequence conservation
(4) cluster 500 best solutions in 8 models
(5) rigid-body refinement, select 5 models with best sequence conservation.
Ab initio structure prediction of TM bundles
Membrane Bioinformatics SS09
25
Rigid-body refinement
Membrane Bioinformatics SS09
26
dark: Modellight: X-ray structure
Additional input:known connectivity of thehelices A-B-C-D-E-F-G.
Otherwise, the search space would have beentoo large.
Compare best models with X-ray structures
HalorhodopsinBacteriorhodopsin Sensory Rhodopsin
Rhodopsin NtpK
Membrane Bioinformatics SS09
27
Comparing the best models with X-ray structures
Membrane Bioinformatics SS09
28
These are our 4 best non-native models of bR.
Because contact between A and E was not imposed, very different topologies were obtained.
In 2006, our methods could not distinguish between these models.
but they could serve as input for further experiments.
Can one select the best model?
Membrane Bioinformatics SS09
29
“Success case”: True de novo model of 4-helix bundle
Membrane Bioinformatics SS09
30
Predicting lipid-exposure
Membrane Bioinformatics SS09
31
Predicting lipid-exposureAim: derive optimal scale to predict exposure of residues to hydrophobic part of lipid bilayer.
Scale should optimally correlate with SASA minimize quadratical error.
Y: SASA values of the training set (N = 2901 residue positions)X: profile of residue frequencies from multiple sequence alignment ( N 21 matrix): wanted propensity scale for 20 amino acids + 1 intercept value (21)
Solution for minimization task
Membrane Bioinformatics SS09
32
What does MO scale capture?
Membrane Bioinformatics SS09
33
Improved prediction of exposure by statistical learning
Prediction method Prediction accuracy [%]
Beuming & Weinstein 68.7
TMX 78.7
Yuan ... Teasdale 71.1
Beuming & Weinstein(2004) method
Membrane Bioinformatics SS09
34
Improved method by statistical learning
The theory of Support Vector Classifiers evolves from a simpler case of optimal separating hyperplanes that, while separating two separable classes, maximize the distance between a separating hyperplane and the closest point from either class.
A: The two classes can be fully separable by a hyperplane, and the optimal separating hyperplane can be obtained by solving Eq. 9. B: It is not possible to separate the two classes with a hyperplane, and the optimal hyperplane can be obtained by solving Eq. 17.
Membrane Bioinformatics SS09
Stockholm Univ. Sept. 200835
Improved method by statistical learning
Membrane Bioinformatics SS09
36
Improved method by statistical learning
Membrane Bioinformatics SS09
37
Improved method by statistical learning
Membrane Bioinformatics SS09
38
Improved method by statistical learning
Membrane Bioinformatics SS09
39
http://service.bioinformatik.uni-saarland.de/tmx/
input:
Putative TM helices
TopoView drawsSnake plot
Master thesisNadine Schneider
Membrane Bioinformatics SS09
40
http://service.bioinformatik.uni-saarland.de/tmx/
Top: TMD11, Bottom: TMD 12 Membrane Bioinformatics SS09
41
http://service.bioinformatik.uni-saarland.de/tmx/
Top: TMD5, Bottom: TMD 12 Membrane Bioinformatics SS09
42
Summary TMX and related methods
Sequences of TM proteins reveal many powerful features to allow prediction of 2D- and 3D structural features, function, and oligomerization status.
TMX server can predict lipid exposure with ca. 78% accuracy.:http://service.bioinformatik.uni-saarland.de/tmx/
Possible applications:(1) predict transporter pores
(2) predict lipid-exposed surface of TM proteins: correlate with different membrane compositioncollaborate with us do you have lots of solubility data?
(3) Conserved surface residues may indicate interaction sites
Membrane Bioinformatics SS09