membrane bioinformatics ss09 1 v9 – orientation of tm helices -modelling 3d structures of helical...

Membrane Bioinformatics SS091

V9 – orientation of TM helices

- Modelling 3D structures of helical TM bundlesPark, Staritzbichler, Elsner & Helms, Proteins (2004), Park & Helms, Proteins (2006)

- Beuming & Weinstein (2004)T. Beming & H. Weinstein (2004) Bioinformatics 20, 1822

- Adamian & Liang (2006)L. Adamian & J. Liang (2006) BMC Struct. Biol. 6, 13

- TMX: predict lipid-accessible sides of TM helices from sequencePark & Helms, Bioinformatics (2007), Park, Hayat & Helms, BMC Bioinformatics (2007),

2

Structure modelling for helical membrane proteins>P52202 RHO -- Rhodopsin. MNGTEGPDFYIPFSNKTGVVRSPFEYPQYYLAEPWKYSALAAYMFMLIILGFPINFLTLYVTVQHKKLRSPLNYILLNLAVADLFMVLGGFTTTLYTSMNGYFVFGVTGCYFEGFFATLGGEVALWCLVVLAIERYIVVCKPMSNFRFGENHAIMGVVFTWIMALTCAAPPLVGWSRYIPEGMQCSCGVDYYTLKPEVNNESFVIYMFVVHFAIPLAVIFFCYGRLVCTVKEAAAQQQESATTQKAEKEVTRMVIIMVVSFLICWVPYASVAFYIFSNQGSDFGPVFMTIPAFFAKSSAIYNPVIYIVMNKQFRNCMITT LCCGKNPLGDDETATGSKTETSSVSTSQVSPA

www.gpcr.org

EMBO Reports (2002)

1D

2D

3D


3

Design helical bundles using effective energy functions

Aim: assemble TM bundles

Glycophorin A dimer, Erb/Neu dimer, phospholamban pentamer

Method: scan 6-D conformational space of dimers of ideal helices


4

Example for parametrised energy function between 2 residues

docking of helix-dimers: energy scoring

search 5 degrees of freedom systematically.

score conformations by residue-residue energy function.

Park et al. Proteins (2004)Membrane Bioinformatics SS09

5

Test for Glycophorin A, dimer of two identical helices, NMR structure available

docking of helix-dimers

RMSD between best model and NMRstructure only 0.8 Å.

Energy landscape around the minimum Minimum is trulyglobal minimum.

Park et al. Proteins (2004)

However, this is not thecase for dimers in larger TMH proteins.Membrane Bioinformatics SS09

6

Need more/other information to orient helicesEarly suggestion: TM proteins are „inside-out“ proteins. That means that are hydrophobic outside and hydrophilic inside.

compute hydrophobic moment = the direction of largest hydrophobicity

N

iiprojiC rriH

N 1

1

here, rproj(i) is the projection of the side-chain onto the helical axis, i.e. the vector difference describes the shortest distance between residue i and the helix axis.H(i) is the hydrophobicity of residue i.

This method was introduced by David Eisenberg (1982, Nature)


7

role of hydrophobic momentAccording to the concept of Eisenberg,all helices would orient their most hydrophobic side towards the bilayer.

However, this measure is quite unprecise (Park & Helms, Biopolymers 2006).

Hydrophobicity scalesww: Wimley-White scaleeis: Eisenberg scaleges: Goldman/Engelman/Steitz scalekd: Kyte-Doolittle scaleSpecialized scaleskP: kProtbw: Beuming & Weinstein scaletmlip1/2: Adamian & Liang


8

Beuming & Weinstein (2004): amino acid propensities(1) Hydrophobic residues (A, I, L, V)

make up 48.7 % of all residues in TM proteins

(2) Charged residues (D, E, H, K, R) constitute only 5.5%

(3) Glycine (G) is relatively abundant(4) Small residues (A, C, S, T) form

30.6%(5) Aromatic residues (F,W,Y) represent

15.8%6 -branched residues (T, I, V) form

24.9%.(7) Proline is a helix-breaker and is

underrepresented(8) Also, Cys, Gln, and Asn are rarely

found.


9

amino acid propensities: conclusionsThe overall amino acid composition deviates significantly from that of the whole

genome.

Hydrophobic residues (A, F, G, I, L, M, V, W) occur more frequently in MPs than in the whole genomes.

Conversely, residues C, D, E, K, N, P, Q, R are underrepresented in MPs.

H, S, T, and Y have equal distributions in MPs and whole genomes.


10

Beuming & Weinstein (2004): inside vs. outside(1) Most of the exposed (lipid facing)

charged residues (D, E, K, H, R) that are found in TMs are located in the terminal regions (4.4%) rather than in the central region (2.7%).

(2) The exposed terminal parts are very rich in aromatic residues (21.3%) compard to the central part (16.1%).


11

Beuming & Weinstein (2004): surface propensity scale

Table shows fraction SF of exposed residue i.Trp has highest value of SF, His has smallest value.

Normalize SP values with respect to His (SP=0) and Trp (SP=1).

HISTRP

HISXX SFSF

SFSFSP


12

correlation of SP scale with other scalesCompute correlation coefficient.

SP propensity scale has high correlation with hydrophobicity or volume scales.

Combine SP scale with conservation index:

Alignmentn a

iaiai p

ffCI log

pa : a priori distribution of residues


13

Beuming & Weinstein (2004)Add propensity score and conservation score:

total score(i) = SPi + CIi

Accuracy to detect the buried resides is ca. 70%.


14

Beuming & Weinstein (2004)(top) correct SASA in X-ray structure

(middle): prediction based on amino-acid propensity + conservationBEST!

(bottom): prediction based only on conservation


15

Adamian & Liang (2006): interacting helicesExample for two interacting TM helices in succinate dehydrogenase.

Interacting residues follow heptad motiv.

Note the periodicity of 3.6 residues per turn in an ideal -helix.


16

Adamian & Liang (2006)Heptad motifs are generally preferred for interacting helix pairs.

For left-handed helices, about 94.7% and 92.4% of interacting residues can be mapped to heptad repeats for parallel and anti-parallel helices.

For right-handed pairs the number are slightly less.

Assume that the residues of lipid-accessible helices follows a similar pattern.


17

Adamian & Liang (2006)Each TM helix has „7 faces“.

A: the anchoring residues are0, 7, 14, and 21contacts are also formed by residues 3, 4, 10, 11, 17, 18


18

Adamian & Liang (2006)

Combine lipophilicity score Lf and positional entropy Ef of a helical face by simply multiplying them.


19

Adamian & Liang (2006): Test fo TRP channel


20

Adamian & Liang (2006): discuss failures

Sometimes, binding sites for individual lipids (e.g. cardiolipin) are formed on the surfaces of TM proteins. Those residues will also be highly conserved, and the method will therefore fail.


21

What is needed for true de novo design of helical bundles?

Aim: explore new TM protein topologies.

distance-dependent residue-residue force field Generate energetically favorable geometries of helix dimers.

Overlap helix dimers full protein structure.


22

Derivation of position scores

(1) For each test protein, 1000 similar sequences from non-redundant database using BLAST URLAPI.

(2) generate initial multiple sequence alignment (MSA) with ClustalW.Delete fragments < 80% of length of query sequence.

From these refined MSA, apply 6 different % identity criteria, 6 final MSAs for each test protein.

Pei & Grishin: need to align ≥ 20 sequences to accurately estimate conservation indices from MSAs.


23

Test: correct orientation (0,0) has lowest score.

predicting the TM-helix-orientation from sequences

CI: conservation index in MSA

SASA: Solvent accessible surface area,relative to a single, free helix

fj(i): frequency of amino acid jin position i.fj : frequency of amino acid j in full alignment.

C : average conservation index (CI): Standard deviation

Positive values: conserved positionsNegative values: variable positions

12

CfifCI

jjji

Assumption:lipid-exposed positions areless conserved.


24

Aim: construct structural model for a bundle of ideal transmembrane helices.

(1) Construct 12 good geometries for every helix pair AB, BC, CD, DE, EF, FG

(2) overlay ABCDEFG„thin out“ solution space containing ca. 126 models(a) remove „solutions“ where helices collide with eachother(b) delete non-compact „solutions“

(3) score remaining 106 solutions by sequence conservation

(4) cluster 500 best solutions in 8 models

(5) rigid-body refinement, select 5 models with best sequence conservation.

Ab initio structure prediction of TM bundles


25

Rigid-body refinement


26

dark: Modellight: X-ray structure

Additional input:known connectivity of thehelices A-B-C-D-E-F-G.

Otherwise, the search space would have beentoo large.

Compare best models with X-ray structures

HalorhodopsinBacteriorhodopsin Sensory Rhodopsin

Rhodopsin NtpK


27

Comparing the best models with X-ray structures


28

These are our 4 best non-native models of bR.

Because contact between A and E was not imposed, very different topologies were obtained.

In 2006, our methods could not distinguish between these models.

but they could serve as input for further experiments.

Can one select the best model?


29

“Success case”: True de novo model of 4-helix bundle


30

Predicting lipid-exposure


31

Predicting lipid-exposureAim: derive optimal scale to predict exposure of residues to hydrophobic part of lipid bilayer.

Scale should optimally correlate with SASA minimize quadratical error.

Y: SASA values of the training set (N = 2901 residue positions)X: profile of residue frequencies from multiple sequence alignment ( N 21 matrix): wanted propensity scale for 20 amino acids + 1 intercept value (21)

Solution for minimization task


32

What does MO scale capture?


33

Improved prediction of exposure by statistical learning

Prediction method Prediction accuracy [%]

Beuming & Weinstein 68.7

TMX 78.7

Yuan ... Teasdale 71.1

Beuming & Weinstein(2004) method


34

Improved method by statistical learning

The theory of Support Vector Classifiers evolves from a simpler case of optimal separating hyperplanes that, while separating two separable classes, maximize the distance between a separating hyperplane and the closest point from either class.

A: The two classes can be fully separable by a hyperplane, and the optimal separating hyperplane can be obtained by solving Eq. 9. B: It is not possible to separate the two classes with a hyperplane, and the optimal hyperplane can be obtained by solving Eq. 17.


Stockholm Univ. Sept. 200835



36



37



38



39

http://service.bioinformatik.uni-saarland.de/tmx/

input:

Putative TM helices

TopoView drawsSnake plot

Master thesisNadine Schneider


40


Top: TMD11, Bottom: TMD 12 Membrane Bioinformatics SS09

41


Top: TMD5, Bottom: TMD 12 Membrane Bioinformatics SS09

42

Summary TMX and related methods

Sequences of TM proteins reveal many powerful features to allow prediction of 2D- and 3D structural features, function, and oligomerization status.

TMX server can predict lipid exposure with ca. 78% accuracy.:http://service.bioinformatik.uni-saarland.de/tmx/

Possible applications:(1) predict transporter pores

(2) predict lipid-exposed surface of TM proteins: correlate with different membrane compositioncollaborate with us do you have lots of solubility data?

(3) Conserved surface residues may indicate interaction sites


membrane bioinformatics ss09 1 v9 – orientation of tm helices -modelling 3d structures of helical...

Documents