a global view of the protein structure universe and protein evolution sung-hou kim university of...

32
A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Upload: felicity-hutchinson

Post on 17-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

A Global View of the Protein Structure Universe and

Protein Evolution

Sung-Hou Kim

University of California, Berkeley, CA

U.S.A.

June 27, 2006

Page 2: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

TopicsI. Global view of the protein structure universe

II. Mapping of protein functions on the structural universe

III. Global view of the evolution of proteins

Page 3: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

J. Hou

G. Sims

I.-G. Choi

S.-R. JunC. Zhang

Page 4: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

I. Mapping the Protein Structure Universe: Structural Demography

Page 5: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

The Protein Universe• 500 – 20,000 genes per organism• >13.6 106 species• >1010 – 1012 protein sequences

but………..• ~105 protein sequence families• ~104 protein structure families• ~103 protein fold domain

families

Page 6: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

“Mapping” by Metric Matrix Distance Geometry(Classical Multidimensional Scaling)

Pair-wise relational distanceswith “errors”

Most likely (consistent)global relational “mapping”

d1,2

x1

x2 x3

x4

d2,4

d1,3

d2,3

d3,4

d1,4

Page 7: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Method

• Take all protein structures in PDB (>35,000)

• Construct a non-redundant set at 25% sequence identity (~2000 structures)

• Calculate all-to-all pair-wise structural similarities, then convert to dissimilarity scores

• Apply metric matrix distance geometry to find the global position of each structure in N-dimensional space

• 3-D plot to capture the major features of the protein structure space

Page 8: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Protein Structure Distance Matrix(~2000 structures with <25% sequence ID)

P1 P2 P3 P4 P5 P6 ……………P1898

P1

P2

P3

P4

P5

P6

.

.

P1898

D 3,4

Page 9: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Eigen values

0.00

500.00

1,000.00

1,500.00

2,000.00

2,500.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Positional coordinates in 1898 dimensional space.Major feature extraction in 3-dimension

Page 10: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

The Protein Structure Universe (2005)

Page 11: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

A1A2

A5

A3

A4

A1: (2ERL:_) MATING PHEROMONE ER-1;

A2: (1ELW:B) TPR1-DOMAIN OF HOP;

A3: (1A6M:_) MYOGLOBIN;

A4: (1E85:A) CYTOCHROME C’;

A5: (1M57:C) CYTOCHROME C OXIDASE;

Four demographic regions of the protein structure universe

Page 12: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Four Protein Fold Classes

n n n nm

+

Page 13: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Major Features of the Protein Structural

Space1. Protein structural space is

sparsely populated2. Four elongated regions

corresponding to four protein “fold” classes

3. Small to large size distribution along three of four “feature axes”

Page 14: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

II. Mapping of Functions(1) Enzymatic functions

Page 15: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Molecular functions:Basic chemistry

EC

Page 16: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

EC3: Hydrolases

Page 17: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

EC6: Ligases

Page 18: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

II. Mapping of Functions(2) Metal Binding

Page 19: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Ca

Co

Cu

Fe

Mn

Mo

Ni

Zn

Multi-bound

Not bound

Metal Binding

Page 20: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Zn

Page 21: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Cu

Page 22: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Major Features of Functional Mapping

Maximum diversity in architectural preference for a given molecular function:

“scaffold” selection vs. design

Page 23: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

III. Evolution of Proteins (a) “Ages” of Protein

Families

Page 24: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Method: “Common Structural Ancestor”

Page 25: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

The “age” of the “common structural ancestor” of a protein family

“Age” of CSA

Page 26: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Ages of the Common Structural Ancestors

Population averaged Chain length has similar distribution

Page 27: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

III. Evolution of Proteins (b) Protein Fold Classes

Page 28: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

ML Relative “age” of common structural ancestors

Page 29: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

III. Evolution of Proteins (e) Protein Families

Page 30: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Hypothesis: Multiple Origins of Protein Families

Page 31: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Summary

• Mapping of protein structures—Sparse except four highly populated demographic regions (structural selection)

• Mapping of molecular functions—Opportunistic use of structural features for molecular function (selection, not design)

• Mapping of CSA ages—(1) Evolution of protein fold classes (2)”Multiple origin model” for the evolution

of protein families

Page 32: A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006

Organismic evolution by natural selection for

environment

may be founded on

Molecular evolution by structural selection for

function