maze in biology: the pathway problem

59
YMIB Maze in biology: the pathway problem Ueng-Cheng Yang ( 楊楊楊 ) Institute of Bioinformatics National Yang-Ming University Nov. 14, 2003 http://www.flint.umich.edu/ Departments/ITS/crac/ mazeorig.form.html

Upload: lara

Post on 14-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

http://www.flint.umich.edu/ Departments/ITS/crac/ mazeorig.form.html. Maze in biology: the pathway problem. Ueng-Cheng Yang ( 楊永正) Institute of Bioinformatics National Yang-Ming University Nov. 14, 2003. fertilization. 1st cleavage. 2nd cleavage. 3rd cleavage. oogenesis. mRNA - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Maze in biology: the pathway problem

YMIB

Maze in biology: the pathway problem

Ueng-Cheng Yang (楊永正 )

Institute of Bioinformatics

National Yang-Ming University

Nov. 14, 2003

http://www.flint.umich.edu/Departments/ITS/crac/mazeorig.form.html

Page 2: Maze in biology: the pathway problem

YMIB

oogenesis

mRNAlocalization

fertilization 1st cleavage

2nd cleavage

3rd cleavage

2 identical cells

4 identical cells

8 cells with 2 different cell types

sperm

oocyte

embryonic development

Genome is the complete set of genetic material, which is similar to the

programs in the ROM

Page 3: Maze in biology: the pathway problem

YMIB

Gene expression of eukaryotes

Picture taken fromLehninger’s “Principles of Biochemistry”

Page 4: Maze in biology: the pathway problem

YMIB

Microarray (Gene chip) is a high-throughput technique that may measure thousands of gene expressi

on at a time

Black box

Changes in geneexpression

Perturbation

Page 5: Maze in biology: the pathway problem

YMIB

Presentation of life and knowledge management

Sequence information

decompress

Expression level

Tissue (spatial)

Development(temporal)Genes

Page 6: Maze in biology: the pathway problem

YMIB

Transform or out of the game?

http://www.sciencemag.org/cgi/content/full/291/5507/1221/F1

Global

High-throughput

analysis

Local

Individualanalysis

Page 7: Maze in biology: the pathway problem

YMIB

Bioinformatics should provide the direction for future biology

Bioinformaticsresearch

Genome, transcriptomeand proteome research

Collectdata

Interpretdata

tatttctctactgatttgaacaagattgtcgagaaattcccaaaacaagccgaaaaattg

Data  => Information => Knowledge => Technique => Economy

Page 8: Maze in biology: the pathway problem

YMIB

Are there rules in biology?

* Picture made from screenshot of http://www.shef.ac.uk/~chem/web-elements/

Page 9: Maze in biology: the pathway problem

YMIB

Should there be rules in biology?

Geneduplication

Variation(mutation)

Geneduplication

Recombination

+

Page 10: Maze in biology: the pathway problem

YMIB

Pathway study is the one of most fundamental problems for biological research at molecular level

• Metabolism• Signal transduction• Biosynthesis of

macromolecules (mechanism study)– Replication

– Transcription

– RNA processing

– Translation

Page 11: Maze in biology: the pathway problem

YMIB

Similar chemistry can be re-used in different enzymes

COOH COOHCH2 CH2

CH2 CH2

C O + NAD+ + CoASH C O + CO2 + NADH+H+

COOH S CoAketoglutarate succinyl CoA

Page 12: Maze in biology: the pathway problem

YMIB

Paralogous genes may have similar functions

Linear molecule pyruvate (3) → acetyl CoA (2) + CO2

-ketobutyrate (4) → propionyl CoA (3) + CO2

-ketoglutarate (5) → succinyl CoA (4) + CO2

-ketoadipic acid (6) → glutaryl CoA (5) + CO2

Branched molecule-ketoisovalerate (5) → isobutyryl CoA (4) + CO2

-ketoisocaproic acid (6) → isovaleryl CoA (5) + CO2

-keto--methylvalerate (6) → -methylbutyryl CoA (5) + CO2

Page 13: Maze in biology: the pathway problem

YMIB

Observation (III): “Dehydrogenation, hydration, dehydrogenation” is a pathway module

OAA citrate

isocitrate

-ketoglutarate

succinyl CoAsuccinate

malate

fumarate

-2H-CO2

-2H-CO2

CoA

-2H

-2H

CoA + GTP

acetyl CoA

release CO2

reforming the carrier

H2O

TCA cycle

Page 14: Maze in biology: the pathway problem

YMIB

A set of reactions can be “re-used” together

RCH2CH2 CH2C-S-CoA

O

RCH2CH=CHC-S-CoA

OH O

RCH2CH CH2C-S-CoA

OH O

RCH2C CH2C-S-CoA

O O

-2H

+H2O

-2H

RCH2CH2CH2CH2CH2CH2C-S-CoA

O

RCH2CH2CH2CH2C-S-CoA

O

RCH2CH2C-S-CoA

O

Acetyl CoA

Acetyl CoA

Page 15: Maze in biology: the pathway problem

YMIB

A single reaction may create a new pathway

3 1

5 + 5

3 + 3 + 7

6

4 + 6

Trans-ketolase 6 Trans-aldolase 1

5 + 5

33 + 7

6

4 + 6

Photosynthesis Pentose phosphate cycle

Page 16: Maze in biology: the pathway problem

YMIB

The pathway problems that might be obvious to physicists

Pathway simulation => hypothetical cell– Flux balance analysis– S-system– … etc.

Page 17: Maze in biology: the pathway problem

YMIB

Complicated feedback regulation

A B C D

W X Y Z(-)

(-)

"x"(such as ADP) will accumulate if this reaction is inhibited.

Page 18: Maze in biology: the pathway problem

YMIB

M

G1

S

G2

Cell cycle and simulation of complex biological events

M G1 S G2 M

interphase

Page 19: Maze in biology: the pathway problem

YMIB

Other types of pathway problems

• Pathway discovery– From protein-protein interaction and microarray

• Pathway reconstruction– Genome annotation and interpretation

• Pathway simulation => hypothetical cell– Flux balance analysis– S-system

Page 20: Maze in biology: the pathway problem

YMIB

Information integration is the first step for data mining

Modification, expression, interaction, structure

DNA

RNA

transcription

translation

protein

Genomic seq.

EST, SAGE,Gene chips

Annotation,comparison

Page 21: Maze in biology: the pathway problem

YMIB

Different cells have the same genome, but they express different set of genes after differentiation

Colon KidneyLung OvarySmallintestineTestis Thyroid… Total

EGF 0 15 1 0 0 0 0 … 19EGFR 3 4 19 9 0 0 0 … 103PLCG1 1 3 7 1 2 1 0 … 68SHC1 4 10 22 1 0 3 1 … 249GRB2 1 1 3 2 0 0 2 … 77SOS1 4 3 0 2 0 0 0 … 36HRAS 1 7 10 0 2 1 0 … 58RAF1 4 6 28 1 3 4 0 … 197MAP3K1 2 8 2 2 0 0 0 … 44MAP2K4 5 6 1 3 1 4 0 … 81MAP2K1 4 10 3 2 0 2 0 … 82MAPK8 1 2 2 0 0 1 0 … 33STAT1 13 32 14 6 4 6 3 … 260STAT3 3 7 17 7 0 1 0 … 135MAPK3 9 10 9 4 1 1 0 … 181

Page 22: Maze in biology: the pathway problem

YMIB

Organizing the known information: Integrating different types of pathways

Signal transduction Gene regulatorynetwork

Metabolicpathway

CDK E2F PFK

F6P

F1,6P

EGF

Glycolysis

Page 23: Maze in biology: the pathway problem

YMIB

Steps in pathway discovery

Factors involved => Components

Molecular interaction => Events

Order of events => Pathways

Pathway interaction => Circuits

Page 24: Maze in biology: the pathway problem

YMIB

The dream of molecular biologists

?

Cell., 100(1):57–70 Review, 2000.

PNAS, Vol. 95, 14863-14868

Science. Vol 292. May,2001

Page 25: Maze in biology: the pathway problem

YMIB

Appropriate presentation format is essential for computation

[EGFR]+[EGF] <-> [EGF-EGFR]

[EGF-EGFR]+[EGF-EGFR] <->[(EGF-EGFR)2]

[(EGF-EGFR)2]<->[(EGF-EGFR*)2]

[(EGF-EGFR*)2]+[GAP]<->[(EGF-EGFR*)2-GAP]

[(EGF-EGFR*)2-GAP]+[Grb2]<->[(EGF-EGFR*)2-GAP-Grb2]

[(EGF-EGFR*)2-GAP-Grb2]+[Sos]<->[(EGF-EGFR*)2-GAP-Grb2-Sos]

[(EGF-EGFR*)2-GAP-Grb2-Sos]+[Ras-GDP]<->[(EGF-EGFR*)2-GAP-Grb2-Sos-Ras-GDP]

[(EGF-EGFR*)2-GAP-Grb2-Sos-Ras-GDP]<->[(EGF-EGFR*)2-GAP-Grb2-Sos]+[Ras-GTP]

[Raf]+[Ras-GTP]<->[Raf-Ras-GTP]

[Raf-Ras-GTP]<->[Raf*]+[Ras-GTP*]

Nature biotechnology 20, 370-375

Page 26: Maze in biology: the pathway problem

YMIB

Strategy

Nucleus

cellmembrane

Zoutwardreconstruction

Y

X

?

?

inwardreconstruction

Receptor

adaptor

?

?connector

Page 27: Maze in biology: the pathway problem

YMIB

Reconstructing pathways based on protein-protein interaction

Receptor

adaptor

… etc.inward

reconstruction

Page 28: Maze in biology: the pathway problem

YMIB

Identifying new receptor is the starting point for inward reconstruction

Page 29: Maze in biology: the pathway problem

YMIB

1

2

9

10

1112

13

1415

16 17

19

21

22

232425

2627

2829

3

45

678

18

20

30?

The distribution of death domain containing genes in human genome

Page 30: Maze in biology: the pathway problem

YMIB

A

B

C

D

E

F

0.1

16 UNC5D10 UNC5A

21 UNC5B7 UNC5C

23 NFKB231

8 NFKB119 DAPK1

34 NY-REN-6436 MALT1

33 IRAK235 IRAK1

26 IRAK-M12

23 EDAR

529 NGFR

27 CRADD6

24 FADD28 TRADD

11 RIPK113 TNFRSF21

32 LRDD1 TNFRSF12

25 TNFRSF1A14 TNFRSF10A

15 TNFRSF10B18 TNFRSF11B

22 TNFRSF630 P84

4 MYD8820 ANK317 ANK1

9 ANK2

Phylogenetic clusters correlate with protein functions

Page 31: Maze in biology: the pathway problem

YMIB

Functional correlation: Tissue specificity of gene expression

brain tissues

Paralogous genes

Page 32: Maze in biology: the pathway problem

YMIB

Specificity of protein-protein interaction

A

B

C

D

E

F

0.1

16 UNC5D10 UNC5A

21 UNC5B7 UNC5C

23 NFKB231

8 NFKB119 DAPK1

34 NY-REN-6436 MALT1

33 IRAK235 IRAK1

26 IRAK-M12

23 EDAR

529 NGFR

27 CRADD6

24 FADD28 TRADD

11 RIPK113 TNFRSF21

32 LRDD1 TNFRSF12

25 TNFRSF1A14 TNFRSF10A

15 TNFRSF10B18 TNFRSF11B

22 TNFRSF630 P84

4 MYD8820 ANK317 ANK1

9 ANK2

TNFRSF1A, 12 --- TRADD --- FADDTNFRSF6, 10A, 10B --- FADD

Page 33: Maze in biology: the pathway problem

YMIB

Reconstructing pathways based gene expression and pathway information

Nucleus

cellmembrane

Jun

outwardreconstruction

MAPK8-P*

MAPK8-P*

MAP2K4-P*

?

Page 34: Maze in biology: the pathway problem

YMIB

Related pathways in heart

Page 35: Maze in biology: the pathway problem

YMIB

Related pathways can be discovered by looking for shared components among pathways

25

25

23

16

14

14

20

17

18

13

1915

19

15

1517

1517

13

13

13

1813

Shared

Component

Pathway1 Pathway2 Index

pdgfPathway egfPathway 1.96e-40

pdgfPathway tpoPathway 9.89e-27

pdgfPathway igf1Pathway 2.26e-22

pdgfPathway insulinPathway 2.26e-22

egfPathway igf1Pathway 2.26e-22

egfPathway insulinPathway 2.26e-22

pdgfPathway ngfPathway 2.20e-22

… … …

Page 36: Maze in biology: the pathway problem

YMIB

To die, or not to die? It’s a

signaling problem

Page 37: Maze in biology: the pathway problem

YMIB

If PDGF receptor does not exist in colon, why do we need the downstream

components in PDGF

signaling pathway?

Page 38: Maze in biology: the pathway problem

YMIB

“MAP2K4, MAPK8, Jun” is a pathway

module shared by at least 3 pathways

PDGF 11

EGF 11

TNF 21

EGF/PDGF 16

ALL 4

Page 39: Maze in biology: the pathway problem

YMIB

Pathway modules

MAP3K1(MEKK1)module

RAF1(RAF)

module

MAP3K7(TAK)

module

Death signalGrowth signal Stress signal

HRASTRAF2

FOS JUN ATF2 SP1

Gene expression regulation, (including transcription, splicing), translation and protein modification…

RPS6KA5

Page 40: Maze in biology: the pathway problem

YMIB

Connector

Factors involved => Components

Molecular interaction => Events

Order of events => Pathways

Page 41: Maze in biology: the pathway problem

YMIB

Inducible gene sets are co-regulated.

Picture taken from http://genomics.stanford.edu/yeast/additional_figures_link.html

Page 42: Maze in biology: the pathway problem

YMIB

Most constitutively expressed genes are not regulated

Pyruvate kinase

Rate-limiting step is usually the target for regulation

Page 43: Maze in biology: the pathway problem

YMIB

Microarray exp. is the nature’s way to cla

ssify genes

Collect sections from different angles

Image reconstructionhttp://www.npcc.gov.tw/npcc/chn/imaging/imaging.htm

Tomography(斷層掃瞄 )

Page 44: Maze in biology: the pathway problem

YMIB

In extreme environment, the whole pathway can be turned on/offALPHA = alpha factor arrest 18; ELU = centrifugal elutriation 14; CDC15 = cdc15 ts 15; SPO = sporulation 7; HT = shock by high temp 6; D = reducing agent 4; C = low temp 4; DX = diauxic shift 7

Clustering is driven by these features

ALPHA ELU CDC15 SPO HT D C DX

Conflicts?

Page 45: Maze in biology: the pathway problem

YMIB

Unrelated sequences of similar function cluster together

Eisen, M.B., Spellman, P.T., Brown, P.O., and Botstein, D. (1998) Cluster analysis and display of genome-wide expression pattern. Proc. Natl. Acad. Sci. USA 95, 14863-14868.

Page 46: Maze in biology: the pathway problem

YMIB

How good is the classification?

• In microarray clustering– hexokinase II– phosphofructokinase– aldolase– triose phosphate isomerase– GAPDH 1, 2, 3– phosphoglycerate kinase– phosphoglycerate mutase– Enolase II– pyruvate kinase

• In glycolysis, in total there are 10 enzymes involved

• Microarray experiment only missed phospho-glucose isomerase

• Pyruvate (de)carboxylase and transaldolase are mis-placed

Pretty good

Page 47: Maze in biology: the pathway problem

YMIB

Pathway is a subset of components in a regulatory network

How can we reconstruct the network from partial pathways?

Page 48: Maze in biology: the pathway problem

YMIB

Tri-component relation is better than bi-component relation

Page 49: Maze in biology: the pathway problem

YMIB

Distinguishing branch and linear structures is sufficient

Page 50: Maze in biology: the pathway problem

YMIB

Distinguish the branch and linear structures

Page 51: Maze in biology: the pathway problem

YMIB

Exact order within a subset is not essential to reconstruct the pathway

4 5 6 73

{4,5,6}{5,6,7}

{3,4,5}

3=>4=>5=>6=>7

{5,4,6}

{7,5,6}

{4,5,3}

Page 52: Maze in biology: the pathway problem

YMIB

Integrating discontinuous tri-component relation

Page 53: Maze in biology: the pathway problem

YMIB

Summary

• Inward reconstruction– Look for novel receptors by protein domain search– Look for possible pathways by protein-protein interaction

information.

• Connector– Look for trio-relation by learning Bayesian network

• Outward reconstruction– Look for pathway modules– Establish transcription regulation network

Need a user-centric environment for information-

driven biomedical research

Page 54: Maze in biology: the pathway problem

YMIB

Acknowledgements

• Yuh-Fan Liu: Genome wide motif scanning

• Yung-Wen Deng: Death domain resource and cross talks among pathways

• Yu-Tai Wang: Pathway knowledge management system

• Kai-Lung Tang: Pathway visualization

• Shih-Te Yang: Pathway prediction

• Collaborator: Dr. Der-Ming Liou

Page 55: Maze in biology: the pathway problem

YMIB

Complications in regulation

Alternative pathways caused by alternative splicing events

Page 56: Maze in biology: the pathway problem

YMIB

Differential Processing of The Calcitonin Gene Transcript in Rats

Picture taken from Lehninger’s “Principles of Biochemistry”

Page 57: Maze in biology: the pathway problem

YMIB

A tumor necrosis factor receptor that lacks of transmembrane region

Page 58: Maze in biology: the pathway problem

YMIB

A FADD protein that lacks of DED domain

Page 59: Maze in biology: the pathway problem

YMIB

Information-driven biomedical research

Make observations and working hypotheses by comparing information