special topics in computational biology : formal methods in systems biology

51
Special Topics in Computational Biology: Formal Methods in Systems Biology Chris Langmead Department of Computer Science Carnegie Mellon University James Faeder Department of Computational Biology University of Pittsburgh School of Medicine Spring, 2008

Upload: edward

Post on 02-Feb-2016

58 views

Category:

Documents


0 download

DESCRIPTION

Special Topics in Computational Biology : Formal Methods in Systems Biology. Spring, 2008. Chris Langmead Department of Computer Science Carnegie Mellon University James Faeder Department of Computational Biology University of Pittsburgh School of Medicine. General Info. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Special Topics in Computational Biology : Formal Methods in Systems Biology

Special Topics in Computational Biology:Formal Methods in Systems Biology

Chris Langmead Department of Computer ScienceCarnegie Mellon University

James Faeder Department of Computational BiologyUniversity of Pittsburgh School of Medicine

Spring, 2008

Page 2: Special Topics in Computational Biology : Formal Methods in Systems Biology

General Info• Course Numbers:

– CMU 15-872(A)– CMU 02-730– Pitt CMPBIO 2045(Arts & Sciences)– Pitt MSCBIO 2045 (School of Medicine)

• Location: Newell-Simon Hall (NSH) 3002 - OK?• Time: Tu, Th 1:30-2:50 PM• Instructors

– Chris Langmead ([email protected])– Jim Faeder ([email protected])

• Office Hours: By appointment (please email)• Course Wiki:

http://bionetgen.org/index.php/Formal_Methods_in_Systems_Biology (email Jim for account)

Page 3: Special Topics in Computational Biology : Formal Methods in Systems Biology

Course Format: An Informal Course about Formal Methods

• Introductory lectures (two weeks)• Students will read and present research papers

– Sign up for open dates on the wiki (25 - projects)

• Students will design and complete a course project on a subject of special interest

• Grading is based on completion of work• Flexibility depending on course enrollment

– Journal club– Focused project– Review article

Page 4: Special Topics in Computational Biology : Formal Methods in Systems Biology

Encouragement

• Opportunity to learn about new areas and methods that will be of direct interest in your research.

• (True for the “instructors” as well)• We will operate as a multi-disciplinary team

– Computer Scientists, Physicists, Chemists, Engineers, Mathematicians, …, Biologists

– Good communication essential

Page 5: Special Topics in Computational Biology : Formal Methods in Systems Biology

Products of the Course

• Comprehensive bibliography in wiki format• Research projects leading to publishable

results in the field• Review article (?)• Improved organization and presentation skills• Participation on a multi-disciplinary team

Page 6: Special Topics in Computational Biology : Formal Methods in Systems Biology

Introductions

• Your name

• Your university, department, research area(s) and research advisor

• Your educational background– Computer Science, Math, Physics, etc.

• Goals taking the course

Page 7: Special Topics in Computational Biology : Formal Methods in Systems Biology

Outline of Today’s Lecture

• Definition of terms

• Goals

• Examples of Successful Abstractions– Flux Balance Analysis– Mass Action Kinetics

• Brief survey of topics

Page 8: Special Topics in Computational Biology : Formal Methods in Systems Biology

Importance of Symbols• Invention of symbol for zero and decimal

system for writing numbers “among the greatest human inventions.”

• 3 known independent inventions• In each case, development took centuries• Major impact on trade, culture, and

philosophy.• Celebration of zero dot in Sanskrit poetry

“The dot on her forehead / Increases her beauty tenfold,/ Just as a zero dot [sunya-bindu] /Increases a number tenfold. -Biharilal

QuickTime™ and aPNG decompressor

are needed to see this picture.

Page 9: Special Topics in Computational Biology : Formal Methods in Systems Biology

Key Definitions - Formal Methods

• In computer science and software engineering, formal methods are mathematically-based techniques for the specification, development and verification of software and hardware systems.

• The use of formal methods for software and hardware design is motivated by the expectation that, as in other engineering disciplines, performing appropriate mathematical analyses can contribute to the reliability and robustness of a design.

• However, the high cost of using formal methods means that they are usually only used in the development of high-integrity systems, where safety or security is important.

- WIKIPEDIA

Page 10: Special Topics in Computational Biology : Formal Methods in Systems Biology

Expanded View of Formal Methods

• Formal abstractions that may be used to model system of interest

• In addition to sytems that can be formally analyzed, we will consider representations that can only be fully explored by simulations.

Page 11: Special Topics in Computational Biology : Formal Methods in Systems Biology

Key Definitions - Systems Biology

• Systems biology is a relatively new biological study field that focuses on the systematic study of complex interactions in biological systems, thus using a new perspective (integration instead of reduction) to study them.

• Particularly from 2000 onwards, the term is used widely in the biosciences, and in a variety of contexts.

• Because the scientific method has been used primarily toward reductionism, one of the goals of systems biology is to discover new emergent properties that may arise from the systemic view used by this discipline in order to understand better the entirety of processes that happen in a biological system.

- WIKIPEDIA

Page 12: Special Topics in Computational Biology : Formal Methods in Systems Biology

Origin of Systems Biology

• Completion of genome projects is major inspiration

• Provided “parts list” for the cell

• Next obvious step is to ask how parts work together to carry out function?

Page 13: Special Topics in Computational Biology : Formal Methods in Systems Biology

Vision for Role of Computer Science in Systems Biology

• “Computer science could provide the abstraction[s] needed for consolidating knowledge of biomolecular systems”

• “...the abstractions, tools and methods used to specify and study computer systems should illuminate our accumulated knowledge about biomolecular systems.”

Regev and Shapiro, “Cells as Computation,” Nature (2002).

Page 14: Special Topics in Computational Biology : Formal Methods in Systems Biology

Abstract Representations in Biology

• DNA sequence represented by strings with 4 letter alphabet (ATGC)

• Protein sequence and structure– Strings with 20 letter alphabet– Set of 3D atomic coordinates (PDB file)

The KaiC hexamer, a Circadian clock protein. From pdb.org.

Page 15: Special Topics in Computational Biology : Formal Methods in Systems Biology

(Some) Desirable Properties of an Abstract Representation

1. Relevant / accurate

2. Computable

3. Understandable

4. Extensible

5. ScalableModular

Hierarchical

1-4 from Regev and Shapiro, “Cells as Computation,” Nature (2002).

Page 16: Special Topics in Computational Biology : Formal Methods in Systems Biology

An Irony

• CS community aims to provide powerful abstract representations to improve understanding of systems.

• Manner of reporting results - technical reports in conference proceedings - presents major barrier to wider adoption by science and engineering communities.

• There is a need for better communication among disciplines!

Page 17: Special Topics in Computational Biology : Formal Methods in Systems Biology

Sometimes formalism creates a barrier

Page 18: Special Topics in Computational Biology : Formal Methods in Systems Biology
Page 19: Special Topics in Computational Biology : Formal Methods in Systems Biology

Example: Red blood cell model

Page 20: Special Topics in Computational Biology : Formal Methods in Systems Biology

Agenda

• We are looking for useful abstractions that can improve our understanding of how biological systems behave

Page 21: Special Topics in Computational Biology : Formal Methods in Systems Biology

Goals

• Language(s) for constructing whole-cell models (comprehensive, system-wide)

• Formal analysis (reasoning) of such models• Simulation of models on distributed systems• Combination of analysis and simulation to

predict behavior of models– genotype phenotype

Page 22: Special Topics in Computational Biology : Formal Methods in Systems Biology

Challenges• Accuracy

– Missing interactions

• Computability– Requirement to perform simulations for many properties of interest– Poor scaling of simulations

• Understanding– Problem of network visualization

• Extensibility– Missing biophysics

• Scalability– Need to compute behavior on multiple scales, e.g.

tissuecellcytoplasmnucleus

Page 23: Special Topics in Computational Biology : Formal Methods in Systems Biology

Mathematical vs. Computational Models

Fisher & Henzinger, Nat. Biotechnol. (2007).

Computational

How important is this distinction?

r1: A + B -> CConsider an elementary chemical reaction

Mathematical

d[A] / dt =−k[A][B] module A: [0..N] init N; [r1] (A > 0) -> k*A*B: (A’ = A - 1); …endmodule

Page 24: Special Topics in Computational Biology : Formal Methods in Systems Biology

Tension between Accuracy and Computability • Application of formal methods requires that elements of

representation be relatively simple.• For example, a representation that includes all

analytical functions in mathematics might not be useful - impossible to make predictions.

• In general, increasing the complexity of the representation limits ability for analysis.

• Representations are sometimes chosen for amenability to analysis rather than realism - e.g. boolean networks.

• Computational (“executable”) models tend to make restrictions explicit.

Page 25: Special Topics in Computational Biology : Formal Methods in Systems Biology

Some successful abstractions in systems biology

• Flux Balance Analysis– Genome-wide models of metabolism

• Mass Action Kinetics– Cell-cycle model– Growth factor signaling model

Page 26: Special Topics in Computational Biology : Formal Methods in Systems Biology

Network Reconstruction (2D Annotation)

B. O. Palsson, Nature Biotechnology 22, 1218 - 1219 (2004)

Page 27: Special Topics in Computational Biology : Formal Methods in Systems Biology

Network Reconstruction (cont.)

• Wiring diagram for the components in a cell• Elements are

– Molecular Components (Species)– Interactions (Reactions)

• Additional detail can be added.• Genome-wide reconstructions for metabolism are

available for many model organisms (including Homo Sapiens!)

• “All such interactions are ultimately represented by a genome-scale stoichiometric matrix—a two-dimensional genome annotation.”

B. O. Palsson, Nature Biotechnology 22, 1218 - 1219 (2004)

Page 28: Special Topics in Computational Biology : Formal Methods in Systems Biology

Overview of Flux Balance Analysis

• Genome-wide reconstruction of metabolic network

• Assume steady state

• Assume optimal growth (biomass production)

ri : s1 + s2v1⏐ →⏐ s3

S1i =S2i =−1; S3i =1; Sji =0, ∀j ∉{1,2, 3}

S⋅v=b, where bi are known transport fluxes.

maximize f (v) =v⋅vout

Page 29: Special Topics in Computational Biology : Formal Methods in Systems Biology

Genome-Wide Reconstruction of Haemophilus influenzae

Edwards, J. S. et al. J. Biol. Chem. 1999;274:17410-17416

Page 30: Special Topics in Computational Biology : Formal Methods in Systems Biology

Single and double deletion in the central metabolic pathways of H. Influenzae

Edwards, J. S. et al. J. Biol. Chem. 1999;274:17410-17416

Page 31: Special Topics in Computational Biology : Formal Methods in Systems Biology

What Accounts for Success?

• Knowledge Base– Metabolic chemistry known from >50 years

biochemistry and genome sequence

• Simple Abstraction– Biochemistry reduced to list of reaction stoiochimetries

• Powerful Computation Method– Highly optimized solvers for Linear Programming

problem

• Extensibility– Non-optimal growth in mutants– Constraints arising from molecular crowding

Page 32: Special Topics in Computational Biology : Formal Methods in Systems Biology

Cellular Signal Transduction

ligand

receptor

ligand-receptorbinding

aggregationsignaling complex

transphosphorylation

adaptor

SH3domain

SH2domain

kinase

plasmamembrane

Page 33: Special Topics in Computational Biology : Formal Methods in Systems Biology

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Mass Action Kinetics

R + L ka

kd

à Üààá ààà RL

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture. à Üàá àà

d[R]

dt=−ka[R][L] + kd[RL]

d[L]dt

=−ka[R][L] + kd[RL]

d[RL]dt

=+ka[R][L] −kd[RL]

Differential Equations

Page 34: Special Topics in Computational Biology : Formal Methods in Systems Biology

Reaction Network Model of Signaling

Kholodenko et al., J. Biol. Chem. 274, 30169 (1999)

Page 35: Special Topics in Computational Biology : Formal Methods in Systems Biology

Comparing Model and Experiment

Experimental Data

Simulation Results

Page 36: Special Topics in Computational Biology : Formal Methods in Systems Biology

Benefits of Mass Action Kinetic Modeling

• Large knowledge base of signaling biochemistry• Models dynamical behavior• Computational Methods Well Established

– ODE solvers for continuous systems

• Nonlinear Dynamics Theory• Extensibility

– Stochastic Simulation Algorithm for discrete systems– Spatially-resolved models can be built on same mass

action equations

Page 37: Special Topics in Computational Biology : Formal Methods in Systems Biology

Limitations of Mass Action Kinetic Modeling

• Rapidly expanding knowledge base– Many components and interactions unknown

• Lack of precision– ad hoc assumptions to limit combinatorial

explosion (next lecture)

• Large sets of nonlinear ODE’s are difficult to simulate or analyze

• No comprehensive models yet

Page 38: Special Topics in Computational Biology : Formal Methods in Systems Biology

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Map of Signaling Initiated by a Single Family of Receptors

Oda and Kitano (2006) Mol. Syst. Biol.

Page 39: Special Topics in Computational Biology : Formal Methods in Systems Biology

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Map of Signaling Initiated by a Single Family of Receptors

Oda and Kitano (2006) Mol. Syst. Biol.

Analysis is limited to simple graph theoretic measures and qualitative discussions of architecture.

Page 40: Special Topics in Computational Biology : Formal Methods in Systems Biology

(Partial) List of Topics

• Boolean Networks• Petri Nets• Statecharts• Process Algebras• Agent-Based Modeling• Hybrid Systems • Model Checking• Simulation Algorithms

Page 41: Special Topics in Computational Biology : Formal Methods in Systems Biology

Brief Overview of Two Useful Abstractions

• Boolean Networks• Petri Nets• Statecharts• Process Algebras• Agent-Based Modeling• Hybrid Systems • Model Checking• Simulation Algorithms

Page 42: Special Topics in Computational Biology : Formal Methods in Systems Biology

Boolean Networks

Li, F., et al. PNAS 101, 4781–4786 (2004).

BN model of cell cycle in budding yeast

G1

Page 43: Special Topics in Computational Biology : Formal Methods in Systems Biology

Boolean Networks

Li, F., et al. PNAS 101, 4781–4786 (2004).

BN model of cell cycle in budding yeast

G1Update:

b(t +1) =a1(t)+a2 (t)−a3(t)−a4 (t)

Page 44: Special Topics in Computational Biology : Formal Methods in Systems Biology

Boolean Networks

Li, F., et al. PNAS 101, 4781–4786 (2004).

BN model of cell cycle in budding yeast

Blue arrows form stable basin of attraction

G1Update:

b(t +1) =a1(t)+a2 (t)−a3(t)−a4 (t)

Page 45: Special Topics in Computational Biology : Formal Methods in Systems Biology

Balance Sheet for BNs

Pro• Models may be

constructed on basis of scant data*

• Fast computation• Strong analysis tools (?)• Good for reasoning

about stability and robustness

Con• Two levels may not be

enough• Lack of compositionality• Not hierarchical, but

may be embedded in more complex models.

*Li S, Assmann SM, Albert R (2006) Predicting Essential Components of Signal Transduction Networks: A Dynamic Model of Guard Cell Abscisic Acid Signaling. PLoS Biol 4(10): e312

Page 46: Special Topics in Computational Biology : Formal Methods in Systems Biology

Petri Nets

Chaouiya, C. Petri net modelling of biological networks. Brief. Bioinform. 8, 210–219 (2007).

Places

Transition

Tokens

Transition

Page 47: Special Topics in Computational Biology : Formal Methods in Systems Biology

Time Evolution

Petri Nets

Chaouiya, C. Petri net modelling of biological networks. Brief. Bioinform. 8, 210–219 (2007).

Places

Transition

Tokens

Transition

Page 48: Special Topics in Computational Biology : Formal Methods in Systems Biology

Petri Nets Generalize Network Reconstruction

Chaouiya, C. Brief. Bioinform. 8, 210–219 (2007).

p3

p4

t2

C corresponds to S

Page 49: Special Topics in Computational Biology : Formal Methods in Systems Biology

Some useful formal properties of PNs

• P-invariants ( ) ~ Mass Conservation

• T-invariants ( ) ~ Loops / Ele. Modes

• Reachability - whether a state can be reached

• Liveness - whether a transition can be fired

CT ⋅x=0

C⋅y=0

Page 50: Special Topics in Computational Biology : Formal Methods in Systems Biology

Overview of PNs

• PNs are graphs, and provide tight connection between visualization and modeling

• PN formalism is isomorphic to network reconstruction formalism (reaction networks)

• Many extensions are possible to overcome limitations– Colored Petri Nets, Hierarchical CPNs, Multi-level

PN, Stochastic PNs, etc.

• Extensions provide further modeling capabilities at the expense of analysis.

Page 51: Special Topics in Computational Biology : Formal Methods in Systems Biology

Concluding Remarks

• Goal of course is to explore various representations from CS literature that can be used to model biomolecular systems.

• What opportunities do these representations offer in terms of analysis, simulation, understanding, and scalability?