modelling the randomness in biological systems - school of

Modelling the Randomness in

Biological Systems

Ole Schulz-Trieglaff

TH

E

U N I V E RS

IT

Y

OF

ED I N B U

RG

H

Master of Science

School of Informatics

University of Edinburgh

2005

Abstract

This dissertation deals with the modelling of biological processes using Stochastic Petri Nets

(SPNs). Petri Nets are a formalism that comes from Computer Science (Petri 1962) and has been

used since many years to model systems such as computer networks. Recently, they have also been

applied to model biological processes such as genetic networks (Goss & Peccoud 1998).

The outcome of this dissertation is twofold. First, a software framework was implemented that

allows creating a SPN model in a graphical editor. This software is called PNK 2e and can be used

to simulate the behaviour of the net using the infrastructure of the Systems Biology Workbench,

a collection of simulation and analysis tools tailored for biological applications. PNK 2e is based

on the Petri Net Kernel (PNK), an Open Source project developed at the Humboldt University in

Berlin, Germany. PNK 2e is also available under the Open Source software licence. It can im-

port Petri Net models from different XML description formats. The graphical representation of

biological models offered by SPNs is very intuitive. Furthermore, a SPN can be simulated using

algorithms that are commonly used in the field of Systems or Theoretical Biology. PNK 2e was the

topic of a scientific poster at the BioSysBio conference 2005 in Edinburgh. It is available online1

and has been announced in a forum dealing with Systems Biology software.

In the second part of this dissertation, PNK 2e was used to simulate genetic oscillators, networks

of genes and proteins that exhibit oscillations with a period close to 24 hours. These systems are

thought to represent the molecular basis of the internal clock of many organisms. Two oscillators of

very different architecture (Gonze, Halloy & Goldbeter 2002, Vilar, Kueh, Barkai & Leibler 2002)

were simulated with different numbers of molecules involved. This behaviour is of special interest,

since it is known that circadian clocks have to work reliably with only very few molecules. The

obtained results support previous findings (Barkai & Leibler 2000) but also provide new insights

about design features of biomolecular clocks. It was found that one particular architecture is even

driven by fluctuations in the molecular populations. Usually, this noise is considered to be a source

of disturbance, but in this case it is essential for the functioning of the clock. This architecture also

reveals significant robustness in case of mutations of key genes or changes of rate constants in the

model.

1www.inf.fu-berlin.de/˜trieglaf/PNK2e

i

Acknowledgements

I am very indebted to my supervisor Prof. Gordon Plotkin for his invaluable advice during my time

in Edinburgh and for reviewing this dissertation.

I also would like to thank Prof. Andrew Millar for his introduction into circadian clocks and many

helpful ideas.

Many thanks to Lucia Castellanos and Malcolm Leiva Gebhard who reviewed parts of this disser-

tation and gave many advices and much appreciated support.

Thanks to Stephen Ramsey (Institute for Systems Biology, Seattle), Frank Bergmann (Keck Gradu-

ate School, Claremont) and Michael Weber (German Aerospace Center) for making their software

available to be used in this project.

Funding was provided by the Students Awards Agency for Scotland and the ”Landesregierung des

Saarlandes” (Government of the state Saarland, Germany).

ii

Declaration

I declare that this thesis was composed by myself, that the work contained herein is my own except

where explicitly stated otherwise in the text, and that this work has not been submitted for any other

degree or professional qualification except as specified.

(Ole Schulz-Trieglaff )

iii

Table of Contents

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Scope of this Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Background 6

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Petri Net theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1 Untimed Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.2 Stochastic Petri Nets and Markov Processes . . . . . . . . . . . . . . . . . 8

2.2.3 Representing Biological Processes with Petri Nets . . . . . . . . . . . . . 10

2.3 Kinetics of chemical reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3.1 Deterministic Kinetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3.2 Stochastic Kinetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4 Simulation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4.1 The Gillespie Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4.2 The Gibson-Bruck Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Background on Stochastic Models and Tools in Biology 22

3.1 Previous Work on Stochastic Models . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.1.1 A Stochastic Model of Pathway Bifurcation in Phage λ . . . . . . . . . . . 23

3.1.2 Stochastic analysis of Biological models with Petri Nets . . . . . . . . . . 24

3.1.3 Analysis of the E.coli Stress Circuit with Stochastic Nets . . . . . . . . . . 26

3.2 Review of other Petri Net Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

iv

4 Technical Methodology 32

4.1 The Petri Net Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.1.1 Design and Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.1.2 The Extended Kernel (PNK 2e) . . . . . . . . . . . . . . . . . . . . . . . 38

4.2 The Systems Biology Workbench . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.3 Stochastic Simulations with Dizzy . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5 Experiments 50

5.1 The Volterra-Lotka Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.1.1 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.2 Stochastic models of circadian rhythms . . . . . . . . . . . . . . . . . . . . . . . 54

5.2.1 The delay-based Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.2.2 The hysteresis-based Model . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.3 Synchronising several oscillating cells . . . . . . . . . . . . . . . . . . . . . . . . 69

5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

6 Conclusions 74

6.1 Concluding remarks and Observations . . . . . . . . . . . . . . . . . . . . . . . . 74

6.2 Unsolved Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6.3 Suggestions for Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

A User guide to PNK 2e 78

B Delay-based Oscillatory Network 83

C Genetic Oscillator based on Hysteresis 88

D Glossary of biological terms 91

Bibliography 92

v

List of Figures

2.1 Firing of transitions in a Petri Net . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 A Petri Net Model of Gene Expression . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3 Example of Michaelis-Menten kinetics . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1 Simplified model of the E.Coli stress circuit . . . . . . . . . . . . . . . . . . . . . 26

4.1 Overview of the extended Petri Net Kernel . . . . . . . . . . . . . . . . . . . . . . 39

4.2 Broker architecture of the SBW . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.1 Petri Net representation of the Lotka-Volterra Reactions . . . . . . . . . . . . . . . 52

5.2 Stochastic simulation of the Lotka-Volterra Reactions . . . . . . . . . . . . . . . . 53

5.3 Core model for circadian rhythms based on delay . . . . . . . . . . . . . . . . . . 55

5.4 Delay-based circadian clock: stochastic and deterministic simulation . . . . . . . . 57

5.5 Stochastic simulation with changing numbers of molecules . . . . . . . . . . . . . 58

5.6 The hysteresis-based model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.7 Hysteresis-based circadian clock: stochastic and deterministic simulation . . . . . 62

5.8 Hysteresis-based circadian clock: simulation with low degradation rate of repressor 64

5.9 Stochastic simulation with changing numbers of molecules (hysteresis) . . . . . . 65

5.10 Simulation of the effects of gene duplication . . . . . . . . . . . . . . . . . . . . . 66

5.11 Simulation of both circadian clocks with low rate constants . . . . . . . . . . . . . 68

5.12 Synchronisation of several cells . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.13 Robustness of the oscillattions in both models measured by half-life of autocorre-

lation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

A.1 Screenshot of PNK 2e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

A.2 Screenshot of PNK 2e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

B.1 SPN representation of the circadian clock in Neurospora . . . . . . . . . . . . . . 86

vi

C.1 SPN representation of the hysteresis-based model . . . . . . . . . . . . . . . . . . 90

vii

Chapter 1

Introduction

In recent years, the new scientific field of Systems Biology has aroused much interest. Systems

Biology deals with the application of methods from Mathematics, Computer Science and Physics

to improve our understanding of how biological systems function. It tries to understand the cell in

an integrated manner and to obtain an understanding of its single components by examining their

relationships and relating them to a global view of the cell.

An important aspect of Systems Biology is the search for representations of cellular processes that

can be used to compute their future behaviour. This idea is not new and simple models of bio-

chemical reactions have been researched since a long time. But the availability of high-throughput

experiments and recent advancements in computational methods and computer power have greatly

improved their potential. However, the search for suitable modelling techniques still continues,

techniques that are able to capture the full complexity of the cell and its components.

The first goal of this M.Sc. thesis was to develop a software tool that facilitates the modelling of a

biological system with Stochastic Petri Nets. But during the preparation of this project, its scope

was further extended. In addition, we provide a comparison of two general models representing

genetic oscillators. These oscillators are assumed to represent the basis of the circadian rhythm

that is observed in many organisms. The experiments give an example for a successful application

of PNK 2e, the software that was written during the first part of this dissertation. In addition, they

provide an insight into possible architectures for more detailed and realistic models of circadian

clocks.

1

Chapter 1. Introduction 2

Project Objectives

The tasks of this dissertation can be summarized as follows:

• Development of a software platform for experiments with Stochastic Petri Nets. In contrast

to software that is already available, this platform should be tailored for biological applica-

tions. As an example, it should support data exchange formats that are commonly used in

the scientific community.

• Application of this software to model an exemplary biological system. Evaluate the use-

fulness of Stochastic Petri Nets for this task and validate the software by comparing the

experimental results to findings published in the literature.

• Further experiment to compare two models describing circadian clocks and how their behav-

iour reacts to changes in parameters and structure of the model.

1.1 Motivation

Petri Net software tools

Currently there is a wealth of software environments available, tools that can be used to create rep-

resentations of chemical reactions and to compute their future behaviour. In the early stage of this

M.Sc. dissertation, it was planned to develop a SPN software completely from the scratch. Then

it was discovered that several other tools with similar functionality exist already. However none of

these tools were freely available and seemed to be of use in biological applications. Therefore it

was decided to develop a new Petri Net tool tailored for the modelling of biological processes.

Several Petri Net tools were found on the Internet that were developed during M.Sc. projects at

other universities. It was estimated that the development of a new tool from the scratch would take

almost all of the three months available. It would leave no time to model interesting systems and

to conduct detailed experiments. That is why we decided to use the infrastructure from already

existing Open Source projects in this dissertation. There are some software projects that seemed

to be useful for this task. In the beginning, it was planned to extend the software PIPE (Platform

Independent Petri Net Editor) but this turned out to be far more difficult than expected. PIPE

does only support non-stochastic Petri Nets and an extension by another net type would have been

very laborious. As a consequence, we decided to use the Petri Net Kernel instead and the Systems


Biology Workbench, a collection of analysis tools for Systems Biology, to simulate the behaviour

of the net. Both software tools are published under the Open Source licence and their authors

were very helpful and provided much advice. The Petri Net Kernel (PNK) was developed at the

Humboldt University in Berlin, Germany, and the Systems Biology Workbench is a collaborative

project between several institutes, among them the California Institute of Technology, the Keck

Graduate Institute in Claremont and the Institute for Systems Biology in Seattle.

Circadian Oscillators

In the second part of this M.Sc. project, we validate the PNK 2e software by modelling so-called

circadian oscillators. Many organisms are known to have an internal clock that has significant

influence on their behaviour and lifecycle. Circadian oscillators or oscillatory networks are be-

lieved to be the source of these clocks on a molecular level. They consist of a small set of genes

and proteins that interact with each other. At least one of the contained proteins exhibits some

rhythmic activity e.g. its concentration in the cell oscillates with a period of about 24 hours. This

clock protein is assumed to regulate the activity of other genes and to drive the circadian rhythm

of the organism. The understanding of how exactly these circadian rhythms are created is very

important, for instance to stimulate the growth of food plants. Nevertheless the exact behaviour of

all components in a typical clock is not yet understood and these details are often very difficult to

examine in biological experiments.

Therefore it has been decided to model two competing architectures for a genetic oscillator with

Stochastic Petri Nets and to simulate them under different conditions. As a first step, we tried to

recreate findings published elsewhere (Gonze, Halloy & Goldbeter 2002, Vilar et al. 2002) in order

to validate our approach. But we also expanded the experiments done by others and tried to gain

new insights into possible models for circadian clocks.

There exists a group of experimental biologists at the School of Biosciences in the University of

Edinburgh that works on mathematical models of circadian clocks. This group is lead by Prof.

Andrew Millar who provided many help advices and gave valuable hints for the experiments in

this dissertation.


1.2 Scope of this Dissertation

As remarked above, there are already many software tools that can be used to model biological

systems. However, (Stochastic) Petri Nets provide a more formal view on the modelling problem.

Their theory is well researched, efficient algorithms exist not only to simulate their dynamic be-

haviour but also to examine structural properties. The software that is presented in this dissertation

makes use of other Open Source projects and relies on common data exchange formats. This is

particularly important since there are many tools that merely repeat work that has already been

done elsewhere. In addition, many of these tools use their own formats to store their data and are

thus not compatible with other software.

When it comes to the experimental part, we focus on stochastic simulations with different numbers

of molecules and examine the influences of changes in the rate constants. This is important for two

reasons. It is known that genetic circuits have to function reliably under a variety of conditions,

also if only very few instances of the involved key proteins are present. Moreover, the reaction

rates can be influenced by changes in temperature or nutrition of the organisms. The behaviour of

a genetic oscillator should not be influenced by moderate changes in these constants.

1.3 Organisation

This dissertation assumes that the reader does not have a background on chemical kinetics or

Stochastic Petri Nets. Therefore one chapter is dedicated to explaining the most relevant issues.

Nevertheless, a basic knowledge of biology is assumed such as the regulation of genes and the

synthesis of proteins, but on a rather basic level. A glossary of the most important biological

terms used in this dissertation is given in Appendix D. We will also provide a brief introduction

of Markov processes that are closely related to SPNs. The organisation of this dissertation is given

below.

Chapter One provides an introduction into the theory of Petri Nets. Stochastic Petri Nets are in-

troduced and their relationship to stochastic kinetics is explained. We also give a comparison of the

deterministic and the stochastic assumption in chemical kinetics. The most common algorithms

for the stochastic simulation of chemical reactions are presented. We will discuss their individual

advantages and how they can be used to simulate Stochastic Petri Nets.


Chapter Two provides an overview of stochastic models and tools in Biology. We present three

related efforts on stochastic modelling of biological processes. We aim at providing on overview

of previous work and discuss their relationship to this dissertation. The second part of this chapter

introduces two software tools that are similar to the software developed during this dissertation.

An overview of their functionality is given and a comparison to our software, PNK 2e, is drawn.

Chapter Three describes in detail the Petri Net Kernel. We enumerate the extensions that we made

and explain why they are necessary. Technical difficulties that were encountered are also discussed

in this chapter. We also give details of other software that was used such as the Systems Biology

Workbench and the description language SBML.

Chapter Four describes the experiments that were performed to validate the new version of the

Petri Net Kernel. We will give a brief introduction of circadian clocks. We successfully replicated

the results of other authors. Furthermore, an experimental comparison of two competing models

for circadian oscillators is given that underlines the usefulness of PNK 2e for experiments in Sys-

tems Biology.

Chapter Five contains an evaluation of this dissertation, summarizes the conclusions and gives in-

dications for future work.

Chapter 2

Background

2.1 Introduction

This chapter lays the theoretical foundations of this project. A short overview of Petri Net theory is

given and some of its properties are introduced. There are many extensions to this basic theoretical

concept. However, only the theory of Stochastic Petri Nets and its relation to Markov Processes

will be discussed since it is the model used in this work.

In the second section, an introduction to mathematical modelling in Biology is given. The two

main competing approaches, deterministic models based on differential equations, and stochastic

simulations are compared. The stochastic approach is usually based on the assumption that the

behaviour of the system fulfils the Markov assumption. Therefore an outline of the relationship

between Stochastic Petri Nets and Markovian random processes is also provided.

One of the advantages of Stochastic Petri Nets is that they can be simulated using efficient simu-

lation algorithms. Therefore, two of these algorithms will be introduced in the last section. These

algorithms are implemented in the Systems Biology Workbench and were used in our experiments.

2.2 Petri Net theory

Petri Nets provide a graphical notation for the formal description of the dynamic behaviour of

systems. Although Petri Nets have been used for the qualitative modelling of computer systems

and communication networks since the 1960s (Petri 1962), their use as a paradigm for quantitative

modelling started only about twenty years ago. Untimed Petri Nets (Place-Transition Nets) are

6

Chapter 2. Background 7

introduced first. Following that, Stochastic Petri Nets and their relation to Markov Processes are

described.

2.2.1 Untimed Petri Nets

A Petri Net (PN) is a directed bipartite graph with two sets of nodes, transitions and places. Tran-

sitions are drawn as bars and places as circles. Places can contain tokens that are drawn as black

dots. The state of a PN is given by the number of tokens in all places and is called a marking. The

initial placing of tokens is called the initial marking and represents the starting state of the net.

Transitions and places are connected by directed arcs. Input places are the places for which the

arcs point from the place to the transition. Output places are the places for which the arcs point

from the transition to the place. An arc can have an inscription, its multiplicity.

Tokens move in the net according to rules given by the transitions. A transition is said to be enabled

if each input place contains at least as many tokens as given by the multiplicity of the connecting

arc. An enabled transition is executed by removing as many tokens from each input place as given

by the inscription of the arc connecting input place and transition and by inserting as many tokens

into each output place as given by the inscription of the arc pointed from the transition to the output

place.

3

2

2

(a) Petri Net before Firing

3

2

2

(b) Petri Net after Firing

Figure 2.1: Example of the execution of a transition in a Petri Net.

Starting from the initial marking and following the firing rules we can progress through all possible


states of the net. This procedure is called the token game. The set of all possible states of a net,

given a certain initial marking, is called the reachability set with respect to this initial marking.

Different initial markings may give rise to different reachability sets. For this reason, the initial

marking is an important part of the model.

The token game gives rise to the reachability graph. This graph contains all markings encountered

during the token game as nodes and the transitions between these markings as arcs.

2.2.2 Stochastic Petri Nets and Markov Processes

Stochastic Petri Nets (SPNs) are a popular extension to basic Petri Net theory. In a SPN each

transition fires with an exponentially distributed delay. In other terms, each transition has an asso-

ciated firing rate µ which is the parameter of a (negative) exponential distribution with the density

function f (x) = µe−µx. This distribution has the memoryless property. If X is an exponentially

distributed random variable, then this property states:

P(X > t + s|X > t) = P(X > s) t > 0, s > 0

Proof:

P(X > t + s|X > t) =P(X > t + s)

P(X > t)

=e−µ(t+s)

e−µt

= e−µs

= P(X > s)

The exponential distribution is often used to model waiting times until some event occurs. In a

SPN, such an event would be execution of the next transition. In this context, knowing that already

t time units have elapsed without the execution of a transition, does not give us any additional

information about when the next execution will occur. In other terms, if no change of state has

happened at time t, then the distribution of the remaining sojourn time in the current state is the

same as if no time had passed.

If one or several transitions are enabled in some state of the SPN, the delays for each transition are

sampled from the exponential distribution. The transition with the smallest delay is then executed

first.


In recent years, there has been a lot of interest in the application of Markov processes to Biology;

for instance, in the modelling of the evolution of genome sequences. In fact, a SPN is nothing else

than a representation of a Markov process and the reachability graph is simply the state transition

diagram of the Markov process. In order to underline this fact we will give a brief definition of a

Markov process and sketch the aforementioned relationship.

2.2.2.1 Markov Processes

A sequence X = X(t)t∈N of random variables X(t) is called a discrete-time stochastic process.

The state space of this process is the set of all possible values that X(t) can assume (Ross 1996). A

Markov process is a stochastic process X(t) which has the Markov or memoryless property. Given

the value of X(t) at some time t, future values X(s) of the process for s > t do not depend on

knowledge of the past history X(u) for u < t :

P(X(tn+1) = xn+1|X(tn) = xn, . . . ,X(t1) = x1) = P(X(tn+1) = xn+1|X(tn) = xn)

This memoryless property means that once we have arrived in a particular state, the future behav-

iour is always the same regardless of how we arrived in the state. Markov processes are popular

since the underlying theory is relatively simple. The process can be visualised by its state-transition

diagram which contains the states of the process together with the connecting transitions.

If we are in a state s ∈ S of a Markov process, the distribution of time until the next change of state

is independent of the time of the previous change of state due to the Markov property. In other

terms, the waiting or sojourn time in a state is memoryless. The only probability distribution that

has this attribute is the exponential distribution and therefore the waiting times until a change of

state in a Markov process follow this distribution. Each transition in a Markov process has there-

fore a rate which is the parameter of an exponential distribution.

As mentioned above, a SPN gives rise to a Markov process. If we compute the reachability graph

of the SPN, this graph is isomorphic to the state-transition diagram of the Markov Process. But

this also means that we can analyse the behaviour of this process and by doing this, obtain new

knowledge about the dynamics of the Petri Net model.

Steady State distribution of a Markov Process

An important property of a Markov process is its behaviour over a long period of time. Under

certain conditions, the process will settle to some regular or steady state behaviour. This does not


mean that the process has stopped and does not make any transitions. But it does mean that the

probability distribution of the process being in a certain state does not change anymore.

We denote the probability that the Markov process is in state xk at time t by πt(xk). The steady

state has been reached if this probability is not dependent on the time anymore. Thus we denote

the steady state distribution by π and π(xk) is then the probability that the model is in state xk after

the steady state is reached.

Theorem: (Ross 1996) A steady state distribution, π(xk),xk ∈ S exists for every Markov process

with the following properties:

• its transition rates are time homogeneous e.g. they do not depend on the time at which we

observe the process,

• finite number of states,

• and irreducibility. This means that all states in S can be reached from all other states if we

follow the transitions of the process.

The steady state distribution can be calculated by using the so-called global balance equations.

These equations give rise to a system of linear equations that can be solved by appropriate algo-

rithms.

If we derive the underlying Markov process of the Petri Net by computing the reachability graph

of the net, we can compute the steady state of the Markov process. In practice, not every SPN has a

steady state distribution (because the state space might be not finite) or it would be too expensive to

compute it (if the number of states is very large). This makes observations obtained from stochastic

simulations even more valuable.

2.2.3 Representing Biological Processes with Petri Nets

The first research involving the application of Petri Nets to biological models was conducted by

Reddy, Liebman & Mavrovouniotis (1993). They modelled the combined glycolytic and pentose

phosphate pathway in the erythrocyte cell with a Petri Net and used Petri Net theory to analyse

qualitative properties of this pathway.


Figure 2.2: A Petri Net model of gene expression. This is a abstract model of gene expression and protein

synthesis. The gene becomes active with rate λ. The protein is synthesized with rate v and degraded with

rate δ. Example is taken from Goss & Peccoud (1998).

In general it is easy to represent chemical reactions with a Petri Net. We think of places as chemi-

cal species and of transitions as reactions occurring between these species. The multiplicity of an

arc is given by the stoichiometric coefficient of the species involved in the reaction. Tokens usually

represent single molecules but can also be seen as a fixed amount of molecules such as a mole (=

6.022×1023 molecules), a common base unit in Chemistry.

In the last years, there have been some publications in which Stochastic Petri Nets were used to

model coupled chemical reactions (Goss & Peccoud 1998), (Srivastava, Peterson & Bentley 2001).

This is due to the fact that if the participating chemical species occur only at very low concentra-

tions then a stochastic model of the chemical kinetics is more accurate then a deterministic one. In

this context, we assume that each reaction occurs with a certain probability and the rates of firings

in the net are given by the stochastic rate laws. We will give details of this stochastic assumption

in chemical kinetics in the next section.

2.3 Kinetics of chemical reactions

The next two sections give an overview of chemical kinetics. The basics of the deterministic

formulation are outlined and a comparison to the stochastic approach is drawn.


2.3.1 Deterministic Kinetics

Chemical kinetics are concerned with the time evolution of a reaction system. Classical or deter-

ministic kinetics are expressed in terms of the concentrations of the chemicals. These concentra-

tions can vary continuously as the reactions progress.

We assume that the rate of a reaction follows the mass action law which means that this rate is

proportional to the concentration (and in turn to the mass) of each reactant raised to the power

of its stoichiometry (Cox & Nelson 2004). In other terms, the change of a product quantity is

proportional to the product of the reactant concentrations. As an example, in this second-order

reaction

A+B→C

the rate of change of C, dCdt , is given by k× [A]× [B] where [A] and [B] denote the concentration of

A and B respectively and k is some constant. Thus the rate of change of C can be modelled by a

differential equationdCdt

= k× [A]× [B]

The rate constant k needs to be specified as well as the initial concentrations of A and B. This

procedure can of course be extended to several coupled reactions. In this case, the reactions give

rise to a set of coupled differential equations. These equations can be used to compute the time

evolution of the reactions either by solving them analytically (which is not often possible) or by

numerical integration.

The idea behind the deterministic formulation of chemical kinetics is that even if single molecules

move randomly, the overall behaviour of a large group of molecules follows a pattern and this

pattern can be modelled deterministically.

Even if a set of differential equations cannot be solved analytically, it is often possible to determine

characteristics of their steady state behaviour. This can be done by setting the right side of the

equations to zero and solving for the concentrations of the reactants. But if we know the steady

states of the system, we do not know if a particular set of initial conditions will lead to one of these

states and how it will reach it. An example is given in our experiments with the Lotka-Volterra

reactions presented in section 5.1.


0 50 100 150 200 250 300 350 400 4500

10

20

30

40

50

60

70

80

90

100

Time

Time of course of Michaels−Menten kinetics

Con

cent

ratio

n of

pro

duct

Figure 2.3: Typical time course of a

reaction following Michaelis-Menten ki-

netics. The rate of the synthesis of

the product P increases rapidly but con-

verges after some time. This is due to

the saturation of the enzyme.

2.3.1.1 Michaelis-Menten kinetics

Michaelis-Menten kinetics are a special case of deterministic kinetics. They are named after

Leonar Michaelis (1875-1946) and Maud Menten (1879-1960). Since these kinetics occur in the

experimental section of this work several times, a brief explanation is given in this section.

We consider a set of reactions in which a substrate S is converted into a product P only in the

presence of an enzyme E:

S +E ↔ ES (2.1)

ES → P+E (2.2)

We assume that the reactions follow mass-action kinetics. The forward reaction of 2.1 has the rate

constant k1, the backward reaction k−1 and reaction 2.2 the constant k2. These reactions can be

modelled by a set of coupled differential equations (Cox & Nelson 2004).

Using several assumptions it was shown that these reactions can be simplified to one single equa-

tion expressing the rate of change of the product [P] in terms of the Michaelis-Menten constant

KM = k−1+k2k1

and the maximum rate Vmax of the reaction. Vmax is given by [E0]× k2 where [E0] is

the total concentration of the enzyme E. Thus the Michaelis-Menten equation is given by

d[P]

dt= Vmax

[S]

Km +[S]

Both constants, KM and Vmax can be determined experimentally. Vmax can be obtained by increas-

ing the substrate concentration until the reaction reaches its maximum rate and KM is equal to the

substrate concentration at which d[P]dt equals Vmax/2 (set KM = [S] in the equation above).


This approximation is also part of the SBML description language (section 4.2) and occurs several

times in the deterministic model of the biomolecular clock in Neurospora (section 5.2). In order

to simulate Michaelis-Menten Kinetics stochastically, this equation is usually decomposed again

into its elementary steps. The problem is that the decomposition of the deterministic formulation

to the elementary stochastic steps leads to ambiguities since the elementary rate constants are not

specified by the deterministic model.

2.3.2 Stochastic Kinetics

As outlined before, classical mass-action kinetics assume that the behaviour of a large number

of molecules follows deterministic patterns. The reaction constants are regarded as rates and the

various species concentrations are represented by continuous, single-valued functions of time. In

many cases. random fluctuations and correlations do not play a significant role for the behaviour

of a system and this assumption is adequate. Nevertheless there are many examples for which this

approach turned out not to be correct (Arkin, Ross & McAdams 1998).

We will start this section by outlining the central assumptions of the stochastic approach to chem-

ical kinetics. After this we will describe how this approach relates to the deterministic model and

how stochastic rate constants can be converted into deterministic ones and vice versa.

In a stochastic context, the reaction constants are viewed as reaction probabilities per unit time.

The temporal behaviour of the system is modelled as a Markovian random walk on the space of the

molecular populations of the species. It was proved that the stochastic formulation reduces to the

deterministic formulation in the thermodynamic limit e.g. when the numbers of molecules and the

volume approach infinity (Kurtz 1971). We consider a set of n chemical species X1,X2, . . . ,Xn and

a set of m reactions R1,R2, . . . ,Rm. If the container in which the reactions take place is well stirred

and in thermal equilibrium, it can be shown that the probabilities that two molecules Xi and X j,

i, j ∈ 1 . . .n, collide is constant (Gillespie 1977). Each reaction Ri can therefore be characterized

by a single constant ci which is defined as the average probability that a particular combination

of Ri reactant molecules will react according to reaction Ri. The probability of the next occurence

of reaction Ri in the time interval dt is then ci×dt. As an example, let us consider again a simple

second-order reaction:

A+B → C (2.3)


The rate constant for this reaction gives the probability that a pair of molecules A and B reacts to

produce C. Since there are A×B different combinations of molecules of this type, the probability

that this reaction will occur somewhere inside the container in the next infinitesimal time interval

dt is given by A×B× ci where A and B denote the number of molecules of species A and B and ci

is the reaction constant of the reaction as defined above.

If the reaction had been of the form

A→ B

then this probability had been ci×A and in the case

2A→ B

the probability would have been A(A−1)A × ci ≈ A2ci. We are now examining the relationship be-

tween the reaction parameter ci and the deterministic rate constant ki. This is important since much

of the literature on biochemical rate constants is dominated by a deterministic point of view. Fur-

thermore, if we want to compare deterministic and stochastic formulations of the same model, we

need to be able to convert these deterministic constants into their stochastic counterparts.

Referring again of the simple example of reaction 2.3, A×B× ci dt gives the probability that this

reaction will occur somewhere in the container in the next time interval dt. Dividing by the volume

V leads us to the average reaction rate per unit volume A×B× ci/V . This is already close to the

deterministic rate constant which is defined as the average reaction rate per unit volume. But the

deterministic constants are expressed in terms of concentrations [A] = A/V and [B] = A/V and not

in terms of numbers of molecules. But if we replace A and B in the stochastic rate law by [A] and

[B], we obtain [A][B]VcI which is the rate of change dependent on the concentration.

Since the deterministic rate law is defined as k1[A][B], we can infer from that:

ki = V × ci

for a bimolecular reaction of the form 2.3. For a reaction 2A→ B, we would have obtained ki =

V × ci/2. For a monomolecular reaction such as A→ B, ki and ci are equal. In general, we can

conclude that the relationship between ci and ki is simple in a mathematical sense. Conversions

and comparisons between parameters in deterministic and stochastic approaches are possible.


2.4 Simulation Algorithms

Starting from the stochastic approach to chemical kinetics described in the previous section, two

well-known stochastic simulation algorithms will be introduced: the First Reaction Method de-

veloped by Gillespie (1976) and the Next Reaction Method by Gibson & Bruck (2000). Both

algorithms are exact procedures for numerically simulating the time evolution of a well-stirred

chemically reacting system.

Both procedures are used in the experimental section in order to simulate the dynamic behaviour

of a Stochastic Petri Net model. The Gillespie algorithm is described in more detail, explaining its

relationship to a Stochastic Petri Net and drawing to the Next Reaction Method.

2.4.1 The Gillespie Algorithm

The Gillespie algorithm is the most popular algorithm for the stochastic simulation of coupled

chemical reactions. It was proposed by Gillespie (1976) and comes in two different versions: The

First Reaction Method and the Direct Method. Following the structure of this paper, the theoretical

concept that underlies the algorithm will be introduced first. Then, the traditional master-equation

and the simulation approach will be compared.

In a deterministic setting, we assume that the time evolution of a chemically reacting system is

continuous and deterministic. It is evident that this is not correct since the molecular population

levels can only change by discrete integer amounts. In addition, the time evolution is usually not a

deterministic process, but is governed by the random movements of single molecules. The Gille-

spie algorithm is useful if we want to simulate reactions with very few molecules involved. This

is the case for many regulatory networks. Furthermore a stochastic approach is appropriate for

systems that exhibit instable behaviour. In this case, even small fluctuations in the molecular pop-

ulations can drive the system out of its current state. This causes drastic changes in the system that

could not be predicted by a deterministic formulation.

The traditional approach, before Gillespie developed his algorithm, was the master-equation ap-

proach. In this approach, random variables are used to denote each possible state of the system i.e.

combinations of molecular populations. The master equation or Chapman-Kolmogorov equation

is a system of coupled differential equations that describes the transition probabilities in the sys-


tem. It is possible to write down and solve the master equation for a system. This would give us

complete knowledge of the systems dynamics. However, this is only possible for simple systems

with very few states. For larger systems this approach becomes intractable.

As outlined above, the Gillespie algorithm follows the assumption of stochastic kinetics. We as-

sume that for each reaction Rµ a stochastic reaction constant cµ exists that gives the probability that

a particular combination of molecules will react according to Rµ. This assumption requires that the

systems is kept well mixed either by direct stirring or simply by requiring that nonreactive mole-

cular collisions occur much more frequently than reactive molecular collisions (Gillespie 1977).

This is the fundamental hypothesis of the Gillespie algorithm.

The algorithm generates a single sample trajectory of the chemical process. This can be interpreted

as a random walk through the space of possible states. At each time step, the system is exactly

in one state defined by the molecular populations in the system. The Gillespie algorithms then

picks a reaction and executes it according to a probability distribution such that the probability of

the generated trajectory is the same that the Master equation would assign to this trajectory. By

generating many trajectories and averaging their results, we can estimate any parameter of interest

such as the average number of molecules of species at some time t.

Gillespie (1976) proposed two methods for the simulation of the trajectories. The Direct Method

calculates explicitly which reaction occurs next and when it occurs. The First Reaction Method

generates for each reaction µ a time τµ at which it occurs and then executes the reaction which

occurs first. We will describe both methods now.

The Direct Method

This method relies on the probability density P(µ,τ) that the next reaction is µ and occurs at time τ.

We already introduced the stochastic reaction constant cµ which is the probability that a particular

combination of molecules will react according to reaction Rµ. Let hµ be the number of distinct

combinations of Rµ reactant molecules in a certain state. For a bimolecular reaction X +Y → Z,

hµ would have the form XY , for a reaction of the form 2X → Z, hµ would be X(X − 1)/2 etc.

We can then define aµdt = hµ cµ dt as the probability that reaction Rµ will occur in some state

(X1,X2, . . .XN) at time t. It can be shown (Gillespie 1976) that

P(µ,τ)dτ = aµ exp(−τ∑j

a j) dτ


This equation can be used to compute directly the next reaction to occur. Integrating P(µ,τ) over

all τ from 0 to ∞ yields

P(Reaction = µ) = aµ/∑j

a j

In a similar way, we can obtain a distribution of the waiting times until the next reaction occurs by

summing P(µ,τ) over all µ which gives us

P(τ)dτ = (∑j

a j) exp(−τ∑j

a j) dτ

This simply means that the waiting time to the next reaction is exponentially distributed with

parameter ∑ j a j. These two distributions give rise to the Direct method:

1. Set the initial numbers of molecules and set t to 0.

2. Calculate aµ for all reactions µ.

3. Choose a reaction µ according to the distribution P(Reaction = µ).

4. Choose τ according to P(τ).

5. Execute reaction µ by changing the numbers of molecules accordingly. Update the time to

t + τ.

6. Go to step 2.

This algorithm needs two random numbers per iteration, takes time proportional to the number of

reactions to update the aµ values since they depend on the current state of the system and takes

time proportional to the number of reactions to calculate ∑ j a j.

The First Reaction Method

This algorithm computes a putative time τµ for each reaction µ to occur, a time the reaction would

occur at if no other reaction occurred first. The reaction with the lowest τµ is then executed first.

1. Set the initial numbers of molecules and set t to 0.

2. Calculate aµ for all reactions µ.

3. For each reaction µ, compute a delay τµ according to an exponential distribution with para-

meter aµ .


4. Let µ∗ be the reaction with the smallest delay τ∗µ.

5. Execute reaction µ∗ by changing the numbers of molecules accordingly. Update the time to

t + τ∗µ.

6. Go to step 2.

These two algorithms seem to be very different, but it can be proved that τ and µ are chosen ac-

cording to the same probability distribution and both approaches are therefore equivalent (Gillespie

1976). The First Reaction method computes two random numbers per iteration, needs time propor-

tional to the number of reactions to update the ai values and needs time proportional to the number

of reactions to identify the reaction with the smallest putative time.

The Gillespie algorithm and Stochastic Petri Nets

One can already anticipate that there is a close relationship between the First Reaction method and

the simulation of a Stochastic Petri Net. In fact, the Gillespie algorithm fulfils the Markov property

since the transition probabilities to each new state depend on the current state only. If a SPN is

chosen to represent a set of coupled reactions and if its transition rates are chosen according to the

mass rate laws, then its behaviour can be simulated with the First Reaction method. The SPN can

be seen as a direct graphical representation of the underlying Markov process.

In some cases the execution of a reaction might affect other reactions as well, for instance if some

reactions share the same reactants. A SPN gives some information about those dependencies since

reactions with the same reactants would be represented by transitions with the same input places

in the SPN. But the Gillespie algorithm does not make use of this information. In the next section,

an algorithm is introduced that takes these dependencies into consideration .

2.4.2 The Gibson-Bruck Algorithm

This algorithm is also called the Next Reaction Method (Gibson & Bruck 2000) and is an improve-

ment of the First Reaction Method as described in the last sections. The Gillespie algorithm in

both variants was extremely successful and is still commonly used. Its main disadvantage is the

amount of computational effort that is needed to conduct simulations with many molecular species

and many reactions.


The Next Reaction Method addresses this problem by offering a reduction of the running time

while still being exact. In our experiments it turned out that the size of the models was not large

enough to cause a substantial difference in the running time. Nevertheless this algorithm was used

to obtain results averaged over a large number of simulation runs. In this case, the difference in the

running time does make a difference.

If one examines the Gillespie algorithm carefully, one can see that it does in fact more work than

might be necessary. It recomputes the rate and delay of each reaction even in cases when they

have not changed. In contrast to this, the approach of Gibson and Bruck stores both values for

each reaction in an efficient data structure and only recomputes them if necessary. Usually it is

not allowed to simply re-use random numbers such as the delay for each reaction. In our case it is

legitimate due to the fact that the delays follow an exponential distribution.

The Next Reaction Method:

1. Set the initial numbers of molecules, set the time t = 0, calculate the stochastic rate ai for

each reaction i and calculate a delay τi for each reaction.

2. Let j be the index of the smallest τi.

3. Execute reaction j by changing the numbers of molecules accordingly.

4. Update a j according to the new state and compute a new putative delay for reaction j using

the new a j.

5. For each reaction whose parameter ai is affected by the execution of reaction j:

(a) Update the rate ai, but store the old value as ai,old .

(b) Set τi = (ai,old/aa,new)/(τi− t)+ t (see (Gibson & Bruck 2000)).

(c) Delete old rate ai,old .

6. Go to step 2.

The Next Reaction method needs to know which ai’s are affected by the execution of a reaction.

Gibson and Bruck state that this can be achieved by using a data structure called dependency graph.

This graph has a node for each reaction. A directed arc connects nodes i and j if the rate of reaction

j is affected by the execution of reaction i. The graph can be constructed automatically before the


algorithms starts by searching for species that are reactants or products of one reaction and reac-

tants of another reaction. After the execution of a reaction i, the algorithms determines the rates

that need to be changed by examining the children nodes of node i in the graph.

In order to obtain efficiently the reaction with the smallest delay, Gibson and Bruck propose an

indexed priority queue. The queue is implemented by another graph which offers fast search and

insert operations.

The Next Reaction method fits more naturally to a Stochastic Petri Net since it operates on a more

local level. The graph representation of a SPN is similar to the dependency graph proposed by

Gibson and Bruck. For a reaction which is represented by a transition in the SPN, the set of all

places which are connected to this transition are molecular species that are affected by this reac-

tion. These species might be reactants of other reactions and the rates of these reactions have to

be changed. This is an interesting relationship but since the Systems Biology Workbench (Hucka,

Finney, Sauro, Bolouri, Doyle & Kitano 2002) offers already efficient implementations of the Gib-

son Bruck algorithm, it was not used here.

The Gibson Bruck algorithm needs one new random number per iteration and it only recomputes

reaction rates and delays if necessary. It is the fastest exact simulation algorithm. But since it is

not very easy to implement, the Gillespie algorithm is still the most commonly used approach.

Chapter 3

Background on Stochastic Models and

Tools in Biology

After laying the theoretical foundations of this project in the last chapter, some previous work on

stochastic modelling of biological phenomena will now be reviewed. It might be surprising, but

random fluctuations or noise play a very important role in the regulation of genes and in turn can

lead to a random behaviour on the level of metabolic or regulatory pathways. One of the most

famous examples is the Phage λ decision circuit (Ptashne 1992). Arkin et al. (1998) developed

a stochastic model of this gene network that was able to explain the apparently random decision

between lysogenic or lysic cycle. A brief description of this model, which was one of the first and

most comprehensive stochastic models of a gene network, is given, together with a review of two

other publications that used the same modelling formalism. A comparison of their approach with

the findings of this dissertation is given. To our knowledge, these two publications are the first

scientific projects in which Stochastic Petri Nets where used to model the dynamic behaviour of

biomolecular interactions.

The last section in this chapter is dedicated to the technical methodology of this project. Two

software tools that can be used to edit and simulate Petri Nets are presented. A comparison is

drawn between these tools and the Petri Net Kernel, the tool that was used and extended in this

study.

22

Chapter 3. Background on Stochastic Models and Tools in Biology 23

3.1 Previous Work on Stochastic Models

3.1.1 A Stochastic Model of Pathway Bifurcation in Phage λ

The following simplified description of the pathway bifurcation in phage λ follows the paper by

Arkin et al. (1998). They were the first to present such a detailed stochastic model of a regulatory

gene network. They also proved that random fluctuations in the molecular populations can have

drastic consequences for the organism as a whole. We provide a short description of the model and

draw a comparison to our work. In reality, the decision circuit is much more complex and contains

various other factors. But this description focuses on the most important ones.

The model

The phage λ is a bacteriophage, a parasite that attacks E. coli bacteria. The phage attaches its tail

to the surface of the bacteria and injects its chromosome into it. After this, the infected bacteria

can switch between two states: in some cases the phage chromosome replicates itself and new

phages are produced in the host cell. According to Ptashne (1992) it takes about 45 min until

the infected bacterium is destroyed (lysis) and about 100 new phages are released. In other cases

the phage chromosome enters the lysogenic state, integrates into the host DNA and is replicated

together with the bacterium. The phage chromosome stays therefore in a dormant state and only

some events, for instance ultraviolet irradiation, can lead to a lysis of the bacteria. Apart from this

so-called stress-induced decision between two states, it was also observed that the change between

the two states can occur at random.

The decision whether an infected E.coli cell enters either the lysic or lysogenic pathway is mainly

controlled by two different proteins: Cro and CI. Cro starts the lytic cycle. If it is expressed con-

stantly over a longer period of time, the chromosome of the λ phage is replicated and the lysis of

the bacteria is inevitable. On the other side, the CI protein controls the lysogenic pathway. If it is

expressed constantly, other phage genes are suppressed and the phage enters a dormant state and

the lysogenic cycle starts. However CI is usually expressed after infection with the phage. But its

expression can be induced by another protein, CII. This protein is usually degraded very quickly

and CI is not expressed. But if a fourth protein, CIII, is expressed at the same time, the degradation

of CII is slowed down and CII molecules are available long enough to induce the expression of CI

which leads to the lyogenic state.


To conclude, if the proteins CII and CIII are expressed shortly after the infection by a λ phage, CI

is activated, the expression of Cro is suppressed and the bacteria enters the lysogenic pathway. If

not, CI is not expressed and the lysis is started.

Results

Arkin et al. (1998) were able to show that the lysis-lysogeny decision is indeed influenced by ran-

dom bursts in the protein production. In all cases, a burst in the concentration of the CII protein

occurred after infection with the phage. But only in the lysogenic case, this burst was by chance

accompanied by a burst of CIII production. The CIII could then stabilize the CII production which

in turns leads to an increase of CI and entry into the lysogenic state. In contrast, in the lytic-fated

case no CIII production occurred so the unprotected CII rapidly degraded and did not activate the

CI expression. Without expression of CI, Cro production continued and lysis ensued.

These results show that random fluctuations in the production of one single protein can influence

the fate of a simple organism. This work was the first comprehensive stochastic model of a reg-

ulatory network. It highlighted the need for stochastic models to capture the full complexity of

Biology. Even if the authors did not use Petri Nets to model the network, it would have been possi-

ble since they simulated the system with the Gillespie algorithm. This corresponds to the execution

of the corresponding Stochastic Petri Net as described in chapter 2.

3.1.2 Stochastic analysis of Biological models with Petri Nets

The first attempts to model biological systems with Petri Nets focussed on static properties of the

model. Reddy et al. (1993) were the first to use Petri Nets to model the combined glycolytic and

pentose phosphate pathway in the erythrocyte cell. Goss & Peccoud (1998) used Stochastic Petri

Nets (SPNs) for the first time to model the dynamics of molecular interactions. They were the

first to recognize the advantages that are offered by this representation. For these reasons, a brief

summary of their work and a comparison to our approach is given.

The model

Goss & Peccoud (1998) explain the terminology of Stochastic Petri Nets and illustrate how it can

be used to model molecular interactions. Furthermore they present a stochastic model of ColE1

plasmid replication and compare its simulation results to the deterministic solution.


They introduced the very intuitive mapping of molecular species to places and reactions to transi-

tions in the Petri Net. The same representation was used in the experimental section of this work.

Nevertheless, sometimes it is more convenient to understand places more as a certain state in the

system rather than a species. For instance in the model of a biological clock (section 5.2), a re-

pressor protein binds to the promoter region of a gene. This binding reaction is represented by

a transition whose input set consists of the protein and the gene place. The output place of this

transition represents the suppressed gene with the bound protein. But this place rather represents

the state ”inhibited gene” than a molecular species since gene and protein are still two separate

molecules.

Goss & Peccoud (1998) simulate the Petri Net with the UltraSAN software (Deavours, II, Qureshi,

Sanders & van Moorsel 1995). This simulation is similar to the Next Reaction Method (Gibson &

Bruck 2000). In addition, they also derive an analytic solution by solving the associated Markov

process for its steady state. They briefly mention that Petri Net theory can also be used to examine

structural properties of the net. We tried this as well but in our case the results were not very inter-

esting. It is possible, for instance, to examine the Petri Net for place invariants. These invariants

are, informally speaking, a set of places whose sum of tokens does not change during the simu-

lation of the net. Not surprisingly, the places that represent an enzyme and the enzyme-substrate

complex are part of an invariant since the number of enzyme molecules is always constant.

Results

To summarize, Goss & Peccoud (1998) were the first to apply SPNs to model chemical reactions.

Their work is the foundation of this project even if their objective was slightly different. They were

interested in analytical and simulation results of the Markov process represented by the Petri Net.

However, as already mentioned in chapter 2, obtaining the steady state distribution of the Markov

process is not always possible. Goss & Peccoud (1998) demonstrated that it is possible to restrict

the state space of a SPN by simply enforcing an upper bound on the number of tokens in the states.

It is important to find a reasonable value for this bound.

We did not follow this approach in our experiments since it is unlikely that the steady state behav-

iour of the systems we considered give interesting results. The models of circadian clocks we are

presenting in 5 exhibit an oscillatory behaviour and the steady state distribution will simply give an

average of the oscillations. In addition, simulation results are easier and faster to obtain. Therefore


we focus more on the evaluation of the results obtained from the stochastic simulation.

Goss & Peccoud (1998) compared analytical and deterministic results with the simulated behav-

iour of the net. We also made comparisons between the deterministic and stochastic solution but

following the work of Gonze, Halloy & Goldbeter (2002), we were also interested in the behaviour

of the model with different numbers of molecules involved. The behaviour of the stochastic model

under different conditions was not a topic in the work by Goss & Peccoud (1998).

3.1.3 Analysis of the E.coli Stress Circuit with Stochastic Nets

(a) Model of E.Coli Stress Circuit (b) Time course of σ32 concentration

Figure 3.1: (a) An overview of the simplified E. Coli stress circuit (Srivastava et al. 2001). The σ32 protein

is synthesized and forms a holoenzyme together with the mRNA polymerase (Eσ32). This complex binds to

the promoter regions of several chaperone proteins and proteases that degrade misfolded proteins. Some of

the chaperons can also bind to Eσ32 and serve as a reservoir of the sigma factor and lead to its degradation.

(b) Ethanol stress response with and with out σ32 mRNA antisense. The results of the stochastic simulation

(thick solid line) + standard deviation (thin dotted lines) compared with experimental data (points with error

bar). The antisense mRNA binds to the σ32 mRNA and was included to see how the model behaves if the

σ32 synthesis is inhibited.

Srivastava et al. (2001) developed a SPN model of the E.coli stress circuit and used this model

to characterize the behaviour of the bacterium under stress i.e. if exposed to heat, ethanol, heavy

metals etc. σ32 is a protein that regulates the expression of other proteins in response to external

stress. Under stress, the rate of synthesis and stability of σ32 increases. In this case the production

of other proteins such as chaperones (enzymes which assist other proteins in achieving proper


folding which is affected by stress) or proteases (enzymes that degrade other proteins).

Results

The problem is that many details of the σ32 mediated pathways are still not understood. The au-

thors validated their model by comparing its simulated behaviour to experimental results. But it

is not sure if all parts of the model are correct, such as the exact binding sequence of the different

chaperons. Nevertheless, they were able to reproduce experimental results and to gain new insights

into the behaviour of the stress circuit. One of these results was that the σ32 response is mainly

controlled by the rate of mRNA translation and that large quantities of the protein are bound to

chaperones under non-stress conditions. This allows the cell to react rapidly to external stress by

releasing these proteins and not waiting until new ones are produced. The authors also underlined

the fact that a stochastic formulation can be used to generate estimates of the variance in the data.

This information cannot be obtained from a deterministic model.

Their approach was similar to ours since they created a simplified model of a biological system

and used simulation results to understand it better. They faced the same dilemma as we did since

many details of the real biological system are not understood, so one has to generalize and make

assumptions about some parameters. On the other hand, our experiments deal with rather general

features such as the architecture of the system, its resistance to noise etc. In contrast to this,

Srivastava et al. (2001) were interested in very specific features of the σ32 pathways such as the

partitioning of the σ32 within the cell and the time evolution of its concentration.

3.2 Review of other Petri Net Tools

An Open Source software tool was used to edit and simulate the Petri Net models in this project.

The development of a new software completely from the scratch would have taken too much of

the three months that were allocated for this project. Hence we decided to search for a suitable

software and to extend it, developing the features needed.

It turned out that there are many software tools available that have some of the required features.

There are some very good Petri Net editors such as GreatSPN (Chiola, Franceschinis, Gaeta &

Ribaudo 1995) or UltraSAN (Deavours et al. 1995). GreatSPN supports Generalized Stochastic

Petri Nets (Marsan, Balbo, Conte, Donatelli & Franceschinis 1995), Stochastic Petri Nets with


immediate transitions and inhibitor arcs. UltraSAN started as a tool for Stochastic Petri Nets but

offers now many extensions to this theory such as arcs that are associated with a certain function

and various analytical solvers on the level of the Markov process.

Both tools are general tools with no specific application domain. That means that they do not offer

any features specific to a biological application. Nevertheless, they are widely used and UltraSAN

has also been used to model regulatory networks in E.Coli (Goss & Peccoud 1998). The problem

is that they have often very restrictive licenses and, even if they are developed by research groups

at Universities, the developers do not support any extensions of their software.

Thus we searched for alternatives. A great help was the online archive of Petri Net tools available

at the University of Hamburg, Germany. 1 In the remainder of this chapter, we will review some

of the Petri Net software tools that are available online and justify the decision to use the Petri Net

Kernel (Kindler & Weber 2001).

Requirements for this project

The software needed should fulfil the following requirements:

• a graphical interface that is easy to use,

• platform-independent (which usually means that it is written in the programming language

Java),

• support of a stochastic simulation,

• means to analyse the results of the simulation,

• and a modular structure, such that we could implement new features if necessary.

Most of the Petri Net tools at Sourceforge and at the Petri Net world were tested for this abili-

ties. As an example, we will present two of these tools and the extent to which they fulfil these

requirements.

1http://www.informatik.uni-hamburg.de/TGI/PetriNets/


Cell Illustrator (Genomic Object Net)

The basis of this software project was an extension of the discrete Petri Net theory by continuous

features (Matsuno, Doi, Nagasaki & Miyano 2000). This can be useful since protein concentra-

tions vary continuously but are coupled with discrete switches (i.e. protein production is switched

on or off depending on the expression levels of some genes). Based on this observation, Matsuno,

Tanaka, Aoshima, Doi, Matsui & Miyano (2003) developed the theory of Hybrid Functional Petri

Nets (HFPN). In general, a hybrid Petri Net contains two sets of places and transitions, discrete

/ continuous places and discrete / continuous transitions. Discrete places and discrete transitions

are the same as in the discrete Petri Net model. In contrast to this, a continuous place holds a

nonnegative real number as its content. A continuous transition fires continuously and its firing

speed is a function of the values in the places. Finally, a hybrid functional Petri Net has discrete

and continuous input and output arcs but also test input arcs. This arc can be directed from a place

of any kind to a transition of any kind and does not consume the content of the source place. The

test arc only inhibits its target transition if its source place contains as least as many tokens as given

by its transition.

The simulation software Genomic Object Net (GON) implements these Hybrid Functional Petri

Nets. Matsuno et al. (2003) claims that using GON it is possible to construct a computational

model directly from a map of the biological pathway taken from the literature. GON uses a system

of differential equations to simulate this pathway. The parameters of the reaction have to be deter-

mined by experiments or found in the literature.

A trial version of GON was installed and tested. It is evident that a hybrid net is ideal to model

switch behaviour that may occur in biology. The software is very easy to use and gives a very

professional impression. On the other hand, it is does not implement any stochastic simulation

features so it is a very different modelling approach.

One of the largest drawbacks of GON is that the latest version has become commercial. The com-

pany Gene Networks International sells this software now under the name Cell Illustrator. Even

academic users have to pay a considerable sum for the full version but a trial version with limited

functionality is available.

Cell Illustrator is only available for Windows. It is very easy to use but cannot be extended and


does not offer any stochastic simulation features. Therefore it was not suitable for this project.

PIPE - Platform Independent Petri Net Editor

In contrast to GON, this tool is a classical editor for Petri Nets. It was developed during a M.Sc.

group project at the Imperial College London 2003 (Bloom 2003). After the end of the dissertation,

PIPE was further extended and is now available as Open Source project at the Sourceforge online

repository2.

The software was written with the aim to provide an easy-to-use application. The authors also

wanted to ensure that the program is extensible by creating a modular structure. New modules can

be added easily to extend to functionality of the program. So far, different analysis modules have

been developed such as modules for the identification of invariants, state space analysis etc.

In the beginning, it seemed that PIPE would be an ideal tool for this project. It is published under

an Open Source licence, written in Java and very intuitive to use. Nevertheless, it does not support

Stochastic Petri Nets and hence no stochastic simulation. First we planned to implement these

features on our own and received considerable support from the current maintainer, James Bloom.

Unfortunately, it turned out that the features needed were far more difficult to implement than

expected. After some days it was decided to use another software tool that was easier to extend.

The problem was that PIPE was written in such a way that new analysis modules could be added

easily, but it was difficult to introduce new types of Petri Nets. In its current version, PIPE only

supports non-stochastic Place-Transition Nets and is therefore not suitable for this project.

3.3 Conclusions

The idea that a stochastic simulation of chemical reactions can be more accurate and natural from

a physical perspective is not new. But the fact that variations on the molecular level can have

substantial influence on the high-level pathways and even on the phenotype of the organism was

proven only some years ago (Arkin et al. 1998). We were able to make use of the experience and

the work of others in this project even if our approach slightly differs from the work published so

far.

2http://www.petri-net.net/htdocs/


From a software engineering perspective, we faced the difficult task to find a software tool that

fulfilled most of our requirements but was also modular so that it could be extended easily if

needed. There is the clear tendency that every research group writes their own tools mainly be-

cause self-written tools appear to be more trustworthy. In the near future it might become even

more important to integrate different tools and their abilities in order to avoid a waste of time and

resources.

For this project, it turned out that even if there are many tools available, none fulfilled all require-

ments. We had the option to either write our own tool or to extend an existing one. Writing a

useful software from the scratch takes a considerable amount of time, probably even more than the

three months that were allocated for this project, and therefore we decided to use an existing tool.

This also gave us enough time to put more effort into the experimental and modelling part of the

dissertation.

Chapter 4

Technical Methodology

The first half of this chapter describes the Petri Net Kernel (PNK) version 2 (Kindler & Weber

2001) and the extensions made during this project. We start by giving a general overview about

architecture and functions of the Kernel in the first section. After this the extended software as it

was used in this dissertation is presented. A documentation on how the Kernel can be extended by

support for other Petri Nets is detailed in the remainder of the first section.

In its current version, the Kernel can only be used to visualize and edit Petri Nets but not to simulate

their behaviour. An extension of PNK was developed that uses the infrastructure of the Systems

Biology Workbench (Hucka et al. 2002) to simulate the Markov process represented by the Petri

Net. We call this extended version PNK 2e (extended Petri Net Kernel 2).

The second half of this chapter is therefore dedicated to the Systems Biology Workbench (Hucka

et al. 2002). An overview of this project and its components is given, as well as the XML im-

plementation SBML (Finney & Hucka 2003), which is used as the workbench’s data exchange

format.

4.1 The Petri Net Kernel

The Petri Net Kernel (PNK) was developed at the research group of Theory of Programming at the

Humboldt University of Berlin, Germany. It is rather intended to provide an infrastructure for the

development of Petri Net tools than to be a tool of its own. The current version is 2.2 and written

in Java. It is available at http://www.informatik.hu-berlin.de/top/pnk/.

32

Chapter 4. Technical Methodology 33

In its standard version PNK supports Place-Transition nets and Coloured Petri Nets, Petri Nets in

which the tokens belong to different classes and are distinguishable. It also offers a rudimentary

graphical editor that can be used to create and modify Petri Nets. The rationale for the development

of PNK was the fact that implementing a new Petri Net tool can be very time consuming. Most

of the implementation effort for a new Petri Net tool is spent for almost the same functionality: a

graphical editor, a visualization device, functions to save and store the net, etc. The effort for im-

plementing this standard functionality is spent over and over again when new tools are developed

by different groups. The aim of the PNK project was to provide a common infrastructure for Petri

Net tools and to avoid repeating the same implementation effort.

Right now the PNK project is finished. But a new and improved version is currently developed at

the University of Paderborn, Germany. There are some tools that have been developed using PNK

such as PNVis which supports the 3D visualization of Petri Nets and the Petri Net Cube, a software

that implements parametric Petri Nets.

4.1.1 Design and Concepts

In the following an outline of the architecture of PNK and its modules is given. New Petri Net

types can be added to the Kernel by implementing classes for places, transitions and arcs. Each of

these classes can have labels or extensions which are also Java classes.

As an example, a place has an extension for its current marking and a transition has an another ex-

tension which represents the rate of the exponential distribution from which the delay is sampled.

The net as a whole can also have one or several extensions, for instance its name and additional

information about the net type. The dependencies between classes and their labels are defined in a

XML file called netTypeSpecification. An example of a definition file for a Stochastic Petri Net is

given on the next page.

This concept of a net type is very general. Its advantage is its close relationship to the Petri Net

Markup Language (Weber & Kindler 2002), which is a common description language to exchange

Petri Nets between different software tools. It is also used by the Kernel to save nets created by the

user. Nevertheless, this markup language defines only the syntax: which labels and combinations

of labels are allowed for a certain net type. In order to provide the semantics, methods can be

defined for each label. For instance the label representing the marking of a place has methods that


define the addition and subtraction of tokens. These methods are used by the simulator to move

the token through the net. Each net has also an associated instance of the class FiringRule. This

class defines how transitions are executed and how conflicts between transitions are resolved. In a

SPN, the class StochasticRule defines that the delays in the net follow an exponential distribution

and that in case of a conflict, the transition with the smallest delay is executed first.

Application Modules and I/O Modules

Each instance of a net type that is loaded into the editor is checked against its definition. Further-

more, the Petri Net Kernel provides a net type interface. This interface can be used to derive new

types. It basically states that each label and node in a net must be defined by a Java class. Once a

new net type is defined that fulfils the requirements of the net type interface, an instance of the net

can be loaded into the Kernel.

PNK also contains templates that define how a Petri Net can be saved to and retrieved from a file,

the so-called I/O-modules. So far, there is a module that saves a net into a PNML file (Petri Net

Markup language). Support for SBML and CMDL was added. SBML is used in the Systems

Biology Workbench to exchange models between different components and CMDL is a simplified

description language for chemical reactions supported by the stochastic simulator Dizzy (Ramsey,

Orrell & Bolouri 2005). The aim of the PNK project was to provide a common infrastructure for a

variety of Petri Net tools by offering templates for tasks that occur repeatedly. The architecture of

the Petri Net Kernel consists of five parts corresponding to the different steps encountered during

the development of a Petri Net tool:

• The net interface provides methods to access and to modify a net. It is also used to synchro-

nize the activity of different applications accessing the same net.

• The net dialog interface provides means to visualize information within a net and to interact

with the end user. This interface is used by applications that require an input by the user.

• The net type interface, already mentioned above, states how to define labels and firing rule

of a net type. It is used to check the validity of a net type.

• The application interface is a template for the definition of a new application based on the

PNK project. The related InOut interface defines the minimal functionality an I/O-module

has to offer.


New tools and their relation to existing net types can be defined by in a XML file, the tool speci-

fication. This file contains an entry for each application developed using the infrastructure of the

Kernel. Each of these application entries has a sub-entry for each net type the application supports.

For instance, a stochastic simulator only supports Stochastic Petri Nets and therefore its entry in

the tool interface has only one sub-entry for Stochastic Petri Nets.

These interfaces will be discussed in more detail in the remainder of this section. We do not aim

at giving a complete review of all methods and classes, but rather a general idea of the available

functions and how they were used in this dissertation.

The Net Interface

This interface consists of a collection of Java classes that represent a Petri Net together with its

extensions. The basic classes are Net, Place, Transition, Arc, and Extension.

Each class provides methods for accessing and modifying the corresponding net and its elements.

As an example, for an instance ”net” of class Net, the method call net.getPlaces() returns a list

of all its places. Likewise, there are methods for returning the in-going and out-going arcs of a

transition or of a place. There are also methods to access the label of an element.

For all these methods, the PNK will take care of maintaining the net consistent. This means that

the Kernel synchronizes the state of the Kernel between the different applications accessing it. The

Java class application control, which is described at the end of this section, keeps track of changes

to the net made by different tools and guarantees the consistency of the net in all situations.

Definition of Petri Net types

As outlined before, a net type in PNK is defined by a collection of classes for nodes, transitions

and extensions. The dependencies of these classes are given in a XML file. The net type interface

checks a net type definition for its validity. In addition, the declaration of a new Petri Net requires

also the definition of its firing rule i.e. how its transitions are executed.

During the extension of the Kernel by support for Stochastic Petri Nets, classes representing sto-

chastic transitions were implemented. Each class has a rate which is a label that is derived from the

super-class Extension. This class provides a set of basic methods that need to be provided by all


extensions such as a method to convert the internal value of the class into a string etc. A concrete

instance of a class stores all of its extensions in a hash. This hash contains key-value pairs of each

extension with their name as value.

The net type itself is specified in a XML file. The file enumerates all possible labels for each kind

of Petri Net element. The following listing shows an example of a Petri Net type specification. The

listing is simplified for the sake of clarity.

<?xml v e r s i o n = ” 1 . 0 ” e n c o d i n g =”UTF−8”?>

<!DOCTYPE n e t T y p e S p e c i f i c a t i o n SYSTEM ” n e t T y p e S p e c i f i c a t i o n . d t d ”>

<n e t T y p e S p e c i f i c a t i o n name=” S t o c h a s t i c N e t ”>

<e x t e n d a b l e c l a s s = ” . k e r n e l . Net”>

<e x t e n s i o n name=” f i r i n g R u l e ” c l a s s =” S t o c h a s t i c N e t R u l e ”/>

</ e x t e n d a b l e >

<e x t e n d a b l e c l a s s =” P l a c e ”>

<e x t e n s i o n name=” marking ” c l a s s =” Natura lNumber ”/>

<e x t e n s i o n name=” i n i t i a l M a r k i n g ” c l a s s =” Natura lNumber ”/>


<e x t e n d a b l e c l a s s =” Arc”>

<e x t e n s i o n name=” i n s c r i p t i o n ” c l a s s =” Natura lNumber1 ”/>


<e x t e n d a b l e c l a s s =” T r a n s i t i o n ”>

<e x t e n s i o n name=” r a t e ” c l a s s =” DoubleValue ”/>


</ n e t T y p e S p e c i f i c a t i o n >

The file contains a XML specific header and mentions the name of the Petri Net type. It then

introduces the two types of nodes in the SPN together with their extensions. The net itself has the

extension FiringRule, a place has a InitialMarking and a current Marking. The distinction between

initial marking and a current marking is useful to track the time evolution of the numbers of tokens

in a place. Each stochastic transition has a rate, which is the parameter of an exponential distribu-

tion, and each arc in the net has an inscription.

It is important to distinguish between the net interface and the net type interface. The net interface

as described in the last section offers the basic classes needed for the implementation of a new

Petri Net such as places, transitions etc. In contrast to this, the net type interface is used to check

the validity of a new net type. One requirement is that each component of the net as defined in the

type specification file is also represented by a Java class.


The Dialog Interface

This interface is needed by applications that require an action by the user or want to present the

results of some computation to the user. It offers functions to display textual information in the

net, to highlight a set of nodes or to request a decision from the user. We used it in the Simulator

application which plays the token game by moving the tokens through the net. While performing

the simulation, enabled transitions are highlighted and the flow of tokens is illustrated by changing

the node labels. In a net type that does not define how conflicts between simultaneously enabled

transitions are resolved, the user can be asked to decide which transition to execute by clicking on

the corresponding node.

By default, if an application requests a dialog, this request will be passed to the editor which then

displays a dialog to the user, colours the nodes, etc. But it is also possible to implement one’s own

dialogs.

Creating new Applications with PNK

This section briefly summarizes how a new Petri Net application can be created by using the in-

frastructure of PNK. A new application is a small module in a Petri Net tool that implements some

functionality. As an example, the Editor is an application and the stochastic simulator as well.

Technically, an application is a class derived from the PNK class MetaApplication or MetaBigAp-

plication. In the most simple case, an application modules implements only one single method

run(). The current net can be accessed by the method getNet(). In a more complex applica-

tion, there can be arbitrarily many methods. In that case, however, the application must implement

a method getMenus() which provides the PNK with the necessary information on the available

methods, such that the PNK can provide corresponding menus for the end user and start a method

on user request.

There are also two more special interfaces, NetObserver and ApplicationNetDialog. The Net Ob-

server interfaces allows an application to keep track of changes that are made to the net. The

ApplicationNetDialog interface can be used to display graphical information screens or questions

to the user. The editor is an example of an application that implements both interfaces.

Input/Output-modules are classes derived from the class InOut and are similar to the aforemen-


tioned applications. They have to implement two methods, load() and save(), that load or save

a net to or from a given URL. Support for PNML is implemented in the class PNMLInOut.

Definition of new Tools

Again we have to clarify an important point. In the framework of the Petri Net Kernel, an appli-

cation is a small module that is part of a larger program or tool. A tool is a collection of at least

one new net type and at least one application. The relationships between net types and application

are defined in a XML file, the tool type definition. This file contains an entry for each application.

Each of these entries has a sub-entry for the net types the application supports. This is a part of the

tool definition for the extended Petri Net Kernel that was developed in this dissertation.

<n e t t y p e i d =” n10 ” t y p e S p e c i f i c a t i o n =” f i l e : n e t T y p e S p e c i f i c a t i o n s / S t o c h a s t i c N e t . xml”/>

<n e t t y p e i d =” n11 ” t y p e S p e c i f i c a t i o n =” f i l e : n e t T y p e S p e c i f i c a t i o n s / GSPN . xml”/>

<a p p l i c a t i o n i d =” a1 ” ma inCla s s =” de . h u b e r l i n . i n f o r m a t i k . pnk . app . S t o c h a s t i c S i m u l a t o r ”>

<a l l o w e d N e t t y p e s >

<n t r e f r e f =” n10 ”/>

<n t r e f r e f =” n11 ”/>

</ a l l o w e d N e t t y p e s >

</ a p p l i c a t i o n >

The file specifies references to the definitions of the Stochastic Net and the Generalized Stochastic

Net. The next lines refer to the Stochastic Simulator (token game) which is available for Stochastic

Nets and Generalized Stochastic Nets. There are also additional attributes that can be defined for

an application, such as the maximum number of instances and a default net type.

The Application Control

The Java class Application Control is finally the mediator between the different applications in a

Petri Net tool. When the Kernel is started, this class loads all applications and net types as given by

the tool definition. The application control also coordinates the interaction between the different

applications and starts an application on request by the user.

4.1.2 The Extended Kernel (PNK 2e)

This section summarizes the extensions implemented for the Petri Net Kernel. The Kernel was

developed to provide an infrastructure for new Petri Net tools and has a modular composition. It


Figure 4.1: Overview of the extended Petri Net Ker-

nel. The interface to the INA software was developed

at the Humboldt University of Berlin but is also avail-

able in the extended version. The simulation interface

to the Systems Biology Workbench was developed in

this work. For a description see below.

was therefore very suitable for this project.

Developing new Petri Net software from the scratch takes a considerable amount of time. The idea

behind this project was to extend an existing software tool and spend more time on the application

of the software. It turned out that this approach also has some shortcomings. The Petri Net Kernel

is a mature software tool. Nevertheless, it still contains some bugs that appeared during this work.

Despite its modularity, more changes had to be made than expected and these changes took more

time than anticipated.

Nonetheless the Kernel turned out to be useful and many helpful ideas and support were received

from its developers. It is available online1 and has been registered to a database of Systems Biology

software2. The remainder of this section will be used to present the changes that were made to the

Petri Net Kernel.

Print / Export

A function was implemented that allows the user of PNK to print a Petri Net or to export it to a

Postscript file. This function relies on the methods provided by the Java API 1.4.2. The plots of

Petri Nets in this report were created with this function.

1www.inf.fu-berlin.de/˜trieglaf/PNK2e2www.sbml.org


Stochastic Petri Nets / Generalized Petri Nets

Classes for a Stochastic Petri Net, such as delayed transitions and a stochastic firing rule, were

implemented. In addition, Java classes that define the graphical representation of immediate tran-

sitions and inhibitor arcs in the editor were also developed.

Generalized Stochastic Petri Nets (GSPNs) are an extension of Stochastic Petri Nets. In addition

to delayed transitions, a GSPN also comprises immediate transitions and inhibitor arcs.

• An immediate transition fires, if enabled, before all delayed transitions. If several immediate

transitions are enabled at the same time, the transition to execute is determined at random.

• An inhibitor arc is an arc that disables a transition if its input place(s) contain(s) at least one

token. Inhibitor arcs may have multiplicities in the same way as ordinary arcs.

Inhibitor arcs can be used to model decisions that are assumed to take no time. They can also be

used to represent control actions that are necessary to ensure the correct behaviour of the model.

Inhibitor arcs result in a smaller state space of the underlying Markov process since they can be

used to exclude certain states.

We also implemented support for GSPNs since they seemed to be an useful extension of the stan-

dard SPN theory. But this formalism does not seem to be necessary for biological applications,

at least not in the examples used. Furthermore their simulation is more difficult and the corre-

spondence to a Markov process is less obvious. For these reasons this approach was not further

pursued.

Token Game for Stochastic Nets

If we start from an initial marking and execute enabled transitions according to the firing rule, we

call this simulation the token game. The token game that was implemented is essentially noth-

ing else than a stochastic simulation of the Markov process but at much lower speed. Every time a

transition is executed, it is highlighted for some milliseconds. The movement of the tokens through

the net is visualized.

The dialog and observer interfaces of the Petri Net Kernel allow implementing a token game simu-

lator easily. It is useful to get an idea of the general behaviour of the net. But for larger networks, a

token game simulation takes a lot of time and one should resort to a standard stochastic simulation.


SBML and CMDL support

SBML and CMDL are data exchange formats that are used in the Systems Biology Workbench.

Before a Petri Net can be simulated in the workbench, it needs to be translated into one of these

formats. We will give details of both languages in the next section.

As outlined above, new In- and Output classes can be derived from the InOut class that is provided

by the Kernel. We developed two new classes, SBMInOut which translates a Petri Net into SBML,

and CMDLInOut that translates into CMDL.

In addition, the extended Kernel can import Stochastic Petri Nets from SBML. This is particularly

useful since many published models of chemical reactions are available in this data format. There

are also efforts to provide a collection of annotated SBML models online 3. These models can now

be imported into the Kernel. The problem is that SBML does not contain any layout information. If

a Petri Net representation is created from a SBML file, PNK 2e creates a single cluster of all nodes

in the net. The Kernel contains a simple layout algorithm that was implemented by the developers

at the Humboldt University. This algorithm is capable of rearranging the nodes and tries to create

a clearer layout of the graph. In our experience, this works only for simple and small Petri Nets.

The algorithmic drawing of complex graphs is a topic of current research.

Interface to Systems Biology Workbench

The Systems Biology Workbench (SBW) includes an interface that allows other applications to

use its infrastructure. This application only needs to use the API offered by the SBW to call the

so-called broker and can access any of its services.

A new application was developed that derives from the interface MetaApplication, translates the

Petri Net into SBML or CMDL and calls the simulation service of the SBW. This service can then

be used to simulate the net either stochastically or deterministically. Details of this simulation

service will be given in section 4.3.

Simplified Petri Nets for Biological Reactions

After the first few experiments with the extended Petri Net Kernel, it turned out that Petri Nets rep-

resenting networks of chemical reactions can become very large and complex. For this reason, two

3Biomodels database: http://www.biomodels.net/ (developed by the European Bioinformatics Institute, UK)


simplifications for reactions which occur very frequently in biological models, were introduced.

X

n1

Y The first simplification comprises a new Petri Net transition. It repre-

sents a reversible chemical reaction X ←→ Y . It is defined in the Java

class ReversibleTransition and has two associated rates instead of one. Before this simplified model

can be simulated, the reversible transition has to be decomposed into its forward and backward re-

action. The first rate is assigned to the forward and the second to the backward reaction.

Enzyme

XY

t1

The second simplification represents the reaction X →Y which is catalysed

by an enzyme (indicated by the place with the inscription ”enzyme”). The

fact that the enzyme catalyses this reaction is indicated by the black box at

the end of the arc that connects the enzyme place with the transition. This

arc which is implemented in class EnzymeArc and has three associated rates r1, r2 and r3. In order

to simulate this reaction stochastically, the Kernel decomposes the reaction into three separate

steps:

X +Enzymer1−→ Cm (4.1)

Cmr2−→ X +Enzyme (4.2)

Cmr3−→ Y +Enzyme (4.3)

This set of reactions represents the synthesis of an intermediate complex Cm (4.1), the dissociation

of this complex into reactant X and the enzyme (4.2) and production of the product Y (4.3). Each

of the rates in class EnzymeArc is associated to one reaction steps. More complex reactions with

several reactants and/or products are also possible.

4.2 The Systems Biology Workbench

This section gives an outline of the Systems Biology Workbench, the framework used to perform

the simulations in this dissertation. The ERATO Systems Biology Workbench (SBW) Project

was originally funded by a grant from the Japan Science and Technology Corporation (Hucka

et al. 2002). The aim of this project was to build a software infrastructure that allows to share

resources between simulation and analysis programs for Systems Biology. Currently financial

support comes from the U.S. Department of Energy. The principal investigators of the project are

situated at the Keck Graduate Institute in Claremont, the California Institute of Technology and

the Institute for Systems Biology in Seattle.


The Systems Biology Workbench was used to simulate the time evolution of Stochastic Petri Nets.

As outlined in chapter 2, a SPN maps directly to the Gillespie or Gibson Bruck algorithm and can

be executed since all of these stochastic models are based on a Markov process. The next sections

describe the architecture of the SBW, give an overview of its API and describe its simulation

service in more detail.

Overview

The current version of the SBW is 2.3.1. Due to large scale architectural changes, this version is

only available for the Windows operating system. All parts are available as open-source under the

GNU LGPL.

It is fact that many software engineers often duplicate each other’s effort when implementing dif-

ferent packages. Many research groups write their own programs that fulfil their very special needs

and reflect the specific expertise and preferences of the group. This results in many small software

projects, each having its niche strengths which are different, but complementary to, the strengths

of other tools. On the other hand, since there are certain basic functions that are needed by all

programs (data input/output, visualisation of results etc.), developers often have to re-implement

general functionality in their tools.

There is currently no software that can answer all the questions of the Systems Biology community.

Many researchers uses a variety of tools at the same time to look at their problems from different

perspectives. Software that eases the exchange of information between these tools will become

even more important.

The Systems Biology Workbench tries to address this issue by offering a common infrastructure

for software tools within the field of Systems or Theoretical Biology. The approach is very similar

to the Petri Net Kernel that comprises an infrastructure for Petri Net tools to avoid a waste of

resources. We decided to merge these tools that are so similar in their approach.


The SBW Broker

SBW Broker

Module writtenin Python

SBW Python Interface

Mod

ule

writ

ten

in J

ava

SBW

Jav

a In

terf

ace M

odule written

in C++

SBW

C++ Interface

Figure 4.2: The broker architecture of the

SBW. Gray areas indicates SBW components.

Adapted from (Hucka et al. 2002).

The SBW distribution consists of the SBW Broker

and several small modules that illustrate the use of

the workbench. Among these modules are Jarnac,

a deterministic simulator, and a bifurcation analy-

sis tool. Furthermore there are many programs that

have been developed independently from the SBW

but support its communication protocols. They are

SBW-enabled and can communicate through the SBW.

Among these programs is Dizzy, a stochastic simula-

tor that implements the Gillespie algorithm. The cen-

trepiece of the SBW is the SBW Broker. It is a background program that is started automatically if

needed by a module of the SBW. The Broker maintains a list of all registered modules. A program

needs to be registered once with the Broker and can then be started on demand.

Architecture of the SBW

The communication of the different components in the SBW is realized by a message-passing ar-

chitecture. Messages are exchanged as structured data-bundles. All interactions are defined on a

very high level and the whole framework is independent from any programming language. The

Broker itself is written in C++ but the modules can be written in any language as long as they can

send, retrieve and process messages according to the conventions given by the SBW.

A module can implement one or more services. Services are interfaces to the functions of the

module. These interfaces are made visible to other modules and can be executed through the Bro-

ker. Each service belongs to a hierarchically organized category. This organization allows other

applications to list all services that belong to a certain category without knowing the names of

each service. As an example, the SBW module that we developed for the Petri Net Kernel simply

lists all simulation services that are known to the Broker and the user can decide which service to

execute.


The following listing is taken from PNK 2e and illustrates the use of the Java API provided by

Systems Biology Workbench:

1

2 SBW. c o n n e c t ( ) ; / / open c o n n e c t i o n t o SBW b r o k e r

3

4 s e l e c t i o n = ( S e r v i c e D e s c r i p t o r ) l i s t . g e t S e l e c t e d V a l u e ( ) ;

5

6 S e r v i c e s e r v i c e = s e l e c t i o n . g e t S e r v i c e I n M o d u l e I n s t a n c e ( ) ;

7 A n a l y s i s a n a l y s i s = ( A n a l y s i s ) s e r v i c e . g e t S e r v i c e O b j e c t ( A n a l y s i s . c l a s s ) ;

8

9 a n a l y s i s . d o A n a l y s i s ( sbml ) ; / / v a r i a b l e sbml s t o r e s t h e sbml r e p r e s e n t a t i o n

In the third line the service that was chosen by the user is retrieved. This description of this service

is encapsulated in the class ServiceDescriptor. Then an instance of the service is requested from

the SBW.

The simulation service is a subclass of the more general Analysis category. We retrieve an instance

of this category and perform the simulation with a call of the method doAnalysis. This source

code is very general since our intention was to offer the user not only the choice between services

of the very specific Simulation category but also to the larger Analysis category which contains

also deterministic simulators and export modules.

Visualisation and Analysis of Results

The Systems Biology Workbench contains some modules that can perform a rudimentary analysis

of the results. The graphs in the experimental section of this work were performed by exporting

the simulation results to Matlab, which is possible using the Matlab export module of the SBW.

The Systems Biology Markup Language

The Systems Biology Markup Language (SBML) (Finney & Hucka 2003) complements the Sys-

tems Biology Workbench described in the last section. Most of the modules contained in the SBW

require the model to be described in this language. SBML is a standard description language for

the representation of models of biological networks. Mathematical models described in SBML for

a variety of systems are available at http://www.sbml.org. This webpage also contains a SBML

test suite that allows to verify own models and to ease the development of own applications. Pack-


ages such as libSBML are available and can be used to parse SBML files and to map them into their

own internal representation. Although SBML (currently implemented in XML) aims at machine

readability, it can also be followed by eye.

Currently SBML exists in two versions, level 1 and 2. Level 2 is the newest release but level 1

is still widely supported. Conversions between both levels is possible but level 2 contains some

improvements that can not be translated into level 1. However, most of the available tools only

support level 1 including the Systems Biology Workbench. We will therefore restrict this overview

to the features available in level 1. A SBML model consists of the following entities:

Unit Definition determines the unit of the participating species. For stochastic simulations, this

has to be the number of molecules. A deterministic model requires concentrations which can

be calculated from the volume of the container and the number of molecules.

Compartment is a finite container in which the reactions take place. An example could be a

biological cell or a cellular compartment such as the nucleus. This definition also contains

the volume of the container.

Species are the types of molecules involved in this model. An attribute of each species is its initial

amount given in a unit as defined above together with the compartment in which it is located.

Reaction is a statement describing some transformation, transport or binding process that can

change the amount of one or more species. Each reaction has an associated rate law that

describes its kinetics.

Parameter are variables in the model. They can relate to a single reaction only (such as rate

constants) but also act on a global level.

Rules are a general concept that can be used to express relations between the amount of species

or constrains on the rate law of a reaction etc.

The SBML language definition is available online at www.sbml.org and gives further details about

the different entities in a SBML document and some examples. Here we will only show a small

example of a reaction and how it can be represented with SBML. Consider the reaction

X +2 Yk1−→ Z

which is some hypothetical synthesis of a molecule Z with the reactants X and Y . The reaction

occurs with at a rate constant k1. A representation of this reaction in SBML level 1 would look like

this:


. . . .

< l i s t O f R e a c t i o n s>

< r e a c t i o n name=” r e a c t i o n 1 ” r e v e r s i b l e =” f a l s e ”>

< l i s t O f R e a c t a n t s>

<s p e c i e s R e f e r e n c e s p e c i e s =”X” s t o i c h i o m e t r y =” 1 ” />

<s p e c i e s R e f e r e n c e s p e c i e s =”Y” s t o i c h i o m e t r y =” 2 ” />

< / l i s t O f R e a c t a n t s>

< l i s t O f P r o d u c t s>

<s p e c i e s R e f e r e n c e s p e c i e s =”Z” s t o i c h i o m e t r y =” 1 ” />

< / l i s t O f P r o d u c t s>

<k i n e t i c L a w f o r m u l a =” k1 ∗ X ∗ Y”>

< l i s t O f P a r a m e t e r s>

<p a r a m e t e r name=” k1 ” v a l u e =” 5 ” />

< / l i s t O f P a r a m e t e r s>

< / k i n e t i c L a w>

< / r e a c t i o n>

< / l i s t O f R e a c t i o n s>

. . .

The XML header and the declaration of units and compartment were omitted for clarity. SBML

is clearly a useful invention. So far, the problem is that not all tools support all features of this

language. For instance, the stochastic simulator Dizzy does not recognize the attribute reversible

of a reaction. As a result, a reversible reaction has to be represented by two separate reactions, one

for the forward and another one for the backward reaction.

There is also no consensus about the representation of enzymes or inhibitors but this is an issue

that is supposed to be solved in future releases of SBML.

Mapping of the Petri Net to SBML

With the information given in the last section, it is relatively easy to see how a Petri Net can be

stored in SBML. Places are stored as species and transitions as reactions. The Java class SBML

InOut implements this mapping. This class uses the aforementioned library libSBML provided by

the developers of SBML. This library simplifies the export to SBML since the programmer does

not have to write the XML file directly but creates a set of Java objects that represent the document.

The library handles all read and write accesses on the level of the XML file and is very fast.


During this dissertation, the same approach of mapping Petri Nets to SBML has been published

elsewhere (Shaw, Koelmans, Steggles & Wipat 2004). However, we developed our own imple-

mentation and did not use any other software besides the libSML library.

There is only one issue that needs to be clarified and that is the question of how to map the tokens

in the Petri Net to the units in the SBML model. In general, a stochastic simulation requires the

units to be given in terms of the numbers of molecules. But in the literature, there are several

approaches. The most common one is to think of one token as one molecule. But there are also

studies in which one token represents a fixed number of molecules (Srivastava et al. 2001, Arkin

et al. 1998).

In this implementation and all subsequent experiments, it is assumed that one token represents one

molecule. If the user wants to obtain deterministic information about the model, the concentration

is computed from the numbers of tokens and the volume of the compartment.

4.3 Stochastic Simulations with Dizzy

The software Dizzy (Ramsey et al. 2005) is developed at the Institute for Systems Biology in Seat-

tle. It implements the Gillespie-Algorithm and its improvement by Gibson and Bruck. Dizzy also

contains a tool for the integration of differential equations and is capable of importing models from

SBML. The user can plot the results or write them to a file for further processing by other tools.

The program is written in Java and published as OpenSource software.

Dizzy was used to perform the simulations in the next chapter. There are several tools with sim-

ilar functionality that are supported by the SBW. It was decided to use Dizzy since it has a very

comfortable graphical interface. In addition it also offers a so-called programmatic-interface. This

interface can be used by Java programs to use the algorithms implemented by Dizzy directly with-

out the SBW, simply by importing the libraries of Dizzy. This was useful since the first release of

the SBW was not very stable and the connection to the Broker was interrupted frequently. This

resulted in an interruption of the simulation. We reported these problems to the developers of the

workbench but it took them some time to correct this bug. In the meantime, Dizzy could be used

through its programmatic interface directly. Later it was decided to switch back to the SBW me-

diated communication. A communication of Petri Net Kernel and Dizzy through the workbench is

more flexible and allows to include other tools into the analysis as well.


Dizzy is also capable of reading CMDL which is a simplified version of SBML only suitable for

the description of small models. An export module for the Petri Net Kernel was written since

CMDL is easier to read and to debug if the model is small. But SMBL is much more commonly

used and therefore no examples of CMDL will be given.

4.4 Conclusions

In this chapter, the technical methodology used in this project has been described. The Petri Net

Kernel and its extensions made during this dissertation were detailed. A survey of the Systems

Biology Workbench was given. Both projects are similar in their approach and have proven to be

useful during this dissertation. There seems to be a clear trend in academical software engineering

to develop frameworks that integrate different tools and avoid double effort.

It was decided to extend existing tools instead of writing new software in order to have more time

for the experimental part of this project. This approach was successful. On the other hand, it turned

out that one has to spend considerable time to understand program written by other people. This is

particularly important if one wants to make changes to the software or to implement new features.

Furthermore, if bugs are found in the software it is often very difficult to find them. One is usually

dependent on the help of the developers of the software. This is usually not the case if one uses

only self-written software.

During this dissertation, considerable help was received from the developers of the Systems Bi-

ology Workbench, especially Frank Bergmann at the Keck Graduate Institute in Claremont, USA

and Stephen Ramsey who is developing the simulation software Dizzy. Both responded quickly

to any questions and were eager to extend their software and to provide additional documentation.

Reported bugs were often corrected within some days. The friendly and open attitude that is char-

acteristic for the Open Source community greatly facilitated the work on this dissertation.

The extension to the Petri Net Kernel that was developed during this dissertation is available online

at www.inf.fu-berlin.de/˜trieglaf/PNK2e/. This web page also provides a detailed docu-

mentation of the software generated with the javadoc tool. Appendix A contains a user guide

to PNK 2e which includes screenshots and a short introduction. This document is also available

online at the aforementioned web page.

Chapter 5

Experiments

This chapter describes the experiments conducted with the extended Petri Net Kernel. We start

with a simulation of the Volterra-Lotka Reactions. This well-known model has been extensively

used as an example for systems whose behaviour cannot be accurately computed with determinis-

tic simulations (Gillespie 1976). In this dissertation, it serves as a first example for an application

of the Petri Net Kernel. It highlights the usefulness of Stochastic Petri Nets for the modeling of

biological reactions and, since the simulation results have already been researched, is used to vali-

date our approach.

The second section deals with a more comprehensive example, a stochastic simulation of a regu-

latory gene network that exhibits circadian rhythms. It is known that many organisms exhibit an

circadian rhythm on the level of genes and proteins that follows the daily cycle of day and night.

However, the details of the molecular mechanisms upon which these rhythms rely are not yet un-

derstood. Two competing models and their dynamic behaviour are compared. These models differ

heavily in their architecture and the assumptions they make. Consequently, they also reveal a very

different behaviour if simulated stochastically under changing conditions. This chapter concludes

with some experiments dealing with the synchronisation of several biomolecular clocks.

50

Chapter 5. Experiments 51

5.1 The Volterra-Lotka Reactions

The so-called Lotka-Volterra reactions are described by a set of three coupled, autocatalytic reac-

tions:

Yc1−→ 2Y (5.1)

Y +Xc2−→ 2X (5.2)

Xc3−→ /0 (5.3)

This model can also be seen as a simple description of a predator-prey ecosystem. In this case,

Y represents the prey which reproduces itself in reaction 5.1. Reaction 5.2 describes how X , the

predator species, reproduces by feeding on the prey. Reaction 5.3 models the demise of X through

natural causes.

These reactions have been extensively modelled using different deterministic and stochastic ap-

proaches. The corresponding reaction-rate equations are given by

dYdt

= c1Y − c2XY (5.4)

dXdt

= c2XY − c3X (5.5)

The nontrivial steady state of this system is given by

dYdt

=dXdt

= 0

and one can show that it is characterized by

Y = c3/c2 and X = c1/c2

For X = Y = 1000, c1 = c3 = 10 and c2 = 0.01, a deterministic approach predicts that this situation

will persist indefinitely (Gillespie 1976). Figure 5.1(b) shows the result of a numerical integration

of the equations 5.4 and 5.5 with these initial conditions: the number of molecules remains indeed

constant over time. But this is not what one would expect from a good model of a predator-prey

ecosystem.

It was also shown that this system exhibits an oscillatory behaviour if modelled stochastically.

The system was simulated by mapping the Stochastic Petri Net to its SBML representation using


prey

predator

reproduction of prey

10

consumption of prey

0.01

predator death

10

1

2

11

2

1

(a) Petri Net Representation

0 5 10 15 20 25 30 35 40999

999.2

999.4

999.6

999.8

1000

1000.2

1000.4

1000.6

1000.8

1001

num

ber

of m

olec

ules

Time [d]

PreyPredator

(b) Deterministic solution

Figure 5.1: 5.1(a) is the SPN representation of the Lotka-Volterra reactions. 5.1(b) shows the deterministic

solution of the reaction-rate equations with X = Y = 1000, c1 = c3 = 10 and c2 = 0.01. The plot was obtained

by numerical integration of the differential equations 5.4 and 5.5.

PNK2e. This SBML representation was used to simulate the dynamic behaviour of the model with

the Gillespie algorithm. Figure 5.2(a) shows the result of this simulation with the expected oscil-

lations. We conducted our experiments with initial conditions that match the steady state. Further

experiments have shown that the behaviour of the system is heavily influenced by its initial con-

ditions. Not always one of the steady states is reached. Nonetheless, stochastic and deterministic

formulation come to a different result in many cases (data not shown). We decided to present

simulations with the same parameters as Gillespie (1976) because this example is very illustrative.

5.1.1 Results and Discussion

It is not difficult to see why the Lotka-Volterra model exhibits an oscillatory behaviour. If one

examines the plot 5.2(a) carefully, one can see that each rise in the prey population is followed

by an increase of the predator population and a subsequent decrease in the prey population. If the

prey population increases, the amount of available food for the predator population rises as well.

This leads to an increase of the predator population followed by a decrease in the prey population.

The resultant food shortage for the predators leads to a decline in their population. This permits

the prey population to increase again etc.


0 5 10 15 20 25 300

500

1000

1500

2000

2500

3000

nu

mb

er

of

mo

lecu

les

time

PredatorPrey

(a) Stochastic Simulation (30 timesteps)

200 400 600 800 1000 1200 1400 1600 1800 2000 2200200

400

600

800

1000

1200

1400

1600

1800

2000

2200

Number of Y molecules

Nu

mb

er

of

X m

ole

cule

s

(b) Y vs. X

Figure 5.2: Stochastic simulation of the Lotka-Volterra reactions with the Gillespie algorithm. The number of

X and Y molecules oscillates between 150 and 2600 molecules. The right figure shows a plot of Y vs. X .

Gillespie (1976) gives a more formal analysis of the simulation results. If one solves the differen-

tial equations 5.4 and 5.5 for an arbitrary initial state (X0,Y0), then the solution would be an orbit

in the (X,Y) plane that passes through the initial state (X0,Y0). One can show that this solution

is only neutrally stable in a mathematical sense. That means that if the system is perturbed by

some random fluctuations, it is driven out of this orbit and ends up in a new solution orbit pass-

ing through (X ′0,Y′0). Figure 5.2(b) illustrates this behaviour. It shows a plot of the number of Y

(predator) versus the number of X (prey) molecules. The system passes through several neutrally

stable, concentric solution orbits. Fluctuations can drive it either outward or inward into a new

orbit. Sooner or later, a random fluctuation will drive the system on one of the two axes in figure

5.2(b). That is, either the prey or the predator population will die out. If the prey dies out first, the

predator population will die out as well. If the predator population dies out first, the prey popula-

tion will increase indefinitely.

The Volterra-Lotka reactions are of course a very simple example. They have been used by Gille-

spie (1976) to underline that there are (bio-)chemical systems whose behaviour can not be reliably

predicted using deterministic approaches. The reason is that a deterministic formulation does not

take into account the random fluctuations occurring at the microscopic level. In contrast to this,

the stochastic simulations is a much more natural framework and gives results that are closer to

intuition. In this work, it was decided to use this example again to show that the Petri Net repre-

sentation is able to represent chemical reactions in a natural way. The SPN representations of the

reactions is given in figure 5.1(a) and is easy to understand. Furthermore, if the transition rates in


the Stochastic Petri Net are chosen according to the mass action rate kinetics (Cox & Nelson 2004),

the net can be efficiently simulated using the Gillespie algorithm as done in our experiments.

5.2 Stochastic models of circadian rhythms

5.2.1 The delay-based Model

The daily change of day and night affect nearly all life-forms. Many organisms have evolved

rhythmic responses that follow this day-night cycle. There exist many responses, ranging from

behaviour (sleep-wake cycles, feeding rhythms) to molecular rhythms (e.g. gene expression and

enzyme-activity rhythms).

This section deals with a core molecular model capable of generating circadian rhythms in Neu-

rospora crassa which is a red bread mould. This mould is often used as a model organism since it

is easy to grow. Its genome is fully sequenced and simple. The model itself represents a rhythmic

response on the level of gene expression (i.e. the response is given by oscillations in the concentra-

tion of a regulatory protein with a period close to 24 h). One has to keep in mind that this model is

fairly general. It does not aim at capturing all the details in a real cell. It is also not specific to Neu-

rospora but represents an architecture that is thought to be representative for a biomolecular clock

in simple organisms. As an example, Gonze, Halloy & Goldbeter (2002) state that, with small

modifications, this system is equivalent to the biomolecular clock in Drosophila melanogaster, the

fruit fly.

There are several models of genetic oscillators, almost all of them are based on a negative feedback

loop. In this loop a regulatory protein inhibits its own expression. The model we are examining

was published by Leloup, Gonze & Goldbeter (1999) for the first time. This deterministic descrip-

tion was later transformed into a stochastic model by Barkai & Leibler (2000). The simulation of

this model revealed a very noisy behaviour and no stable oscillations. Their conclusion was that

the model is wrong since it is only able to exhibit circadian rhythms if solved deterministically.

But the deterministic formulation is assumed to be inadequate since the regulatory protein occurs

in very few instances only. Two years later the same model was simulated again using the same

algorithm but with different rate constants (Gonze, Halloy & Goldbeter 2002) for the protein-DNA

interactions. These kinetic constants are thought to be critical for the noise resistance of the sys-

tem. Under these conditions the model revealed stable oscillations with a period of about 24 hours.


Gonze, Halloy & Goldbeter (2002) argue that the kinetic rate constants used by Barkai & Leibler

(2000) in their stochastic model were too small. Using different parameters, they were able to

produce stable oscillations in a stochastic simulation even with very few molecules. In the experi-

ments that follow, the same kinetic constants and experimental setup are used as by Gonze, Halloy

& Goldbeter (2002). But it is not clear which parameters are actually correct since no experimental

data is available so far.

Figure 5.3: Core model for circadian rhythms based on delay. This plot is taken from Gonze,

Halloy & Goldbeter (2002). The model incorporates gene transcription as well as transport, degradation and

translation of mRNA (Mp). The clock protein (P0) that is synthesized from the mRNA is reversibly phospory-

lated into the form P1 and P2 successively. P2 is either degraded or transported into the nucleus (PN) where it

exerts a negative feedback on its own gene. The inhibition is cooperative (explanation see text below). This

general model accounts for circadian oscillations in Neurospora but also Drosophila.

The model

An overview of the model is given in figure 5.3. Essentially a protein is phosphorylated in two

steps, it diffuses back into the nucleus and inhibits its own synthesis. This negative feedback loop

leads, if timed correctly, to oscillations in the concentration of the protein with a period close to

24 hours. The phosphorylation induces a delay between the translation of the mRNA and the dif-

fusion of the protein back into the nucleus. Its role in the biological clock is not yet clear (Barkai

& Leibler 2000) and there are theoretical models of circadian oscillations that can do without it

(Gonze, Halloy & Gaspard 2002). Up to four proteins must bind successively to the gene promoter

to repress transcription. The resulting inhibition of the gene is cooperative. This means that each

bound protein facilitates the binding of the next protein, i.e. the rate constants of these reactions


are increased. This is a phenomenon that occurs very frequently in nature. But also cooperativity

is not required for the oscillations to take place. Nevertheless, a cooperative inhibition increases

the robustness of the oscillations. The experiments of Gonze, Halloy & Goldbeter (2002) revealed

that the highest robustness is achieved with three proteins binding to the gene. They understand

robustness as an informal measure of the regularity of the oscillations. Whenever we use the term

robustness in the remainder of this dissertation, we refer to the regularity the oscillations as well.

This regularity is determined by the deviation in period and amplitude of the oscillations.

The differential equations that correspond to the individual reaction steps in the model are given

in Appendix B. From this deterministic model, a description of the detailed reaction steps was

derived. Reactions following Michaels-Menten kinetics are decomposed into single steps. The

Petri Net representation of the reaction steps is large and is therefore included in Appendix B. The

probabilities of the reactions are, if not present in the deterministic model, taken from the literature.

As outlined before, the aim of the stochastic formulation was to show that this model is adequate,

a fact that was questioned in a previous study (Barkai & Leibler 2000).

The volume of the system, Ω, was changed systematically in order to modulate the number of

molecules in the model. In order to understand why an increase of the volume results in a larger

number of molecules, one needs to recall how the deterministic rate constants are converted in the

stochastic reaction constants (see section 2.3.2 in chapter 2). Intuitively, an increase of the size of

the container should result in a dilution and less molecules involved. But this is not the case in

this scenario. Since the deterministic constants are expressed in terms of the concentrations of the

molecular species, we multiply these constants by the volume of the system and obtain constants in

terms of the number of molecules. The deterministic constants are fixed and therefore an increase

in the volume of the system increases the magnitude of the reaction probabilities and effectively

the number of molecules involved. Gonze, Halloy & Goldbeter (2002) refer to Ω by the size of the

system instead of volume in order to avoid confusions.

The problem with this approach is that if we modify all reaction probabilities in this way, we would

also increase the number of genes in the model. But this is not realistic and therefore all reaction

constants involving the gene promoter G all scaled by Ω to keep its number equal to unity. For

details see table B.2 in the appendix. The experiments were started by creating a Petri Net repre-

sentation of this model. The Petri Net is then translated into SBML and its behaviour simulated

by the Gillespie algorithm. This algorithm is implemented in the Systems Biology Workbench


(Hucka et al. 2002). We tried first to recreate the results published by Gonze, Halloy & Goldbeter

(2002) and performed simulations with different values of Ω. The aim was to check whether the

stochastic simulations produce results similar to those obtained with the deterministic model.

Estimates of period in the oscillations were obtained with the Matlab Signal Processing Toolbox.

The Fourier transform was applied to the time course of the protein population. This transform

aims at decomposing a noisy signal into a linear combination of sine and cosine functions. The

power spectrum or spectral density was obtained and used to estimate the strength of the different

frequencies that form the signal, in this case the time evolution of the numbers of proteins. From

the frequency estimate, the period of the signal was obtained.

0 20 40 60 80 100 120 140 160 180 2000

5

10

15

Time [h]

Con

cent

ratio

n [n

M]

Deterministic simulation

mRNAnuclear proteinall protein molecules

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80

1

2

3

4

5

6

7Deterministic simulation

mRNA concentration, Mp (nM)

Nucle

ar p

rote

in c

once

ntra

tion,

PN (n

M)

0 20 40 60 80 100 120 140 160 1800

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Time [h]

Num

ber o

f mR

NA

or p

rote

in m

olec

ules

Stochastic simulation with Ω = 500


0 100 200 300 400 500 600 700 800 900 10000

500

1000

1500

2000

2500

3000

3500

4000

4500Stochastic Simulation with Ω = 500

Nuc

lear

pro

tein

mol

ecul

es P

N

mRNA molecules MP

Figure 5.4: Delay-based circadian clock: stochastic and deterministic simulation The first row

gives the results obtained in the absence of noise. These curves are generated by numerical integration of the

kinetic equations as given in Appendix B. The oscillations of mRNA (Mp), nuclear (PN) and total clock protein

(Pt ) correspond to a the evolution towards a limit cycle shown as a projection onto the (Mp,PN) plane. The

results in the second row are obtained by stochastic simulation of the chemical reactions corresponding to

the deterministic model. The number of mRNA molecules oscillates between a few and about 1000 whereas

nuclear and total clock protein oscillate in the ranges of 200 - 4000 and 800 - 8000, respectively.


0 50 100 150 200 250 300 350 400 450 5000

500

1000

1500

2000

2500Stochastic simulation with Ω = 100

Num

ber

of m

RN

A o

r pro

tein

mole

cule

s

Time [h]


0 50 100 150 200 2500

100

200

300

400

500

600

700

800

900

1000

Nucle

ar

pro

tein

mole

cule

s P

N

mRNA molecules MP

Stochastic Simulation with Ω = 100

0 50 100 150 200 250 300 350 400 450 500−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time (h)

Sa

mp

le A

uto

co

rre

latio

n

Autocorrelation with Ω = 100

0 50 100 150 200 250 300 350 400 450 5000

200

400

600

800

1000

1200

1400

1600

1800

Time [h]

Num

ber

of m

RN

A o

r pro

tein

mole

cule

s



0 20 40 60 80 100 1200

100

200

300

400

500

600

mRNA molecules MP

Nucle

ar

pro

tein

mole

cule

s P

N


0 50 100 150 200 250 300 350 400 450 500−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time (h)S

am

ple

Au

toco

rre

latio

n


0 50 100 150 200 250 300 350 400 450 5000

100

200

300

400

500

600

Time (h)

Num

ber

of m

RN

A o

r pro

tein

mole

cule

s



0 5 10 15 20 25 30 35 40 45 500

20

40

60

80

100

120

140

160

180

200

mRNA molecules MP

Nucle

ar

pro

tein

mole

cule

s P

N


0 50 100 150 200 250 300 350 400 450 500−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time (h)

Sa

mp

le A

uto

co

rre

latio

n


Figure 5.5: Effect of the number of molecules on the robustness of the oscillations. The plots

show the results of stochastic simulations with Ω changing from 100 to 50 and 10. The left plot in each row

shows the oscillations for mRNA, nuclear protein and all proteins during a simulation time of 484 hours. The

middle plot shows the corresponding limit cycle and the right plot the time evolution of the autocorrelation

function. The autocorrelation function was computed for time lags from 0 to 480. For Ω = 50 and 100, one

can still observe robust circadian oscillations. Only for Ω = 10 the oscillations become very noisy. This fact

is underlined by the rapid decrease of the autocorrelation function. The simulations were conducted with the

Gillespie algorithm. The plots of the autocorrelation function were created with Matlab.


Experiments and Results

The aim of this first experiment was to check whether for sufficiently large numbers of molecules,

a stochastic simulation will produce results similar to the deterministic model. A simulation of

the SPN representing the model was performed with the Gillespie algorithm (Figure 5.4). The left

plot in each row shows the oscillations of mRNA (MP), nuclear protein (PN) and the sum of all

protein molecules obtained with the deterministic model (first row) and the stochastic simulation.

These oscillations evolve towards a limit cycle which is shown in the second plot in each row. The

stochastic simulation was conducted for Ω = 500.

For this particular value of Ω, the oscillations of the stochastic model are quite stable with a pe-

riod of 24.5 hours and a standard deviation of 1.1 hours. Similar values were obtained by Gonze,

Halloy & Goldbeter (2002). The number of mRNA molecules varies in the range of 0−1000 and

the number of proteins in the range of 200− 4000 (nuclear form) and 800− 8000 (all proteins).

The deterministic model shows stable oscillations with a period of 23.8 hours. In this model, the

concentration of mRNA varies in the range of 0−2 nM whereas the protein concentrations change

in a range of 1− 6.5 nM (PN) and 2.4− 14.8 nM (all proteins). It seems that the molecular noise

which is considered only by the stochastic model, merely induces a change in the amplitude of the

oscillations and not in the period.

The next experiments dealt with the influence of decreasing numbers of molecules on the results

of the stochastic simulation. Stochastic simulations were performed for Ω = 10,50 and 100. The

results in figure 5.5 reveal that stable oscillations occur with Ω = 100 (first row) and 50 (second

row). With these parameters, the number of mRNA molecules oscillate in the range of 25− 930

(Ω = 100) and 15−800 (Ω = 50). But for smaller numbers of molecules the circadian rhythms are

more and more overlapped by noise. The bottom row shows the result of a simulation with Ω = 10.

The number of mRNA molecules varies from 0 to 200 and the number of proteins changes from 0

to 200 (PN) and from 50 to 400 (all proteins). The limit cycle is not longer visible and the circadian

oscillations have become very noisy.

These observations are underlined by the time evolution of the autocorrelation function which

is given in the third column of each row. This function measures the degree of periodicity of a

function. It is the correlation of a discrete process against a time-shifted version of itself, for a time


lag τ, and is defined by:

R f (τ) =E[(Xi−µ)(Xi+τ−µ)]

σ2

where E is the expected value and µ the mean. For a deterministic and periodic time series, the

autocorrelation function oscillates between 0 and 1. In the presence of noise, the more periodic

the function, the more slowly the autocorrelation function goes to zero. This loss of correlation

is due to the phenomenon of phase diffusion. In the presence of noise the phase of free-running

oscillations varies in such a way that it eventually covers the whole range of possible values over a

period (Gonze, Halloy & Goldbeter 2002).

If many molecules are present in the system, the autocorrelation decreases slowly as can be ob-

served for Ω = 100. If noise starts to obliterate the oscillatory behaviour, the autocorrelation func-

tion decreases more rapidly. This can be seen for lower numbers of molecules (Ω = 10,50). So far,

we merely repeated experiments that were published elsewhere (Gonze, Halloy & Goldbeter 2002).

We can now hope that our approach, to model the reaction steps with a SPN, is valid and can apply

this method and the experimental setup to other models. This has not yet been done.

5.2.2 The hysteresis-based Model

As mentioned at the beginning of this chapter, the validity of the delay-based model has been

questioned (Barkai & Leibler 2000). In this part of the dissertation, we will present a different

model based on hysteresis (Vilar et al. 2002). In general, hysteresis is a property of a system that

describes a memory or lagging effect. In contrast to the previous model, this genetic oscillator

consists of two different components, an activator and a repressor protein. The expression of the

activator leads to a delayed expression of the repressor. The repressor inhibits the synthesis of the

activator protein and is the source of the oscillations.

The delay-based and the hysteresis-based model are very different but general models that try to

explain how circadian oscillations might be generated on a molecular level. A short comparison of

both models has already been made (Barkai & Leibler 2000). Recently, it has been suggested that

the rate of the binding reaction between protein and DNA has significant influence on the noise re-

sistance of the oscillations (Forger & Peskin 2005). We therefore repeat previous experiments with

faster rate constants for these binding reactions that are assumed to increase the noise resistance.

We also change the size of the hysteresis-based model in the same fashion as for the delay-based

model in the last section. This was not part of the experiments in Barkai & Leibler (2000).


The Model

Figure 5.6: The hysteresis-based

model. Illustration taken from Barkai

& Leibler (2000). This model consists

of two active components, an activator

(A) and a repressor (R). A stimulates the

synthesis of B. If the concentration of R

rises, A is degraded quickly. After the

slow degradation of R, A is v synthe-

sized again.

Figure 5.6 above gives an illustration of the second oscillatory network that is examined in this

section. The model also includes the degradation of both messenger RNAs synthesized from genes

PA and PR. The activator protein A binds to the promoter regions of its own gene which leads

to an increase of the transcription rate. This type of feedback loop increases the noise resistance

of the oscillator and seems to be a common feature among several competing models (Barkai &

Leibler 2000). Protein A also binds to the promoter of gene PR and induces its expression. The

gene is transcribed into mRNA and the resulting protein R binds to protein A. The degradation

complex (C) is formed (not shown in the simplified illustration above) and A is degraded. That is,

the degradation complex C decays into R. The cycle completes by degradation of the repressor R

and subsequent re-expression of the activator. For a detailed description of all reaction steps in the

model, see Appendix C.

As for the previous model, this oscillator is assumed to represent the core architecture of a bio-

molecular clock. It is not specific to any organism but contains features such as a positive feedback

loop and two competing components that were found experimentally in a variety of organisms

such as cyanobacteria but also mammals (Dunlap 1999). The dynamics of this system are captured

by a set of differential equations which are given in Appendix C. These equations have been de-

composed into the elementary steps in the same way as for the delay-based model. The reaction

probabilities are given in table C.1 and the SPN representation of the reactions in figure C.1 in the

third appendix.


0 20 40 60 80 100 120 140 160 180 2000

500

1000

1500

2000

2500

Time [h]

Num

ber o

f mol

ecul

es

Deterministic simulation

CR

(a) Deterministic simulation

0 500 1000 1500 2000 25000

200

400

600

800

1000

1200

1400

1600

1800

C

R

Limit cycle of the deterministic simulation

(b) Limit cycle

0 20 40 60 80 100 120 140 160 180 2000

500

1000

1500

2000

2500

Time [h]

Num

ber o

f mol

ecul

es

Stochastic simulation

CR

(c) Stochastic simulation

0 500 1000 1500 2000 25000

200

400

600

800

1000

1200

1400

1600

1800

(d) Limit cycle

Figure 5.7: Hysteresis-based circadian clock: stochastic and deterministic simulation Oscil-

lations in the repressor protein (R) and the degradation complex (C) obtained by numerical simulation of the

deterministic 5.7(a) and stochastic description 5.7(c) of the model.


The delay-based model that was examined in the last chapter, reveals stable oscillations at high

numbers of molecules i.e. if the size of the system is increased to Ω = 500. On the other hand,

if Ω is decreased to 10, oscillations can still be perceived but with very irregular period and am-

plitude. The delay-based and the hysteresis-based model have already been compared (Barkai &

Leibler 2000) but with slower rate constants and without changing the system size. We present a

more thorough comparison with faster rates for the protein and DNA binding reactions and with

changing numbers of proteins and mRNA in the model. All stochastic simulations were started

with one gene for repressor and activator protein and no instances of mRNA and protein species as

initial conditions.

To begin, we compare the stochastic formulation of the new hysteresis-based model with its deter-

ministic formulation. Figure 5.7 shows the results of this experiment. Following the work of Vilar


et al. (2002), we give the time course of repressor protein (R) and degradation complex (C) which

consists of activator and repressor protein. Deterministic formulation and stochastic simulation are

in good agreement and correspond to the results obtained by Vilar et al. (2002). The degradation

complex is formed as soon as activator and repressor proteins are available. It decays into R and

therefore a peak in the concentration of C is followed by a peak in the concentration of R. Later

on, R is degraded and the cycle starts again. This experiment was performed with Ω = 1, as done

by Vilar et al. (2002). With these parameter settings, the stochastic simulation results in a change

of amplitude in the oscillations but the period remains remarkably stable.

These experiments have already been performed elsewhere (Barkai & Leibler 2000). But in con-

trast to our experiments, they used slow rate constants for the binding reaction between protein and

DNA. Whereas the noise resistance of the delay-based model is greatly improved with these higher

rate constants, we cannot observe any change in the behaviour of the hysteresis-based model.

Similar to the experiments in the last section, we are now examining the behaviour of the bio-

molecular clock for different numbers of molecules. The delay-based model exhibited very noisy

oscillations at low values of Ω. If Ω is increased the behaviour of the stochastic model approaches

the time course of the deterministic simulation. The fact that random fluctuations on the molecular

level are averaged out if enough molecules are involved has been proven formally (Kurtz 1971)

and the behaviour of the delay-based model confirms this.

In contrast to these results, the hysteresis-based model reveals a very different behaviour. For

Ω = 10 and 50, the oscillations are very stable (third and second row of figure 5.9). But if the size

of the system is further increased, the oscillations stop completely (first row of figure 5.9). For

this model, higher numbers of molecules do not seem to improve the oscillatory behaviour of the

model. A limited amount of noise seems to have a positive influence on the oscillator. For higher

numbers of molecules, the system seems to approach a steady state.

These results confirm the findings of Vilar et al. (2002) that performed a theoretical analysis of a

simplified version of the hysteresis-based oscillator. They were able to show that the molecular

fluctuations can actually enhance the oscillator. Essentially, small perturbations can drive the sys-

tem out of a stable state and initiate a new phase. In a deterministic setting, these perturbations

are not considered and the system remains in the stable state once it has arrived there. This be-

haviour was observed for a particular low value for the degradation rate of the repressor protein R


0 50 100 150 200 250 300 350 4000

500

1000

1500

2000

2500Deterministic simulation

Num

er o

f rep

ress

or m

olec

ules

Time [h]

(a) Deterministic simulation

0 50 100 150 200 250 300 350 4000

500

1000

1500

2000

2500

Num

er o

f rep

ress

or m

olec

ules

Time [h]


(b) Stochastic simulation

Figure 5.8: Time evolution of the repressor protein (R) for deterministic (a) and stochastic (b) formulation

of the model. Parameter values are as given in Appendix C except for the degradation rate of R (δR) which is

now 0.05h−1.

(δR = 0.05 h−1). Figure 5.8 shows the results of a deterministic and a stochastic simulation with

these parameters. It is not completely clear whether a similar situation is created if the size of

the system is increased as in our experiments. The simulation shows that the abundance of the

key proteins A and R oscillates between 0 and several thousands of molecules. Vilar et al. (2002)

observed a similar behaviour for Ω = 1. The fact that very few instances of the key proteins are

present only during a very short time interval might be the reason for the noise resistance of the

system at low values of Ω. On the other hand, the perturbations that will necessarily occur at

these low abundances might just be enough to drive the system out of the steady state and into

the next period. The simulation with Ω = 100 reveals that the numbers of both key proteins (data

for A not shown) do not decrease to zero but oscillate around 1000 molecules. This might be the

reason that the system approaches a steady state because the influence of fluctuations in the molec-

ular populations is too low to drive the system out of its stable state and to initiate a new oscillation.

In it very difficult to obtain coherent conclusions about a complex nonlinear system just from

observations. To our knowledge, most of the theoretical results about the behaviour of a model

with different parameters were obtained for simplified versions of the model only. In this case,

various assumptions about the model were made such as the steady state of some molecular species

and the hope was expressed that both models, real and simplified one, exhibit a similar behaviour

over a wide range of conditions. It might require further advancement in the theoretical sciences

to obtain new insights. Nevertheless, our simulations have shown that the hysteresis-based model

exhibits an unexpected behaviour if the size of the system is increased.


0 20 40 60 80 100 120 140 160 180 2000

1000

2000

3000

4000

5000

6000

Time [h]

Num

ber

of m

ole

cule

s


CR

0 1000 2000 3000 4000 5000 60000

200

400

600

800

1000

1200

1400Limit cycle of the stochastic simulation with Ω = 100

C

R

0 50 100 150 200 250 300 350 400 450 500−0.2

0

0.2

0.4

0.6

0.8

Time lag [h]

Sa

mp

le A

uto

co

rre

latio

n


0 20 40 60 80 100 120 140 160 180 2000

2

4

6

8

10

12x 10

4

Time [h]

Num

ber

of m

ole

cule

s


CR

0 2 4 6 8 10 12

x 104

0

1

2

3

4

5

6

7

8

9x 10

4

C

R

Limit cycle of the stochastic simulation with Ω = 50

0 50 100 150 200 250 300 350 400 450 500−0.5

0

0.5

1

Time lag [h]

Sa

mp

le A

uto

co

rre

latio

n


0 20 40 60 80 100 120 140 160 180 2000

0.5

1

1.5

2

2.5x 10

4

Time [h]

Num

ber

of m

ole

cule

s


CR

0 0.5 1 1.5 2 2.5

x 104

0

2000

4000

6000

8000

10000

12000

14000

C

R

Limit cycle of the stochastic simulation with Ω = 10

0 50 100 150 200 250 300 350 400 450 500−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time lag [h]

Sa

mp

le A

uto

co

rre

latio

n


Figure 5.9: Stochastic simulation with changing numbers of molecules (hysteresis-based

model) The plots show the results of stochastic simulations with Ω changing from 100 to 50 and 10. The

left plot in each row shows the oscillations for repressor (R) and degradation complex (C)during a simulation

time of 200 hours. The middle plot is the limit cycle and the right plot the time evolution of the autocorrelation

function.


Simulating the effects of gene duplication

0 20 40 60 80 100 120 140 160 180 2000

5000

10000

15000

Time [h]

Num

er o

f mol

ecul

es

Stochastic simulation with Ω=100

mRNAP

N

(a) Delay-based model

0 20 40 60 80 100 120 140 160 180 2000

200

400

600

800

1000

1200

1400

1600

1800

2000


Time [h]

Num

ber o

f mol

ecul

es

CR

(b) Hysteresis-based model

Figure 5.10: Both models are simulated with a second copy of the clock gene (activator gene in case of

the hysteresis model). The rate of transcription of both models is increased by factor 10. Both models are

simulated with a value of Ω for which they should exhibit stable oscillations (100 for delay-based and 1 for

hysteresis-based model).

Gene duplication is thought to have a major role in evolution. It can happen when an error during

the DNA replication occurs and a copy of a functional gene is inserted into a different part of the

DNA. This copy might be identical to the original gene or mutated. If both copies are functional,

one of the genes might mutate later on and acquire a different function since it is not longer re-

quired for the survival of the organism. But both genes can also remain active during the further

evolution of the species (Cox & Nelson 2004).

The effect of gene duplication on a genetic oscillator has already been examined by Forger & Pe-

skin (2005). They increased the number of genes in their stochastic model of a circadian clock in

mammals and found that the robustness of the oscillations is improved if more genes are present

in the model. They measured the robustness of the oscillations in terms of the deviation of the

period over many runs. Forger & Peskin (2005) argue that a low number of genes always leads to

some ”residual stochasticity” in the model. Even in the limit of a large volume when all reactions

can be modelled deterministically, the reactions involving the gene and its promoter will still occur

stochastically since the number of genes will not be influenced by the increase of the volume. This

might also explain the fact that faster binding rates between protein and DNA lead to a reduction

of the stochasticity in the model since the randomness of these reactions will be averaged out if

they occur on a very fast time scale (Gonze, Halloy & Goldbeter 2002).


Gene duplication might also involve the mutation of the duplicated gene. This was not considered

in the simulations of Forger & Peskin (2005). We simulate gene duplication by introducing a sec-

ond mutated gene in both models. A mutation can have several effects on a gene. It might lead to

a disfunctional protein product. But it also possible that the promoter region is modified and that

the gene is transcribed at a much higher rate than its original. We modelled mutation by increasing

the transcription rate of the gene copy by a factor of 10.

Figure 5.10 shows the results of this experiment. The hysteresis-based model is apparently not

severely perturbed by the modification. The amplitude of the oscillations is decreased and less

proteins are produced than in the model with only one gene (2043 maximum compared to 2495).

It seems that a higher overall transcription rate of the activator gene leads to a faster transcription of

the repressor protein and in turn to a faster degradation of the activator protein. This might be the

reason for the damping of the amplitudes. But the genetic circuit still exhibits regular oscillations.

In contrast to this, the delay-based model is affected by the introduction of a second mutated gene.

It does not exhibit any oscillations but the number of clock proteins increases. A maximum value

of 390000 was observed during a simulation time of 200 hours.

We tested the behaviour of both genetic oscillators when a second gene with increased transcription

rate is introduced. This modification can be interpreted as a simulation of a duplication of the clock

gene. However, the main intention of this experiment was to examine the behaviour of both models

if key genes are copied and mutated. It is questionable if the modification that we introduced are a

good model of a real gene duplication. Nevertheless it was shown that the hysteresis-based model is

less susceptible to structural modifications. This is an characteristic that is supported by evolution

since gene networks that are easily affected by mutations might die out quickly. The ability to

function reliably even if key components are mutated is probably necessary for the circadian clock

to be successfully embedded within the cell.

Influence of changes in the Protein-DNA binding rates

In a previous study (Barkai & Leibler 2000), the delay-based model of a biomolecular clock

(Leloup et al. 1999) has been criticised since it exhibits very unstable and noisy oscillations if

simulated stochastically. Later on, Gonze, Halloy & Goldbeter (2002) repeated this stochastic

simulation with different rate constants. Theses experiments revealed that the delay-based oscil-

lator is able to oscillate reliably if the rates of the reaction between the clock protein and its own


gene are set to very high values. Nowadays it is believed that high rate constants in these reac-

tions are crucial for the robustness of the oscillation in many models of circadian clocks (Forger &

Peskin 2005).

We used the same values for the binding reactions in our experiments for both models, hysteresis-

and delay-based. But we briefly repeat the simulations conducted by Barkai & Leibler (2000) with

low rate constants to finalise our comparison of both models.

0 50 100 150 200 250 300 350 400 450 5000

20

40

60

80

100

120

140

160

Time [h]

Num

er o

f mol

ecul

es


mRNAP

N


0 20 40 60 80 100 120 140 160 180 2000

500

1000

1500

2000

2500

Time [h]

Num

ber o

f mol

ecul

es


CR


Figure 5.11: Simulation of both circadian clocks with low rate constants Stochastic simulation of both

models with low rate constants for the binding reactions between DNA and proteins. The rates were set to

50 (binding) and 10h−1 (dissociation).

Our findings match the results obtained by Barkai & Leibler (2000). Whereas the hysteresis-based

model exhibits stable and pronounced oscillations even with low rate constants, the delay-based

still oscillates but in a very noisy manner. No period that is even close to 24 hours is visible.

The problem is here that the true values that reflect the velocity of the protein-DNA binding re-

actions are not known. Even if it has been observed that high rate constants are crucial for the

robustness of the model, it is not clear if these rate constants reflect reality. Currently there ex-

ists no experimental data about the kinetics of these reactions. However, the obtained results can

be seen as an indication about the soundness of both models since real circadian clocks have to

function reliably if their parameters are changed due to external influences such as change in tem-

perature and current state of the organism (hunger, stress, etc.).


5.3 Synchronising several oscillating cells

The two models presented in this chapter are oscillators with a very general structure. They do not

contain features specific to any organism. They exhibit a somehow contradicting behaviour and

only noisy or no oscillations at all under certain conditions. However, the ability to create stable

circadian oscillations under a large variety of external conditions is thought to be a key feature

of biological clocks (Barkai & Leibler 2000). Changes in transcription and translation rates may

arise from variations in nutrition, growth condition or temperature. It seems to be reasonable that

evolution favours designs of cellular systems that function reliable despite global changes in their

environment.

There might be several factors that can help biomolecular clocks to achieve this task. The entrain-

ment by daylight is certainly such a factor. In the biomolecular clock of Neurospora, light enhances

the degradation of the clock protein and by doing this, exerts a periodic forcing of the clock that

was found to improve the noise resistance (Gonze, Halloy & Goldbeter 2002). Light can be seen

as some kind of external synchronisation. There are also theories of an internal synchronisation

e.g. a synchronisation between different cells such that the oscillations of a group of cells are more

stable than the oscillations of a single cell (Forger & Peskin 2005).


This section presents a short experiment about a possible mechanism of synchronisation between

different clocks. One possibility is to simply average the oscillations of several cells. But we

would expect that several noisy oscillations which are averaged over a large number of cells simply

disappear due to the different shift of each oscillation.

In order to cope with this problem, a single run of the delay-based model was observed at Ω = 50.

After this system left the transient phase and settled into some more or less stable limit cycle, the

numbers of each molecular species were recorded and used as initial state for a new run of 100

cells. Again, the average was taken of the individual oscillations. The rationale of this approach

is that each cell should start at a common initial state such that the initial shift of the oscillations

against each other is zero. The results of these experiments are shown in figure 5.12.


In both cases, the experiments were not very successful. In fact, the oscillations that were averaged

over several cells are even weaker than the oscillations of the single cell. The autocorrelation func-

tion decays very fast and a limit cycle is nonexistent. However, we can also observe that the very

first oscillations are clearly pronounced in each experiment. In the later course of the simulation,

the oscillations become noisier and their average converges towards a straight line.

What can be concluded from these experiments? First of all, there is almost certainly some kind

of synchronisation of oscillating cells (Forger & Peskin 2005). But given the results from these

experiments and given the fact that cells are clearly separated compartments, it does not appear

to be reasonable to simply average the oscillations of several noisy cells. It might be possible to

improve the results of these experiments by combining external and internal synchronisation. We

could simply introduce a factor that the simulates the influence of daylight, for instance changing

the degradation rate of the clock protein in the delay-based model every 12 hours, and then av-

erage the oscillations of several cells. However, this was not possible in our experimental setup

since after the SPN model is created, it is simulated in one run and there is no possibility to change

parameters in the model during a simulation in the Systems Biology Workbench.

Recent findings suggest that several noisy oscillatory cells are synchronised by a messenger sub-

stance by a messenger substance (Gonze, Bernard, Waltermann, Kramer & Herzel 2005). As an

example, cells in in the suprachiasmatic nucleus of the hypothalamus, which is assumed to be

the circadian pacemaker in mammals, exhibit oscillations with free-running periods if examined

in isolation. But the suprachiasmatic nucleus as a whole exhibits regular oscillations with stable

period close to 24 hours. In a different study, a model of a cell population in the hypothalamus was

developed and it was shown that this population can be synchronised by introducing a global vari-

able representing a neurotransmitter which influences directly the transcription rate of the clock

gene (Gonze et al. 2005). However, the details of the real synchronisation of oscillating cells in

mammals are still unknown. It is not known which messenger substance actually enforces the syn-

chronisation. Moreover how this substances interacts with the molecular clocks in the individual

cells is also unknown. Our results suggest that there must be some form of global synchronisation

to ensure a stable circadian rhythm on a tissue level since the simple averaging of single oscillators

does not improve the stability of the circadian rhythm.


0 50 100 150 200 2500

100

200

300

400

500

600

700

800

Num

er

of m

ole

cule

s

Time [h]


mRNAP

Nall proteins

0 10 20 30 40 50 60 70 8050

100

150

200

250

300

350

number of mRNA molecules

num

ber

of P

N m

ole

cule

s


0 20 40 60 80 100 120 140 160 180 200−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Lag

Sa

mp

le A

uto

co

rre

latio

n


0 50 100 150 200 2500

100

200

300

400

500

600

700

800

900

1000

Num

er

of m

ole

cule

s

Time [h]


mRNAP

Nall proteins

0 10 20 30 40 50 60 70 80 900

50

100

150

200

250

300

350

400

450

number of mRNA molecules

num

ber

of P

N m

ole

cule

s


0 20 40 60 80 100 120 140 160 180 200−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Lag

Sa

mp

le A

uto

co

rre

latio

n

Sample Autocorrelation Function (ACF)

0 50 100 150 200 250 300 350 400 450 5000

200

400

600

800

1000

1200

1400

1600

1800

Time [h]

Num

ber

of m

RN

A o

r pro

tein

mole

cule

s



0 20 40 60 80 100 1200

100

200

300

400

500

600

mRNA molecules MP

Nucle

ar

pro

tein

mole

cule

s P

N


0 50 100 150 200 250 300 350 400 450 500−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time (h)

Sa

mp

le A

uto

co

rre

latio

n


Figure 5.12: Synchronisation of several cells The first row represents an experiment in which the

molecule numbers were simply averaged over 100 runs. The second row shows the results of the second

experiment. In this case, the initial numbers of all molecular species were set to a value within the limit cycle

and the results again averaged over 100 runs. The last row gives a simulation of a single cell at the same

system size (Ω = 50). The fist plot in each row gives the time evolution of mRNA, nuclear protein (PN) and

the whole protein population. The second plot gives plot of mRNA versus the nuclear protein abundance and

the third one the time course of the autocorrelation function.


0 50 100 150 200 250 300 350 400 450 5000

50

100

150

200

250

Ω

Hal

f−tim

e of

aut

ocor

rela

tion

Delay−based model (Gonze et al. 2002)


10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100

120

140

160

Ω

Hal

f−tim

e of

the

auto

corr

elat

ion

Hysteresis−based model (Barkai and Leibler 2000)


Figure 5.13: Robustness of the oscillations in both models measured by half-life of autocorrelation.

The half-life of the autocorrelation is plotted against Ω, the size of the system.

5.4 Discussion

The primary intention of these experiments was to show a practical application for the extended

Petri Net Kernel and to give a comprehensive example that underlines the need for stochastic mod-

els in biology. But our objective was also to present a more detailed comparison of two stochastic

models of circadian clocks and to give new insights into the architecture of the true clock.

The first aim has been achieved. The Petri Net Kernel in its extended version has proven to be use-

ful in these experiments. The Petri Net representation is a visualisation of the reaction steps that is

easy to understand and it can be simulated efficiently using the Systems Biology Workbench. One

of the biggest advantages of the Kernel is that is does not require any knowledge of programming

languages to create a model. The user is only required to use a graphical editor with an intuitive in-

terface and to create a network of nodes and arcs. Furthermore, the experimental results published

by Gonze, Halloy & Goldbeter (2002) and Vilar et al. (2002) have been successfully recreated.

When it comes to the section objective, the comparison of two clock architectures, the results are

more difficult to evaluate. We were able to recreate simulation results for the delay-based model

that were published by Gonze, Halloy & Goldbeter (2002). Furthermore we extended this ap-

proach, to simulate a stochastic model with changing system size, to the hysteresis-based model.

Even if both models have already been compared by Barkai & Leibler (2000), the size of the sys-

tem was not changed in their study. We also simulated the effects of gene duplication and mutation


in both models and investigated the synchronisation of several noisy oscillators. In addition, our

simulations support the results from Barkai & Leibler (2000), that the delay-based model is sus-

ceptible to changes in the protein-DNA binding reactions.

In general, the hysteresis-based model seems to be less susceptible to changes in its rate constants

and is also hardly affected by duplication and mutation of its key gene. The delay-based model is

severely affected by both modifications. In addition, if the size of the system is small and degrada-

tion rate of the repressor protein is low, the hysteresis-based model is enhanced by the fluctuations

on a molecular level. Changes in transcription and translation rates may arise in a real cell and

gene duplication is a common event in the evolution of simple organisms. The robustness in the

presence of noise is also an important factor since it was shown in biological experiments that the

circadian clock in Neurospora is able to work reliably if the numbers of key proteins are in the

order of 20 molecules (Merrow, Garceau & Dunlap 1997). According to Barkai & Leibler (2000),

the ability to resist such uncertainties was probably one of the ”decisive factors in the evolution

of circadian clocks” and should be ”reflected in the underlying oscillation mechanism”. From this

perspective, the hysteresis-based model seems to be more sound.

On the other hand, we were able to repeat experiments that reveal that the delay-based model

(Gonze, Halloy & Goldbeter 2002) approaches the oscillatory behaviour of its deterministic for-

mulation if the size of the system is increased. The oscillations are more robust and exhibit a stable

period of about 24 hours. The hysteresis-based model reveals a somehow contradictory behaviour.

We chose a parameter setting with high degradation rates of the repressor protein and Ω = 1. Sto-

chastic and deterministic simulation revealed robust oscillations for these parameters. But when

we increased Ω in the same way as we did for the delay-based model, the hysteresis-based model

did not oscillate anymore.

To conclude, we cannot draw any final conclusions about the validity of each model due to limited

time, generality of both architectures and the fact that the true rate constants of many reactions are

unknown. The delay-based model does not capture all details of the circadian clock in Neurospora

or Drosophila. The hysteresis-based model is not specific to any organism but contains components

that were found in several genetic oscillators. It seems to us that the hysteresis-based model is more

sound since it is less susceptible to changes in its environment and mutations. Its characteristics,

postive feedback loops and two active proteins, could serve as starting points for the construction

of better models.

Chapter 6

Conclusions

6.1 Concluding remarks and Observations

The outcome of this dissertation is twofold. First, a software for the modelling and simulation of

biological processes with Stochastic Petri Nets was created. Second, this software was used to

model genetic circuits that are of current scientific interest.

The implementation of the experimental framework lasted for about one month. A new version

of the Petri Net Kernel, called PNK 2e, was developed. This version can import Petri Nets from

SBML. Petri Nets can also be created using a graphical editor and their behaviour can be simu-

lated in the Systems Biology Workbench, either stochastically or deterministically. The SPN can

be written to either SBML or CMDL, two description languages for biological models. The aim

was to keep the usage of the software as simple as possible. No programming experience is re-

quired to create a model and to simulate it. PNK 2e was presented during the poster session at

the BioSysBio conference 2005 in Edinburgh. Since Stochastic Petri Nets provide an intuitive rep-

resentation of stochastic models that are commonly used in the Systems Biology community, the

tool received some interest and we received encouragement for our work. Furthermore, PNK 2e

has been announced on the webpage and mailing list of the SBML project. It is available on the

internet1, together with a manual and a step-by-step user guide.

We hoped that the use of freely available software would save us some time. This expectation was

met. On the other hand, we found out that it can be difficult to be dependent on the work and good

will of others. During the first part of this project, some bugs were found in the Systems Biol-

1http://www.inf.fu-berlin.de/˜trieglaf/PNK2e

74

Chapter 6. Conclusions 75

ogy Workbench. It was difficult to correct the errors without a deep knowledge of the program.

Therefore we contacted the developers of the workbench and asked for help. Fortunately, they

were very helpful and corrected the problem within days. But this is certainly not always the case.

We summarize that the software has aroused interest during a first presentation because Markov

Processes and graphical models are approaches biologists are very familiar with. The need for

stochastic models is widely accepted but software tools that are really user-friendly and offer all

the functionality needed by biologists are still rare.

When it comes to the experimental part, conclusions are more difficult to draw. We were able to

recreate the simulation results of others and PNK 2e has proven its usefulness during these ex-

periments. In addition, we conducted a comparison of two competing architectures for circadian

clocks. A comparison at this level of detail has not been done before. We could also present some

results about hypothetical models of the synchronisation of different cells.

We compared a model based on delay induced by phosphorylation of the clock protein and a model

based on hysteresis or lag caused by slow degradation of a repressor protein. The delay-based

model is very sensitive to slow rate constants in the binding reactions of DNA and protein and to

duplication of the clock protein. Its oscillations become very noisy if few molecules are involved.

On the other hand, the hysteresis-based model seems to be enhanced by noise. For some parameter

settings, this model oscillates only in the stochastic simulation. The deterministic solution, which

does not take noise into account, arrives in a steady state. On the other hand, if we start from

a setting of parameters that leads to oscillations in a stochastic and deterministic simulation, and

further increase the size of the system, the genetic circuit does not exhibit any oscillations. This

behaviour contradicts our expectations but might be due to the lack of detail in the model.

Both models capture only core features of real circadian clocks and we can therefore only draw

general conclusions about possible architectures. It is known that evolution favours designs that

are robust to noise and work well under a variety of external influences. In our experiments, it was

shown that the hysteresis-based model is less susceptible to changes in the rate constants. From

this perspective, one might prefer this model. It seems also to be enhanced by molecular fluctua-

tions which are known to be an important factor in the cell (Arkin et al. 1998). On the other hand,

our experiments revealed that the hysteresis-based model does not oscillate if many molecules are

contained in the system. This fact contradicts common expectations since it should function better

with increasing size of the system.


In contrast to this, the delay-based model oscillates independently of the system size. But this

model seems to be heavily affected by changes in its rate constants, especially in changes of the

binding rates between protein and DNA. We are sure that further advancement, both in compu-

tational but also experimental sciences, is necessary before we can draw final conclusions about

the architecture of biomolecular clocks. Nonetheless, both scientific fields are developing at a fast

pace and we hope these advances will be made very soon.

6.2 Unsolved Problems

We were able to obtain some interesting results. But is certainly difficult to deliver a coherent piece

of work during three months. The software, PNK 2e, has scope for further improvement. As an

example, it would be very convenient if the editor could assign several rate constants to a stochas-

tic transition. Each rate could be assigned to a different environment such as size of the system,

temperature etc. The user could choose a set of rates for an experiment and compare simulation

results with different settings easily. The editor itself could also be extended. It might be useful to

have the option to merge different nets or to use hierarchical nets e.g. nets that contain subnets in

a transition.

Concerning the experiments on stochastic models of genetic oscillators, there is certainly a lot of

work to be done. First of all, the search for rate constants that truly reflect the real velocity of

the reactions is still continuing. Furthermore, many details of real circadian clocks are still to be

uncovered by experimental means. We also need new formal methods that are able to analyse

the complex behaviour of nonlinear systems. There is also a lack of methods that can capture

the reliability of the oscillations in an adequate way. Mere visual inspection of the oscillations is

not sufficient and the autocorrelation is often misleading since it is always decreasing for noisy

oscillations.

6.3 Suggestions for Future Work

The advantages of Petri Nets compared to other modelling formalisms used in Biology is that

their theory has been researched for decades. There exist algorithms that can not only be used to

examine their dynamic behaviour but also to search for structural properties and to derive steady


state information by algebraic means. However, in this dissertation, we focussed on results from

stochastic and deterministic simulations. This is due to the fact that, for the models that have been

examined here, structural analysis yielded not very interesting results and steady state information

was not possible to obtain. But here lies the true advantage of Petri Nets.

The full use of this potential will emerge if the models become more complex such that a topologi-

cal analysis will give more interesting results. So far, it can at least be used to check if the systems

fulfils the assumptions made, such as invariants on the number of enzyme molecules. An algebraic

computation of the steady state distribution requires an upper bound on the number of states in the

Markov process. This can be enforced by simply limiting the state space. But this threshold has

to be chosen carefully. The efficient derivation of the steady state distribution is also an interesting

topic and there is certainly scope for future research.

Some attempts have been made in this dissertation to simplify the representation of biological re-

actions with Petri Nets. The problem is that the net becomes very large with increasing complexity

of the reactions. Further attempts could be made to develop new representations that maintain the

advantages of Petri Nets and capture the complexity inherent in biological processes even more

easily.

It might also be interesting to develop new simulation algorithms that make use of the information

that is captured by the net. As it was outlined in chapter 2, some useful information is lost if the

net is translated into SBML e.g. information about dependencies among the reactions that can be

used to perform efficient simulations. Even if the use of other software can save a lot of time and

effort, it is also an advantage to use one’s implementations that can be adopted more easily. For

instance, if we could stop the simulation at a time of our choice, we could simulate the influence

of daylight by changing the rate constants during the simulation.

Appendix A

User guide to PNK 2e

This is the manual for PNK 2e, a software developed using the Petri Net Kernel (PNK) version 2.2.

The Petri Net Kernel is a framework for the development of Petri Net tools. It was developed at

the Humboldt University of Berlin, Germany. Its extended version, PNK 2e, was developed by Ole

Schulz-Trieglaff, during his M.Sc. dissertation at the University of Edinburgh, UK. PNK 2e fea-

tures Stochastic Petri Nets, a modelling formalism that stems from Computer Science. Stochastic

Petri Nets (SPNs) are closely related to Markov Jump Processes. Their behaviour can be simulated

using the Gillespie Algorithm and its improved versions (Gibson-Bruck, Tau Leap).

PNK 2e extends the PNK by features for the modelling of biological processes. PNK 2e means

”extended Petri Net Kernel version 2”. The software is able to create a Petri Net representation of

a model described in SBML (Systems Biology Workbench Language). The net is drawn by using

a simple algorithm implemented by Alexander Gruenewald, Humboldt University of Berlin. The

dynamic behaviour of the Petri Net can be simulated using the Systems Biology Workbench. In

order to achieve this, PNK 2e translates the net back into its SBML description and passes this

description automatically to the Workbench. Alternatively, a Petri Net can be created using the

graphical editor of the Kernel.

Licence Agreement

PNK 2e is free software; you can redistribute it and/or modify it under the terms of the GNU

General Public License as published by the Free Software Foundation; version 2 of the license.

PNK 2e is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY. You are

NOT ALLOWED to CHANGE THE ORIGINAL COPYRIGHT NOTICE. See the GNU General

78

Appendix A. User guide to PNK 2e 79

Public License for more details. You should have received a copy of the GNU General Public

License along with PNK 2e; if not see http://www.gnu.org/.

Quick start

This is a brief introduction to PNK 2e. The software requires the Java version≥ 1.2.2. The archive

PNK2e.zip can be downloaded from www.inf.fu-berlin.de/ trieglaf/PNK2e and contains

all necessary files. The Systems Biology Workbench is required to perform the simulation of the

Petri Net and is available at http://sbw.kgi.edu/.

Download and run the software

If the archive PNK2e.zip is extracted, a new directory PNK2e should be created in the current

directory. This directory contains several libraries in .jar format, the file PNK2e.jar which is the

program itself and several subdirectories:

• sampleNets - contains some netexamples

• netTypeSpecifications - contains examples for a net’s specification

• toolSpecification - contains some toolspecification examples

If anything goes wrong, first check if you have the correct version of Java installed by executing

java -version. Then try to find out if all necessary libraries are contained in the same direc-

tory as the .jar file. PNk 2e needs at least the libraries jaxp.jar, crimson.jar, SBWcore.jar and

SBMLreader.jar. The remaining libraries are needed for the translation of Petri Nets into CMDL

only.

Open, edit and save a Petri Net

The software can be started by double-clicking on the file PNK2e.jar (Windows) or by executing

java -jar PNK2e.jar (Linux and other operating systems). The main menu of PNK 2e should

appear. By clicking on the File menu, the user can open a file and load the net into the Kernel (see

screenshot A.1). The editor is opened automatically and displays the net.


Alternatively, the user can select New in the menu Open of the main menu to create a new Petri Net.

PNK 2e can edit Stochastic Petri Nets, Biological Nets (SPNs with simplificactions for biological

reactions) and Generalized Stochastic Nets (SPNs with inhibitor arcs and immediate transitions).

Depending on this choice, the main menu changes to the editor menu. This menu offers the user

the possibility to draw places and transitions by simply choosing the type of node to be drawn and

by clicking into the editor pane. Arcs can be drawn by first clicking on the source node and then

on the target node. PNK 2e also contains a function to automatically arrange a net. This function

is called DoNetLayout and is available in the main menu.

Figure A.1: PNK 2e after loading the SPN representation of a genetic oscillator.

Simulating a net

PNK 2e can simulate a Stochastic Net with or without the developed simplifications for biologi-

cal reactions. This simulation is conducted in form of a ”token game”, that is a transition that is

executed is coloured for some milliseconds and the flow of the token through the net is visualised.

This simulation is available in the main menu under stochastic simulation.

This simulation is the correct way to simulate the net and gives a good idea of its dynamics.

However, it is not well suited for large nets since it is very slow. Furthermore, data of the simulation

run is not collected. If a more detailed analysis of the simulation is required, the user can choose


the entry ConnectToSBW in the main menu. PNK2e then opens a window with a list of all services

in the Systems Biology Workbench that are available on this computer. For a description of the

SBW and on how to install new modules and services, have a look at the manual of the SBW

which is available at http://sbw.kgi.edu/. We recommend to install the simulator Dizzy in

addition to the workbench because this software offers several simulation algorithms, stochastic

and deterministic, and works very well with PNK 2e. But every simulation software is compatible

to the Systems Biology Workbench and implements the Gillespie algorithm or one of its improved

versions can be used.

Figure A.2: The simulation interface of PNK 2e.

If Dizzy is installed, the list of SBW services should contain an entry Dizzy simulation service. Af-

ter clicking on this entry, the Dizzy simulation window opens (see screenshot A.2). The window

contains a list with all available simulation algorithms. Start and end time of the simulation can

be chosen. In case of a stochastic simulation, the user can also decide to average the result over

several runs. If a deterministic simulation is chosen, the user has also decide about step size and

maximum relative and absolute error.


The simulation is started by clicking on the Start button on the left side of the simulation window.

The simulation can be paused by clicking on Pause and resumed by clicking on Resume. At the

end of the simulation, results can be plotted or written to a file.

Appendix B

Delay-based Oscillatory Network

Parameters in the stochastic model

Reaction number Reaction step Probability of reaction Description

1 G+PNa1−→ GPN w1 = a1×G×PN/Ω First protein binds to gene

2 GPNd1−→ G+PN w2 = d1×GPN Dissociation of first protein

3 GPN +PNa2−→ GPN2 w3 = a2×GPN×PN/Ω Second protein binds to gene

4 GPN2d2−→ GPN +PN w4 = d2×GPN2 Dissociation of second protein

5 GPN2 +PNa3−→ GPN3 w5 = a3×GPN2×PN/Ω Third protein binds to gene

6 GPN3d3−→ GPN2 +PN w6 = d3×GPN3 Dissociation of third protein

7 GPN3 +PNa4−→ GPN4 w7 = a4×GPN3×PN/Ω Fourth protein binds to gene

8 GPN4d4−→ GPN3 +PN w8 = d4×GPN4 Dissociation of fourth protein

9 Gvs−→MP +G w9 = vs×G Gene expression

10 GPNvs−→MP +GPN w10 = vs×GPN Gene expression

11 GPN2vs−→MP +GPN2 w11 = vs×GPN2 Gene expression

12 GPN3vs−→MP +GPN3 w12 = vs×GPN3 Gene expression

13 MP +Emkm1−→Cm w13 = km1×Mp×Em/Ω Degradation of mRNA (1)

14 Cmkm2−→Mp +Em w14 = km2×Cm Degradation of mRNA (2)

15 Cmkm3−→ Em w15 = km3×Cm Degradation of mRNA (3)

Table B.1: The reaction steps of the stochastic model (part 1)

83

Appendix B. Delay-based Oscillatory Network 84


16 Mpks−→Mp +P0 w16 = ks×MP Translation of mRNA

17 P0 +E1k11−→C1 w17 = k11×P0×E1/Ω First Phosphorylation (1)

18 C1k12−→ P0 +E1 w18 = k12×C1 First Phosphorylation (2)

19 C1k13−→ P1 +E1 w19 = k13×C1 First Phosphorylation (3)

20 P1 +E2k21−→C2 w20 = k21×P1×E2/Ω First Dephosphorylation (1)

21 C2k22−→ P1 +E2 w21 = k22×C2 First Dehosphorylation (2)

22 C2k23−→ P0 +E2 w22 = k23×C2 First Dehosphorylation (3)

23 P1 +E3k31−→C3 w23 = k31×P1×E3/Ω Second Phosphorylation (1)

24 C3k32−→ P1 +E3 w24 = k32×C2 Second Phosphorylation (2)

25 C3k33−→ P2 +E3 w25 = k33×C3 Third Phosphorylation (3)

26 P2 +E4k41−→C4 w26 = k41×P2×E4/Ω Second Phosphorylation (1)



29 P2 +Edkd1−→Cd w29 = kd1×P2×Ed/Ω Second Dephosphorylation (1)

30 Cdkd2−→ P2 +Ed w30 = kd2×Cd Second Dephosphorylation (2)

31 Cdkd3−→ E4 w31 = kd3×Cd Second Dephosphorylation (3)

32 P2k1−→ PN w32 = k1×P2 Diffusion of protein into nucleus

33 PNk2−→ P2 w33 = k2×PN Diffusion of protein into cytosol

Table B.2: The reaction steps of the stochastic model (part 2)

This table shows the detailed reaction steps in the stochastic model together with their transition

probabilities and a short description. The kinetic constants are given in table B.3. Figure B.1 shows

the SPN representation of these reaction steps.


Reaction step Parameter values

1/2 a1 = Ω mol−1 h−1,d1 = (160×Ω) h−1

3/4 a2 = (10×Ω) mol−1 h−1,d2 = (100×Ω) h−1

5/6 a3 = (100×Ω) mol−1 h−1,d3 = (10×Ω) h−1

7/8 a4 = (100×Ω) mol−1 h−1,d4 = (10×Ω) h−1

9 vs = (0.5×Ω) mol−1 h−1

10-12 km1 = 165 mol−1 h−1,km2 = 30 h−1,km3 = 3 h−1

Em tot = Em +Cm = (0.1×Ω)mol

13 ks = 2.0 h−1

14-16 k11 = 146.6 mol−1 h−1,k12 = 200 h−1,k13 = 20 h−1

E1 tot = E1 +C1 = (0.3×Ω)mol

17-19 k21 = 82.5 mol−1 h−1,k22 = 150 h−1,k23 = 15 h−1

E2 tot = E2 +C2 = (0.2×Ω)mol

20-22 k31 = 146.6 mol−1 h−1,k32 = 200 h−1,k33 = 20 h−1

E3 tot = E3 +C3 = (0.3×Ω)mol

23-25 k41 = 82.5 mol−1 h−1,k42 = 150 h−1,k43 = 15 h−1

E4 tot = E4 +C4 = (0.2×Ω)mol

26-28 kd1 = 1650 mol−1 h−1,kd2 = 150 h−1,kd3 = 15 h−1

Ed tot = Ed +Cd = (0.1×Ω)mol

29-30 k1 = 2.0h−1,k2 = 1.0 h−1

Table B.3: Parameter values used for the stochastic simulations (mol = molecules) The numbered

steps refer to those from table B.2.


1/23/4 5/6 7/8

9 10 11

13-15

16

17-19

20-22

23-2526-28

29-31

32-33

12

Figure B.1: SPN representation of the circadian clock in Neurospora . Transitions are numbered

according to the reaction steps in B.2. Places are numbered to the chemical species in the model.

The figure gives the simplified Petri Net representation of this model. Double squares represent

reversible reactions. The numbers refer to the reaction in table B.3.


Kinetic equations of the deterministic model

In the model that is given by schema in 5.3, the time evolution of the concentrations of mRNA (Mp) and

clock protein (P0,P1,P2 or PN) is governed by the following set of differential equations:

dMp

dt= vs

KnI

KnI +Pn

N− vm

Mp

Km +MP(B.1)

dP0

dt= ksMp− v1

P0

K1 +P0+ v2

P1

K2 +P1(B.2)

dP1

dt= v1

P0

K1 +P0− v2

P1

K2 +P1− v3

P1

K3 +P1+ v4

P2

K4 +P2(B.3)

dP2

dt= v3

P1

K3 +P1− v4

P2

K4 +P2− vd

P2

Kd +P2− k1P2 + k2PN (B.4)

PN

dt= k1P2− k2PN (B.5)

The results in 5.4(a) have been obtained by numerical integration of the equations above for the following

parameter values: KI = 2 nM , n = 4, vs = 0.5 nMh−1, vm = 0.3 nMh−1, Km = 0.2 nM, ks = 2.0 h−1, v1 =

6.0 nMh−1, K1 = 1.5 nM , v2 = 3.0 nMh−1, K2 = 2.0 nM , v3 = 6.0 nMh−1, K3 = 1.5 nM , v4 = 3.0 nMh−1,

K4 = 2.0 nM , vd = 1.5 nMh−1, Kd = 0.1 nM, k1 = 2.0 h−1, k2 = 1.0 h−1

Appendix C

Genetic Oscillator based on Hysteresis


1 A+DAγA−→ D′A w1 = γA×A×DA/Ω Protein A activates gene DA

2 D′AθA−→ A+DA w2 = θA×D′A Protein A dissociates from D′A

3 DAαA−→MA +DA w3 = αA×DA DA is transcribed

4 D′Aα′A−→MA +D′A w4 = α′A×D′A D′A is transcribed

5 MAδMA−→ w5 = δMA×MA Degradation of MA

6 MAβA−→ A+MA w6 = βA×MA Translation of MA

7 A+DRγR−→ D′R w7 = γR×DR×A/Ω Protein A activates DR

8 D′RθR−→ D+A w8 = θR×D′R Protein A dissociates from DR

9 DRαR−→MR +DR w9 = αR×DR DR is transcribed

10 D′Rα′R−→MR +D′R w10 = αR×D′R D′R is transcribed

11 MRδMR−→ w11 = δMR×MR Degradation of MR

12 MRβR−→ R w12 = βR×MR Translation of MR

13 AδA−→ w13 = δA×A Degradation of A

14 RδR−→ w14 = δR×R Degradation of R

15 A+RγC−→C w15 = δR×R×A Synthesis of C

16 CδA−→ R w16 = δA×C Degradation of A (2)

Table C.1: Reaction steps in the stochastic formulation of the circadian clock model by Vilar et al. (2002)

together with their probabilities.

Table C.1 gives the reaction steps in the stochastic model of the genetic oscillator by (Vilar et al. 2002).

88

Appendix C. Genetic Oscillator based on Hysteresis 89

These steps have been obtained by decomposing the deterministic model governed by a set of differential

equations into its elementary steps. Compared to the model by (Gonze, Halloy & Goldbeter 2002), this

model is far more general. Enzymatic degradations are not completely decomposed into all elementary

steps as described by the Michaelis-Menten kinetics but are represented by a single reaction only.

Reaction step Parameter values

1/2 γA = Ω mol−1 h−1, θA = 50×Ω h−1

3/4 αA = 50×Ω h−1, α′A = 500×Ω h−1

5/6 δMA = 10 h−1, βA = 50 h−1

7/8 γR = Ω mol−1 h−1, θR = 100×Ω h−1

9/10 αR = 0.01×Ω h−1, α′R = 50×Ω h−1

11/12 δMR = 0.5 h−1, βR = 5 h−1

13/14 δA = 1 h−1, δR = 0.2 h−1

15/16 γC = 2 h−1, δA = 1 h−1

Table C.2: Parameter values for the circadian clock based on hysteresis

Kinetic equations of the deterministic model

The deterministic formulation of the genetic oscillator is given by the following equations:

dDA

dt= θAD′A− γADAA (C.1)

dDR

dt= θRD′R− γRDRA (C.2)

dD′Adt

= γADAA−θAD′A (C.3)

dD′Rdt

= γRDRA−θRD′R (C.4)

dMA

dt= α′AD′A +αADA−δMAMA (C.5)

dAdt

= βAMA +θAD′A +θRD′R−A(γADA + γRDR + γCR+δA) (C.6)

dMR

dt= α′RD′R +αRDR−δMRMR (C.7)

dRdt

= βRMR− γCAR+δAC−δRR (C.8)

dCdt

= γCAR−δAC (C.9)

Appendix C. Genetic Oscillator based on Hysteresis 90

SPN representation of the reaction steps

Figure C.1: SPN representation of the hysteresis-based model. This net is the graphical representation of

the individual reaction steps. The values of the rate constants are given in table C.2

Appendix D

Glossary of biological terms

This appendix provides an overview of the most important biological terms used in this dissertation. It aims

at the reader who is not familiar with the foundations of molecular biology. The explanations are adapted

from Alberts, Johnson, Lewis, Raff, Roberts & Walter (2002).

DNA (deoxyribonucleic acid) Long chain of acidic molecules formed from covalently linked deoxyri-

bonucleotide units. It serves as the store of hereditary information within a cell and the carrier of

this information from generation to generation.

gene Region of DNA that controls a hereditary characteristic, usually corresponding to a single protein or

RNA. This definition includes the entire functional unit, encompassing coding DNA sequences and

noncoding regulatory DNA sequences (promoter sequences).

messenger RNA (mRNA) RNA molecule that specifies the sequence of a protein. Produced by a protein

called RNA polymerase as a complementary copy of DNA. It is translated into protein in a process

catalyzed by ribosomes.

protein The major macromolecular constituent of cells. A linear polymer of amino acids linked together

by peptide bonds in a specific sequence.

transcription Copying of one strand of DNA into a complementary RNA sequence by the enzyme RNA

polymerase.

translation Process by which the sequence of nucleotides in a messenger RNA molecule directs the incor-

poration of amino acids into protein. It occurs on a ribosome.

91

Bibliography

Alberts, Bruce, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts & Peter Walter (2002), Mole-

cular Biology of the Cell, Garland Publishing.

Arkin, Adam, John Ross & Harley H. McAdams (1998), ‘Stochastic kinetic analysis of developmental

pathway bifurcation in phage lambda-infected escherichia coli cells’, Genetics 149(4), 1633–1648.

Barkai, Naama & Stanislas Leibler (2000), ‘Biological rhythms: Circadian clocks limited by noise’, Nature

403, 267–268.

Bloom, James D. (2003), Pipe - a platform independent petri net editor, Master’s thesis, Imperial College

London, http://freshmeat.net/projects/petri-net.

Chiola, G., G. Franceschinis, R. Gaeta & M. Ribaudo (1995), ‘Greatspn 1.7 - a graphical editor and analyzer

for timed and stochastic petri nets’, Performance Evaluation 24(1-2), 47–68.

Cox, Michael & David L. Nelson (2004), Lehninger - Principles of Biochemistry, 4 edn, Palgrave Macmil-

lan.

Deavours, D. D., W. D. Obal II, M. A. Qureshi, W. H. Sanders & A. P. A. van Moorsel (1995), ‘Ultrasan

version 3 overview’, Proceedings of the AIAA Computing in Aerospace 10 Conference pp. 327–338.

Dunlap, Jay C. (1999), ‘Molecular bases for circadian clocks’, Cell 96(22), 271–290.

Finney, Andrew & Michael Hucka (2003), Systems biology markup language (sbml) level 2: Structures and

facilities for model definitions, Technical report, Systems Biology Workbench Development Group,

California Institute of Technology.

Forger, Daniel B. & Charles S. Peskin (2005), ‘Stochastic simulation of mammalian circadian clock’, PNAS

102(2), 321–324.

Gibson, Michael A. & Jehoshua Bruck (2000), ‘Efficient exact stochastic simulation of chemical systems

with many species and many channels’, J. Phys. Chem. A 104, 1876–1889.

92

Bibliography 93

Gillespie, Daniel T. (1976), ‘A general method for numerically simulating the stochastic time evolution of

coupled chemical reactions’, J. Comput. Phys. 22, 403–434.

Gillespie, Daniel T. (1977), ‘Exact stochastic simulation of coupled chemical reactions’, J. Phys. Chem.

81, 2340–2361.

Gonze, D., J. Halloy & P. Gaspard (2002), ‘Biochemical clocks and molecular noise: Theoretical study of

robustness factors’, Journal of Chemical Physics 116(24), 10997–11010.

Gonze, Didier, Jose Halloy & Albert Goldbeter (2002), ‘Robustness of circadian rhythms with respect to

molecular noise’, PNAS 99(2), 673–678.

*http://www.pnas.org/cgi/content/abstract/99/2/673

Gonze, Didier, Samuel Bernard, Christian Waltermann, Achim Kramer & Hanspeter Herzel (2005), ‘Spon-

taneous synchronization of coupled circadian oscillators’, Biophys. J. 89(1), 120–129.

Goss, Peter J.E. & Jean Peccoud (1998), ‘Quantitative modeling of stochastic systems in molecular biology

by using stochastic petri nets’, PNAS 95(12), 6750–6755.

Hucka, M., A. Finney, H.M. Sauro, H. Bolouri, J. Doyle & H. Kitano (2002), ‘The erato systems biology

workbench: Enabling interaction and exchange between software tools for computational biology’,

Proceedings of the Pacific Symposium on Biocomputing .

Kindler, E. & M. Weber (2001), ‘The petri net kernel - an infrastructure for building petri net tools’, Software

Tools for Technology Transfer 3(4), 486–497.

Kurtz, Thomas G. (1971), ‘The relationship between stochastic and deterministic models for chemical reac-

tions’, The Journal of Chemical Physics 57(7), 2976–2978.

Leloup, Jean-Christophe, Didier Gonze & Albert Goldbeter (1999), ‘Limit cycle models for circadian

rhythms based on transcriptional regulation in drosophila and neurospora’, J Biol Rhythms 14(6), 433–

448.

Marsan, M. Ajmone, G. Balbo, G. Conte, S. Donatelli & G. Franceschinis (1995), Modelling with General-

ized Stochastic Petri Nets, Wiley Series in Parallel Computing.

Matsuno, Doi, Nagasaki & Miyano (2000), ‘Hybrid petri net representation of gene regulatory networks’,

Proc. Pacific Symposium on Biocomputing pp. 338–349.

Matsuno, H., Y. Tanaka, H. Aoshima, A. Doi, M. Matsui & S. Miyano (2003), ‘Biopathways representation

and simulation on hybrid functional petri net’, In Silico Biology 3(3), 389 – 404.

Bibliography 94

Merrow, Martha W., Norman Y. Garceau & Jay C. Dunlap (1997), Dissection of a circadian oscillation into

discretedomains, Vol. 94.

Petri, Carl Adam (1962), Kommunikation mit Automaten (Communicating with automata), PhD thesis,

University of Bonn, Institut fur Instrumentelle Mathematik.

Ptashne, M. (1992), A genetic switch: Phage λ and Higher Organisms, Blackwell Science.

Ramsey, S., D Orrell & H Bolouri (2005), ‘Dizzy: stochastic simulation of large-scale genetic regulatory

networks’, J. Bioinf. Comp. Biol. 3(2), 415–436.

Reddy, Venkatramana N., Michael N Liebman & Michael L. Mavrovouniotis (1993), ‘Petri net representa-

tions in metabolic pathways’, Proc Int Conf Intell Syst Mol Biol 1, 328–336.

Ross, Sheldon M. (1996), Stochastic Processes, 2 edn, Wiley Series in Probability and Mathematic Statistics.

Shaw, O., A. Koelmans, J. Steggles & A. Wipat (2004), ‘Applying petri nets to systems biology using

xmltechnologies’, Proceedings of the Workshop on the Definition, Implementation and Application of

a Standard Interchange Format for Petri Nets (26), 11–25.

Srivastava, R., M.S. Peterson & W.E. Bentley (2001), ‘Stochastic kinetic analysis of the escherichia coli

stress circuit using σ32-targeted antisense’, Biotechnol. Bioeng 75(1), 120–129.

Vilar, Jose M.G., Hao Yuan Kueh, Naama Barkai & Stanislas Leibler (2002), ‘Mechanisms of noise-

resistance in genetic oscillators’, Proc Natl Acad Sci USA 99(9), 5988–5992.

Weber, Michael & Ekkart Kindler (2002), ‘The petri net markup language’, Petri Net Technology for Com-

munication Based Systems, Advances In Petri Nets .

modelling the randomness in biological systems - school of

Documents