modelling the randomness in biological systems - school of
TRANSCRIPT
Modelling the Randomness in
Biological Systems
Ole Schulz-Trieglaff
TH
E
U N I V E RS
IT
Y
OF
ED I N B U
RG
H
Master of Science
School of Informatics
University of Edinburgh
2005
Abstract
This dissertation deals with the modelling of biological processes using Stochastic Petri Nets
(SPNs). Petri Nets are a formalism that comes from Computer Science (Petri 1962) and has been
used since many years to model systems such as computer networks. Recently, they have also been
applied to model biological processes such as genetic networks (Goss & Peccoud 1998).
The outcome of this dissertation is twofold. First, a software framework was implemented that
allows creating a SPN model in a graphical editor. This software is called PNK 2e and can be used
to simulate the behaviour of the net using the infrastructure of the Systems Biology Workbench,
a collection of simulation and analysis tools tailored for biological applications. PNK 2e is based
on the Petri Net Kernel (PNK), an Open Source project developed at the Humboldt University in
Berlin, Germany. PNK 2e is also available under the Open Source software licence. It can im-
port Petri Net models from different XML description formats. The graphical representation of
biological models offered by SPNs is very intuitive. Furthermore, a SPN can be simulated using
algorithms that are commonly used in the field of Systems or Theoretical Biology. PNK 2e was the
topic of a scientific poster at the BioSysBio conference 2005 in Edinburgh. It is available online1
and has been announced in a forum dealing with Systems Biology software.
In the second part of this dissertation, PNK 2e was used to simulate genetic oscillators, networks
of genes and proteins that exhibit oscillations with a period close to 24 hours. These systems are
thought to represent the molecular basis of the internal clock of many organisms. Two oscillators of
very different architecture (Gonze, Halloy & Goldbeter 2002, Vilar, Kueh, Barkai & Leibler 2002)
were simulated with different numbers of molecules involved. This behaviour is of special interest,
since it is known that circadian clocks have to work reliably with only very few molecules. The
obtained results support previous findings (Barkai & Leibler 2000) but also provide new insights
about design features of biomolecular clocks. It was found that one particular architecture is even
driven by fluctuations in the molecular populations. Usually, this noise is considered to be a source
of disturbance, but in this case it is essential for the functioning of the clock. This architecture also
reveals significant robustness in case of mutations of key genes or changes of rate constants in the
model.
1www.inf.fu-berlin.de/˜trieglaf/PNK2e
i
Acknowledgements
I am very indebted to my supervisor Prof. Gordon Plotkin for his invaluable advice during my time
in Edinburgh and for reviewing this dissertation.
I also would like to thank Prof. Andrew Millar for his introduction into circadian clocks and many
helpful ideas.
Many thanks to Lucia Castellanos and Malcolm Leiva Gebhard who reviewed parts of this disser-
tation and gave many advices and much appreciated support.
Thanks to Stephen Ramsey (Institute for Systems Biology, Seattle), Frank Bergmann (Keck Gradu-
ate School, Claremont) and Michael Weber (German Aerospace Center) for making their software
available to be used in this project.
Funding was provided by the Students Awards Agency for Scotland and the ”Landesregierung des
Saarlandes” (Government of the state Saarland, Germany).
ii
Declaration
I declare that this thesis was composed by myself, that the work contained herein is my own except
where explicitly stated otherwise in the text, and that this work has not been submitted for any other
degree or professional qualification except as specified.
(Ole Schulz-Trieglaff )
iii
Table of Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Scope of this Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Background 6
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Petri Net theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Untimed Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Stochastic Petri Nets and Markov Processes . . . . . . . . . . . . . . . . . 8
2.2.3 Representing Biological Processes with Petri Nets . . . . . . . . . . . . . 10
2.3 Kinetics of chemical reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Deterministic Kinetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Stochastic Kinetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Simulation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.1 The Gillespie Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.2 The Gibson-Bruck Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Background on Stochastic Models and Tools in Biology 22
3.1 Previous Work on Stochastic Models . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.1 A Stochastic Model of Pathway Bifurcation in Phage λ . . . . . . . . . . . 23
3.1.2 Stochastic analysis of Biological models with Petri Nets . . . . . . . . . . 24
3.1.3 Analysis of the E.coli Stress Circuit with Stochastic Nets . . . . . . . . . . 26
3.2 Review of other Petri Net Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
iv
4 Technical Methodology 32
4.1 The Petri Net Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.1 Design and Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1.2 The Extended Kernel (PNK 2e) . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 The Systems Biology Workbench . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 Stochastic Simulations with Dizzy . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5 Experiments 50
5.1 The Volterra-Lotka Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.1.1 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Stochastic models of circadian rhythms . . . . . . . . . . . . . . . . . . . . . . . 54
5.2.1 The delay-based Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2.2 The hysteresis-based Model . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Synchronising several oscillating cells . . . . . . . . . . . . . . . . . . . . . . . . 69
5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6 Conclusions 74
6.1 Concluding remarks and Observations . . . . . . . . . . . . . . . . . . . . . . . . 74
6.2 Unsolved Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3 Suggestions for Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
A User guide to PNK 2e 78
B Delay-based Oscillatory Network 83
C Genetic Oscillator based on Hysteresis 88
D Glossary of biological terms 91
Bibliography 92
v
List of Figures
2.1 Firing of transitions in a Petri Net . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 A Petri Net Model of Gene Expression . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Example of Michaelis-Menten kinetics . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1 Simplified model of the E.Coli stress circuit . . . . . . . . . . . . . . . . . . . . . 26
4.1 Overview of the extended Petri Net Kernel . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Broker architecture of the SBW . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.1 Petri Net representation of the Lotka-Volterra Reactions . . . . . . . . . . . . . . . 52
5.2 Stochastic simulation of the Lotka-Volterra Reactions . . . . . . . . . . . . . . . . 53
5.3 Core model for circadian rhythms based on delay . . . . . . . . . . . . . . . . . . 55
5.4 Delay-based circadian clock: stochastic and deterministic simulation . . . . . . . . 57
5.5 Stochastic simulation with changing numbers of molecules . . . . . . . . . . . . . 58
5.6 The hysteresis-based model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.7 Hysteresis-based circadian clock: stochastic and deterministic simulation . . . . . 62
5.8 Hysteresis-based circadian clock: simulation with low degradation rate of repressor 64
5.9 Stochastic simulation with changing numbers of molecules (hysteresis) . . . . . . 65
5.10 Simulation of the effects of gene duplication . . . . . . . . . . . . . . . . . . . . . 66
5.11 Simulation of both circadian clocks with low rate constants . . . . . . . . . . . . . 68
5.12 Synchronisation of several cells . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.13 Robustness of the oscillattions in both models measured by half-life of autocorre-
lation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
A.1 Screenshot of PNK 2e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.2 Screenshot of PNK 2e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
B.1 SPN representation of the circadian clock in Neurospora . . . . . . . . . . . . . . 86
vi
C.1 SPN representation of the hysteresis-based model . . . . . . . . . . . . . . . . . . 90
vii
Chapter 1
Introduction
In recent years, the new scientific field of Systems Biology has aroused much interest. Systems
Biology deals with the application of methods from Mathematics, Computer Science and Physics
to improve our understanding of how biological systems function. It tries to understand the cell in
an integrated manner and to obtain an understanding of its single components by examining their
relationships and relating them to a global view of the cell.
An important aspect of Systems Biology is the search for representations of cellular processes that
can be used to compute their future behaviour. This idea is not new and simple models of bio-
chemical reactions have been researched since a long time. But the availability of high-throughput
experiments and recent advancements in computational methods and computer power have greatly
improved their potential. However, the search for suitable modelling techniques still continues,
techniques that are able to capture the full complexity of the cell and its components.
The first goal of this M.Sc. thesis was to develop a software tool that facilitates the modelling of a
biological system with Stochastic Petri Nets. But during the preparation of this project, its scope
was further extended. In addition, we provide a comparison of two general models representing
genetic oscillators. These oscillators are assumed to represent the basis of the circadian rhythm
that is observed in many organisms. The experiments give an example for a successful application
of PNK 2e, the software that was written during the first part of this dissertation. In addition, they
provide an insight into possible architectures for more detailed and realistic models of circadian
clocks.
1
Chapter 1. Introduction 2
Project Objectives
The tasks of this dissertation can be summarized as follows:
• Development of a software platform for experiments with Stochastic Petri Nets. In contrast
to software that is already available, this platform should be tailored for biological applica-
tions. As an example, it should support data exchange formats that are commonly used in
the scientific community.
• Application of this software to model an exemplary biological system. Evaluate the use-
fulness of Stochastic Petri Nets for this task and validate the software by comparing the
experimental results to findings published in the literature.
• Further experiment to compare two models describing circadian clocks and how their behav-
iour reacts to changes in parameters and structure of the model.
1.1 Motivation
Petri Net software tools
Currently there is a wealth of software environments available, tools that can be used to create rep-
resentations of chemical reactions and to compute their future behaviour. In the early stage of this
M.Sc. dissertation, it was planned to develop a SPN software completely from the scratch. Then
it was discovered that several other tools with similar functionality exist already. However none of
these tools were freely available and seemed to be of use in biological applications. Therefore it
was decided to develop a new Petri Net tool tailored for the modelling of biological processes.
Several Petri Net tools were found on the Internet that were developed during M.Sc. projects at
other universities. It was estimated that the development of a new tool from the scratch would take
almost all of the three months available. It would leave no time to model interesting systems and
to conduct detailed experiments. That is why we decided to use the infrastructure from already
existing Open Source projects in this dissertation. There are some software projects that seemed
to be useful for this task. In the beginning, it was planned to extend the software PIPE (Platform
Independent Petri Net Editor) but this turned out to be far more difficult than expected. PIPE
does only support non-stochastic Petri Nets and an extension by another net type would have been
very laborious. As a consequence, we decided to use the Petri Net Kernel instead and the Systems
Chapter 1. Introduction 3
Biology Workbench, a collection of analysis tools for Systems Biology, to simulate the behaviour
of the net. Both software tools are published under the Open Source licence and their authors
were very helpful and provided much advice. The Petri Net Kernel (PNK) was developed at the
Humboldt University in Berlin, Germany, and the Systems Biology Workbench is a collaborative
project between several institutes, among them the California Institute of Technology, the Keck
Graduate Institute in Claremont and the Institute for Systems Biology in Seattle.
Circadian Oscillators
In the second part of this M.Sc. project, we validate the PNK 2e software by modelling so-called
circadian oscillators. Many organisms are known to have an internal clock that has significant
influence on their behaviour and lifecycle. Circadian oscillators or oscillatory networks are be-
lieved to be the source of these clocks on a molecular level. They consist of a small set of genes
and proteins that interact with each other. At least one of the contained proteins exhibits some
rhythmic activity e.g. its concentration in the cell oscillates with a period of about 24 hours. This
clock protein is assumed to regulate the activity of other genes and to drive the circadian rhythm
of the organism. The understanding of how exactly these circadian rhythms are created is very
important, for instance to stimulate the growth of food plants. Nevertheless the exact behaviour of
all components in a typical clock is not yet understood and these details are often very difficult to
examine in biological experiments.
Therefore it has been decided to model two competing architectures for a genetic oscillator with
Stochastic Petri Nets and to simulate them under different conditions. As a first step, we tried to
recreate findings published elsewhere (Gonze, Halloy & Goldbeter 2002, Vilar et al. 2002) in order
to validate our approach. But we also expanded the experiments done by others and tried to gain
new insights into possible models for circadian clocks.
There exists a group of experimental biologists at the School of Biosciences in the University of
Edinburgh that works on mathematical models of circadian clocks. This group is lead by Prof.
Andrew Millar who provided many help advices and gave valuable hints for the experiments in
this dissertation.
Chapter 1. Introduction 4
1.2 Scope of this Dissertation
As remarked above, there are already many software tools that can be used to model biological
systems. However, (Stochastic) Petri Nets provide a more formal view on the modelling problem.
Their theory is well researched, efficient algorithms exist not only to simulate their dynamic be-
haviour but also to examine structural properties. The software that is presented in this dissertation
makes use of other Open Source projects and relies on common data exchange formats. This is
particularly important since there are many tools that merely repeat work that has already been
done elsewhere. In addition, many of these tools use their own formats to store their data and are
thus not compatible with other software.
When it comes to the experimental part, we focus on stochastic simulations with different numbers
of molecules and examine the influences of changes in the rate constants. This is important for two
reasons. It is known that genetic circuits have to function reliably under a variety of conditions,
also if only very few instances of the involved key proteins are present. Moreover, the reaction
rates can be influenced by changes in temperature or nutrition of the organisms. The behaviour of
a genetic oscillator should not be influenced by moderate changes in these constants.
1.3 Organisation
This dissertation assumes that the reader does not have a background on chemical kinetics or
Stochastic Petri Nets. Therefore one chapter is dedicated to explaining the most relevant issues.
Nevertheless, a basic knowledge of biology is assumed such as the regulation of genes and the
synthesis of proteins, but on a rather basic level. A glossary of the most important biological
terms used in this dissertation is given in Appendix D. We will also provide a brief introduction
of Markov processes that are closely related to SPNs. The organisation of this dissertation is given
below.
Chapter One provides an introduction into the theory of Petri Nets. Stochastic Petri Nets are in-
troduced and their relationship to stochastic kinetics is explained. We also give a comparison of the
deterministic and the stochastic assumption in chemical kinetics. The most common algorithms
for the stochastic simulation of chemical reactions are presented. We will discuss their individual
advantages and how they can be used to simulate Stochastic Petri Nets.
Chapter 1. Introduction 5
Chapter Two provides an overview of stochastic models and tools in Biology. We present three
related efforts on stochastic modelling of biological processes. We aim at providing on overview
of previous work and discuss their relationship to this dissertation. The second part of this chapter
introduces two software tools that are similar to the software developed during this dissertation.
An overview of their functionality is given and a comparison to our software, PNK 2e, is drawn.
Chapter Three describes in detail the Petri Net Kernel. We enumerate the extensions that we made
and explain why they are necessary. Technical difficulties that were encountered are also discussed
in this chapter. We also give details of other software that was used such as the Systems Biology
Workbench and the description language SBML.
Chapter Four describes the experiments that were performed to validate the new version of the
Petri Net Kernel. We will give a brief introduction of circadian clocks. We successfully replicated
the results of other authors. Furthermore, an experimental comparison of two competing models
for circadian oscillators is given that underlines the usefulness of PNK 2e for experiments in Sys-
tems Biology.
Chapter Five contains an evaluation of this dissertation, summarizes the conclusions and gives in-
dications for future work.
Chapter 2
Background
2.1 Introduction
This chapter lays the theoretical foundations of this project. A short overview of Petri Net theory is
given and some of its properties are introduced. There are many extensions to this basic theoretical
concept. However, only the theory of Stochastic Petri Nets and its relation to Markov Processes
will be discussed since it is the model used in this work.
In the second section, an introduction to mathematical modelling in Biology is given. The two
main competing approaches, deterministic models based on differential equations, and stochastic
simulations are compared. The stochastic approach is usually based on the assumption that the
behaviour of the system fulfils the Markov assumption. Therefore an outline of the relationship
between Stochastic Petri Nets and Markovian random processes is also provided.
One of the advantages of Stochastic Petri Nets is that they can be simulated using efficient simu-
lation algorithms. Therefore, two of these algorithms will be introduced in the last section. These
algorithms are implemented in the Systems Biology Workbench and were used in our experiments.
2.2 Petri Net theory
Petri Nets provide a graphical notation for the formal description of the dynamic behaviour of
systems. Although Petri Nets have been used for the qualitative modelling of computer systems
and communication networks since the 1960s (Petri 1962), their use as a paradigm for quantitative
modelling started only about twenty years ago. Untimed Petri Nets (Place-Transition Nets) are
6
Chapter 2. Background 7
introduced first. Following that, Stochastic Petri Nets and their relation to Markov Processes are
described.
2.2.1 Untimed Petri Nets
A Petri Net (PN) is a directed bipartite graph with two sets of nodes, transitions and places. Tran-
sitions are drawn as bars and places as circles. Places can contain tokens that are drawn as black
dots. The state of a PN is given by the number of tokens in all places and is called a marking. The
initial placing of tokens is called the initial marking and represents the starting state of the net.
Transitions and places are connected by directed arcs. Input places are the places for which the
arcs point from the place to the transition. Output places are the places for which the arcs point
from the transition to the place. An arc can have an inscription, its multiplicity.
Tokens move in the net according to rules given by the transitions. A transition is said to be enabled
if each input place contains at least as many tokens as given by the multiplicity of the connecting
arc. An enabled transition is executed by removing as many tokens from each input place as given
by the inscription of the arc connecting input place and transition and by inserting as many tokens
into each output place as given by the inscription of the arc pointed from the transition to the output
place.
3
2
2
(a) Petri Net before Firing
3
2
2
(b) Petri Net after Firing
Figure 2.1: Example of the execution of a transition in a Petri Net.
Starting from the initial marking and following the firing rules we can progress through all possible
Chapter 2. Background 8
states of the net. This procedure is called the token game. The set of all possible states of a net,
given a certain initial marking, is called the reachability set with respect to this initial marking.
Different initial markings may give rise to different reachability sets. For this reason, the initial
marking is an important part of the model.
The token game gives rise to the reachability graph. This graph contains all markings encountered
during the token game as nodes and the transitions between these markings as arcs.
2.2.2 Stochastic Petri Nets and Markov Processes
Stochastic Petri Nets (SPNs) are a popular extension to basic Petri Net theory. In a SPN each
transition fires with an exponentially distributed delay. In other terms, each transition has an asso-
ciated firing rate µ which is the parameter of a (negative) exponential distribution with the density
function f (x) = µe−µx. This distribution has the memoryless property. If X is an exponentially
distributed random variable, then this property states:
P(X > t + s|X > t) = P(X > s) t > 0, s > 0
Proof:
P(X > t + s|X > t) =P(X > t + s)
P(X > t)
=e−µ(t+s)
e−µt
= e−µs
= P(X > s)
The exponential distribution is often used to model waiting times until some event occurs. In a
SPN, such an event would be execution of the next transition. In this context, knowing that already
t time units have elapsed without the execution of a transition, does not give us any additional
information about when the next execution will occur. In other terms, if no change of state has
happened at time t, then the distribution of the remaining sojourn time in the current state is the
same as if no time had passed.
If one or several transitions are enabled in some state of the SPN, the delays for each transition are
sampled from the exponential distribution. The transition with the smallest delay is then executed
first.
Chapter 2. Background 9
In recent years, there has been a lot of interest in the application of Markov processes to Biology;
for instance, in the modelling of the evolution of genome sequences. In fact, a SPN is nothing else
than a representation of a Markov process and the reachability graph is simply the state transition
diagram of the Markov process. In order to underline this fact we will give a brief definition of a
Markov process and sketch the aforementioned relationship.
2.2.2.1 Markov Processes
A sequence X = X(t)t∈N of random variables X(t) is called a discrete-time stochastic process.
The state space of this process is the set of all possible values that X(t) can assume (Ross 1996). A
Markov process is a stochastic process X(t) which has the Markov or memoryless property. Given
the value of X(t) at some time t, future values X(s) of the process for s > t do not depend on
knowledge of the past history X(u) for u < t :
P(X(tn+1) = xn+1|X(tn) = xn, . . . ,X(t1) = x1) = P(X(tn+1) = xn+1|X(tn) = xn)
This memoryless property means that once we have arrived in a particular state, the future behav-
iour is always the same regardless of how we arrived in the state. Markov processes are popular
since the underlying theory is relatively simple. The process can be visualised by its state-transition
diagram which contains the states of the process together with the connecting transitions.
If we are in a state s ∈ S of a Markov process, the distribution of time until the next change of state
is independent of the time of the previous change of state due to the Markov property. In other
terms, the waiting or sojourn time in a state is memoryless. The only probability distribution that
has this attribute is the exponential distribution and therefore the waiting times until a change of
state in a Markov process follow this distribution. Each transition in a Markov process has there-
fore a rate which is the parameter of an exponential distribution.
As mentioned above, a SPN gives rise to a Markov process. If we compute the reachability graph
of the SPN, this graph is isomorphic to the state-transition diagram of the Markov Process. But
this also means that we can analyse the behaviour of this process and by doing this, obtain new
knowledge about the dynamics of the Petri Net model.
Steady State distribution of a Markov Process
An important property of a Markov process is its behaviour over a long period of time. Under
certain conditions, the process will settle to some regular or steady state behaviour. This does not
Chapter 2. Background 10
mean that the process has stopped and does not make any transitions. But it does mean that the
probability distribution of the process being in a certain state does not change anymore.
We denote the probability that the Markov process is in state xk at time t by πt(xk). The steady
state has been reached if this probability is not dependent on the time anymore. Thus we denote
the steady state distribution by π and π(xk) is then the probability that the model is in state xk after
the steady state is reached.
Theorem: (Ross 1996) A steady state distribution, π(xk),xk ∈ S exists for every Markov process
with the following properties:
• its transition rates are time homogeneous e.g. they do not depend on the time at which we
observe the process,
• finite number of states,
• and irreducibility. This means that all states in S can be reached from all other states if we
follow the transitions of the process.
The steady state distribution can be calculated by using the so-called global balance equations.
These equations give rise to a system of linear equations that can be solved by appropriate algo-
rithms.
If we derive the underlying Markov process of the Petri Net by computing the reachability graph
of the net, we can compute the steady state of the Markov process. In practice, not every SPN has a
steady state distribution (because the state space might be not finite) or it would be too expensive to
compute it (if the number of states is very large). This makes observations obtained from stochastic
simulations even more valuable.
2.2.3 Representing Biological Processes with Petri Nets
The first research involving the application of Petri Nets to biological models was conducted by
Reddy, Liebman & Mavrovouniotis (1993). They modelled the combined glycolytic and pentose
phosphate pathway in the erythrocyte cell with a Petri Net and used Petri Net theory to analyse
qualitative properties of this pathway.
Chapter 2. Background 11
Figure 2.2: A Petri Net model of gene expression. This is a abstract model of gene expression and protein
synthesis. The gene becomes active with rate λ. The protein is synthesized with rate v and degraded with
rate δ. Example is taken from Goss & Peccoud (1998).
In general it is easy to represent chemical reactions with a Petri Net. We think of places as chemi-
cal species and of transitions as reactions occurring between these species. The multiplicity of an
arc is given by the stoichiometric coefficient of the species involved in the reaction. Tokens usually
represent single molecules but can also be seen as a fixed amount of molecules such as a mole (=
6.022×1023 molecules), a common base unit in Chemistry.
In the last years, there have been some publications in which Stochastic Petri Nets were used to
model coupled chemical reactions (Goss & Peccoud 1998), (Srivastava, Peterson & Bentley 2001).
This is due to the fact that if the participating chemical species occur only at very low concentra-
tions then a stochastic model of the chemical kinetics is more accurate then a deterministic one. In
this context, we assume that each reaction occurs with a certain probability and the rates of firings
in the net are given by the stochastic rate laws. We will give details of this stochastic assumption
in chemical kinetics in the next section.
2.3 Kinetics of chemical reactions
The next two sections give an overview of chemical kinetics. The basics of the deterministic
formulation are outlined and a comparison to the stochastic approach is drawn.
Chapter 2. Background 12
2.3.1 Deterministic Kinetics
Chemical kinetics are concerned with the time evolution of a reaction system. Classical or deter-
ministic kinetics are expressed in terms of the concentrations of the chemicals. These concentra-
tions can vary continuously as the reactions progress.
We assume that the rate of a reaction follows the mass action law which means that this rate is
proportional to the concentration (and in turn to the mass) of each reactant raised to the power
of its stoichiometry (Cox & Nelson 2004). In other terms, the change of a product quantity is
proportional to the product of the reactant concentrations. As an example, in this second-order
reaction
A+B→C
the rate of change of C, dCdt , is given by k× [A]× [B] where [A] and [B] denote the concentration of
A and B respectively and k is some constant. Thus the rate of change of C can be modelled by a
differential equationdCdt
= k× [A]× [B]
The rate constant k needs to be specified as well as the initial concentrations of A and B. This
procedure can of course be extended to several coupled reactions. In this case, the reactions give
rise to a set of coupled differential equations. These equations can be used to compute the time
evolution of the reactions either by solving them analytically (which is not often possible) or by
numerical integration.
The idea behind the deterministic formulation of chemical kinetics is that even if single molecules
move randomly, the overall behaviour of a large group of molecules follows a pattern and this
pattern can be modelled deterministically.
Even if a set of differential equations cannot be solved analytically, it is often possible to determine
characteristics of their steady state behaviour. This can be done by setting the right side of the
equations to zero and solving for the concentrations of the reactants. But if we know the steady
states of the system, we do not know if a particular set of initial conditions will lead to one of these
states and how it will reach it. An example is given in our experiments with the Lotka-Volterra
reactions presented in section 5.1.
Chapter 2. Background 13
0 50 100 150 200 250 300 350 400 4500
10
20
30
40
50
60
70
80
90
100
Time
Time of course of Michaels−Menten kinetics
Con
cent
ratio
n of
pro
duct
Figure 2.3: Typical time course of a
reaction following Michaelis-Menten ki-
netics. The rate of the synthesis of
the product P increases rapidly but con-
verges after some time. This is due to
the saturation of the enzyme.
2.3.1.1 Michaelis-Menten kinetics
Michaelis-Menten kinetics are a special case of deterministic kinetics. They are named after
Leonar Michaelis (1875-1946) and Maud Menten (1879-1960). Since these kinetics occur in the
experimental section of this work several times, a brief explanation is given in this section.
We consider a set of reactions in which a substrate S is converted into a product P only in the
presence of an enzyme E:
S +E ↔ ES (2.1)
ES → P+E (2.2)
We assume that the reactions follow mass-action kinetics. The forward reaction of 2.1 has the rate
constant k1, the backward reaction k−1 and reaction 2.2 the constant k2. These reactions can be
modelled by a set of coupled differential equations (Cox & Nelson 2004).
Using several assumptions it was shown that these reactions can be simplified to one single equa-
tion expressing the rate of change of the product [P] in terms of the Michaelis-Menten constant
KM = k−1+k2k1
and the maximum rate Vmax of the reaction. Vmax is given by [E0]× k2 where [E0] is
the total concentration of the enzyme E. Thus the Michaelis-Menten equation is given by
d[P]
dt= Vmax
[S]
Km +[S]
Both constants, KM and Vmax can be determined experimentally. Vmax can be obtained by increas-
ing the substrate concentration until the reaction reaches its maximum rate and KM is equal to the
substrate concentration at which d[P]dt equals Vmax/2 (set KM = [S] in the equation above).
Chapter 2. Background 14
This approximation is also part of the SBML description language (section 4.2) and occurs several
times in the deterministic model of the biomolecular clock in Neurospora (section 5.2). In order
to simulate Michaelis-Menten Kinetics stochastically, this equation is usually decomposed again
into its elementary steps. The problem is that the decomposition of the deterministic formulation
to the elementary stochastic steps leads to ambiguities since the elementary rate constants are not
specified by the deterministic model.
2.3.2 Stochastic Kinetics
As outlined before, classical mass-action kinetics assume that the behaviour of a large number
of molecules follows deterministic patterns. The reaction constants are regarded as rates and the
various species concentrations are represented by continuous, single-valued functions of time. In
many cases. random fluctuations and correlations do not play a significant role for the behaviour
of a system and this assumption is adequate. Nevertheless there are many examples for which this
approach turned out not to be correct (Arkin, Ross & McAdams 1998).
We will start this section by outlining the central assumptions of the stochastic approach to chem-
ical kinetics. After this we will describe how this approach relates to the deterministic model and
how stochastic rate constants can be converted into deterministic ones and vice versa.
In a stochastic context, the reaction constants are viewed as reaction probabilities per unit time.
The temporal behaviour of the system is modelled as a Markovian random walk on the space of the
molecular populations of the species. It was proved that the stochastic formulation reduces to the
deterministic formulation in the thermodynamic limit e.g. when the numbers of molecules and the
volume approach infinity (Kurtz 1971). We consider a set of n chemical species X1,X2, . . . ,Xn and
a set of m reactions R1,R2, . . . ,Rm. If the container in which the reactions take place is well stirred
and in thermal equilibrium, it can be shown that the probabilities that two molecules Xi and X j,
i, j ∈ 1 . . .n, collide is constant (Gillespie 1977). Each reaction Ri can therefore be characterized
by a single constant ci which is defined as the average probability that a particular combination
of Ri reactant molecules will react according to reaction Ri. The probability of the next occurence
of reaction Ri in the time interval dt is then ci×dt. As an example, let us consider again a simple
second-order reaction:
A+B → C (2.3)
Chapter 2. Background 15
The rate constant for this reaction gives the probability that a pair of molecules A and B reacts to
produce C. Since there are A×B different combinations of molecules of this type, the probability
that this reaction will occur somewhere inside the container in the next infinitesimal time interval
dt is given by A×B× ci where A and B denote the number of molecules of species A and B and ci
is the reaction constant of the reaction as defined above.
If the reaction had been of the form
A→ B
then this probability had been ci×A and in the case
2A→ B
the probability would have been A(A−1)A × ci ≈ A2ci. We are now examining the relationship be-
tween the reaction parameter ci and the deterministic rate constant ki. This is important since much
of the literature on biochemical rate constants is dominated by a deterministic point of view. Fur-
thermore, if we want to compare deterministic and stochastic formulations of the same model, we
need to be able to convert these deterministic constants into their stochastic counterparts.
Referring again of the simple example of reaction 2.3, A×B× ci dt gives the probability that this
reaction will occur somewhere in the container in the next time interval dt. Dividing by the volume
V leads us to the average reaction rate per unit volume A×B× ci/V . This is already close to the
deterministic rate constant which is defined as the average reaction rate per unit volume. But the
deterministic constants are expressed in terms of concentrations [A] = A/V and [B] = A/V and not
in terms of numbers of molecules. But if we replace A and B in the stochastic rate law by [A] and
[B], we obtain [A][B]VcI which is the rate of change dependent on the concentration.
Since the deterministic rate law is defined as k1[A][B], we can infer from that:
ki = V × ci
for a bimolecular reaction of the form 2.3. For a reaction 2A→ B, we would have obtained ki =
V × ci/2. For a monomolecular reaction such as A→ B, ki and ci are equal. In general, we can
conclude that the relationship between ci and ki is simple in a mathematical sense. Conversions
and comparisons between parameters in deterministic and stochastic approaches are possible.
Chapter 2. Background 16
2.4 Simulation Algorithms
Starting from the stochastic approach to chemical kinetics described in the previous section, two
well-known stochastic simulation algorithms will be introduced: the First Reaction Method de-
veloped by Gillespie (1976) and the Next Reaction Method by Gibson & Bruck (2000). Both
algorithms are exact procedures for numerically simulating the time evolution of a well-stirred
chemically reacting system.
Both procedures are used in the experimental section in order to simulate the dynamic behaviour
of a Stochastic Petri Net model. The Gillespie algorithm is described in more detail, explaining its
relationship to a Stochastic Petri Net and drawing to the Next Reaction Method.
2.4.1 The Gillespie Algorithm
The Gillespie algorithm is the most popular algorithm for the stochastic simulation of coupled
chemical reactions. It was proposed by Gillespie (1976) and comes in two different versions: The
First Reaction Method and the Direct Method. Following the structure of this paper, the theoretical
concept that underlies the algorithm will be introduced first. Then, the traditional master-equation
and the simulation approach will be compared.
In a deterministic setting, we assume that the time evolution of a chemically reacting system is
continuous and deterministic. It is evident that this is not correct since the molecular population
levels can only change by discrete integer amounts. In addition, the time evolution is usually not a
deterministic process, but is governed by the random movements of single molecules. The Gille-
spie algorithm is useful if we want to simulate reactions with very few molecules involved. This
is the case for many regulatory networks. Furthermore a stochastic approach is appropriate for
systems that exhibit instable behaviour. In this case, even small fluctuations in the molecular pop-
ulations can drive the system out of its current state. This causes drastic changes in the system that
could not be predicted by a deterministic formulation.
The traditional approach, before Gillespie developed his algorithm, was the master-equation ap-
proach. In this approach, random variables are used to denote each possible state of the system i.e.
combinations of molecular populations. The master equation or Chapman-Kolmogorov equation
is a system of coupled differential equations that describes the transition probabilities in the sys-
Chapter 2. Background 17
tem. It is possible to write down and solve the master equation for a system. This would give us
complete knowledge of the systems dynamics. However, this is only possible for simple systems
with very few states. For larger systems this approach becomes intractable.
As outlined above, the Gillespie algorithm follows the assumption of stochastic kinetics. We as-
sume that for each reaction Rµ a stochastic reaction constant cµ exists that gives the probability that
a particular combination of molecules will react according to Rµ. This assumption requires that the
systems is kept well mixed either by direct stirring or simply by requiring that nonreactive mole-
cular collisions occur much more frequently than reactive molecular collisions (Gillespie 1977).
This is the fundamental hypothesis of the Gillespie algorithm.
The algorithm generates a single sample trajectory of the chemical process. This can be interpreted
as a random walk through the space of possible states. At each time step, the system is exactly
in one state defined by the molecular populations in the system. The Gillespie algorithms then
picks a reaction and executes it according to a probability distribution such that the probability of
the generated trajectory is the same that the Master equation would assign to this trajectory. By
generating many trajectories and averaging their results, we can estimate any parameter of interest
such as the average number of molecules of species at some time t.
Gillespie (1976) proposed two methods for the simulation of the trajectories. The Direct Method
calculates explicitly which reaction occurs next and when it occurs. The First Reaction Method
generates for each reaction µ a time τµ at which it occurs and then executes the reaction which
occurs first. We will describe both methods now.
The Direct Method
This method relies on the probability density P(µ,τ) that the next reaction is µ and occurs at time τ.
We already introduced the stochastic reaction constant cµ which is the probability that a particular
combination of molecules will react according to reaction Rµ. Let hµ be the number of distinct
combinations of Rµ reactant molecules in a certain state. For a bimolecular reaction X +Y → Z,
hµ would have the form XY , for a reaction of the form 2X → Z, hµ would be X(X − 1)/2 etc.
We can then define aµdt = hµ cµ dt as the probability that reaction Rµ will occur in some state
(X1,X2, . . .XN) at time t. It can be shown (Gillespie 1976) that
P(µ,τ)dτ = aµ exp(−τ∑j
a j) dτ
Chapter 2. Background 18
This equation can be used to compute directly the next reaction to occur. Integrating P(µ,τ) over
all τ from 0 to ∞ yields
P(Reaction = µ) = aµ/∑j
a j
In a similar way, we can obtain a distribution of the waiting times until the next reaction occurs by
summing P(µ,τ) over all µ which gives us
P(τ)dτ = (∑j
a j) exp(−τ∑j
a j) dτ
This simply means that the waiting time to the next reaction is exponentially distributed with
parameter ∑ j a j. These two distributions give rise to the Direct method:
1. Set the initial numbers of molecules and set t to 0.
2. Calculate aµ for all reactions µ.
3. Choose a reaction µ according to the distribution P(Reaction = µ).
4. Choose τ according to P(τ).
5. Execute reaction µ by changing the numbers of molecules accordingly. Update the time to
t + τ.
6. Go to step 2.
This algorithm needs two random numbers per iteration, takes time proportional to the number of
reactions to update the aµ values since they depend on the current state of the system and takes
time proportional to the number of reactions to calculate ∑ j a j.
The First Reaction Method
This algorithm computes a putative time τµ for each reaction µ to occur, a time the reaction would
occur at if no other reaction occurred first. The reaction with the lowest τµ is then executed first.
1. Set the initial numbers of molecules and set t to 0.
2. Calculate aµ for all reactions µ.
3. For each reaction µ, compute a delay τµ according to an exponential distribution with para-
meter aµ .
Chapter 2. Background 19
4. Let µ∗ be the reaction with the smallest delay τ∗µ.
5. Execute reaction µ∗ by changing the numbers of molecules accordingly. Update the time to
t + τ∗µ.
6. Go to step 2.
These two algorithms seem to be very different, but it can be proved that τ and µ are chosen ac-
cording to the same probability distribution and both approaches are therefore equivalent (Gillespie
1976). The First Reaction method computes two random numbers per iteration, needs time propor-
tional to the number of reactions to update the ai values and needs time proportional to the number
of reactions to identify the reaction with the smallest putative time.
The Gillespie algorithm and Stochastic Petri Nets
One can already anticipate that there is a close relationship between the First Reaction method and
the simulation of a Stochastic Petri Net. In fact, the Gillespie algorithm fulfils the Markov property
since the transition probabilities to each new state depend on the current state only. If a SPN is
chosen to represent a set of coupled reactions and if its transition rates are chosen according to the
mass rate laws, then its behaviour can be simulated with the First Reaction method. The SPN can
be seen as a direct graphical representation of the underlying Markov process.
In some cases the execution of a reaction might affect other reactions as well, for instance if some
reactions share the same reactants. A SPN gives some information about those dependencies since
reactions with the same reactants would be represented by transitions with the same input places
in the SPN. But the Gillespie algorithm does not make use of this information. In the next section,
an algorithm is introduced that takes these dependencies into consideration .
2.4.2 The Gibson-Bruck Algorithm
This algorithm is also called the Next Reaction Method (Gibson & Bruck 2000) and is an improve-
ment of the First Reaction Method as described in the last sections. The Gillespie algorithm in
both variants was extremely successful and is still commonly used. Its main disadvantage is the
amount of computational effort that is needed to conduct simulations with many molecular species
and many reactions.
Chapter 2. Background 20
The Next Reaction Method addresses this problem by offering a reduction of the running time
while still being exact. In our experiments it turned out that the size of the models was not large
enough to cause a substantial difference in the running time. Nevertheless this algorithm was used
to obtain results averaged over a large number of simulation runs. In this case, the difference in the
running time does make a difference.
If one examines the Gillespie algorithm carefully, one can see that it does in fact more work than
might be necessary. It recomputes the rate and delay of each reaction even in cases when they
have not changed. In contrast to this, the approach of Gibson and Bruck stores both values for
each reaction in an efficient data structure and only recomputes them if necessary. Usually it is
not allowed to simply re-use random numbers such as the delay for each reaction. In our case it is
legitimate due to the fact that the delays follow an exponential distribution.
The Next Reaction Method:
1. Set the initial numbers of molecules, set the time t = 0, calculate the stochastic rate ai for
each reaction i and calculate a delay τi for each reaction.
2. Let j be the index of the smallest τi.
3. Execute reaction j by changing the numbers of molecules accordingly.
4. Update a j according to the new state and compute a new putative delay for reaction j using
the new a j.
5. For each reaction whose parameter ai is affected by the execution of reaction j:
(a) Update the rate ai, but store the old value as ai,old .
(b) Set τi = (ai,old/aa,new)/(τi− t)+ t (see (Gibson & Bruck 2000)).
(c) Delete old rate ai,old .
6. Go to step 2.
The Next Reaction method needs to know which ai’s are affected by the execution of a reaction.
Gibson and Bruck state that this can be achieved by using a data structure called dependency graph.
This graph has a node for each reaction. A directed arc connects nodes i and j if the rate of reaction
j is affected by the execution of reaction i. The graph can be constructed automatically before the
Chapter 2. Background 21
algorithms starts by searching for species that are reactants or products of one reaction and reac-
tants of another reaction. After the execution of a reaction i, the algorithms determines the rates
that need to be changed by examining the children nodes of node i in the graph.
In order to obtain efficiently the reaction with the smallest delay, Gibson and Bruck propose an
indexed priority queue. The queue is implemented by another graph which offers fast search and
insert operations.
The Next Reaction method fits more naturally to a Stochastic Petri Net since it operates on a more
local level. The graph representation of a SPN is similar to the dependency graph proposed by
Gibson and Bruck. For a reaction which is represented by a transition in the SPN, the set of all
places which are connected to this transition are molecular species that are affected by this reac-
tion. These species might be reactants of other reactions and the rates of these reactions have to
be changed. This is an interesting relationship but since the Systems Biology Workbench (Hucka,
Finney, Sauro, Bolouri, Doyle & Kitano 2002) offers already efficient implementations of the Gib-
son Bruck algorithm, it was not used here.
The Gibson Bruck algorithm needs one new random number per iteration and it only recomputes
reaction rates and delays if necessary. It is the fastest exact simulation algorithm. But since it is
not very easy to implement, the Gillespie algorithm is still the most commonly used approach.
Chapter 3
Background on Stochastic Models and
Tools in Biology
After laying the theoretical foundations of this project in the last chapter, some previous work on
stochastic modelling of biological phenomena will now be reviewed. It might be surprising, but
random fluctuations or noise play a very important role in the regulation of genes and in turn can
lead to a random behaviour on the level of metabolic or regulatory pathways. One of the most
famous examples is the Phage λ decision circuit (Ptashne 1992). Arkin et al. (1998) developed
a stochastic model of this gene network that was able to explain the apparently random decision
between lysogenic or lysic cycle. A brief description of this model, which was one of the first and
most comprehensive stochastic models of a gene network, is given, together with a review of two
other publications that used the same modelling formalism. A comparison of their approach with
the findings of this dissertation is given. To our knowledge, these two publications are the first
scientific projects in which Stochastic Petri Nets where used to model the dynamic behaviour of
biomolecular interactions.
The last section in this chapter is dedicated to the technical methodology of this project. Two
software tools that can be used to edit and simulate Petri Nets are presented. A comparison is
drawn between these tools and the Petri Net Kernel, the tool that was used and extended in this
study.
22
Chapter 3. Background on Stochastic Models and Tools in Biology 23
3.1 Previous Work on Stochastic Models
3.1.1 A Stochastic Model of Pathway Bifurcation in Phage λ
The following simplified description of the pathway bifurcation in phage λ follows the paper by
Arkin et al. (1998). They were the first to present such a detailed stochastic model of a regulatory
gene network. They also proved that random fluctuations in the molecular populations can have
drastic consequences for the organism as a whole. We provide a short description of the model and
draw a comparison to our work. In reality, the decision circuit is much more complex and contains
various other factors. But this description focuses on the most important ones.
The model
The phage λ is a bacteriophage, a parasite that attacks E. coli bacteria. The phage attaches its tail
to the surface of the bacteria and injects its chromosome into it. After this, the infected bacteria
can switch between two states: in some cases the phage chromosome replicates itself and new
phages are produced in the host cell. According to Ptashne (1992) it takes about 45 min until
the infected bacterium is destroyed (lysis) and about 100 new phages are released. In other cases
the phage chromosome enters the lysogenic state, integrates into the host DNA and is replicated
together with the bacterium. The phage chromosome stays therefore in a dormant state and only
some events, for instance ultraviolet irradiation, can lead to a lysis of the bacteria. Apart from this
so-called stress-induced decision between two states, it was also observed that the change between
the two states can occur at random.
The decision whether an infected E.coli cell enters either the lysic or lysogenic pathway is mainly
controlled by two different proteins: Cro and CI. Cro starts the lytic cycle. If it is expressed con-
stantly over a longer period of time, the chromosome of the λ phage is replicated and the lysis of
the bacteria is inevitable. On the other side, the CI protein controls the lysogenic pathway. If it is
expressed constantly, other phage genes are suppressed and the phage enters a dormant state and
the lysogenic cycle starts. However CI is usually expressed after infection with the phage. But its
expression can be induced by another protein, CII. This protein is usually degraded very quickly
and CI is not expressed. But if a fourth protein, CIII, is expressed at the same time, the degradation
of CII is slowed down and CII molecules are available long enough to induce the expression of CI
which leads to the lyogenic state.
Chapter 3. Background on Stochastic Models and Tools in Biology 24
To conclude, if the proteins CII and CIII are expressed shortly after the infection by a λ phage, CI
is activated, the expression of Cro is suppressed and the bacteria enters the lysogenic pathway. If
not, CI is not expressed and the lysis is started.
Results
Arkin et al. (1998) were able to show that the lysis-lysogeny decision is indeed influenced by ran-
dom bursts in the protein production. In all cases, a burst in the concentration of the CII protein
occurred after infection with the phage. But only in the lysogenic case, this burst was by chance
accompanied by a burst of CIII production. The CIII could then stabilize the CII production which
in turns leads to an increase of CI and entry into the lysogenic state. In contrast, in the lytic-fated
case no CIII production occurred so the unprotected CII rapidly degraded and did not activate the
CI expression. Without expression of CI, Cro production continued and lysis ensued.
These results show that random fluctuations in the production of one single protein can influence
the fate of a simple organism. This work was the first comprehensive stochastic model of a reg-
ulatory network. It highlighted the need for stochastic models to capture the full complexity of
Biology. Even if the authors did not use Petri Nets to model the network, it would have been possi-
ble since they simulated the system with the Gillespie algorithm. This corresponds to the execution
of the corresponding Stochastic Petri Net as described in chapter 2.
3.1.2 Stochastic analysis of Biological models with Petri Nets
The first attempts to model biological systems with Petri Nets focussed on static properties of the
model. Reddy et al. (1993) were the first to use Petri Nets to model the combined glycolytic and
pentose phosphate pathway in the erythrocyte cell. Goss & Peccoud (1998) used Stochastic Petri
Nets (SPNs) for the first time to model the dynamics of molecular interactions. They were the
first to recognize the advantages that are offered by this representation. For these reasons, a brief
summary of their work and a comparison to our approach is given.
The model
Goss & Peccoud (1998) explain the terminology of Stochastic Petri Nets and illustrate how it can
be used to model molecular interactions. Furthermore they present a stochastic model of ColE1
plasmid replication and compare its simulation results to the deterministic solution.
Chapter 3. Background on Stochastic Models and Tools in Biology 25
They introduced the very intuitive mapping of molecular species to places and reactions to transi-
tions in the Petri Net. The same representation was used in the experimental section of this work.
Nevertheless, sometimes it is more convenient to understand places more as a certain state in the
system rather than a species. For instance in the model of a biological clock (section 5.2), a re-
pressor protein binds to the promoter region of a gene. This binding reaction is represented by
a transition whose input set consists of the protein and the gene place. The output place of this
transition represents the suppressed gene with the bound protein. But this place rather represents
the state ”inhibited gene” than a molecular species since gene and protein are still two separate
molecules.
Goss & Peccoud (1998) simulate the Petri Net with the UltraSAN software (Deavours, II, Qureshi,
Sanders & van Moorsel 1995). This simulation is similar to the Next Reaction Method (Gibson &
Bruck 2000). In addition, they also derive an analytic solution by solving the associated Markov
process for its steady state. They briefly mention that Petri Net theory can also be used to examine
structural properties of the net. We tried this as well but in our case the results were not very inter-
esting. It is possible, for instance, to examine the Petri Net for place invariants. These invariants
are, informally speaking, a set of places whose sum of tokens does not change during the simu-
lation of the net. Not surprisingly, the places that represent an enzyme and the enzyme-substrate
complex are part of an invariant since the number of enzyme molecules is always constant.
Results
To summarize, Goss & Peccoud (1998) were the first to apply SPNs to model chemical reactions.
Their work is the foundation of this project even if their objective was slightly different. They were
interested in analytical and simulation results of the Markov process represented by the Petri Net.
However, as already mentioned in chapter 2, obtaining the steady state distribution of the Markov
process is not always possible. Goss & Peccoud (1998) demonstrated that it is possible to restrict
the state space of a SPN by simply enforcing an upper bound on the number of tokens in the states.
It is important to find a reasonable value for this bound.
We did not follow this approach in our experiments since it is unlikely that the steady state behav-
iour of the systems we considered give interesting results. The models of circadian clocks we are
presenting in 5 exhibit an oscillatory behaviour and the steady state distribution will simply give an
average of the oscillations. In addition, simulation results are easier and faster to obtain. Therefore
Chapter 3. Background on Stochastic Models and Tools in Biology 26
we focus more on the evaluation of the results obtained from the stochastic simulation.
Goss & Peccoud (1998) compared analytical and deterministic results with the simulated behav-
iour of the net. We also made comparisons between the deterministic and stochastic solution but
following the work of Gonze, Halloy & Goldbeter (2002), we were also interested in the behaviour
of the model with different numbers of molecules involved. The behaviour of the stochastic model
under different conditions was not a topic in the work by Goss & Peccoud (1998).
3.1.3 Analysis of the E.coli Stress Circuit with Stochastic Nets
(a) Model of E.Coli Stress Circuit (b) Time course of σ32 concentration
Figure 3.1: (a) An overview of the simplified E. Coli stress circuit (Srivastava et al. 2001). The σ32 protein
is synthesized and forms a holoenzyme together with the mRNA polymerase (Eσ32). This complex binds to
the promoter regions of several chaperone proteins and proteases that degrade misfolded proteins. Some of
the chaperons can also bind to Eσ32 and serve as a reservoir of the sigma factor and lead to its degradation.
(b) Ethanol stress response with and with out σ32 mRNA antisense. The results of the stochastic simulation
(thick solid line) + standard deviation (thin dotted lines) compared with experimental data (points with error
bar). The antisense mRNA binds to the σ32 mRNA and was included to see how the model behaves if the
σ32 synthesis is inhibited.
Srivastava et al. (2001) developed a SPN model of the E.coli stress circuit and used this model
to characterize the behaviour of the bacterium under stress i.e. if exposed to heat, ethanol, heavy
metals etc. σ32 is a protein that regulates the expression of other proteins in response to external
stress. Under stress, the rate of synthesis and stability of σ32 increases. In this case the production
of other proteins such as chaperones (enzymes which assist other proteins in achieving proper
Chapter 3. Background on Stochastic Models and Tools in Biology 27
folding which is affected by stress) or proteases (enzymes that degrade other proteins).
Results
The problem is that many details of the σ32 mediated pathways are still not understood. The au-
thors validated their model by comparing its simulated behaviour to experimental results. But it
is not sure if all parts of the model are correct, such as the exact binding sequence of the different
chaperons. Nevertheless, they were able to reproduce experimental results and to gain new insights
into the behaviour of the stress circuit. One of these results was that the σ32 response is mainly
controlled by the rate of mRNA translation and that large quantities of the protein are bound to
chaperones under non-stress conditions. This allows the cell to react rapidly to external stress by
releasing these proteins and not waiting until new ones are produced. The authors also underlined
the fact that a stochastic formulation can be used to generate estimates of the variance in the data.
This information cannot be obtained from a deterministic model.
Their approach was similar to ours since they created a simplified model of a biological system
and used simulation results to understand it better. They faced the same dilemma as we did since
many details of the real biological system are not understood, so one has to generalize and make
assumptions about some parameters. On the other hand, our experiments deal with rather general
features such as the architecture of the system, its resistance to noise etc. In contrast to this,
Srivastava et al. (2001) were interested in very specific features of the σ32 pathways such as the
partitioning of the σ32 within the cell and the time evolution of its concentration.
3.2 Review of other Petri Net Tools
An Open Source software tool was used to edit and simulate the Petri Net models in this project.
The development of a new software completely from the scratch would have taken too much of
the three months that were allocated for this project. Hence we decided to search for a suitable
software and to extend it, developing the features needed.
It turned out that there are many software tools available that have some of the required features.
There are some very good Petri Net editors such as GreatSPN (Chiola, Franceschinis, Gaeta &
Ribaudo 1995) or UltraSAN (Deavours et al. 1995). GreatSPN supports Generalized Stochastic
Petri Nets (Marsan, Balbo, Conte, Donatelli & Franceschinis 1995), Stochastic Petri Nets with
Chapter 3. Background on Stochastic Models and Tools in Biology 28
immediate transitions and inhibitor arcs. UltraSAN started as a tool for Stochastic Petri Nets but
offers now many extensions to this theory such as arcs that are associated with a certain function
and various analytical solvers on the level of the Markov process.
Both tools are general tools with no specific application domain. That means that they do not offer
any features specific to a biological application. Nevertheless, they are widely used and UltraSAN
has also been used to model regulatory networks in E.Coli (Goss & Peccoud 1998). The problem
is that they have often very restrictive licenses and, even if they are developed by research groups
at Universities, the developers do not support any extensions of their software.
Thus we searched for alternatives. A great help was the online archive of Petri Net tools available
at the University of Hamburg, Germany. 1 In the remainder of this chapter, we will review some
of the Petri Net software tools that are available online and justify the decision to use the Petri Net
Kernel (Kindler & Weber 2001).
Requirements for this project
The software needed should fulfil the following requirements:
• a graphical interface that is easy to use,
• platform-independent (which usually means that it is written in the programming language
Java),
• support of a stochastic simulation,
• means to analyse the results of the simulation,
• and a modular structure, such that we could implement new features if necessary.
Most of the Petri Net tools at Sourceforge and at the Petri Net world were tested for this abili-
ties. As an example, we will present two of these tools and the extent to which they fulfil these
requirements.
1http://www.informatik.uni-hamburg.de/TGI/PetriNets/
Chapter 3. Background on Stochastic Models and Tools in Biology 29
Cell Illustrator (Genomic Object Net)
The basis of this software project was an extension of the discrete Petri Net theory by continuous
features (Matsuno, Doi, Nagasaki & Miyano 2000). This can be useful since protein concentra-
tions vary continuously but are coupled with discrete switches (i.e. protein production is switched
on or off depending on the expression levels of some genes). Based on this observation, Matsuno,
Tanaka, Aoshima, Doi, Matsui & Miyano (2003) developed the theory of Hybrid Functional Petri
Nets (HFPN). In general, a hybrid Petri Net contains two sets of places and transitions, discrete
/ continuous places and discrete / continuous transitions. Discrete places and discrete transitions
are the same as in the discrete Petri Net model. In contrast to this, a continuous place holds a
nonnegative real number as its content. A continuous transition fires continuously and its firing
speed is a function of the values in the places. Finally, a hybrid functional Petri Net has discrete
and continuous input and output arcs but also test input arcs. This arc can be directed from a place
of any kind to a transition of any kind and does not consume the content of the source place. The
test arc only inhibits its target transition if its source place contains as least as many tokens as given
by its transition.
The simulation software Genomic Object Net (GON) implements these Hybrid Functional Petri
Nets. Matsuno et al. (2003) claims that using GON it is possible to construct a computational
model directly from a map of the biological pathway taken from the literature. GON uses a system
of differential equations to simulate this pathway. The parameters of the reaction have to be deter-
mined by experiments or found in the literature.
A trial version of GON was installed and tested. It is evident that a hybrid net is ideal to model
switch behaviour that may occur in biology. The software is very easy to use and gives a very
professional impression. On the other hand, it is does not implement any stochastic simulation
features so it is a very different modelling approach.
One of the largest drawbacks of GON is that the latest version has become commercial. The com-
pany Gene Networks International sells this software now under the name Cell Illustrator. Even
academic users have to pay a considerable sum for the full version but a trial version with limited
functionality is available.
Cell Illustrator is only available for Windows. It is very easy to use but cannot be extended and
Chapter 3. Background on Stochastic Models and Tools in Biology 30
does not offer any stochastic simulation features. Therefore it was not suitable for this project.
PIPE - Platform Independent Petri Net Editor
In contrast to GON, this tool is a classical editor for Petri Nets. It was developed during a M.Sc.
group project at the Imperial College London 2003 (Bloom 2003). After the end of the dissertation,
PIPE was further extended and is now available as Open Source project at the Sourceforge online
repository2.
The software was written with the aim to provide an easy-to-use application. The authors also
wanted to ensure that the program is extensible by creating a modular structure. New modules can
be added easily to extend to functionality of the program. So far, different analysis modules have
been developed such as modules for the identification of invariants, state space analysis etc.
In the beginning, it seemed that PIPE would be an ideal tool for this project. It is published under
an Open Source licence, written in Java and very intuitive to use. Nevertheless, it does not support
Stochastic Petri Nets and hence no stochastic simulation. First we planned to implement these
features on our own and received considerable support from the current maintainer, James Bloom.
Unfortunately, it turned out that the features needed were far more difficult to implement than
expected. After some days it was decided to use another software tool that was easier to extend.
The problem was that PIPE was written in such a way that new analysis modules could be added
easily, but it was difficult to introduce new types of Petri Nets. In its current version, PIPE only
supports non-stochastic Place-Transition Nets and is therefore not suitable for this project.
3.3 Conclusions
The idea that a stochastic simulation of chemical reactions can be more accurate and natural from
a physical perspective is not new. But the fact that variations on the molecular level can have
substantial influence on the high-level pathways and even on the phenotype of the organism was
proven only some years ago (Arkin et al. 1998). We were able to make use of the experience and
the work of others in this project even if our approach slightly differs from the work published so
far.
2http://www.petri-net.net/htdocs/
Chapter 3. Background on Stochastic Models and Tools in Biology 31
From a software engineering perspective, we faced the difficult task to find a software tool that
fulfilled most of our requirements but was also modular so that it could be extended easily if
needed. There is the clear tendency that every research group writes their own tools mainly be-
cause self-written tools appear to be more trustworthy. In the near future it might become even
more important to integrate different tools and their abilities in order to avoid a waste of time and
resources.
For this project, it turned out that even if there are many tools available, none fulfilled all require-
ments. We had the option to either write our own tool or to extend an existing one. Writing a
useful software from the scratch takes a considerable amount of time, probably even more than the
three months that were allocated for this project, and therefore we decided to use an existing tool.
This also gave us enough time to put more effort into the experimental and modelling part of the
dissertation.
Chapter 4
Technical Methodology
The first half of this chapter describes the Petri Net Kernel (PNK) version 2 (Kindler & Weber
2001) and the extensions made during this project. We start by giving a general overview about
architecture and functions of the Kernel in the first section. After this the extended software as it
was used in this dissertation is presented. A documentation on how the Kernel can be extended by
support for other Petri Nets is detailed in the remainder of the first section.
In its current version, the Kernel can only be used to visualize and edit Petri Nets but not to simulate
their behaviour. An extension of PNK was developed that uses the infrastructure of the Systems
Biology Workbench (Hucka et al. 2002) to simulate the Markov process represented by the Petri
Net. We call this extended version PNK 2e (extended Petri Net Kernel 2).
The second half of this chapter is therefore dedicated to the Systems Biology Workbench (Hucka
et al. 2002). An overview of this project and its components is given, as well as the XML im-
plementation SBML (Finney & Hucka 2003), which is used as the workbench’s data exchange
format.
4.1 The Petri Net Kernel
The Petri Net Kernel (PNK) was developed at the research group of Theory of Programming at the
Humboldt University of Berlin, Germany. It is rather intended to provide an infrastructure for the
development of Petri Net tools than to be a tool of its own. The current version is 2.2 and written
in Java. It is available at http://www.informatik.hu-berlin.de/top/pnk/.
32
Chapter 4. Technical Methodology 33
In its standard version PNK supports Place-Transition nets and Coloured Petri Nets, Petri Nets in
which the tokens belong to different classes and are distinguishable. It also offers a rudimentary
graphical editor that can be used to create and modify Petri Nets. The rationale for the development
of PNK was the fact that implementing a new Petri Net tool can be very time consuming. Most
of the implementation effort for a new Petri Net tool is spent for almost the same functionality: a
graphical editor, a visualization device, functions to save and store the net, etc. The effort for im-
plementing this standard functionality is spent over and over again when new tools are developed
by different groups. The aim of the PNK project was to provide a common infrastructure for Petri
Net tools and to avoid repeating the same implementation effort.
Right now the PNK project is finished. But a new and improved version is currently developed at
the University of Paderborn, Germany. There are some tools that have been developed using PNK
such as PNVis which supports the 3D visualization of Petri Nets and the Petri Net Cube, a software
that implements parametric Petri Nets.
4.1.1 Design and Concepts
In the following an outline of the architecture of PNK and its modules is given. New Petri Net
types can be added to the Kernel by implementing classes for places, transitions and arcs. Each of
these classes can have labels or extensions which are also Java classes.
As an example, a place has an extension for its current marking and a transition has an another ex-
tension which represents the rate of the exponential distribution from which the delay is sampled.
The net as a whole can also have one or several extensions, for instance its name and additional
information about the net type. The dependencies between classes and their labels are defined in a
XML file called netTypeSpecification. An example of a definition file for a Stochastic Petri Net is
given on the next page.
This concept of a net type is very general. Its advantage is its close relationship to the Petri Net
Markup Language (Weber & Kindler 2002), which is a common description language to exchange
Petri Nets between different software tools. It is also used by the Kernel to save nets created by the
user. Nevertheless, this markup language defines only the syntax: which labels and combinations
of labels are allowed for a certain net type. In order to provide the semantics, methods can be
defined for each label. For instance the label representing the marking of a place has methods that
Chapter 4. Technical Methodology 34
define the addition and subtraction of tokens. These methods are used by the simulator to move
the token through the net. Each net has also an associated instance of the class FiringRule. This
class defines how transitions are executed and how conflicts between transitions are resolved. In a
SPN, the class StochasticRule defines that the delays in the net follow an exponential distribution
and that in case of a conflict, the transition with the smallest delay is executed first.
Application Modules and I/O Modules
Each instance of a net type that is loaded into the editor is checked against its definition. Further-
more, the Petri Net Kernel provides a net type interface. This interface can be used to derive new
types. It basically states that each label and node in a net must be defined by a Java class. Once a
new net type is defined that fulfils the requirements of the net type interface, an instance of the net
can be loaded into the Kernel.
PNK also contains templates that define how a Petri Net can be saved to and retrieved from a file,
the so-called I/O-modules. So far, there is a module that saves a net into a PNML file (Petri Net
Markup language). Support for SBML and CMDL was added. SBML is used in the Systems
Biology Workbench to exchange models between different components and CMDL is a simplified
description language for chemical reactions supported by the stochastic simulator Dizzy (Ramsey,
Orrell & Bolouri 2005). The aim of the PNK project was to provide a common infrastructure for a
variety of Petri Net tools by offering templates for tasks that occur repeatedly. The architecture of
the Petri Net Kernel consists of five parts corresponding to the different steps encountered during
the development of a Petri Net tool:
• The net interface provides methods to access and to modify a net. It is also used to synchro-
nize the activity of different applications accessing the same net.
• The net dialog interface provides means to visualize information within a net and to interact
with the end user. This interface is used by applications that require an input by the user.
• The net type interface, already mentioned above, states how to define labels and firing rule
of a net type. It is used to check the validity of a net type.
• The application interface is a template for the definition of a new application based on the
PNK project. The related InOut interface defines the minimal functionality an I/O-module
has to offer.
Chapter 4. Technical Methodology 35
New tools and their relation to existing net types can be defined by in a XML file, the tool speci-
fication. This file contains an entry for each application developed using the infrastructure of the
Kernel. Each of these application entries has a sub-entry for each net type the application supports.
For instance, a stochastic simulator only supports Stochastic Petri Nets and therefore its entry in
the tool interface has only one sub-entry for Stochastic Petri Nets.
These interfaces will be discussed in more detail in the remainder of this section. We do not aim
at giving a complete review of all methods and classes, but rather a general idea of the available
functions and how they were used in this dissertation.
The Net Interface
This interface consists of a collection of Java classes that represent a Petri Net together with its
extensions. The basic classes are Net, Place, Transition, Arc, and Extension.
Each class provides methods for accessing and modifying the corresponding net and its elements.
As an example, for an instance ”net” of class Net, the method call net.getPlaces() returns a list
of all its places. Likewise, there are methods for returning the in-going and out-going arcs of a
transition or of a place. There are also methods to access the label of an element.
For all these methods, the PNK will take care of maintaining the net consistent. This means that
the Kernel synchronizes the state of the Kernel between the different applications accessing it. The
Java class application control, which is described at the end of this section, keeps track of changes
to the net made by different tools and guarantees the consistency of the net in all situations.
Definition of Petri Net types
As outlined before, a net type in PNK is defined by a collection of classes for nodes, transitions
and extensions. The dependencies of these classes are given in a XML file. The net type interface
checks a net type definition for its validity. In addition, the declaration of a new Petri Net requires
also the definition of its firing rule i.e. how its transitions are executed.
During the extension of the Kernel by support for Stochastic Petri Nets, classes representing sto-
chastic transitions were implemented. Each class has a rate which is a label that is derived from the
super-class Extension. This class provides a set of basic methods that need to be provided by all
Chapter 4. Technical Methodology 36
extensions such as a method to convert the internal value of the class into a string etc. A concrete
instance of a class stores all of its extensions in a hash. This hash contains key-value pairs of each
extension with their name as value.
The net type itself is specified in a XML file. The file enumerates all possible labels for each kind
of Petri Net element. The following listing shows an example of a Petri Net type specification. The
listing is simplified for the sake of clarity.
<?xml v e r s i o n = ” 1 . 0 ” e n c o d i n g =”UTF−8”?>
<!DOCTYPE n e t T y p e S p e c i f i c a t i o n SYSTEM ” n e t T y p e S p e c i f i c a t i o n . d t d ”>
<n e t T y p e S p e c i f i c a t i o n name=” S t o c h a s t i c N e t ”>
<e x t e n d a b l e c l a s s = ” . k e r n e l . Net”>
<e x t e n s i o n name=” f i r i n g R u l e ” c l a s s =” S t o c h a s t i c N e t R u l e ”/>
</ e x t e n d a b l e >
<e x t e n d a b l e c l a s s =” P l a c e ”>
<e x t e n s i o n name=” marking ” c l a s s =” Natura lNumber ”/>
<e x t e n s i o n name=” i n i t i a l M a r k i n g ” c l a s s =” Natura lNumber ”/>
</ e x t e n d a b l e >
<e x t e n d a b l e c l a s s =” Arc”>
<e x t e n s i o n name=” i n s c r i p t i o n ” c l a s s =” Natura lNumber1 ”/>
</ e x t e n d a b l e >
<e x t e n d a b l e c l a s s =” T r a n s i t i o n ”>
<e x t e n s i o n name=” r a t e ” c l a s s =” DoubleValue ”/>
</ e x t e n d a b l e >
</ n e t T y p e S p e c i f i c a t i o n >
The file contains a XML specific header and mentions the name of the Petri Net type. It then
introduces the two types of nodes in the SPN together with their extensions. The net itself has the
extension FiringRule, a place has a InitialMarking and a current Marking. The distinction between
initial marking and a current marking is useful to track the time evolution of the numbers of tokens
in a place. Each stochastic transition has a rate, which is the parameter of an exponential distribu-
tion, and each arc in the net has an inscription.
It is important to distinguish between the net interface and the net type interface. The net interface
as described in the last section offers the basic classes needed for the implementation of a new
Petri Net such as places, transitions etc. In contrast to this, the net type interface is used to check
the validity of a new net type. One requirement is that each component of the net as defined in the
type specification file is also represented by a Java class.
Chapter 4. Technical Methodology 37
The Dialog Interface
This interface is needed by applications that require an action by the user or want to present the
results of some computation to the user. It offers functions to display textual information in the
net, to highlight a set of nodes or to request a decision from the user. We used it in the Simulator
application which plays the token game by moving the tokens through the net. While performing
the simulation, enabled transitions are highlighted and the flow of tokens is illustrated by changing
the node labels. In a net type that does not define how conflicts between simultaneously enabled
transitions are resolved, the user can be asked to decide which transition to execute by clicking on
the corresponding node.
By default, if an application requests a dialog, this request will be passed to the editor which then
displays a dialog to the user, colours the nodes, etc. But it is also possible to implement one’s own
dialogs.
Creating new Applications with PNK
This section briefly summarizes how a new Petri Net application can be created by using the in-
frastructure of PNK. A new application is a small module in a Petri Net tool that implements some
functionality. As an example, the Editor is an application and the stochastic simulator as well.
Technically, an application is a class derived from the PNK class MetaApplication or MetaBigAp-
plication. In the most simple case, an application modules implements only one single method
run(). The current net can be accessed by the method getNet(). In a more complex applica-
tion, there can be arbitrarily many methods. In that case, however, the application must implement
a method getMenus() which provides the PNK with the necessary information on the available
methods, such that the PNK can provide corresponding menus for the end user and start a method
on user request.
There are also two more special interfaces, NetObserver and ApplicationNetDialog. The Net Ob-
server interfaces allows an application to keep track of changes that are made to the net. The
ApplicationNetDialog interface can be used to display graphical information screens or questions
to the user. The editor is an example of an application that implements both interfaces.
Input/Output-modules are classes derived from the class InOut and are similar to the aforemen-
Chapter 4. Technical Methodology 38
tioned applications. They have to implement two methods, load() and save(), that load or save
a net to or from a given URL. Support for PNML is implemented in the class PNMLInOut.
Definition of new Tools
Again we have to clarify an important point. In the framework of the Petri Net Kernel, an appli-
cation is a small module that is part of a larger program or tool. A tool is a collection of at least
one new net type and at least one application. The relationships between net types and application
are defined in a XML file, the tool type definition. This file contains an entry for each application.
Each of these entries has a sub-entry for the net types the application supports. This is a part of the
tool definition for the extended Petri Net Kernel that was developed in this dissertation.
<n e t t y p e i d =” n10 ” t y p e S p e c i f i c a t i o n =” f i l e : n e t T y p e S p e c i f i c a t i o n s / S t o c h a s t i c N e t . xml”/>
<n e t t y p e i d =” n11 ” t y p e S p e c i f i c a t i o n =” f i l e : n e t T y p e S p e c i f i c a t i o n s / GSPN . xml”/>
<a p p l i c a t i o n i d =” a1 ” ma inCla s s =” de . h u b e r l i n . i n f o r m a t i k . pnk . app . S t o c h a s t i c S i m u l a t o r ”>
<a l l o w e d N e t t y p e s >
<n t r e f r e f =” n10 ”/>
<n t r e f r e f =” n11 ”/>
</ a l l o w e d N e t t y p e s >
</ a p p l i c a t i o n >
The file specifies references to the definitions of the Stochastic Net and the Generalized Stochastic
Net. The next lines refer to the Stochastic Simulator (token game) which is available for Stochastic
Nets and Generalized Stochastic Nets. There are also additional attributes that can be defined for
an application, such as the maximum number of instances and a default net type.
The Application Control
The Java class Application Control is finally the mediator between the different applications in a
Petri Net tool. When the Kernel is started, this class loads all applications and net types as given by
the tool definition. The application control also coordinates the interaction between the different
applications and starts an application on request by the user.
4.1.2 The Extended Kernel (PNK 2e)
This section summarizes the extensions implemented for the Petri Net Kernel. The Kernel was
developed to provide an infrastructure for new Petri Net tools and has a modular composition. It
Chapter 4. Technical Methodology 39
Figure 4.1: Overview of the extended Petri Net Ker-
nel. The interface to the INA software was developed
at the Humboldt University of Berlin but is also avail-
able in the extended version. The simulation interface
to the Systems Biology Workbench was developed in
this work. For a description see below.
was therefore very suitable for this project.
Developing new Petri Net software from the scratch takes a considerable amount of time. The idea
behind this project was to extend an existing software tool and spend more time on the application
of the software. It turned out that this approach also has some shortcomings. The Petri Net Kernel
is a mature software tool. Nevertheless, it still contains some bugs that appeared during this work.
Despite its modularity, more changes had to be made than expected and these changes took more
time than anticipated.
Nonetheless the Kernel turned out to be useful and many helpful ideas and support were received
from its developers. It is available online1 and has been registered to a database of Systems Biology
software2. The remainder of this section will be used to present the changes that were made to the
Petri Net Kernel.
Print / Export
A function was implemented that allows the user of PNK to print a Petri Net or to export it to a
Postscript file. This function relies on the methods provided by the Java API 1.4.2. The plots of
Petri Nets in this report were created with this function.
1www.inf.fu-berlin.de/˜trieglaf/PNK2e2www.sbml.org
Chapter 4. Technical Methodology 40
Stochastic Petri Nets / Generalized Petri Nets
Classes for a Stochastic Petri Net, such as delayed transitions and a stochastic firing rule, were
implemented. In addition, Java classes that define the graphical representation of immediate tran-
sitions and inhibitor arcs in the editor were also developed.
Generalized Stochastic Petri Nets (GSPNs) are an extension of Stochastic Petri Nets. In addition
to delayed transitions, a GSPN also comprises immediate transitions and inhibitor arcs.
• An immediate transition fires, if enabled, before all delayed transitions. If several immediate
transitions are enabled at the same time, the transition to execute is determined at random.
• An inhibitor arc is an arc that disables a transition if its input place(s) contain(s) at least one
token. Inhibitor arcs may have multiplicities in the same way as ordinary arcs.
Inhibitor arcs can be used to model decisions that are assumed to take no time. They can also be
used to represent control actions that are necessary to ensure the correct behaviour of the model.
Inhibitor arcs result in a smaller state space of the underlying Markov process since they can be
used to exclude certain states.
We also implemented support for GSPNs since they seemed to be an useful extension of the stan-
dard SPN theory. But this formalism does not seem to be necessary for biological applications,
at least not in the examples used. Furthermore their simulation is more difficult and the corre-
spondence to a Markov process is less obvious. For these reasons this approach was not further
pursued.
Token Game for Stochastic Nets
If we start from an initial marking and execute enabled transitions according to the firing rule, we
call this simulation the token game. The token game that was implemented is essentially noth-
ing else than a stochastic simulation of the Markov process but at much lower speed. Every time a
transition is executed, it is highlighted for some milliseconds. The movement of the tokens through
the net is visualized.
The dialog and observer interfaces of the Petri Net Kernel allow implementing a token game simu-
lator easily. It is useful to get an idea of the general behaviour of the net. But for larger networks, a
token game simulation takes a lot of time and one should resort to a standard stochastic simulation.
Chapter 4. Technical Methodology 41
SBML and CMDL support
SBML and CMDL are data exchange formats that are used in the Systems Biology Workbench.
Before a Petri Net can be simulated in the workbench, it needs to be translated into one of these
formats. We will give details of both languages in the next section.
As outlined above, new In- and Output classes can be derived from the InOut class that is provided
by the Kernel. We developed two new classes, SBMInOut which translates a Petri Net into SBML,
and CMDLInOut that translates into CMDL.
In addition, the extended Kernel can import Stochastic Petri Nets from SBML. This is particularly
useful since many published models of chemical reactions are available in this data format. There
are also efforts to provide a collection of annotated SBML models online 3. These models can now
be imported into the Kernel. The problem is that SBML does not contain any layout information. If
a Petri Net representation is created from a SBML file, PNK 2e creates a single cluster of all nodes
in the net. The Kernel contains a simple layout algorithm that was implemented by the developers
at the Humboldt University. This algorithm is capable of rearranging the nodes and tries to create
a clearer layout of the graph. In our experience, this works only for simple and small Petri Nets.
The algorithmic drawing of complex graphs is a topic of current research.
Interface to Systems Biology Workbench
The Systems Biology Workbench (SBW) includes an interface that allows other applications to
use its infrastructure. This application only needs to use the API offered by the SBW to call the
so-called broker and can access any of its services.
A new application was developed that derives from the interface MetaApplication, translates the
Petri Net into SBML or CMDL and calls the simulation service of the SBW. This service can then
be used to simulate the net either stochastically or deterministically. Details of this simulation
service will be given in section 4.3.
Simplified Petri Nets for Biological Reactions
After the first few experiments with the extended Petri Net Kernel, it turned out that Petri Nets rep-
resenting networks of chemical reactions can become very large and complex. For this reason, two
3Biomodels database: http://www.biomodels.net/ (developed by the European Bioinformatics Institute, UK)
Chapter 4. Technical Methodology 42
simplifications for reactions which occur very frequently in biological models, were introduced.
X
n1
Y The first simplification comprises a new Petri Net transition. It repre-
sents a reversible chemical reaction X ←→ Y . It is defined in the Java
class ReversibleTransition and has two associated rates instead of one. Before this simplified model
can be simulated, the reversible transition has to be decomposed into its forward and backward re-
action. The first rate is assigned to the forward and the second to the backward reaction.
Enzyme
XY
t1
The second simplification represents the reaction X →Y which is catalysed
by an enzyme (indicated by the place with the inscription ”enzyme”). The
fact that the enzyme catalyses this reaction is indicated by the black box at
the end of the arc that connects the enzyme place with the transition. This
arc which is implemented in class EnzymeArc and has three associated rates r1, r2 and r3. In order
to simulate this reaction stochastically, the Kernel decomposes the reaction into three separate
steps:
X +Enzymer1−→ Cm (4.1)
Cmr2−→ X +Enzyme (4.2)
Cmr3−→ Y +Enzyme (4.3)
This set of reactions represents the synthesis of an intermediate complex Cm (4.1), the dissociation
of this complex into reactant X and the enzyme (4.2) and production of the product Y (4.3). Each
of the rates in class EnzymeArc is associated to one reaction steps. More complex reactions with
several reactants and/or products are also possible.
4.2 The Systems Biology Workbench
This section gives an outline of the Systems Biology Workbench, the framework used to perform
the simulations in this dissertation. The ERATO Systems Biology Workbench (SBW) Project
was originally funded by a grant from the Japan Science and Technology Corporation (Hucka
et al. 2002). The aim of this project was to build a software infrastructure that allows to share
resources between simulation and analysis programs for Systems Biology. Currently financial
support comes from the U.S. Department of Energy. The principal investigators of the project are
situated at the Keck Graduate Institute in Claremont, the California Institute of Technology and
the Institute for Systems Biology in Seattle.
Chapter 4. Technical Methodology 43
The Systems Biology Workbench was used to simulate the time evolution of Stochastic Petri Nets.
As outlined in chapter 2, a SPN maps directly to the Gillespie or Gibson Bruck algorithm and can
be executed since all of these stochastic models are based on a Markov process. The next sections
describe the architecture of the SBW, give an overview of its API and describe its simulation
service in more detail.
Overview
The current version of the SBW is 2.3.1. Due to large scale architectural changes, this version is
only available for the Windows operating system. All parts are available as open-source under the
GNU LGPL.
It is fact that many software engineers often duplicate each other’s effort when implementing dif-
ferent packages. Many research groups write their own programs that fulfil their very special needs
and reflect the specific expertise and preferences of the group. This results in many small software
projects, each having its niche strengths which are different, but complementary to, the strengths
of other tools. On the other hand, since there are certain basic functions that are needed by all
programs (data input/output, visualisation of results etc.), developers often have to re-implement
general functionality in their tools.
There is currently no software that can answer all the questions of the Systems Biology community.
Many researchers uses a variety of tools at the same time to look at their problems from different
perspectives. Software that eases the exchange of information between these tools will become
even more important.
The Systems Biology Workbench tries to address this issue by offering a common infrastructure
for software tools within the field of Systems or Theoretical Biology. The approach is very similar
to the Petri Net Kernel that comprises an infrastructure for Petri Net tools to avoid a waste of
resources. We decided to merge these tools that are so similar in their approach.
Chapter 4. Technical Methodology 44
The SBW Broker
SBW Broker
Module writtenin Python
SBW Python Interface
Mod
ule
writ
ten
in J
ava
SBW
Jav
a In
terf
ace M
odule written
in C++
SBW
C++ Interface
Figure 4.2: The broker architecture of the
SBW. Gray areas indicates SBW components.
Adapted from (Hucka et al. 2002).
The SBW distribution consists of the SBW Broker
and several small modules that illustrate the use of
the workbench. Among these modules are Jarnac,
a deterministic simulator, and a bifurcation analy-
sis tool. Furthermore there are many programs that
have been developed independently from the SBW
but support its communication protocols. They are
SBW-enabled and can communicate through the SBW.
Among these programs is Dizzy, a stochastic simula-
tor that implements the Gillespie algorithm. The cen-
trepiece of the SBW is the SBW Broker. It is a background program that is started automatically if
needed by a module of the SBW. The Broker maintains a list of all registered modules. A program
needs to be registered once with the Broker and can then be started on demand.
Architecture of the SBW
The communication of the different components in the SBW is realized by a message-passing ar-
chitecture. Messages are exchanged as structured data-bundles. All interactions are defined on a
very high level and the whole framework is independent from any programming language. The
Broker itself is written in C++ but the modules can be written in any language as long as they can
send, retrieve and process messages according to the conventions given by the SBW.
A module can implement one or more services. Services are interfaces to the functions of the
module. These interfaces are made visible to other modules and can be executed through the Bro-
ker. Each service belongs to a hierarchically organized category. This organization allows other
applications to list all services that belong to a certain category without knowing the names of
each service. As an example, the SBW module that we developed for the Petri Net Kernel simply
lists all simulation services that are known to the Broker and the user can decide which service to
execute.
Chapter 4. Technical Methodology 45
The following listing is taken from PNK 2e and illustrates the use of the Java API provided by
Systems Biology Workbench:
1
2 SBW. c o n n e c t ( ) ; / / open c o n n e c t i o n t o SBW b r o k e r
3
4 s e l e c t i o n = ( S e r v i c e D e s c r i p t o r ) l i s t . g e t S e l e c t e d V a l u e ( ) ;
5
6 S e r v i c e s e r v i c e = s e l e c t i o n . g e t S e r v i c e I n M o d u l e I n s t a n c e ( ) ;
7 A n a l y s i s a n a l y s i s = ( A n a l y s i s ) s e r v i c e . g e t S e r v i c e O b j e c t ( A n a l y s i s . c l a s s ) ;
8
9 a n a l y s i s . d o A n a l y s i s ( sbml ) ; / / v a r i a b l e sbml s t o r e s t h e sbml r e p r e s e n t a t i o n
In the third line the service that was chosen by the user is retrieved. This description of this service
is encapsulated in the class ServiceDescriptor. Then an instance of the service is requested from
the SBW.
The simulation service is a subclass of the more general Analysis category. We retrieve an instance
of this category and perform the simulation with a call of the method doAnalysis. This source
code is very general since our intention was to offer the user not only the choice between services
of the very specific Simulation category but also to the larger Analysis category which contains
also deterministic simulators and export modules.
Visualisation and Analysis of Results
The Systems Biology Workbench contains some modules that can perform a rudimentary analysis
of the results. The graphs in the experimental section of this work were performed by exporting
the simulation results to Matlab, which is possible using the Matlab export module of the SBW.
The Systems Biology Markup Language
The Systems Biology Markup Language (SBML) (Finney & Hucka 2003) complements the Sys-
tems Biology Workbench described in the last section. Most of the modules contained in the SBW
require the model to be described in this language. SBML is a standard description language for
the representation of models of biological networks. Mathematical models described in SBML for
a variety of systems are available at http://www.sbml.org. This webpage also contains a SBML
test suite that allows to verify own models and to ease the development of own applications. Pack-
Chapter 4. Technical Methodology 46
ages such as libSBML are available and can be used to parse SBML files and to map them into their
own internal representation. Although SBML (currently implemented in XML) aims at machine
readability, it can also be followed by eye.
Currently SBML exists in two versions, level 1 and 2. Level 2 is the newest release but level 1
is still widely supported. Conversions between both levels is possible but level 2 contains some
improvements that can not be translated into level 1. However, most of the available tools only
support level 1 including the Systems Biology Workbench. We will therefore restrict this overview
to the features available in level 1. A SBML model consists of the following entities:
Unit Definition determines the unit of the participating species. For stochastic simulations, this
has to be the number of molecules. A deterministic model requires concentrations which can
be calculated from the volume of the container and the number of molecules.
Compartment is a finite container in which the reactions take place. An example could be a
biological cell or a cellular compartment such as the nucleus. This definition also contains
the volume of the container.
Species are the types of molecules involved in this model. An attribute of each species is its initial
amount given in a unit as defined above together with the compartment in which it is located.
Reaction is a statement describing some transformation, transport or binding process that can
change the amount of one or more species. Each reaction has an associated rate law that
describes its kinetics.
Parameter are variables in the model. They can relate to a single reaction only (such as rate
constants) but also act on a global level.
Rules are a general concept that can be used to express relations between the amount of species
or constrains on the rate law of a reaction etc.
The SBML language definition is available online at www.sbml.org and gives further details about
the different entities in a SBML document and some examples. Here we will only show a small
example of a reaction and how it can be represented with SBML. Consider the reaction
X +2 Yk1−→ Z
which is some hypothetical synthesis of a molecule Z with the reactants X and Y . The reaction
occurs with at a rate constant k1. A representation of this reaction in SBML level 1 would look like
this:
Chapter 4. Technical Methodology 47
. . . .
< l i s t O f R e a c t i o n s>
< r e a c t i o n name=” r e a c t i o n 1 ” r e v e r s i b l e =” f a l s e ”>
< l i s t O f R e a c t a n t s>
<s p e c i e s R e f e r e n c e s p e c i e s =”X” s t o i c h i o m e t r y =” 1 ” />
<s p e c i e s R e f e r e n c e s p e c i e s =”Y” s t o i c h i o m e t r y =” 2 ” />
< / l i s t O f R e a c t a n t s>
< l i s t O f P r o d u c t s>
<s p e c i e s R e f e r e n c e s p e c i e s =”Z” s t o i c h i o m e t r y =” 1 ” />
< / l i s t O f P r o d u c t s>
<k i n e t i c L a w f o r m u l a =” k1 ∗ X ∗ Y”>
< l i s t O f P a r a m e t e r s>
<p a r a m e t e r name=” k1 ” v a l u e =” 5 ” />
< / l i s t O f P a r a m e t e r s>
< / k i n e t i c L a w>
< / r e a c t i o n>
< / l i s t O f R e a c t i o n s>
. . .
The XML header and the declaration of units and compartment were omitted for clarity. SBML
is clearly a useful invention. So far, the problem is that not all tools support all features of this
language. For instance, the stochastic simulator Dizzy does not recognize the attribute reversible
of a reaction. As a result, a reversible reaction has to be represented by two separate reactions, one
for the forward and another one for the backward reaction.
There is also no consensus about the representation of enzymes or inhibitors but this is an issue
that is supposed to be solved in future releases of SBML.
Mapping of the Petri Net to SBML
With the information given in the last section, it is relatively easy to see how a Petri Net can be
stored in SBML. Places are stored as species and transitions as reactions. The Java class SBML
InOut implements this mapping. This class uses the aforementioned library libSBML provided by
the developers of SBML. This library simplifies the export to SBML since the programmer does
not have to write the XML file directly but creates a set of Java objects that represent the document.
The library handles all read and write accesses on the level of the XML file and is very fast.
Chapter 4. Technical Methodology 48
During this dissertation, the same approach of mapping Petri Nets to SBML has been published
elsewhere (Shaw, Koelmans, Steggles & Wipat 2004). However, we developed our own imple-
mentation and did not use any other software besides the libSML library.
There is only one issue that needs to be clarified and that is the question of how to map the tokens
in the Petri Net to the units in the SBML model. In general, a stochastic simulation requires the
units to be given in terms of the numbers of molecules. But in the literature, there are several
approaches. The most common one is to think of one token as one molecule. But there are also
studies in which one token represents a fixed number of molecules (Srivastava et al. 2001, Arkin
et al. 1998).
In this implementation and all subsequent experiments, it is assumed that one token represents one
molecule. If the user wants to obtain deterministic information about the model, the concentration
is computed from the numbers of tokens and the volume of the compartment.
4.3 Stochastic Simulations with Dizzy
The software Dizzy (Ramsey et al. 2005) is developed at the Institute for Systems Biology in Seat-
tle. It implements the Gillespie-Algorithm and its improvement by Gibson and Bruck. Dizzy also
contains a tool for the integration of differential equations and is capable of importing models from
SBML. The user can plot the results or write them to a file for further processing by other tools.
The program is written in Java and published as OpenSource software.
Dizzy was used to perform the simulations in the next chapter. There are several tools with sim-
ilar functionality that are supported by the SBW. It was decided to use Dizzy since it has a very
comfortable graphical interface. In addition it also offers a so-called programmatic-interface. This
interface can be used by Java programs to use the algorithms implemented by Dizzy directly with-
out the SBW, simply by importing the libraries of Dizzy. This was useful since the first release of
the SBW was not very stable and the connection to the Broker was interrupted frequently. This
resulted in an interruption of the simulation. We reported these problems to the developers of the
workbench but it took them some time to correct this bug. In the meantime, Dizzy could be used
through its programmatic interface directly. Later it was decided to switch back to the SBW me-
diated communication. A communication of Petri Net Kernel and Dizzy through the workbench is
more flexible and allows to include other tools into the analysis as well.
Chapter 4. Technical Methodology 49
Dizzy is also capable of reading CMDL which is a simplified version of SBML only suitable for
the description of small models. An export module for the Petri Net Kernel was written since
CMDL is easier to read and to debug if the model is small. But SMBL is much more commonly
used and therefore no examples of CMDL will be given.
4.4 Conclusions
In this chapter, the technical methodology used in this project has been described. The Petri Net
Kernel and its extensions made during this dissertation were detailed. A survey of the Systems
Biology Workbench was given. Both projects are similar in their approach and have proven to be
useful during this dissertation. There seems to be a clear trend in academical software engineering
to develop frameworks that integrate different tools and avoid double effort.
It was decided to extend existing tools instead of writing new software in order to have more time
for the experimental part of this project. This approach was successful. On the other hand, it turned
out that one has to spend considerable time to understand program written by other people. This is
particularly important if one wants to make changes to the software or to implement new features.
Furthermore, if bugs are found in the software it is often very difficult to find them. One is usually
dependent on the help of the developers of the software. This is usually not the case if one uses
only self-written software.
During this dissertation, considerable help was received from the developers of the Systems Bi-
ology Workbench, especially Frank Bergmann at the Keck Graduate Institute in Claremont, USA
and Stephen Ramsey who is developing the simulation software Dizzy. Both responded quickly
to any questions and were eager to extend their software and to provide additional documentation.
Reported bugs were often corrected within some days. The friendly and open attitude that is char-
acteristic for the Open Source community greatly facilitated the work on this dissertation.
The extension to the Petri Net Kernel that was developed during this dissertation is available online
at www.inf.fu-berlin.de/˜trieglaf/PNK2e/. This web page also provides a detailed docu-
mentation of the software generated with the javadoc tool. Appendix A contains a user guide
to PNK 2e which includes screenshots and a short introduction. This document is also available
online at the aforementioned web page.
Chapter 5
Experiments
This chapter describes the experiments conducted with the extended Petri Net Kernel. We start
with a simulation of the Volterra-Lotka Reactions. This well-known model has been extensively
used as an example for systems whose behaviour cannot be accurately computed with determinis-
tic simulations (Gillespie 1976). In this dissertation, it serves as a first example for an application
of the Petri Net Kernel. It highlights the usefulness of Stochastic Petri Nets for the modeling of
biological reactions and, since the simulation results have already been researched, is used to vali-
date our approach.
The second section deals with a more comprehensive example, a stochastic simulation of a regu-
latory gene network that exhibits circadian rhythms. It is known that many organisms exhibit an
circadian rhythm on the level of genes and proteins that follows the daily cycle of day and night.
However, the details of the molecular mechanisms upon which these rhythms rely are not yet un-
derstood. Two competing models and their dynamic behaviour are compared. These models differ
heavily in their architecture and the assumptions they make. Consequently, they also reveal a very
different behaviour if simulated stochastically under changing conditions. This chapter concludes
with some experiments dealing with the synchronisation of several biomolecular clocks.
50
Chapter 5. Experiments 51
5.1 The Volterra-Lotka Reactions
The so-called Lotka-Volterra reactions are described by a set of three coupled, autocatalytic reac-
tions:
Yc1−→ 2Y (5.1)
Y +Xc2−→ 2X (5.2)
Xc3−→ /0 (5.3)
This model can also be seen as a simple description of a predator-prey ecosystem. In this case,
Y represents the prey which reproduces itself in reaction 5.1. Reaction 5.2 describes how X , the
predator species, reproduces by feeding on the prey. Reaction 5.3 models the demise of X through
natural causes.
These reactions have been extensively modelled using different deterministic and stochastic ap-
proaches. The corresponding reaction-rate equations are given by
dYdt
= c1Y − c2XY (5.4)
dXdt
= c2XY − c3X (5.5)
The nontrivial steady state of this system is given by
dYdt
=dXdt
= 0
and one can show that it is characterized by
Y = c3/c2 and X = c1/c2
For X = Y = 1000, c1 = c3 = 10 and c2 = 0.01, a deterministic approach predicts that this situation
will persist indefinitely (Gillespie 1976). Figure 5.1(b) shows the result of a numerical integration
of the equations 5.4 and 5.5 with these initial conditions: the number of molecules remains indeed
constant over time. But this is not what one would expect from a good model of a predator-prey
ecosystem.
It was also shown that this system exhibits an oscillatory behaviour if modelled stochastically.
The system was simulated by mapping the Stochastic Petri Net to its SBML representation using
Chapter 5. Experiments 52
prey
predator
reproduction of prey
10
consumption of prey
0.01
predator death
10
1
2
11
2
1
(a) Petri Net Representation
0 5 10 15 20 25 30 35 40999
999.2
999.4
999.6
999.8
1000
1000.2
1000.4
1000.6
1000.8
1001
num
ber
of m
olec
ules
Time [d]
PreyPredator
(b) Deterministic solution
Figure 5.1: 5.1(a) is the SPN representation of the Lotka-Volterra reactions. 5.1(b) shows the deterministic
solution of the reaction-rate equations with X = Y = 1000, c1 = c3 = 10 and c2 = 0.01. The plot was obtained
by numerical integration of the differential equations 5.4 and 5.5.
PNK2e. This SBML representation was used to simulate the dynamic behaviour of the model with
the Gillespie algorithm. Figure 5.2(a) shows the result of this simulation with the expected oscil-
lations. We conducted our experiments with initial conditions that match the steady state. Further
experiments have shown that the behaviour of the system is heavily influenced by its initial con-
ditions. Not always one of the steady states is reached. Nonetheless, stochastic and deterministic
formulation come to a different result in many cases (data not shown). We decided to present
simulations with the same parameters as Gillespie (1976) because this example is very illustrative.
5.1.1 Results and Discussion
It is not difficult to see why the Lotka-Volterra model exhibits an oscillatory behaviour. If one
examines the plot 5.2(a) carefully, one can see that each rise in the prey population is followed
by an increase of the predator population and a subsequent decrease in the prey population. If the
prey population increases, the amount of available food for the predator population rises as well.
This leads to an increase of the predator population followed by a decrease in the prey population.
The resultant food shortage for the predators leads to a decline in their population. This permits
the prey population to increase again etc.
Chapter 5. Experiments 53
0 5 10 15 20 25 300
500
1000
1500
2000
2500
3000
nu
mb
er
of
mo
lecu
les
time
PredatorPrey
(a) Stochastic Simulation (30 timesteps)
200 400 600 800 1000 1200 1400 1600 1800 2000 2200200
400
600
800
1000
1200
1400
1600
1800
2000
2200
Number of Y molecules
Nu
mb
er
of
X m
ole
cule
s
(b) Y vs. X
Figure 5.2: Stochastic simulation of the Lotka-Volterra reactions with the Gillespie algorithm. The number of
X and Y molecules oscillates between 150 and 2600 molecules. The right figure shows a plot of Y vs. X .
Gillespie (1976) gives a more formal analysis of the simulation results. If one solves the differen-
tial equations 5.4 and 5.5 for an arbitrary initial state (X0,Y0), then the solution would be an orbit
in the (X,Y) plane that passes through the initial state (X0,Y0). One can show that this solution
is only neutrally stable in a mathematical sense. That means that if the system is perturbed by
some random fluctuations, it is driven out of this orbit and ends up in a new solution orbit pass-
ing through (X ′0,Y′0). Figure 5.2(b) illustrates this behaviour. It shows a plot of the number of Y
(predator) versus the number of X (prey) molecules. The system passes through several neutrally
stable, concentric solution orbits. Fluctuations can drive it either outward or inward into a new
orbit. Sooner or later, a random fluctuation will drive the system on one of the two axes in figure
5.2(b). That is, either the prey or the predator population will die out. If the prey dies out first, the
predator population will die out as well. If the predator population dies out first, the prey popula-
tion will increase indefinitely.
The Volterra-Lotka reactions are of course a very simple example. They have been used by Gille-
spie (1976) to underline that there are (bio-)chemical systems whose behaviour can not be reliably
predicted using deterministic approaches. The reason is that a deterministic formulation does not
take into account the random fluctuations occurring at the microscopic level. In contrast to this,
the stochastic simulations is a much more natural framework and gives results that are closer to
intuition. In this work, it was decided to use this example again to show that the Petri Net repre-
sentation is able to represent chemical reactions in a natural way. The SPN representations of the
reactions is given in figure 5.1(a) and is easy to understand. Furthermore, if the transition rates in
Chapter 5. Experiments 54
the Stochastic Petri Net are chosen according to the mass action rate kinetics (Cox & Nelson 2004),
the net can be efficiently simulated using the Gillespie algorithm as done in our experiments.
5.2 Stochastic models of circadian rhythms
5.2.1 The delay-based Model
The daily change of day and night affect nearly all life-forms. Many organisms have evolved
rhythmic responses that follow this day-night cycle. There exist many responses, ranging from
behaviour (sleep-wake cycles, feeding rhythms) to molecular rhythms (e.g. gene expression and
enzyme-activity rhythms).
This section deals with a core molecular model capable of generating circadian rhythms in Neu-
rospora crassa which is a red bread mould. This mould is often used as a model organism since it
is easy to grow. Its genome is fully sequenced and simple. The model itself represents a rhythmic
response on the level of gene expression (i.e. the response is given by oscillations in the concentra-
tion of a regulatory protein with a period close to 24 h). One has to keep in mind that this model is
fairly general. It does not aim at capturing all the details in a real cell. It is also not specific to Neu-
rospora but represents an architecture that is thought to be representative for a biomolecular clock
in simple organisms. As an example, Gonze, Halloy & Goldbeter (2002) state that, with small
modifications, this system is equivalent to the biomolecular clock in Drosophila melanogaster, the
fruit fly.
There are several models of genetic oscillators, almost all of them are based on a negative feedback
loop. In this loop a regulatory protein inhibits its own expression. The model we are examining
was published by Leloup, Gonze & Goldbeter (1999) for the first time. This deterministic descrip-
tion was later transformed into a stochastic model by Barkai & Leibler (2000). The simulation of
this model revealed a very noisy behaviour and no stable oscillations. Their conclusion was that
the model is wrong since it is only able to exhibit circadian rhythms if solved deterministically.
But the deterministic formulation is assumed to be inadequate since the regulatory protein occurs
in very few instances only. Two years later the same model was simulated again using the same
algorithm but with different rate constants (Gonze, Halloy & Goldbeter 2002) for the protein-DNA
interactions. These kinetic constants are thought to be critical for the noise resistance of the sys-
tem. Under these conditions the model revealed stable oscillations with a period of about 24 hours.
Chapter 5. Experiments 55
Gonze, Halloy & Goldbeter (2002) argue that the kinetic rate constants used by Barkai & Leibler
(2000) in their stochastic model were too small. Using different parameters, they were able to
produce stable oscillations in a stochastic simulation even with very few molecules. In the experi-
ments that follow, the same kinetic constants and experimental setup are used as by Gonze, Halloy
& Goldbeter (2002). But it is not clear which parameters are actually correct since no experimental
data is available so far.
Figure 5.3: Core model for circadian rhythms based on delay. This plot is taken from Gonze,
Halloy & Goldbeter (2002). The model incorporates gene transcription as well as transport, degradation and
translation of mRNA (Mp). The clock protein (P0) that is synthesized from the mRNA is reversibly phospory-
lated into the form P1 and P2 successively. P2 is either degraded or transported into the nucleus (PN) where it
exerts a negative feedback on its own gene. The inhibition is cooperative (explanation see text below). This
general model accounts for circadian oscillations in Neurospora but also Drosophila.
The model
An overview of the model is given in figure 5.3. Essentially a protein is phosphorylated in two
steps, it diffuses back into the nucleus and inhibits its own synthesis. This negative feedback loop
leads, if timed correctly, to oscillations in the concentration of the protein with a period close to
24 hours. The phosphorylation induces a delay between the translation of the mRNA and the dif-
fusion of the protein back into the nucleus. Its role in the biological clock is not yet clear (Barkai
& Leibler 2000) and there are theoretical models of circadian oscillations that can do without it
(Gonze, Halloy & Gaspard 2002). Up to four proteins must bind successively to the gene promoter
to repress transcription. The resulting inhibition of the gene is cooperative. This means that each
bound protein facilitates the binding of the next protein, i.e. the rate constants of these reactions
Chapter 5. Experiments 56
are increased. This is a phenomenon that occurs very frequently in nature. But also cooperativity
is not required for the oscillations to take place. Nevertheless, a cooperative inhibition increases
the robustness of the oscillations. The experiments of Gonze, Halloy & Goldbeter (2002) revealed
that the highest robustness is achieved with three proteins binding to the gene. They understand
robustness as an informal measure of the regularity of the oscillations. Whenever we use the term
robustness in the remainder of this dissertation, we refer to the regularity the oscillations as well.
This regularity is determined by the deviation in period and amplitude of the oscillations.
The differential equations that correspond to the individual reaction steps in the model are given
in Appendix B. From this deterministic model, a description of the detailed reaction steps was
derived. Reactions following Michaels-Menten kinetics are decomposed into single steps. The
Petri Net representation of the reaction steps is large and is therefore included in Appendix B. The
probabilities of the reactions are, if not present in the deterministic model, taken from the literature.
As outlined before, the aim of the stochastic formulation was to show that this model is adequate,
a fact that was questioned in a previous study (Barkai & Leibler 2000).
The volume of the system, Ω, was changed systematically in order to modulate the number of
molecules in the model. In order to understand why an increase of the volume results in a larger
number of molecules, one needs to recall how the deterministic rate constants are converted in the
stochastic reaction constants (see section 2.3.2 in chapter 2). Intuitively, an increase of the size of
the container should result in a dilution and less molecules involved. But this is not the case in
this scenario. Since the deterministic constants are expressed in terms of the concentrations of the
molecular species, we multiply these constants by the volume of the system and obtain constants in
terms of the number of molecules. The deterministic constants are fixed and therefore an increase
in the volume of the system increases the magnitude of the reaction probabilities and effectively
the number of molecules involved. Gonze, Halloy & Goldbeter (2002) refer to Ω by the size of the
system instead of volume in order to avoid confusions.
The problem with this approach is that if we modify all reaction probabilities in this way, we would
also increase the number of genes in the model. But this is not realistic and therefore all reaction
constants involving the gene promoter G all scaled by Ω to keep its number equal to unity. For
details see table B.2 in the appendix. The experiments were started by creating a Petri Net repre-
sentation of this model. The Petri Net is then translated into SBML and its behaviour simulated
by the Gillespie algorithm. This algorithm is implemented in the Systems Biology Workbench
Chapter 5. Experiments 57
(Hucka et al. 2002). We tried first to recreate the results published by Gonze, Halloy & Goldbeter
(2002) and performed simulations with different values of Ω. The aim was to check whether the
stochastic simulations produce results similar to those obtained with the deterministic model.
Estimates of period in the oscillations were obtained with the Matlab Signal Processing Toolbox.
The Fourier transform was applied to the time course of the protein population. This transform
aims at decomposing a noisy signal into a linear combination of sine and cosine functions. The
power spectrum or spectral density was obtained and used to estimate the strength of the different
frequencies that form the signal, in this case the time evolution of the numbers of proteins. From
the frequency estimate, the period of the signal was obtained.
0 20 40 60 80 100 120 140 160 180 2000
5
10
15
Time [h]
Con
cent
ratio
n [n
M]
Deterministic simulation
mRNAnuclear proteinall protein molecules
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80
1
2
3
4
5
6
7Deterministic simulation
mRNA concentration, Mp (nM)
Nucle
ar p
rote
in c
once
ntra
tion,
PN (n
M)
0 20 40 60 80 100 120 140 160 1800
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Time [h]
Num
ber o
f mR
NA
or p
rote
in m
olec
ules
Stochastic simulation with Ω = 500
mRNAnuclear proteinall protein molecules
0 100 200 300 400 500 600 700 800 900 10000
500
1000
1500
2000
2500
3000
3500
4000
4500Stochastic Simulation with Ω = 500
Nuc
lear
pro
tein
mol
ecul
es P
N
mRNA molecules MP
Figure 5.4: Delay-based circadian clock: stochastic and deterministic simulation The first row
gives the results obtained in the absence of noise. These curves are generated by numerical integration of the
kinetic equations as given in Appendix B. The oscillations of mRNA (Mp), nuclear (PN) and total clock protein
(Pt ) correspond to a the evolution towards a limit cycle shown as a projection onto the (Mp,PN) plane. The
results in the second row are obtained by stochastic simulation of the chemical reactions corresponding to
the deterministic model. The number of mRNA molecules oscillates between a few and about 1000 whereas
nuclear and total clock protein oscillate in the ranges of 200 - 4000 and 800 - 8000, respectively.
Chapter 5. Experiments 58
0 50 100 150 200 250 300 350 400 450 5000
500
1000
1500
2000
2500Stochastic simulation with Ω = 100
Num
ber
of m
RN
A o
r pro
tein
mole
cule
s
Time [h]
mRNAnuclear proteinall protein molecules
0 50 100 150 200 2500
100
200
300
400
500
600
700
800
900
1000
Nucle
ar
pro
tein
mole
cule
s P
N
mRNA molecules MP
Stochastic Simulation with Ω = 100
0 50 100 150 200 250 300 350 400 450 500−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time (h)
Sa
mp
le A
uto
co
rre
latio
n
Autocorrelation with Ω = 100
0 50 100 150 200 250 300 350 400 450 5000
200
400
600
800
1000
1200
1400
1600
1800
Time [h]
Num
ber
of m
RN
A o
r pro
tein
mole
cule
s
Stochastic Simulation with Ω = 50
mRNAnuclear proteinall protein molecules
0 20 40 60 80 100 1200
100
200
300
400
500
600
mRNA molecules MP
Nucle
ar
pro
tein
mole
cule
s P
N
Stochastic Simulation with Ω = 50
0 50 100 150 200 250 300 350 400 450 500−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time (h)S
am
ple
Au
toco
rre
latio
n
Stochastic Simulation with Ω = 50
0 50 100 150 200 250 300 350 400 450 5000
100
200
300
400
500
600
Time (h)
Num
ber
of m
RN
A o
r pro
tein
mole
cule
s
Stochastic Simulation with Ω = 10
mRNAnuclear proteinall protein molecules
0 5 10 15 20 25 30 35 40 45 500
20
40
60
80
100
120
140
160
180
200
mRNA molecules MP
Nucle
ar
pro
tein
mole
cule
s P
N
Stochastic Simulation with Ω = 10
0 50 100 150 200 250 300 350 400 450 500−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time (h)
Sa
mp
le A
uto
co
rre
latio
n
Stochastic Simulation with Ω = 10
Figure 5.5: Effect of the number of molecules on the robustness of the oscillations. The plots
show the results of stochastic simulations with Ω changing from 100 to 50 and 10. The left plot in each row
shows the oscillations for mRNA, nuclear protein and all proteins during a simulation time of 484 hours. The
middle plot shows the corresponding limit cycle and the right plot the time evolution of the autocorrelation
function. The autocorrelation function was computed for time lags from 0 to 480. For Ω = 50 and 100, one
can still observe robust circadian oscillations. Only for Ω = 10 the oscillations become very noisy. This fact
is underlined by the rapid decrease of the autocorrelation function. The simulations were conducted with the
Gillespie algorithm. The plots of the autocorrelation function were created with Matlab.
Chapter 5. Experiments 59
Experiments and Results
The aim of this first experiment was to check whether for sufficiently large numbers of molecules,
a stochastic simulation will produce results similar to the deterministic model. A simulation of
the SPN representing the model was performed with the Gillespie algorithm (Figure 5.4). The left
plot in each row shows the oscillations of mRNA (MP), nuclear protein (PN) and the sum of all
protein molecules obtained with the deterministic model (first row) and the stochastic simulation.
These oscillations evolve towards a limit cycle which is shown in the second plot in each row. The
stochastic simulation was conducted for Ω = 500.
For this particular value of Ω, the oscillations of the stochastic model are quite stable with a pe-
riod of 24.5 hours and a standard deviation of 1.1 hours. Similar values were obtained by Gonze,
Halloy & Goldbeter (2002). The number of mRNA molecules varies in the range of 0−1000 and
the number of proteins in the range of 200− 4000 (nuclear form) and 800− 8000 (all proteins).
The deterministic model shows stable oscillations with a period of 23.8 hours. In this model, the
concentration of mRNA varies in the range of 0−2 nM whereas the protein concentrations change
in a range of 1− 6.5 nM (PN) and 2.4− 14.8 nM (all proteins). It seems that the molecular noise
which is considered only by the stochastic model, merely induces a change in the amplitude of the
oscillations and not in the period.
The next experiments dealt with the influence of decreasing numbers of molecules on the results
of the stochastic simulation. Stochastic simulations were performed for Ω = 10,50 and 100. The
results in figure 5.5 reveal that stable oscillations occur with Ω = 100 (first row) and 50 (second
row). With these parameters, the number of mRNA molecules oscillate in the range of 25− 930
(Ω = 100) and 15−800 (Ω = 50). But for smaller numbers of molecules the circadian rhythms are
more and more overlapped by noise. The bottom row shows the result of a simulation with Ω = 10.
The number of mRNA molecules varies from 0 to 200 and the number of proteins changes from 0
to 200 (PN) and from 50 to 400 (all proteins). The limit cycle is not longer visible and the circadian
oscillations have become very noisy.
These observations are underlined by the time evolution of the autocorrelation function which
is given in the third column of each row. This function measures the degree of periodicity of a
function. It is the correlation of a discrete process against a time-shifted version of itself, for a time
Chapter 5. Experiments 60
lag τ, and is defined by:
R f (τ) =E[(Xi−µ)(Xi+τ−µ)]
σ2
where E is the expected value and µ the mean. For a deterministic and periodic time series, the
autocorrelation function oscillates between 0 and 1. In the presence of noise, the more periodic
the function, the more slowly the autocorrelation function goes to zero. This loss of correlation
is due to the phenomenon of phase diffusion. In the presence of noise the phase of free-running
oscillations varies in such a way that it eventually covers the whole range of possible values over a
period (Gonze, Halloy & Goldbeter 2002).
If many molecules are present in the system, the autocorrelation decreases slowly as can be ob-
served for Ω = 100. If noise starts to obliterate the oscillatory behaviour, the autocorrelation func-
tion decreases more rapidly. This can be seen for lower numbers of molecules (Ω = 10,50). So far,
we merely repeated experiments that were published elsewhere (Gonze, Halloy & Goldbeter 2002).
We can now hope that our approach, to model the reaction steps with a SPN, is valid and can apply
this method and the experimental setup to other models. This has not yet been done.
5.2.2 The hysteresis-based Model
As mentioned at the beginning of this chapter, the validity of the delay-based model has been
questioned (Barkai & Leibler 2000). In this part of the dissertation, we will present a different
model based on hysteresis (Vilar et al. 2002). In general, hysteresis is a property of a system that
describes a memory or lagging effect. In contrast to the previous model, this genetic oscillator
consists of two different components, an activator and a repressor protein. The expression of the
activator leads to a delayed expression of the repressor. The repressor inhibits the synthesis of the
activator protein and is the source of the oscillations.
The delay-based and the hysteresis-based model are very different but general models that try to
explain how circadian oscillations might be generated on a molecular level. A short comparison of
both models has already been made (Barkai & Leibler 2000). Recently, it has been suggested that
the rate of the binding reaction between protein and DNA has significant influence on the noise re-
sistance of the oscillations (Forger & Peskin 2005). We therefore repeat previous experiments with
faster rate constants for these binding reactions that are assumed to increase the noise resistance.
We also change the size of the hysteresis-based model in the same fashion as for the delay-based
model in the last section. This was not part of the experiments in Barkai & Leibler (2000).
Chapter 5. Experiments 61
The Model
Figure 5.6: The hysteresis-based
model. Illustration taken from Barkai
& Leibler (2000). This model consists
of two active components, an activator
(A) and a repressor (R). A stimulates the
synthesis of B. If the concentration of R
rises, A is degraded quickly. After the
slow degradation of R, A is v synthe-
sized again.
Figure 5.6 above gives an illustration of the second oscillatory network that is examined in this
section. The model also includes the degradation of both messenger RNAs synthesized from genes
PA and PR. The activator protein A binds to the promoter regions of its own gene which leads
to an increase of the transcription rate. This type of feedback loop increases the noise resistance
of the oscillator and seems to be a common feature among several competing models (Barkai &
Leibler 2000). Protein A also binds to the promoter of gene PR and induces its expression. The
gene is transcribed into mRNA and the resulting protein R binds to protein A. The degradation
complex (C) is formed (not shown in the simplified illustration above) and A is degraded. That is,
the degradation complex C decays into R. The cycle completes by degradation of the repressor R
and subsequent re-expression of the activator. For a detailed description of all reaction steps in the
model, see Appendix C.
As for the previous model, this oscillator is assumed to represent the core architecture of a bio-
molecular clock. It is not specific to any organism but contains features such as a positive feedback
loop and two competing components that were found experimentally in a variety of organisms
such as cyanobacteria but also mammals (Dunlap 1999). The dynamics of this system are captured
by a set of differential equations which are given in Appendix C. These equations have been de-
composed into the elementary steps in the same way as for the delay-based model. The reaction
probabilities are given in table C.1 and the SPN representation of the reactions in figure C.1 in the
third appendix.
Chapter 5. Experiments 62
0 20 40 60 80 100 120 140 160 180 2000
500
1000
1500
2000
2500
Time [h]
Num
ber o
f mol
ecul
es
Deterministic simulation
CR
(a) Deterministic simulation
0 500 1000 1500 2000 25000
200
400
600
800
1000
1200
1400
1600
1800
C
R
Limit cycle of the deterministic simulation
(b) Limit cycle
0 20 40 60 80 100 120 140 160 180 2000
500
1000
1500
2000
2500
Time [h]
Num
ber o
f mol
ecul
es
Stochastic simulation
CR
(c) Stochastic simulation
0 500 1000 1500 2000 25000
200
400
600
800
1000
1200
1400
1600
1800
(d) Limit cycle
Figure 5.7: Hysteresis-based circadian clock: stochastic and deterministic simulation Oscil-
lations in the repressor protein (R) and the degradation complex (C) obtained by numerical simulation of the
deterministic 5.7(a) and stochastic description 5.7(c) of the model.
Experiments and Results
The delay-based model that was examined in the last chapter, reveals stable oscillations at high
numbers of molecules i.e. if the size of the system is increased to Ω = 500. On the other hand,
if Ω is decreased to 10, oscillations can still be perceived but with very irregular period and am-
plitude. The delay-based and the hysteresis-based model have already been compared (Barkai &
Leibler 2000) but with slower rate constants and without changing the system size. We present a
more thorough comparison with faster rates for the protein and DNA binding reactions and with
changing numbers of proteins and mRNA in the model. All stochastic simulations were started
with one gene for repressor and activator protein and no instances of mRNA and protein species as
initial conditions.
To begin, we compare the stochastic formulation of the new hysteresis-based model with its deter-
ministic formulation. Figure 5.7 shows the results of this experiment. Following the work of Vilar
Chapter 5. Experiments 63
et al. (2002), we give the time course of repressor protein (R) and degradation complex (C) which
consists of activator and repressor protein. Deterministic formulation and stochastic simulation are
in good agreement and correspond to the results obtained by Vilar et al. (2002). The degradation
complex is formed as soon as activator and repressor proteins are available. It decays into R and
therefore a peak in the concentration of C is followed by a peak in the concentration of R. Later
on, R is degraded and the cycle starts again. This experiment was performed with Ω = 1, as done
by Vilar et al. (2002). With these parameter settings, the stochastic simulation results in a change
of amplitude in the oscillations but the period remains remarkably stable.
These experiments have already been performed elsewhere (Barkai & Leibler 2000). But in con-
trast to our experiments, they used slow rate constants for the binding reaction between protein and
DNA. Whereas the noise resistance of the delay-based model is greatly improved with these higher
rate constants, we cannot observe any change in the behaviour of the hysteresis-based model.
Similar to the experiments in the last section, we are now examining the behaviour of the bio-
molecular clock for different numbers of molecules. The delay-based model exhibited very noisy
oscillations at low values of Ω. If Ω is increased the behaviour of the stochastic model approaches
the time course of the deterministic simulation. The fact that random fluctuations on the molecular
level are averaged out if enough molecules are involved has been proven formally (Kurtz 1971)
and the behaviour of the delay-based model confirms this.
In contrast to these results, the hysteresis-based model reveals a very different behaviour. For
Ω = 10 and 50, the oscillations are very stable (third and second row of figure 5.9). But if the size
of the system is further increased, the oscillations stop completely (first row of figure 5.9). For
this model, higher numbers of molecules do not seem to improve the oscillatory behaviour of the
model. A limited amount of noise seems to have a positive influence on the oscillator. For higher
numbers of molecules, the system seems to approach a steady state.
These results confirm the findings of Vilar et al. (2002) that performed a theoretical analysis of a
simplified version of the hysteresis-based oscillator. They were able to show that the molecular
fluctuations can actually enhance the oscillator. Essentially, small perturbations can drive the sys-
tem out of a stable state and initiate a new phase. In a deterministic setting, these perturbations
are not considered and the system remains in the stable state once it has arrived there. This be-
haviour was observed for a particular low value for the degradation rate of the repressor protein R
Chapter 5. Experiments 64
0 50 100 150 200 250 300 350 4000
500
1000
1500
2000
2500Deterministic simulation
Num
er o
f rep
ress
or m
olec
ules
Time [h]
(a) Deterministic simulation
0 50 100 150 200 250 300 350 4000
500
1000
1500
2000
2500
Num
er o
f rep
ress
or m
olec
ules
Time [h]
Stochastic simulation
(b) Stochastic simulation
Figure 5.8: Time evolution of the repressor protein (R) for deterministic (a) and stochastic (b) formulation
of the model. Parameter values are as given in Appendix C except for the degradation rate of R (δR) which is
now 0.05h−1.
(δR = 0.05 h−1). Figure 5.8 shows the results of a deterministic and a stochastic simulation with
these parameters. It is not completely clear whether a similar situation is created if the size of
the system is increased as in our experiments. The simulation shows that the abundance of the
key proteins A and R oscillates between 0 and several thousands of molecules. Vilar et al. (2002)
observed a similar behaviour for Ω = 1. The fact that very few instances of the key proteins are
present only during a very short time interval might be the reason for the noise resistance of the
system at low values of Ω. On the other hand, the perturbations that will necessarily occur at
these low abundances might just be enough to drive the system out of the steady state and into
the next period. The simulation with Ω = 100 reveals that the numbers of both key proteins (data
for A not shown) do not decrease to zero but oscillate around 1000 molecules. This might be the
reason that the system approaches a steady state because the influence of fluctuations in the molec-
ular populations is too low to drive the system out of its stable state and to initiate a new oscillation.
In it very difficult to obtain coherent conclusions about a complex nonlinear system just from
observations. To our knowledge, most of the theoretical results about the behaviour of a model
with different parameters were obtained for simplified versions of the model only. In this case,
various assumptions about the model were made such as the steady state of some molecular species
and the hope was expressed that both models, real and simplified one, exhibit a similar behaviour
over a wide range of conditions. It might require further advancement in the theoretical sciences
to obtain new insights. Nevertheless, our simulations have shown that the hysteresis-based model
exhibits an unexpected behaviour if the size of the system is increased.
Chapter 5. Experiments 65
0 20 40 60 80 100 120 140 160 180 2000
1000
2000
3000
4000
5000
6000
Time [h]
Num
ber
of m
ole
cule
s
Stochastic simulation with Ω = 100
CR
0 1000 2000 3000 4000 5000 60000
200
400
600
800
1000
1200
1400Limit cycle of the stochastic simulation with Ω = 100
C
R
0 50 100 150 200 250 300 350 400 450 500−0.2
0
0.2
0.4
0.6
0.8
Time lag [h]
Sa
mp
le A
uto
co
rre
latio
n
Autocorrelation with Ω = 100
0 20 40 60 80 100 120 140 160 180 2000
2
4
6
8
10
12x 10
4
Time [h]
Num
ber
of m
ole
cule
s
Stochastic simulation with Ω = 50
CR
0 2 4 6 8 10 12
x 104
0
1
2
3
4
5
6
7
8
9x 10
4
C
R
Limit cycle of the stochastic simulation with Ω = 50
0 50 100 150 200 250 300 350 400 450 500−0.5
0
0.5
1
Time lag [h]
Sa
mp
le A
uto
co
rre
latio
n
Autocorrelation with Ω = 50
0 20 40 60 80 100 120 140 160 180 2000
0.5
1
1.5
2
2.5x 10
4
Time [h]
Num
ber
of m
ole
cule
s
Stochastic simulation with Ω = 10
CR
0 0.5 1 1.5 2 2.5
x 104
0
2000
4000
6000
8000
10000
12000
14000
C
R
Limit cycle of the stochastic simulation with Ω = 10
0 50 100 150 200 250 300 350 400 450 500−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time lag [h]
Sa
mp
le A
uto
co
rre
latio
n
Autocorrelation with Ω = 10
Figure 5.9: Stochastic simulation with changing numbers of molecules (hysteresis-based
model) The plots show the results of stochastic simulations with Ω changing from 100 to 50 and 10. The
left plot in each row shows the oscillations for repressor (R) and degradation complex (C)during a simulation
time of 200 hours. The middle plot is the limit cycle and the right plot the time evolution of the autocorrelation
function.
Chapter 5. Experiments 66
Simulating the effects of gene duplication
0 20 40 60 80 100 120 140 160 180 2000
5000
10000
15000
Time [h]
Num
er o
f mol
ecul
es
Stochastic simulation with Ω=100
mRNAP
N
(a) Delay-based model
0 20 40 60 80 100 120 140 160 180 2000
200
400
600
800
1000
1200
1400
1600
1800
2000
Stochastic simulation with Ω=1
Time [h]
Num
ber o
f mol
ecul
es
CR
(b) Hysteresis-based model
Figure 5.10: Both models are simulated with a second copy of the clock gene (activator gene in case of
the hysteresis model). The rate of transcription of both models is increased by factor 10. Both models are
simulated with a value of Ω for which they should exhibit stable oscillations (100 for delay-based and 1 for
hysteresis-based model).
Gene duplication is thought to have a major role in evolution. It can happen when an error during
the DNA replication occurs and a copy of a functional gene is inserted into a different part of the
DNA. This copy might be identical to the original gene or mutated. If both copies are functional,
one of the genes might mutate later on and acquire a different function since it is not longer re-
quired for the survival of the organism. But both genes can also remain active during the further
evolution of the species (Cox & Nelson 2004).
The effect of gene duplication on a genetic oscillator has already been examined by Forger & Pe-
skin (2005). They increased the number of genes in their stochastic model of a circadian clock in
mammals and found that the robustness of the oscillations is improved if more genes are present
in the model. They measured the robustness of the oscillations in terms of the deviation of the
period over many runs. Forger & Peskin (2005) argue that a low number of genes always leads to
some ”residual stochasticity” in the model. Even in the limit of a large volume when all reactions
can be modelled deterministically, the reactions involving the gene and its promoter will still occur
stochastically since the number of genes will not be influenced by the increase of the volume. This
might also explain the fact that faster binding rates between protein and DNA lead to a reduction
of the stochasticity in the model since the randomness of these reactions will be averaged out if
they occur on a very fast time scale (Gonze, Halloy & Goldbeter 2002).
Chapter 5. Experiments 67
Gene duplication might also involve the mutation of the duplicated gene. This was not considered
in the simulations of Forger & Peskin (2005). We simulate gene duplication by introducing a sec-
ond mutated gene in both models. A mutation can have several effects on a gene. It might lead to
a disfunctional protein product. But it also possible that the promoter region is modified and that
the gene is transcribed at a much higher rate than its original. We modelled mutation by increasing
the transcription rate of the gene copy by a factor of 10.
Figure 5.10 shows the results of this experiment. The hysteresis-based model is apparently not
severely perturbed by the modification. The amplitude of the oscillations is decreased and less
proteins are produced than in the model with only one gene (2043 maximum compared to 2495).
It seems that a higher overall transcription rate of the activator gene leads to a faster transcription of
the repressor protein and in turn to a faster degradation of the activator protein. This might be the
reason for the damping of the amplitudes. But the genetic circuit still exhibits regular oscillations.
In contrast to this, the delay-based model is affected by the introduction of a second mutated gene.
It does not exhibit any oscillations but the number of clock proteins increases. A maximum value
of 390000 was observed during a simulation time of 200 hours.
We tested the behaviour of both genetic oscillators when a second gene with increased transcription
rate is introduced. This modification can be interpreted as a simulation of a duplication of the clock
gene. However, the main intention of this experiment was to examine the behaviour of both models
if key genes are copied and mutated. It is questionable if the modification that we introduced are a
good model of a real gene duplication. Nevertheless it was shown that the hysteresis-based model is
less susceptible to structural modifications. This is an characteristic that is supported by evolution
since gene networks that are easily affected by mutations might die out quickly. The ability to
function reliably even if key components are mutated is probably necessary for the circadian clock
to be successfully embedded within the cell.
Influence of changes in the Protein-DNA binding rates
In a previous study (Barkai & Leibler 2000), the delay-based model of a biomolecular clock
(Leloup et al. 1999) has been criticised since it exhibits very unstable and noisy oscillations if
simulated stochastically. Later on, Gonze, Halloy & Goldbeter (2002) repeated this stochastic
simulation with different rate constants. Theses experiments revealed that the delay-based oscil-
lator is able to oscillate reliably if the rates of the reaction between the clock protein and its own
Chapter 5. Experiments 68
gene are set to very high values. Nowadays it is believed that high rate constants in these reac-
tions are crucial for the robustness of the oscillation in many models of circadian clocks (Forger &
Peskin 2005).
We used the same values for the binding reactions in our experiments for both models, hysteresis-
and delay-based. But we briefly repeat the simulations conducted by Barkai & Leibler (2000) with
low rate constants to finalise our comparison of both models.
0 50 100 150 200 250 300 350 400 450 5000
20
40
60
80
100
120
140
160
Time [h]
Num
er o
f mol
ecul
es
Stochastic simulation with Ω=500
mRNAP
N
(a) Delay-based model
0 20 40 60 80 100 120 140 160 180 2000
500
1000
1500
2000
2500
Time [h]
Num
ber o
f mol
ecul
es
Stochastic simulation
CR
(b) Hysteresis-based model
Figure 5.11: Simulation of both circadian clocks with low rate constants Stochastic simulation of both
models with low rate constants for the binding reactions between DNA and proteins. The rates were set to
50 (binding) and 10h−1 (dissociation).
Our findings match the results obtained by Barkai & Leibler (2000). Whereas the hysteresis-based
model exhibits stable and pronounced oscillations even with low rate constants, the delay-based
still oscillates but in a very noisy manner. No period that is even close to 24 hours is visible.
The problem is here that the true values that reflect the velocity of the protein-DNA binding re-
actions are not known. Even if it has been observed that high rate constants are crucial for the
robustness of the model, it is not clear if these rate constants reflect reality. Currently there ex-
ists no experimental data about the kinetics of these reactions. However, the obtained results can
be seen as an indication about the soundness of both models since real circadian clocks have to
function reliably if their parameters are changed due to external influences such as change in tem-
perature and current state of the organism (hunger, stress, etc.).
Chapter 5. Experiments 69
5.3 Synchronising several oscillating cells
The two models presented in this chapter are oscillators with a very general structure. They do not
contain features specific to any organism. They exhibit a somehow contradicting behaviour and
only noisy or no oscillations at all under certain conditions. However, the ability to create stable
circadian oscillations under a large variety of external conditions is thought to be a key feature
of biological clocks (Barkai & Leibler 2000). Changes in transcription and translation rates may
arise from variations in nutrition, growth condition or temperature. It seems to be reasonable that
evolution favours designs of cellular systems that function reliable despite global changes in their
environment.
There might be several factors that can help biomolecular clocks to achieve this task. The entrain-
ment by daylight is certainly such a factor. In the biomolecular clock of Neurospora, light enhances
the degradation of the clock protein and by doing this, exerts a periodic forcing of the clock that
was found to improve the noise resistance (Gonze, Halloy & Goldbeter 2002). Light can be seen
as some kind of external synchronisation. There are also theories of an internal synchronisation
e.g. a synchronisation between different cells such that the oscillations of a group of cells are more
stable than the oscillations of a single cell (Forger & Peskin 2005).
Experiments and Results
This section presents a short experiment about a possible mechanism of synchronisation between
different clocks. One possibility is to simply average the oscillations of several cells. But we
would expect that several noisy oscillations which are averaged over a large number of cells simply
disappear due to the different shift of each oscillation.
In order to cope with this problem, a single run of the delay-based model was observed at Ω = 50.
After this system left the transient phase and settled into some more or less stable limit cycle, the
numbers of each molecular species were recorded and used as initial state for a new run of 100
cells. Again, the average was taken of the individual oscillations. The rationale of this approach
is that each cell should start at a common initial state such that the initial shift of the oscillations
against each other is zero. The results of these experiments are shown in figure 5.12.
Chapter 5. Experiments 70
In both cases, the experiments were not very successful. In fact, the oscillations that were averaged
over several cells are even weaker than the oscillations of the single cell. The autocorrelation func-
tion decays very fast and a limit cycle is nonexistent. However, we can also observe that the very
first oscillations are clearly pronounced in each experiment. In the later course of the simulation,
the oscillations become noisier and their average converges towards a straight line.
What can be concluded from these experiments? First of all, there is almost certainly some kind
of synchronisation of oscillating cells (Forger & Peskin 2005). But given the results from these
experiments and given the fact that cells are clearly separated compartments, it does not appear
to be reasonable to simply average the oscillations of several noisy cells. It might be possible to
improve the results of these experiments by combining external and internal synchronisation. We
could simply introduce a factor that the simulates the influence of daylight, for instance changing
the degradation rate of the clock protein in the delay-based model every 12 hours, and then av-
erage the oscillations of several cells. However, this was not possible in our experimental setup
since after the SPN model is created, it is simulated in one run and there is no possibility to change
parameters in the model during a simulation in the Systems Biology Workbench.
Recent findings suggest that several noisy oscillatory cells are synchronised by a messenger sub-
stance by a messenger substance (Gonze, Bernard, Waltermann, Kramer & Herzel 2005). As an
example, cells in in the suprachiasmatic nucleus of the hypothalamus, which is assumed to be
the circadian pacemaker in mammals, exhibit oscillations with free-running periods if examined
in isolation. But the suprachiasmatic nucleus as a whole exhibits regular oscillations with stable
period close to 24 hours. In a different study, a model of a cell population in the hypothalamus was
developed and it was shown that this population can be synchronised by introducing a global vari-
able representing a neurotransmitter which influences directly the transcription rate of the clock
gene (Gonze et al. 2005). However, the details of the real synchronisation of oscillating cells in
mammals are still unknown. It is not known which messenger substance actually enforces the syn-
chronisation. Moreover how this substances interacts with the molecular clocks in the individual
cells is also unknown. Our results suggest that there must be some form of global synchronisation
to ensure a stable circadian rhythm on a tissue level since the simple averaging of single oscillators
does not improve the stability of the circadian rhythm.
Chapter 5. Experiments 71
0 50 100 150 200 2500
100
200
300
400
500
600
700
800
Num
er
of m
ole
cule
s
Time [h]
Stochastic simulation with Ω=50
mRNAP
Nall proteins
0 10 20 30 40 50 60 70 8050
100
150
200
250
300
350
number of mRNA molecules
num
ber
of P
N m
ole
cule
s
Stochastic simulation with Ω=50
0 20 40 60 80 100 120 140 160 180 200−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Lag
Sa
mp
le A
uto
co
rre
latio
n
Stochastic simulation with Ω=50
0 50 100 150 200 2500
100
200
300
400
500
600
700
800
900
1000
Num
er
of m
ole
cule
s
Time [h]
Stochastic simulation with Ω=50
mRNAP
Nall proteins
0 10 20 30 40 50 60 70 80 900
50
100
150
200
250
300
350
400
450
number of mRNA molecules
num
ber
of P
N m
ole
cule
s
Stochastic simulation with Ω=50
0 20 40 60 80 100 120 140 160 180 200−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Lag
Sa
mp
le A
uto
co
rre
latio
n
Sample Autocorrelation Function (ACF)
0 50 100 150 200 250 300 350 400 450 5000
200
400
600
800
1000
1200
1400
1600
1800
Time [h]
Num
ber
of m
RN
A o
r pro
tein
mole
cule
s
Stochastic Simulation with Ω = 50
mRNAnuclear proteinall protein molecules
0 20 40 60 80 100 1200
100
200
300
400
500
600
mRNA molecules MP
Nucle
ar
pro
tein
mole
cule
s P
N
Stochastic Simulation with Ω = 50
0 50 100 150 200 250 300 350 400 450 500−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time (h)
Sa
mp
le A
uto
co
rre
latio
n
Stochastic Simulation with Ω = 50
Figure 5.12: Synchronisation of several cells The first row represents an experiment in which the
molecule numbers were simply averaged over 100 runs. The second row shows the results of the second
experiment. In this case, the initial numbers of all molecular species were set to a value within the limit cycle
and the results again averaged over 100 runs. The last row gives a simulation of a single cell at the same
system size (Ω = 50). The fist plot in each row gives the time evolution of mRNA, nuclear protein (PN) and
the whole protein population. The second plot gives plot of mRNA versus the nuclear protein abundance and
the third one the time course of the autocorrelation function.
Chapter 5. Experiments 72
0 50 100 150 200 250 300 350 400 450 5000
50
100
150
200
250
Ω
Hal
f−tim
e of
aut
ocor
rela
tion
Delay−based model (Gonze et al. 2002)
(a) Delay-based model
10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100
120
140
160
Ω
Hal
f−tim
e of
the
auto
corr
elat
ion
Hysteresis−based model (Barkai and Leibler 2000)
(b) Hysteresis-based model
Figure 5.13: Robustness of the oscillations in both models measured by half-life of autocorrelation.
The half-life of the autocorrelation is plotted against Ω, the size of the system.
5.4 Discussion
The primary intention of these experiments was to show a practical application for the extended
Petri Net Kernel and to give a comprehensive example that underlines the need for stochastic mod-
els in biology. But our objective was also to present a more detailed comparison of two stochastic
models of circadian clocks and to give new insights into the architecture of the true clock.
The first aim has been achieved. The Petri Net Kernel in its extended version has proven to be use-
ful in these experiments. The Petri Net representation is a visualisation of the reaction steps that is
easy to understand and it can be simulated efficiently using the Systems Biology Workbench. One
of the biggest advantages of the Kernel is that is does not require any knowledge of programming
languages to create a model. The user is only required to use a graphical editor with an intuitive in-
terface and to create a network of nodes and arcs. Furthermore, the experimental results published
by Gonze, Halloy & Goldbeter (2002) and Vilar et al. (2002) have been successfully recreated.
When it comes to the section objective, the comparison of two clock architectures, the results are
more difficult to evaluate. We were able to recreate simulation results for the delay-based model
that were published by Gonze, Halloy & Goldbeter (2002). Furthermore we extended this ap-
proach, to simulate a stochastic model with changing system size, to the hysteresis-based model.
Even if both models have already been compared by Barkai & Leibler (2000), the size of the sys-
tem was not changed in their study. We also simulated the effects of gene duplication and mutation
Chapter 5. Experiments 73
in both models and investigated the synchronisation of several noisy oscillators. In addition, our
simulations support the results from Barkai & Leibler (2000), that the delay-based model is sus-
ceptible to changes in the protein-DNA binding reactions.
In general, the hysteresis-based model seems to be less susceptible to changes in its rate constants
and is also hardly affected by duplication and mutation of its key gene. The delay-based model is
severely affected by both modifications. In addition, if the size of the system is small and degrada-
tion rate of the repressor protein is low, the hysteresis-based model is enhanced by the fluctuations
on a molecular level. Changes in transcription and translation rates may arise in a real cell and
gene duplication is a common event in the evolution of simple organisms. The robustness in the
presence of noise is also an important factor since it was shown in biological experiments that the
circadian clock in Neurospora is able to work reliably if the numbers of key proteins are in the
order of 20 molecules (Merrow, Garceau & Dunlap 1997). According to Barkai & Leibler (2000),
the ability to resist such uncertainties was probably one of the ”decisive factors in the evolution
of circadian clocks” and should be ”reflected in the underlying oscillation mechanism”. From this
perspective, the hysteresis-based model seems to be more sound.
On the other hand, we were able to repeat experiments that reveal that the delay-based model
(Gonze, Halloy & Goldbeter 2002) approaches the oscillatory behaviour of its deterministic for-
mulation if the size of the system is increased. The oscillations are more robust and exhibit a stable
period of about 24 hours. The hysteresis-based model reveals a somehow contradictory behaviour.
We chose a parameter setting with high degradation rates of the repressor protein and Ω = 1. Sto-
chastic and deterministic simulation revealed robust oscillations for these parameters. But when
we increased Ω in the same way as we did for the delay-based model, the hysteresis-based model
did not oscillate anymore.
To conclude, we cannot draw any final conclusions about the validity of each model due to limited
time, generality of both architectures and the fact that the true rate constants of many reactions are
unknown. The delay-based model does not capture all details of the circadian clock in Neurospora
or Drosophila. The hysteresis-based model is not specific to any organism but contains components
that were found in several genetic oscillators. It seems to us that the hysteresis-based model is more
sound since it is less susceptible to changes in its environment and mutations. Its characteristics,
postive feedback loops and two active proteins, could serve as starting points for the construction
of better models.
Chapter 6
Conclusions
6.1 Concluding remarks and Observations
The outcome of this dissertation is twofold. First, a software for the modelling and simulation of
biological processes with Stochastic Petri Nets was created. Second, this software was used to
model genetic circuits that are of current scientific interest.
The implementation of the experimental framework lasted for about one month. A new version
of the Petri Net Kernel, called PNK 2e, was developed. This version can import Petri Nets from
SBML. Petri Nets can also be created using a graphical editor and their behaviour can be simu-
lated in the Systems Biology Workbench, either stochastically or deterministically. The SPN can
be written to either SBML or CMDL, two description languages for biological models. The aim
was to keep the usage of the software as simple as possible. No programming experience is re-
quired to create a model and to simulate it. PNK 2e was presented during the poster session at
the BioSysBio conference 2005 in Edinburgh. Since Stochastic Petri Nets provide an intuitive rep-
resentation of stochastic models that are commonly used in the Systems Biology community, the
tool received some interest and we received encouragement for our work. Furthermore, PNK 2e
has been announced on the webpage and mailing list of the SBML project. It is available on the
internet1, together with a manual and a step-by-step user guide.
We hoped that the use of freely available software would save us some time. This expectation was
met. On the other hand, we found out that it can be difficult to be dependent on the work and good
will of others. During the first part of this project, some bugs were found in the Systems Biol-
1http://www.inf.fu-berlin.de/˜trieglaf/PNK2e
74
Chapter 6. Conclusions 75
ogy Workbench. It was difficult to correct the errors without a deep knowledge of the program.
Therefore we contacted the developers of the workbench and asked for help. Fortunately, they
were very helpful and corrected the problem within days. But this is certainly not always the case.
We summarize that the software has aroused interest during a first presentation because Markov
Processes and graphical models are approaches biologists are very familiar with. The need for
stochastic models is widely accepted but software tools that are really user-friendly and offer all
the functionality needed by biologists are still rare.
When it comes to the experimental part, conclusions are more difficult to draw. We were able to
recreate the simulation results of others and PNK 2e has proven its usefulness during these ex-
periments. In addition, we conducted a comparison of two competing architectures for circadian
clocks. A comparison at this level of detail has not been done before. We could also present some
results about hypothetical models of the synchronisation of different cells.
We compared a model based on delay induced by phosphorylation of the clock protein and a model
based on hysteresis or lag caused by slow degradation of a repressor protein. The delay-based
model is very sensitive to slow rate constants in the binding reactions of DNA and protein and to
duplication of the clock protein. Its oscillations become very noisy if few molecules are involved.
On the other hand, the hysteresis-based model seems to be enhanced by noise. For some parameter
settings, this model oscillates only in the stochastic simulation. The deterministic solution, which
does not take noise into account, arrives in a steady state. On the other hand, if we start from
a setting of parameters that leads to oscillations in a stochastic and deterministic simulation, and
further increase the size of the system, the genetic circuit does not exhibit any oscillations. This
behaviour contradicts our expectations but might be due to the lack of detail in the model.
Both models capture only core features of real circadian clocks and we can therefore only draw
general conclusions about possible architectures. It is known that evolution favours designs that
are robust to noise and work well under a variety of external influences. In our experiments, it was
shown that the hysteresis-based model is less susceptible to changes in the rate constants. From
this perspective, one might prefer this model. It seems also to be enhanced by molecular fluctua-
tions which are known to be an important factor in the cell (Arkin et al. 1998). On the other hand,
our experiments revealed that the hysteresis-based model does not oscillate if many molecules are
contained in the system. This fact contradicts common expectations since it should function better
with increasing size of the system.
Chapter 6. Conclusions 76
In contrast to this, the delay-based model oscillates independently of the system size. But this
model seems to be heavily affected by changes in its rate constants, especially in changes of the
binding rates between protein and DNA. We are sure that further advancement, both in compu-
tational but also experimental sciences, is necessary before we can draw final conclusions about
the architecture of biomolecular clocks. Nonetheless, both scientific fields are developing at a fast
pace and we hope these advances will be made very soon.
6.2 Unsolved Problems
We were able to obtain some interesting results. But is certainly difficult to deliver a coherent piece
of work during three months. The software, PNK 2e, has scope for further improvement. As an
example, it would be very convenient if the editor could assign several rate constants to a stochas-
tic transition. Each rate could be assigned to a different environment such as size of the system,
temperature etc. The user could choose a set of rates for an experiment and compare simulation
results with different settings easily. The editor itself could also be extended. It might be useful to
have the option to merge different nets or to use hierarchical nets e.g. nets that contain subnets in
a transition.
Concerning the experiments on stochastic models of genetic oscillators, there is certainly a lot of
work to be done. First of all, the search for rate constants that truly reflect the real velocity of
the reactions is still continuing. Furthermore, many details of real circadian clocks are still to be
uncovered by experimental means. We also need new formal methods that are able to analyse
the complex behaviour of nonlinear systems. There is also a lack of methods that can capture
the reliability of the oscillations in an adequate way. Mere visual inspection of the oscillations is
not sufficient and the autocorrelation is often misleading since it is always decreasing for noisy
oscillations.
6.3 Suggestions for Future Work
The advantages of Petri Nets compared to other modelling formalisms used in Biology is that
their theory has been researched for decades. There exist algorithms that can not only be used to
examine their dynamic behaviour but also to search for structural properties and to derive steady
Chapter 6. Conclusions 77
state information by algebraic means. However, in this dissertation, we focussed on results from
stochastic and deterministic simulations. This is due to the fact that, for the models that have been
examined here, structural analysis yielded not very interesting results and steady state information
was not possible to obtain. But here lies the true advantage of Petri Nets.
The full use of this potential will emerge if the models become more complex such that a topologi-
cal analysis will give more interesting results. So far, it can at least be used to check if the systems
fulfils the assumptions made, such as invariants on the number of enzyme molecules. An algebraic
computation of the steady state distribution requires an upper bound on the number of states in the
Markov process. This can be enforced by simply limiting the state space. But this threshold has
to be chosen carefully. The efficient derivation of the steady state distribution is also an interesting
topic and there is certainly scope for future research.
Some attempts have been made in this dissertation to simplify the representation of biological re-
actions with Petri Nets. The problem is that the net becomes very large with increasing complexity
of the reactions. Further attempts could be made to develop new representations that maintain the
advantages of Petri Nets and capture the complexity inherent in biological processes even more
easily.
It might also be interesting to develop new simulation algorithms that make use of the information
that is captured by the net. As it was outlined in chapter 2, some useful information is lost if the
net is translated into SBML e.g. information about dependencies among the reactions that can be
used to perform efficient simulations. Even if the use of other software can save a lot of time and
effort, it is also an advantage to use one’s implementations that can be adopted more easily. For
instance, if we could stop the simulation at a time of our choice, we could simulate the influence
of daylight by changing the rate constants during the simulation.
Appendix A
User guide to PNK 2e
This is the manual for PNK 2e, a software developed using the Petri Net Kernel (PNK) version 2.2.
The Petri Net Kernel is a framework for the development of Petri Net tools. It was developed at
the Humboldt University of Berlin, Germany. Its extended version, PNK 2e, was developed by Ole
Schulz-Trieglaff, during his M.Sc. dissertation at the University of Edinburgh, UK. PNK 2e fea-
tures Stochastic Petri Nets, a modelling formalism that stems from Computer Science. Stochastic
Petri Nets (SPNs) are closely related to Markov Jump Processes. Their behaviour can be simulated
using the Gillespie Algorithm and its improved versions (Gibson-Bruck, Tau Leap).
PNK 2e extends the PNK by features for the modelling of biological processes. PNK 2e means
”extended Petri Net Kernel version 2”. The software is able to create a Petri Net representation of
a model described in SBML (Systems Biology Workbench Language). The net is drawn by using
a simple algorithm implemented by Alexander Gruenewald, Humboldt University of Berlin. The
dynamic behaviour of the Petri Net can be simulated using the Systems Biology Workbench. In
order to achieve this, PNK 2e translates the net back into its SBML description and passes this
description automatically to the Workbench. Alternatively, a Petri Net can be created using the
graphical editor of the Kernel.
Licence Agreement
PNK 2e is free software; you can redistribute it and/or modify it under the terms of the GNU
General Public License as published by the Free Software Foundation; version 2 of the license.
PNK 2e is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY. You are
NOT ALLOWED to CHANGE THE ORIGINAL COPYRIGHT NOTICE. See the GNU General
78
Appendix A. User guide to PNK 2e 79
Public License for more details. You should have received a copy of the GNU General Public
License along with PNK 2e; if not see http://www.gnu.org/.
Quick start
This is a brief introduction to PNK 2e. The software requires the Java version≥ 1.2.2. The archive
PNK2e.zip can be downloaded from www.inf.fu-berlin.de/ trieglaf/PNK2e and contains
all necessary files. The Systems Biology Workbench is required to perform the simulation of the
Petri Net and is available at http://sbw.kgi.edu/.
Download and run the software
If the archive PNK2e.zip is extracted, a new directory PNK2e should be created in the current
directory. This directory contains several libraries in .jar format, the file PNK2e.jar which is the
program itself and several subdirectories:
• sampleNets - contains some netexamples
• netTypeSpecifications - contains examples for a net’s specification
• toolSpecification - contains some toolspecification examples
If anything goes wrong, first check if you have the correct version of Java installed by executing
java -version. Then try to find out if all necessary libraries are contained in the same direc-
tory as the .jar file. PNk 2e needs at least the libraries jaxp.jar, crimson.jar, SBWcore.jar and
SBMLreader.jar. The remaining libraries are needed for the translation of Petri Nets into CMDL
only.
Open, edit and save a Petri Net
The software can be started by double-clicking on the file PNK2e.jar (Windows) or by executing
java -jar PNK2e.jar (Linux and other operating systems). The main menu of PNK 2e should
appear. By clicking on the File menu, the user can open a file and load the net into the Kernel (see
screenshot A.1). The editor is opened automatically and displays the net.
Appendix A. User guide to PNK 2e 80
Alternatively, the user can select New in the menu Open of the main menu to create a new Petri Net.
PNK 2e can edit Stochastic Petri Nets, Biological Nets (SPNs with simplificactions for biological
reactions) and Generalized Stochastic Nets (SPNs with inhibitor arcs and immediate transitions).
Depending on this choice, the main menu changes to the editor menu. This menu offers the user
the possibility to draw places and transitions by simply choosing the type of node to be drawn and
by clicking into the editor pane. Arcs can be drawn by first clicking on the source node and then
on the target node. PNK 2e also contains a function to automatically arrange a net. This function
is called DoNetLayout and is available in the main menu.
Figure A.1: PNK 2e after loading the SPN representation of a genetic oscillator.
Simulating a net
PNK 2e can simulate a Stochastic Net with or without the developed simplifications for biologi-
cal reactions. This simulation is conducted in form of a ”token game”, that is a transition that is
executed is coloured for some milliseconds and the flow of the token through the net is visualised.
This simulation is available in the main menu under stochastic simulation.
This simulation is the correct way to simulate the net and gives a good idea of its dynamics.
However, it is not well suited for large nets since it is very slow. Furthermore, data of the simulation
run is not collected. If a more detailed analysis of the simulation is required, the user can choose
Appendix A. User guide to PNK 2e 81
the entry ConnectToSBW in the main menu. PNK2e then opens a window with a list of all services
in the Systems Biology Workbench that are available on this computer. For a description of the
SBW and on how to install new modules and services, have a look at the manual of the SBW
which is available at http://sbw.kgi.edu/. We recommend to install the simulator Dizzy in
addition to the workbench because this software offers several simulation algorithms, stochastic
and deterministic, and works very well with PNK 2e. But every simulation software is compatible
to the Systems Biology Workbench and implements the Gillespie algorithm or one of its improved
versions can be used.
Figure A.2: The simulation interface of PNK 2e.
If Dizzy is installed, the list of SBW services should contain an entry Dizzy simulation service. Af-
ter clicking on this entry, the Dizzy simulation window opens (see screenshot A.2). The window
contains a list with all available simulation algorithms. Start and end time of the simulation can
be chosen. In case of a stochastic simulation, the user can also decide to average the result over
several runs. If a deterministic simulation is chosen, the user has also decide about step size and
maximum relative and absolute error.
Appendix A. User guide to PNK 2e 82
The simulation is started by clicking on the Start button on the left side of the simulation window.
The simulation can be paused by clicking on Pause and resumed by clicking on Resume. At the
end of the simulation, results can be plotted or written to a file.
Appendix B
Delay-based Oscillatory Network
Parameters in the stochastic model
Reaction number Reaction step Probability of reaction Description
1 G+PNa1−→ GPN w1 = a1×G×PN/Ω First protein binds to gene
2 GPNd1−→ G+PN w2 = d1×GPN Dissociation of first protein
3 GPN +PNa2−→ GPN2 w3 = a2×GPN×PN/Ω Second protein binds to gene
4 GPN2d2−→ GPN +PN w4 = d2×GPN2 Dissociation of second protein
5 GPN2 +PNa3−→ GPN3 w5 = a3×GPN2×PN/Ω Third protein binds to gene
6 GPN3d3−→ GPN2 +PN w6 = d3×GPN3 Dissociation of third protein
7 GPN3 +PNa4−→ GPN4 w7 = a4×GPN3×PN/Ω Fourth protein binds to gene
8 GPN4d4−→ GPN3 +PN w8 = d4×GPN4 Dissociation of fourth protein
9 Gvs−→MP +G w9 = vs×G Gene expression
10 GPNvs−→MP +GPN w10 = vs×GPN Gene expression
11 GPN2vs−→MP +GPN2 w11 = vs×GPN2 Gene expression
12 GPN3vs−→MP +GPN3 w12 = vs×GPN3 Gene expression
13 MP +Emkm1−→Cm w13 = km1×Mp×Em/Ω Degradation of mRNA (1)
14 Cmkm2−→Mp +Em w14 = km2×Cm Degradation of mRNA (2)
15 Cmkm3−→ Em w15 = km3×Cm Degradation of mRNA (3)
Table B.1: The reaction steps of the stochastic model (part 1)
83
Appendix B. Delay-based Oscillatory Network 84
Reaction number Reaction step Probability of reaction Description
16 Mpks−→Mp +P0 w16 = ks×MP Translation of mRNA
17 P0 +E1k11−→C1 w17 = k11×P0×E1/Ω First Phosphorylation (1)
18 C1k12−→ P0 +E1 w18 = k12×C1 First Phosphorylation (2)
19 C1k13−→ P1 +E1 w19 = k13×C1 First Phosphorylation (3)
20 P1 +E2k21−→C2 w20 = k21×P1×E2/Ω First Dephosphorylation (1)
21 C2k22−→ P1 +E2 w21 = k22×C2 First Dehosphorylation (2)
22 C2k23−→ P0 +E2 w22 = k23×C2 First Dehosphorylation (3)
23 P1 +E3k31−→C3 w23 = k31×P1×E3/Ω Second Phosphorylation (1)
24 C3k32−→ P1 +E3 w24 = k32×C2 Second Phosphorylation (2)
25 C3k33−→ P2 +E3 w25 = k33×C3 Third Phosphorylation (3)
26 P2 +E4k41−→C4 w26 = k41×P2×E4/Ω Second Phosphorylation (1)
27 C4k42−→ P2 +E4 w27 = k42×C4 Second Phosphorylation (2)
28 C4k43−→ P1 +E4 w28 = k43×C4 Second Phosphorylation (3)
29 P2 +Edkd1−→Cd w29 = kd1×P2×Ed/Ω Second Dephosphorylation (1)
30 Cdkd2−→ P2 +Ed w30 = kd2×Cd Second Dephosphorylation (2)
31 Cdkd3−→ E4 w31 = kd3×Cd Second Dephosphorylation (3)
32 P2k1−→ PN w32 = k1×P2 Diffusion of protein into nucleus
33 PNk2−→ P2 w33 = k2×PN Diffusion of protein into cytosol
Table B.2: The reaction steps of the stochastic model (part 2)
This table shows the detailed reaction steps in the stochastic model together with their transition
probabilities and a short description. The kinetic constants are given in table B.3. Figure B.1 shows
the SPN representation of these reaction steps.
Appendix B. Delay-based Oscillatory Network 85
Reaction step Parameter values
1/2 a1 = Ω mol−1 h−1,d1 = (160×Ω) h−1
3/4 a2 = (10×Ω) mol−1 h−1,d2 = (100×Ω) h−1
5/6 a3 = (100×Ω) mol−1 h−1,d3 = (10×Ω) h−1
7/8 a4 = (100×Ω) mol−1 h−1,d4 = (10×Ω) h−1
9 vs = (0.5×Ω) mol−1 h−1
10-12 km1 = 165 mol−1 h−1,km2 = 30 h−1,km3 = 3 h−1
Em tot = Em +Cm = (0.1×Ω)mol
13 ks = 2.0 h−1
14-16 k11 = 146.6 mol−1 h−1,k12 = 200 h−1,k13 = 20 h−1
E1 tot = E1 +C1 = (0.3×Ω)mol
17-19 k21 = 82.5 mol−1 h−1,k22 = 150 h−1,k23 = 15 h−1
E2 tot = E2 +C2 = (0.2×Ω)mol
20-22 k31 = 146.6 mol−1 h−1,k32 = 200 h−1,k33 = 20 h−1
E3 tot = E3 +C3 = (0.3×Ω)mol
23-25 k41 = 82.5 mol−1 h−1,k42 = 150 h−1,k43 = 15 h−1
E4 tot = E4 +C4 = (0.2×Ω)mol
26-28 kd1 = 1650 mol−1 h−1,kd2 = 150 h−1,kd3 = 15 h−1
Ed tot = Ed +Cd = (0.1×Ω)mol
29-30 k1 = 2.0h−1,k2 = 1.0 h−1
Table B.3: Parameter values used for the stochastic simulations (mol = molecules) The numbered
steps refer to those from table B.2.
Appendix B. Delay-based Oscillatory Network 86
1/23/4 5/6 7/8
9 10 11
13-15
16
17-19
20-22
23-2526-28
29-31
32-33
12
Figure B.1: SPN representation of the circadian clock in Neurospora . Transitions are numbered
according to the reaction steps in B.2. Places are numbered to the chemical species in the model.
The figure gives the simplified Petri Net representation of this model. Double squares represent
reversible reactions. The numbers refer to the reaction in table B.3.
Appendix B. Delay-based Oscillatory Network 87
Kinetic equations of the deterministic model
In the model that is given by schema in 5.3, the time evolution of the concentrations of mRNA (Mp) and
clock protein (P0,P1,P2 or PN) is governed by the following set of differential equations:
dMp
dt= vs
KnI
KnI +Pn
N− vm
Mp
Km +MP(B.1)
dP0
dt= ksMp− v1
P0
K1 +P0+ v2
P1
K2 +P1(B.2)
dP1
dt= v1
P0
K1 +P0− v2
P1
K2 +P1− v3
P1
K3 +P1+ v4
P2
K4 +P2(B.3)
dP2
dt= v3
P1
K3 +P1− v4
P2
K4 +P2− vd
P2
Kd +P2− k1P2 + k2PN (B.4)
PN
dt= k1P2− k2PN (B.5)
The results in 5.4(a) have been obtained by numerical integration of the equations above for the following
parameter values: KI = 2 nM , n = 4, vs = 0.5 nMh−1, vm = 0.3 nMh−1, Km = 0.2 nM, ks = 2.0 h−1, v1 =
6.0 nMh−1, K1 = 1.5 nM , v2 = 3.0 nMh−1, K2 = 2.0 nM , v3 = 6.0 nMh−1, K3 = 1.5 nM , v4 = 3.0 nMh−1,
K4 = 2.0 nM , vd = 1.5 nMh−1, Kd = 0.1 nM, k1 = 2.0 h−1, k2 = 1.0 h−1
Appendix C
Genetic Oscillator based on Hysteresis
Reaction number Reaction step Probability of reaction Description
1 A+DAγA−→ D′A w1 = γA×A×DA/Ω Protein A activates gene DA
2 D′AθA−→ A+DA w2 = θA×D′A Protein A dissociates from D′A
3 DAαA−→MA +DA w3 = αA×DA DA is transcribed
4 D′Aα′A−→MA +D′A w4 = α′A×D′A D′A is transcribed
5 MAδMA−→ w5 = δMA×MA Degradation of MA
6 MAβA−→ A+MA w6 = βA×MA Translation of MA
7 A+DRγR−→ D′R w7 = γR×DR×A/Ω Protein A activates DR
8 D′RθR−→ D+A w8 = θR×D′R Protein A dissociates from DR
9 DRαR−→MR +DR w9 = αR×DR DR is transcribed
10 D′Rα′R−→MR +D′R w10 = αR×D′R D′R is transcribed
11 MRδMR−→ w11 = δMR×MR Degradation of MR
12 MRβR−→ R w12 = βR×MR Translation of MR
13 AδA−→ w13 = δA×A Degradation of A
14 RδR−→ w14 = δR×R Degradation of R
15 A+RγC−→C w15 = δR×R×A Synthesis of C
16 CδA−→ R w16 = δA×C Degradation of A (2)
Table C.1: Reaction steps in the stochastic formulation of the circadian clock model by Vilar et al. (2002)
together with their probabilities.
Table C.1 gives the reaction steps in the stochastic model of the genetic oscillator by (Vilar et al. 2002).
88
Appendix C. Genetic Oscillator based on Hysteresis 89
These steps have been obtained by decomposing the deterministic model governed by a set of differential
equations into its elementary steps. Compared to the model by (Gonze, Halloy & Goldbeter 2002), this
model is far more general. Enzymatic degradations are not completely decomposed into all elementary
steps as described by the Michaelis-Menten kinetics but are represented by a single reaction only.
Reaction step Parameter values
1/2 γA = Ω mol−1 h−1, θA = 50×Ω h−1
3/4 αA = 50×Ω h−1, α′A = 500×Ω h−1
5/6 δMA = 10 h−1, βA = 50 h−1
7/8 γR = Ω mol−1 h−1, θR = 100×Ω h−1
9/10 αR = 0.01×Ω h−1, α′R = 50×Ω h−1
11/12 δMR = 0.5 h−1, βR = 5 h−1
13/14 δA = 1 h−1, δR = 0.2 h−1
15/16 γC = 2 h−1, δA = 1 h−1
Table C.2: Parameter values for the circadian clock based on hysteresis
Kinetic equations of the deterministic model
The deterministic formulation of the genetic oscillator is given by the following equations:
dDA
dt= θAD′A− γADAA (C.1)
dDR
dt= θRD′R− γRDRA (C.2)
dD′Adt
= γADAA−θAD′A (C.3)
dD′Rdt
= γRDRA−θRD′R (C.4)
dMA
dt= α′AD′A +αADA−δMAMA (C.5)
dAdt
= βAMA +θAD′A +θRD′R−A(γADA + γRDR + γCR+δA) (C.6)
dMR
dt= α′RD′R +αRDR−δMRMR (C.7)
dRdt
= βRMR− γCAR+δAC−δRR (C.8)
dCdt
= γCAR−δAC (C.9)
Appendix C. Genetic Oscillator based on Hysteresis 90
SPN representation of the reaction steps
Figure C.1: SPN representation of the hysteresis-based model. This net is the graphical representation of
the individual reaction steps. The values of the rate constants are given in table C.2
Appendix D
Glossary of biological terms
This appendix provides an overview of the most important biological terms used in this dissertation. It aims
at the reader who is not familiar with the foundations of molecular biology. The explanations are adapted
from Alberts, Johnson, Lewis, Raff, Roberts & Walter (2002).
DNA (deoxyribonucleic acid) Long chain of acidic molecules formed from covalently linked deoxyri-
bonucleotide units. It serves as the store of hereditary information within a cell and the carrier of
this information from generation to generation.
gene Region of DNA that controls a hereditary characteristic, usually corresponding to a single protein or
RNA. This definition includes the entire functional unit, encompassing coding DNA sequences and
noncoding regulatory DNA sequences (promoter sequences).
messenger RNA (mRNA) RNA molecule that specifies the sequence of a protein. Produced by a protein
called RNA polymerase as a complementary copy of DNA. It is translated into protein in a process
catalyzed by ribosomes.
protein The major macromolecular constituent of cells. A linear polymer of amino acids linked together
by peptide bonds in a specific sequence.
transcription Copying of one strand of DNA into a complementary RNA sequence by the enzyme RNA
polymerase.
translation Process by which the sequence of nucleotides in a messenger RNA molecule directs the incor-
poration of amino acids into protein. It occurs on a ribosome.
91
Bibliography
Alberts, Bruce, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts & Peter Walter (2002), Mole-
cular Biology of the Cell, Garland Publishing.
Arkin, Adam, John Ross & Harley H. McAdams (1998), ‘Stochastic kinetic analysis of developmental
pathway bifurcation in phage lambda-infected escherichia coli cells’, Genetics 149(4), 1633–1648.
Barkai, Naama & Stanislas Leibler (2000), ‘Biological rhythms: Circadian clocks limited by noise’, Nature
403, 267–268.
Bloom, James D. (2003), Pipe - a platform independent petri net editor, Master’s thesis, Imperial College
London, http://freshmeat.net/projects/petri-net.
Chiola, G., G. Franceschinis, R. Gaeta & M. Ribaudo (1995), ‘Greatspn 1.7 - a graphical editor and analyzer
for timed and stochastic petri nets’, Performance Evaluation 24(1-2), 47–68.
Cox, Michael & David L. Nelson (2004), Lehninger - Principles of Biochemistry, 4 edn, Palgrave Macmil-
lan.
Deavours, D. D., W. D. Obal II, M. A. Qureshi, W. H. Sanders & A. P. A. van Moorsel (1995), ‘Ultrasan
version 3 overview’, Proceedings of the AIAA Computing in Aerospace 10 Conference pp. 327–338.
Dunlap, Jay C. (1999), ‘Molecular bases for circadian clocks’, Cell 96(22), 271–290.
Finney, Andrew & Michael Hucka (2003), Systems biology markup language (sbml) level 2: Structures and
facilities for model definitions, Technical report, Systems Biology Workbench Development Group,
California Institute of Technology.
Forger, Daniel B. & Charles S. Peskin (2005), ‘Stochastic simulation of mammalian circadian clock’, PNAS
102(2), 321–324.
Gibson, Michael A. & Jehoshua Bruck (2000), ‘Efficient exact stochastic simulation of chemical systems
with many species and many channels’, J. Phys. Chem. A 104, 1876–1889.
92
Bibliography 93
Gillespie, Daniel T. (1976), ‘A general method for numerically simulating the stochastic time evolution of
coupled chemical reactions’, J. Comput. Phys. 22, 403–434.
Gillespie, Daniel T. (1977), ‘Exact stochastic simulation of coupled chemical reactions’, J. Phys. Chem.
81, 2340–2361.
Gonze, D., J. Halloy & P. Gaspard (2002), ‘Biochemical clocks and molecular noise: Theoretical study of
robustness factors’, Journal of Chemical Physics 116(24), 10997–11010.
Gonze, Didier, Jose Halloy & Albert Goldbeter (2002), ‘Robustness of circadian rhythms with respect to
molecular noise’, PNAS 99(2), 673–678.
*http://www.pnas.org/cgi/content/abstract/99/2/673
Gonze, Didier, Samuel Bernard, Christian Waltermann, Achim Kramer & Hanspeter Herzel (2005), ‘Spon-
taneous synchronization of coupled circadian oscillators’, Biophys. J. 89(1), 120–129.
Goss, Peter J.E. & Jean Peccoud (1998), ‘Quantitative modeling of stochastic systems in molecular biology
by using stochastic petri nets’, PNAS 95(12), 6750–6755.
Hucka, M., A. Finney, H.M. Sauro, H. Bolouri, J. Doyle & H. Kitano (2002), ‘The erato systems biology
workbench: Enabling interaction and exchange between software tools for computational biology’,
Proceedings of the Pacific Symposium on Biocomputing .
Kindler, E. & M. Weber (2001), ‘The petri net kernel - an infrastructure for building petri net tools’, Software
Tools for Technology Transfer 3(4), 486–497.
Kurtz, Thomas G. (1971), ‘The relationship between stochastic and deterministic models for chemical reac-
tions’, The Journal of Chemical Physics 57(7), 2976–2978.
Leloup, Jean-Christophe, Didier Gonze & Albert Goldbeter (1999), ‘Limit cycle models for circadian
rhythms based on transcriptional regulation in drosophila and neurospora’, J Biol Rhythms 14(6), 433–
448.
Marsan, M. Ajmone, G. Balbo, G. Conte, S. Donatelli & G. Franceschinis (1995), Modelling with General-
ized Stochastic Petri Nets, Wiley Series in Parallel Computing.
Matsuno, Doi, Nagasaki & Miyano (2000), ‘Hybrid petri net representation of gene regulatory networks’,
Proc. Pacific Symposium on Biocomputing pp. 338–349.
Matsuno, H., Y. Tanaka, H. Aoshima, A. Doi, M. Matsui & S. Miyano (2003), ‘Biopathways representation
and simulation on hybrid functional petri net’, In Silico Biology 3(3), 389 – 404.
Bibliography 94
Merrow, Martha W., Norman Y. Garceau & Jay C. Dunlap (1997), Dissection of a circadian oscillation into
discretedomains, Vol. 94.
Petri, Carl Adam (1962), Kommunikation mit Automaten (Communicating with automata), PhD thesis,
University of Bonn, Institut fur Instrumentelle Mathematik.
Ptashne, M. (1992), A genetic switch: Phage λ and Higher Organisms, Blackwell Science.
Ramsey, S., D Orrell & H Bolouri (2005), ‘Dizzy: stochastic simulation of large-scale genetic regulatory
networks’, J. Bioinf. Comp. Biol. 3(2), 415–436.
Reddy, Venkatramana N., Michael N Liebman & Michael L. Mavrovouniotis (1993), ‘Petri net representa-
tions in metabolic pathways’, Proc Int Conf Intell Syst Mol Biol 1, 328–336.
Ross, Sheldon M. (1996), Stochastic Processes, 2 edn, Wiley Series in Probability and Mathematic Statistics.
Shaw, O., A. Koelmans, J. Steggles & A. Wipat (2004), ‘Applying petri nets to systems biology using
xmltechnologies’, Proceedings of the Workshop on the Definition, Implementation and Application of
a Standard Interchange Format for Petri Nets (26), 11–25.
Srivastava, R., M.S. Peterson & W.E. Bentley (2001), ‘Stochastic kinetic analysis of the escherichia coli
stress circuit using σ32-targeted antisense’, Biotechnol. Bioeng 75(1), 120–129.
Vilar, Jose M.G., Hao Yuan Kueh, Naama Barkai & Stanislas Leibler (2002), ‘Mechanisms of noise-
resistance in genetic oscillators’, Proc Natl Acad Sci USA 99(9), 5988–5992.
Weber, Michael & Ekkart Kindler (2002), ‘The petri net markup language’, Petri Net Technology for Com-
munication Based Systems, Advances In Petri Nets .