modeling for (physical) biologists: an introduction to the rule-based

25
This content has been downloaded from IOPscience. Please scroll down to see the full text. Download details: IP Address: 129.59.115.15 This content was downloaded on 17/07/2015 at 07:06 Please note that terms and conditions apply. Modeling for (physical) biologists: an introduction to the rule-based approach View the table of contents for this issue, or go to the journal homepage for more 2015 Phys. Biol. 12 045007 (http://iopscience.iop.org/1478-3975/12/4/045007) Home Search Collections Journals About Contact us My IOPscience

Upload: voliem

Post on 14-Feb-2017

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Modeling for (physical) biologists: an introduction to the rule-based

This content has been downloaded from IOPscience. Please scroll down to see the full text.

Download details:

IP Address: 129.59.115.15

This content was downloaded on 17/07/2015 at 07:06

Please note that terms and conditions apply.

Modeling for (physical) biologists: an introduction to the rule-based approach

View the table of contents for this issue, or go to the journal homepage for more

2015 Phys. Biol. 12 045007

(http://iopscience.iop.org/1478-3975/12/4/045007)

Home Search Collections Journals About Contact us My IOPscience

Page 2: Modeling for (physical) biologists: an introduction to the rule-based

Phys. Biol. 12 (2015) 045007 doi:10.1088/1478-3975/12/4/045007

PAPER

Modeling for (physical) biologists: an introduction to the rule-basedapproach

LilyAChylek1,2, LeonardAHarris3, JamesR Faeder4 andWilliam SHlavacek2,5

1 Department of Chemistry andChemical Biology, Cornell University, Ithaca, NY 14853,USA2 Theoretical Biology andBiophysics Group, Theoretical Division andCenter forNonlinear Studies, Los AlamosNational Laboratory, Los

Alamos,NM87545,USA3 Department of Cancer Biology, Vanderbilt University School ofMedicine, Nashville, TN 37212,USA4 Department of Computational and Systems Biology, University of Pittsburgh School ofMedicine, Pittsburgh, PA 15260,USA5 NewMexico Consortium, Los Alamos,NM87544,USA

E-mail: [email protected] [email protected]

Keywords: rule-basedmodeling, systems biology, cell signaling

Supplementarymaterial for this article is available online

AbstractModels that capture the chemical kinetics of cellular regulatory networks can be specified in terms ofrules for biomolecular interactions. A rule defines a generalized reaction,meaning a reaction thatpermitsmultiple reactants, each capable of participating in a characteristic transformation and eachpossessing certain, specified properties, whichmay be local, such as the state of a particular site ordomain of a protein. In other words, a rule defines a transformation and the properties that reactantsmust possess to participate in the transformation. A rule also provides a rate law. A rule-basedapproach tomodeling enables consideration ofmechanistic details at the level of functional sites ofbiomolecules and provides a facile and visualmeans for constructing computationalmodels, whichcan be analyzed to study how system-level behaviors emerge from component interactions.

Introduction

Amodel is a representation or imitation of a system. Amodel is especially useful when the systemof interest isboth complex and difficult to probe experimentally.Models have long been used to study biologicalsystems, which are among the most complex systemsstudied in science and engineering. Models take manydifferent forms. Two types of models that are familiarto biologists are (1) a living system amenable toexperimental study (e.g., an animal in which charac-teristic features of a human disease are recapitulated, amammalian cell line, or a model organism, such asEscherichia coli, Saccharomyces cerevisiae or Caenor-habditis elegans) and (2) a diagram, which is often usedto graphically summarize one’s understanding of howa systemoperates.

Aspects of these two types of models, living sys-tems and diagrams, are combined in mathematical/computational models. (We will hereafter refer pri-marily to computational models, because the vastmajority of mathematical models are now analyzedwith the aid of computers, and models formulated as

computer programs, meaning computational models,are becoming more common and almost always havean underlying mathematical basis.) Like a diagram, acomputational model is based on what a modelerknows or hypothesizes about a system. A difference isthe greater precision of a computational model.Whereas a diagram tends to be ambiguous and quali-tative, a computational model requires a modeler tomake concrete, quantitative statements about how asystem is believed to behave. The precision and rigorrequired to build a computational model can help toidentify gaps in knowledge and understanding. Whensuch gaps are encountered, a modeler is required togenerate hypotheses that fill the gaps. Knowledge andhypotheses are formalized and used to create a set ofequations and/or a computer program, which can beused, for example, to simulate how the state of a sys-tem evolves over time in response to a stimulus or per-turbation. In this way, a computational model sharesan important feature of a living model system: both acomputational model and a livingmodel enable biolo-gical phenomena to be studied in a controlledmanner,in a setting that allows questions about system

RECEIVED

20November 2014

REVISED

25 February 2015

ACCEPTED FOR PUBLICATION

9March 2015

PUBLISHED

15 July 2015

© 2015 IOPPublishing Ltd

Page 3: Modeling for (physical) biologists: an introduction to the rule-based

behavior andmechanisms to be answeredwith relativeease. Both types of models suffer from the pitfall thatthe insights obtained may depend in an unknown wayon the degree to which the model at hand representsthe true object of study, the biological/physiologicalsystem of interest. Nevertheless, models are essentialfor progress in biological research, because manyquestions simply cannot be studiedwithout them.

Biologists routinely use diagrams to reason aboutbiological systems, especially cellular regulatory net-works. In the context of discussions about cell signal-ing systems, these diagrams are sometimes calledpathway maps, or simply maps. Pathway maps, whichare commonly drawn ad hoc, despite attempts toestablish standardized conventions [1], can be viewedas a type of conceptual model. A map can be used tosummarize and convey one’s understanding of how asystemworks or to provide a broad overview of what isknown about a system. A limitation of a map is that itsinterpretation ultimately relies on the intuition of itsreader, which is a serious weakness given the complex-ity of cellular regulatory networks.

The complexity of cellular networks arises fromseveral sources. One is the ubiquitous presence offeedback and feedforward loops, which can generateexotic nonlinear behaviors, such as hysteresis andoscillations [2, 3]. Another complicating feature is thetendency of the biomolecules that comprise cellularregulatory networks to interact with multiple bindingpartners [4, 5]. This molecular promiscuity, or poly-specificity, can generate crosstalk, meaning influencesbetween networks with distinct functions. Anothercomplication is that biomolecular interactions andtheir consequences are sensitive to quantitative fac-tors, including the copy numbers of binding partners,binding cooperativity, and binding affinities and life-times [6, 7]. In addition to these quantitative proper-ties of biomolecular interactions, which can be viewedas intrinsic properties, interactions are also likely to beaffected by extrinsic properties, such as compartmen-talization [8] and binding competition within a com-partment [9], which is a function of networkconnectivity. The features mentioned above can eitheralone or in combination give rise to non-intuitive sys-tem behaviors. These complicating features limit ourability to understand and manipulate cellular reg-ulatory networks, and they beckon us to considermodels more sophisticated and powerful than dia-grams, such as computationalmodels.

Computational models go beyond diagrams,because the logical consequences of the informationused to formulate a model can usually be elucidatedthrough computer-aided calculations, or simulations.The information that serves as the basis for a modelmay include a synthesis of multiple mechanisticinsights, an integration of different types of quantita-tive data (e.g., measurements of protein copy numbersand dissociation constants), and plausible assump-tions, which are often necessary to fill knowledge gaps

as noted earlier. The analysis of a computationalmodel can identify a potential system behavior thatone would not have expected and suggest experimentsto test the non-obvious (i.e., interesting) predictionsof the model. Validation of such predictions increasesone’s confidence in the model. Falsified predictionscan also be useful if they lead to revisions that increaseamodel’s reliability and generality.

Modeling is not a monolithic practice. Differenttechniques are used to address different problems.Even the same problems are commonly attacked usinga variety of techniques. Some models are formulatedto study specific systems, whereas others are for-mulated to study phenomena found in many systems.The physical and chemical principles captured inmodels vary, and models incorporate biologicalknowledge at differing resolutions, from abstract andphenomenological to detailed and mechanistic [10].Some approaches, such as logical modeling [11], areunconnected or only loosely coupled to physicochem-ical principles. This diversity reflects not only a needfor different approaches to address different questionsbut also a lack of consensus about best practices.

Arguably, the field of computational systems biol-ogy is at a stage where many different modelingapproaches are being tried partly because traditionalmodeling approaches, meaning those long used,mostly in non-biological fields, have shortcomingswhen applied to biological systems. Models based onordinary differential equations (ODEs) are arguablythe most popular type of physicochemical modelemployed in biology [12]; these models represent atraditional modeling approach. ODEs have been usedto model (bio)chemical systems since before the exis-tence of molecules was widely accepted [13, 14].Although ODE models are undoubtedly useful, theODEmodeling approach when applied to cellular reg-ulatory networks entails complicating requirementsthat call for new ideas about how to model [15–17].Namely, this approach requires one to enumerate thepotentially populated chemical species in a system.Because biomolecules can usually be found in numer-ous states and complexes, it is desirable to avoid thisrequirement.

Rule-based modeling allows one to avoid enumer-ating the potentially populated chemical species in asystem; it is an approach tailored for modeling a bio-molecular interaction network, or indeed, any systemwhere structured objects interact via component partsin a modular way. Here, we provide a brief introduc-tion to rule-based modeling in systems biology, whichis characterized by the use of local rules to representbiomolecular interactions [18–24]. In biology, rule-based modeling has most often been used to study cellsignaling systems, but this modeling paradigm is morebroadly applicable [23, 25–27]. Indeed, rule-basedmodeling has antecedents in physics, chemistry, andcomputer science [22, 28–32]. Rules formalizemechanistic understanding of biomolecular

2

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 4: Modeling for (physical) biologists: an introduction to the rule-based

interactions according to conventions that allow a setof rules to serve as an executable model, meaning amodel that can be used to obtain simulations of thebehavior of the system represented by the model.Simulations may leverage different algorithms to cap-ture system behavior at varying levels of abstraction ofthe underlying physiochemical laws. For example,simulations are possible that do and do not accountfor fluctuations in copy numbers, depending on howthe simulation algorithm relates to the chemical mas-ter equation [33]. Once a rule-based model is speci-fied, it is relatively easy to perform multiple types ofsimulations, which facilitates analysis. For example,one can readily switch between stochastic and deter-ministic simulation methods [34]. A rule-basedapproach tomodel specification ismost justified whenone is concerned with biomolecular interactions thatdepend on and impact site-specific details, or biomo-lecular site dynamics [23], and moreover the interac-tions of interest are modular, meaning somewhatindependent of each other.

Many rule-based models reported in the literatureare simulated using an ODE solver operating on rule-derived ODEs, and so can be viewed as equivalent toODEmodels that produce the same simulation results.However, rule-based models are specified in terms ofrules instead of equations. The different approachesused to specify ODE and rule-based models are notsuperficial; the differences in the way these models arewritten (i.e., specified) encourage and usually entaildifferentmodeling assumptions.

This review, mainly through discussion of exam-ples of rules and simple but complete executable mod-els, is meant to help biologists, including those who donot specialize in modeling, create models for the sys-tems they study. Models can serve as valuable reason-ing aids, and with rule-based modeling, morewidespread use of models is foreseeable. This reviewmay also be helpful for experienced modelers new torule-based modeling, or even practitioners of theapproach, because the example models illustrate avariety of rule-based modeling capabilities and maytherefore serve a valuable reference purpose. We willfocus on one particular approach that can be used tospecify and analyze rule-based models, the approachenabled by the BioNetGen language (BNGL) andBNGL-compatible software tools, such as BioNetGen[34–36]. Our focus will also be on models appropriatefor well-mixed reaction compartments, because suchmodels are generally easier to specify and analyze.With such models, there is no need to specify reactioncompartment geometry or boundary conditions, forexample. Later, we will mention other model-specifi-cation approaches and software tools (many with cap-abilities beyond what is offered by the BNGLframework) and point to sources of information aboutthese methods. A tight focus is necessary to provide apractical introduction to rule-based modeling andconcrete examples. After learning the principles

taught here, we encourage readers to explore otherapproaches.

Whatmakesmodels useful?

Models have the potential to aid biologists in severalways. First, a model can be used as a roadmap forexperimental design. Experiments are used to testhypotheses, and the more complex a hypothesis, themore complicated and numerous the necessaryexperimental tests are likely to be. Models canpotentially be used to carefully design experimentaltests that would be optimal for supporting or disprov-ing a hypothesis [37–39]. Second, models can also beused to reconcile surprising or conflicting data. Insome cases, seemingly contradictory results mayactually be compatible when quantitative details aretaken into account [40]. Third, models can be used toconsolidate knowledge about a system [41, 42]. Analo-gous to assembly of a jigsaw puzzle, models can beused to piece together available information to form amore complete picture of how a system works.Furthermore, by comparing model-based simulationsto experimental data, discrepancies between under-standing and reality can be identified, whichmay pointto areas where additional pieces of information needto be discovered through further experiments.

What constitutes a useful model? The answer tothis question depends partly, or even largely, on cur-rent modeling capabilities. For example, models thatrequire computer-aided operations to obtain predic-tions of system behavior (e.g., generation of randomnumbers as in Monte Carlo methods [43]) onlybecame useful after technological advances made ana-lyses of such models practical. It has recently beenargued that those pursuing useful models in biologyshould adhere to the following guidelines [44]: (1) ‘aska question’, (2) ‘keep it simple’, and (3) ‘if the modelcannot be falsified, it is not telling you anything’. Thelatter guideline represents valuable advice, because if amodel cannot be falsified, it lies outside the realm ofscience. However, this guideline should perhaps beaugmented with an admonition to seek models thatmake not only testable predictions but also unex-pected predictions. The first two guidelines also repre-sent valuable advice, although their interpretationshould not be static. These guidelines must be tem-pered with an awareness of what current modelingmethodology allows. For example, although ‘asking aquestion’ is necessary as a starting point, it does notmean that a model need be designed/used for only asingle purpose to the exclusion of model reuse [45],which can be valuable and time saving. Similarly,‘keeping it simple’ is partly defined by the tools anddata available to a modeler; what appears complexusing one modeling approach may actually be simpleusing another. For example, the development of rule-based modeling approaches, as we will see below, has

3

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 5: Modeling for (physical) biologists: an introduction to the rule-based

made it relatively easy to specify models that would becumbersome if not impossible to specify asODEs.

Simplicity is viewed as a virtue, perhaps especiallyso in mathematical pursuits, such as modeling. In hisfamous lecture on mathematical problems, Hilbertstated [46], ‘what is clear and easily comprehendedattracts, the complicated repels us’. A view amongsome, if not most, modelers is that many biochemicaldetails elucidated by biologists are too complicated tocontemplate including in models. Accordingly, manyknown biochemical details are omitted in models.Simple, abstract models, which may focus, for exam-ple, on capturing certain limited influences amongmolecular entities and processes, have certainly beenuseful [47, 48], and are likely to continue to be usefulfor a long time. However, there are many importantquestions that can now be feasibly addressed thatdepend on consideration of biochemical details to anextent beyond what is usually considered bymodelers.Easing the consideration of mechanistic biochemicaldetails in models for cellular regulatory systems wasone of the driving motivations for the development ofthe rule-based modeling approach in systems biology[18]. With this approach, there is a new definition of‘simple’. We should heed the saying attributed to Ein-stein [49], ‘everything should be as simple as possible,but not simpler’.

The ability to capture mechanistic details in a rule-based model has opened new frontiers, such as thedevelopment of ‘standard models’, which do not cur-rently exist inmost if not all of biology. Standardmod-els in other fields, such as the Standard Model ofparticle physics [50], drive the activities of whole com-munities and tend to be detailed, because they con-solidate knowledge and are useful in large part becausethey identify the outstanding gaps in understanding.Would standard models benefit biologists? An affir-mative answer is suggested by the fact that there aremany intricate cellular regulatory systems that haveattracted enduring interest, such as the epidermalgrowth factor receptor (EGFR) signaling network[51, 52], which has been studied for decades fordiverse reasons. Efforts to model EGFR signaling havebeen hailed as paradigmatic of systems biology [53]. Acomprehensive, extensively tested, and largely vali-dated model for the EGFR signaling network or anyother well-studied system, meaning a standard model,would aid modelers by providing a trusted reusablestarting point for asking not one butmany questions.

Uses and advantages of rule-basedmodeling

Rules describe the interactions and processes in asystem. Various established conventions exist forwriting rules, such as BNGL [34], but in each of these,a modeler specifies (1) the properties required ofreactants, (2) the outcome of a reaction, or,

equivalently, the transformation that is applied toreactants to obtain products, and (3) the rate law thatgoverns all reactions implied by the rule. A rule isminimal if it does not depend on any properties of thereactants other than those of the components that aremodified by the reaction. For example, a minimal rulefor association might only require that each of the twosites participating in formation of a non-covalentbond be free. The set of sites ormolecular componentsmodified by a rule is called the reaction center. Anyadditional requirements on the reactants are consid-ered contextual. When the interactions in a system areknown to be modular, the rules describing the systemwill incorporate relatively little information aboutmolecular context and the contextual requirementsthat are expressed in rules will be local, such as arequirement that a site neighboring a reaction centerbe bound or in a particular modification state. Rulesfor the interactions in a cellular regulatory networkprovide a high-level and relatively easy-to-obtainrepresentation of the network. As we will discuss later,the rules of a model can often be translated automati-cally into equations, which can then be analyzed usingconventional numerical methods. In cases wheretranslation of rules into equations is not possible, therules of a model can be used as event generators in adiscrete-event (stochastic) simulation algorithm.

Because rules can include information about spe-cific biomolecular sites, such as a constraint limitingan interaction between two proteins to cases where aparticular tyrosine residue in one of the proteins isphosphorylated, rule-based modeling is ideal forrepresenting biomolecular site dynamics, the changesin states occurring at the functional sites of biomole-cules [23]. For proteins, these sites include conserveddomains [54], such as Src homology 2 (SH2) and SH3domains [55], as well as catalytic domains; short linearmotifs [56]; and sites of post-translational modifica-tions [57], such as an amino acid residue modifiedthrough covalent enzyme-catalyzed addition of a che-mical group (e.g., a phosphoryl group) or a peptidebond that can be cleaved by a protease. The concept ofa rule, which provides an abstract representation of aninteraction (or process), should become clearer as wediscuss examples.

Rules can be specified using many different con-ventions and means [35, 58–70]. Here, as indicatedearlier, we will focus on use of the BNGL [34] for spe-cifying rules. Many of the available rule-based models,which are essentially collections of rules, have beenformulated using this language [71] or can be readilyrecast in this language.Much of what is said here aboutBNGL and BNGL-compliant software also applies toother model-specification languages and the softwarecompatible with these languages, which will be brieflydiscussed later.

Several thorough but informal descriptions ofBNGL are available [34, 72, 73]. A formal descriptionis also available [74]. BNGL allows for the specification

4

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 6: Modeling for (physical) biologists: an introduction to the rule-based

of complete, executable models, which again can beviewed as collections of rules, as well as parametervalues and other ancillary information (e.g., initial con-ditions). BNGL also allows for the specification ofactions, which include simulation protocols, and simu-lation outputs. BNGL is a machine-readable (i.e., text-based) language designed for modeling cellular reg-ulatory systems (especially cell signaling systems), whichis akin to a domain-specific programming language. It iscompatible with a number of general-purpose softwaretools, including BioNetGen [34, 35], RuleBuilder [75],RuleBender [76, 77], MOSBIE [78], DYNSTOC [79],RuleMonkey [80], and NFsim [81]. Once specified, aBNGL-encodedmodel can be processed to perform dif-ferent operations, including model visualization; trans-lation of a BNGL-encoded model specification intodifferent formats, such as an equivalent systems biologymarkup language (SBML) encoding [12] or aMATLABM-file or MEX-file [82]; and simulation [34]. A varietyof simulation methods (and simulators) are available[23], aswewill discuss later.

It is important to understandhowrule-basedmodelsdiffer from traditionally formulatedmodels to appreciatewhen a rule-based approach iswarranted andmost likelyto be helpful. As indicated earlier, a key distinctionbetween a rule-based model and a traditional equation-based model is the way that it is specified, but there tendto be important differences beyond themeans of specifi-cation. We note that equation-based models that arecomparable to rule-based models may take the form ofODEs for systems with well-mixed chemical kinetics orthe form of partial differential equations (PDEs) for sys-tems in which reaction and diffusion are coupled.Although one could in principle use either rules orequations to specify essentially the samemodel for a sys-temof interest, the differentmodel-specification approa-ches tend to engender different modeling assumptions.Indeed, when one compares models in the literature,such as the ODE-basedmodels of Kholodenko et al [83]and Chen et al [84] for ErbB receptor signaling and therule-based models of comparable scope of Blinov et al[85] and Creamer et al [86], one sees that the differenttypes of models are based on very different assumptions.Equation-basedmodels are based on assumptions aboutwhich chemical species are populated and thereforetracked. Thus, the size of a model corresponds to thenumber of chemical species taken to be populated in asystem. Rule-based models, in contrast, are based onassumptions about themodularity of interactions, whichaffect the forms taken by rules. The size of amodel corre-sponds to the number of interactions of interest. Thesedifferences arise from the distinct approaches to modelspecification,whichhavedifferent requirements.

To specify an ODE or PDE model, for example,one must enumerate the chemical species that arepotentially populated and write an equation for each.Thus, in modeling a cell signaling system with ODEsor PDEs, amodeler is obligated to specify which proteinstates and signaling complexes are populated.

Unfortunately, there is usually no empirical data avail-able to guide amodeler in decidingwhich chemical spe-cies and reactions are important, and a comprehensiveenumeration of all possibly populated chemical speciesis often impracticable. When site dynamics are con-sidered there may be more chemical species that couldpotentially be populated than there are molecules in acell. This barrier to model specification has been calledcombinatorial complexity [15, 16].We note that lack ofinformation about which chemical species are popu-lated in a system is an issue of concern even if one takesa rule-based modeling approach (which eliminates therequirement to enumerate chemical species), becausethis information, if available, would impose useful con-straints on the rules of model. However, the rules andparameters of a rule-based model implicitly identifyimportant species and reactions, which can be revealedthrough simulation [87]. Thus, a rule-based modelingapproach does not eliminate the problem of absence ofknowledge about the important chemical species andreactions in a system but it does provide a tool for dis-covering this information starting from the known bio-molecular interactions of the system.

An example of combinatorial complexity is pro-vided in figure 1(A), which illustrates the reaction net-work that arises in a model for a protein with twophosphorylation sites that can bind other proteinswhen phosphorylated. This model implies a reactionnetwork with 9 distinct chemical species and 12 possi-ble reactions, but it can be encoded with a set of fourrules under the assumption that phosphorylation orbinding at one site does not affect the rate of reactionstaking place at the other site. The use of rules(figure 1(B)) circumvents the requirement to identifythe possible chemical species and reactions in advanceand greatly reduces the size of the model specificationin this case (from 12 reactions to 4 rules). Each rulespecifies the necessary and sufficient conditions foroccurrence of the interaction represented by the rule.In our example, each rule only specifies the state of thesite being modified by phosphorylation or binding,but not that of the other site. Each rule generates areaction for each allowable set of reactant species itencounters, and furthermore, each reaction generatedby a rule is parameterized by the same rate constant.The latter represents a coarse graining of the chemicalkinetics, because in principle, each species may beassociated with a unique reactivity, as suggested byfigure 1(A). The effect of this modularity assumptionis shown in figure 1(C), where each of the three reac-tions generated by each rule is governed by the samerate constant. This rule-based version of the model isthus not quite as general as the full reaction scheme.

Although rules tend to involve assumptions ofmodularity, rules can be made as precise as necessaryto represent knowledge or hypotheses about theunderlying biochemistry. It would be possible, forexample, to use rules to fully specify the reaction net-work shown in figure 1(A) using a set of 12 rules

5

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 7: Modeling for (physical) biologists: an introduction to the rule-based

havingmore contextual constraints than the four rulesdiscussed above, making each of these rules as specificin regards to reactants as the 12 distinct reactions.Thus, these rules would be equivalent to or essentiallythe same as the reactions shown in figure 1(A). Such aspecification would not be more concise than a tradi-tional one, but it would have the advantage that theprecise structural (i.e., site-specific) requirements foreach reactionwould be recorded in the rules.

Rule-based modeling can be viewed as a general-ization of traditional approaches to modeling

biochemical kinetics that lifts the burden of needing tomake an explicit listing of the chemical species andreactions that are important in a system. The burdenof enumerating species and reactions often leads toad hoc assumptions to limit the number of species andreactions included in a model. Two such assumptionsare illustrated in figure 1(D). Both reaction schemesinclude the same types of reactions (binding andphosphorylation) used in the full and rule-basedversions, but the majority of species and reactionsare eliminated. Although assumptions of sequential

Figure 1.Reaction network-based and rule-based representations of a simple biochemical system. (a)Reaction network for a protein(roundedbox) that can be phosphorylated andmodified at two sites. The unmodified forms andphosphorylated formsof each site arerepresentedby anopen circle andfilled circle, respectively. Bindingof another protein to a phosphorylated site is indicated by additionofan edge. Because of combinatorial complexity, there are a total of nine possible chemical species (labeled s1–s9) corresponding to distinctstates of phosphorylation andbinding,with a total of 12possible (bio)chemical reactions (labeled r1–r12) connecting these states.Although such transformations are reversible in general,wehavemade the reactions unidirectional here for simplicity. (b)Ruleswithmodularity assumptions that represent the networkwith simplifying rate assumptions. Each of the four rules describes a transformation,either phosphorylationor binding, that occurs at oneof the two sites. Because each rule refers to only one site, the state of the other siteaffects neither the applicability of the rule nor the rate of the reactions that are generated. (c)Reaction network generated by themodularrules in (b). Labels on the reaction arrows indicate the rate constants governing each reaction. Each rule generates three reactionswithidentical rate constants as indicated.Although the number of species and reactions is the same as the network shown in (a), the numberof parameters governing the rate equations in themodel has been reduced from12 to 4 byneglectingpossible cooperative interactionsbetween the sites. (d) Simplified reactionnetworkmodelswith assumptions to limit combinatorial complexity. Two types of commonimplicit assumptions are illustrated: sequentialmodification,where themodificationsof the sites are assumed tooccur in a particularorder, and competitive binding,whereonly one binding partner is allowed to bind at a time. These assumptions limit both the reactionnetwork size and the number of parameters. All but one of the reactions depicted has a corresponding transformation in the rule-basedversionof themodel (c), as indicated by the indexof the rate constants. A prime is used todenote approximate correspondence, becausethe underlying assumptions of themodels differ. The competitive bindingmodel also includes a lumped reaction inwhichboth sites areassumed tobe phosphorylated to enable binding of oneor the other of the protein’s binding partners.

6

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 8: Modeling for (physical) biologists: an introduction to the rule-based

modification or competitive binding may be justifiedin some cases, expedience is often the (unstated) ratio-nale. Because traditional modeling approaches alsolack standard nomenclature for tracking bonds orpost-translational modifications, the composition ofspecies can be ambiguous and the nature of assump-tionsmade in constructing amodel difficult to assess.

Although cell signaling has been studied for decades,we have limited knowledge about how modificationsand binding at different sites of a protein are coupled(e.g., [20, 24, 88]). We argue that rules expressing a highdegree of modularity represent a more natural startingpoint for model development than making assumptionsfor the sake of limiting network size. At the same timewecaution that assumptions of modularity may not alwaysbe valid, as exemplified by the model of Ullah et al [89]for the behavior of a ligand-gated ion channel, the inosi-tol 1, 4, 5-trisphosphate receptor. This model includesonly a fraction of the possible states of the receptor, thoseidentified as being critical for reproducing an impressivecollection of data. Although the model is inconsistentwith ligand binding sites in the receptor behaving inde-pendently, ormodularly, the states included in themodelare highly idiosyncratic, meaning that the importance ofthese particular states is not at all apparent. Thus, thismodel is representative of models having structures thatcan only be found through inference from data. Models,especially at early stages of investigation, are usually takento have less labyrinthine structures, regardless of model-ing approach. Successive refinement of modular rulesrepresents one principled approach that may prove use-ful in the future for the inference of such complex coop-erative interactions fromdata.

Examples of rules for interactions in a cellsignaling system

To make the concept of rules more concrete, let usintroduce and then discuss three sets of BNGL-encoded rules, which describe various processesinvolved in IgE receptor (FcεRI) signaling (figure 2).(As we discuss these rules, we will also discuss theunderlying biological details, hopefully in an intelligi-ble manner for readers unfamiliar with FcεRI signal-ing.) These rules are drawn froma larger set of such rules

[41], discussed below, and do not by themselves form acoherent model of the system. Before discussing theexample rules and presenting them in BNGL format, itshould be noted that rules generally encode molecularmechanisms, which may seem arcane and which canmake the writing or interpretation of a rule somewhatchallenging. Of course, the ability to naturally capture(i.e., formalize)molecularmechanisms of biomolecularinteractions is an advantage of rules, but because of theintimate link between the form of a rule and theunderlying biological details, the presentation of rulesbelow is necessarily entwined with a discussion ofmolecular mechanisms (i.e., site-specific details ofbiomolecular interactions) involved in FcεRI signaling.If these details obscure our introduction of rules (whichwe hope is not the case), we encourage the reader toreturn to this section after examining the completemodels discussed later; these models consist of rulesthat can be interpretedmore easily without backgroundknowledge about a particular biological system.

FcεRI is a multichain antigen recognition receptorthat is expressed, for example, on mast cells [90]. Theantigen specificity of a given FcεRI receptor is deter-mined by the IgE antibody with which the receptor isassociated. IgE-FcεRI complexes are long lived relativeto the time scale of the signaling events initiated by theinteractions of IgE-FcεRI complexes with a multi-valent antigen. Signaling may be initiated in thelaboratory by crosslinking of FcεRI through use of var-ious reagents, such as bivalent haptens, multivalenthaptenated proteins (or other carrier molecules), che-mically crosslinked oligomers of IgE, anti-IgE anti-bodies, and anti-FcεRI antibodies [91].

Here, we will discuss rules for the following signal-ing processes, which can all be considered early stepsin FcεRI signaling (figure 2(A)): (1) binding and cross-linking of receptors by a bivalent anti-FcεRI IgG anti-body (figure 2(B)); (2) recruitment of a proteintyrosine kinase, Lyn, to a docking site at Y218 in the βchain of an activated receptor (UniProt [92] number-ing for the rodent protein Ms4a2), which depends onphosphorylation of Y218 (figure 2(C)); and (3) phos-phorylation of amembrane-localized substrate (Y175)in Lat by a second receptor-associated protein tyrosinekinase, Syk (figure 2(D)). Rules for these processes canbewritten in BNGL as follows:

a( ) ( ) ( ! ). ( ! ) (1 )Lig r,r Rec l −>Lig r,r l Rec l 1 kp1+

b( ! ) ( ) ( ! ! ). ( ! ) (1 )Lig r ,r Rec l −>Lig r ,r 1 Rec l 1 kp2+ + +

c( ! ). ( ! ) ( ) ( ) (1 )Lig r 1 Rec l 1 −>Lig r Rec l koff+

( ) ( _ ) ( ! ). ( _ ! ) (2)Lyn SH2 Rec b Y218 P − Lyn SH2 1 Rec b Y218 P 1 kp1,kmL+ ∼ < > ∼

a( ! ) ( ) ( ! ! ). ( ! ) (3 )Syk tSH2 ,kin Lat Y175 0 Syk tSH2 ,kin 1 Lat Y175 0 1 kf+ + ∼ −> + ∼

b( ! ). ( ! ) ( ) ( ) (3 )Syk kin 1 Lat Y175 0 1 Syk kin Lat Y175 0 kr∼ −> + ∼

c( ! ). ( ! ) ( ) ( ) (3 )Syk kin 1 Lat Y175 0 1 Syk kin Lat Y175 P kcat∼ −> + ∼

7

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 9: Modeling for (physical) biologists: an introduction to the rule-based

These rules, which represent direct binding andcatalytic interactions, are visualized in figure 3 inaccordance with the conventions of Faeder et al [93].Note that bonds between molecular components areindicated by sharing of bond indices, which are pre-fixed by the ‘!’ symbol. The notation ‘!+’ indicates abond without indicating the binding partner. Simi-larly, internal states of molecular components are pre-fixed by the ‘∼’ symbol. The conventions of BNGL-encoded rules are discussed further below.

The rule of equation (1a) represents capture of afree ligand by a free receptor site, meaning capture offree anti-FcεRI IgG antibody by the epitope in FcεRIαrecognized by the antibody. The rule of equation (1b)represents crosslinking of two receptors by a receptor-tethered ligand. The rule of equation (1c) representsdissociation of a ligand–receptor (non-covalent)bond. Note that in the graphical depictions of theseinteractions, which are shown in figures 2(B) and3(A), (B), the two binding rules are depicted as rever-sible, whereas in equation (1c) we have combined thetwo types of dissociation events captured in the rever-sible rules into a single dissociation rule. The rule ofequation (2) represents reversible recruitment of theSrc-family protein tyrosine kinase Lyn to a phospho-tyrosine docking site in the receptor via Lyn’s SH2domain. The docking site is located at Y218 in thereceptor’s β chain, which is part of an immunor-eceptor tyrosine-based activation motif (ITAM). Therules of equations (3a) through (3c) represent Syk-mediated phosphorylation of a substrate in the mem-brane adaptor protein Lat (Y175, UniProt numberingfor the rodent protein) via a Michaelis–Mentenmechanism [94, 95]. The rule of equation (3a) indi-cates that a component in Syk, which is named tSH2to suggest ‘tandem SH2 domains,’ is bound, which isindicated by the notation ‘!+.’ (The tandem SH2domains in Syk interact with a pair of ITAMphospho-tyrosines in each of the receptor’s γ chains.) Thus, therule of equation (3c) represents an activity of Syk thatis restricted to receptor-associated forms of this pro-tein tyrosine kinase, because binding of the kinasedomain of Syk, denoted kin, to its substrate, Y175 inLat, depends on Syk association with the receptor, asindicated by the rule of equation (3a).

As can be seen from the example rules presentedabove, formal elements of a rule are often in one-to-one correspondence with material entities. For exam-ple, Lyn in the rule of equation (2) refers to Lyn andSH2 refers to the SH2 domain of Lyn.

Each of our example rules ends with a listing ofeither one or two names of parameters. These rule-associated parameters are, by convention, taken tobe rate constants in mass-action (elementary) ratelaws. (Other types of rate laws can also beassociated with rules, but when only a parametername is given, a mass-action rate law is impliedby convention.) The numerical values (and the units)of the rate constants associated with rules are

usually defined separately from the rules within amodel-specification file. There is at least one rate lawassociated with each rule. Reversible rules, such as therule of equation (2), may be associated with two ratelaws, one for the forward transformation defined bythe rule and one for the reverse transformationdefined by the rule.

The rule of equation (2) is associated with two rateconstants and has two directions. It can be viewed as ageneralized reversible reaction, or a reaction generatorthat defines forward and reverse transformations aris-ing from a reversible interaction between two mole-cules named Lyn and Rec. The rule indicates that theinteraction between Lyn and the receptor is mediatedby molecular components named SH2 (a constituentof Lyn) and b_Y218 (a constituent of FcεRI’s β-chainITAM). The nomenclature of rules is similar to that ofstandard chemical reactions; however, rules differfrom standard chemical reactions in that they do notuniquely identify reactants (or products), whichallows for concise model specification. For example,the rule of equation (2) is silent about functional com-ponents of Lyn and FcεRI other than the SH2 domainin Lyn and its phosphotyrosine docking site in FcεRI.Thus, under a ‘don’t care, don’t write’ convention[23], the rule potentially applies tomultiple forms andstates of these proteins and consequently defines mul-tiple reactions if multiple forms and states are encom-passedwithin amodel specification.

The rule of equation (2) has left- and right-handsides, which are separated by the symbol ‘<−>.’ Thissymbol indicates that the interaction represented bythe rule is reversible and that the rule defines transfor-mations/reactions in forward and reverse directions.The other rules presented above are each unidirec-tional, which is indicated by the symbol ‘−>.’ By con-vention, unidirectional rules are written (and read)from left to right. In equations (1a), (1b), (2) and (3a),the plus sign on the left-hand side indicates that therule defines bimolecular (association) reactions, read-ing from left to right. The absence of a plus sign on theright-hand side of equation (2) indicates that the ruledefines unimolecular (dissociation) reactions when itis read in the opposite direction. The rules ofequations (1c), (3b), and (3c) also define unidirec-tional reactions. When two rate constants are asso-ciated with a rule, as is the case for equation (2), twoelementary mass-action rate laws are implied. In therule of equation (2), one rate law, with rate constantkpL, is implied for all association reactions defined bythe rule, and a second rate law, with rate constantkmL,is implied for all dissociation reactions defined by therule. The use of a single rate law for all association (ordissociation) reactions is a simplification, a type ofcoarse graining, as noted earlier.

The left-hand side of a rule defines necessary andsufficient conditions. In the case of equation (2), thisrule defines necessary and sufficient conditions forLyn association with FcεRI. Namely, association may

8

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 10: Modeling for (physical) biologists: an introduction to the rule-based

Figure 2.Early events in IgE receptor (FcεRI) signaling. (a) An activation cascade in the signaling network of FcεRI, amultichainantigen-recognition receptor. A ligand or crosslinking reagent, such as amultivalent antigen or, as shown, an IgG antibody specific forthe receptor’sα chain, activates the receptor by inducing receptor clustering. Receptor clustering allows Lyn, a Src-family proteinkinase that constitutively associates with the receptor’s β chain, to phosphorylate tyrosines in neighboring, co-clustered receptors,which serves to recruitmore Lyn to receptors as part of a positive feedback loop. Lyn kinase activity also serves to recruit a secondprotein kinase, Syk, to receptors, which enhances Syk’s ability to phosphorylate tyrosines in the transmembrane adaptor protein Lat.(b) Pictorial representation of the steps involved in receptor crosslinking by a bivalent ligand (anti-FcεRIα): capture of free ligand(top) and crosslinking of receptors by a receptor-tethered ligand (bottom). (c) Pictorial representation of reversible Lyn associationwith FcεRIβ. (d) Pictorial representation of phosphorylation of Lat by receptor-bound Syk via aMichaelis–Mentenmechanism.

Figure 3.Visualization of rules. The rules of equations (1)–(3) are visualized in accordancewith the graphical conventions of BNGL[93]. According to these conventions, functional components ofmolecules (e.g., binding sites and sites of phosphorylation) arerepresented as vertices of colored graphs. The color of a graph corresponds to the type ofmolecule represented by the graph. Thecomponents of a given type ofmolecule share the same color (or equivalently, they share the samemolecule type name). Vertices haveoptional attributes (or equivalently, labels), which indicate internal states (i.e., local properties, such as phosphorylation status).Bonds betweenmolecular components are represented by (undirected) edges. As a simplification, one normally only uses edges torepresent those bonds that break and/or formunder conditions of interest. (a), (b)Graphical depiction of equations (1a) and (1b).These rules formalize the pictorial representations offigure 2(B). (c)Graphical depiction of equation (2). This rule formalizes thepictorial representation offigure 2(C). (d)–(f) Graphical depiction of equations (3a), (3b) and (3c). These rules formalize the pictorialrepresentations offigure 2(D). It should be noted that rules, which are used to represent themolecular interactions in a system, aresimply collections of possibly connected subgraphs of the graphs used to represent themolecules in a system. The subgraphscomprising a rule identify themolecular components that affect, or are affected by, the interaction represented by the rule.

9

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 11: Modeling for (physical) biologists: an introduction to the rule-based

occur if and only if the SH2 domain of Lyn is free(indicated by an absence of a bond index appended tothe component name SH2, which when present in arule is prefixed by a ‘!’ symbol) and Y218 in FcεRI’s β-chain is both free (again indicated by the absence of abond index) and in an internal state labeled ‘P.’ Inter-nal states are convenient abstractions, which can beused to represent local properties of molecular com-ponents. Here, the label P is used to represent a phos-phorylated tyrosine; the label 0 (which is intended tosuggest nomodification) could be used to represent anunphosphorylated tyrosine. Internal state labels areprefixed by a tilde (∼). Similarly, as already noted,bond indices are prefixed by an exclamationmark (!).The right-hand side of the rule defines the outcome ofLyn association with FcεRI: the formation of a bondbetween Lyn’s SH2 domain and the phosphotyrosinein FcεRI’s β-chain. The bond is indicated throughname sharing. In BNGL, associating a common bondindex, such as ‘1’ in the rules presented above, with apair of component names indicates that the compo-nents are connected. A dot (.), which appears on theright-hand side of equations (1a), (1b), (2) and (3a),serves as a separator. A dot also indicates connection.For example, the dot in equation (2) indicates that Lynand FcεRI are connected (without specifying how). Inthe case of this particular rule, the dot is redundant,because sharing of the bond index 1 by SH2 andb_Y218 also indicates that Lyn and FcεRI are con-nected. When read from right to left, the rule ofequation (2) indicates that a bond between Lyn’s SH2domain and pY218 in FcεRI’s β-chain can be brokenand that breaking of the bond causes dissociation ofthe proteins.

A complete, executablemodel and itsanalysis

The rules discussed above are simplified versions ofrules taken from a recently developed library of rulesfor interactions involved in FcεRI signaling [41]. Eachrule in this library is typically associated with com-ments that provide annotation, including a rationalefor the formalization of the interaction represented bythe rule and supporting references. This library wasdeveloped with the intention of enabling agile devel-opment of models for studying the system-levelbehavior of the FcεRI signaling network. A limitingstep in model development is often a literature searchand review, conducted for the purpose of collecting,organizing and formalizing the available mechanisticknowledge that is relevant for addressing a specificresearch question or other purpose. With a reliablerule library, this aspect of model development isstreamlined. In supplementary file S1, which is a plain-text BioNetGen input file, we have combined the rulesgiven in equations (1)–(3) with additional rules(sourced from the FcεRI rule library), and other

elements of an executable model specification, such asparameter values, to obtain an executable modelencompassing the rules illustrated in figure 3. Thismodel can be used, for example, to predict how thebinding properties of a bivalent ligand, which we takehere to be an FcεRI-specific IgG antibody, affect FcεRIsignaling events downstream of FcεRI crosslinking.An example of the type of behavior we can investigateusing the model is provided in figure 4, which showshow the extent of Lat phosphorylation depends on thelifetime of a ligand–receptor bond.

As noted above, the model specification of supple-mentary file S1 was constructed by reusing rules col-lected in the FcεRI rule library of Chylek et al [41]. Theability to construct a new model by assembling exist-ing rules in a newway, albeit with somemodifications,illustrates the compositionality of the rule-basedmod-eling approach. In table 1, we compare the rules ofsupplementary file S1 with their parent rules in theFcεRI rule library. The modifications of the parentrules were introduced as simplifications and to obtaina self-consistent set of rules. The molecule types andinteractions considered in the model are illustrated infigure 5, which shows a contact map automaticallygenerated from supplementary file S1 (stacks.iop.org/PB/12/045007/mmedia) by RuleBender [76, 77](figure 5(A)) and an extended contact map manuallyconstructed in accordance with recommended guide-lines [96] (figure 5(B)). Boxes in the maps illustratemolecules and the functional components of mole-cules considered in the model. Arrows in the mapsrepresent interactions captured by rules in themodel.

It should be noted that a complete model specifi-cation invariably consists of several other parts inaddition to rules. These parts of a model may include,for example, a declaration of the different types ofmolecules considered in the model, parameter values,and user-defined functions for rate laws. A model isalso usually associated with specifications of modeloutputs, which are called observables, and one ormore (simulation) commands, which are calledactions. Tables 2 and 3 provide listings of some of theavailable BNGL actions and their arguments.

Although themodel of figure 5 and supplementaryfile S1 was constructed quickly, this model is fairly ela-borate. To introduce and discuss the various parts of acomplete executable BNGL-encoded model specifica-tion, let us consider a simpler model, a model forinteraction of a solublemonovalent ligand with a biva-lent cell-surface receptor (figure 6). This model corre-sponds to the experimental system considered in thestudy of Erickson et al [97], in which the ligand wasDCT (a monovalent hapten) and the receptor washapten-specific cell-surface IgE (bound to FcεRI). Inthis study, Erickson et al [97] studied the effect of dif-fusion on ligand capture by the cell-surface receptorand escape of the ligand from the receptor to the bulkextracellularfluid.

10

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 12: Modeling for (physical) biologists: an introduction to the rule-based

Themodel of figure 6 (and supplementary file S2),as is typical of BNGL-encoded models, consists of sev-eral named blocks of code, as well as comments, whichare each preceded by the ‘#’ symbol. In figure 6, com-ments are used to indicate the units of parameters,which must be specified in a self-consistent unit sys-tem. Each block of code in figure 6 is identified anddelineated by a block name associated with the‘begin’ and ‘end’ keywords. The model specifica-tion, which consists of six blocks, is contained within ablock of code that is delineated by ‘begin model’and ‘end model’ and followed by two commands oractions, which together define a simulation protocol.This protocol consists of two parts: (1) network gen-eration (as specified by the ‘generate_network’command), which involves finding the chemical reac-tion network (i.e., the reactions and species) impliedby the rules of the model; and (2) construction andnumerical integration of the corresponding ODEs forthe chemical kinetics of the network (as specified bythe ‘simulate’ command).

The blocks of code comprising the model specifi-cation of figure 6 (i.e., the six blocks within themodelblock) are the parameters, molecule types,seed species, observables, functions,and reaction rules blocks. These blocks of codeprovide the following information: (i) parameters—values for useful physical/mathematical constants(e.g., π) and parameters of the model (e.g., ligand andreceptor abundances, an equilibrium association con-stant for ligand–receptor binding, and a dissociationrate constant) in a consistent unit system, one inwhichconcentrations are expressed on a per cell basis, whichhas certain advantages discussed elsewhere [34]; (ii)moleculetypes—definitions of the types ofmole-cules considered in the model (here, a monovalentligand and a bivalent receptor), including their rele-vant component parts (one ligand binding site andtwo identical cognate receptor binding sites); (iii)seed species—the species initially present in thesystem (here, the free form of the ligand and the freeform of the receptor) and their abundances, which forthis model serve to define an initial condition for aninitial value problem; (iv) observables—themodel outputs of interest, which are defined as sumsover the concentrations of speciesmatching a specifiedpattern or set of patterns (e.g., the amount of boundligand); (v) functions—mathematical functionsdefined using parameters and observables that areused to specify complex observables (i.e., observablesthat cannot be represented by a sum or weighted sumof concentrations) and non-elementary rate laws(here, functions are used to define forward and reverseligand–receptor binding coefficients in accordancewith the theory of Berg and Purcell [98]); (vi) reac-tion rules—local rules that model/representinteractions (here, a reversible rule is defined forligand–receptor association and dissociation and the

rule is associated with a pair of forward and reverserate laws specified using functions).

The use of rules does not simplify the specificationof this model, but the model-specification file shownin figure 6, introduces the various parts of BNGL-encoded models and illustrates the use of functions,which is a fairly new capability [23, 81]. Below, afterbriefly discussing available simulation methods andsoftware tools, we will consider additional examples ofmodel specifications and actions, including a (simple)model that would be difficult to specify in a traditionalmanner (supplementary file S3).

Abrief survey of usefulmethods andsoftware tools

Various methods and software tools are available tosupport rule-based modeling. The vast majority of thepresently available methods and tools are designed tosupport model specification and/or simulation. Thesetasks are somewhat interdependent but less so than istypical for traditionalmodeling approaches.

Methods

To specify models in terms of rules, one needsconventions and compatible software tools for operat-ing on and analyzing compliant model specifications.Because of the newness of rule-based modeling,conventions are still emerging. However, BNGL andBNGL-compliant tools are used commonly [22, 71].Alternative methods for specifying models includelanguages/formats with syntactic differences that tendto offer similar functionality [58, 62, 99, 100];domain-specific languages that offer higher-levelabstractions that can ease the task of model specifica-tion [101]; and embedded languages, which allow amodeler to leverage the power of a general-purposeprogramming language [69]. There are also BNGLextensions and other languages that enable modelingcapabilities beyond what is available within the BNGLframework [67, 68, 102–107]. We note that rules andrule-based models are readily visualized [76–78, 93, 96, 105, 108, 109] and software tools exist thatenable a visual approach to model specification, suchas RuleBuilder [75] and Simmune [63, 104, 105].(RuleBuilder is not currently supported but the Javacode is available.)

As we have discussed above, the rule-basedmodel-ing paradigm allows site-specific details about biomo-lecular interactions to be captured and representedusing an understandable language, such as BNGL. Theunderstandability of BNGL stems from its underlyinggraphical formalism [36, 74, 110], which makes indi-vidual rules and collections of rules (models) amen-able to visualization [76–78, 93, 96, 105, 108, 109], aswell as annotation [96]. Computer-aided analysis ofBNGL-encoded models is enabled by general-purpose

11

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 13: Modeling for (physical) biologists: an introduction to the rule-based

BNGL-compatible software tools that can processmodel specifications to obtain simulation results.These tools implement simulation algorithms that fallinto one of two categories: direct methods or indirectmethods, both of which are based on physicochemicalprinciples [23]. (Some researchers refer to directmethods, or even rule-based modeling, as a form ofagent-based modeling; we discourage this practice,because agent-based models are not typically based onphysicochemical principles.)

Indirect methods involve the interpretation of therules of a model to obtain (through network genera-tion, i.e., through enumerating the reactions impliedby rules) an equivalent model specification that has atraditional form, such as that of a coupled system ofODEs [34–36, 74, 111]. Once a traditional formula-tion has been obtained, then standard tools can beapplied, such as the ODE solvers suitable for stiff pro-blems available within SUNDIALS [112] or MATLAB(MathWorks, Natick, MA). Use of an indirect methodallows one to leverage mature, sophisticated, and

diverse simulation and analysis tools. However, indir-ect methods have limited applicability, because a set ofrules sometimes cannot be practically translated intoan equivalent, traditional model form, often becauseof memory requirements [74]. (Recall the problem ofcombinatorial complexity.) A not unrelated issue isrunaway network generation. In general, one cannotdetermine beforehand whether the network-genera-tion step of an indirect method will finish running(yielding a finite-size list of reactions and/orequations) or continue indefinitely. In other words,when one tries to apply an indirect method, networkgeneration may or may not terminate. In practice, onecan anticipate a failure to terminate if one can detectrules that introduce a polymerization-like process.Trying to determine if network generation will termi-nate is equivalent to trying to solve the famous haltingproblem. Rule-based modeling systems have beenshown to be Turing complete [113], and for such sys-tems, the halting problem is known to be undecidable.In cases where indirect methods are inapplicable,

Figure 4. Sensitivity of phosphorylation to the lifetime of a ligand–receptor bond. Themodel of supplementaryfile S1 (stacks.iop.org/PB/12/045007/mmedia)was used to predict how the lifetime of a ligand–receptor bond influences the (steady-state) level ofphosphorylation of Y218 in the β chain of FcεRI (dotted line), which is a docking site of Lyn’s SH2domain; the steady-state level ofphosphorylation of Y65 andY76 in the γ chain of FcεRI (broken line), which are docking sites of Syk’s tandemSH2 domains; and thesteady-state level of phosphorylation of Y136 in Lat (solid line), which is a substrate of Syk. The simulation results underlying the plotsshown here are reported in a.gdatfile that is created by BioNetGen after processing themodel specified in supplementary file S1 andthen executing the actions defined in supplementaryfile S1. The lifetime of a ligand–receptor bond corresponds to the inverse of therate constant associatedwith the rule of equation (1c), i.e., 1/koff. The dependence of phosphorylation levels on ligand–receptorbond lifetime arises from the interplay between kinetic proofreading, which tends to increase phosphorylation levels as lifetimeincreases, and serial engagement, which tends to decrease phosphorylation levels as lifetime increases [17, 151].

Table 1.Comparison of two parent rules in the library of [41] and the derived rules of supplementary file S1 available at stacks.iop.org/PB/12/045007/mmedia. Each derived rule was obtained from its parent rule through a simplification, removal of a constraint requiring the SH3domain in Lyn to be free, because in themodel of supplementary file S1, the binding partners of Lyn’s SH3domain are not considered.

Parent rule Derived rule

# Lyn unique domain binds receptor

Rec(b_Y218∼0)+ Lyn(U,SH3,SH2) −> \ Lyn(U,SH2)+ Rec(b_Y218∼0)−> \

Rec(b_Y218∼0!1).Lyn(U!1,SH3,SH2)kfRecLyn1 Lyn(U!1,SH2).Rec(b_Y218∼0!1)

# Lyn SH2 domain binds receptor

Rec(b_Y218∼P)+ Lyn(U,SH3,SH2) −> \ Lyn(U,SH2)+ Rec(b_Y218∼P)−> \

Rec(b_Y218∼P!1).Lyn(U,SH3,SH2!1)kfRecLyn2 Lyn(U,SH2!1).Rec(b_Y218∼P!1)

12

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 14: Modeling for (physical) biologists: an introduction to the rule-based

Figure 5.Visualization of a rule-basedmodel. Rules can be precisely visualized, as illustrated infigure 3, but a coarse description of therules comprising amodel is often useful for obtaining an overview ofwhat is captured in amodel. Two visualizations of themodel ofsupplementary file S1 (stacks.iop.org/PB/12/045007/mmedia) a contactmap and an extended contactmap, are shownhere. (a)Contactmap. This type of visualization can be generated automatically from the rules of amodel. RuleBender provides thisfunctionality. A contactmap shows the types ofmolecules considered in amodel, their functional components, and the bonds that canform and break between these components. (b) Extended contactmap. This type of visualization adds information beyond thatencoded directly in rules and serves an annotation purpose. At present, extended contactmapsmust bemanually constructed. In anextended contactmap, one represents the hierarchical substructures ofmolecules; enzyme-substrate relationships, which are usuallyimplicit in rules; and the dependence of interactions on the internal states ofmolecular components. Each arrow in an extendedcontactmap represents a set of rules that share a common reaction center. (Multiple rules correspond to one arrow if a transformationis taken to occur inmultiplemolecular contexts in a context-dependentmanner.) Arrows that begin and endwith arrowheadsrepresent direct binding interactions. Arrows that begin at a nested box and end at a small circle represent catalytic interactions. Thebox associatedwith such an arrow represents an enzyme; the circle points to a substrate of the enzyme. Components taken to haveinternal states are attached to flags. Here,flags are used to identify tyrosine residues taken to have phosphorylated andunphosphorylated internal states. If an arrow representing a direct binding interaction points to a solid dot on a flag, then the arrowshould be understood to represent an interaction that depends on the internal state indicated by theflag.Note that phosphataseactivities are not represented here because phosphatases are implicit in themodel of supplementary file S1.

Table 2. Summary of selected BioNetGen actions and arguments. For a complete listing of actions and arguments, see the documentationavailable at [73].

Action/argument Description Default

generate_network Process the rules of amodel to generate the reaction network implied by the rules

overwrite=>1/0 Allow an existing .net file to be overwritten 0

max_iter=>int Specify themaximumnumber of iterations for the network generation

procedure

100

max_agg=>int Specify themaximumnumber ofmolecules in an aggregate or complex 1e9

simulate Perform a deterministic or stochastic simulation (theode,ssa andplamethods require prior

execution ofgenerate_network and thenfmethod requiresNFsim)

prefix=>‘string’ Specify the base filename for output Base name of bnglfile

suffix=>‘string’ Specify a suffix to be added to output filenames Empty

verbose=>1/0 Request verbose output 0

method=>‘string’ Specify simulationmethod,stringmust beode,ssa,plaornf Required

t_start=>float Start time for simulation 0

t_end=>float End time for simulation Required

n_steps=>int Number of report times betweent_start andt_end 1

print_functions=>1/0 Request that function evaluations be sent to .gdat file 0

atol=>float Absolute tolerance (usedwithode) 1e-8

rtol=>float Relative tolerance (usedwithode) 1e-8

steady_state=>1/0 Check for steady state (usedwithode) 0

seed=>int Seed for generation of pseudo randomnumbers (usedwithssa,pla,

andnf)

Random

parameter_scanbifurcate Perform simulations to a specified end time over a specified range of values for a parameter (these

commands are useful formaking steady-state dose-response curves and bifurcation diagrams);

usebifurcate if bistability is suspected

The arguments ofsimulate are available, plus those described below

parameter=>‘string’ Name of the parameter to be scanned Required

par_min=>float Minimumvalue of parameter Required

par_max=>float Maximumvalue of parameter Required

n_scan_pts=>int Number of points betweenpar_min andpar_max to sample Required

log_scale=>1/0 Sample points logarithmically/linearly 0

13

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 15: Modeling for (physical) biologists: an introduction to the rule-based

direct methods, which are specialized for simulationof rule-basedmodels, are used.

Direct methods are fairly new, and methodologytailored for simulation and analysis of rule-basedmodels is still being developed. Thus, compared to thetoolbox of methods available for traditionally for-mulated models, direct method options are relativelylimited. The available direct methods are all particle-based stochastic simulation algorithms[79, 80, 114, 115]. Particle-based stochastic simulationallows the state of a system to be tracked in terms ofmolecular site states instead of population levels ofchemical species, which obviates the need to enumer-ate the chemical species (and reactions) implied byrules. In direct methods, rules are used as event gen-erators. An event changes the state of at least onemole-cular site, and the state of the system is found as afunction of time through the firing of probabilisticallychosen rule-defined reaction events, one event at atime. Although this type of simulation algorithm canbe less efficient than an indirect method (because ofthe drawbacks of discrete-event procedures [116]),direct methods are the most generally applicable typeof simulation method available for rule-based models.Some models that have been developed to study cel-lular regulatory systems can only be simulated using adirectmethod [86, 117–119].

A hybrid particle/population (HPP) method hasrecently been implemented that lies intermediatebetween indirect and direct methods [74]. Like indir-ect methods, HPP expands the rules of a model toobtain an alternative, equivalent model specification.However, unlike with indirectmethods, the expansion

with HPP is only partial and the resulting model is stillrule-based, although with a larger number of rulesthan the original. The advantage of this partial expan-sion is that it allows a subset of the species to be treatedas population variables rather than particles. Thisapproach can significantly reduce computationalmemory requirements [74]. Simulations of the par-tially expandedmodel can be performed using a recentversion of NFsim (1.11 or later) that can handle popu-lation variables.

Software toolsTwo useful simulation tools for rule-based compart-mental models are BioNetGen [34] and NFsim [81].BioNetGen is a BNGL-compatible simulation tool thatimplements a variety of indirect methods, includingboth deterministic and stochastic methods. Theoptions available include methods suitable for stiffproblems, such as the default ODE solver (invokedwith the ‘ode’ method flag) and the partitioned-leaping algorithm for stochastic simulation (invokedwith the ‘pla’ method flag) (table 2). The lattermethod is a variant ofGillespie’s tau-leaping algorithm[116, 120]. NFsim is a BNGL-compatible simulationtool that implements a (stochastic) direct method[81, 121]. BioNetGen and NFsim, which offer theability to specify rate laws that have arbitrary func-tional forms [23, 81], can each be accessed from thecommand line, through scripting, or through agraphical user interface (GUI), which is provided byRuleBender [76, 77]. The Python-based frameworkPySB (for systems biology modeling) provides anadditional means for using these and other tools [69].

Table 3. Summary of additional BioNetGen actions and arguments. For a complete listing of actions and arguments, see the documentationavailable at [73].

Action/argument Description

readFile Import text from a designated file

file=>‘string’ Name offile to be read and path (required)

blocks=>[‘string’,…] An optional list of blocks to be imported from afile (by default, all available blocks are

imported)

writeSBMLwriteMfilewriteMexfile Report as output a generated reaction network in SBML,M-file, orMEX-file format

prefix=>‘string’ Attach a prefix to the base filename used for output

suffix=>‘string’ Attach a suffix to the basefilename used for output

setConcentration(‘species’,val) Set the value of the concentration of a species

species The BNGLname of the species

val Anumerical value

saveConcentrations(‘label’) Save inmemory all current species concentrations

label Anoptional label that can be used to reference saved information

resetConcentrations(‘label’) Set all species concentrations to the values stored inmemory (at the last save, by default, or

to those values referenced by the indicatedlabel)

label Anoptional label that can be used to reference saved information

setParameter(‘param’,val) Set the value of a parameter

param The name of the parameter

val Anumerical value

saveParameters(‘label’) Save inmemory all current parameter values

label Anoptional label that can be used to reference saved information

resetParameters(‘label’) Set all parameters to the values stored inmemory (at the last save, by default, or to those

values referenced by the indicatedlabel)

label Anoptional label that can be used to reference saved information

14

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 16: Modeling for (physical) biologists: an introduction to the rule-based

The RuleBender/BioNetGen/NFsim software stackcan be freely downloaded as a bundle, along withdocumentation, and installed (typicallywithout a needfor compilation) on commonly used platforms,including Windows, Mac OS, and Linux [122]. Weshould emphasize that this software stack is designedand intended for the analysis of models in which

reaction compartments are spatially homogeneous(i.e., well-mixed). Consequently, other tools, such asSimmune [104, 105], must be used for spatial model-ing. There are a number of software tools for rule-based spatial modeling based on physicochemicalprinciples, which account for coupling between reac-tion and diffusion in different ways (viz., the next

Figure 6.Amodel for ligand–receptor bindingwith diffusion effects. An annotated, executable version of thismodel is provided assupplementary file S2 available at stacks.iop.org/PB/12/045007/mmedia. Themodel specification consists of several parts, whichdefine parameters,molecule types, seed species, observables, functions, and rules. Themodel specification is accompanied by severalcommands, or actions. If supplementary file S2 is submitted to BioNetGen for processing, these actionswill be performedautomatically after themodel specification is parsed. The actions shownhere instruct BioNetGen to perform three simulations ofligand–receptor binding kinetics with different values of the ligand diffusion coefficient.

15

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 17: Modeling for (physical) biologists: an introduction to the rule-based

subvolume method, PDEs, and Brownian dynamics)[66, 67, 102–105, 123]. Although the definition of areaction network does not provide a complete specifi-cation of a spatial model, it is often a significant part ofa spatial model specification. Thus, it is perhaps worthnoting that BioNetGen currently supports export ofrule-derived reaction networks in a number of formatsthat can be loaded directly into various spatial model-ing tools, including VCell [66], SSC [67], Smoldyn[123], and MCell [124]. Most spatial modeling toolsimplement indirect simulation methods (i.e., mosttools rely on network generation). However, there areexceptions [102, 103].

More examplemodels

To further illustrate the parts of a BNGL-encodedmodel, various features of BNGL, and some of theavailable capabilities of BNGL-compatible softwaretools, a series of example models is presented inFigures 7–9. The model specifications shown in thesefigures are complemented by annotated BioNetGeninput files, which are provided in the supplementarymaterial (supplementary files S3–S5). Figure 10 showssimulation results obtained from the models offigure 7–9, as well as the model of figure 6 consideredearlier (figure 10(A)). These results were obtained bypassing themodel specifications and actions of supple-mentary files S2–S5 to BioNetGen for processing.Processing can be initiated either at the command linevia a command of the form ‘path/BNG2.pl filename.bngl’ or via a point-and-click procedure within theGUI of the RuleBender environment [76, 77].

The model of figure 7 (supplementary file S3)illustrates the conciseness that is possible with a rule-based approach to model specification. This model,although it accounts for interactions among only ahandful ofmolecules, is rather large when expressed asan equivalent reaction network. The network impliedby the rules of the model, which can be found auto-matically by BioNetGen and reported in the variousformats mentioned above, consists of 156 chemicalspecies and 1218 unidirectional reactions. The corre-sponding system of coupled ODEs for themass-actionkinetics of this network consists of 156 equations, onefor each species, which collectively contain 1218 dis-tinct right-hand-side terms, one for each unidirec-tional reaction. As indicated by itsmoleculetypesblock, themodel offigure 7 accounts explicitly for fourmolecules. These molecules are a receptor tyrosinekinase (RTK), which has two sites of autopho-sphorylation; an SH2 domain-containing adaptorprotein, which interacts with the two sites in the RTKwhen they are phosphorylated; a phosphatase; and aphosphatase inhibitor. The rules of the model capturethe following picture. The two sites in the RTK arephosphorylated constitutively in a non-specific man-ner by cytoplasmic protein tyrosine kinases, which are

implicitly considered, and dephosphorylated by a con-stitutively active phosphatase. The two sites in theRTK are also phosphorylated in a specific mannerwhen the RTK is dimeric. In the absence of a ligand,the RTK constitutively dimerizes through a (weak)self-interaction.

The model of figure 7, because of constitutivephosphorylation and dephosphorylation of the RTK,has a non-trivial basal steady state, meaning the steadystate of the system before a perturbation. This steadystate is non-trivial in the sense that its specificationrequires the definition of the population levels of eachof the 156 chemical species implied by the rules of themodel and many of these species have concentrationsthat are significantly different from zero. The basalsteady state can be found through an equilibrationprocedure, which is specified in the actions blockof figure 7. In this procedure, the population levelsspecified in the seed species block of the modelare taken to define an initial condition and a simula-tion starting from this initial condition and continuingfor a sufficiently long time is then performed to findthe basal steady state. After setting the phosphataseinhibitor abundance to a non-zero level, a second pro-duction simulation is performed, starting from thebasal steady state, to predict the effect of phosphataseinhibition. Results from the equilibration and produc-tion simulations are shown infigure 10(B).

The model of figure 8 captures processes involvedin a genetic toggle switch [125], such as regulatedmRNA/protein synthesis. This model, as para-meterized, exhibits bistability. The ‘bifurcate’action in the actions block of figure 8 invokes a seriesof simulations that serve to find stable steady states ofthe system as a function of a bifurcation parameter(figure 10(C)). The ‘writeMfile’ command createsan M-file that can be used within MATLAB togetherwith MatCont [126], a MATLAB toolbox for numer-ical continuation and bifurcation analysis, to findunstable steady states.

The model of figure 9 is taken from the review ofMunsky et al [127]. It is intended to capture and illus-trate fluctuations in gene expression arising from theinherent stochasticity of (bio)chemical reactions andsmall population sizes. A constitutively expressed geneand three regulated genes, which each have ‘on’ and‘off’ states, are considered. The state of each gene iscontrolled by a transcription factor, which is implicitlyconsidered. Parameters of the model are such that theregulated genes all have the same average level ofexpression; however, distributions of transcript abun-dance are distinct. As illustrated in the actions block offigure 9, an (indirect) stochastic simulation method isinvoked by simply specifying the appropriate methodflag, ‘ssa’, as an argument of the ‘simulate’ com-mand (table 2). The results of a single stochastic simu-lation run are shown in figure 10(D). An alternative,accelerated-stochastic simulation method [120] canbe invoked by setting the method flag to ‘pla’

16

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 18: Modeling for (physical) biologists: an introduction to the rule-based

(table 2) [73]. In addition, NFsim [81], which imple-ments a network-free (i.e., direct) stochastic simula-tion method, can be invoked using the ‘nf’ method(table 2). For cases where there are large differences inthe population sizes of chemical species, the HPPmethod is available [73, 74].

Three further example models, which illustratepowerful but uncommonly used (to date) modelingcapabilities, are provided in supplementary files S6–S8. The first of these files is simply a preprocessed ver-sion of the model of figure 7. The preprocessing stepperformed to obtain supplementary file S6 (stacks.iop.

org/PB/12/045007/mmedia) is called fragmentation[111], which can be likened to amodel reduction tech-nique. However, the technique is not based on typicalmodel reduction strategies, such as timescale separa-tion [128]. Fragmentation simplifies a model specifi-cation, such that when the rules of the model areprocessed to derive the implied reaction network, aminimal network, which can be viewed as a reducedform of the network that would be found throughnaïve rule specification and application, is found. Sim-plifications are obtained, for example, by representingindependent sites in a molecule separately.

Figure 7.Amodel for receptor signalingwith a non-trivial basal steady state. An annotated, executable version of thismodel isprovided as supplementary file S3 available at stacks.iop.org/PB/12/045007/mmedia. Because of space limitations, the parametersblock of themodel specification is not shown. Instead, a ‘readFile’ command is shownwhere the parameters blockwould normallyappear. This command directs BioNetGen to read a (plain-text)file named ‘supplementaryfileS3.bngl’ in the current workingdirectory and to import the parameters block from thatfile. (If the block is not present in the designated file, or thefile cannot befound, an error report is generated.) The actions associatedwith themodel specification instruct BioNetGen tofind the basal steadystate of the system, inwhich the receptor is constitutively phosphorylated, and then to simulate the response to addition of aphosphatase inhibitor.

17

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 19: Modeling for (physical) biologists: an introduction to the rule-based

Fragmentation is compatible with deterministic simu-lation methods. It is a fairly simple procedure that canbe performed manually using the recipe provided byFeret et al [111]. Related methods have also beendescribed recently [129–133]. We note that fragmen-tation is connected to modeling approaches that havelong been used in polymer chemistry (see [88] andreferences therein).

Supplementary file S7 (stacks.iop.org/PB/12/045007/mmedia) provides an example of a model spe-cified using compartmental BNGL (cBNGL) [134],which enables the definition of reaction compart-ments and transport between them. The topology ofthe compartments is assumed to be composed of ahierarchy of membrane-enclosed volumes. Each sur-face encloses a single volume, which in turn can con-tain an arbitrary number of surfaces. Eachcompartment has a specified reaction volume, whichfor surface compartments is given by the product ofthe surface area and an effective width that has typicalvalues corresponding to the molecular dimensions ofproteins (see Harris et al [33, 134] for more details).The use of effective volumes for surface compartments

allows all bimolecular rate constants to be specified inthe same units (e.g., M−1 s−1), with automatic conver-sion being performed to obtain the correct rates forsurface-surface, surface-volume, and volume-volumereactions. In cBNGL both species and their molecularconstituents have a location attribute that maps themto a specific compartment. Membrane-associated spe-cies may have molecular constituents that are locatedin volume compartments on either side of the mem-brane compartment, which facilitates correct localiza-tion of product species upon dissociation. MostcBNGL rules do not explicitly reference location attri-butes but rely instead on additional semantics thatonly allow reactions between species in the same oradjacent compartments and automatically determinethe correct volume of reaction for determining reac-tion rates. Currently, simulations of cBNGL modelscan only be performed using indirect methods. Inaddition to ODE and stochastic simulations based onwell-mixed compartments, it is possible to performspatially resolved simulations of cBNGL models usingMCell, a particle-based spatial stochastic simulator[135]. A tutorial describing how to perform spatial

Figure 8.Amodel for a genetic toggle switch. An annotated, executable version of thismodel is provided as supplementary file S4available at stacks.iop.org/PB/12/045007/mmedia. The actions associatedwith themodel specification instruct BioNetGen tofind(stable) steady states through simulation for a long time for a range of values of a bifurcation parameter. Simulations are performed asthe bifurcation parameter is varied from the designatedminimumvalue to the designatedmaximumvalue, and then in the reversedirection. Simulations are initiated from the previously found steady state. The first simulation is initiated from the starting conditionspecified via ‘setConcentration’ commands.

18

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 20: Modeling for (physical) biologists: an introduction to the rule-based

simulations of cBNGL models using realistic geome-tries is available online [136], which also describeshow to obtain realistic geometries for cell simulationfrom statistical models built from experimental ima-ging data using a tool called CellOrganizer [137].

Supplementary file S7 (stacks.iop.org/PB/12/045007/mmedia) defines a simplified model that encompassesmany key events in signal transduction includingligand–receptor binding, receptor trafficking, tran-scriptional activation, and negative feedback through

Figure 9.Amodel for stochastic gene expression. An annotated, executable version of thismodel is provided as supplementary file S5available at stacks.iop.org/PB/12/045007/mmedia. Thismodel considers a gene expressed at a low constitutive level, and threeregulated genes, which are each expressed at an identical average level. Although the regulated genes each have the same expected levelof expression at steady state, their distributions of expression are different. The actions associatedwith themodel specification instructBioNetGen to perform a stochastic simulation.

19

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 21: Modeling for (physical) biologists: an introduction to the rule-based

expression of an inhibitor of receptor activity. Themodel demonstrates that oscillations may arise in thesystem when there is a sharp threshold for transcrip-tional activation, which can occur, for example, ifthere is an inhibitor that binds and inactivates thetranscription factorwith high affinity [138].

Our last example uses a recent extension of BNGLcalled eBNGL [139], in which one can represent inter-actions using minimalist rules together with energypatterns that define cooperative interactions [139].eBNGLwas inspired by and builds upon an earlier tool

called the allosteric network compiler [140] and thework of Danos et al [141]. An eBNGL-encoded modelis parameterized in terms of thermodynamic quan-tities, which ensures that constraints of detailed bal-ance [142–145] are satisfied. Kinetics can still bepredicted on the basis of assumptions about activationenergy barriers and transition state theory. The modelof supplementary file S8 (stacks.iop.org/PB/12/045007/mmedia) captures coupled folding and bind-ing of a disordered protein to a second, folded proteinin accordance with a reaction scheme discussed by

Figure 10. Simulation results obtained from themodels offigures 6–9. (a) Simulation results for themodel offigure 6 (deterministictime courses of ligand–receptor binding for different values of the ligand diffusion coefficient). The solid, broken and dotted linescorrespond to large,medium and small values for the ligand diffusion coefficient. Time is indicated on the horizontal axis, and theamount of bound ligand is indicated on the vertical axis in dimensionless units. The vertical axis is scaled such that a value of 1corresponds to 100 000 bound ligands per cell. (b) Simulation results for themodel offigure 7 (deterministic time courses forequilibration to a basal steady state and for the response to phosphatase inhibition). An equilibration simulationwas started at anarbitrary, easy-to-specify starting condition at time t=0. The purpose of this simulation is to find the non-trivial, difficult-to-specifybasal steady state. A production simulationwas started at time t= 100 s, with the addition of a phosphatase inhibitor. The horizontalaxis indicates time, and the vertical axis indicates the amount of receptor phosphorylated at siteY1 (solid curve) and siteY2 (dottedcurve) relative to the basal levels of phosphorylation for these sites (i.e., phosphorylation levels are normalized). (c) Simulation resultsfor themodel offigure 8 (a bifurcation diagram summarizing the stable steady states found through simulation and parameterscanning). In the top and bottompanels, the horizontal axis indicates the value of the bifurcation parameter. In the top panel, thevertical axis indicates the level of repressorX. In the bottompanel, the vertical axis indicates the level of repressorY. Units are the samefor all axes (copies per cell). Each point corresponds to a stable steady state (i.e., afixed point) at the indicated value of the bifurcationparameter (Kxy). Only stable steady states are represented in these plots. (d) Simulation results for themodel offigure 9 (stochastictime courses). The black curve indicates thefluctuating level of expression from the unregulated gene. The colored curves indicate thefluctuating levels of expression from the three regulated genes (red, gene 1; blue, gene 2; green, gene 3). The average level of expressionis the same for each regulated gene. The horizontal axis has been scaled such that a value of 1 corresponds to 1000 s.

20

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 22: Modeling for (physical) biologists: an introduction to the rule-based

Kiefhaber et al [146]. This scheme encompasses a two-step folding-after-binding mechanism, as well as atwo-step conformational selection mechanism, inwhich the disordered protein first folds and thenbinds.

Conclusions

Rule-based modeling is different from traditionalequation-basedmodeling. Equations are jettisoned forthe most part. Instead, one relies on sophisticatedsoftware packages, which are more accessible thanequations (or software that requires equations asinput). Use of these packages, by design, does notrequire specialized mathematical training or extensiveprogramming skills. Models are specified using aspecialized model-specification language, which canbe understood by both a computer and a user/biologist. Using one of the available languages tospecify a model is similar to writing a simple program,or a script. Thus, some aptitude for programming isrequired, but learning a model-specification language,such as BNGL, is considerably easier than learning,say, a general-purpose programming language, such asPython [147]. BNGL, like most available model-specification languages, by design, is streamlined andless complex than a general-purpose programminglanguage.

BNGL, which is representative of other rule-based modeling languages, has been designed for useby (physical) biologists; it is intended to allow easyrepresentation of knowledge about biomoleculesand their interactions. Within the framework ofBNGL, molecules are represented as structuredobjects, with the component parts of these objectscorresponding to the functional sites in biomole-cules that are responsible for binding and catalyticinteractions of interest. BNGL-encoded rules definenecessary and sufficient conditions for interactionsand provide rate laws. Capturing the site-specificdetails of interactions is relatively straightforward.The problem of combinatorial complexity, whichquickly stymies traditional modeling approachesand severely limits the biochemical details that canbe considered in traditional models, is sidestepped,provided that assumptions of modularity can beplausibly justified or used as a first-order or initialapproximation. Because BNGL is based on a graphi-cal formalism for representing molecules and inter-actions, models and the formal elements of modelscan be readily visualized with precision. Indeed,automatic visualization capabilities are available forboth individual rules and entire rule-based models[76–78, 108, 109]. It is even possible to use a simpledrawing tool to specify a rule-basedmodel [75, 105].

A strong advantage of rule-based modeling, whichhas yet to be fully leveraged, is the ease with whichthesemodels can be annotated [38, 41, 42, 86, 96, 119],

as there is often a one-to-one correspondence betweenthe formal elements of a model and entries in publicdatabases that collect information about biomoleculesand their interactions. Thus, rule-based modelingcould provide a foundation for building standardmodels for well-studied cellular regulatory networksof outstanding interest, such as those implicated indiseases.

Long ago, mathematicians developed formalismstomodel and analyze natural phenomena of interest atthe time, through the methods available at the time,such as calculations possible through reasoning andthe use of pen and paper. These methods were even-tually applied to study chemical and biochemical sys-tems [148, 149], and they remain popular to this day.However, biologists now have at hand more powerfulmodeling tools, which also happen to be easier to use.It is time to pick up these tools and use them as com-plements to intuition and reasoning and quantitativeexperimental approaches. For those interested in tak-ing up this challenge, a number of resources are avail-able beyond this review, including a primer [72], awiki with extensive documentation and links to addi-tional resources [73], and an annual short coursewhere hands-on instruction in rule-based modeling isoffered [150].

Acknowledgments

This work was supported by grants R25GM105608,P50GM085273, and P41GM103712 from theNationalInstitute of General Medical Sciences (NIGMS) of theNational Institutes of Health (NIH) and by an Expedi-tions in Computing Grant (award 0926181) from theNational Science Foundation (NSF).

References

[1] LeNovèreN et al 2009The systems biology graphicalnotationNat. Biotechnol. 27 735–41

[2] Tyson J J, ChenKCandNovak B 2003 Sniffers, buzzers,toggles and blinkers: dynamics of regulatory and signalingpathways in the cellCurr. Opin. Cell Biol. 15 221–31

[3] Tsai TY, Choi Y S,MaW, Pomerening J R, TangC andFerrell J E Jr 2008Robust, tunable biological oscillations frominterlinked positive and negative feedback loops Science 321126–9

[4] Nobeli I, Favia AD andThornton JM2009 Proteinpromiscuity and its implications for biotechnologyNat.Biotechnol. 27 157–67

[5] SchreiberG andKeatingAE 2011 Protein binding specificityversus promiscuityCurr. Opin. Struct. Biol. 21 50–61

[6] Aldridge BB, Burke JM, LauffenburgerDA and Sorger PK2006 Physicochemicalmodelling of cell signalling pathwaysNat. Cell Biol. 8 1195–203

[7] Kastritis P L andBonvinAM2013On the binding affinity ofmacromolecular interactions: daring to askwhy proteinsinteract J. R. Soc. Interface 10 20120835

[8] CowanAE,Moraru I I, Schaff J C, SlepchenkoBMandLoewLM2012 Spatialmodeling of cell signaling networksMethods Cell Biol. 110 195–221

[9] RowlandMA, FontanaWandDeeds E J 2012Crosstalk andcompetition in signaling networksBiophys. J. 103 2389–98

21

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 23: Modeling for (physical) biologists: an introduction to the rule-based

[10] de JongH2002Modeling and simulation of geneticregulatory systems: a literature review J. Comput. Biol. 967–103

[11] Miskov-ZivanovN, TurnerMS, Kane L P,Morel PA andFaeder J R 2013The duration of T cell stimulation is a criticaldeterminant of cell fate and plasticity Sci. Signal 6 ra97

[12] HuckaM et al 2003The systems biologymarkup language(SBML): amedium for representation and exchange ofbiochemical networkmodelsBioinformatics 19 524–31

[13] Laidler K J 1985Chemical kinetics and the origins of physicalchemistryArch. Hist. Exact. Sci. 32 43–75

[14] GillespieDT,Hellander A and Petzold LR 2013 Perspective:stochastic algorithms for chemical kinetics J. Chem. Phys. 138170901

[15] HlavacekWS, Faeder J R, BlinovML, PerelsonA S andGoldstein B 2003The complexity of complexes in signaltransductionBiotechnol. Bioeng. 84 783–94

[16] Mayer B J, BlinovML and LoewLM2009Molecularmachines or pleiomorphic ensembles: signaling complexesrevisited J. Biol. 8 81

[17] Goldstein B, Faeder J R andHlavacekWS2004Mathematicaland computationalmodels of immune-receptor signallingNat. Rev. Immunol. 4 445–56

[18] HlavacekWS, Faeder J R, BlinovML, Posner RG,HuckaMand FontanaW2006Rules formodeling signal-transduction systems Sci. STKE 2006 re6

[19] DanosV, Feret J, FontanaW,Harmer R andKrivine J 2007Rule-basedmodelling of cellular signalling LectureNotesComput. Sci. 4703 17–41

[20] Meier-SchellersheimM, Fraser ID andKlauschen F 2009Multiscalemodeling for biologistsWiley Interdiscip. Rev. Syst.Biol.Med. 1 4–14

[21] Bachman J A and Sorger P 2011New approaches tomodelingcomplex biochemistryNat.Methods 8 130–1

[22] Chylek LA, Stites EC, Posner RG andHlavacekWS 2013Innovations of the rule-basedmodeling approach SystemsBiology: Integrative Biology and Simulation Tools edAProkop andBCsukás vol 1 (Dordrecht: Springer)pp 273–300

[23] Chylek LA,Harris L A, TungC S, Faeder J R, Lopez C F andHlavacekWS2014Rule-basedmodeling: a computationalapproach for studying biomolecular site dynamics in cellsignaling systemsWiley Interdiscip. Rev. Syst. Biol.Med. 613–36

[24] StefanM I, Bartol TM, Sejnowski T J andKennedyMB2014Multi-statemodeling of biomolecules PLoSComput. Biol. 10e1003844

[25] MarchisioMA, ColaiacovoM,Whitehead E and Stelling J2013Modular, rule-basedmodeling for the design ofeukaryotic synthetic gene circuitsBMCSyst. Biol. 7 42

[26] Rangarajan S, Kaminski T, VanWykE, BhanA andDaoutidis P 2014 Language-oriented rule-based reactionnetwork generation and analysis: algorithms of RINGComput. Chem. Eng. 64 124–37

[27] LigonT S, Leonhardt C andRadler JO 2014Multi-levelkineticmodel ofmRNAdelivery via transfection of lipoplexesPLoSOne 9 e107148

[28] Broadbelt L J, Stark SMandKleinMT1994Computer-generated pyrolysismodeling—on-the-fly generation ofspecies, reactions, and rates Ind. Eng. Chem. Res. 33 790–9

[29] KlinkeD J andBroadbelt L J 1997Mechanism reductionduring computer generation of compact reactionmodelsAIChE J. 43 1828–37

[30] Faulon J L and Sault AG2001 Stochastic generator ofchemical structure: III. Reaction network generation J. Chem.Inf. Comput. Sci. 41 894–908

[31] FontanaWandBuss LW1996The barrier of objects: fromdynamical systems to bounded organizationsBoundaries andBarriers ed J Casti andAKarlqvist (Reading,MA: Addison-Wesley) pp 55–116

[32] Regev A, SilvermanWand Shapiro E 2001Representationand simulation of biochemical processes using the π-calculusprocess algebraPac. Symp. Biocomput. 6 459–70

[33] GillespieDT2000The chemical Langevin equation J. Chem.Phys. 113 297–306

[34] Faeder J R, BlinovML andHlavacekWS2009Rule-basedmodeling of biochemical systemswith BioNetGenMethodsMol. Biol. 500 113–67

[35] BlinovML, Faeder J R, Goldstein B andHlavacekWS 2004BioNetGen: software for rule-basedmodeling of signaltransduction based on the interactions ofmolecular domainsBioinformatics 20 3289–91

[36] Faeder J R, BlinovML,Goldstein B andHlavacekWS 2005Rule-basedmodeling of biochemical networksComplexity 1022–41

[37] Apgar J F,WitmerDK,White FMandTidor B 2010 Sloppymodels, parameter uncertainty, and the role of experimentaldesignMol. Biosyst. 6 1890–900

[38] BaruaD,HlavacekWS and Lipniacki T 2012Acomputationalmodel for early events in B cell antigenreceptor signaling: analysis of the roles of Lyn and FynJ. Immunol. 189 646–58

[39] TonsingC, Timmer J andKreutz C 2014Cause and cure ofsloppiness in ordinary differential equationmodelsPhys. Rev.E 90 023303

[40] BaruaD andGoldstein B 2012Amechanisticmodel of earlyFcεRI signaling: lipid rafts and the question of protectionfromdephosphorylation PLoSOne 7 e51669

[41] Chylek LA,HolowkaDA, Baird BA andHlavacekWS 2014An interaction library for the FcεRI signaling network FrontImmunol. 5 172

[42] BaruaD andHlavacekWS2013Modeling the effect of APCtruncation on destruction complex function in colorectalcancer cells PLoSComput. Biol. 9 e1003217

[43] MetropolisN, Rosenbluth AW,RosenbluthMN,Teller AHandTeller E 1953 Equation of state calculations byfast computingmachines J. Chem. Phys. 21 1087–92

[44] Gunawardena J 2014Models in biology: ‘accuratedescriptions of our pathetic thinking’BMCBiol. 12 29

[45] ChelliahV, Laibe C and LeNovèreN2013 BioModelsdatabase: a repository ofmathematicalmodels of biologicalprocessesMethodsMol. Biol. 1021 189–99

[46] Hilbert D 2000Mathematical problems (reprinted) Bull. Am.Math. Soc. 37 407–36

[47] AlonU2007Networkmotifs: theory and experimentalapproachesNat. Rev. Genetics 8 450–61

[48] Ferrell J E Jr, Pomerening J R, Kim SY, Trunnell NB,XiongW,HuangCY andMachleder EM2009 Simple,realisticmodels of complex biological processes: positivefeedback and bistability in a cell fate switch and a cell cycleoscillator FEBS Lett. 583 3999–4005

[49] Sessions R 1979How a ‘difficult’ composer gets that way(reprinted)Roger Sessions onMusic: Collected Essays ed ETCone (Princeton, NJ: PrincetonUniversity Press) pp 169–71

[50] Beringer J et al 2012Review of particle physics Phys. Rev.D 86010001

[51] JorissenRN,WalkerF, PouliotN,GarrettTP,WardCWandBurgessAW2003Epidermal growth factor receptor:mechanismsof activationand signallingExp.CellRes.28431–53

[52] LemmonMAand Schlessinger J 2010Cell signaling byreceptor tyrosine kinasesCell 141 1117–34

[53] WileyHS, Shvartsman SY and Lauffenburger DA 2003Computationalmodeling of the EGF-receptor system: aparadigm for systems biologyTrends Cell Biol. 13 43–50

[54] PawsonT andNash P 2003Assembly of cell regulatorysystems through protein interaction domains Science 300445–52

[55] Schlessinger J 1994 SH2/SH3 signaling proteinsCurr. Opin.Genetics Dev. 4 25–30

[56] VanRoeyK,Uyar B,Weatheritt R J, DinkelH, SeilerM,BuddA,GibsonT J andDaveyNE 2014 Short linearmotifs:ubiquitous and functionally diverse protein interactionmodules directing cell regulationChem. Rev. 114 6733–78

[57] WalshCT,Garneau-TsodikovaS andGattoG J Jr 2005Proteinposttranslationalmodifications: the chemistry ofproteomediversificationsAngew.Chem., Int. Ed.Engl.447342–72

22

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 24: Modeling for (physical) biologists: an introduction to the rule-based

[58] LeNovèreN and Shimizu T S 2001 STOCHSIM:modellingof stochastic biomolecular processesBioinformatics 17 575–6

[59] PriamiC, RegevA, Shapiro E and SilvermanW2001Application of a stochastic name-passing calculus torepresentation and simulation ofmolecular processes Inf.Process. Lett. 80 25–31

[60] DanosV and LaneveC 2004 Formalmolecular biologyTheor.Comput. Sci. 325 69–110

[61] Talcott C, Eker S, KnappM, Lincoln P and Laderoute K 2004Pathway logicmodeling of protein functional domains insignal transduction Pac. Symp. Biocomput. 9 568–80

[62] Lok L andBrent R 2005Automatic generation of cellularreaction networkswithMoleculizer 1.0Nat. Biotechnol. 23131–6

[63] Meier-SchellersheimM,XuX, AngermannB, Kunkel E J,Jin T andGermain RN2006Key role of local regulation inchemosensing revealed by a newmolecular interaction-basedmodelingmethod PLoS Comput. Biol. 2 e82

[64] Phillips A andCardelli L 2007 Efficient, correct simulation ofbiological processes in the stochastic pi-calculus LectureNotesComput. Sci. 4695 184–99

[65] Dematte L, PriamiC andRomanel A 2008The BetaWorkbench: a computational tool to study the dynamics ofbiological systemsBrief Bioinform. 9 437–49

[66] Moraru I I, Schaff J C, Slepchenko BM,BlinovML,Morgan F, LakshminarayanaA, Gao F, Li Y and Loew LM2008Virtual cellmodelling and simulation softwareenvironment IET Syst. Biol. 2 352–62

[67] LisM, ArtyomovMN,Devadas S andChakraborty AK 2009Efficient stochastic simulation of reaction–diffusionprocesses via direct compilationBioinformatics 252289–91

[68] MausC, Rybacki S andUhrmacher AM2011Rule-basedmulti-levelmodeling of cell biological systemsBMCSyst.Biol. 5 166

[69] LopezC F,Muhlich J L, Bachman J A and Sorger PK 2013Programming biologicalmodels in Python using PySBMol.Syst. Biol. 9 646

[70] PalmisanoA,Hoops S,Watson LT, Jones TC Jr,Tyson J J and Shaffer CA 2014MultistateModel Builder(MSMB): a flexible editor for compact biochemicalmodelsBMCSyst. Biol. 8 42

[71] Chylek LA,Wilson B S andHlavacekWS2014Modelingbiomolecular site dynamics in immunoreceptor signalingsystemsAdv. Exp.Med. Biol. 844 245–62

[72] Sekar J A and Faeder J R 2012Rule-basedmodeling of signaltransduction: a primerMethodsMol. Biol. 880 139–218

[73] BioNetGenwiki (http://bionetgen.org/)[74] Hogg J S,Harris L A, Stover L J, NairN S and Faeder J R 2014

Exact hybrid particle/population simulation of rule-basedmodels of biochemical systemsPLoS Comput. Biol. 10e1003544

[75] HuB, FrickeGM, Faeder J R, Posner RG andHlavacekWS2009GetBonNie for building, analyzing and sharing rule-basedmodelsBioinformatics 25 1457–60

[76] XuW, SmithAM, Faeder J R andMarai GE 2011RuleBender: a visual interface for rule-basedmodelingBioinformatics 27 1721–2

[77] Smith AM,XuW, SunY, Faeder J R andMarai G E 2012RuleBender: integratedmodeling, simulation andvisualization for rule-based intracellular biochemistryBMCBioinform. 13 S3

[78] Wenskovitch J E Jr, Harris LA, Tapia J J, Faeder J R andMarai GE 2014MOSBIE: a tool for comparison and analysisof rule-based biochemicalmodelsBMCBioinform. 15 316

[79] Colvin J,MonineM I, Faeder J R,HlavacekWS,VonHoffDDand Posner RG2009 Simulation of large-scalerule-basedmodelsBioinformatics 25 910–7

[80] Colvin J,MonineM I, Gutenkunst RN,HlavacekWS,VonHoffDDand Posner RG2010RuleMonkey: softwarefor stochastic simulation of rule-basedmodelsBMCBioinform. 11 404

[81] SneddonMW, Faeder J R and Emonet T 2011 Efficientmodeling, simulation and coarse-graining of biologicalcomplexity withNFsimNat.Methods 8 177–83

[82] MATLABDocumentation (www.mathworks.com/help/matlab/)

[83] KholodenkoBN,DeminOV,MoehrenG andHoek J B 1999Quantification of short term signaling by the epidermalgrowth factor receptor J. Biol. Chem. 274 30169–81

[84] ChenWW, Schoeberl B, Jasper P J, NiepelM,NielsenUB,LauffenburgerDA and Sorger PK 2009 Input–outputbehavior of ErbB signaling pathways as revealed by amassactionmodel trained against dynamic dataMol. Syst. Biol.5 239

[85] BlinovML, Faeder J R, Goldstein B andHlavacekWS 2006Anetworkmodel of early events in epidermal growth factorreceptor signaling that accounts for combinatorialcomplexityBiosystems 83 136–51

[86] CreamerMS et al 2012 Specification, annotation,visualization and simulation of a large rule-basedmodel forERBB receptor signalingBMCSyst. Biol. 6 107

[87] Faeder J R, BlinovML,Goldstein B andHlavacekWS 2005Combinatorial complexity and dynamical restriction ofnetworkflows in signal transduction Syst. Biol. 2 5–15

[88] PerelsonA S andDelisi C 1980Receptor clustering on a cellsurface: I. Theory of receptor cross-linking by ligands bearingtwo chemically identical functional groupsMath. Biosci. 4871–110

[89] UllahG,MakDOandPearson J E 2012A data-drivenmodelof amodal gated ion channel: the inositol 1, 4, 5-trisphosphate receptor in insect Sf9 cells J. Gen. Physiol. 140159–73

[90] da Silva EZ, JamurMCandOliver C 2014Mast cell function:a new vision of an old cell J. Histochem. Cytochem. 62 698–738

[91] HolowkaD andBaird B 1996Antigen-mediated IgE receptoraggregation and signaling: awindow on cell surface structureand dynamicsAnnu. Rev. Biophys. Biomol. Struct. 25 79–112

[92] UniProt (www.uniprot.org/)[93] Faeder J R, BlinovML andHlavacekWS2005Graphical

rule-based representation of signal-transduction networksProc. 2005ACMSymp. onApplied Computing edLMLiebrock (NewYork: ACMPress) pp 133–40

[94] ChenWW,NiepelM and Sorger PK 2010Classic andcontemporary approaches tomodeling biochemicalreactionsGenes Dev. 24 1861–75

[95] Schnell S 2014Validity of theMichaelis–Menten equation—steady-state or reactant stationary assumption: that is thequestion Febs J. 281 464–72

[96] Chylek LA et al 2011Guidelines for visualizing andannotating rule-basedmodelsMol. Biosyst. 7 2779–95

[97] Erickson J,GoldsteinB,HolowkaDandBairdB 1987Theeffect of receptordensity on the forward rate constant forbindingof ligands to cell surface receptorsBiophys. J.52657–62

[98] BergHC and Purcell EM1977 Physics of chemoreceptionBiophys. J. 20 193–219

[99] DanosV, Feret J, FontanaW,Harmer R andKrivine J 2008Rule-basedmodelling, symmetries, refinements LectureNotes Comput. Sci. 5054 103–22

[100] PedersenM,OuryN,Gravill C and Phillips A 2014 BioSimulators: a webUI for biological simulationBioinformatics30 1491–2

[101] DanosV, Feret J, FontanaW,Harmer R andKrivme J 2009Rule-basedmodelling andmodel perturbation LectureNotesComput. Sci. 5750 116–37

[102] Gruenert G, IbrahimB, Lenser T, LohelM,Hinze T andDittrich P 2010Rule-based spatialmodelingwith diffusing,geometrically constrainedmoleculesBMCBioinform. 11 307

[103] SorokinaO, Sorokin A,Douglas Armstrong J andDanosV2013A simulator for spatially extended kappamodelsBioinformatics 29 3105–6

[104] AngermannBR, Klauschen F, Garcia AD, Prustel T,Zhang F, GermainRN andMeier-SchellersheimM2012Computationalmodeling of cellular signaling processes

23

Phys. Biol. 12 (2015) 045007 LAChylek et al

Page 25: Modeling for (physical) biologists: an introduction to the rule-based

embedded into dynamic spatial contextsNat.Methods 9283–9

[105] Zhang F, AngermannBR andMeier-SchellersheimM2013The SimmuneModeler visual interface for creating signalingnetworks based on bi-molecular interactionsBioinformatics29 1229–30

[106] Mjolsness E 2013Time-ordered product expansions forcomputational stochastic systembiology Phys. Biol. 10035009

[107] Bittig AT,Matschegewski C,Nebe J, Stahlke S andUhrmacher AM2014Membrane related dynamics and theformation of actin in cells growing onmicro-topographies: aspatial computationalmodelBMCSyst. Biol. 8 106

[108] Tiger CF, Krause F, CedersundG, Palmer R, Klipp E,Hohmann S, KitanoH andKrantzM2012A framework formapping, visualisation and automaticmodel creation ofsignal-transduction networksMol. Syst. Biol. 8 578

[109] ChengHC, AngermannBR, Zhang F andMeier-SchellersheimM2014NetworkViewer: visualizingbiochemical reaction networks with embedded rendering ofmolecular interaction rulesBMCSyst. Biol. 8 70

[110] BlinovML, Yang J, Faeder J R andHlavacekWS 2006Graph theory for rule-basedmodeling of biochemicalnetworks LectureNotes Comput. Sci. 4230 89–106

[111] Feret J, DanosV, Krivine J,Harmer R and FontanaW2009Internal coarse-graining ofmolecular systems Proc. NatlAcad. Sci. USA 106 6453–8

[112] HindmarshAC, BrownPN,Grant KE, Lee S L, SerbanR,ShumakerDE andWoodwardC S 2005 SUNDIALS: suite ofnonlinear and differential/algebraic equation solversACMTrans.Math. Softw. 31 363–96

[113] Cardelli L andZavattaroG 2010Turing universality of thebiochemical ground formMath. Struct. Comput. Sci. 2045–73

[114] DanosV, Feret J, FontanaWandKrivine J 2007 Scalablesimulation of cellular signaling networks LectureNotesComput. Sci. 4807 139–57

[115] Yang J,MonineM I, Faeder J R andHlavacekWS 2008KineticMonteCarlomethod for rule-basedmodeling ofbiochemical networks Phys. Rev.E 78 031910

[116] GillespieDT2001Approximate accelerated stochasticsimulation of chemically reacting systems J. Chem. Phys. 1151716–33

[117] MonineM I, Posner RG, Savage PB, Faeder J R andHlavacekWS2010Modelingmultivalent ligand–receptorinteractions with steric constraints on configurations of cell-surface receptor aggregatesBiophys. J. 98 48–56

[118] MahajanA et al2014Optimal aggregationofFcεRIwith astructurally defined trivalent ligandoverrides negativeregulationdriven byphosphatasesACSChem.Biol.9 1508–19

[119] Chylek LA,AkimovV,Dengjel J, RigboltKT,HuB,HlavacekWSandBlagoevB2014Phosphorylation sitedynamicsof earlyT-cell receptor signalingPLoSOne9 e104240

[120] Harris L A andClancy P 2006A ‘partitioned leaping’approach formultiscalemodeling of chemical reactiondynamics J. Chem. Phys. 125 144107

[121] Yang J andHlavacekWS2011The efficiency of reactant sitesampling in network-free simulation of rule-basedmodelsfor biochemical systemsPhys. Biol. 8 055009

[122] RuleBender (http://visualizlab.org/rulebender/)[123] Andrews S S 2012 Spatial and stochastic cellularmodeling

with the Smoldyn simulatorMethodsMol. Biol.804 519–42

[124] Bartol T,DittrichMand Faeder J 2015MCellEncyclopedia ofComputational Neuroscience edD Jaeger andR Jung (NewYork: Springer) pp 1–5

[125] Gardner T S, Cantor CR andCollins J J 2000Constructionof a genetic toggle switch inEscherichia coli Nature 403339–42

[126] MatCont (http://sourceforge.net/projects/matcont/)

[127] Munsky B,Neuert G and vanOudenaarden A2012Usinggene expression noise to understand gene regulation Science336 183–7

[128] KlinkeD J II and Finley SD 2012Timescale analysis of rule-based biochemical reaction networksBiotechnol. Prog. 2833–44

[129] BorisovNM,Chistopolsky A S, Faeder J R andKholodenkoBN2008Domain-oriented reduction of rule-based networkmodels IET Syst. Biol. 2 342–51

[130] ConzelmannH, FeyD andGilles ED 2008 Exactmodelreduction of combinatorial reaction networksBMCSyst. Biol.2 78

[131] Harmer R,DanosV, Feret J, Krivine J and FontanaW2010Intrinsic information carriers in combinatorial dynamicalsystemsChaos 20 037108

[132] Bugenhagen SMandBeardDA2012 Specification,construction, and exact reduction of state transition systemmodels of biochemical processes J. Chem. Phys. 137 154108

[133] Camporesi F, Feret J andHayman J 2013Context-sensitiveflow analyses: a hierarchy ofmodel reductions LectureNotesCompot. Sci. 8130 220–33

[134] Harris L A,Hogg J S and Faeder J R 2009Compartmentalrule-basedmodeling of biochemical systems Proc. 2009Winter Simulation Conference edMDRossetti, RRHill, BJohansson, ADunkin andRG Ingalls (NewYork: ACM)pp 908–19

[135] Kerr RA, Bartol TM,Kaminsky B,DittrichM,Chang J C J,Baden S B, Sejnowski T J and Stiles J R 2008 FastMonte Carlosimulationmethods for biological reaction–diffusion systemsin solution and on surfaces SIAM J. Sci. Comput. 30 3126–49

[136] MCell (www.mcell.org/)[137] MurphyR F 2012Cell organizer: image-derivedmodels of

subcellular organization and protein distributionMethodsCell Biol. 110 179–93

[138] BuchlerNE andCross FR 2009 Protein sequestrationgenerates aflexible ultrasensitive response in a geneticnetworkMol. Syst. Biol. 5 272

[139] Hogg J S 2013Advances in rule-basedmodeling:compartments, energy, and hybrid simulation, withapplication to sepsis and cell signaling PhDThesis Universityof Pittsburgh, Pittsburgh

[140] Ollivier J F, Shahrezaei V and Swain P S 2010 Scalable rule-basedmodelling of allosteric proteins and biochemicalnetworksPLoS Comput. Biol. 6 e1000975

[141] DanosV,Harmer R andHonorato-Zimmer R 2013Thermodynamic graph-rewriting LectureNotes Comput. Sci.8052 380–94

[142] Lewis GN1925Anewprinciple of equilibrium Proc. NatlAcad. Sci. USA 11 179–83

[143] Yang J, BrunoW J,HlavacekWS and Pearson J E 2006Onimposing detailed balance in complex reactionmechanismsBiophys. J. 91 1136–41

[144] EdererM andGilles ED2007Thermodynamically feasiblekineticmodels of reaction networksBiophys. J. 92 1846–57

[145] GorbanAN andYablonskyG S 2011Extended detailedbalance for systemswith irreversible reactionsChem. Eng. Sci.66 5388–99

[146] Kiefhaber T, BachmannA and JensenK S 2012Dynamics andmechanisms of coupled protein folding and bindingreactionsCurr. Opin. Struct. Biol. 22 21–9

[147] Python (www.python.org/)[148] Harcourt AV and EssonW1865On the laws of connexion

between the conditions of a chemical change and its amountProc. R. Soc. London 14 470–4

[149] van’t Hoff JH 1884Études deDynamique Chimique(Amsterdam: FrederikMuller)

[150] The q-bio Summer School andConferencewiki (http://q-bio.org/)

[151] Goldstein B, CoombsD, Faeder J R andHlavacekWS 2008Kinetic proofreadingmodelAdv. Exp.Med. Biol. 640 82–94

24

Phys. Biol. 12 (2015) 045007 LAChylek et al