the kineticist's workbench: qualitative/quantitative simulation of chemical reaction mechanisms

Expert Systems With Applications, Voi. 3, pp. 367-377, 1991 0957--4174/91 $3.00 + .00 Printed in the USA. © 1991 Pergamon Prem plc

The Kineticist's Workbench: Qualitative/Quantitative Simulation of Chemical Reaction Mechanisms

MICHAEL EISENBERG

MIT Laboratory for Computer Science, Cambridge

Almtract--The Kineticist's Workbench is a program combining symbolic and numerical techniques, and intended to assist chemists in simulating, understanding, and analyzing chemical reaction mechanisms. This paper describes the three modor modules that currently comprise the Workbench: a numerical simulation module; a graphical analysis module that uses mechanism structure to make predictions about behavior; and a qualitative analysis module that interprets the numerical results o f simulation and produces qualitative descriptions o f those results.

1. INTRODUCTION

T H E KINETICIST'S WORKBENCH is a program currently being developed by the author, and intended to assist chemists in simulating, understanding, and analyzing complex chemical reaction mechanisms. To this end, the Workbench combines a variety of numerical and symbolic techniques. A chemist may use the program to simulate some given mechanism numerically; she may use it to perform certain types of analysis on the mechanism a priori, in the absence of simulation; or she may use the program to interpret numerical results qualitatively.

This paper describes the current state of the Kineti- eist's Workbench, and its capabilities. The remainder of this introductory section is a (very highly condensed) introduction to the domain of chemical kinetics, and an explanation of why a program like the Workbench may prove useful to chemists. Section 2 provides an overview of the Workbench program and its several component modules; the following three sections describe these modules in greater detail. Section 6 illus- trates the use of the Workbench with an example, and the final section discusses ongoing and future work with the program.

crete events. For instance, the decomposition of dini- trogen pentoxide is typically expressed by the following notation:

2 N205 -'~ 4 NO2 -I- 02. ( 1 )

In this form, the reaction appears to proceed by a collision of two N205 molecules to yield five product molecules; but the geometry of the reactant molecules, among other considerations, sugsests that this simple picture is highly improbable. In point of fact, a more accurate picture of the decomposition process is given by the following set o f four reactions:

N205 +-+ NO2 -I- NO3 --~ NO2 -I- NO -t- 02

NO + N2Os --~ 3NO2. (2)

The four reactions depleted in (2) are known as elementary s t epsmun f i ke ( 1 ), they are intended to de- note discrete collision or decomposition events, t To- gether these reactions constitute a mechanism for the process summarized by ( 1 ); and they specify a set of differential equations that describe the rates of change of concentration for all species: 2

d[N2Os]/dt = - k l [N205]

q- k2[NO2][NO3] - k4[NO][N2Os] (3.1)

1.1. Chemical Kinetics: An Introduction

In their commonly-encountered "textbook" form, chemical reactions appear to be relatively simple dis-

Acknowledgments--This work has benefitted from the guidance and support of many people. Special thanks go to Harold Abelson, Gerald Sussman, Irving Epstein, and Ken Yip.

Requests for ie~ints should be sent to: Michael Eisenberg, MIT Lab for Computer Science, 545 Technology Square, Rm 429, Cam- bridge, MA 02139.

t The notation for mechanisms used in this paper is a bit more compact than that usually found in the chemical literature. The briefer notation used here is somewhat more helpful for understanding the graphical analysis techniques to be described later. Occasionally, for the sake of clarity, the typical "expanded" notation will be employed.

2 The vaines k l , k2, and so on are known as rate constants; briefly, these are proportionality constants reflecting the speed of each elementary reaction. Also, the units of concentration, time, and rate constants are chosen to be internally comistent; typical choices of units are moles/liter for concentration, seconds for time, seconds -n for first-order rate constants like kl, and so on.

367

368 M. Eisenberg

d[NO2]/dt = kl [N205]

- k2[NO2][NO3] + 3k4[NO][N205] (3.2)

d[NO3]/dt = kl [N205]

- k2[NO2][NO3] - k3[NO2][NO3] (3.3)

d [ N O ] / d t -- k3[NO2][NO3] - k4[NO][N2Os]

(3.4)

d[O2]/dt = k3[NO2][NO3]. (3.5)

A mechanism such as (2) is inevitably a hypothesis: it represents the chemist's current working model of how some "real-word" reactive system behaves. Typ- ically, a kineticist interested in a process such as the decomposition of N205 will first propose a mechanism for that process; derive a set of differential equations (such as (3.1)-(3.5)) from the proposed mechanism; and finally attempt to corroborate (or disprove) the hypothetical mechanism by comparing the quantitative predictions of the model with experimental results from the laboratory.

The paragraphs above represent an extremely condensed and skeletal treatment of the fundamentals of chemical kinetics (fuller treatments may be found in Laidler (1987) and Mahan (1972)), but they should suggest to the reader the essential nature of much of the kineticist's work. An important point to note is that chemical mechanisms (and the systems they model) are often extremely complex, and their behavior may be hard to predict. This is not too surprising when one considers that even the relatively simple mechanism (2) is represented mathematically by the set of nonlinear ordinary differential equations ( 3. l ) - (3.5); and, as is well known, systems of nonlinear ODEs can give rise to a bewildering range of behavior. Over the last few decades, there has been a burgeoning interest in chemical systems exhibiting strange phe- nomena such as bistability, oscillations, "chemical chaos," and so forth (R6ssler, 1979). The difficulty of the kineticist's job is further exacerbated by the fact that, for many mechanisms, the number of elementary reactions may be largemsometimes in the hundreds. 3

1.2. Qualitative and Quantitative Understanding of Mechanisms

Suppose that a chemist is presented with a mechanism consisting of a few dozen reactions, and represented

3 See, for instance, lsbarn, Ederer, & Ebert (1981): their mechanism for the decomposition of n-hexane involves 240 elementary steps. It should also be mentioned that there are dit~culties arising from inhomogeneous systems: these must be deseribed with partial differential equations rather than ordinary differential equations. The Kineficist's Workbench, and this paper, deal with homogeneous systems only.

by a large set of differential equations such as (3) above. How can the chemist begin to get some understanding of the behavior of this model?

The first and simplest answer is to perform a numerical simulation of the model on a computer. This is not quite as straightforward as it may sound--chemical systems are frequently "stiff" and hence may pose special problems for numerical integration--but by now a variety of chemical simulation programs exist for this purpose (Byrne, 1981; Shacham, 1985). The output of such a program will generally be a graph (or table) of numbers representing the concentrations of species of interest over time.

Of course, once the integration routine has produced this huge set of numbers, the chemist's job is not over. The numbers have to be scanned for interesting features: Perhaps certain species are at a constant (or near- constant) concentration throughout much of the simulation; or there may be rapid jumps in concentrations, or sustained oscillations. It may be that much of the interesting behavior of the system occurred during a relatively brief portion of the simulation.

The chemist's work cannot even have been said to start with the simulation: a certain amount of interpretation inevitably precedes the numerical work. The chemist may use certain graphical techniques, or heuristics, to anticipate the behavior of the mechanism, or may look for possible simplifications to the mechanism. (It may be, for instance, that some elementary steps can be dropped from the mechanism while leaving its probable behavior unchanged.) And of course, the chemist's work will typically require not just one numerical simulation but many: he may want to vary some parameter (e.g., the rate constant of a particular elementary step) to see how that parameter affects the qualitative behavior of the model.

In general, then, the work of understanding how chemical reactions proceed consists of a mixture of "styles" of reasoning, involving both quantitative cal- culations and qualitative judgment in varying degrees. Historically, only the quantitative aspect of the work has been delegated to computers; but some of the qualitative reasoning is itself routine and can be profitably automated. The Kineticist's Workbench program is designed to realize this goal by augmenting the standard techniques of numerical simulation with methods for predicting and interpreting the results of simulations. In this way, the Workbench can free the scientist from the more mundane tasks of simulation (whether those tasks happen to be quantitative or not). With this mo- tivation in mind, we now turn to a description of the program itself.

2. THE KINETICIST'S WORKBENCH: AN OVERVIEW

The Kineticist's Workbench currently consists of three main modules:

Qualitative~Quantitative Simulation of Chemical Reaction Mechanisms 369

• A numerical integration module. This portion of the Workbench program is responsible for performing numerical integration and graphing numerical re- suits.

• A graphical analysis module. This portion of the program examines mechanisms symbolically and makes predictions about mechanism behavior based solely on the components of the reactions.

• A qualitative analysis module. This portion of the program interprets the results of numerical simulations and attempts to find features of interest (such as steady states, rapid changes in concentrations, or oscillations) in those results.

These three modules will be described in the following sections4; but before proceeding with that discussion, it is worth mentioning several of the design criteria underlying the construction of the Workbench as a whole.

First, as noted, the Workbench employs a wide range of techniques for working with reaction mechanisms. It is an important feature of the program that these techniques not merely coexist, but cooperateDthat they are capable of sharing their results where appropriate. For instance, the numerical simulation routines can use the results of graphical analysis to decide when to halt a simulation; while the qualitative analysis routines use numerical results as their input. The point of having multiple modules, then, is not merely to provide a battery of results for the user, hut to combine those results in meaningful ways.

A second design principle can be expressed as some- thing that the Workbench is not: Namely, it is not an attempt to develop a purely qualitative formalism for studying chemical mechanisms. Numerical results are the fundamental data for the Workbench: they are predicted by graphical analysis and interpreted by qualitative analysis, but no attempt is made to circum- vent numerical integration altogether. In this sense, the Workbench may be contrasted with attempts to develop formalisms for qualitative physics (such as an algebra of confluences (de Klcer & Brown, 1984)). Work in qualitative physics is both fascinating and useful, but its application to complex chemical systems is still unclear, since the behavior of these systems often varies tremendously based on the fine tuning of numerical parameters; an exclusively qualitative vocabulary for simulating these nonlinear systems might prove so wordy that its usual advantases over numerical simulation (compactness, cognitive perspicuity ) could disappear altogether.

Finally, the Workbench uses certain kinds of domain-related knowledge in its analysis of mechanisms,

4 A fourth module, for spotting "patterns" such as fast equilibria in mechanisms, is currently under construction and will not be described in this paper.

and likewise uses domain-related concepts in expressing its results. For example, some of the graphical analysis routines employ heuristics derived from standard as- sumptions of reactive chemistry (e.g., that reactions obey "mass action" kinetics (Feinberg, 1980)); while the qualitative results are expressed in terms ("steady state," "equilibrium," and so on) that are meaningful to the working chemist. That is to say, the Workbench is intended as a tool for chemists, and as such it should not require its users to develop entirely new formalisms for supplying input to the program or reading its output.

Before going on to examine the program in detail, some words on the implementation are in order. The Kineticist's Workbench is written in Scheme and currently runs on a Hewlett-Packard Series 9000/350 workstation (based on a 68030 processor). The program makes use of the IMSL FORTRAN library for some numerical operations, and the Scheme graphics ope.rations are compatible with the X Window System. Because almost all of the Workbench is embedded in a Scheme system that freely mixes interpreted and compiled code, it is easy to write new procedures on the fly while using the program and to examine and test the code interactively.

3. NUMERICAL INTEGRATION

The Kineticist's Workbench program contains an integration module that permits the user to simulate chemical mechanisms numerically. Basically, the user provides the program with relevant input values-- namely, the mechanisms to be simulated (written as a Scheme list which will be described shortly); starting concentrations of all species; and some simulation parameters (including the integration algorithm to be used). With this information, the program derives the appropriate differential equations and then simulates the mechanism. The numerical results may be presented in any of several ways: They may be printed or graphed on the screen, or stored in a file whose name is supplied by the user.

Currently the program supports two integration al- gorithms: a fourth-order Runge--Kutta algorithm and an adaptive-step ~ algorithm. The latter is a pre- dictor--corrector method appropriate for particularly large or stiff systems of equations, while the former is easier to use for simple systems. 5 In either case, the user must also supply additional parameters: starting

s The Gear integrator is part of the IMSL Mathematical Library (supplied by IMSL, Inc.) and is written in FORTRAN. When a Gear integration is requested, the Workbench automatically constructs and compiles the appropriate FORTRAN code while linking in the IMSL Library; runs the code; and then retrieves the numbers produced by running that code.

370 M. Eisenberg

and ending times for the simulation, and time-step value (in the case of the Gear algorithm, this value is the initial time-step to use).

As a brief example of how this Workbench module operates, consider the following tiny sample mechanism:

A - - * B + C kl = 1

B + C- '* A k 2 = 2

B - * 2 D k3 = 1.5.

The user can supply this mechanism as input to the Workbench in several ways; the most compact method is to define a new mechanism using the Scheme interpreter, as follows:

(define sample-mechanism (make-mechanism-from-step-list '(

(((a 1)) ((b 1) (c 1)) 1) (((b 1) (c 1)) ((a 1)) 2)

(((b 1)) ((d 2)) 1.5) ))).

; list of steps

The argument to the procedure make-mechanism- from-step-list is a list composed of three "single-step elements," each of which is itself a list consisting of reactants, products, and rate constant for the given step. The mechanism structure created by this expression actually has a few additional slots---in this case, all empty- - fo r additional information (species whose concentration is constant, species that are supplied from an external source, and so forth).

Now the user must specify a few integration parameters; an easy method is to take a parameter list pro- vided with one of the sample files in the program, and edit that list to contain the appropriate values:

(define sample-parameter-list '((starting-dt 0.05)

(start-time 0.) (end-time 20.) ( actual-starting-cones

( (a 2.) (b 0.) (c 0.) (d 0.))) (integration-method runge-kutta) (focus-species (a b d)) (steps-per-display 20))).

The only elements in this list that perhaps need ex- plication are the final two. The ideal here is that we wish to display the concentration values of species A, B, and D every 20 time steps (corresponding to one second of simulated time).

Finally, by evaluating the following expression:

(do-simple-mechanism-run sample-mechanism sample-parameter-list)

the user can see the result of simulating the mechanism numerically (the first few values are shown below). 6

Species: d Species: b Species: a

Species: d Species: b Species: a Current time:

Species: d Species: b Species: a Current time:

Concentration: 0. Concentration: 0. Concentration: 2.

Concentration: 1.0829351858700385 Concentration: .38061488796522547 Concentration: 1.0779175190997552 1 . ~

Concentration: 1.97060533068859 Concentration: .22755722640644568 Concentration: .7871401082492592 2 . ~ 1 .

There are some other features of the numerical integration module that relate to its interaction with other portions of the program; these will be discussed later in this paper.

4. GRAPHICAL ANALYSIS OF M E C H A N I S M S

Often there is a good deal of useful information to be gleaned from a mechanism even before that mechanism is simulated numerically. Consider, as an example, the following very simple pair of elementary reactions:

A *-, B. (4)

This graph represents two unimolecular elementary steps: a reaction from A to B and a reverse reaction from B to A. By inspection, we can tell that this system must approach equilibrium at nonzero concentrations of A and B (assuming that we start with a nonzero concentration of either species, and that the system follows typical mass action kinetics). We did not need to run a simulation of the system to reach this conclusion: Rather, we were able to predict this feature of the system's behavior from the structure of the reactions themselves (along with the domain knowledge of kinetics that interprets mechanism graphs such as (4)) . Indeed, we did not even need the exact rate constants of the reactions to know that the system would reach equilibrium; we merely used the assumption that those rate constants were nonzero. The equilibrium concentrations of A and B in (4) do, of course, depend on the rates of the reactions, but the fact that the system reaches equilibrium at all is derived from mechanism structure alone.

6 If the user wishes to see the concentration values of certain species graphed on the screen, some additional parameters must be specified dictating (among other things) which species to graph, the ordinate and abscissa ranges of the various graphs, and so forth.

Qualitative~Quantitative Simulation of Chemical Reaction Mechanisms 3 71

There are actually a wide variety of such structural insights that can be brought to bear upon mechanisms, and a wide variety of mechanisms to which at least some structural analysis may be applied. Another example may convey some flavor of the sort of reasoning involved. Consider the following mechanism:

A *-* B ~ External World

B + D--~ E*-* F. (5)

The only new wrinkle in the notation here is represented by the graph vertex labeled "External World"; this indicates that there is both an external source and sink for species B. Without going into painstaking detail, we can again inspect the structure of this mechanism and deduce, among other things, that species D will eventually approach a concentration of zero; that species E and F will approach equilibrium concentrations that depend on (among other things) the initial concentration of D; and that A and B reach equilibrium concentrations that are in fact independent of their initial concentrations and depend only on the rate constants ofthe four reactions shown in the upper row of (5). Again, we have arrived at several potentially useful conclusions without actually bothering to run a simulation of the mechanism; indeed, in this particular case we could deduce the (asymptotically approached ) equilibrium concentrations of A and B by a variety of means other than simulation.

4.1. Consistent Zero Sets of Species

The Workbench performs graphical analysis of mechanisms using notions such as "necessary nonzero species sets" and "necessarily declining sets of species." These and several other concepts built into the program are described in greater detail in Eisenberg (1990). For the purpose of illustration, this paper will discuss two concepts that come up in graphical analysis: consistent zero sets of species, and the zero deficiency theorem.

The notion of consistent zero sets comes up when we ask a question of the form: "Given that a mechanism approaches a steady state in which the concentration of some species X is zero, what other species in the mechanism must likewise be at zero concentration in that steady stateT" To take an example, consider the following mechanism:

A + B ~-, C + D,--, E.

Here, if A has zero concentration at some steady state, then so must E and at least one of C and D; but species B (and at most one of C and D) might well have nonzero concentration. We could conclude from this reasoning that the sets [A D E ] and [A C E ] (and supersets of these) are both consistent zero sets for the mechanism.

4.2. The Zero Deficiency Theorem

The "zero deficiency theorem," derived by M. Feinberg and colleagues at the University of Rochester (Fein- berg, 1980) is a useful and often powerful tool for graphical analysis of mechanisms. Space does not permit a complete explanation of the theorem, but essen- tially the purpose of the deficiency concept is to identify a certain class of mechanisms for which it is possible to conclude (on the basis of structural information alone) whether the mechanism has a locally stable equilibrium state in which the concentrations of all species are nonzero.

To give a telegraphic treatment of the theorem here: first, we assume that a mechanism is represented by the kind of graph structure used throughout this paper. As an example, we can use the mechanism (5) intro- duced earlier.

A *-' B *-' External World

B + D - * E*-* F.

now introduce the following concepts: We 1.

(5)

The number of distinct vertices--that is, left or fight sides of reactions--in the graph, which will be denoted as n. (For our example, this number is 6.)

2. The number of connected components (portions linked by arrows) in the graph, denoted by l. (For our example, this number is 2.)

3. Whether the mechanism has the feature that for every pair of vertices vl, v2, if there is some directed path of reaction arrows leading from v 1 to v2 then there is some pathway back from v2 to vl. This feature is denoted weak reversibility. (Our example is not weakly reversible, since there is a path from the vertex B + D to the vertex E, but no path from E to B + D.)

4. By treating the reactions in the mechanism as vec- tot's in a particular vector space (the space in question is R m, where m is the number of distinct species in the mechanism), we compute the dimension s of the space spanned by the reaction vectors. (For our example, this number turns out to be 4.)

5. Finally, we find the value of n - s - l, and call this value the deficiency. If this number is zero, then we can deduce the following interesting information: namely, that if the mechanism is weakly reversible than it has a unique locally stable equilibrium in which all species are at nonzero concentration. If, on the other hand, the zero-deficiency mechanism is not weakly revers/ble, then there exists no equilibrium state in which all species are at nonzero concentration. (In our example, the mechanism has a deficiency of zero but is not weakly reversible; therefore it does not have a stable equilibrium in which all species have nonzero concentration. In- deed, as we have already concluded, the concentration of species D must approach zero over time.)

372 M. Eisenberg

The reader may be wondering, based on the example shown, whether the zero deficiency theorem involves a great deal of mathematical baggage to achieve an intuitively obvious result. In point of fact, the theorem is often applicable in far more subtle cases. Consider the following example (Feinberg, 1980):

A + B ~-- D*- ' 2C

B + C*- ' E*- ' A + D. (6)

This is a zero-deficiency mechanism whose graph is not weakly reversible; hence it cannot reach equilibrium with all species at nonzero concentrations. This conclusion is far from being intuitively obvious, but again we did not require simulation to arrive at it. As an aside, it is worth noting that we can now profitably use the earlier concept of consistent zero sets to analyze this mechanism: in particular, now that we know that any steady state for the mechanism will include some species with zero concentration, we can ask which sets of species can possibly have zero concentration at a steady state. This question is pursued in the following subsection.

4.3. An Example of Graphical Analysis

As a brief example of how the Workbench can be used to perform graphical analysis, we can examine an ex- cerpt of the program's treatment of mechanism (6). First we create the mechanism (as shown in Section 3), and then evaluate the following expression:

(analyze-mechanism-graphically sample-mechanism-2 ).

Some of the program's printed output is shown below:

Number of complexes: 6 Number of linkage classes: 2 Dimension of stoichiometric subspace: 4 Deficiency: 0 Reversibility: none

Rule number 2.3 - - Deficiency Theorem part 2 is firing

If the deficiency of the mechanism is 0, and the mechanism is not weakly reversible, then the mechanism cannot reach equilibrium at positive concentrations of all species.

Rule number 4.3 - - Looking for asymptotic zero concentrations is firing

The mechanism cannot reach equilibrium with all nonzero concentrations. It also does not contain any * obvious. declining species or sets. We now look for those sets of species that might asymptotically have zero concentration.

Possible sets of zero-concentration species: ((d e c)) Subtracting the following zero set: (d e c)

This mechanism contains no elementary steps.

In summary: the program first concludes that the mechanism will not reach equilibrium with positive concentrations of all species, and then attempts to sub- tract from the mechanism a consistent set of zero-concentration species. It finds that there is in fact only one consistent se tnnamely , the set [ C D E ]. The program now subtracts from mechanism (6) all those steps in which these species are reactants, in the hopes of de- riving a simpler mechanism to study; after doing so, the program notes that there are no remaining steps from mechanism (6) that can possibly proceed once the three species approach zero concentration.

5. QUALITATIVE ANALYSIS OF S IMU LA TIO N RESULTS

Once a simulation has been performed, the Workbench can ( if the user desires) examine the numerical results and attempt to produce a qualitative summary of the events of the simulation. The program first groups the simulation's results into discrete periods of time corresponding to distinguishable "operational states" of the mechanism, and then looks for patterns in the sequence of time periods that it has identified.

5.1. Grouping Results into "Episodes"

The central concept in the Workbench's strategy for producing qualitative summaries is that of the episode. The idea is best illustrated through an example. 7 Con- sider the following simple mechanism (here expressed in an "expanded" notation):

A - - ~ B (7.1)

B "-* C (7.2)

C -'* B. (7.3)

The differential equations specifying the behavior of this system are as follows:

d [ A ] / d t = - k l [A] (8.1)

d[B] /d t = k l [ A ] - k2[B] + k3[C] (8.2)

d [ C ] / d t = k2[B] - k3[C]. (8.3)

Examination of equations (8 .1 ) - ( 8.3 ) reveals that each separate term in each equation corresponds to the change in some species' concentration due to the contribution of a particular reaction. For instance, the two terms on the right of (8.3) correspond to the increase in the concentration of species C due to reaction B --* C, and the decrease in the concentration of C due to

7 More detail is given in Eisenberg (1989).


the reaction C -* B. This correspondence between differential equation terms and contributions from individual elementary steps is a staple of mass action kinetics; in general, the differential equation for a given species X will have as many terms as there are reactions in which X occurs either as a reactant or product.

At any point during a simulation, then, we can examine the concentrations of all species and plug those concentration values into all the terms of the differential equations governing the system. For each species, we can derive an ordering on the terms in that species' differential equation: the ordering corresponds to in- creasing absolute value of each term. To continue our example, suppose that the values ofkl , k2, and k3 are 2, 0.5, and 1, respectively; and suppose that at a given moment in time, the concentrations of A, B, and C are 3, 1, and 0.6. In this case, the three terms in the differential equation for B have the values 6, -0.5, and 0.6, respectively; while the two terms in the equation for C have the values 0.5 and -0.6. Thus, ordering the terms by absolute value, we would say that the term corresponding to reaction (7.1) is largest for species B, while the term for reaction (7.3) is largest for species C. An episode corresponds to a period of time during the simulation during which these reaction orderings remain stable for each species. If at some point during the simulation, the concentrations of A, B, and C change so that (say) the contribution of reaction (7.2) becomes larger than that for (7.3) in determining the change in the concentration of C, then we would say that a new episode of the simulation has begun; and this new episode remains in effect until the reaction orderings again change for either B or C.

To sum up, then: At any given time during the simulation, we look to see how each reaction contributes to the concentration change of each species; and for each species, we order the contributing reactions ac- cording to their importance. An episode corresponds to a period of time during which these orderings remain stable.

5.2. Analyzing Episode Sequences Having seen how episodes are defined, we can now examine how the Workbench produces qualitative summaries of behavior. First, the program identifies the episode boundaries within the numerical results; it then looks for specific features (such as rapid concentration growth or decline for some species) within each episode. For instance, the program may find that the concentration of species X increased by more than 50 percent during a particular episode, and that no decline in X was recorded for any individual time step during that episode. In this case, the Workbench would record that X experienced a large and steady (monotonic) increase during that episode, and would attach a feature descriptor of this sort to the episode.

The question of identifying relatively "rapid" growth or decline of concentration depends on forming a particular time scale for the simulation. A 50% growth in concentration may be deemed rapid if it occurs in-a relatively brief time, or slow if it occurs over an extremely long time. The Workbench derives an implicit time scale for the entire simulation by using the lengths of the longest and shortest episodes as bounds; thus, if a 1 second episode is the shortest found in analyzing some simulation's results, then that episode will be deemed a brief time for the purposes of identifying interesting features. Conversely, if a 1 second episode is much longer than all others found (say, all other episodes are under a millisecond in length), then that episode would be treated as a relatively long time. This heuristic for determining time-scale appears reasonable based on experience with the Workbench so far; it en- ables the program to analyze features in time records that span anywhere from seconds (appropriate for most laboratory systems) to years (which might be appropriate for, say, reactions in the upper atmosphere).

Having annotated the sequence of episodes found in the simulation with features of this kind, the Work- bench now looks for patterns in the sequence itself. Specifically, the program notes features such as a long final episode with relatively little concentration change among species (indicative of a system that has reached equilibrium), or repeating patterns among episodes (indicative of the presence of oscillations ) .S This tech- nique will be illustrated by the example in the next section.

Besides describing the way in which the Workbench uses episode structures, the preceding paragraphs should also provide the reader with some feeling for the rationale behind the episode notion itself. Episodes are intended to correspond to distinct identifiable periods of potentially interesting behavior during a simulation; they are also intended as analogs to the chemist's notion of periods during which some particular reaction "takes over" in importance in determining a mechanism's behavior. The episode data structure provides the Workbench (and the user) with a link between interesting numerical features of some simulation's results, and the underlying reactions that produced those features.

6. AN EXAMPLE OF THE WORKBENCH IN USE

Just to see how the Workbench can be used in a complete example, we consider the "Brusselator" mecha-

s The Workbench is able to note the occurrence of some complex patterns such as birhythmicity, insofar as it will identify two succeeding oscillating sequences; but it currently has no vocabulary for such cases (and for some other patterns, such as "damped oscillations").

374 M. Eisenberg

nism, originally devised by Prigoglne and Lefever as a simple model of an oscillating reaction (Nicolis & Pri- gogine, 1977):

A - - * X kl = 1 (9.1)

B + X - ~ Y + D k 2 = 1 (9.2)

2 X + Y - - * 3 X k 3 = 1 (9.3)

X - - * E k4= 1. (9.4)

Here, we assume that species A and B are held constant at concentrations of 1 and 3, respectively; as for species D and E, they can be treated as "driven-off species," that is species whose concentrations are iden- tically 0, inasmuch as they do not enter as reactants in any steps. The behavior of this mechanism will thus be determined by the concentration changes in X and Y.

We can input the mechanism by typing in the following expression at the Scheme interpreter:

(define brusselator '(

( (((a)) ((x)) 1) (((b) (x)) ((y) (d)) 1) ( ((x 2) (y)) ((x 3)) 1) (((x)) ((e)) 1 ) )

((a 1 ) (b 3)) ;constant species ( ) ;sources ( ) ;sinks (d e) ; driven off ( ) ;functions of time )).

This expression is a bit more complicated than our previous examples, because we are using some of the additional slots available in the description of mechanisms; here, we are specifying in the definition of the Brusselator mechanism that species A and B will have constant concentrations, and that species D and E are to be ignored.

As a first step, we can perform graphical analysis on the mechanism:

(analyze-mechanism-graphically brusselator).

Among the information printed out by the program, we note the following:

Number of complexes: 5 Number of linkage classes: 2 Dimension of stoichiometric subspace: 2 Deficiency: 1 Reversibility: none.

Since this mechanism has a deficiency value of 1, we cannot use the zero deficiency theorem to predict its behavior. Informally, we can say that this mechanism might be capable of"exotic" behavior. 9 It should be noted, parenthetically, that the computation of deficiency is a bit more involved than indicated by the brief summary earlier in this paper: In particular, constant and driven-off species have to be handled spe- cially. Thus, the number of complexes (vertices) in the Brusselator mechanism is 5 for the purposes of deficiency calculation (though a naive examination of (9.1)-(9.4) would suggest that there are 7 distinct vertices).

We now specify run parameters for the program:

(define brusselator-run-parameters • ((starting-dt 0.05 )

(start-time 0.) (end-time 40.) (starting-concs

.((a 1.) (b 3.) (d 0.) (e 0.) (x 0.) (y 0.))) (integration-method runge-kutta) (focus-species (x y)) (steps-or-time-per-display 20) (end-conditions (end-time)) (maintain-episode-history? ,true) (episode-change-depth 0) (graph-window,. display-graph-window,) )).

There are a few parameters in this list that have not appeared in previous examples. Without going into too much detail, the final few parameters indicate that we will use the Workbench to distinguish episode boundaries during the simulation itself; that episode boundaries will be associated with the single most important reaction for each species (rather than a complete ordering of all reactions); and that a running graph of concentrations will be shown as the simulation pro- ceeds. Another list is now needed to specify some graph parameters (axis limits, graph colors, and so forth); but these are not especially interesting and will not be discussed here.

We now can run a simulation of the mechanism by evaluating the following expression:

(do-simple-graphed-mechanism-run brusselator brusselator- run-parameters brusselator-graph-parameters).

Running the simulation produces the graph shown in Figure 1; it appears as though the concentrations of

9 There are in fact additional graphical techniques for examining some deficiency one mechanisms (Feinberg (1980) includes an example); but these have not yet been incorporated into the graphical analysis module of the Workbench.


20.

y

0 . ~

10~

X

O. L \ • " ; . L . ; " '

I 0 • I I I I T • I I I I I Ime 40.

FIGURE 1. ~ ; i ~ i l ; ~ , ~ l of X l ind Y versus t ime I s grmphed by the Wor ldmnoh.

species X a n d Y are oscillating at a regular rate (at least after the very outset of simulation). We can view the episodes constructed by the running simulation by evaluating:

(draw-episodes-from-history).

This adds vertical fines to our graph corresponding to the episode boundaries detected by the running program, as shown in Figure 2. A glance at the episode boundaries indicates that starting with the third episode, there seem to be regular divisions of the graph into two-episode "chunks."

Finally, we can ask the Workbench to analyze the constructed episode history:

( d i s p l a y - f e a t u r c - a n a l y , ~ s ) .

This causes the interpreter to print out (among other information) the following fines:

Now examine the reaction history for possible oscillations:

((long) (large-increase x) (steady-increase x) (larl~increase y) (steady-increase y)) ((large-increase x) (steady-increase x))

(( short ) (large-increase x ) ( rapid-increase x) (large-decrease y) ( r a p i d ~ y) (steady.decrease y))

((long) (laql~decfcase x) (rapid.decrease x) (latg~ina-ease y))

((short) (large-increase x) (rapid-increase x) (large-decrease y) (rapid-decrease y))

20.

g

1, ~ , ~, I I I I I I I I I I I I I I I I I I I I

~, 8,, 1Q, I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

~l/...-'- J ! l / " - i~, . / - I'..I / O. Y i'~ i~ i

10.,,._. I I I I I I I I

X _ _ I I I I

,/,\ m ,;, ,!, O, I ' ~ I " ' - - - - ' 1 1 " " " ' " I

i ' ~Im( i l i i40 0 FIGURE 2. The l a m e graph as in Figure 1, but wi th e p i s o d e b o u n d a r i e s shown.

376 M. Eisenberg

((long) (large-decrease x) (rapid-decrease x) (large-increase y))



((short) (large-increase x) (rapid-increase x) ( large-decrease y) ( rapid-decrease y) )



(final) (large-decrease x ) (rapid-decrease x) (steady-decrease x) (large-increase y) (rapid-increase y) (steady-increase y))

done.

Here, the Workbench has indicated that it has found apparent oscillations starting with the third episode. We can describe the events of each unit of the oscillation as consisting of two episodes: the first is short, and involves a large rapid increase in X and large rapid decrease in Y, while the second is longer and involves a large rapid decrease in X and large increase in Y. (These would correspond to the episodes in Fig. 2 be- ginning at the times 8.05 and 9.3, respectively.)

7. ONGOING AND FUTURE WORK

Space and time do not permit a complete description ofaU the currently implemented features of the Work- bench. Much of the additional structure in the program relates to communication between the various modules--allowing information determined by one module to be used by another. For example, there are cases in which the graphical analysis module may determine that a given mechanism will reach an equilibrium state that is independent of starting concentrations; in this case, the Workbench can use a "rapid search" procedure to find a numerical approximation to the various equilibrium concentrations; and, ifa simulation is still desired, the numerical simulation module can be told to halt when the state of the system approaches its equilibrium position. (An example along these lines is included in Eisenberg (1990).)

Communication between modules is still incom- plete, however, and many extensions to the program can be envisioned. There are situations, for instance, in which qualitative analysis of a simulation's numerical results might guide an automatic reexamination of "interesting" portions of the results. To make this idea a bit more concrete, consider again the Brusselator example. Here, the simulation revealed an "induction" period, followed by sustained oscillations. One might want to perform some analysis on the oscillating portion of the numeric data, ignoring the induction period data; for instance, one might run an FFT algorithm on the postinduction data to find the fundamental fre- quency of oscillation (if we were to run the FFT on

the entire data set, the induction period would introduce complications in the interpretation of the fre- quency spectrum).

To take another example along these lines, it would be desirable to have the program's qualitative analysis checked against the earlier results of graphical analy- sismsomething that the Workbench cannot do at pres- ent. Suppose, for instance, that our graphical analysis tells us that the mechanism will reach a stable equilibrium, but the qualitative analysis fails to discern this equilibrium condition in the numerical results. There are several conceivable explanations for this, but one of the most likely is that the simulation simply did not proceed for a long enough time. In this case, the qualitative analysis module might report the discrepancy between its conclusions and the predictions of the graphical analysis module, and might automatically extend or redo the numerical simulation.

Perhaps one of the most important needed additions to the current program is the ability to compare episode histories between simulations. For instance, we might want to vary a parameter (like a rate constant) over a certain range in order to see how this variation affects the qualitative behavior of the mechanism. In thi~ case, it would be desirable to have the Workbench automatically detect changes between episode histories as the parameter is varied. In some cases, commonly ob- served patterns of change (such as period-doubling for some oscillating systems) could be detected and re- ported. The ability to compare episode histories would also allow us to compare distinct mechanisms: for instance, we might want to change a mechanism (say, by dropping or adding an elementary step) in order to see how the qualitative behavior of the system changes. We could compare simulations of the original and "perturbed" mechanisms to see how the qualitative behavior of the two simulations differ.

Additions of the kind described in the preceding paragraphs are planned for future versions of the Workbench. But besides these ambitious prospects, there are a large number of relatively mundane im- provements that need to be made in the Workbench. These include: a larger repertoire of heuristics for graphical analysis (including mechanisms with a deficiency of one or more); a larger library of recognized patterns for qualitative analysis (including concepts such as "'damped oscillations"); and a more user- friendly interface. There is, in short, much work to be done; but progress so far suggests that chemical simulation can indeed extend well beyond the realm of number crunching.

REFERENCES

Byrne, G. ( 1981 ). Software for differential systems and applicafiom involving macroscopic kinetics. Computers and Chemistry, 5(4 ), 151-158.


de Kleer, J. & Brown, J.A. (1985). Qualitative physics based on confluences. In D. Bobrow (Ed.), Qualitative reasoning about physical systems. Cambridge, MA: MIT Press.

Eisenberg, M. (1989). Descriptive simulation: Combining symbolic and numerical methods in the analysis of chemical reaction mechanisms. MIT A.I. Memo No. 1171.

Eisenberg, M. (1990). Combining qualitative and quantitative techniques in the simulation of chemical reaction mechanisms. In W. Webster & R. Uttamsingh (Eds.), AI and simulation: Theory and applications. San Diego, CA: Society for Computer Simu- lation.

Feinberg, M. (1980). Chemical oscillations, multiple equilibria, and reaction network structure. In W. Stewart, W. Ray, & C. Conley (Eds.), Dynamics and modelling of reactive systems. Orlando, FL: Academic Press.

lsbarn, G., Ederer, H., & Ebert, K. (1981). The thermal decomposition of n-hexane: Kinetics, mechanism, and simulation. In K. Ebert, P. Deuflhard, & W. J~iger (Eds.), Modelling of chemicid reaction systems. Berlin: Springer-Verlag.

Laidler, K. (1987). Chemical kinetics. New York: Harper and Row. Mahan, B. (1972). University chemistry (Second Ed.). Reading, MA:

Addison-Wesley. Nicolis, G. & Pdgogine, I. (1977). Self-organization in nonequilibrium

systems. New York: Wiley. ROssler, O. (1979). Chaos and strange attractors in chemical kinetics.

In A. Pacault & C. Vidal (Eds.), Synergetics: Far from equilibrium. Berlin: Springer-Veda&

Shacham, M. (1985). Comparing software for the solution of systems of nonfinear algebraic equations arising in chemical engineering. Computers and Chemical Engineering, 9 (2), 103-112.

the kineticist's workbench: qualitative/quantitative simulation of chemical reaction mechanisms

Documents