the kineticist's workbench: qualitative/quantitative simulation of chemical reaction mechanisms

11
Expert Systems With Applications, Voi. 3, pp. 367-377, 1991 0957--4174/91 $3.00 + .00 Printed in the USA. © 1991 Pergamon Prem plc The Kineticist's Workbench: Qualitative/Quantitative Simulation of Chemical Reaction Mechanisms MICHAEL EISENBERG MIT Laboratoryfor Computer Science,Cambridge Almtract--The Kineticist's Workbench is a program combining symbolic and numerical techniques, and intended to assist chemists in simulating, understanding, and analyzing chemical reaction mech- anisms. This paper describes the three modor modules that currently comprise the Workbench: a numerical simulation module; a graphical analysis module that uses mechanism structure to make predictions about behavior; and a qualitative analysis module that interprets the numerical results of simulation and produces qualitative descriptions of those results. 1. INTRODUCTION THE KINETICIST'SWORKBENCH is a program currently being developed by the author, and intended to assist chemists in simulating, understanding, and analyzing complex chemical reaction mechanisms. To this end, the Workbench combines a variety of numerical and symbolic techniques. A chemist may use the program to simulate some given mechanism numerically; she may use it to perform certain types of analysis on the mechanism a priori, in the absence of simulation; or she may use the program to interpret numerical results qualitatively. This paper describes the current state of the Kineti- eist's Workbench, and its capabilities. The remainder of this introductory section is a (very highly condensed) introduction to the domain of chemical kinetics, and an explanation of why a program like the Workbench may prove useful to chemists. Section 2 provides an overview of the Workbench program and its several component modules; the following three sections de- scribe these modules in greater detail. Section 6 illus- trates the use of the Workbench with an example, and the final section discusses ongoing and future work with the program. crete events. For instance, the decomposition of dini- trogen pentoxide is typically expressed by the following notation: 2 N205 -'~ 4 NO2 -I- 02. ( 1 ) In this form, the reaction appears to proceed by a col- lision of two N205 molecules to yield five product mol- ecules; but the geometry of the reactant molecules, among other considerations, sugsests that this simple picture is highly improbable. In point of fact, a more accurate picture of the decomposition process is given by the following set of four reactions: N205 +-+ NO2 -I- NO3 --~ NO2 -I- NO -t- 02 NO + N2Os --~ 3NO2. (2) The four reactions depleted in (2) are known as elementary stepsmunfike ( 1 ), they are intended to de- note discrete collision or decomposition events, t To- gether these reactions constitute a mechanism for the process summarized by ( 1 ); and they specify a set of differential equations that describe the rates of change of concentration for all species: 2 d[N2Os]/dt = -kl [N205] q- k2[NO2][NO3] - k4[NO][N2Os] (3.1) 1.1. Chemical Kinetics: An Introduction In their commonly-encountered "textbook" form, chemical reactions appear to be relatively simple dis- Acknowledgments--This work has benefittedfrom the guidanceand support of many people.Special thanksgo to HaroldAbelson, Gerald Sussman, IrvingEpstein, and Ken Yip. Requests for ie~ints should be sent to: Michael Eisenberg,MIT Lab for Computer Science,545 Technology Square, Rm 429, Cam- bridge, MA 02139. t The notation for mechanisms used in this paper is a bit more compactthan that usuallyfoundin the chemical literature.The briefer notation used here is somewhat more helpfulfor understandingthe graphical analysistechniquesto be describedlater. Occasionally,for the sakeofclarity,the typical"expanded"notationwillbe employed. 2 The vaines k l , k2, and so on are known as rate constants;briefly, these are proportionality constants reflectingthe speedof each ele- mentary reaction. Also, the units of concentration, time, and rate constants are chosen to be internally comistent; typical choicesof units are moles/liter for concentration,secondsfor time, seconds -n for first-order rate constants like kl, and so on. 367

Upload: michael-eisenberg

Post on 26-Jun-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Expert Systems With Applications, Voi. 3, pp. 367-377, 1991 0957--4174/91 $3.00 + .00 Printed in the USA. © 1991 Pergamon Prem plc

The Kineticist's Workbench: Qualitative/Quantitative Simulation of Chemical Reaction Mechanisms

MICHAEL EISENBERG

MIT Laboratory for Computer Science, Cambridge

Almtract--The Kineticist's Workbench is a program combining symbolic and numerical techniques, and intended to assist chemists in simulating, understanding, and analyzing chemical reaction mech- anisms. This paper describes the three modor modules that currently comprise the Workbench: a numerical simulation module; a graphical analysis module that uses mechanism structure to make predictions about behavior; and a qualitative analysis module that interprets the numerical results o f simulation and produces qualitative descriptions o f those results.

1. INTRODUCTION

T H E KINETICIST'S WORKBENCH is a program currently being developed by the author, and intended to assist chemists in simulating, understanding, and analyzing complex chemical reaction mechanisms. To this end, the Workbench combines a variety of numerical and symbolic techniques. A chemist may use the program to simulate some given mechanism numerically; she may use it to perform certain types of analysis on the mechanism a priori, in the absence of simulation; or she may use the program to interpret numerical results qualitatively.

This paper describes the current state of the Kineti- eist's Workbench, and its capabilities. The remainder of this introductory section is a (very highly condensed) introduction to the domain of chemical kinetics, and an explanation of why a program like the Workbench may prove useful to chemists. Section 2 provides an overview of the Workbench program and its several component modules; the following three sections de- scribe these modules in greater detail. Section 6 illus- trates the use of the Workbench with an example, and the final section discusses ongoing and future work with the program.

crete events. For instance, the decomposition of dini- trogen pentoxide is typically expressed by the following notation:

2 N205 -'~ 4 NO2 -I- 02. ( 1 )

In this form, the reaction appears to proceed by a col- lision of two N205 molecules to yield five product mol- ecules; but the geometry of the reactant molecules, among other considerations, sugsests that this simple picture is highly improbable. In point of fact, a more accurate picture of the decomposition process is given by the following set o f four reactions:

N205 +-+ NO2 -I- NO3 --~ NO2 -I- NO -t- 02

NO + N2Os --~ 3NO2. (2)

The four reactions depleted in (2) are known as elementary s t epsmun f i ke ( 1 ), they are intended to de- note discrete collision or decomposition events, t To- gether these reactions constitute a mechanism for the process summarized by ( 1 ); and they specify a set of differential equations that describe the rates of change of concentration for all species: 2

d[N2Os]/dt = - k l [N205]

q- k2[NO2][NO3] - k4[NO][N2Os] (3.1)

1.1. Chemical Kinetics: An Introduction

In their commonly-encountered "textbook" form, chemical reactions appear to be relatively simple dis-

Acknowledgments--This work has benefitted from the guidance and support of many people. Special thanks go to Harold Abelson, Gerald Sussman, Irving Epstein, and Ken Yip.

Requests for ie~ints should be sent to: Michael Eisenberg, MIT Lab for Computer Science, 545 Technology Square, Rm 429, Cam- bridge, MA 02139.

t The notation for mechanisms used in this paper is a bit more compact than that usually found in the chemical literature. The briefer notation used here is somewhat more helpful for understanding the graphical analysis techniques to be described later. Occasionally, for the sake of clarity, the typical "expanded" notation will be employed.

2 The vaines k l , k2, and so on are known as rate constants; briefly, these are proportionality constants reflecting the speed of each ele- mentary reaction. Also, the units of concentration, time, and rate constants are chosen to be internally comistent; typical choices of units are moles/liter for concentration, seconds for time, seconds -n for first-order rate constants like kl, and so on.

367

368 M. Eisenberg

d[NO2]/dt = kl [N205]

- k2[NO2][NO3] + 3k4[NO][N205] (3.2)

d[NO3]/dt = kl [N205]

- k2[NO2][NO3] - k3[NO2][NO3] (3.3)

d [ N O ] / d t -- k3[NO2][NO3] - k4[NO][N2Os]

(3.4)

d[O2]/dt = k3[NO2][NO3]. (3.5)

A mechanism such as (2) is inevitably a hypothesis: it represents the chemist's current working model of how some "real-word" reactive system behaves. Typ- ically, a kineticist interested in a process such as the decomposition of N205 will first propose a mechanism for that process; derive a set of differential equations (such as (3.1)-(3.5)) from the proposed mechanism; and finally attempt to corroborate (or disprove) the hypothetical mechanism by comparing the quantitative predictions of the model with experimental results from the laboratory.

The paragraphs above represent an extremely con- densed and skeletal treatment of the fundamentals of chemical kinetics (fuller treatments may be found in Laidler (1987) and Mahan (1972)), but they should suggest to the reader the essential nature of much of the kineticist's work. An important point to note is that chemical mechanisms (and the systems they model) are often extremely complex, and their behav- ior may be hard to predict. This is not too surprising when one considers that even the relatively simple mechanism (2) is represented mathematically by the set of nonlinear ordinary differential equations ( 3. l ) - (3.5); and, as is well known, systems of nonlinear ODEs can give rise to a bewildering range of behavior. Over the last few decades, there has been a burgeoning interest in chemical systems exhibiting strange phe- nomena such as bistability, oscillations, "chemical chaos," and so forth (R6ssler, 1979). The difficulty of the kineticist's job is further exacerbated by the fact that, for many mechanisms, the number of elementary reactions may be largemsometimes in the hundreds. 3

1.2. Qualitative and Quantitative Understanding of Mechanisms

Suppose that a chemist is presented with a mechanism consisting of a few dozen reactions, and represented

3 See, for instance, lsbarn, Ederer, & Ebert (1981): their mech- anism for the decomposition of n-hexane involves 240 elementary steps. It should also be mentioned that there are dit~culties arising from inhomogeneous systems: these must be deseribed with partial differential equations rather than ordinary differential equations. The Kineficist's Workbench, and this paper, deal with homogeneous sys- tems only.

by a large set of differential equations such as (3) above. How can the chemist begin to get some understanding of the behavior of this model?

The first and simplest answer is to perform a nu- merical simulation of the model on a computer. This is not quite as straightforward as it may sound--chem- ical systems are frequently "stiff" and hence may pose special problems for numerical integration--but by now a variety of chemical simulation programs exist for this purpose (Byrne, 1981; Shacham, 1985). The output of such a program will generally be a graph (or table) of numbers representing the concentrations of species of interest over time.

Of course, once the integration routine has produced this huge set of numbers, the chemist's job is not over. The numbers have to be scanned for interesting fea- tures: Perhaps certain species are at a constant (or near- constant) concentration throughout much of the sim- ulation; or there may be rapid jumps in concentrations, or sustained oscillations. It may be that much of the interesting behavior of the system occurred during a relatively brief portion of the simulation.

The chemist's work cannot even have been said to start with the simulation: a certain amount of inter- pretation inevitably precedes the numerical work. The chemist may use certain graphical techniques, or heu- ristics, to anticipate the behavior of the mechanism, or may look for possible simplifications to the mech- anism. (It may be, for instance, that some elementary steps can be dropped from the mechanism while leaving its probable behavior unchanged.) And of course, the chemist's work will typically require not just one nu- merical simulation but many: he may want to vary some parameter (e.g., the rate constant of a particular elementary step) to see how that parameter affects the qualitative behavior of the model.

In general, then, the work of understanding how chemical reactions proceed consists of a mixture of "styles" of reasoning, involving both quantitative cal- culations and qualitative judgment in varying degrees. Historically, only the quantitative aspect of the work has been delegated to computers; but some of the qual- itative reasoning is itself routine and can be profitably automated. The Kineticist's Workbench program is designed to realize this goal by augmenting the standard techniques of numerical simulation with methods for predicting and interpreting the results of simulations. In this way, the Workbench can free the scientist from the more mundane tasks of simulation (whether those tasks happen to be quantitative or not). With this mo- tivation in mind, we now turn to a description of the program itself.

2. THE KINETICIST'S WORKBENCH: AN OVERVIEW

The Kineticist's Workbench currently consists of three main modules:

Qualitative~Quantitative Simulation of Chemical Reaction Mechanisms 369

• A numerical integration module. This portion of the Workbench program is responsible for performing numerical integration and graphing numerical re- suits.

• A graphical analysis module. This portion of the program examines mechanisms symbolically and makes predictions about mechanism behavior based solely on the components of the reactions.

• A qualitative analysis module. This portion of the program interprets the results of numerical simula- tions and attempts to find features of interest (such as steady states, rapid changes in concentrations, or oscillations) in those results.

These three modules will be described in the following sections4; but before proceeding with that discussion, it is worth mentioning several of the design criteria underlying the construction of the Workbench as a whole.

First, as noted, the Workbench employs a wide range of techniques for working with reaction mechanisms. It is an important feature of the program that these techniques not merely coexist, but cooperateDthat they are capable of sharing their results where appro- priate. For instance, the numerical simulation routines can use the results of graphical analysis to decide when to halt a simulation; while the qualitative analysis rou- tines use numerical results as their input. The point of having multiple modules, then, is not merely to provide a battery of results for the user, hut to combine those results in meaningful ways.

A second design principle can be expressed as some- thing that the Workbench is not: Namely, it is not an attempt to develop a purely qualitative formalism for studying chemical mechanisms. Numerical results are the fundamental data for the Workbench: they are predicted by graphical analysis and interpreted by qualitative analysis, but no attempt is made to circum- vent numerical integration altogether. In this sense, the Workbench may be contrasted with attempts to develop formalisms for qualitative physics (such as an algebra of confluences (de Klcer & Brown, 1984)). Work in qualitative physics is both fascinating and useful, but its application to complex chemical systems is still unclear, since the behavior of these systems often varies tremendously based on the fine tuning of nu- merical parameters; an exclusively qualitative vocab- ulary for simulating these nonlinear systems might prove so wordy that its usual advantases over numerical simulation (compactness, cognitive perspicuity ) could disappear altogether.

Finally, the Workbench uses certain kinds of do- main-related knowledge in its analysis of mechanisms,

4 A fourth module, for spotting "patterns" such as fast equilibria in mechanisms, is currently under construction and will not be de- scribed in this paper.

and likewise uses domain-related concepts in expressing its results. For example, some of the graphical analysis routines employ heuristics derived from standard as- sumptions of reactive chemistry (e.g., that reactions obey "mass action" kinetics (Feinberg, 1980)); while the qualitative results are expressed in terms ("steady state," "equilibrium," and so on) that are meaningful to the working chemist. That is to say, the Workbench is intended as a tool for chemists, and as such it should not require its users to develop entirely new formalisms for supplying input to the program or reading its output.

Before going on to examine the program in detail, some words on the implementation are in order. The Kineticist's Workbench is written in Scheme and cur- rently runs on a Hewlett-Packard Series 9000/350 workstation (based on a 68030 processor). The pro- gram makes use of the IMSL FORTRAN library for some numerical operations, and the Scheme graphics ope.rations are compatible with the X Window System. Because almost all of the Workbench is embedded in a Scheme system that freely mixes interpreted and compiled code, it is easy to write new procedures on the fly while using the program and to examine and test the code interactively.

3. NUMERICAL INTEGRATION

The Kineticist's Workbench program contains an in- tegration module that permits the user to simulate chemical mechanisms numerically. Basically, the user provides the program with relevant input values-- namely, the mechanisms to be simulated (written as a Scheme list which will be described shortly); starting concentrations of all species; and some simulation pa- rameters (including the integration algorithm to be used). With this information, the program derives the appropriate differential equations and then simulates the mechanism. The numerical results may be pre- sented in any of several ways: They may be printed or graphed on the screen, or stored in a file whose name is supplied by the user.

Currently the program supports two integration al- gorithms: a fourth-order Runge--Kutta algorithm and an adaptive-step ~ algorithm. The latter is a pre- dictor--corrector method appropriate for particularly large or stiff systems of equations, while the former is easier to use for simple systems. 5 In either case, the user must also supply additional parameters: starting

s The Gear integrator is part of the IMSL Mathematical Library (supplied by IMSL, Inc.) and is written in FORTRAN. When a Gear integration is requested, the Workbench automatically constructs and compiles the appropriate FORTRAN code while linking in the IMSL Library; runs the code; and then retrieves the numbers produced by running that code.

370 M. Eisenberg

and ending times for the simulation, and time-step value (in the case of the Gear algorithm, this value is the initial time-step to use).

As a brief example of how this Workbench module operates, consider the following tiny sample mecha- nism:

A - - * B + C kl = 1

B + C- '* A k 2 = 2

B - * 2 D k3 = 1.5.

The user can supply this mechanism as input to the Workbench in several ways; the most compact method is to define a new mechanism using the Scheme inter- preter, as follows:

(define sample-mechanism (make-mechanism-from-step-list '(

(((a 1)) ((b 1) (c 1)) 1) (((b 1) (c 1)) ((a 1)) 2)

(((b 1)) ((d 2)) 1.5) ))).

; list of steps

The argument to the procedure make-mechanism- from-step-list is a list composed of three "single-step elements," each of which is itself a list consisting of reactants, products, and rate constant for the given step. The mechanism structure created by this expression actually has a few additional slots---in this case, all empty- - fo r additional information (species whose concentration is constant, species that are supplied from an external source, and so forth).

Now the user must specify a few integration param- eters; an easy method is to take a parameter list pro- vided with one of the sample files in the program, and edit that list to contain the appropriate values:

(define sample-parameter-list '((starting-dt 0.05)

(start-time 0.) (end-time 20.) ( actual-starting-cones

( (a 2.) (b 0.) (c 0.) (d 0.))) (integration-method runge-kutta) (focus-species (a b d)) (steps-per-display 20))).

The only elements in this list that perhaps need ex- plication are the final two. The ideal here is that we wish to display the concentration values of species A, B, and D every 20 time steps (corresponding to one second of simulated time).

Finally, by evaluating the following expression:

(do-simple-mechanism-run sample-mechanism sample-parameter-list)

the user can see the result of simulating the mechanism numerically (the first few values are shown below). 6

Species: d Species: b Species: a

Species: d Species: b Species: a Current time:

Species: d Species: b Species: a Current time:

Concentration: 0. Concentration: 0. Concentration: 2.

Concentration: 1.0829351858700385 Concentration: .38061488796522547 Concentration: 1.0779175190997552 1 . ~

Concentration: 1.97060533068859 Concentration: .22755722640644568 Concentration: .7871401082492592 2 . ~ 1 .

There are some other features of the numerical in- tegration module that relate to its interaction with other portions of the program; these will be discussed later in this paper.

4. GRAPHICAL ANALYSIS OF M E C H A N I S M S

Often there is a good deal of useful information to be gleaned from a mechanism even before that mecha- nism is simulated numerically. Consider, as an ex- ample, the following very simple pair of elementary reactions:

A *-, B. (4)

This graph represents two unimolecular elementary steps: a reaction from A to B and a reverse reaction from B to A. By inspection, we can tell that this system must approach equilibrium at nonzero concentrations of A and B (assuming that we start with a nonzero concentration of either species, and that the system follows typical mass action kinetics). We did not need to run a simulation of the system to reach this conclu- sion: Rather, we were able to predict this feature of the system's behavior from the structure of the reactions themselves (along with the domain knowledge of ki- netics that interprets mechanism graphs such as (4)) . Indeed, we did not even need the exact rate constants of the reactions to know that the system would reach equilibrium; we merely used the assumption that those rate constants were nonzero. The equilibrium concen- trations of A and B in (4) do, of course, depend on the rates of the reactions, but the fact that the system reaches equilibrium at all is derived from mechanism structure alone.

6 If the user wishes to see the concentration values of certain species graphed on the screen, some additional parameters must be specified dictating (among other things) which species to graph, the ordinate and abscissa ranges of the various graphs, and so forth.

Qualitative~Quantitative Simulation of Chemical Reaction Mechanisms 3 71

There are actually a wide variety of such structural insights that can be brought to bear upon mechanisms, and a wide variety of mechanisms to which at least some structural analysis may be applied. Another ex- ample may convey some flavor of the sort of reasoning involved. Consider the following mechanism:

A *-* B ~ External World

B + D--~ E*-* F. (5)

The only new wrinkle in the notation here is repre- sented by the graph vertex labeled "External World"; this indicates that there is both an external source and sink for species B. Without going into painstaking de- tail, we can again inspect the structure of this mecha- nism and deduce, among other things, that species D will eventually approach a concentration of zero; that species E and F will approach equilibrium concentra- tions that depend on (among other things) the initial concentration of D; and that A and B reach equilibrium concentrations that are in fact independent of their initial concentrations and depend only on the rate constants ofthe four reactions shown in the upper row of (5). Again, we have arrived at several potentially useful conclusions without actually bothering to run a simulation of the mechanism; indeed, in this particular case we could deduce the (asymptotically approached ) equilibrium concentrations of A and B by a variety of means other than simulation.

4.1. Consistent Zero Sets of Species

The Workbench performs graphical analysis of mech- anisms using notions such as "necessary nonzero spe- cies sets" and "necessarily declining sets of species." These and several other concepts built into the program are described in greater detail in Eisenberg (1990). For the purpose of illustration, this paper will discuss two concepts that come up in graphical analysis: consistent zero sets of species, and the zero deficiency theorem.

The notion of consistent zero sets comes up when we ask a question of the form: "Given that a mecha- nism approaches a steady state in which the concen- tration of some species X is zero, what other species in the mechanism must likewise be at zero concentra- tion in that steady stateT" To take an example, consider the following mechanism:

A + B ~-, C + D,--, E.

Here, if A has zero concentration at some steady state, then so must E and at least one of C and D; but species B (and at most one of C and D) might well have non- zero concentration. We could conclude from this rea- soning that the sets [A D E ] and [A C E ] (and supersets of these) are both consistent zero sets for the mecha- nism.

4.2. The Zero Deficiency Theorem

The "zero deficiency theorem," derived by M. Feinberg and colleagues at the University of Rochester (Fein- berg, 1980) is a useful and often powerful tool for graphical analysis of mechanisms. Space does not per- mit a complete explanation of the theorem, but essen- tially the purpose of the deficiency concept is to identify a certain class of mechanisms for which it is possible to conclude (on the basis of structural information alone) whether the mechanism has a locally stable equilibrium state in which the concentrations of all species are nonzero.

To give a telegraphic treatment of the theorem here: first, we assume that a mechanism is represented by the kind of graph structure used throughout this paper. As an example, we can use the mechanism (5) intro- duced earlier.

A *-' B *-' External World

B + D - * E*-* F.

now introduce the following concepts: We 1.

(5)

The number of distinct vertices--that is, left or fight sides of reactions--in the graph, which will be de- noted as n. (For our example, this number is 6.)

2. The number of connected components (portions linked by arrows) in the graph, denoted by l. (For our example, this number is 2.)

3. Whether the mechanism has the feature that for every pair of vertices vl, v2, if there is some directed path of reaction arrows leading from v 1 to v2 then there is some pathway back from v2 to vl. This feature is denoted weak reversibility. (Our example is not weakly reversible, since there is a path from the vertex B + D to the vertex E, but no path from E to B + D.)

4. By treating the reactions in the mechanism as vec- tot's in a particular vector space (the space in ques- tion is R m, where m is the number of distinct species in the mechanism), we compute the dimension s of the space spanned by the reaction vectors. (For our example, this number turns out to be 4.)

5. Finally, we find the value of n - s - l, and call this value the deficiency. If this number is zero, then we can deduce the following interesting information: namely, that if the mechanism is weakly reversible than it has a unique locally stable equilibrium in which all species are at nonzero concentration. If, on the other hand, the zero-deficiency mechanism is not weakly revers/ble, then there exists no equi- librium state in which all species are at nonzero concentration. (In our example, the mechanism has a deficiency of zero but is not weakly reversible; therefore it does not have a stable equilibrium in which all species have nonzero concentration. In- deed, as we have already concluded, the concentra- tion of species D must approach zero over time.)

372 M. Eisenberg

The reader may be wondering, based on the example shown, whether the zero deficiency theorem involves a great deal of mathematical baggage to achieve an intuitively obvious result. In point of fact, the theorem is often applicable in far more subtle cases. Consider the following example (Feinberg, 1980):

A + B ~-- D*- ' 2C

B + C*- ' E*- ' A + D. (6)

This is a zero-deficiency mechanism whose graph is not weakly reversible; hence it cannot reach equilib- rium with all species at nonzero concentrations. This conclusion is far from being intuitively obvious, but again we did not require simulation to arrive at it. As an aside, it is worth noting that we can now profitably use the earlier concept of consistent zero sets to analyze this mechanism: in particular, now that we know that any steady state for the mechanism will include some species with zero concentration, we can ask which sets of species can possibly have zero concentration at a steady state. This question is pursued in the following subsection.

4.3. An Example of Graphical Analysis

As a brief example of how the Workbench can be used to perform graphical analysis, we can examine an ex- cerpt of the program's treatment of mechanism (6). First we create the mechanism (as shown in Section 3), and then evaluate the following expression:

(analyze-mechanism-graphically sample-mechanism-2 ).

Some of the program's printed output is shown below:

Number of complexes: 6 Number of linkage classes: 2 Dimension of stoichiometric subspace: 4 Deficiency: 0 Reversibility: none

Rule number 2.3 - - Deficiency Theorem part 2 is firing

If the deficiency of the mechanism is 0, and the mechanism is not weakly reversible, then the mechanism cannot reach equilibrium at positive concentrations of all species.

Rule number 4.3 - - Looking for asymptotic zero concen- trations is firing

The mechanism cannot reach equilibrium with all nonzero concentrations. It also does not contain any * obvious. declining species or sets. We now look for those sets of species that might asymptotically have zero concentration.

Possible sets of zero-concentration species: ((d e c)) Subtracting the following zero set: (d e c)

This mechanism contains no elementary steps.

In summary: the program first concludes that the mechanism will not reach equilibrium with positive concentrations of all species, and then attempts to sub- tract from the mechanism a consistent set of zero-con- centration species. It finds that there is in fact only one consistent se tnnamely , the set [ C D E ]. The program now subtracts from mechanism (6) all those steps in which these species are reactants, in the hopes of de- riving a simpler mechanism to study; after doing so, the program notes that there are no remaining steps from mechanism (6) that can possibly proceed once the three species approach zero concentration.

5. QUALITATIVE ANALYSIS OF S IMU LA TIO N RESULTS

Once a simulation has been performed, the Workbench can ( if the user desires) examine the numerical results and attempt to produce a qualitative summary of the events of the simulation. The program first groups the simulation's results into discrete periods of time cor- responding to distinguishable "operational states" of the mechanism, and then looks for patterns in the se- quence of time periods that it has identified.

5.1. Grouping Results into "Episodes"

The central concept in the Workbench's strategy for producing qualitative summaries is that of the episode. The idea is best illustrated through an example. 7 Con- sider the following simple mechanism (here expressed in an "expanded" notation):

A - - ~ B (7.1)

B "-* C (7.2)

C -'* B. (7.3)

The differential equations specifying the behavior of this system are as follows:

d [ A ] / d t = - k l [A] (8.1)

d[B] /d t = k l [ A ] - k2[B] + k3[C] (8.2)

d [ C ] / d t = k2[B] - k3[C]. (8.3)

Examination of equations (8 .1 ) - ( 8.3 ) reveals that each separate term in each equation corresponds to the change in some species' concentration due to the con- tribution of a particular reaction. For instance, the two terms on the right of (8.3) correspond to the increase in the concentration of species C due to reaction B --* C, and the decrease in the concentration of C due to

7 More detail is given in Eisenberg (1989).

Qualitative~Quantitative Simulation of Chemical Reaction Mechanisms 373

the reaction C -* B. This correspondence between dif- ferential equation terms and contributions from indi- vidual elementary steps is a staple of mass action ki- netics; in general, the differential equation for a given species X will have as many terms as there are reactions in which X occurs either as a reactant or product.

At any point during a simulation, then, we can ex- amine the concentrations of all species and plug those concentration values into all the terms of the differ- ential equations governing the system. For each species, we can derive an ordering on the terms in that species' differential equation: the ordering corresponds to in- creasing absolute value of each term. To continue our example, suppose that the values ofkl , k2, and k3 are 2, 0.5, and 1, respectively; and suppose that at a given moment in time, the concentrations of A, B, and C are 3, 1, and 0.6. In this case, the three terms in the differential equation for B have the values 6, -0.5, and 0.6, respectively; while the two terms in the equation for C have the values 0.5 and -0.6. Thus, ordering the terms by absolute value, we would say that the term corresponding to reaction (7.1) is largest for species B, while the term for reaction (7.3) is largest for species C. An episode corresponds to a period of time during the simulation during which these reaction orderings remain stable for each species. If at some point during the simulation, the concentrations of A, B, and C change so that (say) the contribution of reaction (7.2) becomes larger than that for (7.3) in determining the change in the concentration of C, then we would say that a new episode of the simulation has begun; and this new episode remains in effect until the reaction orderings again change for either B or C.

To sum up, then: At any given time during the sim- ulation, we look to see how each reaction contributes to the concentration change of each species; and for each species, we order the contributing reactions ac- cording to their importance. An episode corresponds to a period of time during which these orderings remain stable.

5.2. Analyzing Episode Sequences Having seen how episodes are defined, we can now examine how the Workbench produces qualitative summaries of behavior. First, the program identifies the episode boundaries within the numerical results; it then looks for specific features (such as rapid concen- tration growth or decline for some species) within each episode. For instance, the program may find that the concentration of species X increased by more than 50 percent during a particular episode, and that no decline in X was recorded for any individual time step during that episode. In this case, the Workbench would record that X experienced a large and steady (monotonic) increase during that episode, and would attach a feature descriptor of this sort to the episode.

The question of identifying relatively "rapid" growth or decline of concentration depends on forming a par- ticular time scale for the simulation. A 50% growth in concentration may be deemed rapid if it occurs in-a relatively brief time, or slow if it occurs over an ex- tremely long time. The Workbench derives an implicit time scale for the entire simulation by using the lengths of the longest and shortest episodes as bounds; thus, if a 1 second episode is the shortest found in analyzing some simulation's results, then that episode will be deemed a brief time for the purposes of identifying interesting features. Conversely, if a 1 second episode is much longer than all others found (say, all other episodes are under a millisecond in length), then that episode would be treated as a relatively long time. This heuristic for determining time-scale appears reasonable based on experience with the Workbench so far; it en- ables the program to analyze features in time records that span anywhere from seconds (appropriate for most laboratory systems) to years (which might be appro- priate for, say, reactions in the upper atmosphere).

Having annotated the sequence of episodes found in the simulation with features of this kind, the Work- bench now looks for patterns in the sequence itself. Specifically, the program notes features such as a long final episode with relatively little concentration change among species (indicative of a system that has reached equilibrium), or repeating patterns among episodes (indicative of the presence of oscillations ) .S This tech- nique will be illustrated by the example in the next section.

Besides describing the way in which the Workbench uses episode structures, the preceding paragraphs should also provide the reader with some feeling for the rationale behind the episode notion itself. Episodes are intended to correspond to distinct identifiable pe- riods of potentially interesting behavior during a sim- ulation; they are also intended as analogs to the chem- ist's notion of periods during which some particular reaction "takes over" in importance in determining a mechanism's behavior. The episode data structure provides the Workbench (and the user) with a link between interesting numerical features of some simu- lation's results, and the underlying reactions that pro- duced those features.

6. AN EXAMPLE OF THE WORKBENCH IN USE

Just to see how the Workbench can be used in a com- plete example, we consider the "Brusselator" mecha-

s The Workbench is able to note the occurrence of some complex patterns such as birhythmicity, insofar as it will identify two succeeding oscillating sequences; but it currently has no vocabulary for such cases (and for some other patterns, such as "damped oscillations").

374 M. Eisenberg

nism, originally devised by Prigoglne and Lefever as a simple model of an oscillating reaction (Nicolis & Pri- gogine, 1977):

A - - * X kl = 1 (9.1)

B + X - ~ Y + D k 2 = 1 (9.2)

2 X + Y - - * 3 X k 3 = 1 (9.3)

X - - * E k4= 1. (9.4)

Here, we assume that species A and B are held con- stant at concentrations of 1 and 3, respectively; as for species D and E, they can be treated as "driven-off species," that is species whose concentrations are iden- tically 0, inasmuch as they do not enter as reactants in any steps. The behavior of this mechanism will thus be determined by the concentration changes in X and Y.

We can input the mechanism by typing in the fol- lowing expression at the Scheme interpreter:

(define brusselator '(

( (((a)) ((x)) 1) (((b) (x)) ((y) (d)) 1) ( ((x 2) (y)) ((x 3)) 1) (((x)) ((e)) 1 ) )

((a 1 ) (b 3)) ;constant species ( ) ;sources ( ) ;sinks (d e) ; driven off ( ) ;functions of time )).

This expression is a bit more complicated than our previous examples, because we are using some of the additional slots available in the description of mech- anisms; here, we are specifying in the definition of the Brusselator mechanism that species A and B will have constant concentrations, and that species D and E are to be ignored.

As a first step, we can perform graphical analysis on the mechanism:

(analyze-mechanism-graphically brusselator).

Among the information printed out by the program, we note the following:

Number of complexes: 5 Number of linkage classes: 2 Dimension of stoichiometric subspace: 2 Deficiency: 1 Reversibility: none.

Since this mechanism has a deficiency value of 1, we cannot use the zero deficiency theorem to predict its behavior. Informally, we can say that this mecha- nism might be capable of"exotic" behavior. 9 It should be noted, parenthetically, that the computation of de- ficiency is a bit more involved than indicated by the brief summary earlier in this paper: In particular, con- stant and driven-off species have to be handled spe- cially. Thus, the number of complexes (vertices) in the Brusselator mechanism is 5 for the purposes of defi- ciency calculation (though a naive examination of (9.1)-(9.4) would suggest that there are 7 distinct ver- tices).

We now specify run parameters for the program:

(define brusselator-run-parameters • ((starting-dt 0.05 )

(start-time 0.) (end-time 40.) (starting-concs

.((a 1.) (b 3.) (d 0.) (e 0.) (x 0.) (y 0.))) (integration-method runge-kutta) (focus-species (x y)) (steps-or-time-per-display 20) (end-conditions (end-time)) (maintain-episode-history? ,true) (episode-change-depth 0) (graph-window,. display-graph-window,) )).

There are a few parameters in this list that have not appeared in previous examples. Without going into too much detail, the final few parameters indicate that we will use the Workbench to distinguish episode bound- aries during the simulation itself; that episode bound- aries will be associated with the single most important reaction for each species (rather than a complete or- dering of all reactions); and that a running graph of concentrations will be shown as the simulation pro- ceeds. Another list is now needed to specify some graph parameters (axis limits, graph colors, and so forth); but these are not especially interesting and will not be discussed here.

We now can run a simulation of the mechanism by evaluating the following expression:

(do-simple-graphed-mechanism-run brusselator brusselator- run-parameters brusselator-graph-parameters).

Running the simulation produces the graph shown in Figure 1; it appears as though the concentrations of

9 There are in fact additional graphical techniques for examining some deficiency one mechanisms (Feinberg (1980) includes an ex- ample); but these have not yet been incorporated into the graphical analysis module of the Workbench.

Qualitative~Quantitative Simulation of Chemical Reaction Mechanisms 375

20.

y

0 . ~

10~

X

O. L \ • " ; . L . ; " '

I 0 • I I I I T • I I I I I Ime 40.

FIGURE 1. ~ ; i ~ i l ; ~ , ~ l of X l ind Y versus t ime I s grmphed by the Wor ldmnoh.

species X a n d Y are oscillating at a regular rate (at least after the very outset of simulation). We can view the episodes constructed by the running simulation by evaluating:

(draw-episodes-from-history).

This adds vertical fines to our graph corresponding to the episode boundaries detected by the running pro- gram, as shown in Figure 2. A glance at the episode boundaries indicates that starting with the third epi- sode, there seem to be regular divisions of the graph into two-episode "chunks."

Finally, we can ask the Workbench to analyze the constructed episode history:

( d i s p l a y - f e a t u r c - a n a l y , ~ s ) .

This causes the interpreter to print out (among other information) the following fines:

Now examine the reaction history for possible oscillations:

((long) (large-increase x) (steady-increase x) (larl~increase y) (steady-increase y)) ((large-increase x) (steady-increase x))

(( short ) (large-increase x ) ( rapid-increase x) (large-decrease y) ( r a p i d ~ y) (steady.decrease y))

((long) (laql~decfcase x) (rapid.decrease x) (latg~ina-ease y))

((short) (large-increase x) (rapid-increase x) (large-decrease y) (rapid-decrease y))

20.

g

1, ~ , ~, I I I I I I I I I I I I I I I I I I I I

~, 8,, 1Q, I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

~l/...-'- J ! l / " - i~, . / - I'..I / O. Y i'~ i~ i

10.,,._. I I I I I I I I

X _ _ I I I I

,/,\ m ,;, ,!, O, I ' ~ I " ' - - - - ' 1 1 " " " ' " I

i ' ~Im( i l i i40 0 FIGURE 2. The l a m e graph as in Figure 1, but wi th e p i s o d e b o u n d a r i e s shown.

376 M. Eisenberg

((long) (large-decrease x) (rapid-decrease x) (large-increase y))

((short) (large-increase x) (rapid-increase x) (large-decrease y) (rapid-decrease y))

((long) (large-decrease x) (rapid-decrease x) (large-increase y))

((short) (large-increase x) (rapid-increase x) ( large-decrease y) ( rapid-decrease y) )

((long) (large-decrease x) (rapid-decrease x) (large-increase y))

((short) (large-increase x) (rapid-increase x) (large-decrease y) (rapid-decrease y))

(final) (large-decrease x ) (rapid-decrease x) (steady-decrease x) (large-increase y) (rapid-increase y) (steady-increase y))

done.

Here, the Workbench has indicated that it has found apparent oscillations starting with the third episode. We can describe the events of each unit of the oscil- lation as consisting of two episodes: the first is short, and involves a large rapid increase in X and large rapid decrease in Y, while the second is longer and involves a large rapid decrease in X and large increase in Y. (These would correspond to the episodes in Fig. 2 be- ginning at the times 8.05 and 9.3, respectively.)

7. ONGOING AND FUTURE WORK

Space and time do not permit a complete description ofaU the currently implemented features of the Work- bench. Much of the additional structure in the program relates to communication between the various mod- ules--allowing information determined by one module to be used by another. For example, there are cases in which the graphical analysis module may determine that a given mechanism will reach an equilibrium state that is independent of starting concentrations; in this case, the Workbench can use a "rapid search" proce- dure to find a numerical approximation to the various equilibrium concentrations; and, ifa simulation is still desired, the numerical simulation module can be told to halt when the state of the system approaches its equilibrium position. (An example along these lines is included in Eisenberg (1990).)

Communication between modules is still incom- plete, however, and many extensions to the program can be envisioned. There are situations, for instance, in which qualitative analysis of a simulation's numer- ical results might guide an automatic reexamination of "interesting" portions of the results. To make this idea a bit more concrete, consider again the Brusselator example. Here, the simulation revealed an "induction" period, followed by sustained oscillations. One might want to perform some analysis on the oscillating por- tion of the numeric data, ignoring the induction period data; for instance, one might run an FFT algorithm on the postinduction data to find the fundamental fre- quency of oscillation (if we were to run the FFT on

the entire data set, the induction period would intro- duce complications in the interpretation of the fre- quency spectrum).

To take another example along these lines, it would be desirable to have the program's qualitative analysis checked against the earlier results of graphical analy- sismsomething that the Workbench cannot do at pres- ent. Suppose, for instance, that our graphical analysis tells us that the mechanism will reach a stable equilib- rium, but the qualitative analysis fails to discern this equilibrium condition in the numerical results. There are several conceivable explanations for this, but one of the most likely is that the simulation simply did not proceed for a long enough time. In this case, the qual- itative analysis module might report the discrepancy between its conclusions and the predictions of the graphical analysis module, and might automatically extend or redo the numerical simulation.

Perhaps one of the most important needed additions to the current program is the ability to compare episode histories between simulations. For instance, we might want to vary a parameter (like a rate constant) over a certain range in order to see how this variation affects the qualitative behavior of the mechanism. In thi~ case, it would be desirable to have the Workbench auto- matically detect changes between episode histories as the parameter is varied. In some cases, commonly ob- served patterns of change (such as period-doubling for some oscillating systems) could be detected and re- ported. The ability to compare episode histories would also allow us to compare distinct mechanisms: for in- stance, we might want to change a mechanism (say, by dropping or adding an elementary step) in order to see how the qualitative behavior of the system changes. We could compare simulations of the original and "perturbed" mechanisms to see how the qualitative behavior of the two simulations differ.

Additions of the kind described in the preceding paragraphs are planned for future versions of the Workbench. But besides these ambitious prospects, there are a large number of relatively mundane im- provements that need to be made in the Workbench. These include: a larger repertoire of heuristics for graphical analysis (including mechanisms with a de- ficiency of one or more); a larger library of recognized patterns for qualitative analysis (including concepts such as "'damped oscillations"); and a more user- friendly interface. There is, in short, much work to be done; but progress so far suggests that chemical sim- ulation can indeed extend well beyond the realm of number crunching.

REFERENCES

Byrne, G. ( 1981 ). Software for differential systems and applicafiom involving macroscopic kinetics. Computers and Chemistry, 5(4 ), 151-158.

Qualitative~Quantitative Simulation of Chemical Reaction Mechanisms 377

de Kleer, J. & Brown, J.A. (1985). Qualitative physics based on confluences. In D. Bobrow (Ed.), Qualitative reasoning about physical systems. Cambridge, MA: MIT Press.

Eisenberg, M. (1989). Descriptive simulation: Combining symbolic and numerical methods in the analysis of chemical reaction mechanisms. MIT A.I. Memo No. 1171.

Eisenberg, M. (1990). Combining qualitative and quantitative tech- niques in the simulation of chemical reaction mechanisms. In W. Webster & R. Uttamsingh (Eds.), AI and simulation: Theory and applications. San Diego, CA: Society for Computer Simu- lation.

Feinberg, M. (1980). Chemical oscillations, multiple equilibria, and reaction network structure. In W. Stewart, W. Ray, & C. Conley (Eds.), Dynamics and modelling of reactive systems. Orlando, FL: Academic Press.

lsbarn, G., Ederer, H., & Ebert, K. (1981). The thermal decompo- sition of n-hexane: Kinetics, mechanism, and simulation. In K. Ebert, P. Deuflhard, & W. J~iger (Eds.), Modelling of chemicid reaction systems. Berlin: Springer-Verlag.

Laidler, K. (1987). Chemical kinetics. New York: Harper and Row. Mahan, B. (1972). University chemistry (Second Ed.). Reading, MA:

Addison-Wesley. Nicolis, G. & Pdgogine, I. (1977). Self-organization in nonequilibrium

systems. New York: Wiley. ROssler, O. (1979). Chaos and strange attractors in chemical kinetics.

In A. Pacault & C. Vidal (Eds.), Synergetics: Far from equilibrium. Berlin: Springer-Veda&

Shacham, M. (1985). Comparing software for the solution of systems of nonfinear algebraic equations arising in chemical engineering. Computers and Chemical Engineering, 9 (2), 103-112.