SYMBOLIC AND NUMERICAL COMPUTING FOR CHEMICAL KINETIC REACTION SCHEMES by Mark H. Holmes Yuklun Au J. W. Stayman Department of Mathematical Sciences Rensselaer Polytechnic Institute, Troy, NY, 12180 Abstract The idea of using a symbolic manipulator as the engine for analyzing chemical reactions is developed. With the reactions as the input, it is shown how to symbolically derive the governing differential equations for the concentration of each species, the conservation laws for the scheme, and certain steady states. It is also shown how to use the information derived symbolically to construct, and then execute, a numerical scheme for solving the problem. This is done by linking the symbolic manipulator with a X-window display program that produces a graphical interface relating the data, the numerical solution, and the plotting routines. The routines are demonstrated using the Oregonator model that arises in the study of excitable systems.
Introduction A major difficulty in chemical kinetics is having to deal with large systems of nonlinear ordinary differential equations. The size of the systems encountered can be significant, for example, the reaction scheme for the formation of water involves 30 reversible reactions [Yablonskii, 1991 #23], the scheme proposed for methanol pyrolysis consists of 66 reversible reactions [Norton, 1990 #13], and the scheme for ethylene oxidation and pyrolysis has 307 reactions [Dagaut, 1990 #3]. The reduction of such systems using the conservation laws, or an analysis to determine the conditions that would lead some species to an equilibrium state, is therefore limited. In fact, the only viable approach for dealing with such problems is to solve them numerically. The limitation with this is that a numerical solution is confined to specific cases and it is therefore not conductive to a general analysis that determines how the solution depends on the parameters. An alternative approach is to use a symbolic program, like Maple, to study the problem ([Heck, 1993 #25];[Ellis, 1992 #26]). An advantage in doing this is that general, or exact, expressions can be derived. So, for example, the conservation laws of very large kinetic schemes are easy to determine. With such information, when the solution needs to be evaluated, the system that actually has to be solved can be significantly smaller. It is worth pointing out that computing languages like Maple are a compartively recent development and general purpose programs have been commercially available for only about ten years. Consequently earlier studies, such as the one by ([Szamosi, 1984 #28]), rely on more conventional floating point calculations. Stated in general terms, the objective of this paper is to demonstrate how to use a symbolic manipulator as an interface between the mathematical formulation of a physical problem and the determination of the quantitative characteristics of the solution. Of particular interest here is how to use a symbolic manipulator as the engine for analyzing chemical reactions. For the algorithms that are described, all that is needed are the reactions. For example, the Oregonator model used in the study of excitable systems consists of the five reactions [Field, 1974 #27] A + Y X + P, X + Y 2P, A + X 2X + 2Z, (1) 2X A + P, Z 1 2 Y. 2 9/29/04
Given this information the program can symbolically derive the governing differential equations for the concentration of each species, the conservation laws for the scheme, and certain steady states. It can also solve the equations numerically and then display the results in a variety of formats (e.g., as a function of time, in a two-dimensional phase plane, in three-dimensional phase space, etc.). Because of the ubiquitous nature of the laws of mass action, such capabilities are useful both for teaching as well as research. It is important to point out that some of the problems discussed here can be solved by other means using commercially available software, e.g., DASAL and CRAMS ([Marsili, 1990 #29]). Also, symbolic programs can be somewhat slower than routines that are run using programming languages like FORTRAN or C. However, what is gained by using a symbolic system is an exact result so the calculations need only be done once (unlike a FORTRAN or C program that must be rerun every time the parameters are changed). This is particularly important when one wants to determine the rate parameters from experiment. Also, it is easier to understand the physics since one obtains an answer that depends on the material parameters in a general way. For complex problems, carrying out a symbolic analysis may not produce the solution but it can result in a significant reduction in the system. This is valuable as it can simplify the problem that then must be solved numerically. Consequently, symbolic manipulators provide an important tool for analyzing chemical kinetics problems that can be used in conjunction with classical scientific computing methods. In the next section the theory from chemical kinetics, on which the algorithms are based, is discussed. In conjunction with this, the basic constituents of the symbolic program are described. To help demonstrate what is involved, certain of the commands used in the program are written out for the Oregonator model in (1). After this the X- window graphical interface used to numerically solve the differential equations is described. In the last section, as an example, the complete program is demonstrated using the Oregonator model. Chemical Kinetics and Symbolic Computing For the general form of the schemes considered here we assume there are n reactions involving m distinct species X 1, X 2,..., X m. Therefore, the scheme is composed of individual reactions of the form m α ij X j j=1 m β ij X j for i = 1,..., n. (2) j=1 3 9/29/04
It is not assumed that these are elementary reactions, but we do assume that the stoichiometric coefficients (i.e., the α ij 's and β ij 's ) are non-negative. Given the reaction scheme the first step in the symbolic algorithm is the determination of the species involved. For example, the input for the scheme in (1) is a list of reactions such as the following > R := [ A + Y > X + P, X + Y > 2*P, A + X > 2*X + 2*Z, 2*X > A + P, Z > 0.5*Y ]; (3) Identifying the species that are present is simply a matter of searching the input list, R, to find the indeterminants. In Maple this is accomplished by issuing the command > species := indets(r); species := {X, P, A, Y, Z} To determine the number of species (i.e., the value of m in (2)) the following command is used > m := nops(species); 5 With these results, m = 5 and species[1] = X, species[2] = P,..., species[m] = Z. Once the species are known then stoichiometric coefficient product and reactant matrices, M p and M r, are constructed that, when taken together, are algebraically equivalent to the reaction scheme. Assuming there are n reactions and m species, as in (2), then the nxm matrix M p has i,j-entry β ij and the nxm matrix M r has i,j-entry α ij. Determining these coefficients is a sequential process that involves running through the list of reactions, R, and identifying the coefficients of the reactants and products. As an example, for the Oregonator model given in (1) the coefficient matrices are X P A Y Z X P A Y Z M r = 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 2 0 0 0 0 0 0 0 0 1 M p = 1 1 0 0 0 0 2 0 0 0 2 0 0 0 2 0 1 1 0 0 0 0 0.5 0 4 9/29/04
To help identify where the entries of the matrices originate, the species associated with each column are included in the above expressions for M r and M p. In general, the ith row of each matrix corresponds to the ith reaction and the jth column is connected with the jth species. What is important here is that given the two stoichiometric matrices then it is possible to uniquely reconstruct the reaction scheme. This is not possible if one uses only the stoichiometric matrix M s, which is defined as M s = M p M r. (4) For example, the stoichiometric matrix M s = ( 1 1 1) applies to the scheme consisting of the single reaction A + 2B B + C. However, it also applies to the scheme consisting of the single reaction A + B C. Once M p and M r are known it is then an easy matter to determine the rate r i of the ith reaction. To state the formula obtained using the laws of mass action, consider the system given in (2). If the ith row of M r contains only zeros then r i = 0, otherwise m r i = k i j=1 x j M r [i,j]. (5) In this expression, k i is the rate constant for the ith reaction and M r [i,j] is the i,j-th element of M r. Also, x j designates the concentration of species X j. The Maple commands used to implement the formula in (5) are the following > for i from 1 to n do > r.i := k.i: > for j from 1 to m do > if (Mr[i,j]<>0) then r.i := r.i*(x[j]^mr[i,j]) fi: > od: > od: In the above statements, r.i, k.i, Mr[i,j] and x[j] designate r i, k i, M r [i,j] and x j, respectively. Once the rates have been determined then the kinetic equation for the time evolution of the concentration of the jth species is n d dt x j = i=1 (M p [i,j] M r [i,j])r i. (6) 5 9/29/04
The construction of the reaction rates and subsequent determination of the differential equations is simply an iteration process that exhausts all the possibilities in the scheme using the formulas in (5) and (6). The coefficient matrices can also be used to determine the conservation laws for the system [Holmes, 1991 #24]. These can be found by finding the linearly independent solutions of the equation M s c = 0, (7) where M s is the stoichiometric matrix given in (4). In other words, if c = (c 1,..., c m ) T is a nonzero solution of (7) then a conservation law for the system is m c j x j = constant. (8) j=1 The determination of the linearly independent solutions of (7) is equivalent to finding a basis for the kernel, or null space, of the stoichiometric matrix defined in (4). Using Maple, the command to find a basis for the kernel of M s is > kernel( Ms, 'nlaws' ); The output from this command is a listing of the linearly independent solutions of (7) and the value of nlaws, which is the number of solutions found. The rest of the procedure for finding the conservation laws consists of a few lines of code for writing out the laws as given in (8). As a final comment, in physical chemistry conservation laws are sometimes restricted to cases of when the c i 's are non-negative [Érdi, 1989 #20], although, this condition is not universal [Othmer, 1985 #21]. The above algorithm allows for negative coefficients in the conservation laws. The determination of equilibrium states, and obtaining analytic expressions that involve the rate parameters, are tasks for which symbolic computing is ideally suited. There is more information in these analytic expressions, if they can be obtained, than in a numerical approximation. A case for this has already been made by Kreye, et al. (1988), where they use SMP to find rate constants for linear reaction schemes [Kreye, 1988 #7], and by Liddell, et al. (1990), who find rate constants for certain types of schemes using equilibrium states [Liddell, 1990 #10]. Just how successful the program is in this endeavor, however, depends on the particular problem since the reaction schemes can produce highly nonlinear systems that may be unsolvable in closed form. The Oregonator 6 9/29/04
model in (1) is such an example. Even so it still possible to reduce the system considerably. To illustrate how, suppose the system is known to approach an equilibrium state. In this case, x j 0 as t, irrespective of the initial concentrations, whenever the scheme contains an equation of the form d dt x i = kx j α, (9) where α is positive and k is a rate constant. It is a simple matter to find all such reactions using the stoichiometric coefficient matrices. This is done by first finding which columns of M s have a single nonzero entry and then checking the appropriate row of M r to determine whether the rate depends on a single species. The algorithm does this and then sets the respective concentration to zero. It then goes back to see if any new species with a zero equilibrium state can be found. This process is continued until no new species are obtained. To summarize the above discussion, the symbolic procedures in the kinetics program, and their functions, are as follows: 1) procedure "odes" a) determine the species and number of reactions from an input list containing the reactions; b) loop through the list and construct the stoichiometric coefficient matrices; c) determine the rate of each reaction; c) construct the kinetic equations; 2) procedure "laws" a) find the system's linearly independent conservation laws; 3) procedure "steady" a) find the zero equilibrium states The program, which is written for Maple, is approximately 550 lines long (this includes the help pages as well as the graphical interface described below). It is available through anonymous ftp at Internet site ftp.rpi.edu, and the file to download is pub/math/kinetics.tar.z. Interface Between Symbolic and Numerical Computing Even though symbolic manipulators, such as Maple and Mathematica, have the capability of finding exact solutions to certain types of differential equations, the nonlinear nature of most reactions makes this a remote possibility. This brings up the issue of how to employ the information derived symbolically to construct, and then execute, a numerical 7 9/29/04
scheme. The easiest approach is to use the numerical solver available within the symbolic program. In the case of Maple this involves the Runge-Kutta-Fehlberg (RKF45) method, although other methods, such as a seventh-eighth order continuous Runge-Kutta method, are also available ([Burden, 1993 #30]). One can use other algorithms but this aspect of the procedure is not of concern here as numerical methods for initial value problems are well studied [Gear, 1971 #22]. The real difficulty is keeping track of the information needed to compute the solution, particularly for large reaction schemes. Our approach is to link the symbolic manipulator with system files that contain the necessary data to compute the solution (e.g., rate constants, initial concentrations, etc.). To facilitate the handling of the data, and to make displaying the results as simple as possible, a Motif X-window display program has been written. This program is launched from within Maple and it produces a graphical interface relating the data, the numerical solution, and the plotting routines. An example of the window that comes up is shown in Fig. 1. The data window has entries for all the parameters needed to compute the solution, such as the initial concentration for each species and the rate constants (the latter are labeled k1, k2, k3,... in the program). The user can also specify the time interval over which the problem is solved and the number of points used in computing the solution. On the left side of the window are toggle buttons that enable the user to specify how the solution is displayed. The possibilities include: i) a 2-D plot showing one or more species as a function of time, ii) a 2-D plot showing one or more species as a function of one of the other species, and iii) a 3-D plot where X and Y are specified species (or time) and there are one or more Z selections. The user can also specify the color of each solution curve using a pop-up menu. Finally, there are four buttons for saving, plotting, quitting, and help. The display program waits until the user enters the required data and then presses the plot button. At this point all of the data is collected from the field and buttons, and the interface program writes a Maple procedure using this information. Within the procedure the system of differential equations, and associated initial conditions, are defined and numerically solved using the "dsolve" command. After this the procedure creates a PLOT or PLOT3D Maple structure based on the solution and the plotting data provided by the interface. Once the procedure is complete the display program writes the code to a file that Maple reads and then executes. The X-window commands used to display the data window make use of the Motif X windows libraries and a variety of widgets for the different fields in the data window. 8 9/29/04
The interface was written and tested with the GNU C compiler and should be widely portable across most machines running UNIX with the Motif X window libraries. Example To demonstrate the program, a sample session is given below using the Oregonator model in (1). The session was carried out using Maple V, Release 3, on a SPARCstation 2 with 16 MBytes of RAM and running X-windows. The Maple commands begin with a > and each command is preceded by a comment line that indicates what the command does (the comment lines begin with a # ). Also, where appropriate, the response from Maple follows the command. # Read in the program... > read kinetics; # Enter the Oregonator reaction scheme... > R := [ A + Y > X + P, X + Y > 2*P, A + X > 2*X + 2*Z, 2*X > A + P, Z > 0.5*Y ]; R := [X + P < A + Y, 2 P < X + Y, 2 X + 2 Z < A + X, A + P < 2 X,.5 Y < Z] # Determine the differential equations for the scheme... > odes(); The differential equations are: d 2 ---- X(t) = k1 A Y - k2 X Y + k3 X A - 2 k4 X dt d 2 ---- P(t) = k1 A Y + 2 k2 X Y + k4 X dt d 2 ---- A(t) = - k1 A Y - k3 X A + k4 X dt 9 9/29/04
d ---- Y(t) = - k1 A Y - k2 X Y +.5 k5 Z dt d ---- Z(t) = 2 k3 X A - k5 Z dt # Determine the conservation laws... > laws(); The conservation laws are: 1. constant = 6.*X+4.*P+8.*A+2.*Y+Z # Check to see what species can have a zero steady state... > steady(); No species were found to have a zero steady state. # Compute, and then plot, the solution... > ksolve(); In response to the command "ksolve" the plot data window shown in Fig. 1 appears. When the window first comes up there are no entries in the Initial Value and Rate Value columns, and there are no selections in the X, Y, Z columns. The labels X, Y, Z are used here only to indicate the conventional cartesian coordinates (and not specific chemical species). In any case, it is now necessary to specify the initial values for each species. For this example, we take X(0) = 1, P(0) = 0, A(0) = 0.1, Y(0) = 1, and Z(0) = 1. The rate constants to be used in this example are k 1 = 100, k 2 = 1, k 3 = 100, k 4 = 1, and k 5 = 0.01. Also, the solution is to be calculated at 200 points over the time interval 0 < t 40 (time in this example is measured in seconds). This information, as it would appear in the data window, is shown in Fig. 1. The first plot to be constructed gives Y(t) and Z(t) as functions of time t. To accomplish this, T is selected in the X column and both Y and Z are selected in the Y column (see Fig. 1). By default the window assigns a color to each species and these are 10 9/29/04
(in order): white, red, green, yellow, orange, blue, and violet. In the Color column it is possible to reassign the color for each curve. For this example, the curve for Y is to be drawn in green and the one for Z in orange. After entering this information the "Plot Graph" button is pushed and the result is shown in Fig. 2. This plot is done by Maple and there are several options on the plot window, which is not shown, for manipulating the plot (e.g., line styles, type of axes, printing, etc.). It is just as easy to produce 3-D plots using the data window. To illustrate, suppose one wants to plot the curve consisting of the points (X(t), Y(t), A(t)) for 0 < t 40. In regard to Fig. 1, out of the X column select Y, out of the Y column select Z and out of the Z column select A. The curve is also to drawn in violet and this is indicated in the Color column. The other aspects of the data window are not changed. Again, the "Plot Graph" button is pushed and the result is shown in Fig. 3. It is worth mentioning that the symbolic procedures used here are relatively fast. In particular, for the Oregonator model the "odes" command takes about 0.2 seconds, the "laws" command about a second, and the "steady" command less than 0.05 seconds. Moreover, these procedures will work on any computer system which has Maple. The "ksolve" procedure, because of its graphical nature, requires X-windows and the Motif X Windows libraries. Acknowledgments: We would like to thank the URP at Rensselaer for their support of this work. 11 9/29/04
References 12 9/29/04
Figure Captions Fig. 1 The X-window launched from within Maple. This window is the graphical interface used for entering data for plotting the solution of the kinetic equations. Fig. 2 Solution curves for Y(t) and Z(t) from the Oregonator model using the data values shown in Fig. 1. As indicated in Fig. 1, Y is drawn in green and Z in orange. Fig. 3 Solution curve (Y, Z, A), for 0 < t 40, from the Oregonator model using the data values shown in Fig. 1. 13 9/29/04