A New Parallel Numerically improved KIVA3V Program for Diesel Combustion Computations. P.Belardini, C.Bertoli, P.DeMarino, G.Avolio. Istituto Motori National Research Council of Italy. Via Marconi 8 8125 Napoli. Tel. +39 81 7177131 Fax +39 81 239697 E_mail:p.belardini@im.cnr.it INTRODUCTION Due to the computational costs, moreover in presence of high-resolution computational domains, in the last years three-dimensional simulation codes have mostly used simplified combustion chemistry. Handling detailed chemical inetics involves dealing with a very high number of intermediates, whose concentrations are very low but dramatically affecting reaction speeds. Numerical problems become a major issue since the high number of species maes the solution of the chemical equations complicated; the reaction intermediates typically exhibit very high creation and destruction rates, but very low net creation rates; the concentration of the species changes with very much different time scales, adding further stiffness to the problem and thus complicating the choice of the integration time step. The KIVA3V code, an updated extension of previous versions of the same code developed in the 8's, was designed for scalar architecture computers in a period where the main concern of the programmers was to reduce the memory needs as much as possible[1]. This need led to an architecture where the time step, chosen at the beginning of each integration cycle, can't be changed if convergence criteria are not met, neither integrating the Navier-Stoes equations nor in the chemical inetic phase. Therefore the integration method was selected privileging stability (the lagrangian phase, unconditionally stable, but not efficient in term of computational time) over computational cost (the eulerian phase, time efficient, but not always stable), and consequently using partially implicit methods whenever possible. This approach, useful when the convergence criteria may be evaluated a priori, introduces many difficulties when dealing with stiff equations. At the present the availability of parallel architecture computers, moreover at low cost as for the Beowulf architecture, maes it possible both to use of a detailed description of the combustion chemistry attractive and to exploit more advanced computational methods. THE COMBUSTION MODEL The way KIVA3V solves chemical problems was not projected to handle detailed chemical models. Not only is the number of chemical species and chemical equations limited, but in addition no effort is spent to solve the set of the equations simultaneously. Therefore a different, more complex reaction mechanism was introduced. It is based on the Gustavsson and Golovitchev model [2], using 57 chemical species and 29 chemical reactions, reversible in the greater number. It was used to simulate a Common Rail FIAT 1.9 JTD Diesel engine, whose characteristics and whose selected test point is reported in table1: the diesel fuel is setched as n-eptane. Bore 82. [mm] Stroe 9.4 [mm] Displacement 482.2 [cm 3 ] Compression ratio 17.5:1 Engine speed 15 [rpm] PME 2 [bar] Electro-injector VCO, 6hole,.156mm Table 1
The full set of the model equations is reported in [2]. The model is characterized by different n-eptane oxidation mechanisms: they provide the formation of primary and secondary radicals, with fragmentation of the resulting chetons; and on the other hand the termical cracing producing C4 and C3 species. There is no modelling of the aromatic compounds; at present the soot formation model has been removed because it cannot be easily managed by the CHEMKIN library. The KIVA3V code has been integrated with the CHEMKIN-II chemistry solver [3]: the CFD code, responsible for the flow solution, establishes the species and the thermodynamic information; the CHEMKIN solver, responsible for the chemistry, returns chemical reaction rates. In this way the flow and the chemistry calculations have been separated, obtaining a more flexible program and a more logical separation between the program and the thermo inetic data. A conversion lin between the two sections, using respectively species densities and mass fractions, was added, also introducing more controls and filters to deal both with numerically arising negative species concentrations and with inaccuracies of the mass balance. Regarding the inetics, CHEMKIN is able to manage a great number of reactions, considered as elementary steps. The reaction rates may be expressed in the Arrhenius, Troe or Landau-Teller form, which all loos lie: ω = K i K Χ fi = 1 [ X ] ν ' K K Χ fi = 1 [ X ] ν " where X is the -specie molar concentration and ν and ν are the specie reaction order. The chosen form of the reaction rate, Arrhenius, or Troe or Landau-Teller, establishes the form of the K fi coefficients. THE NUMERICAL SOLVERS The differential equations involved in the combustion process have strongly different time scales, and are very stiff. This consideration compelled the original and subsequent versions of the KIVA code to deal with the chemistry in two different sections, solving the fastest equations, treated as though at equilibrium, in a separate module, an approach that reduces the accuracy of the solution. In our version of the code this limit was removed. The full set of equations is simultaneously solved in each computational cell. To handle such a computationally intensive problem, many steps had to be taen. In first place, it was chosen to use the VODE library, developed at the Lawrence Livermore National Laboratory [4] [5]. This is a collection of subroutines that solve systems of ordinary differential equations using a great variety of numerical methods, all sharing a common interface. It was then possible to change the integration method used in every cell and in every time step, adapting the algorithms to the local conditions of the reacting system. VODE uses different families of methods, either the implicit Adams-Moulton method [6] or the BDF (Bacward Differentiation Formulas [7]), more fit to face stiff problems. The algorithms used to solve the systems of algebraic corrector equations range from functional iterations to chords method, with an explicit Jacobian or with a numerically calculated one, full or banded. All the systems of linear equations are solved using direct methods (mainly LU decomposition [8]). If one has a qualitative understanding of the nature of the equations it's also possible to accelerate the convergence of the system letting the solver assume a slowly varying Jacobian matrix. Under that hypothesis which is anyhow periodically verified by VODE -- the number of computations of partial derivatives isn't necessary on every step, drastically reducing the number of evaluations of inetic reaction rates. The huge difference in the chemical reaction and Navier-Stoes time scales also suggested the introduction of segregated integration algorithms. In the classical KIVA3V approach, the time step is the same both for the transport phenomena part and for the solution of chemical inetic equations. We chose to use the overall time step as upper extreme of integration for the chemical equations rather than reducing the overall time step to face the stiffness of the equations. This led to the chemical time step being related to the local conditions found in every cell rather than to the overall average conditions of the reactive system, refining the quality of the computation coherently with the complexity of the physical phenomena taing place in each cell. This choice has been particularly effective: it made it possible to drastically reduce the simulation times, not forcing the program to reduce the time step while it dealt with the parts of the problem which didn't need it. It was in fact found that typically only a few tens of cells out of the many thousands that mae up the grid show a stiffness that would require a drastic reduction of the time step and this method taes advantage of this situation, concentrating computational resources where they are most needed. Another advantage of this method is that it let us bypass one of the most severe limitations of KIVA3V: the monotonic simulation time. As the program can't repeat a non-converging pass under any circumstances, it was absolutely necessary to develop an algorithm that would complete the integration of the chemical equations in any case and using a variable-step sub-integrator ensures that under any circumstances.
Regarding the mathematical methods used to integrate the chemical equations in the interval given by the overall KIVA3V time step, we employed an adaptive strategy. At first the cheaper explicit Euler and 4 th order Runge- Kutta methods are attempted. Only if case of failure of both methods to converge an implicit method is used, at first assuming a mainly diagonal Jacobian matrix. This assumption is removed as last resort. This approach let the computational load be even more finely adapted to the complexity of the problem. THE PARALLELIZATION A preliminary study was performed to outline the most time consuming parts of the code. In a computational domain of about 12 cells more than the 95% of the computing time is spent for the chemistry: the time needed for the fluidynamic and thermodynamic parts of the code is therefore negligible, as already outlined by Kong and others [9]. As consequence the parallelization efforts were mainly focused on the chemical subroutines. The KIVA3V code structure made this choice particularly efficient: in fact any cell is considered as a separated CSTR adiabatic reactor, physically and formally independent from the surrounding cells. In such a context, the sharing of the computational domain among the different computing nodes is a critical issue in the load balancing., as the integration algorithm chosen for each cell may differ both in terms of method chosen (Euler, the 4 th Runge-Kutta, BDF methods), and of time sub-step, depending on the stiffness of the equations under the local conditions. Therefore the code modifications were mainly applied to the chemical module. The simulation runs only on the master node until the chemical phase is reached. Then the job is split among all the available nodes. At the end of the chemistry phase, all the data are collected on the master node, where the rest of the computation taes place (fig.1). Standard message passing software (Lam/MPI: standard MPI 1. and 1.1) was employed to control the nodes and to synchronize data among the processors. A few routines were also added to collect statistical information about the numerical choices operated in the cells. At this regard it seems interesting that most of the cells use either very simple methods (explicit Euler, with local time step equal to the global time step) wherever the chemical inetics are slow, or more stable methods (BDF), when forced to by faster inetics. Only a negligible amount of cells uses the Runge-Kutta method successfully. Processor Spray Dynamics Processor 1 Processor n Grid Partitioning Data Transmission Chemical Kinetics Evaluation Chemical Kinetics Evaluation Chemical Kinetics Evaluation Data Receive Transport Phenomena The KIVA3V Parallelization Fig.1 The grid sharing criterion was chosen considering the load balance. Sequential grid partition methods (Fig.2) didn't seem fit, as the structure of the most common grid generators mae the assignment of consecutive cell numbers to geometrically near cells very liely. It would have been therefore much probable that considering the particular combustion conditions of injection engines all the cells where fast inetics would tae place would also be assigned to the same node. This would have made all parallelization efforts useless. A bloc-cyclic distribution system was chosen [1] (fig.3). The reaction volume is thus assigned in a fairer way to the nodes, balancing the computational load much better in the average. The load related to the fluid dynamics was wholly delegated to one node, without any parallelization, given its relative much lesser importance. This strategy was proven to be right: all measurements showed that each of the eight nodes which mae up the Beowulf cluster we used for the simulation had a CPU load oscillating between 9 and 1%. The networing load was found out to be totally irrelevant.
Sequential criterion of grid sharing Round-Robin criterion of grid sharing fig.2 fig.3 To evaluate the quality of the performed parallelization, load balance analysis was performed in two different test cases. The results obtained showed uniform distribution of the load balance among the nodes for both test cases (figs 4,5 left): the master node, involved also in the fluid-dynamic part of simulations, is slightly more engaged. In both test cases the time spent for the data exchange among the nodes is short: from.2% to.5% for the master node, in dependence of the test case, and absolutely negligible for the other processors (figs.4,5-right). 1 9 % CPU Time.2.18 % System Time 8.16 7.14 6.12 5 4.1.8 3.6 2.4 1.2 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Test case #1 Fig.4 1 9 8 % CPU Time.6.5 % System Time 7 6.4 5 4 3.3.2 2 1.1 1 2 3 4 5 6 7 Test case #2 Fig.5 1 2 3 4 5 6 7 RESULTS The results may be considered in terms of computational time advantages and prediction quality. First to be considered, the time needed to solve the full set of equation was reduced from 72 to 9 hours with the rough contribution of the parallelization. The speed up reaches 6 hours with the further contribution of the numerical integration refinements (fig.6).
8 72 Full, scalar set of equation Computing Time [h] 6 4 2 Full, set of equation, parallely solved 9 6 Full, set of equation, parallely solved, with numerical solver improvement KIVA3V improved version speed up Fig.6 Fig.7 shows the correspondence between the simulation and the experimental tests: the coincidence between the predicted ignition angle and the measured one proves the correct tuning of the inetics, while the agreement of the predicted and measured pressure after the combustion too place (around +2 degrees ATDC) confirms the quality of the thermodynamic data describing the species. The disagreement between the maximum measured pressure and the predicted pressure, even though less than 1% in absolute value, deserves a more detailed analysis, as it shows an overestimate of the chemical inetics in the high temperature phases. This overestimate is surely at least in part due to the choice to use no model of interaction between turbulence and chemical inetics in the program at the moment: as consequence the assumption of perfect mixing in the cells surely raises the estimated reaction rates in highly turbulent zones. A model of turbulent combustion is anyway in phase of development. Numerical, experimental comparison Fig.7 CONCLUSIONS The computational efficiency of the KIVA3V code was significatively improved with the use of parallel techniques joint with the adaptative choice of the integration method, selected cell by cell as the fastest, but the most accurate. The gain obtained in term of computational time may therefore be spent to use more detailed inetics, expressed not only in the Arrhenius form, and able to model in whole the low and the high temperature oxidation process, as necessary for the new generation of combustion engines. The chemical combustion model, no more included in the CFD code, permits to easily change the chemistry using and testing more and more sophisticated inetics. The inetics part of the program is conceptually separated from the numerical solvers: therefore it is possible to separately evaluate the quality of the inetics and the quality of the numerical solutions.
The code was enriched with the capability to reduce the local time step for cells having stronger inetics, introducing in the same more efficient controls of the species mass balance, and increasing the robustness of the computation in term of stability. In the next it is necessary to tune and test the inetic model on a good experimental basis, also introducing a mechanism of handling of the non-gaseous phases which is necessary to model soot formation. Moreover coupling of turbulence with combustion has been developing. REFERENCES [1] Amsden, A.A., O Roure, P.J, Butler, T.D., KIVA3V-II: A Computer Program for Chemically Reactive Flows with Sprays, Los Alamos National Laboratory Report No. LA-1156-MS, 1989 [2] Gustavsson, J., Golovitchev, V.I., Spray Combustion Simulation Based on Detailed Chemistry Approach for Diesel Fuel Surrogate Model, SAE Paper 23-137, (23) [3] Kee, R.J., Rupley, F.M., Miller, J.A., Chemin-II: A Fortran Chemical Kinetics Pacage for the Analysis of Gas Phase Chemical Kinetics, Sandia Report SAND89-89B, 1991 [4] Brown,P.N., Byrne,G.D., Hindmarsh, A.C., VODE: A Variable Coefficient ODE Solver, SIAM J. Sci. Stat. Comput., 1 (1989), pp. 138-151. [5] Byrne, G.D., Hindmarsh, A.C., A Polyalgorithm for the Numerical Solution of Ordinary Differential Equations, ACM Trans. Math. Software, 1 (1975), pp. 71-96. [6] Sampine,L.F., Gordon, M.K., Computer Solution of Ordinary Differential Equations: the Initial Value Problem, Freeman, 1975 [7] Hairer, E., Wanner, G., Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems, Springer-Verlag, 1991 [8] Press,W.H., Teuolsy,S.A., Vetterling,W.T., Flannery, B.P., Numerical Recipes in C, Cambridge University Press, 1992 [9] Amr Ali, Cazzoli, G., Kong, S.C., Reitz, R.R., Mongomery, C., Improvement in Computational Efficiency for HCCI Engine Modelling by Using Reduced Mechanisms and Parallel Computing, Central State Section Meeting 23 [1] Kee, R.J., Miller, J.A., Jefferson, T.H., A Structured Approach to the Computational Modeling of Chemical Kinetics and Molecular Transport in Flowing Systems, Springer Series in Chemical Physics 47, 196 (also nown as Sandia National Laboratories Report, SAND8-83), 1986