The Pennsylvania State University The Graduate School NEWTON-KRYLOV METHODS FOR THE SOLUTION OF THE. k-eigenvalue PROBLEM IN MULTIGROUP NEUTRONICS

Size: px
Start display at page:

Download "The Pennsylvania State University The Graduate School NEWTON-KRYLOV METHODS FOR THE SOLUTION OF THE. k-eigenvalue PROBLEM IN MULTIGROUP NEUTRONICS"

Transcription

1 The Pennsylvania State University The Graduate School NEWTON-KRYLOV METHODS FOR THE SOLUTION OF THE k-eigenvalue PROBLEM IN MULTIGROUP NEUTRONICS CALCULATIONS A Dissertation in Nuclear Engineering by Daniel F. Gill c 2009 Daniel F. Gill Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy December 2009

2 The thesis of Daniel F. Gill was reviewed and approved by the following: Yousry Y. Azmy Adjunct Professor of Nuclear Engineering Dissertation Advisor, Co-Chair of Committee Seungjin Kim Assistant Professor of Nuclear Engineering Co-Chair of Committee Kostadin Ivanov Professor of Nuclear Engineering Ludmil Zikatanov Associate Professor of Mathematics Robert Grove Rickover Fellowship Laboratory Mentor Knolls Atomic Power Laboratory, Schenectady, NY Special Member Brian Aviles Knolls Atomic Power Laboratory, Schenectady, NY Special Member Jack Brenizer Professor of Mechanical and Nuclear Engineering Chair of Nuclear Engineering Signatures are on file in the Graduate School.

3 Abstract In this work we propose using Newton s method, specifically the inexact Newton- GMRES formulation, to solve the k-eigenvalue problem in both transport and diffusion neutronics problems. This is achieved by choosing a nonlinear function whose roots are the eigenpairs of the k-eigenvalue calculation and then using Newton s method to solve the nonlinear system. The flexibility resulting from the use of a Krylov subspace method to solve the linear Newton step can be further extended via the use of the Jacobian-Free Newton-Krylov (JFNK) approximation, which requires no knowledge of the system s Jacobian; instead only the ability to evaluate the system residual is necessary. This makes it possible to avoid the computational and memory costs associated with the construction and storage of the Jacobian, resulting in an efficient solution algorithm. Writing the k-eigenvalue problem as a nonlinear function yields a number of formulations, all of which all have the desired roots. For the diffusion approximation, the nonlinear function is written in the form of the generalized eigenvalue problem and a set of preconditioners is developed and applied to the GMRES iterations that are used to solve the linearized Newton problem. Most of the developed methods can be implemented as either Newton-Krylov (NK) methods, where the Jacobian-vector product is formed using the explicitly constructed Jacobian, or via the JFNK approximation, where a finite-difference perturbation is used to approximate the Jacobian-vector product. One particularly effective preconditioning option comprises the use of the standard power iteration to precondition the GMRES iteration on either the right or the left. Preconditioning on the left, denoted JFNK(PI), results in a modified nonlinear system whose implementation only requires the ability to perform a single traditional outer iii

4 iteration, making this approach relatively simple to wrap around an existing diffusion theory k-eigenvalue problem solver. Other preconditioning options, such as the Incomplete Cholesky decomposition of the within-group diffusion matrix, are also considered. Similar methods were developed for transport theory, cast using an operator notation that greatly simplifies their presentation. All of the nonlinear functions developed are written in terms of a generic fixed-point iteration, with a number of specific fixed-point formulations considered. Each fixed-point scheme represents a viable k-eigenvalue problem solution method, with two of the techniques corresponding to traditionally used iterative schemes. The new methods developed can also be wrapped around existing software in most instances, simplifying the implementation process. Ultimately it is seen that the most effective of the Newton formulations in transport theory is wrapped around a k-eigenvalue formulation that is a very special instance of traditional methods: no upscattering iterations are performed, only one inner iteration completed per outer, using source iteration with the previous outer iterate as the initial guess. This results in a fixed-point iteration that collapses the three possible iteration levels (outer iterations, upscattering iterations, inner iterations) into a single level of iteration. While this formulation of the k-eigenvalue problem converges very slowly if solved as a traditional fixed-point iteration, when coupled with Newton s method it results in very inexpensive Jacobian-vector products. In the Newton approach an extra degree of freedom is introduced by including the eigenvalue as an unknown, meaning an additional relation is necessary to close the system. In the diffusion theory case a normalization condition on the eigenvector was generally used, however in transport theory a number of so-called constraint relations were considered. These fall into two categories: normalization relations and eigenvalue update formulations. It was observed that the most effective of these constraint relations is the fission-rate eigenvalue update, derived directly from the eigenvalue update formula traditionally used to solve the k-eigenvalue problem. Numerical results, including measured performance quantified in number of iterations and execution time, were generated for suites of benchmark problems using the various Newton s Method formulations for the k-eigenvalue problem in both transport and diffusion theories. These results showed that the choice of the perturbation parameter in the JFNK approximation has very little impact on the calculation while the choice of GMRES stopping criterion significantly affects the total cost of the calculation. The size of the GMRES subspace and the maximum number of restarts permitted were also seen to play an important role in the cost of a calculation. While the diffusion formulations showed little sensitivity to the initial guess of the Newton iterations, the transport formulations were seen to iv

5 potentially diverge or converge to a non-fundamental mode if a poor initial guess was used. This behavior was avoided by performing a single traditional fixed-point iteration prior to initializing Newton s method. Overall, the numerical results showed that the Newton formulation of the k- eigenvalue problem in diffusion theory is competitive with the Chebyshev accelerated power iteration, with the JFNK(PI) formulation generally resulting in quicker execution times. The transport results showed that a number of the Newton formulations developed result in methods that are significantly less computationally expensive than traditional techniques. Results for the well-known C5G7-MOX benchmark problem demonstrate that the Newton approach reduces by a factor of 5 the total number of sweeps necessary to converge the point-wise fission source error to The numerical results generated in this work show that the Newton approach is superior to existing techniques for both transport and diffusion calculations. Furthermore the newly developed methods have been formulated in such a way that they can be implemented as wrappers around existing code sections, requiring little access and modification to existing code, where the computational kernel is typically some variation of a traditional outer iteration. Based on these results, it is plausible that more advanced (via numerical optimization and acceleration techniques) implementations of these approaches could prove to be more efficient than the methods currently used to solve the k-eigenvalue problem in production-level software. v

6 TABLE OF CONTENTS List of Figures List of Tables Acknowledgments x xii xiv Chapter 1 Introduction The k-eigenvalue Problem Numerical Solutions to Eigenvalue Problems Power Iteration Subspace Iteration Krylov Subspace Methods Other Methods Krylov Subspace Methods Stationary Methods Projection Methods Preconditioning Linear Problems Inexact Newton Methods Forcing Factor Globally Convergent Newton Methods Jacobian-Free Newton-Krylov Methods The k-eigenvalue Problem in Neutronics Newton s Method in Neutronics Calculations vi

7 Chapter 2 The k-eigenvalue Problem in Diffusion Theory The Diffusion Approximation Inexact Newton Methods Evaluating Γ Evaluating Γ Generalized Eigenvalue Problem Preconditioning with the Diffusion Operator Preconditioning with the IC Factorization of the Diffusion Operator Including Fission Terms in the Preconditioner Preconditioning with Power Iterations JFNK Acceleration of Power Iteration PI as Fixed-Point Iteration Summary of Methods Practical Issues Convergence of Newton Iteration Initial Newton Guess Backtracking Implementation Potential Difficulties Chapter 3 Diffusion Theory Numerical Results Test Code Test Problem Descriptions IAEA Benchmark Biblis Problem BWR CANDU Numerical Experiments Perturbation Parameter Forcing Factor Initial Power Iterations Inner Iterations Convergence of Newton-Based Methods Full Convergence GMRES Performance Practical Considerations Comparison with Power Iteration vii

8 Chapter 4 The k-eigenvalue Problem in Transport Theory Neutron Transport Theory α-eigenvalues k-eigenvalues Problem Discretization Operator Notation Traditional Solution Techniques Inner Iterations Upscattering Treatment Realistic Implementation Newton s Method and Transport Theory Evaluating Γ(P) Accelerating Power Iteration Accelerating Fixed-Point Iteration Accelerating Flattened Fixed-Point Iteration Variations on Flattened Fixed-Point Iteration Constraint Equation Evaluating Γ (P) The α-eigenvalue Problem Summary of Newton Approaches Chapter 5 Transport Theory Numerical Results S N Transport Code Convergence Criteria Performance Measures Algorithmic Parameters Benchmark Problem Suite Takeda Benchmarks C5G7-MOX Benchmark Summary of Benchmarks JFNK/NK Parametric Studies Perturbation Parameter Inexact Newton Forcing Factor Initial Guess GMRES Iterations Newton Formulations Angular and Spatial Discretizations Inner Iterations viii

9 5.4.3 Formulations & Constraints Comparing Traditional & Newton Methods Takeda Results C5G7-MOX Results Summary of Comparisons to Traditional Schemes Chapter 6 Conclusions Summary Conclusions Future Work Appendix A Mathematical Definitions 248 A.1 Types of Matrices Appendix B Benchmark Suite Specification 253 Bibliography 267 ix

10 LIST OF FIGURES 3.1 Diffusion Test Problems Material Assignment Impact of ɛ on JFNK(PI) Impact of ɛ on JFNK(IC) Impact of ɛ on JFNK(GEP) Impact of η on JFNK(PI) Impact of η on JFNK(IC) Impact of η on NK(IC) Effect of Number of Initial Power Iterations to Initialize JFNK(PI) Effect of Number of Initial Power Iterations to Initialize NK(IC) Effect of Maximum Number of Inner Iterations on Preconditioned Power-Iteration-Based Methods Measured in Terms of Cumulative Inner Iterations Effect of Inner Iteration Tolerance on Preconditioned Power-Iteration- Based Methods Measured in Terms of Cumulative Inner Iterations Convergence of Newton Methods for IAEA Benchmark Convergence of Newton-Based Methods for IAEA Benchmark as a Function of GMRES Iterative Properties, 1000 Iteration Maximum Convergence of Newton-Based Methods for IAEA Benchmark as a Function of GMRES Iterative Properties, 100 Iteration Maximum Convergence of Inner Iterations using Power Iteration as a Preconditioner Convergence of Newton-Based Methods for IAEA Benchmark as a Function of η Potential Impact of GMRES Implementation on Convergence Performance of JFNK Preconditioners using An Algorithm Without Backtracking Performance of JFNK Preconditioners using Eisenstat-A Algorithm Without Backtracking x

11 3.20 Performance of NK Preconditioners using An Algorithm Without Backtracking Eisenstat-B Without Backtracking Eisenstat-B With Backtracking when r < Comparison of Power Iteration and Newton s Method: Coarse Mesh, PARCS Convergence Comparison of Power Iteration and Newton s Method: Fine Mesh, PARCS Convergence Comparison of Power Iteration and Newton s Method: Coarse Mesh, ɛ mach Convergence Comparison of Power Iteration and Newton s Method: Fine Mesh, ɛ mach Convergence PARTISN Iterative Strategy Discretization of C5G7-MOX Problem in x y Plane Effect of η on Takeda-1, Case 1 Convergence Effect of η on Takeda-1, Case 2 Convergence C5G7-MOX Benchmark: JFNK-F-FR for Varying η C5G7-MOX Benchmark: JFNK-FDF-FR for Varying η Convergence of Takeda-1 for Varying Angular/Spatial Discretization Performance of JFNK Formulations Relative to Traditional Methods Performance of JFNK Constraints Relative to Traditional Methods Performance of JFNK-FDF-FR Relative to Traditional Methods, Tightly Converged B.1 Takeda-1 Geometry B.2 Takeda-2 Geometry B.3 Takeda-3 Rodded Geometry B.4 Takeda-3 No Rod Positions Geometry B.5 C5G7-MOX 3-D Geometry B.6 C5G7-MOX Pin Description/Layout B.7 C5G7-MOX Axial Reflector B.8 C5G7-MOX: Unrodded, Rodded A, Rodded B xi

12 LIST OF TABLES 2.1 Summary of Inexact Newton Methods for Diffusion Theory Diffusion Test Problems Cross Sections Base Case for Convergence Tests Computational k-eigenvalue Results for Diffusion Benchmarks Derivatives of (φ Pφ) with respect to φ Derivatives of (φ Pφ) with respect to λ Derivatives of ρ with respect to φ Derivatives of ρ with respect to λ Summary of Methods for Transport k-eigenvalue Problem Benchmark Suite: Reference Eigenvalues Degrees of Freedom for Transport Benchmark Problems Effect of ɛ on Sweep Counts Effect of Initial Fixed-Point Iterations on Sweep Count for Takeda-1 Configurations Effect of Initial Eigenvalue Guess on Sweep Count for Takeda-1 Configurations C5G7-MOX-Unrodded Total Sweep Counts Using Fixed-Point IG with Converged Inners (GMRES) C5G7-MOX-Unrodded Total Sweep Counts Using Flat Fixed-Point IG C5G7-MOX-Unrodded Total Sweep Counts Using k (0) from a Single Converged Fixed-Point Iteration Sweep Count for Flat-Flux IG C5G7-MOX-Unrodded Problem Without Upscattering and k (0) = Effect of Number of GMRES Restarts on Takeda-3 Problems, Subspace Size xii

13 5.11 Eigenvalues for Takeda-3 Problem with Varying Number of GMRES Restarts Varying Angular/Spatial Discretization for Takeda-1 using JFNK- FDF-FR Newton Formulation Effect of Spatial Discretization on k Effect of Inner Iteration Treatment on Takeda-1 With Upscattering Using a Fixed-Number of Inners for C5G7-MOX-Unrodded Comparison of NK and JFNK for Takeda-1 With Upscattering Comparison of Formulations & Constraints Sweep Count for Takeda- 1 With Upscattering Sweeps per Second for Takeda-1 With Upscattering Comparison of Formulations & Constraints for Takeda-1, Case Comparison of Formulations & Constraints for Takeda-1, Case Comparison of Formulations & Constraints for Takeda-2, Case Comparison of Formulations & Constraints for Takeda-2, Case Comparison of Formulations & Constraints for Takeda-3, Case Comparison of Formulations & Constraints for Takeda-3, Case Comparison of Formulations & Constraints for Takeda-3, Case Comparison of Formulations & Constraints for C5G7-MOX, Unrodded Comparison of Formulations & Constraints for C5G7-MOX, Rodded A Comparison of Formulations & Constraints for C5G7-MOX, Rodded B Accuracy of Benchmark Solutions B.1 Fission Spectrum for Benchmark Suite B.2 Takeda-2 & Takeda-3 Total and Fission Cross Sections B.3 Takeda-1 Cross Sections B.4 Takeda-2 & Takeda-3 Scattering Matrices B.5 C5G7-MOX Control Rod Cross Sections B.6 C5G7-MOX UO 2 Fuel-Clad Cross Sections B.7 C5G7-MOX 4.3% MOX Fuel-Clad Cross Sections B.8 C5G7-MOX 7.0% MOX Fuel-Clad Cross Sections B.9 C5G7-MOX 8.7% MOX Fuel-Clad Cross Sections B.10 C5G7-MOX Fission Chamber Cross Sections B.11 C5G7-MOX Guide Tube Cross Sections B.12 C5G7-MOX Moderator Cross Sections xiii

14 Acknowledgments First and foremost, I would like to thank my academic advisor, Dr. Yousry Azmy, for his guidance and encouragement throughout my academic career. He has shown himself to be a wonderful teacher and his influence on my development cannot be overstated. Little did I know what I was getting myself the first time I walked into his office all those years ago... I would like to thank Dr. Robert Grove for his mentorship in the Rickover fellowship program and for always staying involved and interested in my progress; his comments on this dissertation were invaluable. I would also like to thank my other committee members: Dr. Brian Aviles, Dr. Kostadin Ivanov, Dr. Seungjin Kim, and Dr. Ludmil Zikatanov for their suggestions and comments. I am deeply indebted to Dr. James Warsa for sharing with me both his expertise and patience during this research. I found our conversations to be extremely illuminative and the imprint that they left pervade this work. I would like to express my gratitude to all of my office-mates, present and past: Jess, Double, Mike, Max, Jose, Bielen, Kursat, Josh, and Sebastian. Somehow, between inane debates and improvised nerf-ball challenges, I managed to learn quite a bit from each of you; it was a pleasure. I hope that even in my absence the Green Army sees continued success in its quest for world domination. To my parents thank you for instilling in me the desire to learn and succeed, your influence has made this possible. I would also like thank Matt, Becca, Erin, Eric, and Odin for their steadfast support. To the rest of my family and all of my friends thanks for being there! I wish Goldie and Genevieve were here to see; I think they would be proud. This research was performed under appointment to the Rickover Graduate Fellowship Program sponsored by Naval Reactors Division of the U.S. Department of Energy. xiv

15 CHAPTER 1 Introduction 1.1 The k-eigenvalue Problem Matrix eigenvalue problems are an important class of problems, commonly encountered in a variety of scientific and engineering disciplines. Determining structural stability due to vibrations, analyzing the properties of certain numerical methods, calculating the criticality of a nuclear fission chain-reaction and even ranking web search results are all examples of matrix eigenvalue problems. The fast and efficient numerical determination of eigenvalues and eigenvectors is therefore quite important in many applications and thus these numerical methods constitute an active field of research. In the field of nuclear engineering the matrix eigenvalue problem most often considered is the so-called k-eigenvalue problem. Finding the k-eigenvalue, or multiplication factor, of a nuclear system is essential because it provides insight into the self-sustaining nature of the chain reaction. The multiplication factor is the ratio of neutrons born in the current generation to the number in the preceding generation. If the multiplication factor exceeds one then the number of neutrons present in the system increases with each generation, a situation termed supercritical, while a multiplication factor of less than one indicates a chain reaction which will die out in the absence of a fixed source, known as a sub-critical system. A

16 2 value of one points to a steady-state situation where each generation of fissions produces the same number of neutrons in the next; a system in this state is called critical. The criticality determined by the k-eigenvalue is an important criterion in both the design and analysis of fissioning systems. The numerical method of choice to solve the k-eigenvalue problem in the nuclear engineering community has long been the classical fixed-point power method in conjunction with some form of acceleration. The power method is robust and has the advantage of stable convergence properties but can converge extremely slowly in certain situations. However, research into more advanced methods has also been conducted with encouraging results. In this work, new methods for finding the k-eigenpair using the Jacobian-Free Newton-Krylov (JFNK) method will be introduced. There are a variety of ways in which the k-eigenvalue problem can be written as a nonlinear function, resulting in several Newton formulations. Many formulations can be written such that the computational kernel is similar to existing iterative techniques, making it possible to utilize existing software. Other formulations make it possible to avoid the solution of the within-group problem entirely. Initially, the performance of these methods will be analyzed for the k- eigenvalue problem posed using the diffusion approximation to neutron transport theory. Extension of this approach to the S N approximation to the transport equation constitutes the centerpiece of this research. In this chapter literature relevant to the work at hand will be reviewed, beginning with a general introduction on numerical methods available for the solution of the matrix eigenvalue problem and linear systems of equations. The discussion will then turn to the Newton family of methods with most of the attention being given to the inexact and Jacobian-Free variations of Newton s method for nonlinear systems. The k-eigenvalue problem in nuclear applications will be briefly presented, followed by an overview of the methods currently in use for such problems and those which have been published as alternatives to the current methods. 1.2 Numerical Solutions to Eigenvalue Problems Given the ubiquity of the eigenvalue problem in the fields of science and engineering, it is not surprising that the study of its solution techniques represents an

17 3 important branch of numerical analysis. Nearly any textbook on matrix computations or scientific computing will have a section outlining methods used to compute eigenvectors and eigenvalues; there are also entire textbooks devoted solely to numerical methods for eigenvalue problems. Methods which are directly relevant to the work at hand will be reviewed while others which are only tangentially related will be briefly discussed. This section, however, is not intended to be a comprehensive review of numerical eigenvalue problem techniques. The most important of the methods which will be discussed is the power method, followed by reviews of the related methods of inverse iteration, Rayleigh quotient iteration, and subspace iteration. The discussion of subspace iteration will lead to a quick description of QR iteration, at which point Krylov subspace methods will be introduced, primarily the Arnoldi and Lanczos methods. Finally, the Jacobi-Davidson method will be briefly mentioned Power Iteration The basic matrix eigenvalue problem can be compactly written as Ax = λx, (1.1) where A is an n n matrix, x is a vector of length n, and λ is a scalar. Any non-zero x and λ pair which satisfy this equation are known collectively as an eigenpair, where x is the eigenvector and λ is the eigenvalue. The largest (in magnitude) eigenvalue is denoted by λ 1 such that λ 1 λ 2... λ n with the corresponding eigenvectors denoted by [x 1, x 2,..., x n ]. The sequence of eigenvalues of A is often referred to as the spectrum of A. Power iteration is perhaps the simplest and one of the oldest techniques used to compute an eigenvalue of a square matrix and its corresponding eigenvector. The method basically consists of multiplying some arbitrary nonzero vector, v 0, by A repeatedly. In Heath [1] the normalized power iteration is written as in Algorithm 1.1. Upon convergence, x 1 = v (k) and λ 1 = z (k). In the power method algorithm, division by any properly defined vector norm,., can be used to avoid geometric growth of the eigenvector during the iterative process; specifically,. denotes the L norm. To study the convergence properties of the power method,

18 4 Algorithm 1.1 Power Method [1] v (0) = arbitrary nonzero vector for k = 1, 2,... do z (k) = Av (k 1) v (k) = z (k) / z (k) end for v (0) is written as a linear combination of the eigenvectors v (0) = a 1 x 1 + a 2 x a n x n, where a j : j = 1,..., n, are scalars representing the projection of the initial guess v (0) on the corresponding eigenvectors, x j, and where we assume a 1 0. It follows that ( A k v (0) = v (k) = λ k 1 a 1 x 1 + n j=2 ( λj λ 1 ) k a j x j ) As k increases, (λ j /λ 1 ) 0, assuming λ 1 λ 2, meaning only the components corresponding to x 1 remains. As long as λ 1 > λ 2, λ 1 is called the dominant, or fundamental eigenvalue. A slightly expanded analysis in Golub and Van Loan [2] goes on to show that the convergence of the power method is formally given by λ 1 z (k) = O ( λ 2 λ 1. k). (1.2) The ratio λ 2 /λ 1 is known as the dominance ratio [3], d, and determines the rate of convergence of the power method. In cases where there is little separation between the first two eigenvalues, λ 2 /λ 1 1 and thus convergence is very slow. Both [1] and [2] list a number of reasons why the power method may fail, though none of these reasons is particularly alarming for the type of applications of interest in this work. It is necessary that v (0) has a component in the direction of x 1, else a 1 will equal zero. In practice, even if v (0) has no such component, floating point errors incurred during the process will introduce the necessary component. In the case where λ 1 is not a simple eigenvalue, i.e. it has algebraic multiplicity, meaning that it permits more than one eigenvector, then it is possible for the power method to converge to an eigenvector that is a linear combination of the

19 5 corresponding eigenvectors. Finally, it is noted that given a real matrix and a real v (0), the iterations cannot converge to a complex vector; however, in determining the criticality of nuclear systems via the k-eigenvalue formulation we are only concerned with real quantities. Often the eigenvalue which is smallest in magnitude is sought, in which case the inverse of power iteration, called inverse iteration, is used. This method takes advantage of the fact that the eigenvalues of A 1 are the reciprocals of the eigenvalues of A. Replacing A with A 1 in Algorithm 1.1 defines the inverse iteration process. Of course, the inverse of A does not need to be directly computed. The linear system can be solved using a decomposition method or an iterative procedure. In [4] it is explained how inverse iteration can be used to rapidly find any eigenpair given a good initial guess, using a process known as shifting. To find the eigenvalue closest to some scalar, σ, A 1 is replaced with (A σi) 1, where I is the identity matrix of order n, which will converge to λ σ, with λ being the eigenvalue of A 1 closest to σ. The algorithm for the shifted-inverse power iteration (SIP) is given by Algorithm 1.2. Algorithm 1.2 Shifted-Inverse Power Method Adapted from [1] v (0) = arbitrary nonzero vector for k = 1, 2,... do z (k) = (A σi) 1 v (k 1) v (k) = z (k) / z (k) end for Inverse not explicitly computed The SIP method offers the potential for very fast convergence because the convergence rate is now determined by d = λ a σ λ b σ. If the chosen shift, σ, is much closer to λ a (nearest eigenvalue to σ) than to λ b (next nearest) then the iterative procedure will converge quickly. A variation of the SIP method is the Rayleigh Quotient Iteration. The quantity λ = vt Av v T v

20 6 is known as the Rayleigh quotient, and provides a good estimate of the eigenvalue, λ, if a good estimate of the corresponding eigenvector, v, is known. Using the Rayleigh quotient as the shift in the SIP method results in Algorithm 1.3. The Algorithm 1.3 Rayleigh Quotient Iteration [1] v (0) = arbitrary nonzero vector for k = 1, 2,... do σ (k) = (v (k 1)T Av (k 1) )/(v (k 1)T v (k 1) ) z (k) = (A σ (k) I) 1 v (k 1) v (k) = z (k) / z (k) end for Inverse not explicitly computed Rayleigh quotient iteration has the advantage of converging extremely quickly, at least quadratically and for the class of normal matrices, cubically [1]. However, due to the large number of operations required per iteration the method is rarely used in practice. The power method, inverse iteration, and Rayleigh quotient iteration are all eigenproblem solving algorithms which operate on a single vector, v, and are capable of producing a single eigenpair. It has been shown how shifting can be used to find the eigenvalue closest to some scalar σ, however in some cases λ 2 may be desired when a good estimate is not known for use with the SIP method. Using a process called deflation, a known eigenvalue can be effectively removed; this process is not directly relevant to this work and will not be explained in any detail. For a detailed discussion on deflation and an explanation of the deflation techniques available see [4] Subspace Iteration The methods described in the previous section are basic, yet effective. However, it is often the case that multiple eigenpairs are sought, in which case these methods can be cumbersome, especially as the number of eigenpairs sought increases. The methods described in the next section are capable of simultaneously solving for multiple eigenpairs and, although more sophisticated, they are related to the single vector methods. Of all the methods that produce several eigenpairs the most straightforward method, and variations thereof, is known by several names: simultaneous iteration

21 7 [1], orthogonal iteration [2], and subspace iteration [4]. Throughout this document such methods will be referred to as subspace iteration because this is the name with which it is associated in the nuclear engineering community. The method is described by Saad [4] as a block generalization of the power method introduced in Algorithm 1.1. Rather than initializing power iterations with the single vector v (0), an n p matrix, V (0), is used, with 1 p n, and with the columns of V (0) being a set of linearly independent, but otherwise arbitrary, nonzero vectors. Suppose S 0 is the subspace spanned by the columns of V (0) and S is the subspace spanned by the eigenvectors x 1,..., x p. Repeatedly multiplying V (0) by the matrix A, it follows that for k > 0, the columns of V (k) = A k V (0) form a basis for the subspace S k = A k S 0, with dimension p. It can then be shown that as long as λ p > λ p+1, the subspace S k converges to S as k increases, giving rise to the method s name, subspace iteration [1]. The downfall of this approach is that as k increases, the columns of V (k) become an increasingly ill-conditioned basis for S k. In fact, if λ 1 > λ j, j > 1, then by the same argument presented in Section each column of V converges to x 1 and, if p = 1, this method simply reduces to the power method. The difficulty is addressed by orthonormalizing the columns of V (k) periodically; the orthonormalization process is also known as a reduced QR factorization. This factorization process decomposes an n m matrix A into an n m matrix Q with orthonormal columns and a square m m matrix R which is upper triangular. The reduced QR factorization can be found several ways, including Householder transformations, Givens rotations, or Gram-Schmidt orthogonalization [2]. The differences between these techniques are unimportant, it is only necessary to know that the reduced QR factorization can be calculated when necessary. The algorithm for the iteration process is provided as Algorithm 1.4. The result will be a p p Algorithm 1.4 Subspace Iteration [2] Q (0) = n p matrix with orthogonal columns for k = 1, 2,... do V (k) = AQ (k 1) Q (k) R (k) = V (k) QR factorization of V (k) can be computed every i iterations end for

22 8 matrix, R, with the p largest eigenvalues of A on the diagonal and an orthonormal n p matrix Q, whose columns span the subspace S; thus the invariant subspace spanned by the first p eigenvectors is the same as span(q (k) ) = span(v (k) ) upon convergence. The eigenvectors are then given by the columns of Q, Q = [q 1,..., q p ], which are also known as Schur vectors. Further discussion of the significance of Schur vectors in eigenvalue calculations can be found in the referenced texts. Both Golub and Van Loan[2] and Saad [4] supplement their presentation of subspace iteration with more rigorous analysis regarding the convergence conditions and rate. The presentation in Heath [1] is generally the simplest, as an introductory text it is less rigorous, Saad s focus is on practical application of the algorithms, and Golub and Van Loan provide a concise, yet mathematically complete, description along with a comprehensive set of references for each subject. These general descriptions apply not only to the presentation of subspace iteration but to the entire texts of the three cited references above. It is easy to see that subspace iteration could be used to find all the eigenvalues and eigenvectors of A if p = n. In this case it then makes sense to use as an initial guess Q (0) = I. With p = n, and the identity matrix as the initial guess, the subspace iteration algorithm can be simplified using recurrence relations resulting in a compact, elegant, and very powerful computational method given by Algorithm 1.5, known as QR iteration. Upon convergence the eigenvalues of the original A Algorithm 1.5 QR Iteration [1] A (0) = A for k = 1, 2,... do Q (k) R (k) = A (k 1) A (k) = R (k) Q (k) end for will be the diagonal entries of A (k) and the eigenvectors can be recovered from the Schur vectors, which are the columns of ˆQ (k), where ˆQ (k) = Q (1) Q (2)... Q (k). In fact, the QR iteration converges to the Schur form of A, so if A is Hermitian the eigenvectors will be equal to the Schur vectors and make up the columns of

23 9 ˆQ (k). Otherwise, the columns of ˆQ (k) only form a basis for the subspace S spanned by the eigenvectors of A. These eigenvectors can be recovered by finding the eigenvectors, Y, of the upper-triangular, or upper block triangular matrix A (k) such that X = ˆQ (k) Y. Though the basic QR algorithm is indeed simple and elegant it is far from practical. Heath [1] provides some discussion on improved QR algorithms while Golub and Van Loan [2] offer a more substantive treatment of the topic. A few important disadvantages of the method are noted, basically having to do with the number of operations required per iteration and the memory requirement of the method. Another distinct disadvantage is the all-or-nothing approach, namely using QR iteration, the entire eigenvalue problem must be solved. A practical alternative to the QR algorithm, if only a few eigenvalues are sought, is the family of Krylov subspace methods Krylov Subspace Methods Projection Methods Krylov methods belong to a family of solution techniques called projection methods. Before specifically discussing the use of Krylov methods for the eigenvalue problem, some of the most basic concepts of projection techniques [4] will be presented. In projection techniques, an approximation x to the exact eigenvector x is sought in some subspace K while enforcing the condition that the residual vector be orthogonal to some subspace L. The subspace K is referred to as the right subspace and L is referred to as the left subspace. The choice of L determines the type of projection method: in an orthogonal method L is the same as K, while in an oblique projection method L is not the same as K. An orthogonal projection of the eigenvalue problem, given by Eq. (1.1), to the m-dimensional subspace K seeks an approximate eigenvector x K that satisfies ( A x λ x, ) v = 0, v K, (1.3) which is known as a Petrov-Galerkin condition. Given that {v 1, v 2,..., v m } is an orthonormal basis of K, which makes up the columns of an n m matrix V, the

24 10 problem can be formulated using a basis like x = V y, in which case Eq. (1.3) becomes (AV y λv ) y, v i = 0, i = 1,..., m, which is equivalent to V (AV H y λv ) y = V H AV y λy = 0, where V H is the conjugate transpose of the matrix V, whose columns are given by v i. If the matrix V H AV is denoted by B m this relationship can be written B m y = λy. (1.4) The eigenvalues of B m then approximate those of A and the approximate eigenvectors x can be found from the eigenvectors, y, of B m. In an oblique projection Eq. (1.3) becomes ( A x λ x, ) v = 0, v L. (1.5) An analogous process to the one described above can be used, however two orthonormal bases are now needed: V for K and W for L. These bases are assumed to be biorthogonal, i.e., W H V = I. Using the same problem transformation, x = V y, results in Eq. (1.4), where B m is defined as B m = W H AV. An additional condition must hold for the pair V and W to exist, mainly det ( W H V ) 0. An analysis of this subject and a more comprehensive introduction to projection

25 11 methods is provided by Saad [4]. As projection methods are also important tools in the solution of linear systems a similar description is provided in [5], with a slightly different focus Arnoldi and Lanczos Methods Krylov subspace methods are an important tool in the solution of the eigenvalue problem and also play an important role as iterative methods used to solve linear systems of equations, as will be seen in Section 1.3. Formal mathematical descriptions of the Krylov subspace and its properties can be found in Saad [4] and [5]. The Krylov subspace is defined as K m (A, v) span [ v, Av, A 2 v,..., A m 1 v ], (1.6) where A is an n n matrix, v is a vector of length n, and m is one dimension of the n m subspace. There are two important things to note from the definition of this subspace: first, it is built incrementally, and second, the basis vectors are simply the sequence of vectors generated during the basic power iteration. Krylov subspace methods are referred to as projection methods because these methods work by projecting onto the subspace K m. The two most important Krylov methods are Arnoldi s method and the Hermitian Lanczos method. Arnoldi s method [6] is a method for general nonsymmetric (non-hermitian) matrices based on an orthogonal projection onto the Krylov subspace. It was originally developed by Arnoldi as a means of reducing a matrix into Hessenberg form. An excellent explanation of the method, both theoretically and practically can be found in both of Saad s books, although the focus shifts between Arnoldi as an eigenvalue algorithm [4] and as an iterative algorithm for solving linear systems [5]. The Hessenberg reduction consists of finding V such that V T AV = H, where H is a Hessenberg matrix and V is orthogonal. A Hessenberg matrix is a matrix that is nearly triangular: an upper Hessenberg matrix has all zeros below the first subdiagonal while a lower Hessenberg matrix has all zeros above the first super diagonal. The salient feature of the Arnoldi method is that the reduction to Hessenberg form occurs by generating one column of V at a time. Consequentially,

26 12 H is also constructed incrementally, and it is H which is used to determine the eigenvalues of A. The columns of V are known as Arnoldi vectors and for an n m matrix V m the columns form an orthonormal basis for K m. Although the set of vectors described by Eq. (1.6) generates a basis for the Krylov subspace, it becomes increasingly illconditioned, as discussed in Section 1.2.2, and an orthogonalization process is again necessary. A very basic implementation of Arnoldi s method is given by Algorithm Algorithm 1.6 Arnoldi Iteration [5] v 1 = vector of length n, v 1 2 = 1 for m = 1, 2,... do w m = Av m for j = 1, 2,..., m do h j,m = (w m, v j ) w m = w m h j,m v j end for h m+1,m = w m 2 if h m+1,m = 0 then stop end if v m+1 = w m /h m+1,m end for See footnote a for explanation of the notation used a Although the Arnoldi process is iterative in nature the Arnoldi vectors, v, are not computed iteratively, nor are the elements of H, h j,m. Each iteration does not result in new estimates of V and H but instead increases the dimension of V and H. It is consistent with the notation of this text then that the index m is subscripted rather than superscripted. 1.6, where the columns of V are generated via v 1, v 2,... and the elements of H are given by h j,m. Saad [4] emphasizes that an implementation of the method is not equivalent to the method, particularly due to complications arising from floating-point arithmetic. Algorithm 1.6 orthogonalizes the Arnoldi vectors using the Modified Gram-Schmidt method which seeks to avoid introducing significant floating-point errors which often arise from the classical Gram-Schmidt procedure. The approximate eigenvalues of A in this process, λ (m) i (i = 1,..., m), are the eigenvalues of H m. These eigenvalues themselves need to be calculated using some other method, which generally speaking is the QR iteration. H m is upper Hessenberg by construction, which greatly reduces the computational cost of the QR

27 13 iteration; the o(n 3 ) operations per iteration becomes only o(n 2 ) [2] and generally m n. Thus QR iteration is a very efficient method for finding the eigenvalues of H m. Approximations to the eigenvectors of A, x (m) i, can be found using the relation x (m) i = V m y (m) i, where y (m) (m) (m) i is the eigenvector of H m associated with λ i. The quantities λ i and y (m) i are known as the Ritz value and Ritz vector, respectively. In an orthogonal projection method the following relations hold, A x (m) i x (m) i K m and K m. As m increases, a number of the extremal Ritz eigenvalues will closely x (m) i approximate the eigenvalues of A, where the number of λ (m) s which are good approximations of eigenvalues of A is much less than m. It is then possible to increase m until all of the desired eigenvalues are found. λ (m) i There are a few obvious reasons why the Arnoldi method as described is nonviable for use as an eigenvalue solution technique. As m increases the size of V m and H m both increase, and in each iteration the new v m must be orthogonalized against all previous Arnoldi vectors. Another disadvantage is that the quality of the approximation of λ (m) i to λ i depends largely on the starting vector, v 1. A variation of the Arnoldi method, known as Implicit Restarted Arnoldi Method (IRAM) was developed [7] and addresses these concerns. After a set number of iterations, i, the Arnoldi iteration is restarted with a guess for v 1 which has been carefully constructed from span[v 1,..., v i ]. The restart vector is specifically chosen so that components from wanted eigenvectors are included while unwanted contributions are excluded. As IRAM solves for a subset of the eigenvalues, the wanted eigenvectors are those which correspond to eigenvalues in this subset while unwanted eigenvectors correspond to eigenvalues which are not being solved for. A detailed description of IRAM and the manner in which a restart vector is chosen is outside the scope of this review but the details of the method are widely available and are even briefly discussed in [2]. The Implicitly Restarted Arnoldi Method is a very popular method and represents one of the more powerful methods available today in the solution of the non-hermitian eigenvalue problem when only a limited number of extremal values are sought. A freely available software implementation of IRAM and its variants, known as ARPACK, is particularly popular. ARPACK is distributed as a set

28 14 of Fortran 77 subroutines or c++ classes and is also available in parallel. The ARPACK User s Guide [8] is the best source for implementation details of IRAM, not only does it describe the software interface but it also contains theory sections which overlap with many of the topics discussed up to this point. ARPACK is also capable of solving the Hermitian eigenvalue problem using a variant of the Lanczos method, which is a simplification of the previously described Arnoldi method. If the matrix A used in the Arnoldi Method is Hermitian then many of disadvantages of the Arnoldi Method vanish. The Hessenberg matrix H m becomes tridiagonal (denoted T m ) and it is no longer necessary to store all of the previous Arnoldi vectors, instead a three-term recurrence can be found relating them. This new algorithm, the Lanczos iteration, drastically reduces the storage and computational costs of the Arnoldi method. A simple implementation is provided in Algorithm 1.7. Now, rather than finding the eigenvalues of an upper Hessenberg Algorithm 1.7 Lanczos Iteration [5] v 0 = 0, β 0 = 0 v 1 = vector of length n, v 1 2 = 1 for m = 1, 2,... do w m = Av m α m = v H mw m w m = w m β m 1 v m 1 α m v m β m = w m 2 if β m = 0 then stop end if v m+1 = w m /β m end for matrix, as with the Arnoldi method, the eigenvalues of the Hermitian tridiagonal matrix T m must be found, where α m and β m are the diagonal and subdiagonal entries. This is a very manageable problem, especially when m n and generally the Lanczos method produces high quality approximations of the eigenvalues of A after relatively few iterations. The largest potential disadvantage of the Lanczos method is the possibility that the Lanczos vectors could lose orthogonality due to rounding errors, in which case they could be reorthogonalized, incurring a substantial computational cost.

29 15 Algorithms 1.6 and 1.7 seem to break down if either h m+1,m = 0 or β m = 0, respectively, at any step m. This circumstance actually indicates that the algorithm has found an invariant subspace, K m, and that the Ritz values and Ritz vectors of H m or T m, are in fact equal to the eigenvalues and eigenvectors of A, respectively. Though the Lanczos method is a simplification of the Arnoldi method for a certain class of problems it is much more amenable to rigorous analysis. While it is difficult to analyze properties of the Arnoldi method the Hermitian Lanczos method is much better understood [4]. However, the Lanczos method has also been extended to the case of non-hermitian matrices, where it differs from Arnoldi s method mainly in the fact that it is an oblique, not orthogonal projection technique [2] Other Methods Of the methods that have been omitted from the above review, one in particular should be mentioned. This method, along with IRAM, is one of the most recent numerical methods developed to solve the matrix eigenvalue problem, formulated by Sleijpen and Van der Vorst [9] and termed the Jacobi-Davidson method. This method is a synergistic combination of two older methods: Jacobi s method for eigenvalue problems and Davidson s method. Saad [4] describes Davidson s method as a preconditioned Lanczos-type method which is implemented in a manner more akin to Arnoldi s method due to the accumulation of an orthogonal basis. Only the most basic forms of the algorithms discussed have been presented and very little of the underlying analysis has been examined. The body of work associated with numerical methods for matrix eigenvalue problems is indeed extensive, and ever growing. Many methods exist which are useful only in limited circumstances and many techniques are used to transform the algorithms in this section into practical computational tools. The amount of information available on these subjects is overwhelming and the surface has barely been scratched in this review. For a more complete presentation the text by Heath [1] is a simplified introduction to many of these techniques while Golub and Van Loan [2] provides a firmer mathematical footing. The text by Saad [4], devoted to the eigenvalue problem, is an indispensable resource and is complemented by the manuscript of Van der Vorst [10].

30 Krylov Subspace Methods It has already been seen that Krylov Subspace Methods play an important role in the solution of the matrix eigenvalue problem. These methods are also quite effective in the iterative solution of linear systems of equations, a problem almost universally encountered in solving differential equations on digital computers via discretized formulations. Given an n n nonsingular matrix A, and a vector b, of length n, the problem Ax = b (1.7) is then concerned with finding the unique vector x of length n which satisfies this relationship. At the most fundamental level, the methods used to solve this problem can be divided into two types: direct and iterative methods. Direct methods include Gaussian Elimination and factorizations such as LU decomposition [1]. A direct method is any method that will produce the exact result in a finite number of steps assuming exact arithmetic. Iterative methods, on the other hand, may only approach the exact solution after an infinite number of iterations. In an iterative method some approximation to the solution is made and then continually improved through the iterative process until some measure of the error falls below a chosen tolerance deemed sufficient for that error to not adversely affect the usefulness of the computed solution. The storage and computational costs of direct methods become intractable as the size of the matrix A increases and in these cases iterative methods may be the only practical option. The use of iterative methods to solve large problems is a well-developed area of mathematics. The basics are covered in any numerical analysis text [1, 2, 11] and rather detailed analysis can be found in specialized texts. Varga [12] and Hageman and Young [3] are two classical examples of these types of works, published before many of todays techniques, such as Krylov methods and multigrid became popular or were even conceived. More recent texts are those by Saad [5] and Van der Vorst [13]. Saad s focus is iterative methods for sparse linear systems, where sparsity is a qualitative measure of the number of zeros elements in a matrix: the more zero entries the sparser the matrix. Van der Vorst s text is more limited in scope, only presenting various Krylov methods. Saad s book especially concerns itself with implementations of the algorithms described and it is invaluable in this regard. In

31 17 this review of iterative methods however only basic concepts and algorithms will be considered, with more advanced topics discussed in the main text should they arise Stationary Methods Iterative methods themselves can then be split into two categories, i.e. stationary and projected. Stationary methods include classical iterative methods: the Jacobi method, the Gauss-Seidel method, and the Successive Over-Relaxation method. These methods are treated by most scientific computing textbooks, not just those specifically on iterative methods. Heath [1] and Golub and Van Loan [2] both provide good descriptions of these simple stationary methods, termed the standard iterations in [2]. The following descriptions most closely resemble the derivations given by Heath. In a general form, stationary methods can be written x (k+1) = Gx (k) + c, where G and c are constant throughout the iterations. The matrix G and vector c are chosen such that x = Gx + c is a solution to Ax = b. The matrix G is obtained via a splitting of the matrix A into multiple components, A = M N, where M must be nonsingular and the iteration scheme then becomes x (k+1) = M 1 Nx (k) + M 1 b. Different stationary iterative schemes are determined by the particular splitting of A. The convergence properties of the scheme are also a direct result of the choice of splitting. It is the spectral radius of M 1 N which will drive the convergence, hence choosing a splitting which minimizes the spectral radius is desirable. In practice however there is a tradeoff between the number of iterations and the computational cost per iteration. The Point Jacobi iterative scheme is the most basic and simplest to implement

32 18 of any iterative scheme. It is an inherently parallel process as well since the computation of the current iterate depends only on values found during the previous iteration. The splitting of A in the Point Jacobi scheme is M = D, N = (L + U), where D is a diagonal matrix with entries mirroring the diagonal of A, and L and U are the strictly lower and upper triangular components of A, respectively. The Point Jacobi form is then written as x (k+1) = D 1 ( b (L + U) x (k)). In component form the Point Jacobi method is given by x (k+1) i = b i a ij x (k) j j i, i = 1,..., n. a ii The next logical step in the iterative process is to use the most current information, i.e. x (k+1) j for j < i, available during the calculation or the right hand side, which is the basis of the Gauss-Seidel method. This iterative scheme is slightly more complicated than the Jacobi method, but has superior convergence properties. The splitting of A in the Gauss-Seidel scheme is M = D + L, N = U, where again, D is a diagonal matrix with entries mirroring the diagonal of A and L and U are the strictly lower and upper triangular components of A. The Gauss- Seidel method is then written in matrix form as x (k+1) = (D + L) 1 ( b Ux (k)). In component form the Gauss-Seidel method is given by x (k+1) i = b i j<i a ij x (k+1) j j>i a ij x (k) j a ii, i = 1,..., n.

33 19 Aside from exhibiting superior convergence properties the Gauss-Seidel method also has an advantage from the standpoint of required memory size. Each iterate can overwrite the previous iterate thus eliminating the need to store a separate array for both the new and previous iterates. The SOR Method is not a method derived from a different splitting of A, but rather is a method used to accelerate the convergence of a stationary iterative method, usually the Gauss-Seidel method. The successive relaxation class of methods uses the Gauss-Seidel iterations to determine the next search direction but employs some parameter ω to scale the step size, such that for Gauss-Seidel based SOR, ω = 1 is simply Gauss-Seidel iteration. Given the previous iterate x (k), the new iterate for successive relaxation methods is determined by ( ) x (k+1) = x (k) + ω x (k+1) GS x (k). where x (k+1) GS is the just computed Gauss-Seidel iterate. The relaxation parameter, ω, generally ranges between 0 and 2, with values below 1 giving the underrelaxation class of methods while values between 1 and 2 result in over-relaxation, i.e. SOR methods. The question of which ω to use is not easily answered and theoretical expressions only exist for certain classes of matrices. In practice ω can be optimized via the use of numerical studies or an optimum ω value can be approximated using theory developed for similar classes of problems. With an optimum ω value, SOR displays convergence behavior superior to that of Gauss-Seidel Projection Methods Projection iterative methods utilize the same principles as those used to solve eigenvalue problems. In fact, many of the popular Krylov subspace (projection) methods are direct results of the Arnoldi and Lanczos (Hermitian and non-hermitian) algorithms discussed in Section The definitions of orthogonal and oblique projections previously stated are still applicable and the subspaces K and L still refer to the right and left subspaces, respectively. Now Eq. (1.1) is replaced with Eq. (1.7) where x is sought such that x x 0 + K, where b A x L

34 20 and x 0 is some initial guess to the linear system and x 0 + K is a vector space. This can be written more conveniently as x = x 0 + δ and by then defining the initial residual as r 0 = b Ax 0. The problem is then given by x = x 0 + δ, δ K (1.8a) (r 0 Aδ, w) = 0, w L. (1.8b) It is also helpful to look at this process in matrix form, where V and W have the same meanings as in Section The approximate value is extracted from the projected space as before by x = x 0 + V m y (1.9) Using the new Petrov-Galerkin condition this implies that W T AV y = W T r 0 (1.10) such that the approximate solution is given by x = x 0 + V ( W T AV ) 1 W T r 0, (1.11) assuming W T AV is nonsingular, which is true if A is positive definite and L = K or A is nonsingular and L = AK [5]. It can also be shown that if A is symmetric positive definite and L = K then the approximate solution, x, is the vector which minimizes the matrix norm (A(x x), x x) 1/2, where x is the true solution and x x 0 + K. For the oblique projection where L = AK, the approximate solution is the vector which minimizes the L 2 norm of the residual, b Ax 2, with x x 0 + K. These two properties, proven in [5], provide insight into some of the projection methods which prove to be useful in this work. The specific class of projection methods most relevant to the field of iterative methods is based on the Krylov subspace, K m = span[r 0, Ar 0, A 2 r 0,..., A m 1 r 0 ].

35 21 As projection methods these can be classified by the type of projection, orthogonal or oblique, but according to Saad a more basic taxonomy is whether the method is based on Arnoldi orthogonalization or Lanczos biorthogonalization Arnoldi Based Methods Using the Arnoldi process to find an orthonormal basis for the m-dimensional subspace, K m, has already been introduced and is described in Algorithm 1.6. Arnoldi s algorithm actually leads directly to the full orthogonalization method (FOM), a technique which can be used to solve Ax = b. Knowing that at step m in the Arnoldi process Vm T AV m = H m, Eq. (1.10) becomes H m y m = V T r 0, where V = W in this case since Arnoldi is an orthogonal projection where the left and right subspaces are equal. If v 1 is chosen to be r 0 / r 0 2 this can be simplified further using the orthonormality of the basis such that H m y m = r 0 2 e 1, where e 1 is the first column of the m m identity matrix. Thus, if the indicated value for v 1 is chosen, Arnoldi s method can be used to find an approximate solution, x, to Ax = b after m steps, where m is usually chosen by monitoring the norm of the residual. The approximation x (m) does not need to be calculated at each step, nor does the residual, but generally some objective stopping criterion is desired. Interestingly, if m = n then the solution (in exact arithmetic) is exact [11]. This is in fact true of all Krylov subspace methods, however it is prohibitively expensive to run this many iterations and the strength of Krylov methods lies in the fact that a good approximation of the solution is often available for m n. The Arnoldi method also lies at the heart of one of the best known Krylov Subspace methods, the Generalized Minimal Residual Method (GMRES) which solves a linear system for a nonsymmetric A. GMRES is an oblique projection method using bases K = K m and L = AK m. To derive GMRES note that any vector x x 0 +K m can be written by using equation Eq. (1.9). The system residual

36 22 for x is then given as a function, G(y), where G(y) = b Ax 2 = b A(x 0 + V m y) 2 = r 0 AV m y 2. If the (m + 1) m matrix H m and V m are defined by the Arnoldi algorithm then it is true that AV m = V m+1 Hm. Substituting this relation into G(y) and simplifying r 0, G(y) = ( r 0 2 ) v 1 V m+1 Hm y 2 = V m+1 ( ( r0 2 ) e 1 H m y ) 2 = ( r 0 2 ) e 1 H m y 2. The orthonormality of the columns of V m+1 has been exploited and the assumption has been made that v 1 = r 0 / r 0 2. It has already been stated that for the particular set of left and right subspaces used in GMRES the approximate solution is the solution which minimizes the L 2 norm of the residual, b Ax 2, with x x 0 + K, here defined by G(y). So the y which minimizes G(y) defines the GMRES iterate x (m) ; in other words x (m) = x 0 + V m y m, (1.12a) y m = min i ( r 0 2 ) e 1 H m y 2. (1.12b) where V m and H m are determined by the m-th step of the Arnoldi process. In algorithmic form GMRES looks very similar to the previously defined Arnoldi Method, as indicated in Algorithm 1.8. Again the value for k is chosen based on the residual in some way, often an ɛ value is set such that the iterations stop when b Ax (m) 2 / r 0 2 < ɛ. This quantity can be computed during the iteration in an inexpensive manner as explained by Saad [5]. Finding the minimizer, y k, is equivalent to solving a least-squares problem with an upper Hessenberg matrix, which can be done very efficiently using a procedure known as Givens rotations [2]. To reduce the amount of computation neither the minimizer nor the solution x (m) need to be found until the residual indicates sufficient convergence. GMRES, like the Arnoldi method, retains the Arnoldi vectors and orthogonalizes the new Arnoldi vector against all that have been previously calculated. As m increases, the storage cost for the vectors and the computational cost per iteration

37 23 Algorithm 1.8 GMRES [5] r 0 = b Ax 0 ; v 1 = r 0 / r 0 2 H m = {h jm } 1 j k+1,1 m k is (k + 1) k matrix initialized with zeros for m = 1, 2,..., k do w m = Av m for j = 1, 2,..., m do h jm = (w m, v j ) w m = w m h jm v j end for h m+1,m = w m 2 if h m+1,m = 0 then Compute y m and x m end if v m+1 = w m /h m+1,m end for Find y k that minimizes ( r 0 2 ) e 1 H k y 2 and x (k) via x (k) = x 0 + V k y k become too much of a burden. Using restarts, as described for the Arnoldi method, results in a much more practical GMRES implementation. It is important to note that the matrix A only appears in the algorithm as the multiplier of a vector, meaning GMRES does not require any knowledge of A itself other than the result of its action on (i.e. product with) a vector. This is especially useful for sparse matrices where the number of operations required for the matrixvector multiplication is small for a very sparse matrix. Only needing matrix-vector products is essential to the success of the Jacobian-Free Newton-Krylov method which will be discussed shortly. This is also a trait that is shared among all Krylov subspace methods, though some may require the action of A T as well. Significant improvements can be made to Arnoldi based Krylov methods when the matrix A is Hermitian, in which case the orthogonalization process is based off of Algorithm 1.7, the Hermitian Lanczos method. The Lanczos method for linear systems can be derived from this algorithm the same way that FOM was derived from the Arnoldi orthogonalization process. Although in the Hermitian case the upper Hessenberg matrix, H m, is simplified to a tridiagonal matrix, T m, and recurrence can be exploited to reduce the storage requirements. The solution of the tridiagonal linear system using Gaussian elimination with no pivoting can be directly incorporated into this procedure, resulting in the Direct Lanczos algorithm

38 24 for linear systems. The residual vectors in this algorithm are orthogonal to each other, as in the case of FOM, and the basis vectors of K m, defined by the Lanczos process are conjugate with respect to A. Two vectors w, and y are conjugate if (Ay, w) = 0, or equivalently w H Ay = 0. If these properties are reflected in the direct Lanczos algorithm, the result is the conjugate gradient method an extremely efficient and well known iterative method for symmetric positive definite systems. The method is derived in this manner by [5] but there are many different paths one can take to arrive at this result. Golub and Van Loan [2] provides a thorough description, spending nearly an entire chapter on the derivation of the conjugate gradient method and its properties. The conjugate gradient method is given by Algorithm 1.9. The conjugate Algorithm 1.9 Conjugate Gradient [1] r 0 = b Ax 0 ; p 0 = r 0 for m = 1, 2,..., do α m = (r m, r m )/(Ap m, p m ) x m+1 = x m α m p m r m+1 = r m α m Ap m β m = (r m+1, r m+1 )/(r m, r m ) p m+1 = r m+1 + β m p m end for gradient method can also be used to estimate the two extremal (maximum and minimum modulus) eigenvalues of A via the extremal eigenvalues of the tridiagonal matrix T m, though solving linear systems is the most popular use of the algorithm. Just as the conjugate gradient method is a simplified version of FOM for Hermitian matrices the GMRES algorithm can also be simplified, yielding the Conjugate Residual method, where, as implied, the residuals are now conjugate rather than orthogonal and the basis vectors satisfy (Ap i, Ap j ) = 0. This type of basis, known as A T A-orthogonal can be generalized to non-hermitian problems resulting in a few methods that are mathematically equivalent to GMRES. These methods are the Generalized Conjugate Residual method, ORTHOMIN, and ORTHODIR. This is not a comprehensive review of Arnoldi based Krylov subspace methods but does cover the more important aspects. These methods can also be used in block form and restarts are often necessary in practice. For more rigorous and thorough discussions of these methods Saad [5] and Van der Vorst [13] are excellent resources.

39 Lanczos Based Methods The Lanczos branch of Krylov subspace methods relies on creating mutually orthogonal bases for two Krylov subspaces as opposed to the single basis created by the Arnoldi algorithm. This results in an algorithm known as the Lanczos Biorthogonalization procedure which is used in a manner analogous to the previously mentioned Arnoldi and Hermitian-Lanczos approaches, but resulting in a pair of bases. From this orthogonalization procedure arises a method to solve linear systems, much like the FOM and Lanczos method for linear systems were direct extensions of the Arnoldi and Hermitian Lanczos orthogonalization procedures. Many of the methods based on Lanczos Biorthogonalization can be arrived at in similar manners as their Arnoldi based counterparts. For instance, the well-known Biconjugate Gradient (BiCG) algorithm can be derived by applying similar procedures to the Lanczos Biorthogonalization as were applied to the Arnoldi process. The Quasi-Minimal Residual method (QMR) is the Lanczos analog to GMRES; it seeks to minimize an equivalent quantity in order to obtain an approximate solution. Both BiCG and QMR require operations by the transpose of A, making them unsuitable for any type of applications where the matrix-vector product is treated as a function call (i.e. the matrix is never constructed or stored). For this reason variations of some of the Lanczos methods have been derived which do not require the action of the matrix transpose, the two most well known being Conjugate Gradient Squared (CGS) and Biconjugate Gradient Stabilized (BICGSTAB), although a Transpose Free QMR method has also been developed. CGS can be derived directly from BiCG, and BICGSTAB is a variation of CGS which seeks to avoid some of its shortcomings. Both CGS and BICGSTAB are important methods for the solution of general linear systems. Another useful resource regarding Krylov iterative methods is the Templates book [14] which cleanly presents base algorithms for nearly all of the iterative methods in use today. The theory section on Krylov methods is valuable and a list is provided which details the essential properties and differences between the major Krylov subspace methods (CG, GMRES, BiCG, QMR, CGS, BICGSTAB).

40 Preconditioning Linear Problems Preconditioning of linear systems is mentioned extensively throughout this work and refers to the process of employing one or more transformations that are intended to make a system more amenable to efficient solution via an iterative method. Ideally, the preconditioned system has better spectral properties (eigenvalue distribution, field of values, condition of the eigensystem) and thus an increased rate of convergence. More discussion on preconditioning can be found in [5, 13]. In the instance of Krylov methods preconditioning is essential to the overall performance of the method. Systems can either be left preconditioned, right preconditioned, or both. The left preconditioning process is accomplished by P 1 Ax = P 1 b (1.13) and right preconditioning by AP 1 P x = b, which yields a two-step solution process AP 1 y = b, x = P 1 y. (1.14) Using both left and right preconditioning (sometimes called split preconditioning), the preconditioned system is given by a similar process such that P 1 l APr 1 y = P 1 l b, x = Pr 1 y (1.15) where P l and P r, like P, are square matrices with the same dimensions as A. The main idea is to select P that is easy to form and invert. For the same P the different ways to implement preconditioning will result in differing convergence behavior, thus the choice between left, right, and split preconditioning is not trivial. With regards to GMRES and its variants, left preconditioning will minimize P 1 (b Ax k ) while right preconditioning minimizes b Ax k. Right preconditioning is preferred in many applications because it is applied only to the operator and not the righthand side. Since left preconditioning does alter the right-hand side the norm used to measure convergence is altered and this must be taken into account when using

41 27 this type of preconditioning. Discussions on preconditioning methods often accompany discussion of iterative Krylov methods. The texts by Saad [5] and Van der Vorst [13] explicitly discuss preconditioned formulations of many Krylov subspace algorithms and the types of preconditioners used with them. Saad especially is an excellent resource on the implementation of practical preconditioned methods. In general it is extremely difficult to predict the performance enhancement due to a particular preconditioner based on theoretical analysis and for this reason finding a good preconditioner for a given problem is nearly an art form. A comprehensive survey article on progress in preconditioning large linear systems is given by Benzi [15] with the topics covered similar to those touched on by Saad and Van der Vorst: incomplete factorization methods, sparse approximate inverses and block/multilevel extensions of these methods. The common theme among these types of preconditioners is that they depend on some form or another of the coefficient matrix, A. Incomplete factorizations are the most widely used of the class of preconditioners based on the structure of the coefficient matrix. Generally either Incomplete LU (ILU) or Incomplete Cholesky (IC) factorizations (for SPD systems) are used. An ILU factorization is one such that A = LŪ where L and Ū are approximate LU factors, lower and upper triangular, respectively. For a sparse A there is no reason to think that the full LU factors will be sparse themselves, thus an ILU factorization is simply one that does not retain all the LU factors but only as many as are deemed necessary. The most basic factorizations are termed ILU(0) and IC(0) and these indicate incomplete factorizations which retain the exact sparsity pattern of A; the locations of the nonzeros of A dictate the locations of the nonzeros of L and Ū and the same for the Incomplete Cholesky factors. More advanced incomplete factorizations have a larger amount of fill-in, that is, the factors fill-in the zero locations of A with nonzeros. The amount of fill-in can be determined based on location or numerical value. A general class of ILU (and IC) preconditioners based on location is denoted by ILU(l), where l indicates some level of fill-in. The level fill strategies are based purely on the structure of A while a threshold strategy (ILUT) also depends on the numerical value. Entries that are smaller than some chosen criteria are dropped in such a manner that

42 28 the nonzero pattern of the factors is determined dynamically. Sparse approximate inverses are another type of preconditioner which can be formulated based solely on the structure of A though they are of little interest in this work. These types of preconditioners explicitly construct matrices to formulate the action of P 1 which is necessary in the preconditioning steps found in Eqs. (1.13)- (1.15). However, the process can be viewed more abstractly as one where only the action of P 1 is required. In many cases this makes it possible to develop preconditioners which are based not on the structure of A but with particular knowledge of the underlying problem. For example, a lower order formulation of some problem can be used to precondition a higher order formulation. Physicsbased preconditioners can be extremely powerful though they are not easily found since significant knowledge of the problem is usually required. When using a preconditioner it is rare that P 1 would ever be formed and used explicitly (except in the case of sparse approximate inverses). With the incomplete factorizations the action of P 1 is approximated through direct application of the factors (forward and backward substitution mainly). However for other types of preconditioners the action of P 1 may be found using another iterative method such that a nested set of iterations is formed. This does not pose a problem so long as P is fixed, i.e. a stationary iterative method. However, when an iterative method is used where P 1 is not the same operator across iterations, such as another Krylov method, difficulties arise. To deal with this a family of Krylov methods have been developed which are built to handle a preconditioner which varies from step to step, known as flexible methods [5]. For instance if GMRES is to be used to approximate the action of P 1 then Flexible-GMRES (FGMRES) must be used to solve the preconditioned problem. Though preconditioning is a very important part of increasing the robustness of Krylov subspace methods there is no prescribed manner in which the best, or even an effective preconditioner can be found. The mathematical community continues to study the topic and specialized knowledge on preconditioning can often be found within the community studying a particular area of physics. Despite the difficulties associated with preconditioning, the enhanced performance and robustness make them an essential part of any practical use of Krylov methods.

43 Inexact Newton Methods Newton s method, also known as the Newton-Raphson method, is a very well known method that can be used to find the root of a function. In its most basic form the method is used to find a root of a real-valued function, but it can also be generalized to find a root of a nonlinear system of equations. Newton s method can also be viewed as a method to solve an unconstrained optimization problem, as will be seen. Any textbook on numerical analysis or scientific computing [1, 11] most likely presents Newton s method and several variations thereof. Although not commonly described as a solution technique for eigenvalue problems it is possible to find an eigenpair using Newton s method, an idea that plays a primary role in this work. Stewart [16] devotes a brief section to the solution of eigensystems using Newton s method, mainly as an introduction to the Jacobi-Davidson method, which is related to Newton s method to the extent that Stewart claims it could also be termed the Newton-Rayleigh-Ritz method. Anselone and Ball [17] also consider Newton s method as a tool for solving the eigenvalue problem. Peters and Wilkinson [18] were able to show a relationship between inverse iteration and Newton iteration, while Zhou [19] derived a Newton method directly from the Rayleigh quotient iteration. Stewart [16] also notes connections between Newton s method and the QR algorithm as well as pointing to a theorem of Dennis and Schnabel [20] which effectively says that two methods which are superlinearly convergent and begin at the same point, i.e. initial guess, will produce the Newton direction, up to higher order terms. So, though Newton s method may not commonly be associated with the eigenvalue problem the mathematical community has certainly not neglected the issue. However, the application of Newton s method to the k- eigenvalue problem in neutronics calculations (discrete-ordinates transport or the diffusion approximation) has not been explored beyond the IRAM/JFNK hybrid scheme proposed by Mahadevan and Ragusa [21] and the work of Knoll et al. [22], which will be discussed in Section 1.5. In its most basic form, used to find the root of a nonlinear equation containing a single variable, Newton s method is simple to derive. Using Taylor series a

44 30 nonlinear function f can be expanded as f(x (0) + δx) = f(x (0) ) + f (x (0) )δx f (x (0) )δx If terms higher than first order are truncated then a linear approximation of f near x = x (0) + δx results, which has a root when δx = f(x (0) )/f (x (0) ). This can be used to find an update, x (1), to the root of the original function f, which can in turn be used to find a better correction with the process repeating until convergence is achieved. Simply stated, x (k+1) = x (k) f(x (k) )/f (x (k) ). Although Newton s method converges quadratically [1] it is extremely sensitive to the choice of x (0). Some choices may cause the method to diverge and in the case of multiple roots the choice will determine which root the method converges to. If Newton s method is to be used as an optimization tool (to find a minimum) in one dimension then the root of f and not f is sought, such that x (k+1) = x (k) f (x (k) )/f (x (k) ). Again Newton s method converges quadratically, if it converges, and is sensitive to the initial guess. The converged value in this instance may indicate either a minimum, a maximum, or an inflection point of the function. The true power of Newton s method is realized when it is generalized to an n dimensional problem. Now consider a vector-valued function, F (x), where x is a vector of length n, x = [x 1,..., x n ] T. The Taylor series expansion for this function around some vector x k is given by F (x (k) + δx) = F (x (k) ) + F (x (k) )δx F (x (k) )δx which gives F (x (k) + δx) F (x (k) ) + J(x (k) )δx (1.16) when higher order terms are truncated. The matrix J(x (k) ) is known as the Jaco-

45 31 bian matrix and its elements are defined by J(x (k) ) = F 1 (x (k) ) F 1 (x (k) ) x 1 F 2 (x (k) ) F 2 (x (k) ) x 1. x 2 F n 1 (x (k) ) F n 1 (x (k) ) x 1 x x 2 F n(x (k) ) F n(x (k) ) x 1 x 2 F 1 (x (k) ) F 1 (x (k) ) x n 1 x n F 2 (x (k) ) F 2 (x (k) ) x n 1 x n F n 1 (x (k) ) F n 1 (x (k) ) x n 1 x n F n(x (k) ) F n(x (k) ) x n 1 x n.. (1.17) If δx is set to the solution of the linear equation J(x (k) )δx = F (x (k) ), then the vector x (k) + δx is an approximation for a zero of the function F. This defines an iterative system similar to the one described above for Newton s method in one dimension, given by Algorithm Algorithm 1.10 Newton s Method [20] x (0) = arbitrary initial guess for k = 1, 2,..., do J(x (k) )δx (k) = F (x (k) ) x (k+1) = x (k) + δx (k) end for Although this algorithm reduces the nonlinear problem to a series of linear ones, the Jacobian matrix must be formed and a linear system solved at each step in the iteration. Like the one-dimensional Newton s method the convergence rate of the generalized variant for nonlinear systems converges quadratically, assuming an initial guess has been chosen that is sufficiently close enough to a zero of F. Often times so-called globalization procedures are used with Newton s methods which add to the algorithm an extra calculation that seeks to ensure the step size δx meets some criteria, with the goal of aiding convergence. One simple approach is to multiply δx by some scalar, with the scalar chosen to ensure the new x value actually decreases the residual, F (x (k+1) ), while the trust region method attempts to keep the step size within some radius where the Taylor series approximation is deemed accurate [1]. Dennis and Schnabel [20] contains a detailed discussion on the globalization of Newton s method. Due to the stiff computational requirements of Newton s method variations have been constructed which attempt to reduce the computational cost. The so-

46 32 lution of the linear system, also called the linearized Newton step, or just Newton step, and the formation of the Jacobian are the dominant costs which these variations seek to reduce. The class of methods that reduce the computational cost by approximately solving the linear Newton step are termed Inexact Newton methods. A generic algorithm for the Inexact Newton method is given by Algorithm 1.11, where the condition imposed simply indicates the accuracy required of the approximate solution to the Newton step. Inexact Newton methods are often re- Algorithm 1.11 Inexact Newton Method [23] x (0) = arbitrary initial guess for k = 1, 2,..., do J(x (k) )δx (k) = F (x (k) ) where F (x (k) ) + J(x (k) )δx (k) η (k) F (x (k) ) with η [0, 1) x (k+1) = x (k) + δx (k) end for ferred to by names such as Newton-Jacobi, Newton-SOR, and Newton-Krylov or Newton-GMRES, etc..., which indicate the type of iterative method being used. For large problems Krylov methods are most suitable since they are able to take advantage of the sparsity of the Jacobian matrix Forcing Factor Regardless of the manner in which an Inexact Newton method is implemented, the decision about how to accurately to solve the equation persists. The criterion used to decide this is often referred to as a forcing factor or forcing term in the context of these types of methods. The choice is important because it affects both the robustness and convergence rate of the underyling Newton method. The forcing factor is represented by η (k) in the relationship F (x (k) ) + J(x (k) )δx (k) η (k) F (x (k) ), (1.18) which basically requires a reduction in some residual norm by a factor of η (k) before the linear system is considered converged. F (x (k) ) is the residual at the beginning of the Newton step and F (x (k) ) + J(x (k) )δx (k) is similarly the residual of

47 33 the linear system J(x (k) )δx (k) = F (x (k) ). The mathematical literature contains many valuable papers regarding inexact Newton methods. Dembo, Eisenstat, and Steihaug proved [24] that for a sequence of η (k) which is uniformly less than one, the inexact Newton method converges locally, i.e. as long as x (0) is sufficiently close to the true solution. The same paper also classifies inexact Newton methods using the sequence of η (k) to determine the convergence rate of a given inexact Newton method; briefly, x (k) converges to x superlinearly when η (k) 0 and x (k) converges to x quadratically when η (k) = O( F (x (k) ) ) and F (x) is Lipschitz continuous at x. Brown and Saad [25] performed a more general analysis of a Newton-Krylov method with globalization while Brown [26] performed an analysis of a JFNK method using either Arnoldi (FOM), GMRES, or GCR as the Krylov iterative method. Along with determining the convergence rate, the forcing factor will also determine how much computational work is done within each Newton iteration. In early iterations, when x (k) is not near a solution, the linear approximation of F may not be accurate, in which case choosing η (k) to be too small will effectively oversolve the problem. That is, a good deal of work may be done to find δx (k) which does little to reduce F (x (k) + δx (k) ). The motivation then becomes to find a sequence of η (k) which avoids oversolving but maintains a fast local convergence rate. Many algorithms have been presented which yield a sequence of η (k) values to be used, ranging from constants to elaborately detailed functions. Dembo and Steihaug [27] proposed the choice η (k) = min{ 1 k + 2, F (x(k) ) } (1.19) Eisenstat and Walker [28] proposed two methods which depend not only on the current Newton residual, F (x (k) ), but also on the residual of the previous Newton iterate as well as the residual of the linear Newton step, J(x (k) )δx (k) = F (x (k) ). Numerically modified versions of the Eisenstat algorithms were given to ensure η (k) does not decrease too quickly. Recently, An et al. [29] proposed a new algorithm to more effectively choose the sequence of forcing factors which can be experimentally tuned for a specific problem by the choice of three parameters. Although the convergence rate of the Newton method has been shown to be a function of the

48 34 sequence of η (k) it has not been shown that there is an algorithm to choose the forcing factors which is guaranteed to give better results than any other method Globally Convergent Newton Methods The most desirable attribute of Newton s method is the fast local convergence rate while the difficulty in choosing a starting guess inside the local convergence radius is one of its least appealing. A good deal of literature exists on additions to the (Exact and Inexact) Newton Method which seek to yield a globally convergent algorithm. Many texts which refer to Newton s method also contain brief sections on common globalization approaches, such as linear searches and trust regions [1], as mentioned previously. The text by Dennis and Schnabel [20] which focuses on nonlinear problems devotes a significant amount of discussion to the theory of globalization techniques and practical applications. Most of the discussions on globalization techniques in standard textbooks refer to the exact formulation of Newton s method when the inexact formulation is much more relevant to this work. Globalization of the inexact Newton method though not included in many textbooks has been discussed in a number of journal articles. The manuscripts of Brown and Saad [25] and Eisenstat and Walker [23] contain detailed analyses regarding the convergence of inexact Newton methods when used in conjunction with globalization strategies. Brown and Saad consider only the subset of Newton- Krylov (more generally Newton-Projection) methods while Eisenstat and Walker take a more general look at the class of inexact Newton methods. Brown and Saad supplies the theoretical foundation for the class of Newton-Krylov methods which answers a number of theoretical questions raised in their previous work [30]. The more general treatment of Eisenstat and Walker provides a number of modifications to the inexact Newton method which result in algorithms which converge from any arbitrary starting point, given that some common conditions are met. These modifications can all be considered special instances of the Global Inexact Newton (GIN) framework used by the authors. The proposed modifications are analyzed in detail, always emphasizing the consequences of the inexact aspect. The practical methods proposed by the paper that fit in the GIN framework are

49 35 backtracking methods and equality curve methods. A generic implementation of inexact Newton backtracking as described by Eisenstat and Walker is shown in Algorithm 1.12 Subsequent work by Bellavia and Morini [31] presents a hybrid back- Algorithm 1.12 Inexact Newton Backtracking Method [28] x (0) = arbitrary initial guess for k = 1, 2,..., do J(x (k) ) δx (k) = F (x (k) ) where F (x (k) ) + J(x (k) ) δx (k) η (k) F (x (k) ) Set δx = δx, η = η while F (x (k) ) + J(x (k) )δx (k) > (1 α(1 η)) F (x (k) ) do Choose θ [θ min, θ max ] θδx δx, 1 θ(1 η) η end while x (k+1) = x (k) + δx (k) end for tracking / equality curve backtracking algorithm specifically for Newton-GMRES which offers an advantage of the classical backtracking algorithm by Eisenstat and Walker. In this technique, if classical backtracking along the inexact Newton direction fails to sufficiently reduce the residual then backtracking is performed along a new direction where the inexact Newton conditions hold with equality. More recent work by An and Bai [32] presents a similar hybrid strategy where equality curve backtracking is replaced by a new method termed quasi-conjugate-gradient backtracking Jacobian-Free Newton-Krylov Methods Other than approximately solving the Newton step the computational cost of Newton s method can be reduced by reducing the cost associated with forming the Jacobian matrix. One way to do this is to only construct the Jacobian periodically, not at each step of the Newton iteration. This is sometimes called the stale Jacobian method. It is also possible to create an approximate Jacobian matrix through the use of a difference approximation. A first order divided difference can be given by J(x (k) ) j F (x(k) + h (k) j e j ) F (x (k) ), k 0 h (k) j

50 36 where h (k) j is chosen such that it is small enough to sufficiently reduce truncation error, but not so small that it introduces rounding error [11]. This variation of Newton s method still converges quadratically as long as a suitable value for h is chosen. The idea of using divided differences can also be extended to create estimates of the Jacobian of increasingly higher order. Using Newton-Krylov methods it is possible to avoid any explicit approximation to the Jacobian matrix itself, only an approximation of the action of the Jacobian matrix on a vector is necessary. Newton-Krylov methods where only the action of the Jacobian is used are usually referred to as Jacobian-Free Newton-Krylov (JFNK) methods, though occasionally they are called Matrix-Free Newton-Krylov methods. These are inexact Newton methods which utilize a Krylov solver at the linear Newton step and approximate the Jacobian-vector product by J(x) y [F (x + ɛy) F (x)], (1.20) ɛ where ɛ is a small perturbation. The JFNK method without a globalization step is given by Algorithm 1.13, though the addition of a globalization step as in Algorithm 1.12 is straightforward. The best single resource on JFNK methods is the Algorithm 1.13 JFNK Method [33] x (0) = arbitrary initial guess for k = 1, 2,..., do Solve J(x (k) )δx (k) = F (x (k) ) using iterative Krylov method where F (x (k) ) + J(x (k) )δx (k) η (k) F (x (k) ) and Jy x (k+1) = x (k) + δx (k) end for [F (x+ɛy) F (x)] ɛ survey paper by Knoll and Keyes [33]. This article summarizes the development of Jacobian-Free Newton-Krylov methods to date, detailing the mechanics of the method and its potential problems. The aim of that article is to introduce the method and its possible uses to a larger audience in the computational physics community. The advantages of the JFNK method are the ability to use existing solution subroutines and inexpensive linear solvers as preconditioners. However, the effectiveness of the preconditioner should be monitored and the storage required by the preconditioner and possibly the Krylov solver are not trivial. The

51 37 simplest form of the JFNK algorithm requires two steps, but sometimes four are necessary; from outer to inner the steps include a globalization loop to ensure Newton convergence, the loop over the Newton corrections, a preconditioner for the Krylov solver, and at the innermost level a Krylov subspace solver to address the linearized Newton step. The purpose of the preconditioning step in the algorithm is to reduce the number of iterations required by the Krylov solver. Both left and right preconditioning can be used in conjunction with the JFNK method, and neither choice is inherently better than the other. Some standard preconditioning techniques are mentioned, such as stale Jacobian, incomplete lower-upper (ILU) factorization, Newton-Krylov-Schwarz (NKS), multigrid, and physics-based approaches. An example of a physics-based approach is provided in [33] for a stiff wave system. The remainder of the survey article discusses areas in computational physics where JFNK has already been applied, including but not limited to: fluid dynamics and aerodynamics, edge plasma modeling, the Fokker-Planck equation, the magnetohydrodynamics equations, reactive flows and flows with phase change, radiation diffusion and radiation hydrodynamics, and geophysical flows Perturbation Parameter The choice of the perturbation parameter, ɛ, in Eq. (1.20), is unique to the JFNK formulation of Newton s method. This parameter is essentially the finite-difference perturbation used to approximate the Jacobian, the derivative of F. If ɛ is chosen too large Eq. (1.20) becomes a poor approximation to the derivative and if it is too small the calculation risks the introduction of floating point errors. Knoll and Keyes [33] list a number of empirically-based formulas that can be used to determine an appropriate ɛ. In general, a perturbation in the neighborhood of ɛ mach is used where ɛ mach refers to the floating-point precision of the machine, assuming the floating-point system is based on the machine s native precision. Xu and Downar [34] have developed an algorithm which is capable of estimating the ɛ value which optimizes the JFNK solution by minimizing the total error contribution from the finite-difference truncation and rounding errors.

52 Fixed-Point Iteration Another situation where the JFNK variant of Newton s method can be applied is in the context of fixed-point iterations. If a fixed-point iteration is defined by u (n+1) = Φ(u (n) ) so that the solution is given by u = Φ(u ), then this problem can be solved using the JFNK approximation. This is done by defining the nonlinear function F (u) = u Φ(u) and solving it using the JFNK algorithm. Generally it is not possible to generate the Jacobian of Φ making the application of Newton s method to this problem via JFNK interesting. This can either be considered an acceleration method for the fixed-point iteration or a nonlinearly preconditioned version of the original equation system [25]. This approach has been used by Kerkhoven and Saad [35] as an acceleration method for elliptic coupled nonlinear systems and by Wigton, Yu, and Young [36] as an acceleration method for computational fluid dynamics codes. Xu and Downar [37] considered this as a manner in which a problem involving multiple coupled physics models could be solved while retaining the individual solver for each set of physics. Rather than reformulating the problem as a larger nonlinear problem, each physically distinct model could then be solved using preexisting software which utilized the best methods available for that particular problem. A local convergence analysis of an inexact Newton method agrees with [24] and the convergence analysis of the JFNK method mirrors that of [26]. 1.5 The k-eigenvalue Problem in Neutronics Ultimately the numerical methods discussed are used to solve a problem, be it an eigenvalue problem or a (non)linear system of equations, which serves as a model of some physical phenomenon. In this work the physical problem of interest is the determination of the multiplication factor and the fundamental eigenmode in a nuclear system containing multiplying, i.e. fissionable, nuclides. The multiplication factor can be thought of as the ratio of the total neutron population in the current generation to their number in the preceding generation. Consider a neutron resulting from the fissioning of a nucleus. This neutron will likely scatter around the system until it either leaks out of the system or is absorbed by another

53 39 nucleus, potentially causing another fission. If the number of neutrons created in the preceding generation result in the same number of neutrons in the current generation it is said that the system is critical or that the multiplication factor, k, equals 1, in which case the chain-reaction is self-sustaining and in the absence of external sources of neutrons it is time independent (ignoring the depletion of the finite number of fissionable nuclides in the system). If the number of neutrons increases from one generation to another the reaction is supercritical or k > 1, while if it decreases it is subcritical, k < 1. However, a subcritical system will not necessarily see a decrease in the neutron population over time if an external neutron source is present. Determining the multiplication factor as well as the fundamental mode of the neutron distribution throughout the system, depends on how the neutron distribution in the problem is modeled. Generally one of two mathematical formulations is used: neutron transport theory or diffusion theory (also called the diffusion approximation). Neutron transport theory, is the more exact of the models available, where the dependence of the neutron distribution on the spatial position, direction of travel, and energy of the particles is accounted for. Diffusion theory is an approximation of transport theory which can be derived by making some assumptions about the behavior of the nuclear system. Classical texts on neutron transport include [38, 39, 40] with the book by Lewis and Miller [41] focusing on the numerical solution of the transport problem. Diffusion theory is well-treated in Duderstadt and Hamilton [42], with a more practical treatment given by Stamm ler and Abbate [43]. The texts on iterative methods and analysis by Hageman and Young [3], Varga [12], and Wachspress [44] also discuss the numerical solution to the neutron diffusion problem. The specifics of the eigenvalue problem resulting from diffusion theory will be outlined in Chapter 2 while the transport eigenvalue is presented in Chapter 4. It is sufficient at this point to note that the neutronics eigenvalue problem is of the general form Mφ = 1 k F φ, where the operators have not yet been defined. This is often known as the gener-

54 40 alized eigenvalue problem, and in nuclear engineering is converted to the form Aφ = kφ, A = M 1 F where the inversion of M is never explicitly carried out, but effected via iterations. Traditionally this problem is solved using the fixed-point power iteration method for both transport [41] and diffusion [42] theories through a process using outer and inner iterations. Outer iterations pertain to updating the fission source (or flux) and the multiplication factor while inner iterations are associated with the solution of a linear system, M 1 gg. Most reactor systems result in a value of the dominance ratio, d, that is near one, indicating a slow rate of convergence for the power method and requiring the use of acceleration methods as a means to improve the convergence rate. One common type of acceleration is extrapolation, which uses previous iterates to improve the current iterate. Lewis and Miller mention overrelaxation as an acceleration technique in transport theory and the computer code TORT [45] uses just such a method. Diffusion theory can be accelerated using Chebyshev acceleration, which uses Chebyshev polynomials to determine the extrapolation parameters. Chebyshev acceleration has been used extensively in diffusion theory with analytical treatment provided by [3, 12, 44]. An excellent explanation of the inner-outer structure of the diffusion eigenvalue problem and acceleration via Chebyshev polynomials is given by Ferguson and Derstine [46] and implemented in the code DIF3D [47]. Another well-known diffusion code employing Chebyshev acceleration is NESTLE [48]. Another type of extrapolation technique, originally proposed by Wachspress is described by Hébert [49, 50]. While Chebyshev acceleration requires an estimate of the dominance ratio to determine the extrapolation coefficients, the method by Hebert does not require any such estimate but is instead derived from a variational principle. Comparisons by Hebert show that the variational approach offers a modest improvement over the Chebyshev method without requiring a dominance ratio estimate. Another type of acceleration used often in diffusion calculations is Wielandt iteration, which in principle is the same as the eigenvalue shift discussed in Section This method is also known as fractional iteration [44] and has been discussed

55 41 in the context of the nodal expansion method in diffusion by Sutton [51]. The well-known core simulator PARCS [52] uses the Wielandt method to accelerate the outer iterations. Other methods besides these variations of the traditional power method have been researched as solutions to the k-eigenvalue problem in nuclear engineering. Suetomi and Sekimoto [53] proposed a method to solve the generalized eigenvalue problem resulting from the monoenergetic diffusion equation using a conjugate gradient algorithm. This was done by recognizing that the minimal value of the Rayleigh quotient gives the desired eigenvalue (multiplication factor) and the corresponding eigenvector (neutron flux). The eigenvalue problem was then posed as a minimization problem for the Rayleigh quotient. Since in a monoenergetic diffusion problem both matrices are symmetric positive definite the CG algorithm can be used to solve the linear system associated with the minimization of the Rayleigh quotient. This approach was extended to the multigroup diffusion equations and the CG method was replaced with ORTHOMIN, mentioned in Section Both the CG and ORTHOMIN algorithms can be preconditioned in the proposed scheme. Several sample problems were solved and compared to unaccelerated power iteration using SOR to solve inner iterations; execution time was decreased using the new methods. Gupta and Modak [54] extended the ORTHOMIN method employed by Suetomi to the three-dimensional multigroup transport problem. A scheme was developed where the method could be implemented without explicitly forming any matrices, instead requiring only that the transport code is capable of solving a fixed-source problem. The source iterations were accelerated using a CG based transport synthetic acceleration (TSA) scheme previously developed by the authors. Numerical results for two example problems confirm that the ORTHOMIN method yields better performance than the power method (both without acceleration and with the same TSA(CG) acceleration). Vidal et al. [55] employed two different variations of subspace iteration to find multiple eigenmodes of the multigroup diffusion equations. One of the motivations for finding multiple modes is modal analysis of the equations which requires the first few dominant modes and not just the fundamental mode. Both subspace iterations used the Modified Gram-Schmidt procedure for orthogonalization while one

56 42 used a regular Rayleigh-Ritz projection and the other used a symmetric Rayleigh- Ritz projection. A variational acceleration (extrapolation) technique based on the assumption that the dominant eigenvalues are real numbers was developed. Numerical results from a two-dimensional and a three-dimensional reactor configuration show that the symmetric projection leads to improved performance and that the acceleration technique provides speedup factors in the range of two to four. The Implicit Restarted Arnoldi Method, mentioned in Section , is the eigenvalue method which has most recently been applied to both the diffusion and transport problems. In 1999 Verdú et al. [56] used the IRAM technique to solve for the subcritical modes in the diffusion eigenvalue problem. The software library ARPACK [8] was used in the implementation of IRAM for the diffusion problem and a parallel version of the software was developed. Numerical results were generated for several different configurations of the Three Mile Island core and IRAM was compared with the subspace iteration based methods developed by the authors in [55]. It was found that IRAM is more efficient and more robust than subspace iterations and the authors believe it to be an advancement over the traditionally used methods. Warsa et al. [57] succeeded in implementing IRAM for the transport problem in Again, with transport theory an explicit matrix is generally not available, however IRAM can still be implemented by using power iterations, which are present in most criticality analysis codes. The IRAM method is wrapped around the power method, and results in a solution technique which has a superior rate of convergence while utilizing preexisting software. Increased memory requirements and work per iteration are both outweighed by the improved convergence rate as shown by a number of numerical test problems. The only downside noted is the mediocre performance of IRAM in problems with upscattering. Most recently, Mahadevan and Ragusa [21] have developed a hybrid IRAM/JFNK method which uses IRAM to develop good initial approximations to a few of the dominant eigenmodes and then uses a Newton-based method to find the fully converged modes. Preliminary numerical results have been developed for some diffusion problems and show that the Newton method does succeed in finding fully converged eigenmodes, though few performance results have been published at this time.

57 Newton s Method in Neutronics Calculations Though Newton s method has not been considered as a means of accelerating or replacing the traditional k-eigenvalue calculation, there are instances where it has been used in conjunction with neutronics problems, aside from the work of Mahadevan and Ragusa. The most common manner in which this happens is when Newton s method is used to treat the strong nonlinearities which arise when coupling the core neutronics and thermal-hydraulics problems. Theoretical work on higher order couplings methods using the JFNK approximation has been done by Mousseau [58] and Pope [59]. More applied work has been done by Kastanya and Turinsky [60] and Gan, Xu, and Downar [61]. Both [60] and [61] consider a coupled neutronics thermal-hydraulics problem where Newton-Krylov methods are used, with the latter specifically considering the JFNK approximation. While the work by Gan et al. considers the coupling of the spatial kinetics equations to the thermalhydraulics problem the paper by Kastanya and Turinsky considers the two-group neutron diffusion equation coupled with the thermal-hydraulics problem. This problem necessarily contains the k-eigenvalue problem but is augmented further by the thermal-hydraulics equations and the associated unknown quantities. Kastanya and Turinsky show that the calculation of the coupled problem undergoes speedup when a Newton-Krylov method, specifically Newton-BICGSTAB, is employed. Several approximations to the exact Jacobian treatment were considered and were found to show a marginal speedup, though the JFNK approximation was not one of those employed. The Jacobian was generally written explicitly in their work and preconditioners, such as Block ILU, were constructed for BICGSTAB using this knowledge. More recently Knoll [62] has used JFNK methods to create a new type of nonlinear acceleration method for the within-group transport problem. This work was continued in [22] where along with the nonlinear acceleration scheme the JFNK approximation was considered for the k-eigenvalue problem for a one-group, onedimensional diffusion equation. This was presented at the same conference, M&C 2009, at which preliminary diffusion results from this work were presented [63]. Preliminary transport results were subsequently presented at the 2009 ANS Annual Meeting [64]. The derivation of, and results generated with the new methods are presented

58 44 in the following chapters. A definition of the k-eigenvalue problem in diffusion theory and the presentation of traditional techniques is given in Chapter 2, along with the development of the new Newton formulations of the problem. Numerical results for the Newton method and comparison to traditional solution techniques is given in Chapter 3. A thorough introduction to transport theory and a discussion of the traditional techniques used to solve the k-eigenvalue problem are given in Chapter 4. This chapter also contains the derivation and explanation of the newly developed classes of Newton methods which are the focus of this work. Numerical results presenting the behavior and effectiveness of the Newton formulations of the transport problem are given in Chapter 5 and finally the work is summarized and conclusions are stated in Chapter 6.

59 CHAPTER 2 The k-eigenvalue Problem in Diffusion Theory The central problem of nuclear reactor theory is the determination of the distribution of neutrons in the reactor, for it is this distribution which determines the local rate-density at which nuclear reactions occur. From detailed knowledge of this distribution it is also possible to determine the stability of the nuclear chain reaction. The streaming of neutrons between collisions, and the process by which neutrons interact, be it through scattering off of nuclei of the host material, being absorbed by other nuclei, or leaking out of the system, is known as neutron transport. Deterministic transport methods generally comprise a mathematical relationship possessing a solution which describes the expected neutron distribution with the two major approaches being diffusion theory and transport theory. Diffusion theory treats neutrons in the reactor much like gas molecules in air, with the process being driven by the net migration of neutrons from an area of high concentration to an area of low concentration. In the case of neutron transport, however, the process is driven by collisions of neutrons and host nuclei and not neutronneutron collisions. Neutron transport theory treats the particle transport problem with fewer approximations and is applicable in many types of problems the diffu-

60 46 sion approximation is simply inadequate. Diffusion theory can in fact readily be derived from transport theory if a few approximations are made, which is why it can also be referred to as the diffusion approximation. 2.1 The Diffusion Approximation Basic diffusion theory is clearly presented in the textbook by Duderstadt and Hamilton [42] and the notation used here when dealing with the diffusion approximation will most closely resemble theirs. The steady-state, criticality, diffusion equation can be written D φ + Σ t φ(r, E) = 0 +λχ(e) de Σ s (E E)φ(r, E ) 0 de ν(e )Σ f (E )φ(r, E ), where r is the position vector, E the neutron energy, φ(r, E) the scalar neutron flux, D the diffusion coefficient, Σ t the total interaction macroscopic cross section, Σ s the differential scattering macroscopic cross section, Σ f the fission macroscopic cross section, χ(e) the relative fission yield for energy E, λ = 1/k with k being the criticality eigenvalue or multiplication factor, and ν(e ) the average number of fission neutrons born in a fission induced by neutron with energy E. Discretizing into G energy groups, with group 1 covering the highest energy range and group G the lowest, results in the multigroup diffusion equations G D g φ g + Σ tg φ g = Σ sg gφ g + λχ g G g =1 g =1 νσ fg φ g, g = 1,..., G. (2.1) Formal definitions of the group scalar flux and constants are provided in [42]. Supposing the group removal cross section is defined to be Σ Rg = Σ tg Σ sgg, the multigroup equations are then be given by ( ) Dg + Σ Rg φg G Σ sg gφ g = λχ g νσ fg φ g, g g g = 1,..., G. (2.2) g =1

61 47 In order to numerically solve this eigenvalue problem the spatial domain in the multigroup equations must be discretized. This can be done using methods such as finite-difference approximations of the differential operators or the finite-element method. Duderstadt and Hamilton [42] and Stamm ler and Abbate [43] derive the finite-differenced form of the multigroup equations, the former using meshedge quantities and the latter using mesh-centered quantities. A three-dimensional discretization of the multigroup equations in Cartesian geometry results in a total problem size of N = n x n y n z G, where (n x,n y,n z ) is the number of spatial cells in the x, y, and z directions, respectively, and G is the number of energy groups. The size of the within-group problem is specified by n = n x n y n z, so that N = n G. The left side of Eq. (2.2) can be written as the block matrix M, M 1 Σ s21 Σ s31... Σ sg1 Σ M = s12 M 2 Σ s32... Σ sg , (2.3) Σ s1g Σ s2g Σ s3g... M G where M g is an n n matrix representing the discretized form of D g + Σ Rg, including homogeneous boundary conditions. Generally this is a sparse matrix, and in the case of finite-differencing it is a symmetric banded matrix with three, five or seven bands for one-, two-, and three-dimensional geometries. The matrices of the form Σ g g are also n n but are simply diagonal matrices with elements, (Σ sg g ) ijk, which denote the scattering cross section from group g to group g at some spatial cell labeled ijk. The total dimension of matrix M is then given by N N. The right side of Eq. (2.2) is given by the matrix F, which has the block structure χ 1 νσ f1 χ 1 νσ f2... χ 1 νσ fg χ F = 2 νσ f1 χ 2 νσ f2... χ 2 νσ fg..... (2.4). χ G νσ f1 χ G νσ f2... χ G νσ fg where each block is a diagonal matrix of dimension n n, such that F is N N. The elements of χ g νσ fg are χ g (νσ fg ) ijk where fissions induced in cell ijk by neutrons in group g result in ν neutrons of which a fraction equal to χ g appears in energy

62 48 group g. The scalar neutron flux is given by φ = [φ T 1 φ T 2... φ T G] T (2.5) so that φ is a vector of length N, composed of vectors φ g of length n. Using these definitions the k-eigenvalue problem for the multigroup diffusion equations is given by Mφ = λf φ. (2.6) Traditionally this problem is reformulated as a standard eigenvalue problem and the usual power iteration is applied such that φ (l+1) = λ (l) ( M 1 F ) φ (l), (2.7) with φ (l+1) equivalent to the solution of the linear system Mφ (l+1) = λ (l) F φ (l). The structure of M is heavily dependent on the whether the host material and prevalent conditions, e.g. temperature, support upscattering, i.e. scattering from lower to higher energy groups. Consider a scenario where the groups are chosen such that no upscattering occurs; in this case M is a block-lower triangular matrix and can be solved in a group-wise fashion. The quantity f is a vector of length n which represents the spatial fission source distribution and is defined by f (l) G g =1 νσ fg φ (l) g. (2.8) The shape of the fission source is the same for each group, thus the fission source for group g is simply given by, f g = χ g f, where χ g is just the scalar value of the fission spectrum for that group. The quantity f is useful because it only needs to be calculated once per outer iteration. Using f, the sequence of equations solved during a single outer iteration in the absence of upscattering is illustrated by M 1 φ (l+1) 1 = λ (l) f (l) 1, M 2 φ (l+1) 2 = λ (l) f (l) 2 + Σ s12 φ (l+1) 1 = q (l) 1,..

63 49 g 1 M g φ (l+1) g = λ (l) f g (l) + Σ sg g φ (l+1) g. M G φ (l+1) G = λ (l) f (l) G. g =1 + G 1 g =1 Σ sg G φ (l+1) g = q (l) g, = q (l) G. (2.9) In this manner each within-group equation can be solved using any iterative method capable of solving Ax = b, with the added benefit that for many spatial discretizations M g is symmetric positive definite and efficient algorithms such as conjugate gradient can be used. Using the updated scalar flux, φ (l+1), a new fission source, f (l+1), can be calculated. The updated eigenvalue is usually specified through some weighting function such as k (l+1) = k (l) ( f (l+1), f (l+1)) (f (l), f (l+1) ) or k (l+1) = k (l) f (l+1) f (l) as specified by Duderstadt and Hamilton [42] and Ferguson [46], respectively. Note that to start the iterative process it is necessary to choose an initial guess for the eigenvalue k (0) and an initial guess for either the fission source, f (0), or for the scalar flux, from which the fission source is calculated. If the group structure of the problem is chosen so that neutrons only scatter to the next lowest energy group then M becomes block-bidiagonal, meaning that the total group source q g is simply the sum of the fission source and the scattering source due to the next highest energy group. However if upscattering is considered the systematic solution by the process represented in Eqs. (2.9) is not possible. In this scenario an iterative loop must be placed around the lower energy groups which permit upscattering until suitable convergence of the scattering sources has been attained, though Stamm ler and Abbate [43] notes that groups which have only weak contributions from upscattering need not be included in the upscattering iteration. By retaining a few previous iterates the power method can be accelerated using Chebyshev polynomials, often resulting in a significant decrease in the total number of iterations required. The theory behind this practice is discussed in iterative methods texts [3, 12, 44] while Ferguson and Derstine [46] fully describe a means

64 50 to implement Chebyshev acceleration along with additional steps to enhance the robustness of the algorithm. In a general sense the process works by setting the new iterate of some quantity s using the relationship s (l+1) = s (l) + α ( s (l+1) s (l)) + β ( s (l) s (l 1)), (2.10) where s is the non-accelerated quantity and the scalars α and β are chosen using a method based on Chebyshev polynomial interpolation. It is possible to either accelerate the scalar flux vector φ or the fission source f, although in practice it seems that f is the more frequently chosen option. 2.2 Inexact Newton Methods In this section we describe the application of an inexact Newton method as described in Chapter 1, specifically Newton-GMRES, to the k-eigenvalue problem in diffusion theory. As shown above, the multigroup k-eigenvalue problem in diffusion theory can be written as a generalized eigenvalue problem such that Mφ = λf φ, (2.11) where M, F, φ, and λ are defined in Section 2.1. If φ and λ are considered to be the unknowns then the k-eigenvalue problem represents a system of nonlinear equations that is under-determined by one equation. Assuming that an additional relationship some unspecified function ρ(φ, λ) is included, a nonlinear function Γ can be written [ ] [ ] Mφ λf φ φ Γ(u) =, u =, (2.12) ρ(φ, λ) λ where both Γ and u are of length N + 1, N being the size of the diffusion problem defined previously. This problem can be solved using Newton s method via the iterative sequence Γ (u (m) )δu (m) = Γ(u (m) ), u (m+1) = u (m) + δu (m), (2.13) where Γ is the Jacobian of the vector-valued function Γ. If the first N equations

65 51 of Γ are denoted by Γ φ and the last by Γ λ such that Γ = [Γ T φ Γ λ ] T then this equation can be written [ Γφ φ Γ λ φ Γ φ λ Γ λ λ which is equal to ] [ ] [ ] Mφ δu (m) (m) λ (m) F φ (m) φ =, u (m) (m) =, (2.14) ρ(φ (m), λ (m) ) λ (m) u=u (m) [ ] [ ] [ ] M λf F φ Mφ λf φ φ δu =, u =, (2.15) ρ φ (φ, λ) ρ λ (φ, λ) ρ(φ, λ) λ where the Newton iteration index is suppressed as long as the meaning is clear and ρ φ signifies the derivative ρ/ φ. This system of equations can be used in conjunction with any of the Newton algorithms (Newton, Inexact Newton, Inexact Newton with JFNK), with or without globalization, described in Chapter 1 to find an eigenmode of the multigroup diffusion equations once an appropriate relationship ρ is chosen. However, as the cost of using an Exact Newton method is prohibitive for large problem sizes Inexact Newton methods, specifically Newton- Krylov methods using GMRES (Newton-GMRES), are most suited to solving this problem. The use of a Krylov method suggests that some type of preconditioning will likely be necessary, in which case the problem can be written using generic preconditioners as P 1 l [ M λf F φ ρ(φ, λ) φ ρ(φ, λ) λ which is implemented as P 1 l [ M λf F φ ρ(φ, λ) φ ρ(φ, λ) λ ] ] Pr 1 P r δu = P 1 l Pr 1 y = P 1 l [ ] [ ] Mφ λf φ φ, u =, (2.16) ρ(φ, λ) λ [ ] Mφ λf φ, ρ(φ, λ) δu = P 1 r y, u = [ ] φ. (2.17) λ This preconditioned system of nonlinear equations, meant to be solved using Newton-GMRES, serves as the foundation for a family of algorithms which can be

66 52 used to solve the k-eigenvalue problem. These algorithms can be distinguished in a number of ways. The type of preconditioning (left, right, split, or none) used and the specific preconditioner lead to fundamental differences between methods. The choice of ρ which closes the system of equations is also a defining feature of any complete specification of the problem. Another decision that is important is the treatment of the Jacobian-vector product which can either be treated exactly or by the finite-difference JFNK approximation since a Newton-Krylov method is assumed. A myriad of other details can separate different implementations of Eq. (2.16) but some of the more fundamental distinctions will be examined first. It is important to note that whenever possible these methods seek to utilize procedures and data structures which are part of the standard power iteration described previously. Any overlap between the power method and the class of inexact Newton methods for the k-eigenvalue problem is beneficial because it reduces the amount of new code writing necessary if the new methods were implemented in an existing code Evaluating Γ The first step in implementing any variation of Eq. (2.16) is considering the evaluation of the nonlinear function Γ(u). This operation is trivial when M and F are explicitly stored, assuming a ρ is chosen that is simple to evaluate; however, in most codes M and F will not be directly available. Consider separating the vector-valued function Γ(u), each of length N + 1, into G components of length n, Γ g (u) (g = 1,..., G), and a scalar component, Γ λ (u) such that Γ = [Γ T 1 Γ T 2... Γ T G Γ λ ] T. (2.18) In block form the first G blocks (N elements) of Γ(u) are given by Γ 1 M 1 Σ s21... Σ sg1 χ 1 νσ f1... χ 1 νσ fg φ 1 Γ 2. = Σ s12 M 2... Σ sg λ χ 2 νσ f1... χ 2 νσ fg φ Γ G Σ s1g Σ s2g... M G χ G νσ f1... χ G νσ fg φ G (2.19)

67 53 so that the operations associated with evaluating Γ can be broken down on a group-by-group basis. Consider evaluating Γ for group 1, Γ 1 (u) For a general group, g, Γ 1 (u) = M 1 φ 1 Γ g (u) = M g φ g G Σ sg 1 φ g λχ 1 g =2 = M 1 φ 1 s up g λf 1. G g =g+1 = M g φ g ( s up g G g =1 νσ f g g 1 Σ sg g φ g Σ sg g φ g λf g (2.20) g =1 + s down g + λf g ). (2.21) The final component of Γ(u), depends on the choice of ρ. One way to choose ρ is to impose some normalization condition on the scalar flux vector, φ. For instance, throughout most of this chapter we use ρ to seek a scalar flux which satisfies φ 2 = 1, in which case we can write ρ = 1 2 φt φ + 1, such that ρ is a measure of 2 how far from unity the L 2 norm of the scalar flux is. In this case [ ] Mφ λf φ Γ(u) =. (2.22) 1 2 φt φ Any normalization condition could be used in place of the one used here, but Stewart [16] points out that choosing this specific normalization condition results in a Newton correction that is optimal in the sense that it is orthogonal to the previous correction. In this case then Γ λ is given by Γ λ (u) = 1 2 ( φ T 1 φ 1 + φ T 2 φ φ T Gφ G ) + 1 2, (2.23) resulting in a method to evaluate Γ(u) that emulates the group-wise procedure of the outer-inner iteration scheme. The capability to construct each of the components of Γ(u) is presumed to exist in any diffusion code capable of performing power iterations. The matrix M g will likely be stored explicitly or in some format which takes advantage of its sparsity. If a Krylov method is being used to solve the within-group linear

68 54 equations in the traditional outer-inner structure then it is certainly possible to find M g φ g since this is the matrix-vector product required by the Krylov method. The upscattering and downscattering sources, s up g and s down g respectively, should also be available. In the previous example of a system with no upscattering, s down g was explicitly calculated during the construction of the source term, q g, of the within-group linear system. The upscattering source when present is calculated in the same manner, and codes which perform upscattering iterations should have the ability to construct this term. The fission source term, f g, is also available since it too is an explicit component in the source term q g. Algorithm 2.1 Group-wise Evaluation of Γ(u) Γ λ = 0 calculate f = G g =1 νσ f g φ g for g = 1, 2,..., G do Γ g = M g φ g ( ) s up g + s down g + λf g Γ λ = Γ λ + φ T g φ g end for Γ λ = 1Γ 2 λ The nonlinear function Γ = [Γ T 1 Γ T 2... Γ T G Γ λ] T, can thus be evaluated via Algorithm 2.1 using operations which are also used in the standard outer-inner solution procedure Evaluating Γ If a Jacobian-Free Newton-Krylov method is being used then no consideration need be given to ways of evaluating the true Jacobian-vector product of the nonlinear system; all that is necessary is the capability to evaluate Γ as was demonstrated previously. However, we will show that the true Jacobian-vector product can be evaluated using a similar set of operations to those used to evaluated Γ. Using the same ρ (seeking φ 2 = 1) the Jacobian, Γ, is exactly given by [ ] M λf F φ J(u) = (2.24) φ T 0

69 55 However, in the context of a Newton-Krylov method it is unnecessary to explicitly calculate the Jacobian in this manner, instead the construction of the Jacobianvector product J(u) v is all that is needed. Suppose the vector v is partitioned by energy group as Γ was in the previous section, i.e., v = [v T 1 v T 2... v T G v λ] T, or equivalently v = [vφ T v λ ] T, where v φ = [v1 T... vg T ]T. Using this notation the Jacobian-vector product can be written M 1 v 1 G Σ sg (J(u)v) 1 g g v g λf 1 (v φ ) v λ f 1 (φ) =g+1.. g 1 (J(u)v) g M J(u) v = = g v g Σ sg. g g v g G Σ sg =1 g g v g λf g (v φ ) v λ f g (φ) =g+1. (J(u)v) G M (J(u) v) λ G v G G 1 Σ sg g g v g λf G (v φ ) v λ f G (φ) =1 ( φ T 1 v 1 + φ T 2 v φ T g v g φ T G v ) G (2.25) which can again be evaluated on a group-wise basis. Finding J(u) v is a very similar process to finding Γ(u) as determined by Algorithm 2.1, with the only modification comprising the substitution of the arbitrary vector v for the iterate of the eigenpair, u, in some instances as indicated in Eq. (2.25). The vectors (Jv) g contain a matrix-vector product of M g with v g, the scattering sources which are calculated using v φ rather than φ, the product of the actual eigenvalue iterate, λ, with the group fission source due to v φ rather than φ, and finally the product of the actual group fission source and the last component of the arbitrary vector v, v λ. Thus, evaluating J(u) v is nearly identical to the evaluation of Γ(u) as defined for Mφ = λf φ, except that v φ has replaced φ and an additional term v λ f g (φ) must be calculated. Though, as was shown earlier f(v λ ) or f(φ) only needs to be calculated once since f g = χ g f. In algorithmic form, the calculation of J(u)v looks as shown in Algorithm 2.2 and requires very few additional calculations beyond what is required in Algorithm 2.1. Again if the choice of ρ was different, then the (Jv) λ component would necessarily be calculated differently just as with the evaluation of Γ. However, it is unlikely that a ρ would be chosen which is any more difficult to calculate than any of the components of (Jv) φ. The evaluation of Γ as described uses many of the procedures necessary in the

70 56 Algorithm 2.2 Group-wise Evaluation of Γ (u)v (J(u) v) λ = 0 calculate f(φ) = G g =1 νσ f g φ g calculate f(v φ ) = G g =1 νσ f g v g for g = 1, 2,..., G do (J(u)v) g = M g v g g 1 g =1 (J(u)v) λ = (Jv) λ φ T g v g end for Σ sg g v g G g =g+1 Σ sg g v g λf g (v φ ) v λ f g (φ) traditional power iteration solution of the k-eigenvalue problem. The scattering sources and fission source are constructed, although the sources are due to an arbitrary vector v that is determined by the Krylov linear solver, and not the current estimate of the scalar flux. The difficulty in adapting an existing code to calculate these quantities for an arbitrary input is wholly dependent upon the specific implementation of the power iteration but in an ideal situation one would have access to these calculations via a subroutine or function call. With well-defined means of evaluating Γ and Γ (or more specifically Γ v as required by GMRES) for a specific choice of ρ it is now possible to begin constructing specific implementations of Eq. (2.16) which can be used to solve the k-eigenvalue problem. 2.3 Generalized Eigenvalue Problem The most obvious implementation of Eq. (2.16) is a direct implementation with no preconditioning given that Γ can be evaluated and the Jacobian can be accessed via the JFNK approximation or directly via a matrix-multiplication inside GMRES, with the choice of ρ specified previously. A single Newton step of this system is given by [ ] [ ] [ ] M λf F φ Mφ λf φ φ δu =, u =, (2.26) φ T φt φ + 1 λ 2 where the Newton iteration index has been suppressed. The process of solving this system using Newton s method then proceeds as described in either Algorithm 1.11

71 57 or 1.13: accessing the true Jacobian or using the JFNK approximation, J(u)v Γ(u + ɛv) Γ(u), (2.27) ɛ where Γ is given by Eq. (2.22). The backtracking globalization procedure detailed in Algorithm 1.12 can also trivially be used in conjunction with these methods. The two methods which arise from this implementation of Eq. (2.16) will be referred to as NK(GEP) and JFNK(GEP) where GEP refers to the Generalized Eigenvalue Problem with no preconditioning, NK indicates the Newton-Krylov solution, and JFNK indicates the Jacobian-Free Newton-Krylov approximation is used. The general naming scheme used will be similar in the sense that NK refers to methods where the Jacobian is accessed directly via matrix-vector multiplication while JFNK indicates the use of the JFNK approximation. A reference to the type of preconditioning used will be given as a parenthetical suffix, where in this instance GEP refers to the base problem, i.e. no preconditioning. The preconditioning options available are too numerous with no viable way of determining the best, or even an effective preconditioner, for all problems without trying them out. As discussed in Chapter 1, incomplete factorizations are often used with Krylov methods, though they are of little use here since J(u) is never explicitly constructed, and is instead approximated with JFNK or accessed only through matrix-vector multiplication. Therefore the scant preconditioning options explored were those easiest to implement assuming implementation in a code already capable of performing standard power iterations. Also, while it has been seen that linear systems can be preconditioned on the left, right, or using split preconditioning all of the preconditioners used were implemented via right preconditioning unless otherwise stated Preconditioning with the Diffusion Operator One of the simplest choices of a preconditioner is the group diffusion operator, M g. This is a matrix that one must necessarily have access to in order to solve the

72 58 problem. Consider the matrix, M, defined by M M 2... M (2.28).... MG This is simply the block-diagonal of M with an additional row and column added to account for the ρ relation in the nonlinear system. Since this is a block-diagonal matrix, we know that M 1 is simply M M , (2.29).... M 1 G M 1 where finding Mg 1 is already a capability of any code with standard power iterations. As seen at the beginning of the chapter the inner iterations in diffusion theory refer to the solution of M g φ g = q g, that effectively implies computing M 1 g operating on the vector q g. This capability can therefore be recycled for use as a preconditioner in the context of the Inexact Newton family of methods. This preconditioner can be implemented in exactly the same way inner iterations are solved in diffusion theory; if the linear within-group equations are solved using a Krylov method however FGMRES must be used instead of GMRES in the Newton step. Implementation of this preconditioner can be described by Eq. (2.17) where ρ is chosen so that the L 2 norm of the scalar flux is unity upon convergence, the P 1 l matrix is ignored (set to the identity matrix) and Pr 1 = M 1. Staying true to the naming scheme developed previously the two variants of this right preconditioned system of equations are termed NK(M) and JFNK(M).

73 Preconditioning with the IC Factorization of the Diffusion Operator A logical extension of the previous preconditioner is considering an incomplete factorization of the M g matrices. Since it is known that these matrices are SPD it is possible to apply to them the Incomplete Cholesky factorization. In numerical tests the specific case of no fill-in, IC(0), is implemented due to its simplicity, though higher level fill may offer enhanced performance; L 1 L T L 2 L T P r (2.30).... L G L T G In a typical outer-inner power iteration solution of the k-eigenvalue problem in diffusion theory the within-group matrices, M g, may actually use IC(0) as a preconditioner for CG. Since this choice of P r is still based on M, the collection of within-group diffusion operators with no fission or scattering components, we can generalize the concept of using the preconditioner of the within-group iterations as a preconditioner for GMRES; any preconditioner developed for M g can be applied to the GMRES solution. While M can be thought of as a preconditioner based on solving the within-group problem, the preconditioner of M can itself be directly applied to the Inexact Newton method potentially resulting in a saving in computational cost. When the IC(0) factorization of M g is used as a preconditioner for Newton s method it will be referred to as JFNK(IC) and NK(IC). The IC approach, unlike others considered, does not rely on any iterative methods during the preconditioning process since direct substitution can be used instead Including Fission Terms in the Preconditioner To this point the preconditioners considered only approximated the M component in the main block of the Jacobian, M λf, and even then the scattering terms in M were dropped. Instead now we drop λ but retain F, resulting in the preconditioning

74 60 matrix [ ] M F 0 P r. (2.31) 0 1 Generally F is not available as a specific matrix, instead only being used to build the spatial distribution of the fission source as necessary. However, if this preconditioner is implemented using a Krylov method (GMRES since there is no reason to think it will be symmetric), then only (M F ) v is needed, where v is an arbitrary vector. Thus if Algorithm 2.1, used to evaluate Γ, accepts v and λ = 1 rather than φ and the current λ value, then it can be used in conjunction with GMRES for preconditioning purposes. Again, if a Krylov method is being used for preconditioning it is essential that the linear Newton step be solved using FGMRES and not the standard GMRES iteration. This approach has the benefit of requiring almost no additional coding on top of the implementation of the Inexact Newton method but could potentially provide little or no benefit since Pr 1 is quite similar to the true Jacobian. This type of preconditioning is signified by M F such that the Inexact Newton methods resulting from it are termed JFNK(M-F) and NK(M-F). One trivial variation of this approach is to use M λf where λ uses the actual current eigenvalue iterate Preconditioning with Power Iterations The preconditioners developed to this point have been relatively simple modifications of the original Jacobian. Now instead we consider the entire M operator, scattering included, and examine its use as a preconditioner. As in previous situations M is augmented such that P r = [ ] M and P 1 r = [ ] M 1 0, (2.32) 0 1 where M 1 is the major calculation associated with the power iteration. Recall the traditional power iteration is given by φ (l+1) = λ (l) ( M 1 F ) φ (l) (2.33)

75 61 which is solved using the technique described at the beginning of this chapter. When power iteration is used as a right preconditioner the within-group fission source f g (a sub-vector of F φ) is replaced by the appropriate sub-vector of the vector being acted on by Pr 1. In other words, if the right preconditioned system is given by [ ] [ ] [ ] M λf F φ M 1 0 Mφ λf φ y =, ρ(φ, λ) φ ρ(φ, λ) λ 0 1 ρ(φ, λ) [ ] [ ] M 1 0 φ δu = y, u = 0 1 λ the evaluation of [ ] M 1 0 y 0 1 (2.34) is simply then the result of a standard power iteration, with λf g in the withingroup equation replaced by y g, where y = [y T 1... y T G y λ] T ; with y λ unchanged during the evaluation. This approach however runs into difficulties if upscattering is present. In the traditional power iteration the upscattering source can be constructed from the previous scalar flux vector because it is assumed that φ (l) is an acceptable approximation to φ (l+1) for a sufficiently large value of l. Such an approximation is not readily available when applying M 1 to an arbitrary vector and the physical interpretation of the inversion of M is of no use. A good approximation of M 1 in the presence of upscattering would likely require another level of nested iteration analogous to upscattering iterations when referring to the power method. For this reason left preconditioning is favored over right preconditioning when using power iterations because the mechanics of the entire power iteration can be retained. In fact, left preconditioning using power iteration is a sufficiently important option to warrant a detailed discussion in the following section. However, preconditioning on the right using power iterations has been implemented and tested numerically in this work, and is denoted by the names JFNK(rPI) and NK(rPI).

76 JFNK Acceleration of Power Iteration If the full M operator is used as a left preconditioner for Eq. (2.16) the resulting Newton system is [ ] [ ] [ ] [ ] M 1 0 M λf F φ M 1 0 Mφ λf φ δu =. (2.35) 0 1 ρ(φ, λ) φ ρ(φ, λ) λ 0 1 ρ(φ, λ) Carrying out this operation symbolically results in [ ] [ ] I λm 1 F M 1 F φ (I λm 1 F )φ δu =. (2.36) ρ(φ, λ) φ ρ(φ, λ) λ ρ(φ, λ) For the moment we neglect the Jacobian and just consider the nonlinear equation in this instance, which we will refer to as Γ acc for reasons that will become clear shortly, [ ] Mφ λf φ Γ acc =. (2.37) ρ(φ, λ) Again recalling the matrix representation of the traditional power iteration is given by φ (l+1) = λ ( (l) M 1 F ) φ (l), (2.38) Γ acc can be written [ ] φ (l) φ (l+1) Γ acc =. (2.39) ρ(φ, λ) Suppose that ρ is chosen such that [ ] φ (l) φ (l+1) Γ acc = λ (l) λ (l+1) (2.40) which could just as easily be used to measure the difference between k estimates by replacing λ with its reciprocal. In any case, this nonlinear function Γ acc now represents the difference between some u = [φ T λ] T and the ū = [ φ T λ] T that results from the application of a single power iteration. Note that the specific update formula used to map k (l) to k (l+1) has not been mentioned explicitly but only implied to exist. Given Γ acc an Inexact Newton method can again be constructed

77 63 which is capable of solving the k-eigenvalue problem. The Jacobian of this Newton method appears in Eq. (2.36) where ρ is a function of the update formula used for the eigenvalue. The action of this Jacobian can be calculated since we know how to approximate M 1 via modifications to the power method. However, it appears that using the JFNK approximation is advantageous in this situation because it reduces the total number of times the action of M 1 is required per GMRES iteration PI as Fixed-Point Iteration To better understand this approach and to fully appreciate the manner in which it differs from those using other preconditioners we will derive the same system from a much different starting point: by considering the power iteration to be a fixed-point iteration on the vector u, which is composed of the scalar flux vector and the scalar eigenvalue such that u (l+1) = Φ(u (l) ) (2.41) where [ ] [ ] λm 1 F φ φ Φ(u) =, u =. (2.42) λ(φ, λ) λ The vector denoted by Φ contains an updated estimate of the eigenpair, φ and λ of length N + 1. The function λ(φ, λ) represents some weighting function which updates the previous estimate of k to the new one, generally something along the lines of the update formulas mentioned during the description of the k-eigenvalue problem in diffusion theory. This is the exact same power iteration that was described before, now compactly written as u (l+1) = Φ(u (l) ), given some initial guess u (0) = [φ (0)T λ (0) ] T. In the instance of no upscattering the evaluation of Φ updates the scalar flux using the procedure described by Eqs. (2.9) and updates the eigenvalue using λ, such that each evaluation of Φ is equivalent to a single update of φ and λ, otherwise known as an outer iteration. Let the function Γ acc (u) be given by Γ acc (u) = u Φ(u) (2.43)

78 = [ ] (I λm 1 F ) φ λ λ(φ, λ) 64 (2.44) so that Γ acc (u) = 0 leads to u = Φ(u), which implies that the fixed-point iteration is converged. So by solving the nonlinear system Γ acc (u) = u Φ(u) = 0, it is possible to find an eigenpair. Actual implementation of this method requires very little beyond what is already implemented in codes capable of performing outer iterations. The overwhelming majority of operations in this algorithm deal with evaluations of Γ acc. Each Newton iteration requires one evaluation of Γ acc to find the residual of the linear system and each Krylov iteration requires one evaluation of Γ acc for each Jacobian-vector product, J v, formed. The number of outer (evaluations of Γ acc ) and inner iterations resulting from this algorithm can thus be tabulated and the performance compared to the traditional and accelerated power iteration. The iterative solver used for the within-group equations and any associated preconditioning already existing in a code would generally be used as-is, the function Φ can be treated like a black box in this regard. It must be emphasized that in the implementation of the power iteration, Φ, is of no interest when using this method as only the difference between some vector u and Φ(u) is ever required to evaluate the nonlinear function. This means that any normalization of the flux and any particular update formula for the eigenvalue are compatible with the method as coded. It has been shown that this particular implementation of the Inexact Newton method can be considered a left-preconditioned variation of the Newton method defined by Γ, and can be derived by reinterpreting the power method a fixed-point iteration. For this reason we refer to the new nonlinear function as Γ acc, because it yields a Newton method capable of accelerating the fixed-point iteration Φ, which in this case is the power method. Implementations of this approach are termed JFNK(PI) with no NK version due to the nature of the Jacobian for this preconditioner. Interestingly, a function very similar to Γ acc can be arrived at in the method

79 65 proposed by Mahadevan and Ragusa [21]. If the eigenvalue problem Mφ = 1 k F φ is written ( M 1 F ) φ = kφ then it can be considered an unconstrained optimization problem of order N + 1. Using Newton s method to solve this problem results in the linear system Γ acc (u) = [ (M 1 F ki) φ ] 1 2 φt φ + 1 2, u = [ φ ] k (2.45) where the normalization to unity is imposed by the final equation in the nonlinear system. Using Γ acc (u) instead of Γ acc with the JFNK approximation to Newton s method also results in a JFNK method which can be used to find an eigenpair of the diffusion system. If one takes Γ acc (u) = 0 and multiplies the first N equations by ( λ) then these N equations will be identical to those in Γ acc. Mathematically, Γ acc and Γ acc (u) are quite similar, though resulting from two distinct approaches: one regarding the eigenvalue problem as an optimization technique, Γ acc (u), while the new function Γ acc arises from a desire to accelerate the preexisting power iteration algorithm. Though mathematically similar, the implementation need not be since nothing about the evaluation of M 1 F φ is dictated by the functions themselves; Γ acc has been formulated with the intent of employing preexisting outer-inner iteration structures Summary of Methods The number of possible implementations of Eq. (2.16) is practically endless because of the number of preconditioning schemes conceivable. Examining the schemes discussed so far we see a number of ways which can be used to distinguish between implementations. Perhaps the most interesting is the difference between JFNK(PI) and all of the others. Arriving at JFNK(PI) from Eq. (2.16) required the same transformation (by M 1 ) used to obtain the power method from the natural statement of the k-eigenvalue problem, Mφ = λf φ. The Newton method that resulted

80 66 from this transformation likewise is a Newton method that solves the standard eigenvalue problem, hence the calculation contains many of the same quantities as the traditional power method. So while this method may seem to stand out as unique, it has more in common with the standard solution algorithm for the neutron k-eigenvalue problem than any of the others. Treating the k-eigenvalue problem as a generalized eigenvalue problem is in fact a much more drastic departure with regards to standard techniques. The advantage of solving the generalized eigenvalue problem is that there is no inversions of M involved in the solution process. Using JFNK(GEP) or NK(GEP) requires only one level of iteration (GMRES) per Newton iteration. When preconditioning is included in GMRES another level of iteration potentially exists, and as we have seen this preconditioning can involve M 1. However, the cost of preconditioning iterations is generally cheaper than applying M 1 because simpler matrices are used. Even in the case where M itself is used as a preconditioner its purpose is simply to reduce the number of GMRES iterations so the accuracy of M 1 if implemented iteratively need only be high enough to achieve this task, unlike in the power method where the action of M 1 is an important part of the entire process. However, in practical applications the power method does not require a high quality inversion of M 1 at each iteration, given φ (l 1) g is used as the initial guess for the within-group problem, as this would lead to oversolving in early iterations. Ultimately the inner iterations are converged loosely at the cost of an increased number of power iterations. The Inexact Newton methods presented may utilize any number of calculations similar to the solution of the within-group problem for the purposes of preconditioning, allowing for the reuse of code from power iterations, but fundamentally the two approaches are quite distinct. It is difficult to estimate the relative computational cost of the various methods without implementation and testing, mainly because the cost/effect of iterative tolerances necessary to ensure convergence to the desired eigenmode can not be guessed beforehand. For instance, in the JFNK(PI) approach, how good must the estimate of M 1 be, for NK(M-F) how accurately must the action of (M F ) 1 be approximated to result in a well-conditioned system? Due to the interplay of potentially nested levels of iteration it isn t possible to a priori decide which approach is least expensive. It could be said that methods using the true Jacobian

81 67 as opposed to the JFNK approximation should prove to be more accurate, but the price of this accuracy may be too high. In all cases it has been shown how to construct the Inexact Newton method, along with its preconditioner, using calculations that are safe to assume exist in traditional power iteration based codes, though the scope of these calculations varies greatly. In the case of JFNK(PI) practically nothing is required of the code other than the capability to perform a single power iteration; the only data that is necessary to drive the new iterative scheme is the difference between two successive power iterations. Quantities like Γ and Γ demand more access to the source codes, at times requiring access to functions that calculate the scattering sources or fission source due to an arbitrary flux and eigenvalue estimate. The treatment of upscattering is also a point of interest when comparing the methods developed to this point. The standard power iteration, depending on the implementation, either performs additional iterations over the thermal groups or commingles the upscattering source with the eigenvalue (power) iteration. JFNK(PI) mimics this approach when applying M 1 since it relies on the stock power iteration solution present in a code. All Inexact Newton methods, with the exception of JFNK(PI), do not require upscattering be paid any extra attention. The forward substitution (by group) solution approach favored by power iteration is abandoned for the Newton approach where φ is updated simultaneously via the Newton step: no energy group is treated differently than the others. A listing of the various implementations of Eq. (2.16) described so far is presented in Table 2.1. The name of each method is given along with the nonlinear system the method is based on, which includes the choice of ρ. The treatment of the Jacobian is given by referencing the actual Jacobian matrix, in which case GMRES will perform the J(u) v multiplication as described by Algorithm 2.2, or by referencing the formula for the JFNK approximation. If the JFNK approximation is used then Γ will be whatever nonlinear function is given in the Nonlinear Equation column. The type of preconditioning used is given in the final column by referencing the preconditioning matrix and listing the type (right, left, or split) in parenthesis. The actual implementation of the indicated preconditioner is not discussed at this point but will be mentioned before numerical results are presented. An overview of the entire process is given in Algorithm 2.3 which combines

82 68 Table 2.1: Summary of Inexact Newton Methods for Diffusion Theory Name Nonlinear Jacobian Preconditioning Equation JFNK(GEP) None NK(GEP) None JFNK(M) (Right) NK(M) (Right) JFNK(IC) (Right) NK(IC) (Right) JFNK(M-F) (Right) NK(M-F) (Right) JFNK(rPI) (Right) NK(rPI) (Right) JFNK(PI) None Algorithm 2.3 Inexact Newton for k-eigenvalue Problem 1: u (0) = arbitrary initial guess 2: for m = 1, 2,..., do 3: J(u (m) ) δu = Γ(u (m) ) 4: where Γ(u (m) ) + J(u (m) ) δu η (m) Γ(u (m) ) 5: and Jy 1 [Γ(u + ɛy) Γ(u)] If Jacobian-Free ɛ 6: Set δu = δu, η = η 7: while Γ(u (m) ) + J(u (m) )δu (m) > (1 α(1 η)) Γ(u (m) ) do 8: Choose θ [θ min, θ max ] 9: θδx δx, 1 θ(1 η) η 10: end while 11: u (m+1) = u (m) + δu (m) 12: end for many of the items previously discussed. Line 1 involves choosing the initial guess for the eigenvalue calculation while Line 2 initiates the loop over the Newton iterations. The choice of initial guess and the termination criteria for this loop will be discussed in Section 2.5. Line 3 shows the linearized Newton system, where Γ is determined by the specific method being used and the linear system is solved using preconditioned GMRES or FGMRES. If the iteration is preconditioned using a non-stationary iterative method then FGMRES is necessary but otherwise GMRES is used. The preconditioning steps are not displayed explicitly but are an

83 69 important part of the GMRES iterations with the specific details of the preconditioning determined by the method in use. Line 4 imposes the Inexact Newton criteria, i.e. determines the criteria used to decide when δu is good enough. The choice of η is important because it drives the convergence rate as previously discussed. Lines 6 10 are only present in globalized (also called globally convergent) implementations of the Inexact Newton method. The choice of θ in Line 8 will be discussed in Section 2.5. Line 11 updates the current flux and eigenvalue pair to values corresponding to the iterate. To this point a broad class of Newton-based algorithms has been proposed for the calculation of the fundamental eigenpair in multigroup diffusion theory. A number of practical issues surrounding JFNK/Inexact Newton methods have been discussed in Chapter 1, such as the JFNK perturbation parameter and the forcing factor in the Inexact Newton method. Still, a number of issues need to be covered before a practical implementation of the method is feasible. A number of these issues are addressed in the following section. 2.5 Practical Issues Some minor practical aspects of Algorithm 2.3 will be discussed briefly in this section, including: convergence criteria for the Newton iteration, initial guess for the Newton iteration, and the backtracking (globalization) process. Potential problems in using Newton s method to determine a specific eigenvalue (in this case an extreme eigenvalue) will also be mentioned Convergence of Newton Iteration Any use of an iterative method ultimately leads to the question of when a calculation can be considered converged and Newton s method is no different. However Newton s method can easily calculate one error measure explicitly, the residual of the nonlinear system at each iteration. Thus, if Γ(u (m) ) = r (m) = 0 then u (m) is considered an exact solution. In practice, it is rarely possible to calculate an exact solution due to the nature of floating-point arithmetic, but the residual vector is still the simplest way to measure convergence. The simplest test is r (m) < τ,

84 70 where. is some vector norm and τ some pre-chosen real positive scalar value. This only requires the additional calculation of the vector norm at each Newton iteration. An alternative is to require a decrease in the initial residual, r (0) / r (m), by a factor τ or to cease iterating when δu drops below some specified value. Though these tests are all valid (and simple to perform) they are tests specific to Newton s method. For the purposes of this work, it is desirable that the power iteration method and the Newton-based methods terminate based on the same criteria so that the performance of the different approaches can be directly, and fairly compared. Rather than introducing the residual of the nonlinear system into the power method, the Newton iterations will be terminated based on tests similar to those used in power iteration. The popular simulation software PARCS [52] uses three criteria to determine convergence: absolute error in the eigenvalue, an L 2 measure of the flux error, and an L measure of the flux error. The spatial distribution of the fission source can also be used in place of the flux when measuring error, as done in DIF3D [47]. Using the fission source, the convergence tests can be written k (l+1) k (l) < ɛ k (2.46a) f (l+1) f (l) 2 (f (l+1), f (l) ) < ɛ 1/2 2 (2.46b) f (l+1) i f (l) i max i f (l+1) < ɛ i (2.46c) where the subscript i denotes a spatial index. Since these are simply measures of changes in f and k between iterations they can easily be implemented in the Newton-based approaches, with the measurements being taken between two successive Newton iterations. Acceleration of the power method using JFNK can actually use successive power iterates in error measurements, which are calculated when finding the residual of Γ acc during each Newton step. From an implementation (and elegance) standpoint it is simpler to base convergence on some property of the Newton iteration (a specified decrease in the residual and/or some criteria placed on the Newton step) but using Eqs. (2.46) ensures solutions produced by the Newton approaches meet the same criteria as the solution of the power method.

85 Initial Newton Guess Though the initial guess can be truly arbitrary it is best to choose an initial guess which would also be appropriate if using the power method. Specifically this means a real positive scalar flux, since the solution will also have these properties. Ideally an eigenvalue can be chosen which is close to the iterative limit, though in the case of reactor problems a uniform unite value of the fission source and the eigenvalue is usually a good choice. Often in the power method a flat-flux initial guess is chosen and this will work well in conjunction with the Inexact Newton methods described as well. This flat flux can even be normalized so that it satisfies the normalization condition sought (if any) via the ρ equation. It is also possible to perform a small number of power iterations and use the result as the initial guess for the Newton method. If this approach is adopted then the question becomes finding the number of initial power iterations which decreases the number of Newton iterations required but does not significantly contribute to the overall computational cost. Newton s method is more sensitive with regards to the initial guess when compared to the power method Backtracking Implementation The implementation of backtracking as shown in Algorithm 2.3 is rather straightforward with the exception of choosing the factor θ, which is chosen based on a method proposed by Eisenstat and Walker [28]. The idea is to find θ in some range θ [θ min, θ max ] that minimizes the polynomial p(θ). The polynomial is quadratic and constructed by setting p(0) = g(0), p(1) = g(1), and p (0) = g (0). The function g(θ) is given by Γ(u + θu) 2 2. The θ which minimizes this quadratic is θ = (Γ(u), J(u)δu) Γ(u + δu) 2 2 2(Γ(u), J(u)δu) Γ(u) 2 2 (2.47) though bounds are usually chosen as mentioned earlier to keep θ in a range which is numerically stable; in their work Eisenstat and Walker chose θ [0.1, 0.5]. In a practical backtracking implementation it may also happen that the while loop beginning on Line 7 of Algorithm 2.3 exits after too many iterations, proving to be unacceptably expensive. For this reason this loop is often terminated after

86 72 some fixed number of backtracking steps if the condition on Line 7 is not satisfied. Backtracking early on in the Newton process may also be prohibitive, motivating the practice to execute the backtracking step only once the Newton residual falls below some predetermined value Potential Difficulties Another interesting practical consideration regarding the Newton type approaches concerns the converged eigenvalue and its associated eigenvector. The intent is to use these methods to determine the fundamental mode, but there is no guarantee that the converged solution will correspond to this mode, as any eigenpair, i.e. higher harmonics, will perfectly satisfy the nonlinear equations which have been defined. It is known that the multigroup diffusion equations have a unique eigenvalue with largest modulus and an associated positive eigenvector which is distinct [12, 44]. In practice, using an initial value of k (0) = 1 and a flat φ such that φ (0) 2 = 1, in our numerical experiments detailed in the next chapter has invariably resulted in convergence to the fundamental mode. However, by choosing an initial guess which is artificially close to a higher-harmonic mode it is possible to converge to that mode using any of the Newton-based approaches, and even the power method (assuming iterations are terminated because Eqs. (2.46) are satisfied). Mahadevan and Ragusa [21] in fact depend on this fact when they use IRAM to generate initial guesses for a JFNK-based solver. It seems likely that the only way to guarantee convergence to the fundamental mode is to begin with an initial guess which is sufficiently close to that mode, where sufficiently is used in the sense of local convergence analysis of Newton s method. However, spending any significant amount of computational effort on generating a good initial guess defeats the purpose of the acceleration method. Numerical results will show that in general it has been observed that using a flat-flux initial guess and a reasonable guess for k is almost always sufficient to yield convergence to the fundamental mode, though initializing the Newton run with a small number of standard power iterations can result in an improvement, i.e. reduced number of iterations, versus the Newton-based methods by themselves. In the following chapter numerical results for a number of well-defined test

87 73 problems will be used to compare the Newton-based approaches to two implementations of the power method; unaccelerated and Chebyshev accelerated. Many details regarding practical implementations of these methods not considered in this chapter, will be discussed. Many of the numerical parameters necessary are specific to the Inexact Newton approach and do not apply to traditional power iteration solutions. One such parameter, briefly mentioned was the choice of initializing Newton s method with standard power iterations. Other Newton-specific parameters will be examined, such as, but not limited to, the inexact Newton forcing factor, the JFNK perturbation parameter, and the convergence of the inner iterations when using JFNK as an acceleration technique. While these quantities are not especially significant in the presentation of the methods, it will be seen that from a performance perspective some parameters are extremely important.

88 CHAPTER 3 Diffusion Theory Numerical Results Numerical results were generated for a number of sample problems using a code which implements each of the methods listed in Table 2.1. The test code is also capable of performing standard power iterations and Chebyshev accelerated power iterations, using the implementation of Chebyshev acceleration of the fission source described by Ferguson and Derstine [46]. This code will be used to examine the performance of the Newton-based methods relative to that of the unaccelerated and accelerated power method. The numerical parameters involved in the Newtonbased approaches will also be studied to determine which values result in the best performance. Other aspects of the Newton calculation will also be examined, such as the effect of backtracking on convergence and overall execution time. The performance of the preconditioners developed will also be compared in order to determine which is most effective at reducing the computational cost of the eigenvalue calculation. 3.1 Test Code The test code used to implement the various eigenproblem solution algorithms for diffusion theory is written in Fortran 90/95 and is intended to solve the twodimensional multigroup diffusion equations. The spatial discretization used is

89 75 mesh-centered finite-difference in Cartesian geometry, with support for vacuum (Marshak) and reflective boundary conditions. Sparse storage is utilized for the storage of the M g matrices, specifically the Compressed Sparse Column (CSC) [5] format is implemented. A corresponding matrix-vector multiplication routine takes advantage of the sparse storage, greatly reducing the number of operations necessary to perform matrix-vector multiplications. The conjugate gradient algorithm was used to solve the within-group problem, M g φ g = q g, during the power iteration process, with the option to precondition these equations using the Incomplete Cholesky Factorization of M g as the preconditioner; the DLAP [65] Fortran implementations of CG and preconditioned CG were used. Power iteration was implemented as described by Duderstadt and Hamilton [42], with λ updated via ( f (l+1) λ (l+1) = λ (l), f (l)) (f (l+1), f (l+1) ) (3.1) and Chebyshev acceleration implemented as described by Ferguson [46]. The L 2 norm of the new scalar flux iterate, φ (l+1), was set to unity for each outer iteration. The module associated with the Newton methods was designed to interface with the diffusion code primarily through function calls. The JFNK acceleration of the power method, JFNK(PI), requires only a functional call to the subroutine that performs outer iterations. The other Newton-based approaches require access to more intermediary functions in the outer-inner iterative process such as multiplication of a vector by M g and construction of the fission and scattering sources from an arbitrary vector. GMRES was used to solve the linear Newton step in all of the Newton-based processes, utilizing Saad s SPARSKIT [66]. When preconditioning the Krylov iterations associated with the linear Newton step with GMRES or CG (other Krylov methods) GMRES is replaced by FGMRES, with the SPARSKIT implementations used for both levels of iteration. However, when power iterations were used as a preconditioner the DLAP solution of the within-group problem was retained. SPARSKIT also contains the implementation of FGMRES that was used when necessary. The numerical parameters associated with the Newton-based methods and with the inner iterations are supplied to the program through input, with appropriate values being determined by numerical experiments that will be described in following sections.

90 76 The code outputs relevant iteration data for each of the methods so the convergence rate can be estimated based on the error reduction per iteration. Upon convergence, the converged eigenvalue and eigenvector are stored, along with the error, number of iterations, and execution time. The type of iterations counted and reported in the output file differ for each method: the power method (with and without Chebyshev acceleration) counts both traditional outer and inner iterations, the JFNK accelerated power iteration tracks outer and inner iterations as well as Newton and GMRES iterations, while both of the algorithms for the generalized eigenvalue problem record the number of Newton and GMRES iterations. Though each method shares many common operations there is no good common iteration count which can be used in direct comparisons that is based solely on the number of iterations, where a single iteration represents a fixed amount of computational load across the various algorithms. For this reason, the methods performance ultimately needs to be compared via total execution time. However, when comparing the performance of the Newton methods to one another, iteration counts prove to be useful. 3.2 Test Problem Descriptions Four reactor models were considered as test problems for the Newton-based methods developed in this work. There are two Pressurized Water Reactor (PWR) problems considered, the IAEA 2-D Benchmark and the Biblis Reactor Benchmark. Also considered are a Boiling Water Reactor (BWR) model and a CANDU core, which is a Heavy Water Reactor (HWR). Brief descriptions of these problems, along with cross section and geometry data, and solution information are presented in this section IAEA Benchmark The IAEA 2-D benchmark is specified in the Argonne Benchmark Problem Book [67], Supplement 2, identification 11-A2. It is a static benchmark which is a simplification of a larger 3-D benchmark. A quarter-core symmetric model was used with reflective boundary conditions on the inner surfaces and vacuum conditions

91 77 Table 3.1: Diffusion Test Problems Cross Sections M D 1 D 2 Σ R1 Σ R2 Σ 12 νσ f1 νσ f2 (cm) (cm) (cm) 1 (cm) 1 (cm) 1 (cm) 1 (cm) 1 IAEA 2-D Benchmark Biblis Test Problem D BWR Problem D CANDU Model E E E E E E E E E E E-3 on the outside surfaces where the reflector was extended into the void region to create a square model. The assembly pitch is 20 cm and the total dimensions of the model are 170 cm 170 cm, as seen in Figure 3.1a, which contains geometry and material assignment to regions for each of the test problems. Two energy groups are modeled with no upscattering. The buckling value provided in the benchmark specification has been incorporated into the cross sections, which are listed in Table 3.1, along with the cross sections for all of the problems considered. The IAEA Benchmark model is characterized by sharp flux gradients due to the

92 78 presence of absorbent rods. The reference value of k for the IAEA 2-D problem is k = Unless stated otherwise the mesh used for this problem is a uniform mesh of cells, corresponding to a uniform cell size per dimension,, of 1 cm. (a) IAEA (b) Biblis (c) BWR (d) CANDU Figure 3.1: Diffusion Test Problems Material Assignment. In all models the top and right edges are modeled with vacuum boundary conditions and the bottom and left boundaries are modeled with reflective boundary conditions. Black lines indicate the original extent of the reflector, which was extended to fit the square domain pictured.

93 Biblis Problem The Biblis 2-D Test Case is fully specified in the paper by Hébert [68], although it was originally proposed by Finneman and Wagner. Eight distinct materials and two energy groups are modeled in the Biblis test case, which has a checkerboard pattern due to fuel reloading; see Figure 3.1b. The pitch is cm and the reactor is modeled using quarter-core symmetry. The overall dimensions of the model are cm cm. Again, the reflector region has been extended into the void to complete the square and reflective and vacuum boundary conditions have been used on the interior and outer surfaces, respectively. The reference solution for the test case has a multiplication factor of k = The Biblis test case used a uniform mesh of , unless otherwise indicated, which yields a of about 1.05 cm BWR The BWR configuration tested was taken from a 3-D BWR transient problem in the Argonne Benchmark Problem Book [67], Supplement 2, identification 14-A1. Only one plane of the BWR is modeled using the initial conditions of the transient problem; see Figure 3.1c. Two types of fuel are modeled, each with and without absorbing material, as well as the reflector. The reflector has again been extended into the void to complete the square and the interior and outer surfaces are modeled by reflective and vacuum boundary conditions, respectively. The pitch in the BWR model is 15 cm and the total dimensions of the quarter-core symmetric model are 165 cm 165 cm and again two energy groups are modeled. The two sets of results provided for this problem indicate a criticality factor, k, in the range of BWR results were generated on a uniform mesh, or = 1 cm. Any other meshes used will be stated explicitly CANDU The CANDU core model is a variation of the multi-dimensional HWR problem described in the Argonne Benchmark Problem Book [67], Supplement 3, identification 17-A1. This problem is the reduction of a 3-D transient to a 2-D problem and here only the initial condition of the transient problem is considered. The

94 80 model is composed of only three materials: inner and outer cores and the reflector, which has once again been extended into the void region; see Figure 3.1d. The dimensions of the quarter-core symmetric model are 390 cm 390 cm, where reflective and vacuum boundary conditions have been used on the interior and outer surfaces, respectively. Cross sections are given for a two-group energy discretization and for the initial flux configuration, the reference value of the multiplication factor given is k = The mesh spacing for the CANDU problem is larger, with = 2 cm, corresponding to a uniform mesh. Variations of this mesh will be clearly indicated. 3.3 Numerical Experiments Input files were created for these test problems and the eigenproblem solution methods detailed in Chapter 2 were tested with the software developed. The models were used to run preliminary experiments designed to gauge the sensitivity of the JFNK and Newton-Krylov methods to the various parameters set at the user s discretion. The convergence rate of the Newton-based methods was also examined using these problems. Finally, performance comparisons between the Newton-based methods and the power method will be drawn based upon numerical results. To study the effect various parameters have on the performance of the considered methods an extensive set of results was generated as each parameter was varied while all others were kept constant. The base case is given by ɛ outer = {5 10 6, , } where ɛ outer = {ɛ k, ɛ 2, ɛ } with the selected values taken from the PARCS manual [52]. For methods requiring inner iterations the maximum number of iterations per group was set at 5 and the tolerance on the iterations was ɛ inner = The GMRES iterations (to solve the linear Newton step) used a constant η = 10 2 and a maximum of 90 iterations, with a restart every 30 iterations. The Newton-based methods were started using an initial guess generated by 10 traditional power iterations, which themselves were initialized using a flat-flux guess and k (0) = 1. The perturbation parameter ɛ was set using the empirical formula ɛ = (1 + u )ɛ mach, where u is the current Newton iterate and ɛ mach is the machine precision.

95 81 One more parameter worth mentioning is preconditioning of the inner (withingroup) iterations, which is actually more of an implementation issue. The methods which utilize inner iterations use Incomplete Cholesky (IC) factorization as a preconditioner. While the preconditioning succeeds in greatly reducing the number of iterations required, it can potentially lead to an increase in execution time when compared to a non-preconditioned approach, depending on the problem size. This is due to the fact that there is a non-negligible upfront cost associated with factorization of the M g matrices. All results shown in this chapter which use power iteration as a preconditioner include the IC preconditioner implementation from DLAP [65] used Perturbation Parameter The parameter ɛ is the perturbation size in the finite-difference approximation to the Jacobian-vector product. Five different methods to find a proper ɛ were used, taken from Knoll and Keyes [33] and Xu and Downar [34]. The empirical formulas tested were ɛ 0 = ɛ mach, ɛ 1 = ɛ mach max ( u, 1), (3.2) ɛ 2 = ( ) 1 (1 + u ) ɛ mach, ɛ 3 = u i ɛ mach, N + 1 where u is the most recent Newton iterate and N + 1 is the size of the nonlinear system defined by Γ. A fifth perturbation parameter tested was determined using a formula developed by Xu and Downar that seeks to find an optimal perturbation size by minimizing the round-off and truncation errors [34]. This set of numerical experiments used the JFNK(PI) and JFNK(IC) preconditioned forms of Newton s method along with the JFNK(GEP) formulation which is not preconditioned. Running each set of parametric experiments with each preconditioner would have been prohibitively expensive and so the conclusions drawn using these three implementations are taken to be representative of the entire family of methods. Results for this set of runs are given in Figures 3.2 and 3.3 for JFNK(PI) and JFNK(IC), respectively. These figures show that the total num- i

96 ɛ 0 ɛ 1 ɛ 2 ɛ 3 Xu GMRES Iterations Biblis BWR CANDU IAEA Figure 3.2: Impact of ɛ on JFNK(PI) ɛ 0 ɛ 1 ɛ 2 ɛ 3 Xu GMRES Iterations Biblis BWR CANDU IAEA Figure 3.3: Impact of ɛ on JFNK(IC)

97 83 GMRES Iterations ɛ 0 ɛ 1 ɛ 2 ɛ 3 Xu Biblis BWR CANDU IAEA Figure 3.4: Impact of ɛ on JFNK(GEP) ber of GMRES iterations in the problem is quite insensitive to the choice of the finite-difference perturbation, ɛ. Insensitivity to ɛ is a desirable result because it ultimately reduces the number of free parameters a user of the method would be responsible for deciding. The work by Xu and Downar highlights instances where using a less-than-optimal value of ɛ is associated with an increased computational cost, but it seems that when using the JFNK approximation to solve the k-eigenvalue problem associated with the multigroup diffusion equations, that this is not a concern. Both of these preconditioners are applied effectively to the problem at hand. However, if one were to apply a less effective preconditioner the results look slightly different. Taking the extreme example of using no preconditioner, JFNK(GEP), and running the same set of experiments results in Figure 3.4. Here we see an increased sensitivity to ɛ but still nothing too drastic. Possibly, under certain conditions the choice of ɛ could be the limiting factor when comparing one run to another, but the large number of numerical and problem parameters that have a stronger impact on performance make encountering this situation unlikely.

98 Forcing Factor There were 8 different methods of determining the forcing factor η that were tested. The forcing factor is the tolerance which effectively determines the accuracy of the inversion of the Jacobian. That is, a Newton direction δu is chosen such that Γ(u (m) ) + J(u (m) )δu (m) η (m) Γ(u (m) ), where η is the forcing factor being examined in this set of results. Three of the η values considered were fixed values of 10 1, 10 2, and 10 3 which result in increasingly better representations of J 1. Two methods developed in Eisenstat [28] were used which dynamically change η depending on the values of certain quantities from the Newton iteration, termed Eisenstat-A and Eisenstat-B. Although very similar, the methods require that slightly different quantities be known, with method B being simpler to implement in this scenario because it does not require any additional Jacobian-vector products. Basic versions of Eisenstat-A and B, GMRES Iterations Eis-A Eis-B Dembo ɛ 2 (f) An Biblis BWR CANDU IAEA Figure 3.5: Impact of η on JFNK(PI)

99 85 respectively, are given by and η (m) = Γ(u(m) Γ(u (m 1) J(u (m) δu (m 1) Γ(u (m 1) η (m) = γ ( ) Γ(u (m) α ), Γ(u (m 1) ) though the implemented versions are slightly altered as suggested by the authors to increase numerical stability. The simple η value of Dembo [27], given by Eq. (1.19) η (m) = min{ 1 k + 2, Γ(u(m) } was also considered. One choice specific to this problem was considered, where η (m) = ɛ (m 1) 2, with ɛ 2 defined by Eq. (2.46b). Lastly, the method of An et al. [29], was used, with p 1, p 2, and p 3 being set to 0.3, 0.6, and 0.9, respectively. This method works by taking the ratio of the actual to the predicted Newton step size GMRES Iterations Eis-A Eis-B Dembo ɛ 2 (f) An 1 0 Biblis BWR CANDU IAEA Figure 3.6: Impact of η on JFNK(IC)

100 86 and using this quantity to choose η as a function of the previous η. This method is too involved to discuss in detail here but is clearly explained in the cited work. All of the Newton-based approaches are sensitive to the choice of η, regardless of whether the JFNK approximation or the action of the actual Jacobian is used. In fact the choice of η is the critical factor in the local convergence of an Inexact Newton method and thus the choice impacts both the efficiency and robustness of Newton s method. For this set of runs three different implementations of the newly developed algorithms were used: JFNK(PI), JFNK(IC), and NK(IC). Results are given in Figures 3.5, 3.6, and 3.7, respectively. Unlike the choice of ɛ, the choice of η can be seen to have a significant impact on performance for all of the methods tested. Unfortunately, there is no overwhelmingly superior choice evident. Poor choices, however, are much easier to spot. The constant value of 10 3 is the least efficient of the fixed values tested, consistently resulting in more GMRES iterations than 10 1 and with very few exceptions for Using the algorithm of Dembo, the maximum point-wise error in the fission GMRES Iterations Eis-A Eis-B Dembo ɛ 2 (f) An 1 0 Biblis BWR CANDU IAEA Figure 3.7: Impact of η on NK(IC)

101 87 source, and the An algorithm are all expensive at times. Though occasionally resulting in a low iteration count, these three choices are rarely counted among the better of the performers for a given problem. This leaves the constant values of 10 1 and 10 2 and the two algorithms of Eisenstat as serious contenders. None of these are consistently bad choices and choosing from among them using this limited set of numerical results is nearly arbitrary, though it could be argued that Eisenstat-A offers a slight advantage. However, as the figures demonstrate, the problem-dependent nature of the best η choice practically guarantees that it is unlikely the best option for consistent use. Ideally one of the dynamically changing algorithms for η would be used because it requires very little from the user and can be trusted to adjust η as necessary given the convergence behavior of the problem. These approaches are certainly competitive but in no way are dominating. The behavior of these algorithms will be explored further shortly in conjunction with an examination of the convergence behavior of the Newton iterations for different preconditioners Initial Power Iterations Another quantity that directly impacts the performance of Newton-based methods is the initial guess used. The proximity of the initial guess to the solution (or in this case a solution) is extremely important as convergence of Newton s method is a local and not a global quality. For this reason the Newton-based approaches have been tested by using a variety of initial guesses. These are generated by first performing some number, n pi, of standard power iterations, which themselves were initiated with a flat flux such that φ (0) 2 = 1 and k (0) = 1. The sequence of numbers of initial power iterations used was set to {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25}. Figures 3.8 and 3.9 show selected results from this set of experiments, for JFNK(PI) and NK(IC), respectively (although JFNK(IC) was also tested and found to nearly mirror the NK(IC) results). Although the Newton-based methods are theoretically sensitive to the initial guess, it does not appear that the number of initial power iterations used systematically results in an initial guess close enough to the true solution that the asymptotic convergence rate of Newton s method is achieved. Still, using some number of power iterations to initialize the Newton

102 Biblis BWR CANDU IAEA GMRES Iterations Initial Power Iterations Figure 3.8: Effect of Number of Initial Power Iterations to Initialize JFNK(PI) Biblis BWR CANDU IAEA GMRES Iterations Initial Power Iterations Figure 3.9: Effect of Number of Initial Power Iterations to Initialize NK(IC)

103 89 method is clearly beneficial. This is most obvious for JFNK(IC) and NK(IC) which share the trend seen in Figure 3.9. Generally somewhere in the range of 5-10 power iterations results in the fewest number of total GMRES iterations. The savings in total number of iterations can be substantial in some cases. For instance, in Figure 3.9 the number of GMRES iterations is reduced by 50% performing 7 or 8 power iterations to generate an initial guess (when compared to performing 1 or 2). The trouble is that it is difficult to know beforehand how many power iterations will result in a minimal number of GMRES iterations. One should also keep in mind that these results only pertain to the test code used to generate them since the effect of the power iterations is heavily dependent on the implementation. For instance, any acceleration of the power method will alter the behavior shown and the convergence tolerance of the within-group iterations will also have an impact. In practical applications one would certainly opt to use accelerated power iterations to generate an initial guess since they result in a smaller system residual for the same amount of work. Ideally some simple algorithm based on either the system residual or the power iteration relative errors could be developed so that Newton s method would begin whenever some near-optimal initial guess was calculated, though this idea is not pursued in this manuscript Inner Iterations Inner iterations, in the context of power iteration, refer to the iterative process used to invert M g during the solution of each within-group equation; if a direct solution method were used to invert M g there would be no inner iterations necessary. The number of inner iterations (or the tolerance on the iterative process) determines how well Mg 1 is approximated. This is only tangentially related to the Newton- Krylov family of methods, specifically only when power iteration is being used as a preconditioner. However it is worth examining the effect of inner iterations even if only to ensure acceptable values are being used for the power method (and Chebyshev accelerated power method) when comparing execution times with the Newton methods. In much the same manner as η is used to prevent oversolving the Newton step, the inner iterations are also terminated in an attempt to avoid oversolving.

104 90 Early on in the power iteration process the fission source will be very inaccurate and there is no benefit to solving the within-group problem to a high degree of precision in this case. Codes such as PARCS [52] generally call for a small limit on the maximum number of iterations per group that is allowed before terminating the inner iterative (in this case CG) process. Therefore, these numerical experiments examine what the appropriate maximum number of inner iterations is and whether it is beneficial in any case to fully converge the CG iterations to some error level. These numerical experiments are only performed for the JFNK(PI), power and Chebyshev accelerated power methods. The JFNK(rPI) and NK(rPI) methods are not included even though they utilize inner iterations because the preconditioner cannot be considered the dominant cost. However, for JFNK(PI) the power iteration is certainly the dominant cost since the evaluation of the nonlinear function is equivalent to one power iteration. The number of power iterations used in JFNK(PI) can actually be directly compared to the power method to judge performance (confirmed by nearly identical power iteration per second times). First, the maximum number of inner iterations per group was varied, using values of {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25}. Converging to some error tolerance ɛ inner, rather than placing a ceiling on the number of iterations was also done, using the error tolerances {10 1, 10 2, 10 3, 10 4 }. The effects of capping the number of inner iterations used in the within-group problem is shown in Figure The number of power iterations required undergoes a minimum generally in the range of 2 to 7 maximum inner iterations per group and eventually begins to increase as the maximum number of iterations allowed increases, with the peak occurring at the maximum value tested. The trend is not as pronounced in the case of Chebyshev acceleration but the results are similar; for the most part a maximum number of inner iterations in the range of 4 to 10 is preferable. JFNK(PI) in most cases requires the fewest iterations of the three methods with the notable exception of cases where the maximum iteration number is small. For most problems JFNK(PI) performs best when the maximum iteration number is in the range of 4 to 10 as well, though the trend is less pronounced. Figure 3.11 shows the effect of the inner iteration tolerance on the total number of inner iterations for the power method (with and without Chebyshev acceleration) and JFNK(PI). For the unaccelerated power method, tightening the inner

105 91 Inner Iterations Inner Iterations Biblis 10 3 CANDU Maximum Inner Iterations / Group BWR IAEA Maximum Inner Iterations / Group Power Chebyshev JFNK(PI) Figure 3.10: Effect of Maximum Number of Inner Iterations on Preconditioned Power-Iteration-Based Methods Measured in Terms of Cumulative Inner Iterations iteration convergence increases the total number of inner iterations required, while the Chebyshev accelerated power method is relatively insensitive to this parameter. The JFNK acceleration of the power method, JFNK(PI), trends the same as the power method for the most part although the increases in the number of iterations are small enough to be almost negligible. A looser tolerance in this case appears to benefit JFNK(PI) when comparing it to Chebyshev acceleration, however as always there are exceptions, evidenced by the CANDU results. In general the results of this set of experiments indicate that a lower tolerance is a good choice as is a relatively low number of maximum inner iterations per group. Converging the inner iterations too tightly, either by using a strict tolerance or performing a larger number of iterations, is detrimental to performance because time is spent in early iterations accurately solving a within-group problem that has a very inaccurate fission source. Setting tolerances and iteration counts too low, however, can introduce instabilities into the Newton approach. It should also be noted that the reason it is possible to use only a single inner iteration per outer

106 92 Inner Iterations Inner Iterations Biblis 10 3 CANDU BWR IAEA Power Chebyshev JFNK(PI) Figure 3.11: Effect of Inner Iteration Tolerance on Preconditioned Power-Iteration- Based Methods Measured in Terms of Cumulative Inner Iterations iteration is that the initial guess for the inner iteration is always the most current flux. Therefore, for the fixed-point power iteration, whether the inner iterations are performed consecutively in a single block or separated by eigenvalue updates (outer iterations) they will eventually be performed and the converged solution is no less accurate. 3.4 Convergence of Newton-Based Methods Prior to comparing results of the Newton-based approaches to power iterations it is worthwhile to examine the convergence rate of the Newton-based approaches and what factors determine its prominent features. Specifically, which preconditioning schemes are most effective in terms of both iteration count and execution time and what effect do nested (within GMRES) iterative processes have on performance.

107 Full Convergence The first case in this set of tests will tend towards converging all iterative processes in the nested sequence as tightly as possible since execution time is not of concern in the current test. This set of experiments seeks to confirm that if the GMRES iterations are converged tightly within each Newton step then the superlinear local convergence rate of Newton s method is achieved. The values selected for the free parameters in the calculation are summarized in Table 3.2. Using convergence values this tight for the GMRES iterations closely resembles the direct solution of the linear Newton step. Also, since such a large number of GMRES iterations are permitted it is expected that the effects of preconditioning will be small. Even a non-preconditioned problem will converge when given a sufficiently large number of iterations. For preconditioners which utilize a nested level of iteration (power, M, and M-F) 25 iterations were used in each preconditioning step, which is more than sufficient. The backtracking loop was only used when the Newton residual was below 10 12, mainly to avoid fluctuations in the residual due to precision issues. Plotting the L 2 norm of the Newton residual for the IAEA Benchmark problem against the Newton iteration for all of the Newton methods and using the values in Table 3.2 results in Figure 3.12 The results in this plot show that the Newton-based approaches converge quickly when in the asymptotic regime, beginning at Newton iteration 2 or 3. Convergence to machine precision is quite rapid in all of the cases, within 5 Newton iterations. All of the methods tested except for JFNK(PI) are located in a relatively tightly packed clump, and up to iteration 3 are nearly identical. This agrees with expectations: these are all solutions to the same problem using different preconditioners and Jacobian approximations. As mentioned previously, when using such a large subspace during the GMRES iteration preconditioning is not really necessary. The JFNK(PI) curve, marked by the cyan line with circles does Table 3.2: Base Case for Convergence Tests Initial Power Iterations 10 Perturbation (ɛ) (1 + u )ɛmach Eigenpair Criteria (ɛ k, ɛ 2, ɛ ) ( , , ) GMRES Iterations (dimension, restarts, η) (1000, 10, )

108 94 Newton Residual JFNK(GEP) JFNK(M) JFNK(M-F) JFNK(IC) JFNK(rPI) NK(GEP) NK(M) NK(M-F) NK(IC) NK(rPI) JFNK(PI) Newton Iteration Figure 3.12: Convergence of Newton Methods for IAEA Benchmark not coincide with the other methods because it is solving the problem using a fundamentally different nonlinear equation due to the use of left preconditioning; left preconditioning effectively changes the iterative operator of the problem while right preconditioning does not. It can be seen that the convergence rate for this method is only slightly lower than the rate seen for the other methods. It is likely this rate could be improved by using a better approximation of M 1 and results to this effect will be shown later in this section. While this plot confirms that the Newton-based approaches are capable of achieving impressive convergence rates, the iteration tolerances that allow this are quite impractical. Ways to reduce the computational expense are examined in the remainder of this section. One simple way to reduce this cost is to use fewer GM- RES iterations, mainly by loosening convergence of the GMRES iterations (larger η). Also, the most effective preconditioner is that which results in the smallest execution time and not just minimum number of iterations. This is an important distinction since some preconditioners result in a very small total number of GMRES iterations but are actually expensive to apply at each iteration.

109 GMRES Performance The previous numerical results show the levels of convergence rate possible when execution time is not a concern. The results in this section show how the size of the Krylov subspace used by GMRES and the restart strategy affect the convergence of Newton s method. The effect of the conditioning of the linear system and the accuracy of J 1 are also examined Subspace Size One of the most important parameters for efficient use of GMRES is how many total iterations (or matrix-vector multiplies) to allow and how often to restart. Since GMRES retains basis vectors as it proceeds it is periodically restarted to reduce the storage associated with the method. In theory if there is no rounding error GMRES with N + 1 iterations and no restarts would result in the exact solution of an N + 1 order linear set of equations. However, this is impractical and the challenge is to find a subspace size that does not inhibit the efficient solution of the problem, but is also small enough to not require unreasonable amounts of memory. Also, as the subspace size between restarts increases each iteration becomes more expensive due to the orthogonalization of the new basis vector with respect to all previous vectors. The following results were generated before SPARSKIT was included in the code and thus use the DLAP implementation of GMRES. Ideally, nearly identical results would be obtained with SPARSKIT but this is not the case. In fact, the SPARSKIT implementation of GMRES appears to be even more sensitive to the number of GMRES iterations performed (subspace size and number of restarts) which will be discussed shortly. Still, the point that will be made is based on trends and not hard numbers so the specific implementation of GMRES used is not particularly important. The behavior of the Newton iteration with regards to the properties of the GMRES iterations is the trend meant to be illustrated through these numerical results. In the Unlimited case of this section unreasonably large values were used, with 1000 basis vectors accumulated (1000 iterations between restarts) and 50 restarts allowed. Two cases are now considered, one in which the total number of

110 Unlimited x2 JFNK(PI) JFNK(GEP) NK(GEP) Newton Residual x x100 Newton Residual Newton Iteration Newton Iteration Figure 3.13: Convergence of Newton-Based Methods for IAEA Benchmark as a Function of GMRES Iterative Properties, 1000 Iteration Maximum GMRES iterations is limited to 1000 and the maximum number of basis vectors is taken to be 500, 100, or 10. The other case is more restrictive and corresponds to 100 total GMRES iterations with the number of iterations between restarts set to 100, 50, and 10. All of these experiments were performed using an η which was set to a constant value of JFNK(PI), JFNK(GEP), and NK(GEP) were used in these experiments to show results for methods that are poorly-conditioned JFNK(GEP) and NK(GEP) and for a well-conditioned method, JFNK(PI). Results for the case where 1000 total GMRES iterations were allowed are given in Figure 3.13, and the instances with 500, and 100 iterations between restarts are

111 Unlimited x1 JFNK(PI) JFNK(GEP) NK(GEP) Newton Residual x x10 Newton Residual Newton Iteration Newton Iteration Figure 3.14: Convergence of Newton-Based Methods for IAEA Benchmark as a Function of GMRES Iterative Properties, 100 Iteration Maximum quite similar to the Unlimited case. This indicates that a subspace size of 100 is not prohibitive to the solution of the problem, i.e. few of the Newton solutions require 100 GMRES iterations to converge the linear problem to η = If GMRES were restarting often we would expect to see some behavioral differences among the results as in the case where only 10 iterations are performed between restarts; this results in drastic changes to the previous trends. The performance of JFNK(GEP) and NK(GEP) degrade very noticeably. Again the JFNK(PI) method is seen to be much less sensitive to parameters which cause a near breakdown of the JFNK(GEP) and Newton-Krylov methods, which is reasonable since it is based

112 98 on a nonlinear system which is distinct from that of the other two methods and it is known to be a preconditioned formulation of the GEP approaches. For the case where only 100 GMRES iterations are permitted the results are plotted alongside the Unlimited case, where no limit was placed on the subspace size; results are provided in Figure In this scenario the trends are the same but even more pronounced. Using 100 GMRES iterations with no restarts, as opposed to the GMRES properties given in Table 3.2 does not affect JFNK(PI) at all though the number of JFNK(GEP) and NK(GEP) iterations increases by more than a factor of four. Decreasing the subspace size to a maximum of 50 basis vectors again has little effect on JFNK(PI) but further increases the number of Newton iterations required for the other two methods. In the case of a maximum of 10 iterations per restart, with a maximum of 100 total iterations, JFNK(PI) remains insensitive yet the number of iterations required to achieve the same rather tight convergence by the other two methods increases by more than two orders of magnitude. The JFNK(GEP) and NK(GEP) methods actually fail to converge with the given number of Newton iterations, though they assuredly would have converged eventually if afforded far more iterations Conditioning of Problem The previous results clearly display the sensitivity of the Newton convergence rate to the size of the Krylov subspace and the number of restarts. The JFNK(PI) approach, which is a left-preconditioned variant of JFNK(GEP) is not nearly as susceptible to failure as JFNK(GEP) and NK(GEP), both of which were unable to converge the GMRES iterations in a reasonable number of iterations. The failure of the GMRES iterations to sufficiently converge can certainly be explained by the conditioning of the linear system associated with the linear Newton step. Assuming that the Jacobian is diagonalizable with J = XΛX 1 it can be shown [5] that r (m) 2 κ 2 (X)ɛ (m) r 0 2 where the condition number is defined by κ 2 (X) X 2 X 1 2, and ɛ (m) is some bounded scalar. Thus, the convergence rate of GMRES is directly proportional to the condition number of the matrix X. One plausible explanation for the behavior

113 99 of JFNK(GEP) and Newton-Krylov is that the associated nonlinear system Γ gep (u) results in a Jacobian (or Jacobian-approximation) that has a large condition number, κ 2. The obvious remedy for this problem is to use one of the previously developed methods to precondition the GMRES iteration associated with the linear Newton step. It would be convenient if one were able to employ the type of incomplete factorization often used to precondition Krylov methods. There are many problems associated with this approach, however: the first is that with JFNK(GEP) there is no matrix available so its structure cannot be manipulated directly as is necessary by many factorization-based methods. Also the Jacobian changes each Newton iteration so some types of preconditioners would need to be built repeatedly, which can be expensive. Still, the alternative, i.e. not preconditioning, has been demonstrated to produce unacceptable results when a practical number of GMRES iterations are used. Saad discusses some preconditioning options for Krylov methods in his text while Knoll and Keyes [33] give a review of what others have done to precondition the Newton step of JFNK methods. The types of preconditioning developed for use with the JFNK and NK approaches are based on simple linearizations of the problem and approximations of J and are not subject to any of the difficulties just mentioned. However, due to this simplicity it is unlikely they are the most efficient preconditioners available, though they have the advantage of being easy to implement (especially preconditioning using power iteration). It is not surprising to see the insensitivity of JFNK(PI) to the GMRES iteration properties. The nonlinear system associated with this method is different in a fundamental way from the system corresponding to JFNK(GEP) and NK(GEP) such that JFNK(PI) produces a Jacobian that is well-conditioned with regard to the GMRES iteration while the Jacobian resulting from the other two methods is ill-conditioned. This is expected since it has been shown that JFNK(PI) is simply JFNK(GEP) preconditioned on the left by the standard power iteration. These results just show that JFNK(PI) is an effective preconditioner. To confirm that this is the case, we now consider results for JFNK(PI) where the number of inner iterations (iterations on M g in the within-group problem) is varied. If the power iteration is a good preconditioner one would expect better convergence properties as the power iteration operator, M 1, is represented with increasingly

114 NK(rPI): Inners Allowed per Group of PI Newton Residual No. Inner Its Newton Iteration Convergence of Inner Iterations using Power Iteration as a Precon- Figure 3.15: ditioner better accuracy. Figure 3.15 confirms that when M 1 is poorly approximated the GMRES algorithm exhibits poor convergence properties which in turn adversely affect the convergence rate of Newton s method. As the approximation of M 1 is made better by using an increasing number of CG iterations to solve the withingroup problem, the overall convergence properties of the Newton-Krylov algorithm greatly improve Forcing Term Preliminary results examining the effect of η on the convergence rate of the Newtonbased methods were performed. The experiments were done using constant η values of 10 8, 10 4, and 10 1 with results plotted in Figure These results were also calculated prior to the inclusion of SPARSKIT in the test code and so use the DLAP GMRES implementation. The base case, η = is very similar to the case where η is 10 8, although the differences between JFNK(GEP) and NK(GEP) are now more pronounced and JFNK(PI) takes an extra Newton iteration to fully

115 Base Case 10 2 η = 10 8 JFNK(PI) JFNK(GEP) NK(GEP) Newton Residual η = η = 10 1 Newton Residual Newton Iteration Newton Iteration Figure 3.16: Function of η Convergence of Newton-Based Methods for IAEA Benchmark as a converge. For the case where η is 10 4 it can be seen that JFNK(PI) has changed very little while JFNK(GEP) and NK(GEP) have shifted much closer to JFNK(PI). Comparing η = 10 4 to the base case, it is apparent that from iteration three onwards the smaller forcing factor can be associated with a smaller residual. For the even more relaxed case where η is 10 1 the trends are not preserved and the total iteration count necessary to converge more than doubles for all three methods. Interestingly, in this case it can be seen that JFNK(PI) now performs much better, requiring only half the number of Newton iterations as JFNK(GEP) and NK(GEP). This plot supports the concept of dynamically choosing a η for

116 102 each Newton step that tightens as the problem converges, allowing one to loosely converge in early iterations, where it can be seen to have little effect, and tightening as the calculation proceeds. However, these results were simply intended to convey the fact that the convergence rate of Newton s method is affected by η as well as the properties of the GMRES iteration. Still, the most likely breakdown of the newly developed algorithm is clearly related to the convergence of the GMRES iterative process which can fail due to the conditioning of the linear problem, which arises from a poorly conditioned Jacobian (or approximate Jacobian). This potential problem can be alleviated one of two ways: increasing the number of GMRES iterations or finding an efficient preconditioner for the linearized Newton problem. In terms of computational cost it will almost always be more effective to precondition. Increasing the size of the subspace only increases the total number of GMRES iterations, which is directly proportional to the cost of the algorithm. On top of that, the larger subspace increases the per iteration cost due to the orthogonalization process. Effective preconditioning therefore is essential to any practical implementation of the Newton-Krylov family of methods. Shortly, in Section the preconditioners developed thus far will be evaluated based on their impact on both convergence rate and total execution time. The influence of the forcing term η will also be revisited to see if earlier conclusions hold when the runs are designed to minimize execution time rather than analyze the method GMRES Implementation Another aspect of GMRES to consider is the specific implementation. In any practical application it is unlikely one would choose to write their own GMRES solver simply because there are a number of highly optimized implementations available, such as those incorporated into the Trilinos package. When using such a prepackaged solver it is important to remember something that Saad [5] points out a number of times: there is a distinction between the mathematical description of GMRES and any individual implementation of the method. In other words, if two implementations of GMRES are given identical problems to solve there is no guarantee that the same solution will be found. For instance, the orthogonalization procedure used could differ between implementations leading to the breakdown of

117 103 Newton Residual DLAP (An, none) SK (An, none) DLAP (Eis-A, 10 4 ) SK (Eis-A, 10 4 ) DLAP (Dembo, 10 8 ) SK (Dembo, 10 8 ) Time(s) Figure 3.17: Potential Impact of GMRES Implementation on Convergence one and not the other in certain circumstances. There are many other aspects of the algorithm (such as restart) which could result in implementation choices that ultimately affect GMRES performance. A prime example of this can be seen through numerical results generated using both the DLAP and SPARSKIT implementations of GMRES for the same problem. In Figure 3.17 results generated using the IAEA ( ) problem for three cases are shown, where for each case DLAP and SPARSKIT are both used while all other parameters are kept equal. The three cases can be described as follows: the first uses the An method to determine η and no backtracking is implemented, the second uses the Eisenstat-A algorithm to determine η and backtracking is implemented when the Newton residual falls below 10 4, the third uses the Dembo formula to find η and backtracking is implemented when norm of the Newton residual falls below From the figure it becomes very clear that in certain circumstances the behavior of the GMRES implementation can have a significant effect on the overall calculation. For the first two cases we see that the SPARSKIT formulation allows Newton s method to converge more or less smoothly to machine precision while

118 104 in the DLAP implementation the Newton residual either flattens out or fluctuates in the range. On the other hand for the third case it can be seen that smooth convergence is obtained using the DLAP implementation and it is the SPARSKIT version which fails to reduce the Newton residual. This behavior is not pointed out in order to determine a better implementation, in fact they usually produce indistinguishable results. SPARSKIT is ultimately used for the majority of runs because it is more amenable to preconditioning Practical Considerations A good deal of discussion has already been provided regarding some of the various parameters used in the Newton s method approach to the k-eigenvalue problem. A number of potential failure modes for the method have already been identified, such as the GMRES behavior and settings. In this section we examine which of the methods we have available are best at not only avoiding these problems but doing so quickly. The first thing to consider is the effectiveness of the various preconditioners presented in the last chapter. The need for some type of preconditioning cannot be emphasized enough and a few options have been developed. The impact of the globalization step will also be examined in an attempt to determine whether or not it is necessary, and if so when it should be used. Finally, the behavior of some of these methods is examined using a more traditional convergence criteria. It is valuable to converge to machine precision so that the convergence rate of Newton s method can be verified but in a practical k-eigenvalue calculation this is far too precise of a solution. For this reason looser criteria will be used so the behavior relevant to practical applications can be studied. All of the discussion and results presented in this section were generated via one rather large set of numerical experiments. These experiments were performed using the IAEA problem with a spatial mesh. The JFNK and NK methods were used, along with all of the preconditioning approaches discussed in Chapter 2. The Eisenstat-A, Eisenstat-B, An, and Dembo methods were used to determine η, as well as setting it to a constant value of The effect of backtracking was also explored by performing each run using one of three backtracking regimes: none at all, when the norm of the Newton residual is below 10 8, and when the

119 105 Newton residual is below This produced far too many results to include in their entirety so in the following discussion specific combinations of parameters are selected for inclusion simply because they illustrate certain points most clearly Preconditioners The effect of the various preconditioners was explored for both the JFNK and NK approaches. For the JFNK approach results are shown for the Eisenstat-A and An methods of determining η with no backtracking. For the NK method η is determined by the An approach, also without backtracking. Figure 3.18 shows the results for An with no backtracking. From this figure it is easy to draw a distinction between the various preconditioners. The JFNK(GEP) method in which GMRES is not preconditioned in any way is predictably the worst option by far. In fact, it can be seen that within the time frame in which all of the other options converge the non-preconditioned option is nowhere near convergence. The convergence rate Newton Residual JFNK(PI) JFNK(rPI) JFNK(IC) JFNK(GEP) JFNK(M) JFNK(M-F) Time(s) Figure 3.18: Performance of JFNK Preconditioners using An Algorithm Without Backtracking

120 106 Newton Residual JFNK(PI) JFNK(rPI) JFNK(IC) JFNK(GEP) JFNK(M) JFNK(M-F) Time(s) Figure 3.19: Performance of JFNK Preconditioners using Eisenstat-A Algorithm Without Backtracking is much poorer than one would expect for Newton s method which is the result of a poor approximation to J 1 due to the lack of GMRES convergence at each Newton step. The M and M-F preconditioners are the two next worst options, ignoring IC for the moment. This is reasonable since in both of these approaches a preconditioning matrix is built and inverted and thus another additional level of iteration is introduced into the computation. This is not the case with the other options and helps to explain their relative success. The IC preconditioner suffers from a failure to fully converge to machine precision, however from the plot it is obvious if the calculation was considered converged anywhere in the range of then the IC option will perform very well. The best of the choices are the PI and rpi preconditioners, i.e. power iteration applied as a left and right preconditioner, respectively. This is convenient since power iteration is widely available in existing codes capable of solving k-eigenvalue problems and, even if not available, it is simple to implement. Results generated using Eisenstat-A with no backtracking are provided in Fig-

121 107 Newton Residual Time(s) NK(rPI) NK(IC) NK(GEP) NK(M) NK(M-F) Figure 3.20: Backtracking Performance of NK Preconditioners using An Algorithm Without ure The same trends can be seen even more clearly in this case. No preconditioning is clearly the worst choice, followed by the M and M-F options. Again the power iteration based preconditioners prove to be most effective though IC is just as good in this case. The preconditioning effect on the NK method using the An method for η can be found in Figure This plot confirms the conclusions drawn for the JFNK approach, though the PI preconditioner is not an option here due to the unavailability of the Jacobian. The separation between preconditioners is very distinct for the NK method. Again the basic, non-preconditioned formulation fares worse, though it does show better convergence than its JFNK counterpart in Figure 3.18 in the same execution time span. The M and M-F again fall in the middle of the pack. In all cases though M-F outperforms M alone which is expected since the inverse of M-F is a much better approximation to the inverse of the actual Jacobian at any point, the M preconditioner neglects any fission or scattering contributions. The rpi and IC options are superior by a factor of somewhere between 3 and 4 when compared to the next best choice (M-F). The difference be-

122 108 tween them in this instance is negligible. It is interesting to note here that the convergence of the NK methods is much smoother (with regards to monotonicity) than in the analogous JFNK cases which indicates that the JFNK approximation, though successful, does contribute to some of the numerical instability witnessed in these results. These results help to narrow down the formulations that should be considered for use in any practical implementation. The non-preconditioned variations are generally very poor choices and should never be used, and while the M and M- F options offer an improvement they are still largely unsatisfactory. The five most competitive choices then are JFNK(PI), JFNK(rPI), JFNK(IC), NK(rPI) and NK(IC) and these are in fact what will ultimately be used to compare the performance of the Newton approach to that of traditional power iteration Globalization The topic of globalization, specifically backtracking, has been mentioned many times to this point. The essence of the technique is to ensure that the Newton step being taken is a good step, by some definition of good, and if not then reduce the step size until a good step is determined. Using backtracking it is possible to overcome one of the greatest difficulties associated with Newton s method: finding an initial guess sufficiently close to a root of the nonlinear function. In that sense globalization is unnecessary for this problem since divergence of Newton s method has been not witnessed, regardless of the initial guess chosen (though only real, positive guesses have been used). Unlike many nonlinear problems the eigenvalue problem has many true minima so the concern of getting caught in a local minimum is less pronounced, instead we worry about converging to the desired eigenpair, which in this case is the fundamental mode. Globalization may be useful however to smooth out some of the numerical problems that occasionally arise, such as with JFNK(IC) in Figure Figure 3.21 shows results using the JFNK approach with multiple preconditioners for the IAEA problem using Eisenstat-B without backtracking. In this instance the only method in the pack that converges completely is the JFNK(PI) formulation while all of the other preconditioners fluctuate around a residual norm of If backtracking is implemented when the residual norm below 10 4 this plot drastically changes, as shown in Figure

123 109 Newton Residual JFNK(PI) JFNK(rPI) JFNK(IC) JFNK(GEP) JFNK(M) JFNK(M-F) Time(s) Figure 3.21: Eisenstat-B Without Backtracking Though the non-preconditioned option still suffers from slow convergence the convergence itself is much smoother, even in the 10 8 region. For the rest of the preconditioners, aside from PI, the effect of backtracking is quite significant. All of the approaches that suffered fluctuations and failed to converge no longer fluctuate. The previously seen trends with the preconditioners emerge once again, with the erratic behavior eliminated for the most part. The only exception is with PI which behaves rather peculiarly in this instance, spending the majority of the calculation at a plateau where the Newton residual is neither increasing nor decreasing. This indicates multiple Newton steps where even with backtracking an acceptable step size could not be found. The shape of the plots in the other cases is also worth discussing because it illuminates exactly how backtracking works. Note that none of these lines are smooth but rather they have a staircase shape to them. Using the behavior in Figure 3.21 as an example we can see that backtracking permits the steps that decrease the residual. But, rather than allowing the next step to increase, the residual (resulting in the fluctuations seen) backtracking imposes a small step size such that we end up with this series of drops and plateaus.

124 110 Newton Residual JFNK(PI) JFNK(rPI) JFNK(IC) JFNK(GEP) JFNK(M) JFNK(M-F) Time(s) Figure 3.22: Eisenstat-B With Backtracking when r < 10 4 Backtracking works quite well in this example with the exception of JFNK(PI), and that is a good example of the influence of backtracking: it works well, except when it does not work at all. It is certainly something that should be considered but one must keep in mind that just as with JFNK(PI) sometimes the Newton approach will quickly recover from a bad step and in such cases enforcing backtracking actually prevents this from occurring Convergence Criteria It is worth briefly discussing the effect of convergence criteria on many of the numerical results presented to this point. The Newton residual has often been pushed to machine precision convergence so that the asymptotic behavior of Newton s method could be witnessed, because ultimately methods with faster convergence rates are preferable to those with slower convergence rates. However, this statement can be misleading because in practice it is not common to converge solutions to the point that asymptotic behavior is so clearly exhibited. So while discussing

125 111 numerical problems that arise in the region of small residuals (< 10 6 ) and finding fixes is both interesting and useful, these problems would likely never be encountered in the first place when using one of the Newton approaches for a realistic application. 3.5 Comparison with Power Iteration To attain a definitive comparison between the Newton-based approaches and power iteration (with and without Chebyshev acceleration) another set of numerical experiments was performed, taking into account the behavior studied in previous sections. The eigenvalue problem was solved for each reactor model using two meshes and the eigenvalue was converged to either the machine precision or the PARCS default values. For a given model, mesh, and convergence criteria the problem is solved using the power method, Chebyshev acceleration of the power method, JFNK(PI), JFNK(rPI), JFNK(IC), NK(rPI), and NK(IC). The non-preconditioned versions and the M and M-F preconditioned versions are excluded due to their poor performance in earlier experiments. Again this results in a large amount of information which cannot all be included in this document. However, the only real measure that needs to be compared is the execution time of the various methods to achieve comparable convergence levels. The so-called coarse mesh set of runs uses mesh sizes of , , , and for the Biblis, BWR, CANDU, and IAEA problems respectively; the fine mesh results used meshes of , , , and The ɛ mach convergence criteria is , , and for ɛ k, ɛ 2, and ɛ as defined in Chapter 2; the PARCS criteria is , , and Power iteration and Chebyshev acceleration are performed using a maximum of 6 IC-preconditioned CG iterations for each within-group solution, with a loose convergence tolerance of , as this resulted in near-optimal run times for these methods. A flat flux with k = 1 was used as the initial guess for these methods. All of the Newton approaches were initialized using 2 power iterations and the perturbation factor used was (1 + u )ɛ mach. The forcing factor, η, was chosen to be a fixed value of to avoid any complications which could arise from

126 Power Chebyshev JFNK(PI) JFNK(rPI) JFNK(IC) NK(rPI) NK(IC) Execution Time (s) Biblis BWR CANDU IAEA Figure 3.23: Comparison of Power Iteration and Newton s Method: Coarse Mesh, PARCS Convergence the failure of the forcing factor algorithm. The Krylov subspace size allowed before restart was 30 for all Newton runs, and 1 restarts was allowed (for a maximum of 60 GMRES iterations). The JFNK(PI), JFNK(rPI), and NK(rPI) methods all used power iterations where a maximum of 10 IC-preconditioned CG iterations were used to solve the within-group problem to a tolerance of No backtracking strategy was implemented in these runs though it is possible a good backtracking strategy would have been beneficial. Figure 3.23 shows the results of the coarse mesh, loose convergence runs. This figure offers little in the way of interesting results. Chebyshev acceleration is seen to improve upon the power iterations by a small amount with the exception of the CANDU problem. Generally the methods are all quite comparable, though certainly not equal. The run times are relatively short, which does not provide much chance for distinguishing the best performing methods from the rest. It can be seen that JFNK(IC) at best can be compared to the rest of the Newton methods but for the BWR and CANDU problems it is the worst of the bunch. Generally JFNK(IC), NK(IC), and JFNK(rPI) perform

127 113 Execution Time (s) Power Chebyshev JFNK(PI) JFNK(rPI) JFNK(IC) NK(rPI) NK(IC) Biblis BWR CANDU IAEA Figure 3.24: Comparison of Power Iteration and Newton s Method: Fine Mesh, PARCS Convergence similarly for these problems while JFNK(PI) consistently outperforms all other methods, including power method and Chebyshev acceleration, except in the case of the CANDU problem. The difficulty in drawing general conclusions from these results is that the performance of each method is so dependent on the specific problem and the parameters chosen. This is further complicated by the fact that the run time is an unreliable indicator of performance, generally. For instance, these plots give no indication of what percent of the run time goes into the power or Newton iterations and what percent is spent performing the IC factorization of the M g matrices. Results using the same convergence criteria but a finer mesh are shown in Figure The execution times are now slightly longer than in the previous case allowing for a bit more disparity between solution algorithms. Again, we see that JFNK(IC) is generally the worst of the Newton methods while JFNK(PI) is again the best. In this set of results it also seems that JFNK(rPI) and NK(rPI) are clear runners-up, performing almost identically. This indicates that for these problems and this set of parameters the PI preconditioning choice is

128 114 Execution Time (s) Power Chebyshev JFNK(PI) JFNK(rPI) NK(rPI) NK(IC) Biblis BWR CANDU IAEA Figure 3.25: Comparison of Power Iteration and Newton s Method: Coarse Mesh, ɛ mach Convergence more efficient than using the IC factorization as a preconditioner. The difference between the execution time of the power methods and JFNK(PI) is similar to that seen with the coarser mesh, with the JFNK(PI) method offering a modest reduction in execution time when compared to Chebyshev acceleration. Still, even for the finer mesh size the run times involved are all below five minutes. Ideally very large problems would at some point be used to test these methods so that the vast majority of the execution time is spent in the solver. With small run times it is always possible that system processes can pollute the execution time measures or that initialization steps (such as the IC factorization) are disproportionately represented. Figure 3.25 shows the results of the set of calculations using the coarse mesh and the extremely tight convergence criteria. In the figure it can be seen that Chebyshev acceleration is quite effective for this convergence criteria; for all problems it offers a significant improvement over standard power iteration. Again, one can see that JFNK(PI) is still the dominant Newton approach, though it performs very similarly to JFNK(rPI) and NK(rPI). The NK(IC) approach seems to be the

129 115 Execution Time (s) Power Chebyshev JFNK(PI) JFNK(rPI) NK(rPI) NK(IC) Biblis BWR CANDU IAEA Figure 3.26: Comparison of Power Iteration and Newton s Method: Fine Mesh, ɛ mach Convergence worst of the Newton methods presented. The JFNK(IC) approach is not included in the figures using ɛ mach convergence because of a failure to converge in 100 Newton iterations. For the JFNK(IC) approach for all problems the Newton residual would not converge below and spent the bulk of the calculation fluctuating between 10 6 and It was observed that the maximum number of GMRES iterations was being used in each case which indicates that to see convergence for the JFNK(IC) approach the subspace size and maximum number of iterations needs to be increased. This is indicative of failure of the IC approach to properly precondition the problem in these cases. In the best cases an approximately 35% reduction can again be achieved using Newton s method. Compared to Figure 3.23 the time savings due to Newton s method are more substantial, confirming the benefits of the increased convergence rate of Newton s method. The results from the final set of experiments can be found in Figure 3.26, in which the fine mesh is used to spatially discretize the problem and the outer iterations are converged to near machine precision. Again the effectiveness of

130 116 Table 3.3: Computational k-eigenvalue Results for Diffusion Benchmarks Method Biblis BWR CANDU IAEA a k eff JFNK(PI) JFNK(rPI) JFNK(IC) NK(rPI) NK(IC) Power Chebyshev Reference * a k eff of fine mesh, PARCS convergence runs Chebyshev acceleration of the power method can be seen, with the exception of the CANDU problem. As expected, based on the previous experiment sets, a savings in execution time is witnessed when using Newton s method. In this case even the worst choice of Newton s method, NK(IC), is generally comparable to Chebyshev acceleration. Time savings of about 35% are realized in most problems when comparing Chebyshev acceleration to JFNK(PI). Among the Newton methods it seems that JFNK(PI), JFNK(rPI), and NK(rPI) are the most effective, with NK(IC) significantly worse. Though these results portray the newly developed Newton approaches in a positive light the convergence criteria in these cases is impractically restrictive. A more realistic picture is given by Figure The eigenvalues that accompany the result of Figure 3.24 are given in Table 3.3. This table shows that the Newton results are quite consistent with each other and also with the power method and Chebyshev acceleration for the same problem. The purpose of the experiments in this chapter was to study the behavior of the k-eigenvalue algorithms and not to model the benchmark problems with extreme accuracy, though it can be seen that the calculated results show acceptable agreement with the provided reference solutions. It is worth noting that in all of the numerical experiments run to this point using the Newton-based approaches, the only time the fundamental mode was

131 117 not obtained as the converged solution was the case where the initial guess was artificially selected to be close to a non-fundamental mode. While it cannot be stated with any certainty that this is always the case, it is beneficial to know that the ability of the equations to converge to any mode has not posed a problem in practical calculations using diffusion theory. The numerical results generated in this chapter are insufficient to fully characterize any of the Newton-based methods developed. These methods are composed of relatively basic matrix computations, but when taken as a whole the behavior is understandably complex. The Newton-based methods are extremely sensitive to the choice of η and effective preconditioning of the GMRES iteration is essential. Still, these rudimentary results do indicate this family of methods has the potential to improve on the methods currently used without requiring a complete rewrite of existing solution processes. The ultimate utility of the methods cannot be understood until all avenues for improvement have been explored. It is also clear that the only way to really compare the Newton approach to existing approaches is to implement it in a production-level code and solve real-size problems. The use of execution time as a measure in these experiments is less than optimal due to the elementary nature of the implemented diffusion solution methodology. The presence of more energy groups or upscattering is also something that will have an important impact on the calculation. Still, given the modest success of the approach using diffusion theory, the use of Newton s method for the k-eigenvalue problem in transport theory is the next logical step.

132 CHAPTER 4 The k-eigenvalue Problem in Transport Theory The neutron diffusion equation considered in Chapter 2 is an approximation to the more accurate neutron transport equation. The transport equation is generally taken to be the starting point for any neutronics analysis with various simplifications introduced depending on the target application. The neutral particle transport equation is often referred to as the linear Boltzmann equation since the transport equation itself is a linearization of the Boltzmann equation used to study the kinetic theory of gas mixtures. Much of the original research done on radiation transport was performed by astrophysicists studying the phenomenon in stellar atmospheres, however the discovery of the fission reaction led to the investigation and design of nuclear reactors, which was accompanied by a strong interest in the solution of neutral particle transport problems. Initially, analytical solutions were studied for very limited geometrical configurations and physical properties. These types of analytic studies along with detailed examinations of the underlying mathematics can be found in many excellent texts available regarding linear transport theory such as those by Davison and Sykes [39] and Case and Zweifel [38] which focus on the transport of neutrons and also the classic text by Chandrasekhar [69] which is geared more towards atmospheric radiation transport. The later text of

133 119 Duderstadt and Martin [70] considers a wider variety of transport problems but is a useful resource for the study of radiation transport. With the advent of digital computing and the explosion in the amount of computing power available in the last 50 years the focus of neutron transport research has shifted nearly entirely to numerical solution techniques. During this time the field of nuclear engineering itself has matured into a full-fledged engineering discipline. To supplement this field a number of valuable texts dealing with transport theory have come about. Bell and Glasstone [40] is a magnificent study of reactor theory while Duderstadt and Hamilton [71] offers a broader view of reactor analysis. For the study of numerical techniques, the manuscript by Lewis and Miller [41] is unparalleled. The description of the transport problem and the subsequent approximations used to transform the problem from continuous to discrete variables given in this chapter will closely follow that provided by Lewis and Miller. This chapter will first present the k-eigenvalue formulation of the neutron transport equations and then describe the techniques used to transform the problem in the continuum to one which is amenable to solution on a digital computer. After this is complete the transport equation, discretized in space, angle and energy will be written in a convenient operator notation for the sake of compactness in subsequent discussions. The traditional method of solving the k-eigenvalue problem will then be introduced along with some recent variations, and the necessary terminology. A brief description of standard implementations will be given and then the focus will shift to the novel family of Newton methods which is the focus of this work. The idea behind the method will be briefly reviewed and then the specifics of the various formulations in transport theory will be covered. Numerical results and discussion will be presented in Chapter Neutron Transport Theory The general form of the neutron transport equation in a multiplying medium is given by [ ] 1 v t + ˆΩ + σ( r, E) ψ( r, ˆΩ, E, t) =

134 q( r, ˆΩ, E, t) + de dω σ s ( r, E E, ˆΩ ˆΩ)ψ( r, ˆΩ, E, t) + χ(e) de νσ f ( r, E ) dω ψ( r, ˆΩ, E, t), 120 (4.1a) where these quantities represent, ψ( r, ˆΩ, E, t) = Γ, r V, ˆΩ n < 0, (4.1b) r Position vector, ˆΩ Unit vector along the direction of neutron travel, E Neutron energy, t Time, v Neutron speed, ψ( r, ˆΩ, E, t) Angular neutron flux, σ Total interaction macroscopic cross section, σ s Double-differential scattering macroscopic cross section, q External source, χ Fission spectrum, ν Mean number of fission neutrons produced per fission event, σ f Fission macroscopic cross section, Γ Incoming flux on the boundary of V, V Domain surface, n Outward normal to surface. As mentioned previously, this problem is practically impossible to solve analytically for all but the most simplified scenarios. Still, it serves as the starting point for nearly all applications utilizing neutron transport theory. To fully define the above problem it is necessary that all cross sections and boundary conditions are defined along with any external (or boundary) source, if present. The presence

135 121 of a multiplying medium complicates this problem and the critical state of the reactor must be considered when seeking a solution, which ultimately necessitates a discussion on eigenvalue problems α-eigenvalues As has been stated previously, in a system containing fissionable material there are three possible criticality states: subcritical, critical, and supercritical. In a critical system the chain reaction is self-sustaining in the absence of an external neutron source. In other words, if neutrons are introduced as an initial condition into a critical system eventually a steady state will be achieved where the neutron distribution is time-independent without the presence of any external sources. A subcritical system cannot sustain the chain reaction without the presence of an external source, while a supercritical system is self-sustaining but will not achieve a time-independent distribution. In a subcritical system (with no external sources) the neutron distribution decays exponentially in time while it increases exponentially in a supercritical system. Using these definitions, a system is said to be critical if a non-trivial solution can be found to the equation ] [ˆΩ + σ( r, E) ψ( r, ˆΩ, E) = de dω σ s ( r, E E, ˆΩ ˆΩ)ψ( r, ˆΩ, E ) + χ(e) de νσ f ( r, E ) dω ψ( r, ˆΩ, E ), (4.2) with appropriately defined boundary conditions. For a subcritical system a solution to the above equation in the presence of an external source can be found. In the absence of an external source, Eq. (4.2) only has a solution if the system is exactly critical. Thus, this formulation of the problem can only indicate whether a system is perfectly critical or not, no other information on the system criticality is evident. For this reason the criticality problem is often cast as an eigenvalue problem where the eigenvalue is an indication of how far from criticality a system is. There are generally two types of eigenvalues considered in transport theory: α and k eigenvalues. While the focus of this work is on finding k-eigenvalues, some work has been done to adapt these methods to α-eigenvalues and so they will be briefly

136 122 discussed. If one seeks a solution to Eq. (4.1a) of the form ψ( r, ˆΩ, E, t) = ψ α ( r, ˆΩ, E)e αt (4.3) then Eq. (4.1a) becomes [ˆΩ α ] + σ( r, E) + ψ α ( r, v ˆΩ, E) = q( r, ˆΩ, E, t) + de dω σ s ( r, E E, ˆΩ ˆΩ)ψ α ( r, ˆΩ, E ) + χ(e) de νσ f ( r, E ) dω ψ α ( r, ˆΩ, E ), (4.4) which is an eigenvalue problem with the eigenpair (α,ψ). While a spectrum of eigenvalues will exist, only the one with the largest real component matters in an asymptotic sense as it will dominate the system behavior. The value of α is a measure of the system criticality; if α is zero then a non-trivial steady state exists and the system is critical. Similarly, if the real component of α is greater than zero the distribution is unbounded with increasing time and the system is supercritical, while if the real component of α is less than zero the distribution is decreasing with increasing time and thus the system is subcritical k-eigenvalues The k-eigenvalue problem, which has already been discussed for diffusion theory is also commonly solved in transport theory. To arrive at the k-eigenvalue the assumption is made that ν, the average number of neutrons produced per fission, can be adjusted such that a time-independent solution to Eq. (4.2) can be found. Thus, ν is replaced by ν/k such that ] [ˆΩ + σ( r, E) ψ( r, ˆΩ, E) = de dω σ s ( r, E E, ˆΩ ˆΩ)ψ( r, ˆΩ, E ) de νσ f ( r, E ) dω ψ( r, ˆΩ, E ). (4.5) + χ(e) k

137 123 The eigenvalue k, like α, is a measure of the criticality of the system. Again a spectrum of eigenvalues will exist but only the largest which allows a nonnegative solution is of immediate interest. It is easy to see that if k = 1 then the system is critical as this is the same as finding a solution to Eq. (4.2). If k < 1 it indicates that the average number of neutrons produced per fission must increase for the system to become critical, which implies a subcritical system. Likewise, if k > 1 it implies ν is too large which points to a supercritical system. While the same information can be gleaned from both eigenvalue calculations the k-eigenvalue calculation is much more common for a number of reasons which are discussed in [41] Problem Discretization The continuous form of the k-eigenvalue problem is given in Eq. (4.5), but ultimately we are interested in a discretized form of the problem which can be solved numerically. The first independent variable to be discretized is the energy, E. The starting point for this exercise is Eq. (4.5). Energy discretization is accomplished via the multigroup approximation, which basically splits the continuous span of the energy variable into G energy ranges, referred to as groups. The total range of energy considered is from E 0 to E G = 0, where E 0 should be large enough to include all non-negligible particle populations. Group g comprises the energy range from E g 1 to E g, so that group 1 contains the most energetic neutrons while group G contains the least energetic. A full derivation of this approximation can be found in [40] or [41]. The crux is that the group-dependent flux is defined as Eg 1 with the energy-integrated angular flux given by E g ψ( r, ˆΩ, E) = ψ g ( r, ˆΩ) (4.6) ψ( r, ˆΩ) = 0 ψ( r, ˆΩ, E)dE = G ψ g ( r, ˆΩ) (4.7) g=1 which ultimately allows for Eq. (4.5) to be written as

138 124 [ˆΩ + σg ( r)] ψ g ( r, ˆΩ) = G g =1 dω σ gg ( r, ˆΩ ˆΩ)ψ g ( r, ˆΩ ) + χ g k G g =1 νσ fg ( r) dω ψ g ( r, ˆΩ ), (4.8) with appropriately defined multigroup cross sections, fission spectrum, and boundary conditions. Here the double-differential scattering cross section has been denoted by σ gg to denote scattering from group g to group g. For our purposes the angular dependence is treated using the Discrete Ordinates angular approximation, regularly called S N, which is a collocation method in angle. In the case of isotropic scattering there is no angular dependence in the scattering cross section and the treatment of the scattering integral is much simpler. While the actual transport implementation used in the course of this research does indeed treat scattering as isotropic the methods developed are applicable to problems with anisotropic scattering. For this reason the scattering treatment is briefly discussed before the S N approximation is introduced. Generally the scattering is treated using the so-called P N expansion which works by expanding the scattering-angle dependence of the differential scattering cross section, σ gg, in Legendre polynomials. A brief description of the S N discretization and P N expansion processes is provided here though a more complete discussion can be found in [72], from which the following is adapted. The expansion of the scattering cross section results in σ gg (ˆΩ ˆΩ) = L l=0 2l + 1 4π P l(ˆω ˆΩ)σ gg l. (4.9) Substituting this expansion into the scattering source of Eq. (4.8) yields G g =1 dω L l=0 2l + 1 4π P l(ˆω ˆΩ)σ l gg ( r)ψ g ( r, ˆΩ ). (4.10)

139 125 Using the addition theorem of Spherical Harmonics, P l (ˆΩ ˆΩ) = 4π 2l + 1 l m= l Y lm(ˆω )Y lm (ˆΩ), (4.11) the Legendre function can be evaluated. Inserting Eq. (4.11) into Eq. (4.10) and splitting the spherical harmonics into even and odd functions the scattering source becomes G g =1 l=0 [ L σgg l ( r) Yl0(ˆΩ)φ e l0 g ( r) + l m=1 where the angular flux moments φ and υ are defined by φ( r) lm g = ( ) ] Ylm(ˆΩ)φ e lm g ( r) + Y lm(ˆω)φ o lm g ( r)υg ( r) (4.12) dˆωy e lm(ˆω)ψ g ( r, ˆΩ), m 0 (4.13a) υ( r) lm g = dˆωy o lm(ˆω)ψ g ( r, ˆΩ), m > 0. (4.13b) The order of anisotropic scattering is then determined by L and the total number of angular flux moments is (L + 1) 2. In the case of isotropic scattering, which is used exclusively in the numerical implementation in the following chapter, this simplifies to a single angular flux moment and scattering cross section. For the sake of clarity, the rest of the discussion assumes isotropic scattering though the introduction of operator notation will generalize the problem so that anisotropic scattering is once again accounted for. With isotropic scattering, the multigroup transport equation reduces to [ˆΩ + σg ( r)] ψ g ( r, ˆΩ) = G g =1 σ gg ( r) dω ψ g ( r, ˆΩ ) + χ g k G g =1 νσ fg ( r) dω ψ g ( r, ˆΩ ). (4.14) The S N approximation consists of solving the transport equation at discrete angles n [1, N], such that ψ n ( r) = ψ( r, ˆΩ n ). Angular integrals of angular-dependent

140 126 quantities are approximated via a quadrature rule so that N dˆω = w n = 4π, (4.15) n=1 which yields the set of N equations [ˆΩn + σ g ( r)] ψ n,g ( r) = G g =1 σ gg ( r)φ g ( r) + χ g k G νσ fg ( r)φ g ( r), g =1 n = 1,..., N (4.16) where N φ( r) = w n ψ( r, ˆΩ n ) = dˆωψ( r, ˆΩ). (4.17) n=1 If anisotropic scattering were still being considered then the integrals in Eqs. (4.13) would become summations. It is important during the angular discretization to ensure a consistent normalization of the numerical quadrature, spherical harmonics, and cross sections; when operator notation is introduced it will be assumed that this is the case. See [41] or [72] for a more complete discussion on angular approximations. The transport equation has been fully discretized in both energy and angle dependencies in Eq. (4.16). Once the spatial dependence on r has been discretized the problem will be fully discrete and amenable to numerical solution. There is a large number of schemes used to spatially discretize the multigroup S N transport equation and only two will be mentioned here and even then only a cursory description of these will be provided since information on spatial discretization is widely available in the literature. The two specific methods that will be used in the numerical implementation here are Diamond-Difference (DD) [41, 72] and Arbitrarily High Order Transport of the Nodal type (AHOT-N) [73]. The solution technique for the transport problem is not dependent on the discretization used so it is sufficient to briefly describe these two as the solution algorithm can be generalized to other schemes. In Cartesian geometry, the spatial dependence of the transport equation can be discretized by defining a spatial grid of mesh points with I points in the x-

141 127 direction, J in the y-direction, and K in the z-direction. This creates N c spatial cells with the coordinate of the cell center denoted by (x i, y j, z k ) where x i = 1 2 (x i+1/2 + x i 1/2 ), 1 i I y j = 1 2 (y j+1/2 + y j 1/2 ), 1 j J z k = 1 2 (z k+1/2 + z k 1/2 ), 1 k K with the half-integer indices representing cell edges. Subscripts i, j, and k are used to denote the cell index in the x, y, and z directions respectively. The width of a cell in any given dimension is denoted by, e.g. i = x i+1/2 x i 1/2. To briefly show how to discretize the r-dependence in Eq. (4.16) using the DD approach we first drop the group index since the solution process in each group is the same given that the scattering and fission sources are combined into an effective group-source s, which is assumed known for each group by design of the solution algorithm. The manner in which this source is known will be discussed shortly. Considering this within-group problem for a single ordinate, such that the n subscript can be omitted, [ˆΩ + σ( r) ] ψ( r) = s( r), (4.18) the equation is integrated over a single spatial cell x i +1/2 dx y j +1/2 dy z k +1/2 dz ] ) ([ˆΩ + σ( r) ψ( r) = s( r) (4.19) x i 1/2 y j 1/2 z k 1/2 which yields µ ( ) η ( ) ξ ( ) ψi+1/2 ψ i 1/2 + ψj+1/2 ψ j 1/2 + ψk+1/2 ψ k 1/2 i j k + σ ijk ψ ijk = s ijk (4.20)

142 128 where the cell-edge fluxes are defined by ψ i±1/2,j,k = ψ i,j±1/2,k = ψ i,j,k±1/2 = 1 j k 1 i k 1 i j y j +1/2 y j 1/2 x i +1/2 x i 1/2 x i +1/2 x i 1/2 z k +1/2 dy dz ψ(x i ± 1/2, y, z), (4.21) z k 1/2 z k +1/2 dx dz ψ(x, y j ± 1/2, z), (4.22) z k 1/2 y j +1/2 dx dy ψ(x, y, z k ± 1/2), (4.23) y j 1/2 and the cell-averaged fluxes by ψ i,j,k = 1 i j k x i +1/2 x i 1/2 y j +1/2 dx dy z k +1/2 y j 1/2 z k 1/2 dz ψ(x, y, z). (4.24) In the DD approximation the cell-edge fluxes are related to the cell-averaged flux via a simple average so that ψ ijk = 1 ( ) ψi+1/2,j,k + ψ i 1/2,j,k, (4.25) 2 ψ ijk = 1 ( ) ψi,j+1/2,k + ψ i,j 1/2,k, (4.26) 2 ψ ijk = 1 ( ) ψi,j,k+1/2 + ψ i,j,k 1/2, (4.27) 2 which are known as the diamond-difference relations. The angular flux is typically known on the incoming faces, either through boundary conditions or from the solution in neighboring cells, leaving four equations and four unknowns per cell per discrete ordinate so that upon solution of this system the cell-averaged flux will be known along with the outgoing flux on the cell faces. The outgoing fluxes are then known quantities for the adjacent downwind spatial cells. Thus, for a given angle it is possible to start at a boundary where the flux is known from boundary conditions and traverse through the mesh, finding the cell-averaged flux in each spatial cell. When this process is completed for each angle in the S N approximation it constitutes a mesh sweep, or simply a sweep in transport jargon. The path

143 129 taken is directionally dependent and determined by the values of µ, η, and ξ. The physical interpretation of this is very clear since the sweep direction mimics the direction of neutron travel. This type of solution is also referred to as a wavefront solver. A more thorough description of the transport sweep and DD in particular can be found in [41]. The AHOT-N spatial discretization [73] is much more complex and the reference should be consulted for the details. To briefly describe it, AHOT- N is a formalism by which the flux is expanded spatially resulting in a set of balance equations for the spatial moments and a set of transverse average equations. The spatial order can be truncated at any value and the transverse average equations can be written in weighted diamond-difference form so that a similar sweep process to the one previously described can be used within a given cell. In numerical results to be presented in the next chapter AHOT-N1 is primarily used to represent the AHOT-N class of methods, which indicates a linear expansion of the within-cell and edge angular fluxes. The source term for the k-eigenvalue problem in discrete form is given by s ijk,g = G g =1 σ ijk,gg φ ijk,g + χ g k G νσ ijk,fg φ ijk,g, (4.28) though in reality the scalar flux will also be unknown since it is a function of the g =1 angular flux, which is unknown. This source can also be written as g 1 s ijk,g = σ ijk,gg φ ijk,g + σ ijk,gg φ ijk,g + g <g G g >g σ ijk,gg φ ijk,g + χ g k G νσ ijk,fg φ ijk,g g =1 (4.29) by separating the contributions from downscattering, self-scattering, and upscattering, respectively. To solve the k-eigenvalue problem multiple nested levels of iteration are actually needed. However now that the problem has been fully discretized in space, angle, and energy a convenient operator notation will be introduced to simplify the discussion of k-eigenvalue solution techniques.

144 Operator Notation The transport equation in operator notation can be written much more compactly, which will expedite the discussion of solution techniques. This operator notation may not be standard, but it is steadily gaining acceptance. This introduction to the notation mirrors that found in [72] which is an excellent source of information on both the discretization of the transport problem and the translation from the discrete equations to the operator notation being utilized. The first operator we will introduce was discussed in the previous section. This is the streaming plus removal operator, denoted by L, which signifies the spatially discretized form of [ˆΩn + σ g ( r)], including the problem boundary conditions. L represents a lower triangular matrix that is never constructed or stored explicitly, rather the matrix is inverted implicitly as necessary via the transport sweep described earlier. If we consider the spatially discretized formulation of the within-group problem, Eq. (4.18) with some specified source vector q, the problem can be written Lψ = q, (4.30) which is solved by finding L 1 q. In this case ψ is a vector whose dimensions are equal to the product of the number of ordinates and the number of spatial cells, i.e. ψ has an entry corresponding to each spatial cell for each ordinate. The vector q contains the fixed sources for the corresponding cells and ordinates; the actual construction of q will be discussed shortly. This problem is solved using the transport sweep process. Regardless of the specifics of the sweep, which are determined by the discretization scheme chosen, the problem can be written in this manner and thus L, also referred to as the transport operator, is used to represent the transport process. L can be used to represent the within-group transport operator and it can also be used to represent the discretization of [ˆΩn + σ g ( r)] over all energy groups. In this case L becomes a block-diagonal matrix with L g on the diagonal, L L L = (4.31) L G

145 131 This quirk in notation will be utilized often in the present chapter though if there is any ambiguity whether the equation in question is for a specific energy group or over all groups it will be explicitly stated. The M and D operators which will be discussed shortly are also similar in that they can be used to operate on the within-group level and if used to operate on the entire problem they become block-diagonal operators. Using this notation for the transport operator, the fully discretized k-eigenvalue problem in operator notation can now be written as Lψ = MSDψ + 1 MFDψ. (4.32) k The operators can be described as follows: L is the transport operator, M the moments-to-discrete operator, S is the scattering operator, D is the discrete-tomoments operator and F is the fission operator. The vector ψ can be written ψ = [ ψ T 1 ψ T 2 ψ T g ψ T G] T (4.33) where each group s vector ψ g is given by ψ g = [ ψ T g,1 ψ T g,2 ψ T g,3 ψ T g,n] T (4.34) with the vector ψ g,n containing all the unknowns for group g and angle n over all spatial cells. For the DD discretization these unknowns are simply the angular fluxes but for the AHOT-N methods spatial moments increase the size of the vector ψ g,n. The scattering operator is given by the block matrix S 11 S 12 S 1G S S = 21 S 22 S 2G..... (4.35). S G1 S G2 S GG

146 132 where each block is itself a block-diagonal matrix defined by [σgg 0 ] [σgg 1 ] [σ S gg = gg 1 ] [σgg 1 ] 0 0. (4.36) [σgg L ] Eq. (4.35) is a square matrix which can be interpreted in physical terms quite easily. Consider the splitting of the scattering source shown in Eq. (4.29), where the downscattering, upscattering, and self-scattering have been separated. When written for all groups, angles, and spatial unknowns this leads to the S operator. The strictly lower triangular portion of S corresponds to downscattering, the diagonal to self-scattering, and the strictly upper triangular to upscattering. Thus, for problems without upscattering S is a lower triangular matrix. In the case that energy groups are only directly coupled, S will be a banded block matrix with only the diagonal and a single subdiagonal. In Eq. (4.36) each diagonal entry is a diagonal matrix that maps cross sections to spatial cells and scattering anisotropy order. This is a square matrix whose order is equal to the number of spatial unknowns in the problem. Equivalent in dimension to the scattering operator is the fission operator, F, written χ 1 F 1 χ 1 F 2 χ 1 F G χ F = 2 F 1 χ 2 F 2 χ 2 F G (4.37) χ G F 1 χ G F 2 χ G F G Again the block matrix is composed of group dependent blocks, F g in this case. Each F g, like the group scattering operators from Eq. (4.36) is a diagonal block matrix whose diagonal is composed of matrices which map cross sections to spatial

147 133 cells and scattering anisotropy order, [νσ f,g ] F g = (4.38) The group fission operators are however composed primarily of 0 blocks since the fission cross sections are isotropic and thus there are only entries for the l = 0 angular moment. Still it is convenient to define the fission operator as shown so the k-eigenvalue problem can be written as in Eq. (4.32). It is useful from an implementation standpoint to note that the application of F to some vector can be simplified since the only difference between the rows of the matrix operator appearing in Eq. (4.37) is the scalar factor χ g. Using this knowledge, only y = [F 1 F 2 F G ]v (4.39) needs calculated, where v is some appropriately sized vector. The application of F on v is then given by Fv = [χ 1 y T χ 2 y T χ G y T ] T. (4.40) Both the S and F operators operate upon angular flux moments while L operates upon discrete angular fluxes. The process of mapping discrete angular fluxes to angular flux moments is done via the D operator while harmonic moment quantities can be mapped to discrete ordinates by applying the M operator. The operator is comprised of the spherical harmonics and the dimensions of this operator are determined by the number of Legendre moments used in the scattering expansion and the number of angles used in the discrete ordinates approximation.

148 134 For the within-group problem the moments-to-discrete operator is given by M = [Y00(ˆΩ e 1 )] [Y00(ˆΩ e 2 )] [Y00(ˆΩ e 3 )]. [Y e 00(ˆΩ N )] [Y10(ˆΩ e 1 )] [Y10(ˆΩ e 2 )] [Y10(ˆΩ e 3 )]. [Y e 10(ˆΩ N )] [Y11(ˆΩ o 1 )] [Y11(ˆΩ o 2 )] [Y11(ˆΩ o 3 )]. [Y o 11(ˆΩ N )] [Y11(ˆΩ e 1 )] [Y11(ˆΩ e 2 )] [Y11(ˆΩ e 3 )]. [Y e 11(ˆΩ N )] [Y20(ˆΩ e 1 )] [YLL o (ˆΩ 1 )] [YLL e (ˆΩ 1 )] [Y20(ˆΩ e 2 )] [YLL o (ˆΩ 2 )] [YLL e (ˆΩ 2 )] [Y20(ˆΩ e 3 )] [YLL o (ˆΩ 3 )] [YLL e (ˆΩ 3 )]... [Y20(ˆΩ e N )] [YLL o (ˆΩ N )] [YLL e (ˆΩ N )] (4.41) where each [.] quantity is a square diagonal matrix whose diagonal entries are all identical: the spherical harmonic for the indicated moment and ordinate. This is because the spherical harmonics have no spatial or energy dependence, which implies that the full multigroup M is a block-diagonal matrix whose blocks are given by the within-group M in Eq. (4.41). In the case of isotropic scattering M reduces to a scalar quantity since only the l = 0 moment is retained. Using the D operator, the relationship between the flux moments and discrete fluxes within an energy group can be written as φ g = Dψ g (4.42) where φ = [ φ T 1 φ T 2 φ T G] T (4.43) such that each φ g contains all of the group s angular moments φ g = [ φ g T 00 φ g T 10 υ g T 11 φ g T 11 φ g 20 υ g T LL ] T φ g T LL. (4.44) Each angular moment vector above contains all of the unknowns for a given group and flux moment for all cells in the adopted mesh. The operator D can be written D = M T W (4.45) which has no energy dependence, so the full D operator is again a block-diagonal matrix with the diagonal blocks determined by transpose of the within-group op-

149 135 erator M, given by Eq. (4.41). In Eq. (4.45) the block matrix W is composed of the weights of the angular quadrature set used for the S N approximation [w 1 ] [w 1 ] [w 1 ] [w W = 2 ] [w 2 ] [w 2 ] (4.46) [w N ] [w N ] [w N ] Each of the blocks [w n ] is a diagonal matrix whose entries are all w n, the quadrature weight for angle n. Using the multigroup form of D Eq. (4.42) can be written φ = Dψ. (4.47) It is necessary to note that though discrete fluxes can be mapped to angular moments with φ = Dψ, it is not true that M can be used similarly, such that ψ = Mφ. We now denote the total number of spatial cells by N c, the number of unknowns per cell by N u, the number of energy groups by N g, the number of angles in the S N approximation by N a, and the number of flux moments in the scattering source expansion by N m. Using these quantities we define g m = N c N u N m (4.48) g o = N c N u N a (4.49) t m = N c N u N m N g (4.50) t o = N c N u N a N g (4.51) which permit the operators and quantities introduced to be summarized as follows: Discrete Angular Fluxes (ψ) vector of length t o composed of discrete angular fluxes and spatial flux moments. Angular Flux Moments (φ) vector of length t m composed of angular-spatial flux moments, in the case of isotropic scattering simply the scalar flux. Transport Operator (L) Block-diagonal matrix of dimension t o t o, diagonal blocks are within-group operator of dimension g o g o. The specific

150 136 structure of L is determined by the manner in which the transport equation is discretized in space. L 1 represents an S N sweep over all energy groups. Moment-to-Discrete Operator (M) Dimensions are t o t m, maps harmonic moments to discrete angles. The within-group moment-to-discrete operator, also denoted by M has dimension g o g m. Discrete-to-Moment Operator (D) Dimensions are t m t o, maps discrete angular fluxes to angular flux moments. The within-group discrete-to-moment operator, also denoted by D has dimension g m g o. Scattering Operator (S) Potentially full block matrix with dimensions t m t m. Individual blocks, S gg, have dimensions g m g m. These blocks are composed of blocks denoted [σ gg l ] which have dimensions (N c N u ) (N c N u ) Fission operator (F) Full block matrix with dimensions t m t m. Individual blocks, F g, have dimensions g m g m. These blocks are composed of blocks denoted [νσ f,g ] which have dimensions (N c N u ) (N c N u ). Each block F g is mostly filled with zeros due to the fact that fission is isotropic. Though these operators and matrices are well defined it is important to remember they are never actually constructed. The sweep is used whenever L 1 is necessary, the lower triangular matrix L is never stored. The scattering and fission operators are never actually constructed since they are mostly zeros, instead the sources are directly built using the known mappings of cross sections to spatial cells and anisotropic scattering order. Likewise M and D should be thought of as operations performed and not explicit matrices stored. The operator notation is convenient because it allows the discretized transport equation to be written compactly, however in any numerical solution literally implementing the equations as written would be wasteful.

151 Traditional Solution Techniques If Eq. (4.32) is written using the relationship in Eq. (4.47) then the multigroup, S N transport equation can be written as Lψ = MSφ + 1 MFφ, (4.52) k with the spatial discretization left unspecified. The solution to this problem, comprising an eigenpair (k,ψ), or more typically (k,φ) is generally found through an iterative process, normally the classic power iteration. If we apply DL 1 do this equation and rearrange we obtain ( I DL 1 MS ) φ = 1 k DL 1 MFφ. (4.53) Defining the operators A and B as A = ( I DL 1 MS ) (4.54) B = DL 1 MF (4.55) the problem can be written Aφ = 1 Bφ (4.56) k which is the generalized eigenvalue problem. This can be written as a standard eigenvalue problem by A 1 Bφ = kφ. (4.57) Using the power method this problem is solved by performing the iterations φ (l+1) = 1 k (l) A 1 Bφ (l) (4.58) k (l+1) = k (l) F φ(l+1) 1 F φ (l) 1 (4.59) where Eq. (4.59) is only one of many possible update formulas for the eigenvalue k and l is the iteration index. This index points to the outermost level of iteration in the typical implementation of the power method for solving the k-eigenvalue problem and is aptly called the outer iteration.

152 Inner Iterations Unlike in Chapter 2 with diffusion theory, A in this case is not an explicitly defined matrix so the action of A 1 on B must be examined further. The inverse of A on a vector b, such that Ax = b is x = ( I DL 1 MS ) 1 b. (4.60) The action of this inverse is generally calculated through a basic block Gauss-Seidel iteration, of which a brief description can be found in [11]. The block matrix A contains N g N g blocks, which requires the solution of N g coupled linear systems at each Gauss-Seidel iteration. It is the solution of these linear systems that requires what is commonly known as inner iterations. The Gauss-Seidel iteration itself is considered an intermediate level of iteration, and in fact is only necessary in the presence of upscattering, otherwise simple forward substitution can be used to solve the N g linear coupled systems. Returning to Eq. (4.58), the Gauss-Seidel iteration for a block g is given by ( ) I DL 1 MS gg φ (m+1,l) g = ( g 1 DL 1 1 M k (l) Fφ(l) + g =1 S gg φ (m+1,l) g + G g =g+1 ) S gg φ (m,l) g (4.61) where m is the Gauss-Seidel iteration index. If the source term Q is defined as ( g 1 1 Q = M k (l) Fφ(l) + g =1 S gg φ (m+1,l) g + G g =g+1 ) S gg φ (m,l) g, (4.62) then the linear system for each group that is embedded in the block Gauss-Seidel iteration can be compactly written ( I DL 1 MS gg ) φg = DL 1 Q (4.63) where the iteration superscripts of the intermediate and outer iterations have been suppressed. Generally this system is solved using a fixed-point iteration (Richardson iteration), referred to as source iteration in the context of numerical solutions

153 139 to the transport equation. The fixed-point iteration for Ax = b is x (n+1) = (I A)x (n) + b (4.64) such that the fixed-point solution of Eq. (4.63) is given by φ (n+1) g = DL 1 MS gg φ (n) g + DL 1 Q (4.65) where n is the inner iteration index. The more traditional formulation of source iteration is instead given by the iterative scheme Lψ (n+1) g = MS gg φ (n) g + Q (4.66) φ (n+1) = Dψ (n+1), (4.67) which can easily be shown to be equivalent to Eq. (4.63). This iterative process is known to have convergence issues in scattering dominated regimes in large (weaklyleaking) configurations, where the spectral radius limits to unity. Acceleration of source iterations is an important numerical tool as the source iteration is the primary unit of work done in any transport solution. Each source iteration requires one transport sweep. Thus, reducing the number of source iterations reduces the total number of sweeps. The cost of computing the source vectors is negligible compared to the cost of a sweep so halving the number of source iterations (and thus sweeps) effectively halves the cost of obtaining a solution, though the total cost also includes the cost of the acceleration instructions. Lewis and Miller [41] briefly discuss some of the methods used to accelerate inner iterations, such as rebalance and synthetic acceleration methods. A great deal more about acceleration can be found in the literature as it has been studied extensively in the past 30 years. Among the more well-known techniques are Diffusion Synthetic Acceleration (DSA), Transport Synthetic Acceleration (TSA), and Coarse Mesh Rebalance. The specifics of these methods are not directly related to this work, though it is important to note that the production codes in use today invariably use some form of acceleration in conjunction with source iteration. Recently, an alternative to source iteration has been proposed which solves the within-group equation using a Krylov subspace method, generally GMRES [74].

154 140 This approach does not suffer the same convergence issues that plague source iteration and DSA techniques can easily be used as preconditioners without meeting the stringent consistency requirement imposed by the DSA formalism. This alternative to the standard source iteration is promising enough to be included in the ORNL Denovo transport code [72]. It is easy to see how the GMRES alternative works by starting with Eq. (4.63) and making the following definitions: Ã = ( ) I DL 1 MS gg (4.68) b = DL 1 Q (4.69) such that Eq. (4.63) can be written as Ãφ g = b. As described in previous chapters, GMRES only requires the action of a matrix on a vector such that Ãv needs to be calculated. For the inner iterations this can easily be done using existing instructions in a typical implementation of the mesh sweep, (I DL 1 MS gg ) v can be calculated in three simple steps 1. y = MS gg v 2. z = DL 1 y 3. v z = (I DL 1 MS gg )v thus each within-group iteration requires a sweep to build b and a sweep for each matrix-vector multiply (GMRES iteration). The method used to perform within-group iterations is secondary to the k- eigenvalue methods we will solve with Newton s method. In the following chapter on numerical results both source iteration (SI) and the GMRES formulation of the within-group problem will be used at different points. Which methodology is being used in any set of results will be explicitly stated Upscattering Treatment The foundation of the numerical approach to k-eigenvalue problems has been described to this point. The outer iterations update the fission source and the eigenvalue while intermediate block Gauss-Seidel iterations are used to converge the

155 141 energy dependence when upscattering is present. Often in the presence of upscattering the full block Gauss-Seidel iteration is not performed due to the computational expense and instead only a single iteration is performed. This is equivalent to lagging the upscattering source by one outer iteration, i.e. the upscattering source is constructed using the flux moments from the previous outer iteration. This source is combined with the fission source and the energy dependence in the self-scattering and downscattering is treated through the forward substitution. This can be written explicitly in operator notation by splitting the scattering operator into self-scattering, downscattering, and upscattering: S D, S L, and S U, respectively. The self-scattering operator is block-diagonal, while the downscattering operator is strictly block-lower-triangular and the upscattering operator is strictly block-upper-triangular such that S = S L + S D + S U. With this splitting the k-eigenvalue problem can be written Lψ = M(S L + S D + S U )φ + 1 MFφ. (4.70) k If DL 1 is applied, this equation can be rearranged such that ( I DL 1 M(S L + S D ) ) ( ) 1 φ = DL 1 M k F + S U φ. (4.71) Using the superscript l to again represent the outer iteration this can be written as an iterative process by ( I DL 1 M(S L + S D ) ) ( ) 1 φ (l+1) = DL 1 M k F + S (l) U φ (l), (4.72) where k is updated using Eq. (4.59) or a similar formula. Often this variation of Eq. (4.58) is used in the nuclear community and mistakenly referred to as power iteration. In downscattering-only problems this formulation is equivalent to Eq. (4.58) but when upscattering is present the inverse of A is not being applied to B which results in an iterative process that resembles fixed-point iteration and not the well-known power iteration. If the A operator is split such that A L = ( I DL 1 M(S L + S D ) ) (4.73)

156 142 A U = ( DL 1 MS U ) (4.74) then the eigenvalue problem can be written A L φ (l+1) = ( ) 1 k B + A (l) U φ (l). (4.75) The inverse of A L is then found using block forward substitution which still requires the solution of the within-group problem for each energy group. It can easily be seen that upon convergence the solution to Eq. (4.75) satisfies the same condition as Eq. (4.58). If we use φ and k to denote the converged quantities then Eq. (4.58) becomes Aφ = 1 k Bφ (4.76) which we know to satisfy both the intermediate iterations over energy and the within-group iterations on the self-scattering source. If we use the same notation to denote the convergence of Eq. (4.75) then A L φ = ( ) 1 k B + A U φ (4.77) Knowing that A = A L A U this relationship can trivially be shown to be identical to Eq. (4.76). So though they will have different convergence properties and computational costs, upon convergence the traditional power iteration and the fixed-point iteration due to the splitting of A will have the same set of eigenpairs. Before moving on to the Newton approach to the transport k-eigenvalue problem we will briefly review what a production-level implementation looks like by describing the solution scheme used in the PARTISN code from Los Alamos [75] Realistic Implementation PARTISN (PARallel, TIme-Dependent SN) is a production-level numerical solver for the neutron transport equation developed and maintained by Los Alamos National Laboratory [75]. Among the many transport problems it is capable of solving is the k-eigenvalue problem using the S N angular approximation. The solution algorithm uses diffusion synthetic acceleration of the outer iterations and provides

157 143 the option of either diffusion or transport synthetic acceleration for the inner iterations. The diffusion solver uses either conjugate gradient or a multigrid approach and the diffusion k-eigenvalue problem is accelerated using Chebyshev acceleration. The scattering term in PARTISN can be treated via the P N approximation shown earlier in this chapter. Not only is the combination of numerical methods in PAR- TISN rather intricate, the iterative tolerances have also been finely tuned to result in an iterative process that is heavily optimized with respect to reducing execution time. The new methods developed in this work should not be seen as a direct replacement for this optimized process but as the core of an alternative method which seems likely to be more efficient, or at the least competitive, once it is as finely tuned. While many production-level codes have sophisticated k-eigenvalue strategies, PARTISN is chosen here primarily for the clarity of its documentation. Figure 4.1 is a simplified view of the iterative strategy PARTISN uses to solve the k-eigenvalue problem. In the diagram we are only concerned with problems containing fission or upscattering, otherwise the information flow for a fixed-point problem is given. Though it should be noted that this first branching point in the flow-chart indicates that the upscattering source is treated at the same iterative level as the fission source. This implies that there is no intermediate iteration over energy and instead the k-eigenvalue is solved using the formulation of Eq. (4.75). We can see that the first calculation performed is the diffusion k-eigenvalue problem which is converged using Chebyshev acceleration. The resulting fission source and scalar flux are used to create the fission and upscattering sources in the transport calculation. A transport sweep is done and the results are used to create a diffusion coefficient for a diffusion calculation according to the DSA process. This diffusion calculation results in a new scalar flux, constituting a single accelerated inner iteration. If multiple iterations were being performed then this scalar flux would be used to update the self-scattering source and the cycle would repeat until the specified convergence. Upon completion of the inner iterations another k-eigenvalue diffusion calculation is performed, and the combination of inner iterations and diffusion sub-outer iterations is continued until convergence is achieved. To further improve this process the iterative tolerances are carefully chosen.

158 Figure 4.1: PARTISN Iterative Strategy, Taken From [75] 144

159 145 By default the convergence tolerance on the inner iterations is a relative 10 4 error in the cell-wise scalar flux. However, the default maximum number of iterations is set at 1, so that initially only one accelerated inner per outer is performed. In the diffusion sub-outer calculation two criteria are checked, one on the cell-wise fission source and one on the eigenvalue. It is required that the relative point-wise error in the fission source be less than 3 max(10 4, p ), where p is the outer iteration index. The eigenvalue is converged to max(10 4, p ). The transport fission source is also monitored for convergence. If the relative point-wise fission source error is less than 10 3 this triggers an increase in the number of inners per outer from 1 to somewhere in the range of It is expected that only 1 to 2 additional outers will be necessary at this point. The ultimate convergence criteria requires both the eigenvalue and relative point-wise error in the fission source to be less than 10 4 and by default the maximum number of outer iterations performed is 20. As numerical results in following chapters will show, the number of outers necessary when inners are not accelerated and there are no diffusion sub-outers is invariably greater than 20. The fact that 20 is the default is indicative of the effectiveness of the acceleration used by PARTISN. This discussion is meant to portray the intricacies associated with k-eigenvalue problems in practice. The relatively easy to follow formulation of the problem in Eq. (4.58) or Eq. (4.75) becomes quite complicated when implemented in an optimized manner. Results generated using the basic formulation of Newton s method that we will develop in the remaining sections of the present chapter are not compared to a PARTISNlike implementation of traditional power or fixed-point iteration but instead to an implementation of Eq. (4.58) or Eq. (4.75) as described, which can be considered a fair comparison. 4.4 Newton s Method and Transport Theory With the transport equation discretized and conveniently written in operator notation it is almost trivial to rewrite the k-eigenvalue as a nonlinear system using Newton s method when following the examples provided in Chapter 2. Though the actual operators have changed drastically, they can still be used to write either a generalized eigenvalue problem or the standard eigenvalue problem. If the loss

160 146 and production diffusion operators are replaced by their transport analogues, A and B, then Newton s method for the generalized eigenvalue problem, Aφ = λbφ (4.78) can be written as [ ] [ ] Aφ λbφ φ Γ(u) =, u = ρ(φ, λ) λ (4.79) where all of the quantities were defined in Section 4.2, such that φ contains all of the flux moments and A and B are the transport operators defined by Eqs. (4.54) and (4.55). The relationship ρ(φ, λ), used to complete the system of equations is unspecified at this point. Again the eigenvalue, k, has been replaced with its reciprocal as in diffusion theory. Though the k-eigenvalue is used exclusively it is often simpler to work with λ when working with Newton s method since reciprocals unnecessarily complicate the derivatives involved in the Jacobian. Since neither λ nor k are used to represent any other quantity it will be clear from the equations if one is replaced by its reciprocal. This is directly analogous to Eq. (2.12) for diffusion theory. Any root of Eq. (4.79) is an eigenpair and so solving the nonlinear system using Newton s method will result in a solution to the k-eigenvalue problem. Newton s method is given by Γ (u (m) )δu (m) = Γ(u (m) ), u (m+1) = u (m) + δu (m) (4.80) where Γ is the Jacobian of Eq. (4.79). Using the previous notation the first t m equations of Γ are denoted by Γ φ and the last equation by Γ λ such that Γ = [Γ T φ Γ λ ] T, this equation can be written [ Γφ φ Γ λ φ Γ φ λ Γ λ λ ] [ ] [ ] Aφ δu (m) (m) λ (m) Bφ (m) φ =, u (m) (m) = ρ(φ (m), λ (m) ) λ (m) u=u (m) (4.81) where t m represents the number of unknowns over all spatial cells, energy groups, and flux moments. Thus, for isotropic scattering with only one spatial unknown per cell (such as with DD) the transport form of Γ is no larger than the diffusion problem. That is not to say that they require the same amount of computer

161 147 memory since the discrete angular fluxes used in the S N approximation are not present in diffusion theory, meaning more intermediate storage is required for the transport problem. Leaving ρ unspecified this linear system for a single Newton step can be written [ ] [ ] [ ] A λb Bφ Aφ λbφ φ δu =, u =. (4.82) ρ(φ, λ) φ ρ(φ, λ) λ ρ(φ, λ) λ While this allows the problem to be written in a simple manner it does not help to illuminate the actual calculation process of any of the quantities required by Newton s method. Regardless of whether Newton-Krylov (exact Jacobian-vector product with GMRES) or Jacobian-Free Newton-Krylov are being used it is necessary to evaluate Γ repeatedly. Writing the A and B operators out fully, Γ becomes [ ] [ ] (I DL 1 MS) φ λdl 1 MFφ φ Γ(u) =, u =. (4.83) ρ(φ, λ) λ Using the linearity of the transport sweep to our advantage we can rewrite the previous equation such that only one sweep (per group) is required during each evaluation of Γ [ ] [ ] φ DL 1 M (S + λf) φ φ Γ(u) =, u =. (4.84) ρ(φ, λ) λ This can be simplified by defining a new operator P = DL 1 M (S + λf) (4.85) such that [ ] [ ] φ Pφ φ Γ(u) =, u = ρ(φ, λ) λ (4.86) which looks remarkably similar to Eq. (2.43), the acceleration of the power iteration in diffusion theory by the JFNK approach. Here we are similarly accelerating a fixed-point iteration P using Newton s method. The fixed-point iteration in this

162 148 case would be φ (l+1) = DL 1 M ( S + λ (l) F ) φ (l) (4.87) which does not really look similar to Eq. (4.58) or Eq. (4.75), the fixed-point solutions that we have already examined. This is indeed a different iteration, which is briefly mentioned in [41] as the starting point for a class of k-eigenvalue DSA schemes. In this scheme the inner iterations on self-scattering and the intermediate iterations over energy have been flattened such that what was previously three levels of iteration can now be represented by the index l. This assessment agrees with the whole idea behind using Newton iterations to solve Aφ = λbφ as opposed to the standard eigenvalue problem where the inverse of A must be calculated. The first question one must ask is does this equation produce the same set of eigenpairs as power iteration. We can show this to be true the same way it was shown that Eq. (4.75) produces the same solution as Eq. (4.58). If we again use φ and k (or λ ) to denote the converged quantities then Eq. (4.87) becomes φ ( ) = DL 1 M ( S + λ ( ) F ) φ ( ). (4.88) Rearranging this relationship we can write ( I DL 1 MS ) φ ( ) = λ ( ) DL 1 MFφ ( ) (4.89) Aφ ( ) = 1 k ( ) Bφ( ) (4.90) which is equivalent to the power method solution φ ( ) = 1 k ( ) A 1 Bφ ( ). (4.91) Thus, Eq. (4.87) provides an alternative to both the power and fixed-source formulations of the k-eigenvalue problem previously discussed. While this formulation removes the nested levels of iteration, making a single outer iteration relatively inexpensive, it will convergence much slower. However, since the convergence rate of Newton s method is not dependent on the spectral radius of Eq. (4.87) and Γ will be relatively cheap to evaluate it is possible that this problem formulation will work well in conjunction with Newton s method.

163 Evaluating Γ(P) In fact, Eq. (4.86) can be used to describe a family of formulations of the k- eigenvalue problem using Newton s method if P is taken to represent some operator that maps an input set of flux moments to an updated set of flux moments which are a better approximation to the eigenvector. Three different options for P have been presented so far P P = λa 1 B (4.92) P FP = A 1 L (λb + A U) (4.93) P F = DL 1 M (S + λf) (4.94) resulting in the nonlinear functions Γ P (u) = Γ FP (u) = Γ F (u) = [ ] φ λa 1 Bφ ρ(φ, λ) [ φ A 1 L (λb + A ] U) φ ρ(φ, λ) [ ] φ DL 1 M (S + λf) φ ρ(φ, λ) (4.95) (4.96) (4.97) due to the operator P of Eqs. (4.58), (4.75), and (4.87), respectively. Again u = [φ T λ] T. To gain a better understanding of the cost involved in evaluating each of these nonlinear functions for some given vector v we break down the operations associated with each. It is worth noting that when Γ is evaluated with u the operators have some physical meaning since the contents of u are the scalar flux and the multiplication factor. However, when a JFNK approach is employed and Γ is evaluated at some arbitrary vector v the physical interpretation becomes less clear since v is some perturbation of the vector u, i.e. v = u + ɛy, where y is the vector generated by GMRES.

164 Accelerating Power Iteration The cost of using Newton s method to accelerate the power iteration is relatively high. Each evaluation of Γ in this instance requires the application of A 1 B to the vector v, which is the same as performing a single outer iteration in the traditional sense: nested levels of iteration on self-scattering and energy. The full iteration over energy is generally seen as prohibitively expensive in the context of traditional techniques, so performing this operation at each GMRES iteration is likely to be extremely inefficient. This can be seen by considering the evaluation process for Γ. Consider some v = [v φ v λ ] T where v φ has the same dimensions as φ and v λ is a scalar. The component v φ can be considered to be an augmented vector composed of blocks v g such that v φ = [v 1 v G ], where v g and φ g have the same dimension. Using this notation Γ(v) is written Γ P (v) = [ vφ v λ A 1 Bv φ ρ(v φ, v λ ) ]. (4.98) This would be evaluated in the same manner as a single outer iteration, which is equivalent to solving a fixed-source problem. The evaluation process begins in the first energy group where there is no downscattering by definition, such that ( I DL 1 MS 11 ) v (m+1) 1 = DL 1 M ( v λ Fv φ + G g =2 ) S 1g v (m) g. (4.99) In this process v (0) φ is simply the vector passed from GMRES. To find v (m+1) 1 it is necessary to apply the operation ( I DL 1 MS 11 ) 1 (4.100) which results in a second level of iteration, the inner iterations associated with S N transport theory. The remaining groups, g = 2,..., G are solved by ( I DL 1 MS gg ) v (m+1) g =

165 151 ( g 1 DL 1 M v λ Fv φ + g =1 S gg v (m+1) g + G g =g+1 ) S gg v (m) g (4.101) where again there are inner iterations associated with each group which must meet some specified criteria. If v (m+1) φ meets the convergence criteria for the iterations over energy then the iterations are terminated and v (m+1) φ is denoted by vg. Otherwise the process begins again for group 1. With the iterations over self-scattering and energy are converged the vector-valued function Γ can be written as Γ P (v) = [ vφ v φ ρ(v φ, v λ ) ] (4.102) where ρ is still unspecified. When this Γ is used with the JFNK approximation this process must be carried out for each Jacobian-vector multiplication, which means once per GMRES iteration. This results in a valid, but understandably expensive technique. If the splitting of the scattering source used in Eq. (4.75) is instead used as P then one of the two levels of iteration in the algorithm described above can be removed Accelerating Fixed-Point Iteration If we now consider the evaluation of Eq. (4.96) for the some vector v we see that it will be much less expensive as each group will necessarily only be considered once. Replacing u in Eq. (4.96) with v we write [ vφ A 1 L Γ FP (v) = (v ] λb + A U ) v φ. (4.103) ρ(v φ, v λ ) The source for group 1 is now completely specified since there will be no iterations due to upscattering. The equation for this group is then given by ( I DL 1 MS 11 ) v1 = DL 1 M ( v λ Fv φ + ) G S 1g v g. (4.104) g =2

166 152 The remaining groups, g = 2,..., G are solved by ( g 1 ( ) I DL 1 MS gg vg = DL 1 M v λ Fv φ + S gg vg + G g =1 g =g+1 ) S gg v g (4.105) where v is again used to denote the new value of v, v = A 1 L (v λb + A U ) v φ. Again each group requires the solution of a linear problem due to the application of ( I DL 1 MS gg ) 1 such that Γ(v) is given by Γ FP (v) = [ vφ v φ ρ(v φ, v λ ) ]. (4.106) The expense of the evaluation of Γ is now completely determined by the number of inner iterations performed in each group. If too many inners are performed then Γ will be expensive to evaluate while if few inners are performed, either due to rapid convergence or loosening the stopping criterion, the Γ evaluation will be inexpensive. More discussion on the convergence of the inner iterations will be given shortly Accelerating Flattened Fixed-Point Iteration The flattened k-eigenvalue iteration given by Eq. (4.87) for an arbitrary vector v is [ ] vφ DL 1 M (S + v λ F) v φ Γ F (v) = (4.107) ρ(v φ, v λ ) where now there are no nested levels of iteration. Each group is solved using ( ) g 1 G vg = DL 1 M v λ Fv φ + S gg v g + S gg v g + S gg v g (4.108) g =1 g =g+1 to find v. It is not even necessary that the groups be solved in any certain order since they are not coupled in any way through this iterative scheme. This is worth noting because it implies the the resulting solution algorithm is amenable to syn-

167 153 chronous coarse-grain parallelization providing good potential for highly efficient parallel performance on non-massively parallel platforms. Out of the three choices of P considered, see Eqs. (4.92), (4.93), and (4.94), this is by far the least expensive computationally since no iterations at all are required to evaluate Γ. This is the transport equivalent of the diffusion formulation JFNK(GEP) which only requires matrix-vector multiplies. We can also show that a simple variation of this approach is intimately related to the fixed-point formulation of P, given by Eq. (4.93) Variations on Flattened Fixed-Point Iteration The flattened formulation could be considered the logical endpoint of the simplification process for Eq. (4.95) where the cost per iteration is lessened with the expectations that the total number of iterations will increase. If the iterations over upscattering are dropped such that one Gauss-Seidel iteration is performed then Eq. (4.95) reduces to Eq. (4.96) given that v φ is used as the initial guess. We can also show that Eq. (4.96) yields a variation of Eq. (4.97) if a single inner iteration is performed, using source iteration, with v g (0) as the initial guess. that We begin by writing the first source iteration for an arbitrary guess v (0) such (1) L ψ g = MS gg v g (0) + Q (4.109) (1) where ψ g is the angular-flux-like quantity resulting from the initial guess v g (0). If v g (0) (1) were the scalar flux then ψ g would equal ψ g (1). Equation (4.109) can be manipulated to show that v (1) g = ( DL 1 MS gg v (0) + DL 1 Q). (4.110) The group source Q can now be replaced using the template in Eq. (4.62), where the upscattering index can be dropped from the in-scattering sources as the upscattering is being treated like in the fixed-point formulation, Eq. (4.75), and not the full power method, Eq. (4.58). The quantity v g (which replaces φ g in Eq. (4.62)) in the upscattering and downscattering sources is known from the vector v or from the solution to a previous group, though it is important to note that it

168 154 is not related to the source iteration initial guess, v g (0), in any way this is still arbitrary. Thus as in the previous section, v g represents the solution in group g. v (1) g = DL 1 M ( S gg v (0) g + DL 1 MλFv φ + This can be written more compactly as g 1 g =1 S gg v φ g + G g =g+1 S gg v φg ). (4.111) v (1) φ = (I DL 1 MS L ) 1 DL 1 M [ S D v (0) + (λf + S U ) v φ ]. (4.112) Now the impact of the initial guess can be seen very clearly. If the initial guess for the inner iterations is chosen to be v (0) = v φ then it shows that Eq. (4.96) using a single source iteration per group is equivalent to Eq. (4.97) where the downscattering is constructed using previously-computed group fluxes. The same of course holds true for the actual iterative formulations, i.e. Eq. (4.75) using a single source iteration per group is equivalent to this variation of Eq. (4.87) for the particular initial guess φ (0) = φ (l), where l is the index outer iteration index. Thus, it can be easily shown that Eq. (4.75) using a single source iteration per group per outer with v (0) = v φ as the initial guess converges to an eigenpair in the same manner it was shown Eq. (4.87) will converge to an eigenpair. However, consider the initial guess v g (0) = 0. A single source iteration in this case yields v (1) φ = (I DL 1 MS L ) 1 DL 1 M (λf + S U ) v φ, (4.113) which contains no contribution from the self-scattering. If the fixed-point formulation in Eq. (4.75) is used to define a new iteration scheme by using one source iteration per group per outer iteration (with a zero initial guess for source iteration) then this defines the iterations φ (l+1) = (I DL 1 MS L ) 1 DL 1 M ( λ (l) F + S U ) φ (l) (4.114) which will not converge to an eigenpair. This demonstrates how the choice of initial guess allows for the inner iterations to be commingled with the outer and upscattering iterations such that there is only one level of iteration present. This

169 155 also indicates that there is some transition from Eq. (4.112) to Eq. (4.75). Multiple SI inner iterations could be done using v φ as the initial guess and though this is not equivalent to Eq. (4.112) and not exactly equivalent to Eq. (4.75) (because the self-scattering may not be fully converged), it lies somewhere in-between. However, should some other initial guess be used for the inners the distinction will become very clear because Eq. (4.112) will converge to an eigenpair while Eq. (4.75) will only converge if the inners are converged. Performing multiple inners per outer is intuitively wasteful in the Newton sense because it only serves to increase the computational expense of each Γ evaluation. Still, the point is not that this is better but only that it is possible. We now look more closely at how the previous discussion relates to the flattened iterative scheme. Assuming the groups are solved in order from fast to thermal, updated values of v g are available for computation of the downscattering source. Likewise, the fission source could be recalculated for each energy group using the newest information. Since the fission source term forms the right-hand-side in the Gauss-Seidel iteration of the formulations in Eq. (4.58) it cannot be updated during the iteration or the Gauss-Seidel iterations will never converge. However, since we are not performing the Gauss-Seidel iteration in Eqs. (4.75) or (4.87) this restriction is obsolete. Updating the downscattering and fission source (or performing multiple inner iterations) can be formulated in practice without any difficulty or additional storage; however additional manipulation is required so that they can be represented by an operator P. Starting with the flattened approach and using the updated values for the downscattering source, but treating the inners as described above results in a k- eigenvalue problem of the form ( I DL 1 MS L ) φ (l+1) = DL 1 M ( λ (l) B + S D + S U ) φ (l) (4.115) which can be shown to converge to an eigenpair in the same manner described previously. This is identical to the iterative scheme in Eq. (4.112), explicitly showing that this variation on the flattened iteration is equivalent to Eq. (4.75) using source iteration for inners, with one source iteration per group per outer and using φ (l) as the initial guess for SI. Using this to formulate a Newton method we define

170 156 P as P FD = ( I DL 1 MS L ) 1 DL 1 M (λf + S D + S U ). (4.116) The operator ( ) I DL 1 MS L in this case is inverted using simple forward substitution where the solution of the linear system associated with each group is trivial to solve since the block-diagonal of the operator is the identity matrix. Updating the fission source requires that it be split into F = F L + F D + F U, a strictly block-lower-triangular matrix, a block-diagonal matrix, and a strictly block-uppertriangular matrix, respectively. Doing so results in the fixed-point iteration [ I DL 1 M ( )] S L + λ (l) F L φ (l+1) = DL 1 M [ ] λ (l) (F D + F U ) + S D + S U φ (l) yielding the fixed-point operator (4.117) P FDF = [ I DL 1 M (S L + λf L ) ] 1 DL 1 M [λ (F D + F U ) + S D + S U ] (4.118) for use with Newton s method. Again, forward substitution can be used to solve the linear system associated with each group. These two updating schemes result in two more formulations of Newton s method which can be used to solve the k-eigenvalue problem: Γ FD (u) = Γ FDF (u) = [ ( ) φ I DL 1 1 ] MS L DL 1 M (λb + S D + S U ) φ (4.119) ρ(φ, λ) ( [ φ I DL 1 M (S L + λf L ) ] ) 1 DL 1 M [λ (F D + F U ) + S D + S U ] φ. (4.120) ρ(φ, λ) These methods do not require any additional sweeps when compared to Eq. (4.97) so their impact on the convergence rate will determine their effectiveness. Though in the form written above, namely two appearances of DL 1 in each of Eqs. (4.119) and (4.120), it looks like there will be additional sweeps required, which is actually not necessary. The downscattering and fission sources created using the most recent data do not need to be swept separately, they can be grouped in with the sweep over all other sources since they will be known via the forward substitution.

171 157 Five formulations of the k-eigenvalue problem using Newton s method have been presented, all possessing the same general form [ ] [ ] φ Pφ φ Γ(u) =, u =. (4.121) ρ(φ, λ) λ In each of the five methods this P is some kind of fixed-point operator that is related to the power iteration traditionally used in solving the k-eigenvalue problem with transport theory. This is significant because it allows all of the Newton methods considered in this chapter to rely on an existing k-eigenvalue S N code for the bulk of the computation. It is important however to understand what the k-eigenvalue calculation comprises so that the cost of each application of P can be considered. The k-eigenvalue iterative schemes determined by Eq. (4.92) and Eq. (4.93) are most resembling of the traditional power iteration; in the absence of upscattering they are equivalent. Eq. (4.92) iterates on upscattering when present and Eq. (4.93) lags the upscattering source one outer iteration behind meaning the fission source and upscattering source are created using the same flux estimate. The iterative scheme defined by Eq. (4.94) also lags the upscattering source but if based on adapting an existing code it would only require 1 inner per outer given that the initial guess for that inner is φ or v φ. This scheme also uses the same φ or v φ to construct the downscattering source, even if φ g or vg is known, which is a variation from a traditional outer iteration. The operator P FD in Eq. (4.116) represents the more traditional treatment of downscattering where the results from higher energy groups are used to construct the downscatter source. Equation (4.118) takes this even further by updating the fission source in each group using information from higher energy groups. This is normally never done in the context of traditional solution techniques but is perfectly acceptable in the nonlinear formulation of the k-eigenvalue problem. 4.6 Constraint Equation To this point we have used ρ to represent the final equation in the nonlinear system, leaving the actual function unspecified. In the preliminary diffusion theory work of this project two separate approaches were used. One approach used a relationship

172 158 to impose a normalization condition on the eigenvector, φ, while the other used a relationship which sought to accelerate the convergence of the eigenvalue. This condition will be referred to as the constraint equation in future discussions in light of the fact that if one considers the nonlinear form of the k-eigenvalue problem as an optimization problem ρ is an optimization constraint. The simplest constraint is one that is often used when Newton s method is used to solve eigenvalue problems in the mathematical community. This constraint is here referred to as the normalization constraint (N) and is given by ρ N (φ, λ) = 1 2 φt φ (4.122) This constraint ensures that the converged eigenvector will have an L 2 norm of unity. This normalization has no physical interpretation and is used primarily for mathematical convenience; it is just as reasonable to normalize the flux moments using the L 1 or L norm. For a reactor it is often more reasonable to normalize the flux using the current power level though often times in multiplying systems the normalization condition used is to normalize the fission source to unit production, i.e. one fission neutron born. These are also valid conditions for use in the constraint equation and would still be classified as normalization-based constraints. Another type of relationship which may be used as a constraint is an updated eigenvalue estimate. In diffusion theory this kind of constraint was written ρ(φ, λ) = λ λ(φ, λ). Upon convergence ρ = 0 since λ is an updated eigenvalue estimate and at convergence λ = λ. Once in the asymptotic regime, as the residual Γ(u) decreases with each Newton iteration, this type of constraint will result in increasingly accurate eigenvalue estimates. Writing the k-eigenvalue update formula of Eq. (4.59) in terms of λ results in Written as a constraint equation this becomes λ (l+1) = λ (l) Fφ (l) 1 Fφ (l+1) 1. (4.123) ρ(φ (l), φ (l+1), λ (l) ) = λ (l) λ (l) Fφ (l) 1 Fφ (l+1) 1, (4.124)

173 159 which can be simplified further by recognizing that φ (l+1) = Pφ (l), where P is whatever operator being used to form the nonlinear function in Newton s method. This allows the previous equation to be simplified further to ρ FR (φ, λ) = λ λ Fφ 1 FPφ 1. (4.125) This constraint equation is called the Fission Rate constraint (FR) and it has a clear physical interpretation since it includes the ratio of fission sources. This constraint is advantageous because it is commonly used in existing implementations whereas the Flux-Weighted constraint, which will be discussed next, is based on an eigenvalue update not generally used for the k-eigenvalue problem. The Flux-Weighted (FW) approach is also an eigenvalue update formula and can be viewed as a generalization of the FR constraint. The FW constraint is very similar to the Rayleigh quotient which was described in Chapter 1. The Rayleigh quotient iteration is built around the Rayleigh quotient update formula, which yields an eigenvalue approximation given an approximation to the eigenvector. For the eigenvalue problem the Rayleigh quotient is denoted by Ax = λx (4.126) λ = xt Ax x T x (4.127) which is an estimate of the eigenvalue, λ, due solely to the eigenvector estimate x. A generalized form of the Rayleigh Quotient is not possible using the P options we have developed but it is possible to develop a Rayleigh-quotient-like constraint, the Flux-Weighted constraint, using the FR constraint as a model. The FR constraint is of the form ( ) wφ ρ FW (φ, λ) = λ λ wpφ (4.128) where w is some weighting factor. In the FR constraint the weighting is effected via the operator E T F, where E T = [ ]. However, we can also use the

174 160 flux moments vector such that ( ) φ T φ ρ(φ, λ) = λ λ φ T Pφ (4.129) to obtain an expression resembling the Rayleigh quotient. Just as with the FR constraint, this can also be written as an update formula for a fixed-point iteration such that ( ) φ λ (l+1) = λ (l) T φ. (4.130) φ T Pφ Though all of the eigenvalue-update normalization constraints are written in terms of λ they can be just as easily formulated for k such that ( ) k (l+1) = k (l) FPφ 1 (4.131) Fφ 1 ( ) φ = k (l) T Pφ. (4.132) φ T φ Using the general idea of weighting as in Eq. (4.128) any number of eigenvalue update schemes could be generated. The two types of constraints presented here, normalization and eigenvalue updates, are not the only conceivable options. Any other constraint on the eigenpair could be imposed. However, in this work we will only consider the FR and FW scheme, along with the N normalization constraint, where the L 2 norm of the flux moments is forced to be unity. Each of these normalization constraints changes the nonlinear function Γ and thus they directly impact the convergence rate of Newton s method. We will also see during the presentation of numerical results that the choice of constraint can affect the robustness of the method as well. 4.7 Evaluating Γ (P) Using the techniques discussed up to this point it is possible to fully implement any of the Newton approaches using the JFNK approximation, given that ρ is specified, since in this case the ability to evaluate Γ is all that is necessary to solve the problem. That is, the combination of Eq. (4.80), Eq. (4.86), one of the

175 161 definitions of P and the JFNK approximation, J(u)v Γ(u + ɛv) Γ(u), (4.133) ɛ can be used together to solve the k-eigenvalue problem. However for the different operators, P, discussed earlier it is also possible to operate directly with the exact Jacobian-vector product, with the final form of the Jacobian dependent both on the selected P and the constraint equation used. The Newton problem can be written in a generic manner such that [ φ φ (φ Pφ) ρ(φ, λ) ] [ ] [ ] (φ Pφ) (φ Pφ) φ δu =, u =. (4.134) ρ(φ, λ) ρ(φ, λ) λ λ λ If the Jacobian is considered a 2 2 block matrix, then the derivatives in the two blocks of the first row are given by Tables 4.1 and 4.2. At first glance it may seem that multiplication of the upper block of the Jacobian by some vector will be at least twice as expensive as its JFNK counterpart since each block contains a sweep. However, the source terms to the sweep can be combined before the sweep is performed using the linearity of the operator such that still only one sweep is necessary per group. To show an example of this, multiplying the top row of the Jacobian formed by the flattened operator P F, Eq. (4.94), we can write (I DL 1 M(S + λf))v φ v λ DL 1 MFφ (4.135) where v is the vector the Jacobian is being multiplied by, i.e. J(u) v, and φ and λ are the current values of u. These sweeps can be combined so that the operation can be written I DL 1 M [(S + λf)v φ + v λ Fφ]. (4.136) This can be done for all choices of P so that the application of the true Jacobian should always bear a cost equivalent to that of the JFNK method. However since the JFNK approach is much simpler to implement it is preferred unless it can be shown to have any disadvantages that are not shared by the Newton-Krylov formulations. The two blocks in the second row of the Jacobian are specific to the selected constraint equation and, for the eigenvalue update constraints, also specific

176 162 Table 4.1: Derivatives of (φ Pφ) with respect to φ P ( / φ)(φ Pφ) (4.92) I λa 1 B (4.93) I A 1 L (λb + A U) (4.94) I DL 1 M(S + λf) (4.116) I ( ) I DL 1 1 MS L DL 1 M (λf + S D + S U ) (4.118) I [ I DL 1 M (S L + λf L ) ] 1 DL 1 M [λ (F D + F U ) + S D + S U ] Table 4.2: Derivatives of (φ Pφ) with respect to λ P ( / λ)(φ Pφ) (4.92) A 1 Bφ (4.93) A 1 L Bφ (4.94) DL 1 MFφ (4.116) ( ) I DL 1 1 MS L DL 1 MFφ (4.118) [ I DL 1 M (S L + λf L ) ] 1 DL 1 M (F D + F U ) φ to the operator P. Tables 4.3 and 4.4 show the constraint equation derivatives that comprise the corresponding blocks of the Jacobian. In Table 4.4 the term P λ refers to the derivative of P with respect to λ which can be determined from Table 4.2 for all of the P operators considered. Specifically, P λ can easily be extracted from the table by removing a factor of φ. The normalization constraint introduces very simple entries into the Jacobian, explaining its popularity. We can see that the FW constraint requires the application of P T from Table 4.3, which is an operator not generally available in traditional transport codes, making the implementation of the FW constraint with the NK method rather impractical. The FR constraint on the other hand does not introduce any difficult-to-compute quantities into the Jacobian. However, if the multiplication J(u) v is considered, the FR constraint will introduce significant expense into the NK calculation since P must be applied to v φ, which will double the computational cost when compared to the JFNK equivalent. It is important to remember that the Jacobian would never need to be formed

177 163 Table 4.3: Derivatives of ρ with respect to φ ρ ( / φ)ρ(φ, λ) N FR FW φ T [ ] (E T FPφ)(E T F) (E T Fφ)(E T FP) λ (E T FPφ) 2 [ ] 2φ T (φ T Pφ) φ T (P + P T )(φ T φ) λ (φ T Pφ) 2 Table 4.4: Derivatives of ρ with respect to λ ρ ( / λ)ρ(φ, λ) N 0 FR 1 + λ (ET Fφ)(E T FP λ ) (E T FPφ) 2 ET Fφ E T FPφ FW 1 + λ (φt φ)(φ T P λ ) (φ T Pφ) 2 φt φ φ T Pφ or stored but rather it would only be necessary to know how the multiplication of the Jacobian by a vector is structured. This is the advantage of using a Newton- Krylov method and not some other iterative technique in which a component-wise knowledge of the Jacobian would need to be accessible. If the Jacobian for P F, defined by Eq. (4.94), using the N constraint were formed it would be written as [ ] I DL 1 M(S + λf) DL 1 MFφ J(u) = (4.137) φ T 0

178 164 such that the corresponding Jacobian-vector multiplication is written [ ] I DL 1 M[(S + λf)v φ + v λ Fφ] J(u)v =. (4.138) φ T v φ Even though in an NK method none of the Jacobian matrices defined would ever be constructed or stored, as the components themselves are never stored, implementing the Jacobian-vector multiply still adds a layer of complication beyond the JFNK approach. Due to these additional complications introduced by the NK formalism on top of the JFNK formulations the NK approach is quite unattractive as a means to solve the k-eigenvalue problem in transport theory; the preferred approach in transport theory is the JFNK approximation due to its ease of implementation. While performing the Jacobian-vector multiplies above is quite possible for the N and FR constraints using the standard computer-code components, the S N sweep with scattering and fission sources, it is more convenient to piggyback off of an existing k-eigenvalue solution algorithm so that some sort of outer iteration can be treated as a function of the vector v or u. 4.8 The α-eigenvalue Problem Before concluding the transport theory discussion we again briefly turn our attention to the α-eigenvalue problem discussed earlier in the chapter. It can be shown that the α-eigenvalue problem can be solved in a similar manner to that presented for the k-eigenvalue problem. The idea behind this method began with the k- eigenvalue formulations above and at the time of this writing Jim Warsa and Jeff Densmore [76] at Los Alamos National Laboratory are researching its feasibility. If we consider the α-eigenvalue problem from Eq. (4.4) written in operator notation then A(α)φ = Bφ (4.139) where A(α) is defined by ( I DL(α) 1 MS ) (4.140) and L(α) by L(α) = L + αv (4.141)

179 165 where V is a diagonal matrix containing the reciprocals of the group velocities along its diagonal. This problem is typically solved by adding another level of iteration to the k-eigenvalue problem discussed earlier [77]. Thus a search is conducted for the α that yields k = 1, such that at each step of the search a full k-eigenvalue problem must be performed. This is a daunting computation and the outermost level of iteration can be removed when it is written as a nonlinear function and solved using Newton s method. The most basic nonlinear formulation is given by [ ] [ ] A(α)φ Bφ φ Γ(u) =, u = (4.142) ρ(α, φ) α If the α-eigenvalue problem is written in this manner, or a possibly more efficient variation of this manner, it effectively removes the outermost level of iteration in the α-eigenvalue search avoiding a substantial amount of computation. This approach to the α-eigenvalue is currently under investigation and serves to further highlight the flexibility of Newton-Krylov methods and to demonstrate one of their many potential uses in neutronics calculations. 4.9 Summary of Newton Approaches In this chapter neutron transport theory was presented and the behavior of neutrons in multiplying media was considered. The presence of multiplying media results in a formulation of the transport equation that only has solutions under very restrictive conditions. Traditionally the criticality problem is treated as an eigenvalue problem so that solutions can be found with the eigenvalue providing some measure of how far from critical the system is. Two types of eigenvalues can be found for these systems depending on the problem formulation: the k-eigenvalue and the α-eigenvalue. We are concerned with the k-eigenvalue formulation as this is overwhelmingly used in nuclear engineering applications. The k-eigenvalue formulation was subsequently discretized in space, angle, and energy resulting in a set of discrete equations which can be solved numerically. This discretized set of equation was then expressed in an operator notation that allows the traditional iterative scheme to be presented in a compact manner. In general, this iterative scheme consists of three levels of iterations. An outer

180 166 iteration which is a fixed-point iteration that updates the fission source and the k estimate (the eigenpair), an intermediate Gauss-Seidel iteration over energy in which the upscattering source is converged, and a level of inner iterations for each energy group in which the self-scattering is converged. The inner iterations are traditionally performed using a fixed-point iteration though recently the use of a Krylov method, particularly GMRES, as a solution technique has been gaining acceptance. The iterations over energy are quite expensive computationally and often the upscattering source is commingled with the outer iteration such that the previous outer iterate is used to build the fission source and the upscattering source. This is equivalent to only performing a single Gauss-Seidel iteration over the energy dependence. This formulation of the problem forms the basis for the solution algorithm used by most transport codes today. However it is possible to remove the inner iterations as well by using the previous outer iterate to form both the fission and total scattering source. In this case there is only one iteration index and the eigenpair, upscattering, and self-scattering are all being simultaneously converged. While this results in a slowly convergent process it is quite inexpensive to do a single iteration and thus is an attractive option for use with a Newton formulation. Variations of this method utilize the most recently updated information in the formulation of the scattering and/or fission source. Two classes of constraint equations were presented as options to form the relationship which completes the nonlinear system. These were normalization conditions and eigenvalue update conditions. In particular, three constraint equations were considered in detail: the N constraint, the FR constraint, and the FW constraint. The N constraint forces the L 2 norm of the flux moments to unity, the FR constraint uses the traditional fission rate eigenvalue update formula to accelerate convergence of the eigenvalue while the FW constraint uses a flux-weighted update formula for the eigenvalue that is similar to the Rayleigh quotient. The five iterative options along with the three constraints comprise the set of methods considered in this work, though it should be emphasized there are truly no inherit limits on formulations of this method. The number of possible constraints is vast and even the formulation of P can be varied in any number of ways. In fact, it is not even necessary to pose the problem as a function of P. The Jacobian was constructed symbolically for each P and constraint equation so that an NK for-

181 167 Table 4.5: Summary of Methods for Transport k-eigenvalue Problem Solution Method Formulation (P) Eq. Constraint Eq. Power P (4.92) Traditional Fixed-Point FP (4.93) N (4.122) NK Flat F (4.94) FR (4.125) JFNK Flat-D FD (4.116) FW (4.129) Flat-DF FDF (4.118) mulation could be implemented if desired. The FW constraint requires transposes that are generally not available while the FR constraint results in a significantly higher workload per Jacobian-vector multiply than the JFNK approach. For these reasons and due to the simplicity of its implementation the JFNK approach is the preferred means of applying Newton s method to the k-eigenvalue problem. Table 4.5 provides a concise description of the basis for all of the methods described in this chapter. The traditional solution method indicates a fixed-point formulation of the form φ (l+1) = Pφ (l) (4.143) where the eigenvalue can be updated using the FR and FW update formulas. The NK method refers to a Newton-Krylov method of the form in Eq. (4.134) where P is given by one of the equations listed in the second column of Table 4.5. The Jacobian matrices for this approach can be constructed using Tables 4.1, 4.2, 4.3, and 4.4. The JFNK method refers to using a nonlinear function of the form found in Eq. (4.86) to solve using Newton s method in conjunction with the JFNK approximation. The five formulations of P are listed in the second column and given semi-descriptive acronyms so that they can be easily referenced. The constraint equations developed are listed in the final column. Using this table a method can be fully specified by choosing a solution method, a P formulation, and the constraint equation, though N is not a valid choice for traditional iterative techniques. The JFNK and NK approaches when discussed will be referenced via the format Solution Method - Formulation - Constraint, e.g. JFNK-F-FR for the JFNK solution method with P F, given by Eq. (4.94) and constraint ρ FR, given by Eq. (4.125).

182 168 As mentioned previously not all of these methods will be implemented numerically, particularly the NK methods, as the JFNK approximation provides easy and effective alternatives. The traditional iterative form will also only be used to explore formulations of P which are similar standard practice in the current generation of S N codes. The ultimate goal is to show that it is possible to use one or more of the Newton methods from Table 4.5 to solve k-eigenvalue problems for the fundamental eigenpair. Once successful, we seek to show that it can produce the correct result in fewer sweeps than traditional techniques require while making it possible to rely on standard S N operations such that existing coding can be adapted. The next chapter will provide the numerical results generated for a variety of benchmark problems allowing for the behavior of the various methods to be examined and their overall effectiveness to be tested.

183 CHAPTER 5 Transport Theory Numerical Results Results for a number of benchmark problems were generated using the newly developed Newton methods for the transport k-eigenvalue problem. The benchmark problems cover a wide range of reactor types and include models with and without upscattering. The use of these benchmarks is meant to provide meaningful evidence of the behavior of the Newton approach, and the JFNK approximation in particular, in realistic three-dimensional transport problems. The performance of the Newton methods as a function of the different formulations, constraints, and numerical parameters will be tested so that ultimately the Newton approach can be contrasted with the traditional outer-inner (power/fixed-point iteration) solution scheme. 5.1 S N Transport Code The methods which can be constructed using the combination of operators in Table 4.5 were implemented in a Fortran 90/95 code. The code was constructed by extending an existing steady-state, multigroup, S N transport code written for three-dimensional Cartesian meshes with fixed-sources and no fissioning [78]. The scattering treatment in the initial code was isotropic and this limitation is retained in the k-eigenvalue version of the code, though we have seen in the previous chapter

184 170 that the method can be written in general terms which account for anisotropic scattering. The spatial discretizations available are the Step Method, the Diamond Difference Method, and the AHOT-N method, though primarily DD and AHOT- N1 will be used in this chapter s numerical experiments. The adaptation of the existing code lends credence to the idea that these methods are not too difficult to implement. It is likely that operators used in the method description are already logically separated into functions and subroutines in the target code. For instance, the DL 1 operator represents an S N sweep on some source vector followed by a summation of the discrete fluxes into moments of the resulting angular flux. This operation in many codes may be completely handled by some subroutine that has the responsibility of performing the transport sweep. In this case the spatial discretization used is immaterial as the sweep operator is used as-is. The creation of sources (scattering and fission) are also easily implemented as function calls. In this manner the Newton methods can be built by stringing together these existing capabilities within the target code. The JFNK methods can be implemented in an even easier fashion since in the end they only require a functional call to an existing outer iteration which is capable of accepting v or u as input in place of the previous outer iterate Convergence Criteria There are multiple sets of convergence criteria used in the traditional solution of the traditional k-eigenvalue problem and even more are necessary when Newton s method is included. In the traditional scheme the outer iterations, upscattering iterations, and inner iterations (using source iteration or GMRES) must all be terminated based on some criteria. In the Newton scheme it is necessary to choose a convergence criterion for the GMRES iterations used to solve each linear Newton step and a stopping point for the Newton iterations themselves must be defined. If there are any nested iterations required during the evaluation of Γ then the associated criteria must also be decided. Furthermore, some solution schemes prosper from the combination of multiple criteria for the same quantity depending on where in the calculation the iteration is taking place. The convergence criteria used in the traditional solution scheme are given first.

185 Outer Iterations The convergence of the outer iterations is generally measured by placing a criterion on the eigenvalue convergence and another on the fission source convergence, such as ɛ λ = λ (l+1) λ (l) < 10 5, (5.1) ɛ F = max FPφ (l) Fφ (l) i < 10 4, (5.2) Fφ (l) where l is the outer iteration index and i represents the index over all unknowns (spatial, energy, and angular). Often the convergence of the group scalar flux is also required, though in this case only the fission source convergence is considered. The fission source error test used is actually equivalent to finding the maximum point-wise error in the spatial distribution of the fission source. This can be seen by considering that χ 1 f Fφ =. (5.3) χ G f where f represents the spatial fission distribution. Thus it can be seen that Eq. (5.2) is equivalent to max i FPφ (l) Fφ (l) = max f (l+1) f (l) (5.4) Fφ (l) where j is an index over all spatial cells. The default tolerances for the error measure ɛ λ and ɛ F are 10 5 and 10 4 respectively, as indicated. These values are rather similar to the values used in PARTISN which converges both quantities to The eigenvalue criterion is set to a smaller value so that the results can be properly compared to the benchmark results of the employed test problems, though in practice the fission source converges much slower than the eigenvalue such that the condition in Eq. (5.1) is almost always satisfied before the condition in Eq. (5.2). Notice that these stopping criteria can be applied to any P formulation presented or λ update formula used. Any time non-default tolerances for ɛ λ or ɛ F j f (l) are used it will be clearly stated, but mostly these are used throughout since they represent realistic stopping criteria values.

186 Upscattering Iterations Though generally not performed in most runs a default scheme for converging any possible iterations over upscattering exists. If the Gauss-Seidel iteration index for the iteration over energy is m and the outer index is l then ɛ Up = max j φ (m+1,l) j φ (m,l) j φ (m,l) j ( < min 10 2, 10 2 max i ) FPφ (l 1) Fφ (l 1). Fφ (l 1) (5.5) Here the index j represents the index over all spatial cells. The tolerance value is determined by selecting the minimum of 10 2 and the point-wise fission source error from the previous outer iteration, thus limiting the highest tolerance to This choice of stopping criterion seeks to tighten the convergence of the upscattering iterations as the outer iterations converge in order to avoid oversolving the upscattering source in early outer iterations. This is an option rarely exercised due to its enormous expense. If upscattering iterations are to be routinely performed it would be sensible to at least include a relaxation step in the Gauss-Seidel iterative scheme to accelerate convergence of the upscattering source Inner Iterations The convergence tolerance placed on the inner iterations is extremely important since the majority of the work done in the traditional outer-inner iteration solution scheme arises from the S N sweeps associated with inner iterations. We have seen that there are two options for performing these iterations: a fixed-point iteration, known as source iteration, and reformulating the problem as a linear system so that a Krylov subspace method, likely GMRES, can be used. The criterion used to determine convergence will differ between SI and GMRES and both will be fully explained. In practice inner iterations are almost always done using accelerated SI which is not an available option in the S N code used to generate this chapter s numerical results. This makes it difficult to use any production-level codes as models for choosing the best iterative strategy, i.e. choosing optimal convergence criteria.

187 173 Source iteration, index n, is converged using the following criterion ɛ si,in = max i φ (n+1) i φ (n) i φ (n) i ( < min 10 4, 10 2 max i ) FPφ (l 1) Fφ (l 1). Fφ (l 1) (5.6) As in the case of upscattering, this criterion allows the convergence of the inner iterations to tighten as the outer iterations converge towards the eigenpair. This is meant to reduce the cost of performing inner iterations for early outer iterations where the fission source is still quite inaccurate. perform the inners, convergence is determined by If instead GMRES is used to ( ) ɛ gmres,in = DL 1 Q I DL 1 MS gg φ (n) DL 1 Q ( ) < min 10 2, 10 2 max FPφ (l 1) Fφ (l 1) i Fφ (l 1) (5.7) where Q is defined by Eq. (4.62). The default Krylov subspace size is 25, meaning that a restart is performed every 25 iterations until the criterion is satisfied or the maximum number of iterations is reached. These are the default convergence schemes for the inner iterations, depending on the type of iteration used though it is often the case that the defaults are not used and instead a fixed number of iterations are applied Newton Iterations Choosing the stopping criteria for the Newton iterations is an important issue since each Newton iteration is expensive compared to a single outer iteration. The most logical choice is to monitor some norm of the Newton residual, Γ(u) and/or some norm of the Newton step, δu, and cease iterating when the norm(s) falls below some predetermined value. The two downsides are that this provides no meaningful way to compare to traditional solution methods which monitor the eigenvalue convergence and the point-wise relative error in the fission source and that the residual norm does not provide any information about the error in the eigenvalue versus the error in the eigenvector. To overcome these problems, the Newton iterations in the code are converged

188 174 using criteria based on the traditional outer iteration. This allows for a direct comparison with traditional solution methods and provides commonly used error measurements. Still, this is expensive since it requires the calculation of an additional application of P at each Newton iteration. Thus in a practical application a convergence criteria based on the residual vector is more convenient. However for the sake of comparison the criteria used are ɛ λ = ρ FR (λ (l), φ (l) ) < 10 5 (5.8) ɛ F = max i FPφ (l) Fφ (l) < 10 4 (5.9) Fφ (l) where l refers to the index of the Newton iteration. In the evaluations above the form of P used is given by Eq. (4.93) and the inners are converged using the tolerance given in Eq. (5.7) since GMRES is used to perform inner iterations during the error calculation. These criteria are identical to those used for the outer iterations which ensures the comparison between Newton s approach and the traditional scheme will be fair. The sweeps and execution time associated with the calculation of these error values are not included in the total costs of the Newton approach since these criteria are used only for the purpose of comparison. If the residual norm is used to determine convergence instead, the cost of determining errors is negligible since the residual norm is a simple and inexpensive quantity to calculate. Due to the relatively high cost of a single Newton iteration it may also be desirable to define convergence windows instead of hard limits so that, if some error measure is within a defined percentage of the desired tolerance, then it will be allowed to pass the test. For example, if in Newton iteration 4 the maximum pointwise fission source error is then the default criterion is not met, however due to the convergence rate of Newton s method it is likely that after Newton iteration 5 the error will be an order of magnitude or more lower and a significant number of sweeps will have been accrued in that single Newton iteration. However, this type of scheme was not implemented in order to conserve the equivalence between the Newton and traditional stopping criteria.

189 Inner GMRES Iterations The convergence of the GMRES iterations within each Newton step have previously been discussed. The tolerance in this iterative process is given by the parameter η, called the forcing factor, and has been discussed in the context of the diffusion calculation. The same discussion holds for the transport equation as this criterion has less to do with the specific nonlinear function used than Newton s method itself. Still for the sake of completeness the inner GMRES iterations are terminated when Γ(u m ) + J(u m )δu m Γ(u m ) < η m (5.10) where the Newton iteration index is denoted by m. The value of η can be a constant or can be varied iteration to iteration via one of the algorithms previously discussed, such as Eisenstat-A or An. The impact of η on the Newton calculation using transport theory will be explored later in this chapter. In the flattened formulations of P, Eqs. (4.94), (4.116), and (4.118) there are no inner iterations so the stopping criteria for all of the iterative processes has been fully defined. However for P given by Eq. (4.92) or Eq. (4.93) there are associated inner iterations and in the case of Eq. (4.92) there are also upscattering iterations. The upscattering iterations are converged in the same manner as in the fixed-point implementation, described above by Eq. (5.5). The inner iterations are generally not converged using the default schemes defined above, but instead use a fixed tolerance, which seems to produce a more stable process. The details of these calculations will be discussed further when numerical results are presented Performance Measures Ultimately what is desired from the code is some way to fairly compare the performance and costs of the various methods implemented. In the diffusion theory results execution time was the best option, though there were many inadequacies associated with this approach. The execution time depends on the specific hardware used, runtime conditions on the system at the time of execution, and a number of other factors that users have no control over. This was necessary in diffusion theory because a matrix-vector multiply was simply too inexpensive

190 176 to classify as the major unit of work, i.e. building sources did not require any matrix-vector multiplies but was comparably expensive. Fortunately in transport theory there is a dominant unit of computational work which is common to all of the methods and can be easily counted: the S N sweep. This can be confirmed by calculating the time required per sweep from the measured total execution time for all mesh sweeps for a given test problem using multiple methods to verify they will be quite close regardless of the method used. When comparing one Newton formulation to another the required number of sweeps to achieve a specific level of convergence is again the bottom line in judging a method s performance but other quantities can give an indication of why one formulation performs better than another. By looking at the number of Krylov iterations per Newton it may be clear that one formulation results in a better conditioned system than another. It may also be seen by examining the total number of Newton iterations that one constraint adversely affects the convergence rate of Newton s method compared to another Algorithmic Parameters The algorithmic parameters of interest in transport theory are much the same as those in diffusion theory: the forcing factor, the finite-difference perturbation for the JFNK approximation, the use of backtracking, and the effect of the initial guess. One additional parameter that is present in transport theory is the treatment of inner iterations. One must choose between GMRES and SI and an initial guess must be chosen for these iterations as well. These parameters will be briefly reviewed: Forcing Factor (η) The Newton forcing factor determines how tightly the GMRES iteration is converged at each Newton step. The values considered are increasingly restrictive constant values of 10 1, 10 2, and The two algorithms developed by Eisenstat and Walker [28] are also tested along with those of Dembo et al. [24] and An et al. [29]. Perturbation Parameter (ɛ) Various empirical formulas were used to calculate the perturbation parameter used at each Newton step and the algorithm

191 177 of Xu and Downar [34] was also applied. The empirical formulas can be found in Eq. (3.2) where ɛ 1 is not included due to its similarity with ɛ 0. Backtracking Backtracking was implemented in the transport solution scheme when Newton s method is being used. The option was given to only perform backtracking after the norm of the Newton residual, Γ(u), falls below some user-chosen value. Newton Initial Guess The Newton initial guess is formed by choosing u (0) = [φ (0)T k (0) ] T. Generally the initial guess for the eigenvalue will be set to a value in the range of The initial guess for the flux moments will be one of three choices: a flat-flux, a fission-source-based flux, or the result of some specified number of fixed-point iterations initialized using one of the previous two guesses. The flat-flux and fission-based initial guesses are given by φ (0) = E ET E (5.11) and φ (0) = BE ET E (5.12) respectively, where again E T = [1,..., 1] T. The convergence of initial fixedpoint iterations has previously been discussed, though it is important to note that those iterations are initialized using either of the previous equations and some user-chosen value of k (0). Inner Iterations The convergence of inner iterations using SI and GMRES has already been discussed but the choice between them is still an option that must be considered for the formulations of P in Eq. (4.92) and Eq. (4.93). Recall that in all of the flattened formulations it has been implicitly assumed that one SI is done per group using the same u or v to calculate all fission and scattering sources. Aside from choosing which inner formulation is used it is necessary to choose an initial guess. The most sensible choice is to use some previously known value, φ (l 1), or v φ. However if the iterations used A or A L are truly being converged then any initial guess is acceptable. At times a zero initial guess will be used to show that this is true.

192 Benchmark Problem Suite As in diffusion theory a number of previously developed benchmark problems were used to test the efficiency of the newly developed methods. These problems were intended to test a wide range of reactor configurations and parameters. However, the goal of modeling these benchmark problems was not to prove the capability of the S N code to accurately reproduce the benchmark results. That is, the ultimate goal was to compare the performance of the various Newton and fixed-point formulations to one another and not to tweak the input until the solution agrees with the benchmark reference within some accuracy. Still, it is desirable that the inputs created produce reasonably accurate results, which we successfully verify in all cases tested. The set of problems and their variations covers a wide range of scenarios, all in three-dimensional geometry. Axially homogeneous and axially heterogeneous cases are considered, the number of energy groups range from 2 to 7 both with and without upscattering. Models with large absorption gradients due to control rods are considered as are problems with no control rod positions. The geometrical complexity varies from a small research reactor with homogenized pin-cells to a four-assembly mixed-oxide fuel problem without any spatial homogenization of pin-cells. The specific problems that are modeled are: Takeda-1, a small lightwater reactor (LWR), Takeda-2, a small fast breeder reactor (FBR), and Takeda- 3, an axially heterogeneous FBR problem. The C5G7-MOX benchmark is also considered which models a set of 4 4 assemblies containing mixed-oxide fuel Takeda Benchmarks The specification of the set of Takeda Benchmark problems and participant results can be found in [79] and [80]. Brief descriptions of the individual problems in the set and any associated variations follow. Overall, seven unique geometrical configurations were used from the Takeda benchmarks: two configurations of Takeda-1, two configurations of Takeda-2, and three configurations of Takeda-3. Only Takeda-4 was excluded because it is based on hexagonal geometry.

193 Takeda-1 The simplest and smallest of the benchmark problems is the Takeda-1 Benchmark comprised of cases 1 and 2: rod out and in, respectively. The Takeda-1 problem is a model of the Kyoto University Critical Assembly (KUCA) and its geometric configurations are depicted in B.1 of Appendix B. The problem is meshed using a uniform Cartesian grid with = 1 cm. The computational model represents a 1/8-core with the x y plane measuring 25 cm 25 cm and reflective boundary conditions on the two inside edges and vacuum boundary conditions on the two outside edges. The core also measures 25 cm in the axial direction with a reflective boundary condition on the inside and a vacuum boundary condition at the core edge. In this model the rod is either fully inserted or the rod position is void. The S 4 and S 8 quadrature sets used for this problem were distributed with the benchmark specification. The nuclear data for this problem consists of two energy groups, without upscattering. A variation of the Takeda-1 which contains upscattering is considered so that a computationally inexpensive problem with upscattering is available for numerical experimentation. The geometry and cross sections for Takeda-1, Case 1 are used as the foundation for the upscattering variation. The only difference is that for each material the scattering from group 2 to group 1 is given by σ 12 = σ 2 σ 22 2 (5.13) such that the upscattering cross section is half of the difference between the total cross section and the self-scattering cross section for group 2. To generate a reference solution for this case that is not included in the original Benchmark set the problem was solved using a traditional fixed-point iteration of the form of Eq. (4.75) with the reference eigenvalue given in Table Takeda-2 The Takeda-2 benchmark is a model of a small fast breeder reactor. There are two rod configurations available for this problem: Case 1, rod completely withdrawn and the rod position filled with sodium, and Case 2, where the rod is half-inserted and the rest of the rod tube is filled with sodium. The problem is meshed using a

194 180 uniform Cartesian grid with = 5 cm. The core is modeled computationally via a 1 /4-core representation where in the axial direction both boundary conditions are vacuum. The length of the axial dimension is 150 cm while the x y dimensions are 70 cm 70 cm with reflective boundary conditions on the interior faces and vacuum boundary conditions on the outside faces. Again, S 4 and S 8 quadrature sets are available from the benchmark specification. The Takeda-2 benchmark utilizes four energy groups with no upscattering. The geometrical configuration of the problem is given by Figure B.2 in Appendix B Takeda-3 The Takeda-3 benchmark is also a fast breeder reactor, using the same cross section set as Takeda-2. Three configurations of this problem are specified: control rods inserted, control rods withdrawn, and no control rod positions. In the no rods configuration the rod positions are replaced by either core or blanket cells. Again a uniform Cartesian mesh is used with a = 5 cm. The dimensions of the core in the x y plane are 160 cm 160 cm with reflective boundary conditions on the interior faces and vacuum boundary conditions at the edges. The axial dimension has a length of 90 cm and has a reflective boundary condition on the inner surface and a reflective boundary condition at the system s edge. The problem contains core material, control rods, blankets, and reflectors, making it the most detailed of the Takeda problems. It is also the largest of the Takeda problems having more than double the number of computational cells of either Takeda-1 or Takeda-2. A diagram of the rodded configuration can be found in Figure B.3 and the unrodded configuration is pictured in Figure B.4. Again the quadrature set details can be found in the benchmark specification C5G7-MOX Benchmark The C5G7-MOX problem was originally proposed as a 2-D benchmark [81] and extended to 3-D [82]. The full specification of the 3-D problem is found in [83] while a special issue of Progress in Nuclear Energy contains more detailed descriptions of the results reported by the benchmark participants [84]. All three configurations of the 3-D C5G7 Benchmark were modeled: Unrodded, Rodded A, and Rodded B.

195 181 Table 5.1: Benchmark Suite: Reference Eigenvalues Benchmark Case k ref Uncertainty Takeda-1 a ± Takeda ± Takeda ± Takeda ± Takeda ± Takeda ± Takeda ± C5G7-MOX b Unrodded ±0.003 c C5G7-MOX Rodded A ±0.003 C5G7-MOX Rodded B ±0.003 Takeda-1.1 U d f C5G7-MOX Unrodded-D e f a Takeda computed with reference meshes, S 8 b C5G7-MOX has Monte Carlo reference solution c One standard deviation d Takeda-1.1 upscattering variation e C5G7-MOX-Unrodded, no upscattering variation f Computed with traditional fixed-point iteration A general diagram of the 3-D C5G7-MOX problem is given in Figure B.5, which shows the axial dimensions of the problem and the layout of the minicore, comprised of four assemblies: two UO 2 and two MOX. The total length of the model in the axial dimension is cm and is subdivided into four regions: one axial reflector region and three core regions. The axial dimension has a vacuum boundary condition at the outer edge and a reflective boundary condition on the interior. A detailed view of the x y plane is given in Figure B.6. The pitch of each pin-cell is 1.26 cm and each pin has a radius of 0.54 cm, while each assembly is a grid of pin-cells. The x y plane can be divided into a 3 3 grid of assembly-sized squares where the 2 2 grid of assemblies occupies a corner and the remaining squares are filled with moderator. The total dimensions of the plane are then cm cm. The cylindrical shape of the fuel pin poses a meshing problem due to the Cartesian geometry used in the transport code. Since solution accuracy is not the pri-

196 182 Figure 5.1: Discretization of C5G7-MOX Problem in x y Plane mary goal the mesh used is similar to the coarsest mesh used by Klingensmith et al. [85] in the TORT set of benchmark results. Specifically, this means that the axial direction for each region was discretized uniformly, using five axial cells in the reflector region and three in each of the core regions, resulting in 14 total axial cells. A pin-cell was discretized into a 3 3 non-uniform grid where the pin is approximated as a square which conserves the volume of the true pin shape. This results in a square pin with sides of length cm. The four corners of the discretized pin-cell are squares with sides of cm. The remaining four blocks are rectangles of dimension cm cm. The moderator blocks at the assembly periphery are discretized in the same manner, with the square pin replaced by moderator, resulting in a mesh that is composed of spatial cells. The quadrature set used is a level-symmetric S 6 set which was derived from the TWOTRAN transport code. Again this corresponds roughly to the coarsest of the cases used in [85]. The energy dependence in the C5G7-MOX Benchmark is discretized into seven groups. Unlike in the Takeda problems the original specifications of the C5G7-MOX problem contain upscattering and this is one of the primary motivations for modeling this problem. A variation of the Unrodded case was also considered in which there was no upscattering. This was done by simply setting all of the upscattering cross-sections to zero such that σ gg = 0, g > g. A reference solution to this problem was obtained using traditional fixed-point iteration and the reference eigenvalue can be found in Table 5.1.

197 Summary of Benchmarks Table 5.2 lists the number of degrees of freedom in the benchmark problem suite for each problem with a given angular, spatial, and energy discretization. The number of unknowns per cell is either 1 or 8, which corresponds to DD and AHOT-N0 or AHOT-N1, respectively. The number of angles in the problem is given by N(N +2) if N is the order of the S N approximation. The total number of unknowns refers to the size of the transport problem though generally discrete angular fluxes are not stored making memory requirements are smaller. The size of the Newton problem for each configuration is then given by the total number of unknowns divided by the number of angles plus one, to account for the constraint equation. If anisotropic scattering was included then the total number of unknowns in each case would increase by a factor of N m as discussed earlier where N m is (N + 1) 2 and N is the order of the P N expansion of the anisotropic scattering. In this case the size of the Newton problem would also increase by the same factor. It can be seen that the largest problem among the collection of test cases is the C5G7-MOX problem using AHOT-N1. The use of AHOT-N1 increases the number of unknowns by a factor of eight compared to DD and AHOT-N0, making it relatively expensive to use, in the hope of achieving higher solution accuracy. Numerical results will be Table 5.2: Degrees of Freedom for Transport Benchmark Problems No. No. of Spatial Unknowns Total Benchmark Groups S N Angles Cells per Cell a Unknowns Takeda x10 5 Takeda x10 6 Takeda x10 6 Takeda x10 7 Takeda x10 6 Takeda x10 7 Takeda x10 6 Takeda x10 7 C5G7-MOX x10 8 C5G7-MOX x10 8 a 1 corresponds to DD/AHOT-N0 and 8 to AHOT-N1

198 184 generated only once for each benchmark using AHOT-N1 to show that the DD results are comparable and that DD can be used to cut down on computational cost. Though the focus is not on the accuracy of the solutions to the benchmark suite it is comforting to know that the models have some basis in reality. The eigenvalues obtained using the convergence criteria discussed in Section will be compared to eigenvalues published along with the benchmark results. In the case of the Takeda problems the calculated eigenvalues can be compared to reference S N eigenvalues which were computed on the same spatial mesh and with a similar quadrature. The C5G7-MOX benchmark eigenvalues are compared to the Monte Carlo generated eigenvalues provided in the benchmark specification. The accuracy of results produced in this chapter with respect to these reference eigenvalues will be discussed in later a section. 5.3 JFNK/NK Parametric Studies Before closely examining the performance of any of the Newton methods it is beneficial to examine the effect of some of the adjustable knobs that could potentially impact the behavior of Newton s method. These knobs are parameters that have been mentioned multiple times and have been explicitly studied in a similar manner in diffusion theory. The four main parameters considered are the forcing factor, η, the finite difference perturbation, ɛ, the Newton initial guess, and the number of GMRES iterations permitted. Due to the large number of combinations of parameters and methods available it is unrealistic to comprehensively separately vary each parameter for a given method and a given problem. Instead, a small selection of methods and problems will be used to perform numerical experiments and the results will be taken as representative of the broader behavior. The experiments can generally be broken up into a set using the Takeda-1 benchmark and another using the C5G7-MOX problems. Both configurations of the Takeda-1 problem were run using the flattened set of methods: JFNK-F-N, JFNK-F-FW, JFNK-F-FR and NK-F-FR. The C5G7-MOX problem was run using two different methods, the JFNK-F-FR and the JFNK-FDF-FR formulations. In most cases just the Unrodded configuration is run, but all three configurations

199 185 are used to explore the effect on performance trends caused by the various forcing factor choices. To study the effect of the number of GMRES iterations the Takeda- 3 benchmark (all configurations) is used. Using these combinations it is possible to get a feel for what values of the Newton parameters are more robust and what values lead to poor results. Unless otherwise specified, the problem was run with the following settings: default convergence of the outer iterations as specified in Section , constant η of 10 2, ɛ calculated using ɛ 2 formulation, and no backtracking. For the Takeda problems an initial eigenvalue guess of unity and a flat-flux is used to initialize Newton s method while in the C5G7-MOX problem k (0) = 2 and the initial flux is given by Eq. (5.12). The purpose of using different initial guesses for the two problems is that not all initial guesses result in convergence of the k-eigenvalue problem as will be seen in the discussion in Section The maximum number of Newton iterations allowed was 15 and the GMRES iterations per Newton step used a subspace size of 25 with no restarts. The DD spatial discretization was used in all cases with S 8 in the Takeda-1 problem and S 6 in the C5G7-MOX problem Perturbation Parameter The first parameter we consider is ɛ, the finite-difference perturbation of the JFNK class of methods. Specifically, we consider ɛ defined by Eq. (3.2), with the exception of ɛ 1 because given the scaling in the Krylov vectors the formula is identical to ɛ 0. The results of this set of experiments are reported in Table 5.3. Very clearly the total sweep count for each method and a given problem is completely insensitive to the perturbation parameter ɛ. For a given method the number of sweeps necessary to achieve convergence is identical regardless of the ɛ value used. The insensitivity is much more pronounced here than in the case of diffusion theory where the choice of ɛ had small but unpredictable effects on the computational load. This result points to the robustness of the JFNK approximation and of the transport k-eigenvalue problem. A number of other interesting observations can be made from this table regarding the various Newton formulations. First we can see that JFNK-F-N and NK-F-N require the same amount of work. Since the only difference between the

200 186 Table 5.3: Effect of ɛ on Sweep Counts Newton Perturbation Formula Formulation ɛ 0 ɛ 2 ɛ 3 ɛ 4 Takeda-1, Case 1 JFNK-F-N NK-F-N JFNK-F-FW JFNK-F-FR Takeda-1, Case 2 JFNK-F-N NK-F-N JFNK-F-FW JFNK-F-FR C5G7-MOX-Unrodded JFNK-F-FR JFNK-FDF-FR two approaches is the Jacobian-vector product finite-difference approximation in JFNK this offers evidence that there is no performance penalty for using the JFNK approximation. This is a desired result since the JFNK collection of methods lend themselves to easier implementation than their NK counterparts. If this result holds true in a larger set of cases it would confirm that the JFNK algorithms should comprise the primary formulations considered when choosing among the Newton methods. It can also be seen in the Takeda results that for a selected Newton formulation there is very little, if any, difference between the sweep counts in Case 1 and Case 2, which shows that the presence of the control rods has practically no impact on the convergence rate of Newton s method. In the cases of the C5G7-MOX problem a large discrepancy can be seen between the two sets of entries. However the difference between these cases is not the geometry, but the Newton formulation. JFNK-F-FR uses the FR constraint relation and the flat fixed-point implementation while JFNK-FDF-FR also uses the FR constraint and the flat fixed-point, but in the fixed-point application the fission and upscattering sources are updated on a group-wise basis as discussed previously. This process of using updated sources has

201 187 little cost associated with it since the information is already calculated. This small change in the fixed-point iteration results in a reduction in the sweep count that is slightly greater than 10%. The differences caused by the constraint equation can also be seen in the Takeda results, however the effect of the Newton formulation and the constraint equation will be examined more deeply later in the chapter Inexact Newton Forcing Factor The choice of the forcing factor, η, has a direct impact on the expense of any Inexact Newton approach because it determines how tightly to converge the linear system at each Newton step and thus the amount of work done per Newton step. The various strategies used to choose η have been discussed a number of times. The same set of strategies is used to test the impact of η on the transport k-eigenvalue problem using Newton s method, although the choice of using the previous iteration s maximum point-wise fission error as η is not considered. For the Takeda-1 problem all of the η choices are examined using both control rod configurations of the problem. The Newton methods used are all based on the flattened fixed-point iteration P F, given by Eq. (4.94), with all constraints being tested and both JFNK and NK methods included. The results for Takeda-1, Case 1 are given in Figure 5.2 while those for Case 2 are given in Figure 5.3. Again it can be seen in both cases that JFNK-F-N and NK-F-N produce practically identical results, confirming that there is no penalty for using the JFNK approximation as it results in a very good approximation to the action of the Jacobian. For this reason any conclusion drawn regarding JFNK-F-N can be understood to apply to NK-F-N equally. It can be seen that for Case 1, in general the various strategies produce comparable results, the largest exception being the An [29] algorithm. In Case 1 An produces the worst results for the N and FR constraints and results in a divergence for the FW constraint. In fact, for the FW constraint in both cases nearly every choice of η results in divergence. The only acceptable values are the constant values of 10 2 and In Case 2 each of the constant values tested, both Eisenstat algorithms, and the Dembo algorithm perform comparably. A constant value of 10 3 is the best for the FR and FW constraints for this case while it is a little worse than most

202 188 Transport Sweeps Eis-A Eis-B Dembo An JFNK-F-N NK-F-N JFNK-F-FR JFNK-F-FW Figure 5.2: Effect of η on Takeda-1, Case 1 Convergence Transport Sweeps Eis-A Eis-B Dembo An JFNK-F-N NK-F-N JFNK-F-FR JFNK-F-FW Figure 5.3: Effect of η on Takeda-1, Case 2 Convergence

203 189 Transport Sweeps Eis-A Eis-B Dembo An Unrodded Rodded A Rodded B Figure 5.4: C5G7-MOX Benchmark: JFNK-F-FR for Varying η for the N constraint. For Case 2 the FW results are the same but in the FR case it is now the 10 1 and 10 2 constant values which result in the fewest S N sweeps. In general the result that the constant values perform well is good because these are very simple to implement and computationally inexpensive to execute. On the other hand it appears that which constant value performs best depends on both the problem and the selected Newton method. Fortunately, the differences between them are minor. It is also interesting to note that for both Case 1 and Case 2 of the Takeda problem even the worst choice of η for the FR constraint is better than the best choice of η for the N or FW constraints. These plots make a good case for the argument that the FR constraint is most effective. It is difficult to make an argument for the FW constraint using the JFNK-F formulation since it diverges more often than not. Likewise, the N constraint, though simple to understand and work with results in a substantially more expensive solution, since for some η strategies the N constraint requires twice the number of sweeps the FR constraint does! The same set of η choices were used for the Unrodded, Rodded A, and Rodded B

204 190 Transport Sweeps Eis-A Eis-B Dembo An Unrodded Rodded A Rodded B Figure 5.5: C5G7-MOX Benchmark: JFNK-FDF-FR for Varying η C5G7-MOX problems using the JFNK-F-FR and JFNK-F-FDF-FR formulations of the Newton method for the k-eigenvalue problem. The results for the JFNK- F-FR approach can be found in Figure 5.4 while those for the JFNK-FDF-FR approach are given in Figure 5.5. This set of results differs in that only one constraint is used throughout but different formulations, due to different choices of P, are employed. The results for this problem are less uniform than what was seen for the Takeda problem. This is to be expected as the C5G7-MOX problems are much more challenging: many more spatial cells, energy groups, and the presence of upscattering. In this case there is no worst choice, but it seems that the Eisenstat-A and An approaches are generally poor performers regardless of the rod configuration or Newton formulation. The Eisenstat-B, Dembo, and constant 10 1 strategies all seem to fall in the middle of the pack for all of the situations examined. The best performers are the constant values of 10 2 and 10 3, though in most cases the advantage is not very pronounced relative to the middle of the pack set. These four figures show that the forcing factor, η, does have a noticeable impact

205 191 on the overall expense of a given formulation of Newton s method. Though there are measurable differences between all of them there are no clear favorites and likewise there are no utter failures. Due to the simplicity of the approach and the performance these results indicate using a constant value for η is quite acceptable and that a fixed value in the range of is appropriate. Though the actual eigenvalues have not been shown for any of these experiments it is important to note that they have all agreed with one another and the fixed-point solution to the expected precision, meaning that each approach is correctly solving the problem posed. Furthermore, though it may seem 10 2 is too loose of a tolerance for use with GMRES in the Newton step there are no precision issues in the converged solution due to this choice and the same is true for η = It is possible with some tweaking that any of the previously developed algorithms could show marked improvement, or that a new convergence scheme could be devised which would improve the performance. However, it seems that of those results tested, 10 2 is a safe choice and it will be used for the majority of the runs which follow Initial Guess Studying the effect of the initial guess on the convergence of Newton s method is difficult, to say the least. The most well-known stipulation of Newton s method is that convergence can only be guaranteed when an initial guess has been chosen which is sufficiently close to a root of the function. This is further complicated when using Newton s method to solve the eigenvalue problem because any eigenpair is a root of the nonlinear function. All of this is even further compounded by the number of formulations of the k-eigenvalue problem available when using Newton s method due to the different forms of P and the number of possible constraint equations. The numerical experiments used to examine the effect of the initial guess again used both of the Takeda-1 configurations and the C5G7-MOX-Unrodded configuration with the same Newton methods employed as in the previous section. For both of the Takeda-1 configurations the first set of results is given by Table 5.4. In this set of experiments each Newton method considered is started using either the flat-flux or fission-based initial guess with anywhere between 0 and 5 initial fixed-point iterations. The fixed-point iterations are of the form in Eq.

206 192 Table 5.4: Effect of Initial Fixed-Point Iterations on Sweep Count for Takeda-1 Configurations JFNK-F-N NK-F-N JFNK-F-FR JFNK-F-FW IG Flat Fission Flat Fission Flat Fission Flat Fission Its. Takeda-1, Case a Its. Takeda-1, Case a a Diverged (4.75), P FP, with GMRES used to perform inner iterations and with the default convergence scheme. The initial guess for the eigenvalue, k (0), is unity in all of these cases. If one examines the sweep counts as the number of initial fixed-point iterations increases a blurred picture emerges. It is clear that 5 initial power iterations results in a high sweep count, likely because the fixed-point iteration is doing the majority of the work in this case. There is, in general, a good deal of evidence to suggest that 3 or 4 initial fixed-point iterations also results in a higher sweep count than a small number, however there are a few exceptions to this observation. It seems almost a toss-up between using 0, 1, and 2 fixed-point iterations to initialize the Newton method. However, we can see that with the FW constraint and a fission-based starting flux divergence can be avoided by using a non-zero number of initial fixed-point iterations. Thus, these results seem to indicate that either 1 or 2 initial fixed-point iterations is a good choice as a means to initialize Newton s method. Another trend that can be discerned from glancing at the rows in Table 5.4 is that in nearly every case a flat starting flux results in

207 193 Effect of Initial Eigenvalue Guess on Sweep Count for Takeda-1 Con- Table 5.5: figurations JFNK-F-N NK-F-N JFNK-F-FR JFNK-F-FW IG Flat Fission Flat Fission Flat Fission Flat Fission k (0) Takeda-1, Case a a a 214 a a a 208 a a a 164 a k (0) Takeda-1, Case a a a 194 a a a 174 a a a 148 a a Diverged fewer total sweeps than when the starting flux is determined by the B operator applied to a flat flux (the fission-based initial guess). These results also echo previous conclusions that the FR constraint for this formulation is more efficient than the N and FW constraints, and that NK-F-N behaves identically to JFNK- F-N. In Table 5.5 rather than varying the number of initial fixed-point iterations used the initial eigenvalue guess, k (0), is varied between 1.0 and 3.0. In these runs no initial fixed-point iterations are performed. From this table it can be seen that k (0) can have a significant impact on the convergence of the problem for the eigenvalue update constraints. For the flat flux initial guess the initial eigenvalue has practically no impact on the runs using the N constraint and though large values of k (0) increase the number of sweeps required when the fission-based initial guess is used with the N constraint the method still converges. This may be related to the fact that the N constraint has no dependence on the eigenvalue estimate but is instead determined entirely by the eigenvector. It can be seen for the eigenvalue update constraints that many values of k (0) cause the FR and FW

208 194 Table 5.6: C5G7-MOX-Unrodded Total Sweep Counts Using Fixed-Point IG with Converged Inners (GMRES) Initial Guess Flat Flat Flat Fission Fission Fission # Fixed-Point k (0) JFNK-F-FR 0.5 a a a a a a b b k (0) JFNK-FDF-FR 0.5 a a a a b b b a Converged to higher-harmonic eigenvalue b Diverged constraints to diverge. The actual eigenvalue for both cases is below 1.0 and so it would seem for the FR constraint the sweep count increases as k (0) moves further away from the true eigenvalue, until the method begins to diverge. These results again confirm the difficulties seen with the FW constraint and indicate that either the FR or N constraints are better choices. While these results may indicate the N constraint is more robust it must be noted that no fixed-point iterations were used to initialize Newton s method in this set of results. We will see shortly that convergence difficulties can be overcome in this manner. Table 5.6 shows sweep counts for the unrodded C5G7-MOX problem using the JFNK-F-FR and JFNK-FDF-FR Newton formulations. In this set of experiments both k (0) and the initial number of fixed-point iterations are varied for both a flat flux and fission-based flux initial guess. The k (0) values vary from 0.5 to 3.0 and 0, 1, and 3 initial fixed-point iterations are considered. Again the initial fixed-point iterations are of form Eq. (4.75) with GMRES and the default convergence used

209 195 for the inner iterations. A cursory glance at Table 5.6 reveals a few major trends. The first is that for the C5G7-MOX problem using 0 initial fixed-point iterations, regardless of the starting flux, convergence is extremely unreliable: convergence to the fundamental mode is achieved only for certain values of k (0), and divergence is seen in some cases. For the flat-flux initial guess those k (0) which are closest to the fundamental eigenvalue converge, but for the fission-based starting flux the values of k (0) which converge are unpredictable, though generally higher than the true eigenvalue. It can be seen that performing a single fixed-point iteration before commencing Newton iterations results in convergence to the fundamental mode for both starting flux choices and all k (0) values. Looking at these columns, another interesting trend appears for the flat-flux starting flux; k (0) values near 2.0 seem to result in the fewest sweeps, not the k (0) values closest to the fundamental eigenvalue. Using a single initial-fixed-point iteration and a fission-based initial guess results in a computational load (as measured by the number of mesh sweeps) that is less sensitive to k (0). For the flat-flux initial guess differences of over 300 sweeps can be seen due to the choice of k (0) while for the fission-based option the largest difference in sweep count due to k (0) is less than 100 sweeps. Performing three fixed-point iterations prior to beginning Newton s method results in all runs converging with sweep counts that are nearly independent of k (0). This would indicate that after these three fixed-point iterations the solution is sufficiently converged that Newton s method is being started at the same point, regardless of what the initial flux or k (0) was. This is confirmed by the fact that for a given method the sweep counts are practically identical for all k (0) and both starting flux options. The fixed-point iterations in this case serve to push the solution towards the fundamental mode. This indicates that convergence to the fundamental mode can almost be guaranteed by performing enough fixed-point iterations before switching to Newton s method. In the case of the flat starting flux this is actually detrimental as we can see that the overall number of sweeps increases going from 1 to 3 initial fixed-point iterations while the opposite is true for the fission-based starting flux: the additional initial fixed-point iterations actually lower the total sweep count. Based on these results it seems a flat-flux initial guess is likely to produce the best results. However, the picture painted by this table is anything but clear and so the effect of the initial guess for the C5G7-MOX-

210 196 Table 5.7: C5G7-MOX-Unrodded Total Sweep Counts Using Flat Fixed-Point IG Initial Guess Flat Fission Flat Fission # Fixed-Point 0 a k (0) JFNK-F-FR 0.5 b 1169 b b b b b b b c b c b k (0) JFNK-FDF-FR 0.5 b b b b b b b b b 805 b 2.5 c b c a Full backtracking implemented b Converged to higher-harmonic eigenvalue c Diverged Unrodded problem will be explored further. One possibility is to replace the P given by Eq. (4.93) by that in Eq. (4.94) when performing the initial fixed-point iterations which could be very beneficial since the application of the flattened P, Eq. (4.94), is relatively inexpensive. This is done in Table 5.7 for the C5G7-MOX benchmark with k (0) between 0.5 and 3.0, and a fission-based starting flux with 1 initial fixed-point iteration, and flat starting flux with 5 and 10 initial fixed-point iterations. The results reported in Table 5.7 are not particularly encouraging for this approach. Even using 10 initial fixed-point iterations for this formulation results in many runs which converge to a non-fundamental eigenvalue. In fact, it appears that performing 5 initial fixedpoint iterations is more likely to yield the fundamental mode than performing 10 which indicates that the eigenpair estimate used to initialize Newton s method is much poorer in this case than that seen in the previous set of experiments, when Eq. (4.93) was used for P. Even when using the flattened fixed-point operator

211 197 results in Newton s method converging to the fundamental mode, none of these results indicate there is any benefit from this formulation. The sweep counts are generally higher than what were seen when using Eq. (4.93) to initialize Newton s method. Another experiment was tried where a flat starting flux was used and no initial iterations were performed, but backtracking was used at each step of Newton s method. By comparing the first column of sweep counts in Table 5.7 to the corresponding column in Table 5.6 we can see the addition of the backtracking step is at best marginally effective. Using backtracking the k (0) = 2.0 cases converge but the sweep counts have all risen and this does not take into account the full cost of backtracking since some of the cost cannot be represented as sweeps. Thus, based on Table 5.7 it seems performing a full standard fixed-point iteration using a starting flat flux prior to commencing Newton s method is most likely to result in convergence to the fundamental mode. To delve deeper into what about performing a single fixed-point iteration is so key to converging to the fundamental mode two more sets of experiments were considered. The first experiment examined the impact of the k estimate that results from the fixed-point iteration on the method convergence. This was done by performing the single fixed-point iteration for each of the previous k (0) values and using the resulting eigenvalue estimate as k (0) for a new calculation using no initial fixed-point iterations. This means that the starting fluxes used are still either flat or fission-based but the initial eigenvalue guesses are those determined by the fixed-point iteration. The results of these experiments are shown in Table 5.8 for the C5G7-MOX-Unrodded problem using the JFNK-F-FR and JFNK-FDF- FR methods. What stands out immediately is that this causes almost all of the fission-based runs to converge to a non-fundamental mode. This is not as surprising when one looks at the k (0) values resulting from the fixed-point iteration in this case: the only k (0) which is relatively close to the fundamental mode manages to converge, all of the others are much lower. The results for the flat starting flux do indeed compare well with the results from Table 5.6, the new k (0) values alone result in the majority of runs converging to the fundamental mode. Still, for the JFNK-F-FR method some k (0) values converge to a higher-harmonic eigenpair and another value causes divergence, while in Table 5.6 all of the runs converge to the fundamental mode with 1 initial fixed-point iteration. This indicates that it is

212 198 Table 5.8: C5G7-MOX-Unrodded Total Sweep Counts Using k (0) from a Single Converged Fixed-Point Iteration Initial Guess Flat Fission # Fixed-Point 0 0 JFNK-F-FR k (0) original k (0) k (0) a a a k conv k conv a a a a a b JFNK-FDF-FR k (0) original k (0) k (0) a a a a k conv k conv a a a a Converged to wrong eigenvalue b Diverged not solely k (0) that impacts convergence, the updated flux vector from the fixedpoint iteration must play a role as well. This is further confirmed by considering the rows marked with k conv ; in these two cases the value of k (0) chosen was the eigenvalue the chosen method ultimately converges to. The fact that the sweep counts here are not minimum values for the flat starting flux and do not converge to the fundamental mode for the fission-based starting flux indicates that the proximity of k (0) to the fundamental eigenvalue is not a guaranteed indicator of convergence behavior. Even if it was the case that k (0) was the most important factor regarding convergence to the fundamental mode, comparing the best sweep counts from Tables 5.8 and 5.6 does not reveal any gross inconsistencies so it could be argued that performing the single initial fixed-point iteration is ultimately no

213 199 Table 5.9: Sweep Count for Flat-Flux IG C5G7-MOX-Unrodded Problem Without Upscattering and k (0) = 1 Initial Guess Flat # Fixed-Point 0 a k (0) JFNK-F-FR JFNK-FDF-FR 0.5 a a a a 2.0 a a 2.5 b b 3.0 b b a Converged to wrong eigenvalue b Diverged more expensive than performing none using a good k (0) value. One question arising from the interpretation of the C5G7-MOX-Unrodded results is what is the impact of upscattering. The Takeda results show less sensitivity to the initial guess than the C5G7-MOX results. For the Takeda problems, convergence is affected mainly by k (0) and very little by the number of initial fixed-point iterations. One possible explanation is the presence of upscattering in the C5G7- MOX-Unrodded problem. To test this all of the upscattering cross sections in the C5G7-MOX-Unrodded problem were set to zero as previously described and the JFNK-F-FR and JFK-FDF-FR methods were used with a flat flux and no fixedpoint iterations starting guess. The results of this experiment are given in Table 5.9 and quite definitively show that upscattering is not the reason for the behavior seen in previous experiments. Both methods diverge for certain values of k (0) and a non-fundamental mode is converged to in many cases. Only for k (0) = 1 does each method converge to the fundamental mode for this variation of the C5G7- MOX-Unrodded problem. It is most likely that the key difference between the Takeda and C5G7-MOX problems is the sheer magnitude: Table 5.2 shows that the C5G7-MOX-Unrodded problem considered is almost two orders of magnitude larger than the Takeda-1 problem and thus there are many more eigenvalues which are roots of the nonlinear function used as the foundation of Newton s method.

214 GMRES Iterations As was seen in the diffusion theory results, the choice any user of GMRES must make is how large to allow the subspace to grow before restarting and how many restarts to allow. While η does determine when the GMRES iteration will stop it is possible that some Newton steps will take far too many GMRES iterations to reach the η tolerance and it might be more economical to terminate the iterations and proceed to the next Newton step. The decision on how large the subspace size should be is important from a memory standpoint while the decision of how many restarts to allow is important from an execution time standpoint. The subspace size used will not be experimented with as 25 is a perfectly acceptable value which is neither too demanding on memory or so small it adversely affects the computational performance. The impact of the number of restarts on the overall computational cost, however, will be explored. This experiment used Cases 1, 2, and 3 of the Takeda-3 problem. The GMRES subspace size was set to 25 and 0, 1, and 2 restarts were considered meaning a maximum of 25, 50, and 75 GMRES iterations per Newton step were allowed, respectively. The total number of sweeps and the number of GMRES iterations per Newton iteration for each run are given in Table 5.10 while the resulting eigenvalues for the runs are given in Table The resulting eigenvalues show that the number of restarts has no impact on the solution value, i.e. there is Table 5.10: Effect of Number of GMRES Restarts on Takeda-3 Problems, Subspace Size 25 Takeda-3 Case 1 Case 2 Case 3 Rsts Swps a GMRES/Newt b Swps GMRES/Newt Swps GMRES/Newt a Number of Sweeps b GMRES iterations per Newton

215 201 Table 5.11: Restarts Eigenvalues for Takeda-3 Problem with Varying Number of GMRES Takeda-3 Rsts Case 1 Case 2 Case no loss in accuracy due to capping the number of GMRES iterations at these levels. The sweep counts in Table 5.10 clearly indicate that fewer restarts yields fewer GMRES iterations ultimately consuming fewer sweeps. This is desirable and supports the decision to not perform any restarts though the most interesting data in this experiment is not the total sweep count but the number of GMRES iterations per Newton iteration. It can be seen that in all cases except for Case 3 with 2 restarts a total of four Newton iterations are required to converge. In all cases the first Newton iteration is terminated after the maximum number of GMRES iterations is reached, regardless of whether 25, 50, or 75 iterations are permitted. When no restarts are performed it can be seen that the maximum number of GMRES iterations is frequently reached and often if not reached the iteration count is very close to the limit. These results suggest two approaches which should certainly be pursued if one were attempting to minimize the cost of the k-eigenvalue calculation using Newton s method. The first of these approaches is lowering the cost of the first Newton step and the simplest way to do so is to devise a new η scheme. The cost of the first step confirms that 10 2 is a relatively tight tolerance at this point in the calculation and indicates a dynamic η should be chosen which starts off large in the early Newton steps but does not become too small in later Newton iterations. A scheme similar to the Dembo scheme [24] could be used where the constant values are tweaked to produce a more desirable sequence of η s. The second and more important of these approaches is preconditioning. The effect of preconditioning was witnessed very clearly in the diffusion calculation and the results shown in Table 5.10 clearly show that preconditioning would be beneficial in the transport

216 202 problem as well. In the diffusion calculations the matrix form of the operators was known explicitly and preconditioners were constructed using the M and F matrices. In the transport case we have already formulated the problem using a fixed-point iteration so using a traditional power or fixed-point iteration as preconditioner is not likely to yield improved results. It is possible that preconditioners could be constructed via the operators presented but due to the expense of a single S N sweep (L 1 ) the preconditioner would have to be very effective to lower the total sweep count. It seems likely that a very good preconditioner could be constructed using the diffusion approximation to transport theory. It has been shown [74] that traditional diffusion synthetic acceleration (DSA) techniques for inner iterations can be used as preconditioners for the inner iteration when solved via Krylov iterations. It may be possible to use DSA techniques developed for outer iterations, such as the diffusion sub-outers in PARTISN [75] as preconditioners for the Newton step. Knoll and Keyes [33] specifically discuss using a low-order approximation as a preconditioner in their survey article and further information on how to do so can be found there. The relatively inexpensive diffusion calculation could be very effective in reducing the number of GMRES iterations required at each Newton step, which could substantially lower the cost of the k-eigenvalue calculation using Newton s method; this is a wide open area, not pursued in this work, where further research would be quite beneficial. 5.4 Newton Formulations From the parametric studies conducted in the previous section an idea of the behavior of different Newton formulations and constraints was hinted at. However, these were not comprehensive tests designed to delineate the efficiency of the different formulations of P and different constraint equations. In this section the various fixed-point and Newton formulations (with all constraint options) will be explored along with the effect of the angular and spatial discretizations. The first set of experiments will use the Takeda-1 configurations to examine the impact of the S N order and the number of spatial unknowns per cell on the total sweep count. This is done using S 4 and S 8 quadrature sets and the DD and AHOT-N1 spatial

217 203 discretizations. The solution accuracy of the AHOT-N1 and DD methods will then be briefly compared for the C5G7-MOX problems in order to justify the use of DD and the convergence rate will be briefly examined. A comprehensive look at all available fixed-point and Newton formulations is then performed for the Takeda-1 problem with upscattering which allows for a clear comparison of the various solution methods for the k-eigenvalue problem. The treatment of inner iterations is explored in detail and the most effective scheme for each problem formulation is identified. The conclusions and discussion following this experiment serve as an introduction to the larger set of results ultimately used to compare the newly developed Newton approaches to the traditional fixed-point iteration(s) Angular and Spatial Discretizations To determine the impact of the spatial discretization and S N order on the behavior of Newton s method the JFNK-FDF-FR formulation was used to solve both configurations of the Takeda-1 problem. The DD and AHOT-N1 spatial discretizations were used. The differences between the approaches are not trivial as the AHOT-N1 method has eight times as many unknowns per cell, namely scalar flux spatial moments, as DD in three dimensions. The problem was also solved using the S 4 and S 8 quadrature sets provided in the benchmark specification. The results of these experiments are given in Table 5.12 where the sweep count and eigenvalue of each run are given. It can be seen that regardless of S N order, spatial discretization, or problem configuration the sweep count is quite steady, between 124 and 126. Table 5.12: Varying Angular/Spatial Discretization for Takeda-1 using JFNK- FDF-FR Newton Formulation Case 1 Case 2 Spatial S N Swps k k a Swps k k DD AHOT-N DD AHOT-N a For a given S N order (in pcm)

218 204 Newton Residual Case1-S 4 DD Case1-S 4 AHOT-N1 Case2-S 4 DD Case2-S 4 AHOT-N1 Case1-S 8 DD Case1-S 8 AHOT-N1 Case2-S 8 DD Case2-S 8 AHOT-N Newton Iteration Figure 5.6: Convergence of Takeda-1 for Varying Angular/Spatial Discretization So while increasing the S N order or choosing a spatial discretization with many more unknowns per cell makes each sweep become more expensive this does not increase the total number of sweeps substantially. This behavior applies also to traditional fixed-point solution methods and it is fortunate that it is valid for the Newton approaches as well. The same set of eight runs performed in Table 5.12 was reported but both the eigenvalue and point-wise fission source error were converged to so that the convergence behavior of the JFNK-FDF-F approach could be seen for the different spatial and angular discretizations. The results of this convergence analysis are given in Figure 5.6 where the Newton residual is plotted against the number of Newton iterations on a semi-log scale. It can be seen that the convergence of each method is nearly identical and after a single Newton iteration the asymptotic convergence regime is reached. The convergence rate is superlinear, but sub-quadratic, which is to be expected since the forcing factor is not chosen in a manner which would preserve the quadratic convergence rate of Newton s method. This is an attractive convergence rate but ultimately not very important since in practice

219 205 Table 5.13: Effect of Spatial Discretization on k Benchmark DD AHOT-N1 k a Takeda Takeda Takeda Takeda Takeda C5G7-MOX-Unrodded C5G7-MOX-Rodded A C5G7-MOX-Rodded B a Between DD and AHOT-N1 (pcm) problems are rarely, if ever, converged more tightly than the default tolerance levels used for the eigenpair in all of the other calculations in this chapter. In Table 5.12 it was seen that the sweep counts depend very weakly on the spatial and angular discretizations but it is also worth looking at the impact of the spatial discretization on the converged eigenvalue. The S N order used is generally fixed in the benchmark problems we consider but the spatial discretization is left to the participant s discretion. The DD method is preferred because the run times, particularly for the C5G7-MOX problems, are significantly shorter. For example, for the C5G7-MOX-Unrodded problem using the JFNK-F-FR approach, a representative execution time using DD is 3 4, hours while AHOT-N1 requires hours. These execution times are for one of the most efficient Newton methods tested, while for the traditional fixed-point iterations and the poorly performing Newton formulations the execution time of AHOT-N1 becomes a limiting factor given how many distinct runs are performed to produce the results reported in this chapter. Table 5.13 shows the difference in the eigenvalues calculated using DD and AHOT-N1, all other options being equal. The difference in eigenvalues, measured in pcm, is given in the rightmost column. It can be seen that for the Takeda problems the differences are in the range of pcm while the difference is over 100 pcm for two of the C5G7-MOX problems. Given the goal in solving the

220 206 benchmark problems is to measure performance (and not demonstrate accuracy) there is not a compelling reason to use AHOT-N1 in favor of DD as both methods result in reasonably accurate solutions. In fact, when compared to the eigenvalues in Table 5.1 the DD results are frequently more accurate than the AHOT-N1 values, though the reasons for this are likely related to cancellation of errors and how the reference solutions were computed. Regardless, for the intents and purposes of this work, Table 5.13 confirms that DD is a perfectly adequate spatial discretization and will thus be used in the remainder of the numerical experiments of this chapter Inner Iterations To this point both the source iteration and GMRES methods for performing inner iterations have been explained though neither has been used numerically. In this section we examine the effect the formulation of the within-group problem has on the total sweep count and what initial guess results in the best convergence of the inner iterations. To generate a large number of data points for this experiment the GMRES and SI inner formulations were started in every outer iteration or Newton step with either a zero initial guess or using the previous outer iterate (or the value of the Krylov vector v in the case of a Jacobian-vector product evaluation) as the initial guess. These four scenarios were run for all of the methods which require iterations on the within-group problem: the traditional approach using the P and FP formulations of P with the FR eigenvalue update formula, and the JFNK-P and JFNK-FP Newton formulations, each using the three previously discussed constraint equations. The sweep counts resulting from these numerical experiments are given in Table The benchmark problem used in this case is the Takeda-1 problem with upscattering. This table is extremely informative regarding the impact of the inner iterations on the various problem formulations. The most obvious conclusion is that the GMRES formulation of the inner iteration problem is always less expensive than source iteration no matter what formulation of the k-eigenvalue problem is being solved. Similarly, in the case of comparing GMRES to SI using a zero initial guess the GMRES formulation is always at least an order of magnitude less expensive. For the traditional power/fixed-point implementation we can also see that using

221 207 Table 5.14: Effect of Inner Iteration Treatment on Takeda-1 With Upscattering GMRES Source Iteration IG φ (0,l) = 0 a φ (0,l) = φ (l 1)b φ (0,l) = 0 φ (0,l) = φ (l 1) Method Sweeps Trad-P-FR Trad-FP-FR JFNK-P-N JFNK-FP-N JFNK-P-FW JFNK-FP-FW JFNK-P-FR JFNK-FP-FR a Inner iterations started with a 0 initial guess for group flux b Inner iterations started using current φ or v (depending on formulation) the previous outer as the initial guess ultimately reduces the sweep count. In the case of SI the sweep count is reduced significantly by this choice of initial guess. We have already seen in the previous chapter how using this initial guess with source iteration can result in the FD formulation of P. These results further expose the difference between SI and GMRES in this regard, something which will be explored in more detail shortly. Another definitive trend in the sweep counts is that the P formulation of the problem is always more expensive than the FP formulation. This is to be expected since the methods are equivalent when there is no upscattering but P has a nested level of iteration when upscattering is present. Each evaluation of P is therefore going to be more expensive in the P formulation since upscattering has been introduced in the problem. Comparing the JFNK results to the results generated using traditional methods it can be seen that for these formulations of the Newton problem the JFNK approaches are not even competitive with the k-eigenvalue problem formed using equivalent P operators. Another interesting observation is that the traditional techniques produce the best results when GMRES is used with a previous outer as the initial guess while the JFNK methods produce the best results when a zero initial guess is used with GMRES. In further numerical tests

222 208 Table 5.15: Using a Fixed-Number of Inners for C5G7-MOX-Unrodded GMRES Source Iteration IG φ (0,l) = φ (l 1)a φ (0,l) = φ (l 1) No. Inners Sweeps GMRES b Newton Sweeps GMRES b Newton c c a Inner iterations started using current φ or v (depending on formulation) b Total number of GMRES iterations in Newton s method. This is a summation of the number of GMRES iterations in each Newton step, not the number of inner iterations. c Maximum number of Newton iterations allowed reached described below, unless otherwise indicated, these choices will be used for methods requiring inner iterations. The five formulations of P presented to this point are all well-defined using operator notation. However we will briefly explore here some methods which fall in-between the existing formulations and are related to the treatment of inner iterations. Consider that by using the previous outer as the initial guess in conjunction with SI and only performing a single inner iteration per outer we transition from the FP to the FD approach, with the FDF being a variation of this. The question must be asked, what if two inner iterations were performed per outer? The resulting technique is certainly not the FD approach but neither is it the FP approach since it is unlikely two inner iterations is sufficient to converge the inner iterations, especially for early outer iterations. The same is true for 5 and 10 inner iterations, thus there is some large grey area between the FD/FDF and FP approach where what is being done is technically neither, but more related to the FP approach where the definition of converged inner is loosely defined. This approach also leads to the question of what will happen if SI is replaced with GMRES: will one inner iteration per outer suffice? It is difficult to show the resulting k-eigenvalue problem formulation analytically for this choice since it requires carrying out a single GMRES iteration symbolically rather than numerically.

223 209 To answer some of these questions and see what insight it provides we solve the Unrodded configuration of the C5G7-MOX problem using both GMRES and SI (with the previous outer or v as the initial guess) with a fixed number of inner iterations per outer iteration: 1, 2, 5, and 10. The Newton formulation used is the JFNK-FDF-FR approach and the sweep counts resulting from these experiments are given in Table Along with the sweep counts the total number of GMRES iterations and Newton iterations is tabulated. This does not refer to the GMRES iterations associated with the inner iterations (if performed) but rather to the GMRES iterations used to solve each linearized Newton step. For 1, 2 and 5 iterations the SI formulation is actually better than the GMRES formulation. For 1 and 2 inners per outer, using GMRES results in a solution which converges extremely slowly while for SI this is where the total expense is the lowest. It is not until 10 inners per outer are done that GMRES becomes comparable to SI. When using SI it can be seen that the overall expense of the computation increases as the number of inners per outer increases which suggests that the flattened approaches (F, FD, and FDF) will be less expensive than the FP (and thus the P) approaches. It is interesting to note that the total number of GMRES iterations decreases as the number of inners per outer increases, while the number of Newton iterations is constant. This is further evidence to support the idea that the acceleration of a fixedpoint iteration using Newton s method is equivalent to a preconditioned version of a related system. In other words, we know that the flattened problem formulation, P F, given by Eq. (4.94), is equivalent to the generalized eigenvalue formulation of the nonlinear equation (the GEP formulation in diffusion). So doing a single inner per outer corresponds to this non-preconditioned system, even though the equation can be written in the same generalized form as the other fixed-point iterations (F and P). As the number of inner iterations per outer increases we are transitioning from a flattened approach to one where the within-group equation is being solved, which acts as a preconditioner to the linearized Newton step. This explains the decrease in the total number of GMRES iterations. However it is obvious that the cost of applying the preconditioning is too high to see any benefit from the decreased number of Krylov iterations. The effectiveness of the true P and FP approaches can be understood by realizing that all the JFNK runs

224 210 in Table 5.14 only required two Newton iterations to converge. So even though these methods are expensive they are well-conditioned and behave very well as far as the Newton and Krylov convergence properties are concerned. Still it is the total computational cost, i.e. sweep count, and not the number of iterations which determine the efficiency of a given formulation Formulations & Constraints With the effect of inner iterations better understood, and knowing the best formulations of the within-group problem to use with both traditional and Newton methods, we can finally compare the different formulations and constraints to one another. The problem used for this purpose is the Takeda-1 problem with upscattering. The parameters are all chosen based on the results of the previous experiments: ɛ 2 for the perturbation parameter, 10 2 for the forcing factor, a single fixed-point iteration of form Eq. (4.75) (initialized with a flat flux) using the default convergence criteria to generate the initial guess. The Krylov subspace size is 25 and no restarts are performed while the default convergence criteria are used for all convergence tests. The only situation that does not use default values is the convergence of the inner iterations when the FP and P formulations are used with Newton s method. Using the default inner convergence in this situation yields an unstable scheme that fails to converge though the reason for failure is unknown. Using a constant inner tolerance of works well for the tests in this section employing the JFNK-FP and JFNK-P methods, as well as in the results shown previously in Table The comprehensive set of runs used fixed-point iteration with the FR eigenvalue update for all five formulations of P. The JFNK approach was also used with all formulations of P along with all constraints and additionally the NK-F-N approach was included. Before comparing all of the results generated we take a final look at the differences between the NK and JFNK approaches for the P formulation given by Eq. (4.94) and with the N constraint. The results of this comparison are given in Table The eigenvalue, sweep count, Newton iteration count, total GMRES iteration count, Newton residual, and eigenvalue and point-wise fission source errors are compared for both methods. It can be seen that the eigenvalues agree

225 211 Table 5.16: Comparison of NK and JFNK for Takeda-1 With Upscattering Quantity JFNK-F-N NK-F-N k Sweeps Newton Its. 6 6 GMRES Its Newton Residual 1.393x x10 7 k-error a 2.486x x10 6 FS-Error b 1.101x x10 5 a As defined by Eq. (5.1) b As defined by Eq. (5.2) to 10 digits, which means the difference between them is much less than both the eigenvalue convergence tolerance of 10 5 and the finite-difference perturbation parameter, which is approximately This is excellent agreement and indicates the calculations are nearly identical. This is further confirmed by the fact that the total number of sweeps, Newton iterations, and GMRES iterations incurred by the two formulations are equal. The same is true for all of the error measures included in the table. This is further confirmation that the JFNK alternative to NK-F-N is just as effective the JFNK approximation does not degrade the solution or the algorithm performance in any meaningful way. This is extremely useful to show since the NK approach can be quite difficult, if not impossible, to implement for constraints other than N. More complicated constraints cannot be implemented using operations one would expect to exist in most codes, due to the presence of transposes of certain operators, for example. While it is certainly interesting to note that the action of the Jacobian can be formed and used directly by the Krylov method there is no good reason to prefer this approach to using the JFNK approximation to solve the same system. The JFNK approach is just as easy to implement from scratch as the NK approach and much simpler to implement in a preexisting code. The JFNK approach to the Newton formulation of the k-eigenvalue problem ultimately affords the developer much greater flexibility in reusing existing solution algorithms and software. This is a rather broad conclusion to draw from a single test case and it is entirely possible that there are problems where the quadratic

226 212 Table 5.17: Comparison of Formulations & Constraints Sweep Count for Takeda-1 With Upscattering Constraint a N FW FR k b k c Traditional Power/Fixed-Point Trad-P d Trad-FP d Trad-F Trad-FD Trad-FDF Newton Methods JFNK-P e JFNK-FP e JFNK-F JFNK-FD JFNK-FDF a Update formula for trad. power/fixed-point methods b Converged k with FR constraint/update. FW and N constraints result in eigenvalues which agree to 5 digits with FR constraint so same k applies. c With reference solution from Table 5.1 (pcm) d Using GMRES for inners, previous outer initial guess e Using GMRES for inners, zero initial guess terms in the finite-difference approximation are not negligible. However, since we are converging the eigenpair to some tolerance that is in general larger than ɛ there is good reason to believe the finite-difference Jacobian-vector product is an accurate approximation in the numerical results contained in this work. The full set of sweep counts for all formulations and constraints for the Takeda- 1 problem with upscattering are presented in Table 5.17 along with the eigenvalues for the FR constraint/update formula. A value for k is also given, which in this case is the difference in pcm of the reported eigenvalue with the reference eigenvalue found in Table 5.1, which was calculated using the Traditional-FP formulation of the problem (the second row in Table 5.17). First comparing the five formulations using traditional iterative techniques we see that the full power-iteration formulation (P) is more expensive than the fixed-point formulation of the problem (FP) where the upscattering is not converged. This is exactly what one would expect,

227 213 thus justifying the popularity of the FP approach, where the upscattering source is only updated once per outer iteration. The three flattened approaches are not generally used to solve the k-eigenvalue problem in traditional fixed-point implementations. Though the sweep counts are comparable to those of P and FP it can be seen that the eigenvalues are far from accurate, compared to the convergence tolerance of This indicates that the fixed-point implementation of the F, FD, and FDF formulations converge so slowly that false converge is witnessed. False convergence refers to the situation where the change from one iterate to the next is small enough to satisfy the convergence criteria even though the current iterate is still far from the true converged solution. While the P and FP eigenvalues agree to within the eigenvalue convergence criterion (1 pcm) the flattened approaches differ by over 60 pcm. This also suggests that the sweep counts given for the flattened approaches are actually lower than they should be since more iterations are required to achieve the same solution accuracy as that of the P and FP formulations. To confirm the claim that the flattened approaches suffer from false convergence when used in conjunction with a traditional iterative scheme these methods were used with a stopping tolerance based on the fully-converged eigenvalue and not the relative difference between successive iterations. Specifically, the solutions were considered converged with the difference between the current eigenvalue estimate and was less than This provides a more realistic estimate of the cost of the flattened iteration, however this only implies a converged eigenvalue and it is very likely that the fission source is still not converged, meaning these results probably still underestimate the true cost of the flattened schemes. Using this stopping criterion the Trad-F method requires 1,336 sweeps, twice the number indicated in Table 5.17! Both the Trad-FD and Trad-FDF approaches require 1,326 sweeps to fully converge, slightly less than twice as many as the number shown in Table These results confirm that with a sufficient number of iterations the flattened approaches do converge to the correct solution and that the number of iterations necessary to converge is substantially larger than the number required by the Trad-P and Trad-FP formulations. Looking at the JFNK formulations we see results that reinforce earlier interpretations: the P and FP approaches are more expensive than their counterparts

228 214 using traditional iterative techniques. However, when using the JFNK formulation the flattened approaches are much less expensive and much more accurate than the fixed-point formulation using the same P, by a factor of anywhere from 3 to 5. At the same time the eigenvalues all agree perfectly with the fully converged solution. In all of the JFNK formulations, Newton s method itself performs well, converging in a very small number of iterations. However, it is essential when using the JFNK approximation that the number of Krylov iterations be as small as possible and that the expense in evaluating the nonlinear function Γ not be too high. For the P and FP approach the number of GMRES iterations is relatively low but the expense of each Jacobian-vector multiply is quite large. In the case of the flattened formulations a Jacobian-vector multiply is cheap (a single sweep) and the number of GMRES iterations is not excessive, though it could stand to be reduced by preconditioning, making these formulations a very attractive choice for use with Newton s method. We can see that in this case the FW and FR constraints produce very similar results, about 130 sweeps, while the N constraint results in around 180 sweeps. All of the JFNK methods compare favorably with the best fixed-point implementation, Trad-FP with the FW and FR constraints being over a factor of three less expensive and the N constraint being a factor of about 2.5 less expensive. There is little variation between the different flattened formulations, i.e. the updating of the fission and/or scattering source during the calculation does not noticeably affect the calculation, though we will see shortly this is problem specific. While any of the JFNK approaches could be constructed using an existing solution technique it is clear that the P and FP formulations offer no advantage when used as Newton methods due to the expense associated with evaluating the nonlinear function. To construct any of the flattened formulations from an existing code it is necessary that source iterations are used to solve the inner iteration, that the source iteration is initialized using the previous outer iteration, and that there are no upscattering iterations (upscattering must be commingled with fission). If this is the case then simply setting the maximum number of inner iterations per group to 1 will result in the FD approach. The loop over energy groups in the outer iteration would need to be modified if one wanted to not update the downscattering (F), or update the fission between groups (FDF). While updating the

229 215 Table 5.18: Sweeps per Second for Takeda-1 With Upscattering Constraint N FW FR Traditional Power/Fixed-Point Trad-P Trad-FP Trad-F Trad-FD Trad-FDF Newton Methods JFNK-P JFNK-FP JFNK-F JFNK-FD JFNK-FDF downscattering source incurs no additional computational cost compared to not updating it, using an updated fission source does require additional calculations. Generally the fission distribution is calculated once, prior to the outer iteration, and the fission source for a group is simply computed as the product of this distribution and the group s χ g. If the fission source is updated a new distribution must be calculated for each group, which increases the computational load without increasing the sweep count, since no sweeps are necessary to build the fission distribution. Thus, for the same number of sweeps the FD method will be less expensive than the FDF method and it will be easier to implement. However, further tests will be carried out to explore the performance of both formulations for more demanding problems. Before moving on to the full benchmark suite, the issue of execution time will briefly be considered. It has been posited previously that the sweep count is an adequate measure of the computational cost of a selected formulation, however it has also just been mentioned that building a new fission source for each group is a type of expense that is not reflected in the sweep count. To further elucidate this question we consider the number of sweeps performed per second for each of the runs presented in Table 5.17, with the results given in Table The timing data for the traditional methods is obtained by placing a timer outside of the loop over

230 216 outer iterations while the timing data for the Newton formulations is obtained by placing a timer around the loop over Newton iterations. In the Newton timing data the sweeps and execution time associated with the error determination are excluded as previously mentioned. These results show very good agreement for all methods and all constraints with almost all values in the range. It is difficult to use these results to draw any conclusions about the possible overhead costs associated with each method as many contradictory results can be seen. There does not appear to be a constraint relation that proves consistently faster than the others and the cost of the various JFNK methods are all very similar though it seems JFNK-FP is capable of performing more sweeps per second than any of the other methods tested, including the traditional implementations. However since the variations in sweep per second are so small it is tough to claim the presence of trends with certainty. Though the same machine was used to produce all of the results no special care was taken to produce reliable timing data: multiple runs were performed simultaneously and background system processes may have occasionally interfered resulting in distorted timing data. Still, even with this uncertainty the numbers in Table 5.18 provide good support for the claim that sweep count is an acceptable measure of the total computational cost. This is desirable because it provides a way to compare the various formulations in a reproducible manner since sweep counts are algorithmic and do not depend on any given machine or software. It also permits the conclusions of this study to be extended to more efficient implementations wherein the time to perform a mesh sweep is shorter than in the test code used in this work. 5.5 Comparing Traditional & Newton Methods Comprehensive tests such as those described in the previous section were performed for the purpose of making a generalized comparison of the Newton formulations to the traditional fixed-point formulations for a variety of reactor models, in particular the 10 configurations in the transport benchmark problem suite detailed in Section 5.2. Based on the trends seen in previous numerical experiments, the P formulation was not considered due to the fact it is identical to the FP approach for the Takeda problems and invariably more expensive than the FP approach for the C5G7-MOX

231 217 problems. Also, the JFNK-FP approach was only considered for the FR constraint due to computational expense. However, SI and GMRES were both used to again emphasize the cost of the inner iterations. For all of the problem formulations (both Newton and traditional) the inner iterations for each outer iteration were initialized using the previous outer iterate or v g in the case of a Jacobian-vector multiply. The reported results for each method, for a given problem, are the sweep count, the number of iterations, and the eigenvalue. For the traditional problem formulation the number of outer iterations are given (the number of inners being equal to the sweep count), while for the Newton approach both the number of Newton iterations and the total number of GMRES iterations were provided Takeda Results Results for the Takeda-1 problem with Case 1 Configuration (rods out) are given in Table In this case we see that the least expensive of the traditional approaches are the flattened iterations, though false convergence is apparent. The converged k has more than a 40 pcm difference when compared to the solutions generated using the Trad-FP and Newton approaches. Looking at the difference in the number of outer iterations we see that the flattened approaches require more than an order of magnitude more outer iterations, though only one inner iteration per outer. This is consistent with expectations since the approach effectively removes the inner-outer iteration structure and replaces it with a single level of iterations. The 185 outer iterations results in 370 total sweeps as a result of the Takeda- 1 problem s 2-group energy structure. Again we see that the GMRES withingroup formulation is much more efficient than standard source iteration: though the number of outer iterations is comparable, the GMRES formulation requires significantly fewer sweeps. Choosing which of these values to compare with the Newton results is a difficult proposition, the flattened approach while cheapest is not a true estimate of the computational cost because of the issues with false convergence. At the same time, the convergence scheme used to converge inners in the FP approach is likely not optimal and in realistic implementations the inners are accelerated which would require fewer sweeps. Still using the GMRES formulation, which has not yet gained widespread use, is a substantial improvement

232 218 Table 5.19: Comparison of Formulations & Constraints for Takeda-1, Case 1 Traditional Power/Fixed-Point Method Sweeps Outer Iterations k Trad-FP(SI) Trad-FP(GMRES) Trad-F Trad-FD Trad-FDF Newton Methods Method Sweeps Newton GMRES k JFNK-FP(SI)-FR JFNK-FP(GMRES)-FR JFNK-F-N JFNK-F-FW JFNK-F-FR JFNK-FD-N JFNK-FD-FW JFNK-FD-FR JFNK-FDF-N JFNK-FDF-FW JFNK-FDF-FR over standard SI such that comparing to the Trad-FP(GMRES) sweep count is not a major stretch. Examining the Newton methods we note a number of expected trends which were observed: the JFNK-FP-FR approach is much more expensive than any of the flattened approaches, regardless of which within-group problem formulation is used. Interestingly in this instance we see that the JFNK and traditional formulations of the FP iteration have sweep counts which are quite comparable. The JFNK-FP(GMRES)-FR method requires 545 sweeps while the Trad-FP(GMRES) method requires 475 sweeps. There is also an interesting relationship between the number of outers required by the Trad-FP approaches and the JFNK equivalent version. While all of the flattened methods require a total number of GMRES it-

233 219 Table 5.20: Comparison of Formulations & Constraints for Takeda-1, Case 2 Traditional Power/Fixed-Point Method Sweeps Outer Iterations k Trad-FP(SI) Trad-FP(GMRES) Trad-F Trad-FD Trad-FDF Newton Methods Method Sweeps Newton GMRES k JFNK-FP(SI)-FR JFNK-FP(GMRES)-FR JFNK-F-N JFNK-F-FW JFNK-F-FR JFNK-FD-N JFNK-FD-FW JFNK-FD-FR JFNK-FDF-N JFNK-FDF-FW JFNK-FDF-FR erations in the neighborhood of 50 or 60 the number of GMRES iterations for the JFNK-FP approaches is around 10. Thus, just as in the traditional approach, in the JFNK approach there is some tradeoff between the total number of iterations and the amount of work done per iteration. The number of Newton iterations is also smaller for the JFNK-FP formulations when compared to the flattened formulations, likely due to the excellent convergence properties of the Krylov iterations at each Newton step. Among the flattened formulations there is virtually no difference in the sweep counts or iteration counts for a given constraint. However the FR constraint is noticeably less expensive than either the FW or N constraints for this problem. In all of the Newton runs the eigenvalue converges to the reference value.

234 220 The results for the Takeda-1, Case 2 (rods in) configuration are almost identical to the Case 1 results, down to the sweep and iteration counts. These are provided in Table For the traditional iterations the number of outers differs by 1 at most while for the flattened methods the difference is less than 10 (so less than 20 sweeps). Again the traditional flattened iterations suffer from false convergence with a slightly less than 40 pcm discrepancy between the converged eigenvalues and the fully-converged eigenvalue. Again the JFNK-FP results are comparable to the traditional FP results and the JFNK-F* (JFNK-F/JFNK-FD/JFNK-FDF) methods are the fastest-converging of all methods tested. Again the FR constraint is the least expensive though we see in this case that the FW constraint is clearly the worst of the three. This is easily explained by considering the number of Newton iterations for the FW constraint in Table 5.19 and Table In Case 1 the FW constraint converges in 3 Newton iterations while in Case 2 an extra Newton iteration is required which results in a substantially larger number of GMRES iterations and ultimately around 50 more sweeps. Comparing Trad-FP(GMRES) to any of the JFNK-F*-FR methods we see that the JFNK methods require about 75% fewer sweeps though even comparing to the worst of the flattened Newton methods we still see an almost 65% reduction in the number of sweeps. The results for Case 1 of the Takeda-2 problem (rods out), Table 5.21 exhibit similar trends as those noted above, however the iteration counts and sweep counts are higher in this case. Though this problem has fewer total unknowns than the Takeda-1 problems the number of sweeps done per outer iteration (or evaluation of Γ) is higher due to the increased number of energy groups. The same trends can be seen in the number of outer iterations and in the eigenvalues for the traditional iterative methods; the flattened methods are around 35 pcm away from the fullyconverged eigenvalue. In this case the flattened iterations result in sweep counts that are very similar to the Trad-FP(GMRES) number of sweeps, though both of the Trad-FP approaches converge fully. Again the JFNK-FP methods are very expensive, more than their traditional counterparts. In the JFNK methods we see some minor differences between the F, FD, and FDF formulations of P. Like in the Takeda-1 rods-in configuration the performance of the constraint relations is well separated. The FR constraint proves itself to be superior, with the N constraint performing next best, while the FW constraint again is the worst of the

235 221 Table 5.21: Comparison of Formulations & Constraints for Takeda-2, Case 1 Traditional Power/Fixed-Point Method Sweeps Outer Iterations k Trad-FP(SI) Trad-FP(GMRES) Trad-F Trad-FD Trad-FDF Newton Methods Method Sweeps Newton GMRES k JFNK-FP(SI)-FR JFNK-FP(GMRES)-FR JFNK-F-N JFNK-F-FW JFNK-F-FR JFNK-FD-N JFNK-FD-FW JFNK-FD-FR JFNK-FDF-N JFNK-FDF-FW JFNK-FDF-FR bunch. For all of the JFNK-F* formulations the difference between the FW and FR constraints is around 200 sweeps, which is around a 50% increase in the number of sweeps required by the FR constraint. The best Newton approach is 65% less expensive than the Trad-FP(GMRES) approach, while the worst flattened Newton approach is slightly less than 50% cheaper. Results for the rods-in configuration, Case 2, of the Takeda-2 problem are given in Table There is a noticeable increase in the number of iterations and sweeps required in this configuration compared to the rods-out case which is likely due to the presence of large thermal flux gradients induced by the strong absorption of the control rods. In this case the flattened formulations using traditional fixed-point iterations require a little more than half of the sweeps that Trad-FP(GMRES) does,

236 222 Table 5.22: Comparison of Formulations & Constraints for Takeda-2, Case 2 Traditional Power/Fixed-Point Method Sweeps Outer Iterations k Trad-FP(SI) Trad-FP(GMRES) Trad-F Trad-FD Trad-FDF Newton Methods Method Sweeps Newton GMRES k JFNK-FP(SI)-FR JFNK-FP(GMRES)-FR JFNK-F-N JFNK-F-FW JFNK-F-FR JFNK-FD-N JFNK-FD-FW JFNK-FD-FR JFNK-FDF-N JFNK-FDF-FW JFNK-FDF-FR though this is not a true representation of the cost of these methods as false convergence again yields an approximately 40 pcm difference from the fully-converged eigenvalue. The JFNK equivalents of the Trad-FP approach are nearly twice as expensive in this case. Again for the JFNK methods the constraint equation has a larger bearing on the sweep count than the specific variation of the flattened iteration being employed. Like in previous cases the FR constraint is the most attractive in terms of sweep count, followed closely by the N constraint. The FW constraint is almost twice as expensive for this problem. Though it has occasionally be found to produce sweep counts comparable to those of the FR constraint the FW constraint has exhibited undesirable behavior for the Takeda problems to this point. For this problem the JFNK-FD-FR problem reduces the sweep count

237 223 Table 5.23: Comparison of Formulations & Constraints for Takeda-3, Case 1 Traditional Power/Fixed-Point Method Sweeps Outer Iterations k Trad-FP(SI) Trad-FP(GMRES) Trad-F Trad-FD Trad-FDF Newton Methods Method Sweeps Newton GMRES k JFNK-FP(SI)-FR JFNK-FP(GMRES)-FR JFNK-F-N JFNK-F-FW JFNK-F-FR JFNK-FD-N JFNK-FD-FW JFNK-FD-FR JFNK-FDF-N JFNK-FDF-FW JFNK-FDF-FR by more than 75% when compared to the Trad-FP(GMRES) while the worst of the flattened Newton methods, JFNK-FD-FW, still reduces the sweep count by 55%. It is worth noting in this case that the Newton iteration count of the JFNK runs when using the FR constraint is 4, which is lower even than the Newton iteration counts of the JFNK-FP approach. This is a testament to the effectiveness of the particular combination of the flattened iterative formulation and the FR constraint. While it may be tied to the physical interpretation of the FR constraint, it is clear that for some reason this choice consistently produces good results. It is also the same, or close to the same, formula used to update the eigenvalue estimate in each outer iteration in traditional S N transport codes. The Takeda-3 problem is the most demanding of the Takeda benchmark prob-

238 224 lems, containing the largest number of spatial cells. The results for Case 1 of Takeda-3, the rods-in case, are given in Table The number of outer iterations required by the Trad-FP approaches is the largest here of any problems considered so far and is again very similar for both the Trad-FP(SI) and Trad-FP(GMRES) approaches. Like in previous cases the Trad-FP(GMRES) sweep count is less than half that of the Trad-FP(SI) sweep count. The difference in eigenvalues between the flattened approach results and the fully-converged value is only around 20 pcm for this problem, while the sweep count for the flattened approaches is substantially less than that of the Trad-FP(GMRES) method. This indicates that the convergence strategy used for the inners in the Trad-FP(GMRES) approach is likely more expensive than it needs to be, though trying to truly optimize this type of nested iteration is likely a futile effort: the possibilities are endless. There are some more definitive trends for this problem regarding the various flattened JFNK formulations. For all constraints the FD and FDF formulations require slightly fewer sweeps than the F formulation. While the difference in sweep count is not remarkable it is still behavior worth noting. Here we see that FR is again the most efficient of the constraints, though for this problem the FW constraint is quite comparable. In fact, all of the constraint relations behave quite similarly for this problem though the N constraint requires a couple extra Newton iterations. Comparing the best Newton method for this configuration, JFNK-FDF-FR, to the Trad-FP(GMRES) implementation a sweep savings of almost 90% is realized while using the worst of the flattened Newton methods, JFNK-F-N, still yields an 85% reduction in the number of sweeps; all very encouraging results. The results for Takeda-3 Case 2, the rods-out configuration, are given in Table The results for the traditional iterative techniques are very similar to those for Case 1: the flattened approaches are cheaper than Trad-FP(GMRES) but the eigenvalue is still 30 pcm from the fully-converged value. It should be noted that even if the flattened iterations were continued until the eigenvalue was equal to the fully-converged value it is almost given that the fission source would still be far from convergence as invariably the eigenvalue error falls below 10 5 before the point-wise fission source error falls below Therefore, due to the occurrence false convergence there is no good way to estimate what the true cost of these methods is without comparing them to a known solution or using the system

239 225 Table 5.24: Comparison of Formulations & Constraints for Takeda-3, Case 2 Traditional Power/Fixed-Point Method Sweeps Outer Iterations k Trad-FP(SI) Trad-FP(GMRES) Trad-F Trad-FD Trad-FDF Newton Methods Method Sweeps Newton GMRES k JFNK-FP(SI)-FR JFNK-FP(GMRES)-FR JFNK-F-N JFNK-F-FW JFNK-F-FR JFNK-FD-N JFNK-FD-FW JFNK-FD-FR JFNK-FDF-N JFNK-FDF-FW JFNK-FDF-FR residual to determine convergence (which incurs additional costs). The behavior of the JFNK results is similar to previous problems where we note again that the FW constraint usually offers the worst performance while the FR constraint results in the lowest sweep counts. Even though the N constraint requires an additional Newton iteration when compared to the FW constraint the number of total GMRES iterations is higher for the FW constraint in 2 out of the 3 cases. This indicates that the system formed using the N constraint is better conditioned than that which results from using the FW constraint. Again it appears that with the exception of the FW constraint, the FD and FDF formulations are more efficient than the F formulation, implying there is some benefit to using the most current values for the fission and/or downscattering sources. Yet again the JFNK-

240 226 Table 5.25: Comparison of Formulations & Constraints for Takeda-3, Case 3 Traditional Power/Fixed-Point Method Sweeps Outer Iterations k Trad-FP(SI) Trad-FP(GMRES) Trad-F Trad-FD Trad-FDF Newton Methods Method Sweeps Newton GMRES k JFNK-FP(SI)-FR JFNK-FP(GMRES)-FR JFNK-F-N JFNK-F-FW JFNK-F-FR JFNK-FD-N JFNK-FD-FW JFNK-FD-FR JFNK-FDF-N JFNK-FDF-FW JFNK-FDF-FR FDF-FR formulation produces the best results, requiring almost 90% fewer sweeps than the Trad-FP(GMRES) approach, though the worst of the flattened Newton methods, JFNK-FDF-FW, still results in an approximately 80% reduction in the number of sweeps necessary. Case 3 for the Takeda-3 model differs from Cases 1 and 2 in that the control rods and control rod positions are replaced by core material or blanket material as appropriate, resulting in a system with much less absorbing material. Results for Case 3 are given in Table 5.25 and are very similar to Cases 1 and 2. The eigenvalue calculated by the flattened iterations is still off by 30 pcm due to false convergence while the sweep count for the flattened iterations is in the same range as previous cases, approximately 2000 sweeps. The Trad-FP(GMRES) method converges using

241 sweeps for this problem which is again less than half of the 8965 sweeps required when inner iterations are solved using source iteration. The iteration and sweep counts for the JFNK methods are nearly identical to those from Case 2, however for JFNK-FDF it can be seen that one less Newton iteration is required by the FW constraint in Case 3 making it less expensive. The behavior of the FW constraint for the Takeda-3 problems shows what can happen when a method is almost converged but another Newton iteration is required: a non-negligible increase in the cost of the solution is incurred. Revising the stopping criteria used to recognize this type of situation would be beneficial in that additional Newton steps could be prevented if a solution is close enough to convergence. Unlike in any of the other configurations of this problem the best Newton formulation is JFNK-F-FR while the worst flattened approach is JFNK-FD-FW, they offer sweeps savings of 87% and 84%, respectively, compared to the Traditional methods C5G7-MOX Results The C5G7-MOX problems which comprise the remainder of the benchmark suite are distinct from the original Takeda problems in that upscattering is present and the spatial grid is much finer. The number of unknowns in the C5G7-MOX problems is almost 20 times larger than in the Takeda-3 problem. This is certainly the most realistic of the benchmark models considered and due to its size should provide a meaningful and definitive comparison of the Newton approach to traditional fixed-point iteration methods. The results for the Unrodded C5G7-MOX problem are given in Table The first observation is that all of the sweep counts in the C5G7-MOX problem are higher than in the Takeda problems, due to the 7-group energy structure. In general the number of Newton and GMRES iterations are similar, though the number of outer iterations required by the traditional approaches is larger than in previous cases. The discrepancy between the Trad-FP(SI) and Trad-FP(GMRES) approaches is much larger for this problem, with the source iteration formulation of the problem requiring almost 4 times as many sweeps as the GMRES formulation: 12,466 as opposed to 3,511. The flattened iterations in this case require about as many sweeps as the Trad-FP(GMRES) approach and surprisingly the eigenvalue is only 8 pcm different from the fully-converged value.

242 228 Table 5.26: Comparison of Formulations & Constraints for C5G7-MOX, Unrodded Traditional Power/Fixed-Point Method Sweeps Outer Iterations k Trad-FP(SI) Trad-FP(GMRES) Trad-F Trad-FD Trad-FDF Newton Methods Method Sweeps Newton GMRES k JFNK-FP(SI)-FR JFNK-FP(GMRES)-FR JFNK-F-N JFNK-F-FW JFNK-F-FR JFNK-FD-N JFNK-FD-FW JFNK-FD-FR JFNK-FDF-N JFNK-FDF-FW JFNK-FDF-FR Were false convergence not an issue this would be an excellent example of the tradeoff between the total number of outer iterations and the number of inner iterations per outer iteration. In the traditional formulation of the flattened iterations there is a significant difference between the F and FD/FDF approaches, almost 600 sweeps and 6 pcm of difference. This is the first problem to display such a strong dependence on the formulation of the flattened iteration. The JFNK methods for this problem also reveal some surprises compared to the Takeda problems. The first is that for the Unrodded C5G7-MOX problem the JFNK-FP(GMRES) and JFNK-FP(SI) methods are substantially more efficient than their fixed-point equivalents. JFNK-FP(GMRES)-FR requires 2,681 sweeps while Trad-FP(GMRES) requires 3,511 sweeps, for the source iteration formulation

243 229 this is 8,985 and 12,466 sweeps, respectively. This in itself provides compelling evidence regarding the potential efficiency of Newton s methods. However, the sweep counts associated with the flattened formulations are considerably smaller. The FR constraint is by far the best of the constraints for this problem while the N and FW constraints battle for a distant second place. There is also a sizeable difference between the F and FD/FDF formulations for the C5G7-MOX problem. The N constraint requires 1,254 sweeps for the F formulation but only 827 for the FD and FDF approaches. Likewise, the FW constraint converges in 918 sweeps for the F formulation but 890 for the FD/FDF formulations. Similarly, the FR constraint drops from 771 sweeps for the F iteration to 673/680 sweeps for the FD and FDF iterative formulations, respectively. This is fortunate since the FD formulation is likely the simplest of the flattened formulations to adapt from an existing computer code. For the Unrodded configuration the best Newton method is JFNK-FD-FR which converges in 80% fewer sweeps than Trad-FP(GMRES) while the least effective of the flattened Newton approaches, JFNK-FD-N converges in approximately 65% fewer sweeps. Most of the same discussion accompanying the Unrodded results can also be applied to the Rodded A results, found in Table Again the Trad-FP formulations are bested by their JFNK counterparts and again the Trad-FD and Trad-FDF approaches result in very similar sweep counts to the Trad-FP(GMRES) method, with the eigenvalue 9 pcm off of the fully-converged solution due to false convergence. For the JFNK results clear dependence on both the formulation of P and on the constraint can be seen. While there is no difference between the JFNK- FD and JFNK-FDF results, there is a large difference between these formulations and the JFNK-F formulation. The difference is almost 250 sweeps, regardless of constraint. The FR constraint itself results in at least 100 sweeps less than the FW and N constraints, regardless of the flattened formulation. The best of the Newton approaches is JFNK-FD-FR or JFNK-FDF-FR, whose results are identical. These formulations require 701 sweeps compared to the 3,677 necessary in the Trad-FP(GMRES) formulation, a savings of a little over 80%. The JFNK-F-N and JFNK-F-FW are the most expensive of the flattened Newton formulations with the JFNK-F-FW formulation requiring 1,100 sweeps, though this still results in a sweep savings of 70% compared to the traditional problem formulation.

244 230 Table 5.27: Comparison of Formulations & Constraints for C5G7-MOX, Rodded A Traditional Power/Fixed-Point Method Sweeps Outer Iterations k Trad-FP(SI) Trad-FP(GMRES) Trad-F Trad-FD Trad-FDF Newton Methods Method Sweeps Newton GMRES k JFNK-FP(SI)-FR JFNK-FP(GMRES)-FR JFNK-F-N JFNK-F-FW JFNK-F-FR JFNK-FD-N JFNK-FD-FW JFNK-FD-FR JFNK-FDF-N JFNK-FDF-FW JFNK-FDF-FR The final benchmark problem considered was the Rodded B configuration of the C5G7-MOX problem. The control rods are most deeply inserted in this configuration resulting in large flux gradients due to the strong thermal absorption in the rods. The results of the comprehensive algorithm test for the Rodded B configuration are given in Table The expense of the Trad-FP(SI) formulation is again more than three times the expense of the same problem solved using GMRES for the inner iterations (12,335 compared to 3,825 sweeps). The JFNK equivalent formulations of these problems are again less expensive however, meaning that for the C5G7-MOX benchmark problem a savings in the total sweep count can be realized without any tweaking at all of an existing fixed-point iteration. More importantly, as in all of the benchmark problems even greater reductions in computational cost

245 231 Table 5.28: Comparison of Formulations & Constraints for C5G7-MOX, Rodded B Traditional Power/Fixed-Point Method Sweeps Outer Iterations k Trad-FP(SI) Trad-FP(GMRES) Trad-F Trad-FD Trad-FDF Newton Methods Method Sweeps Newton GMRES k JFNK-FP(SI)-FR JFNK-FP(GMRES)-FR JFNK-F-N JFNK-F-FW JFNK-F-FR JFNK-FD-N JFNK-FD-FW JFNK-FD-FR JFNK-FDF-N JFNK-FDF-FW JFNK-FDF-FR can be seen when the JFNK approach is used in conjunction with the flattened fixed-point formulation and its variants. Yet again the FR constraint proves to be the best choice, significantly outperforming the N and FW constraints. As in the Rodded A case the F formulation pales in comparison to the FD and FDF formulations, requiring additional sweeps. In this problem the FD and FDF formulations are identical for the FR constraint but not for the FW and N constraints, though the overall sweep counts are similar. The best Newton formulation is again JFNK-FD-FR / JFNK-FDF-FR while JFNK-F-FW distinguishes itself as the costliest of the flattened approaches. The savings attained via JFNK- FD-FR (and JFNK-FDF-FR) is an 80% reduction in sweep count as in previous configurations while the JFNK-F-FW formulation still sees a 70% reduction in the

246 232 Table 5.29: Accuracy of Benchmark Solutions Benchmark Case k a k ref k b Takeda Takeda Takeda Takeda Takeda Takeda Takeda C5G7-MOX Unrodded C5G7-MOX Rodded A C5G7-MOX Rodded B a JFNK-FD-FR value used b k k ref measured in pcm (1 pcm = 10 5 ) number of sweeps. This concludes the presentation of the comprehensive set of transport results. Before drawing any final conclusions we briefly examine the accuracy of the benchmark solutions. As already stated multiple times the agreement with the corresponding benchmark reference solutions was not an important part of these numerical experiments, however it is reassuring to know that when solving the benchmark suite we are obtaining reasonable solutions. To do so the eigenvalues from the previous ten tables for the JFNK-FD-FR method were compared to the reference eigenvalues from Table 5.1 and the magnitude of k in pcm is listed. The Takeda reference eigenvalues are only reported to 4 digits and the last digit is not given with full certainty. Without considering this uncertainty the JFNK-FD-FR computed eigenvalues for Takeda-1.1 and Takeda-2.2 eigenvalues agree perfectly with the reference values, while if the uncertainty is included all of the eigenvalues fall within the ranges of the given reference values with the exception of the Takeda-3 rods-in configuration. The k for the Unrodded, Rodded A and Rodded B C5G7- MOX problem is 21, 57, and 369 pcm, respectively. The Unrodded and Rodded A cases are quite close to the reference solutions, closer than one would expect for such a coarse mesh and with the square approximation of the fuel pins. How-

247 233 ever, for the Rodded B case in which the rods are inserted the farthest into the core we see a larger discrepancy between the calculated and reference solutions. This is likely due to the inability of the spatial grid to resolve flux gradients in strongly absorbing regions. Still, Table 5.29 shows that judging by the proximity of the eigenvalue to its reference value all of the benchmark solutions are more than adequate for the purposes of this work. Clearly this conclusion ignores the accuracy of the fundamental mode component of the solution. The reasoning here is that the new iterative schemes developed and demonstrated in this work perform well in configurations that are practically viable if not necessarily accurate. In other words, the good accuracy of the eigenvalue is evidence that the converged eigenmode is within the ballpark, the agreement of all solutions to the eigenpair converged with the standard Power Iterations is evidence that the limit of the new schemes is unchanged Summary of Comparisons to Traditional Schemes To succinctly summarize the performance of the JFNK methods relative to the traditional fixed-point iterative scheme used to solve the k-eigenvalue problem Figures 5.7 and 5.8 are provided. Figure 5.7 shows results for JFNK-F-FR, JFNK- FD-FR, and JFNK-FDF-FR for each benchmark problem, with the FR constraint being used because it was found to result in good performance and because it is the traditionally used update formula. The value reported is the ratio of the number of sweeps required by Trad-FP(GMRES) to that required by the JFNK approaches such that a ratio of 10 implies the JFNK approach requires only a tenth of the number of sweeps that the traditional method does. Given the observed proportionality of the execution time to the number of sweeps, see Table 5.18, this ratio amounts to a speedup factor due to replacing PI with the corresponding Newton method in the same transport code. In Figure 5.7 the values of this ratio can be seen to fall in the range of 3 8 across all test problems in the Benchmark suite. The Takeda-2 problem shows the smallest speedups while the Takeda-3 problem sees the largest benefit from the JFNK approach. For the C5G7-MOX problems, the factor is generally in the neighborhood of 5, a notable improvement on existing techniques. This figure also shows that for the Takeda problems there

248 234 is very little difference between the F, FD, and FDF formulations while this is not the case for the C5G7-MOX problems. The F formulation of P is clearly a less than ideal choice for this benchmark problem while the FD and FDF formulations both perform well and are very similar. Figure 5.8 uses the same speedup factor to compare the Newton to the traditional methods but now the JFNK-FD formulation is presented for the N, FW, and FR constraints. The JFNK-FD approach is used as the basis because it was found to be the most effective of the Newton formulations developed, with the added benefit that it is possible to implement by making very minor changes to traditional fixed-point iteration. It is clear from this figure that the FR constraint is uniformly the best choice for the FD formulation, providing the largest decrease in the sweep count in each of the benchmark problems. Generally, the N constraint is the next best choice but at times it is bested by the FW constraint. While the FR constraint is obviously the most effective it is clear that the N and FW constraints are not bad choices, always converging to the fundamental mode and in Ratio of Trad-FP(GMRES) to Newton Sweep Count Tak-1.1 Tak-1.2 Tak-2.1 Tak-2.2 Tak-3.1 Tak-3.2 Tak-3.3 mox-u mox-ra mox-rb JFNK-F-FR JFNK-FD-FR JFNK-FDF-FR Figure 5.7: Performance of JFNK Formulations Relative to Traditional Methods

249 235 Ratio of Trad-FP(GMRES) to Newton Sweep Count Tak-1.1 Tak-1.2 Tak-2.1 Tak-2.2 Tak-3.1 Tak-3.2 Tak-3.3 mox-u mox-ra mox-rb JFNK-FD-N JFNK-FD-FW JFNK-FD-FR Figure 5.8: Performance of JFNK Constraints Relative to Traditional Methods the worst case (Takeda-2.1) still cutting the number of sweeps in half compared to traditional techniques. It is also worth noting that just as in diffusion theory the benefit derived from Newton s method increases as the convergence tolerance on the eigenpair is tightened. To give an example, we consider the Takeda-1 problem with the DD and AHOT-N1 discretizations and S 4 and S 8 quadrature sets. Solving each Takeda configuration with these options but converging the eigenvalue and pointwise fission-source error to we obtain the speedup factors depicted in Figure 5.9. This shows little dependence of the spatial and angular discretization on the performance of the JFNK-FDF-FR method. In fact the JFNK results are nearly identical and the variations in the ratio of traditional to Newton sweeps are caused by variations in the number of sweeps required by the traditional methods. We can see that almost all the values here are larger than 8 and many are quite close to 10. Looking back at Figures 5.7 and 5.8 we see that the ratio for the Takeda-1 problem was near 4, meaning that for the same problem and numerical methods the Newton approach requires only an eighth rather than a quarter of the num-

250 236 Ratio of Trad-FP(GMRES) to Newton Sweep Count DD-S 4 AHOTN1-S 4 DD-S 8 AHOTN1-S 8 0 Takeda-1.1 Takeda-1.2 Figure 5.9: Performance of JFNK-FDF-FR Relative to Traditional Methods, Tightly Converged ber of sweeps. Though it is unlikely that a solution converged so precisely would ever be sought in practice this behavior is a definite advantage of the superlinear convergence rate of the Jacobian-Free Newton-Krylov methods employed. The main conclusion which can be drawn from all these studies is that the Newton approach can solve the k-eigenvalue problem with a substantially smaller computational cost than traditional methods. The caveat to this statement is that the traditional methods implemented in this work do not contain DSA acceleration of the inner iterations or any methods intended to accelerate the convergence of the outer iterations. Still, the Newton approaches themselves are not preconditioned in any manner and GMRES is used for the inner iterations in most circumstances, meaning the slow convergence of traditional source iteration in scattering dominated regimes is not a major issue. It is clear that the F, FD, and FDF formulations of P, while inadequate for use as standard fixed-point iterations, result in robust and inexpensive Newton methods. Of the constraints developed and tested the fission-rate constraint (FR) is an obvious best choice due to its superior perfor-

251 237 mance. It is also extremely convenient that of the Newton methods tested the JFNK-FD-FR formulation is among the best as it is the simplest to derive from an existing implementation of the Trad-FP(SI) formulation used in most codes today. Additional research is necessary, but it seems plausible that an optimized and accelerated (via preconditioning) JFNK approach to the k-eigenvalue problem could result in a solution method that is competitive with the best techniques in use today, while allowing for simplified implementation in existing software as the method can be constructed on the framework of the traditional outer-inner iterative scheme.

252 CHAPTER 6 Conclusions 6.1 Summary In this work the classical Newton s method has been applied to the k-eigenvalue problem in neutron transport theory and the diffusion approximation to neutron transport theory. In this application the eigenvalue problem is cast into a nonlinear formulation whose solution comprises the eigenvalue-eigenvector pairs. To solve the resulting nonlinear form a specific class of Newton s methods was used which allows for a computationally efficient solution, specifically the class of methods known as Inexact Newton methods is employed. In an Inexact Newton method the linear system at each Newton iteration is solved approximately, usually via an iterative approach. The choice of a Krylov subspace method as the iterative technique for that stage is particularly advantageous and these are referred to as Newton-Krylov methods. Utilizing the properties of Krylov methods it is not necessary to know the Jacobian matrix (necessary for the corresponding exact Newton method) explicitly; only the product of the Jacobian and a vector is necessary. Considerable computational and storage costs can be avoided if the Jacobian is neither formed nor stored. The Jacobian-vector product can be computed in one of two ways: knowing the form of the Jacobian and deriving the form of the product directly or using the Jacobian-Free Newton-Krylov approximation. In the JFNK approxima-

253 239 tion the Jacobian-vector product can be approximated given the ability to evaluate the nonlinear function; this is especially useful because the form of the Jacobian is often difficult to determine explicitly. In this work the GMRES iterative method is employed as the Krylov solver, resulting in a Newton-GMRES method. The nonlinear functions in both the diffusion and transport equations are constructed such that the roots of the function are the eigenpairs of the k-eigenvalue problem. Thus using the Newton-GMRES method (with or without the JFNK approximation) to find a root of the nonlinear function results in an eigenvalue and eigenvector pair that satisfies the standard k-eigenvalue formulation. The fundamental mode is the desired solution and while the problem formulation does not guarantee convergence to this specific mode it was shown that steps can be taken to make this the most likely outcome. The performance of Newton s method for the k-eigenvalue problem is intimately tied to the specific form chosen for the nonlinear function, as it is possible to write many nonlinear functions which will have the eigenpairs as roots of the k-eigenvalue problem. For the diffusion approximation the nonlinear function was written as the residual of the generalized eigenvalue problem with a normalization condition on the eigenvector (scalar flux) used to pose a constraint on the solution of the nonlinear system of equations. Due to poor conditioning of the linearized Newton problem, poor convergence was observed and led to the implementation of a number of preconditioners for the Krylov solver. These preconditioners were applied to both the Newton-Krylov (NK) formulation of the problem, where the form of the Jacobian was manipulated to compute the exact Jacobian-vector product, and to the JFNK formulation of the problem where the action of the Jacobian was probed using the finite-difference JFNK approximation. This resulted in a large class of Newton methods: JFNK(GEP), JFNK(M), JFNK(M-F), JFNK(IC), JFNK(PI), JFNK(rPI), NK(GEP), NK(M), NK(M-F), NK(IC), and NK(rPI), detailed in Chapter 2. The GEP formulations do not precondition the GMRES iterations in any way, while the M, M-F, and IC preconditioners rely on knowledge of the matrices associated with the generalized eigenvalue problem. The rpi preconditioner refers to using standard power iteration as a preconditioner for the GMRES iteration; this proves to be a very effective preconditioner. Adapting an existing code to incorporate rpi provides the added benefit that the precondi-

254 240 tioner would already be constructed by the target code. The rpi refers to right preconditioning only. If the problem is left preconditioned with a standard power iteration then the nonlinear function can be posed in a delta-form such that evaluating the nonlinear function is equivalent to taking the difference between two power iterations. Using this formulation along with the JFNK approximation, the Newton-GMRES iterations can be wrapped around an existing implementation, yielding a savings in computational cost over standard power iteration and Chebyshev accelerated power iteration. The formulation of the nonlinear function in transport theory utilized the various fixed-point iterations which can be used to solve the k-eigenvalue problem in transport theory. The varying treatments of upscattering and inner iterations in traditional techniques result in a diverse set of fixed-point iterations which can be used as the foundations of Newton approaches. Thus, in the transport problem all of the nonlinear functions are written in a delta-form. Five different fixedpoint iterations are used to create five different nonlinear formulations. The first is classic power iteration in which the upscattering and inner iterations are fully converged at each outer iteration. The second is termed traditional fixed-point iteration, where the inner iterations are converged but the upscattering source is constructed using the flux values from the previous outer iterate. This form of the problem is what is most often employed in existing transport codes. The third formulation removes the iterations on upscattering and self-scattering such that there is only a single level of iteration in the solution algorithm. This form of the algorithm is rarely utilized because it exhibits poor convergence properties. Two variations of this method were also formed where the downscattering and fission sources were constructed using the most recently available information. Rather than just using a normalization condition the final equation, referred to as the constraint equation, was considered in more detail for transport theory. Three formulations were developed which fall into two types of categories: 1) the normalization constraint type that aptly imposes some normalization on the eigenvector; and eigenvalue update types that use 2) a flux-weighted and a 3) fission-weighted formula to accelerate the eigenvalue convergence. The equations for both NK and JFNK versions of all the formulations were developed though the NK approach was generally eschewed numerically due to the difficulties associated with its im-

255 241 plementation. The simplest of the NK formulations (NK-F-N) was included and shown to behave comparable numerically to its JFNK counterpart. All of the Newton formulations developed for both transport theory and the diffusion approximation were implemented in Fortran 90/95 codes, two-dimensional Cartesian geometry for the diffusion implementation and three-dimensional for transport. Though the transport equations were developed using anisotropic scattering, only isotropic scattering was considered numerically. For both transport and diffusion theory the finite-difference perturbation parameter, ɛ, in the JFNK approximation was studied, along with the Newton forcing factor, η, and the initial guess for Newton s method. The behavior of the diffusion calculation using Newton s method was thoroughly examined using a suite of four benchmark problems: the well-known IAEA problem, the Biblis benchmark, a CANDU model, and a BWR model. Using these models the previously mentioned parameters were studied along with the convergence of the inner iterations, which in diffusion theory refer to the iterations necessary to invert the within-group diffusion matrix. The convergence rate of Newton s method was observed for all of the preconditioners developed and the impact of the GMRES implementation was studied. Ultimately, the best of the Newton approaches were compared to standard power iteration and Chebyshev accelerated power iteration. The benchmark set used to test the transport problems consisted of four models with 10 total control rod configurations: the Takeda-1, Takeda-2, Takeda-3 and C5G7-MOX benchmarks. The C5G7-MOX benchmark is the most demanding of the problems and is unique among all of the benchmarks considered due to the presence of non-zero upscattering cross sections. The effect of the initial guess on Newton convergence was seen to be very important to the transport calculation and a large number of numerical experiments were run to determine an acceptable choice of initial guess. The effect of the GMRES subspace size, angular/spatial discretization, and formulation of the within-group problem were also considered. The transport results for all Newton formulations and constraint relations were then compared to results generated using traditional iterative methods via the total sweep count, which was shown to be a good measure of the total computational cost. The suites of benchmark problems provided realistic models that could be used as a basis for the assessment of the algorithms performance for both transport

256 242 and diffusion theories. For the diffusion problem formulations, it was seen that the JFNK perturbation parameter, ɛ, has very little impact on the rate of convergence while the choice of forcing factor has a very strong influence on the cost of a given implementation. Of the forcing factors tested, the algorithms Eisenstat-A and Eisenstat-B worked particularly well as did simply using a constant value. However, there was no single choice of forcing factor that offered the best performance for each problem as the diffusion calculation was extremely sensitive to to a large number of parameters. It was seen that by performing a small number, 5 15, of power iterations using a flat-flux initial guess and using the resulting eigenpair estimate to initialize Newton s method, the overall cost could be reduced compared to starting Newton s method directly with a flat flux initial guess. Overall, none of the initial guesses tested caused Newton s method to diverge completely or converge to an eigenvalue other than the fundamental mode. This demonstrates the robustness of Newton s method in conjunction with the diffusion approximation. As in any nested iteration scheme there was the question of how accurately to invert the within-group diffusion operator in each outer iteration. Numerical tests showed that generally a few (less than 10) iterations was preferable for both the traditional power and Chebyshev accelerated power iteration as well as the JFNK(PI) approach; in all cases the within-group problem was solved using the conjugate gradient method preconditioned with IC(0) Incomplete Cholesky factorization with no fill-in. The size of the GMRES subspace was seen to have an important impact on the performance of the Newton calculation. While the best convergence was seen for an unlimited subspace size, this is impractical from a memory standpoint. Ultimately a subspace size of 25 30, which means GMRES will restart every iterations, appeared to be the best compromise. If the subspace size is too small major convergence issues will arise. The Newton approach was seen to converge nearly quadratically when η is sufficiently small, as predicted by analysis. It was also seen that not all GMRES implementations behave equally based on behavior differences seen between the DLAP and SPARSKIT implementations of GMRES. Of the preconditioners tested, the best choices for a JFNK calculation appear to be left and right preconditioning using power iteration and using the Incomplete Cholesky factorization as a preconditioner. For the NK method right preconditioning using

257 243 power iteration and Incomplete Cholesky factorization are both excellent choices. A backtracking globalization strategy was implemented but its effectiveness was marginal. While it can correct some convergence issues, it is responsible for introducing others. Comparison of execution times between the Newton approaches and Chebyshev accelerated power iteration shows that some Newton formulations are capable of solving the problem more efficiently, particularly the JFNK(PI) approach which can be wrapped around existing solution procedures. The benefits of the Newton approach were most fully realized on spatial grids with a large number of cells and when the eigenpair was converged to a very tight convergence. The numerical results which accompany the transport formulations display much the same behavior. The perturbation parameter has no impact at all in the transport calculations, while the choice of fixed forcing factor, η (equal to 10 2 ), is seen to be an effective and simple choice. In general, the performance of the Newton-based iterations for the transport calculations appeared to be much less sensitive to most input parameters than the diffusion calculations, with the exception of the initial guess. While the Takeda benchmarks rate of convergence showed little sensitivity to the initial flux and k guess used to initialize Newton s method, the C5G7-MOX problems displayed very undesirable behavior for a number of starting fluxes and eigenvalues. Often, the Newton iteration would converge to a non-fundamental mode or would completely diverge. To overcome this failure it was found that performing a single fixed-point iteration, with converged inner iterations at the outset of Newton s method, was sufficient to cause the Newton iterations to converge to the fundamental mode. It was also found that the computational expense was generally lower for an initial eigenvalue guess of 2.0, though this finding is likely problem dependent. As in the diffusion results, the total cost of the calculation was seen to depend heavily on the Krylov subspace size and maximum number of iterations allowed. It was generally seen that a subspace size of 25 with no restarts was inexpensive and capable of producing good performance. The computational cost, measured in sweeps, was shown to be independent of the angular quadrature or spatial discretization being used, though the number of energy groups has a direct impact on the sweep count. Though not directly related to this work, the formulation of the within-group problem was examined in a number of situations. The results show conclusively

258 244 that the GMRES formulation of the problem is much more efficient than the unaccelerated source iteration; thus the GMRES approach was ultimately used to compare traditional techniques to the Newton formulations. Though considered equally with the JFNK formulations in diffusion theory, the NK approach was only briefly considered in transport theory with numerical results confirming no difference between NK results and JFNK results produced using an equivalent formulation. Since the JFNK approach is more practical from an implementation standpoint and the derivation of the NK equations is rather complicated we recommend the only the use of JFNK methods. The best set of JFNK methods was based on the flattened formulation of the k-eigenvalue problem, JFNK-F, JFNK-FD, and JFNK-FDF. Of the constraints tested it is clear that the FR approach is the most robust and offers the best performance. The best of the Newton formulations was capable of reducing the number of sweeps by a factor of anywhere between 3 and 8 compared to traditional techniques. 6.2 Conclusions In general it was seen that Newton-Krylov based formulations of the k-eigenvalue problem in both diffusion and transport theory are viable alternatives to existing solution methods. For the diffusion approximation, a number of preconditioners were used, resulting in overall computational costs lower than standard power iteration and Chebyshev accelerated power iteration. The two best preconditioners were one based on traditional power iteration and another based on the Incomplete Cholesky factorization, meaning that the Newton-Krylov framework could be implemented in an existing diffusion code and the existing methods (power iteration or preconditioners of the within-group diffusion matrix) could be used to precondition the Newton problem. Based on the results reported in Chapter 3, the Newton-Krylov method shows promise and its use should be pursued in production diffusion calculations. The behavior of Newton-based methods developed showed extreme sensitivity to many problem parameters and to the problem itself. The only way to truly gauge the effectiveness of the new approach is to implement one of the Newton formulations developed in this work in a production-level code and examine the effect on the total computational cost. This type of implementation

259 245 would also be able to demonstrate whether the sensitivity issues witnessed in this work are due to the problem sizes and diffusion implementation in this work or whether they are inherent to the Newton methods when used with diffusion theory. Many of the techniques employed by diffusion codes today, e.g. multigrid schemes and Wielandt shift, were not considered in the implementation of the diffusion solution and ultimately to be useful the Newton approaches must be shown to be compatible with these techniques as well and not just capable of performing better than Chebyshev acceleration. The most certain way to achieve this is to begin with mature diffusion software and add the Newton approaches as a wrapper around the existing k-eigenvalue solution scheme. The transport theory formulations, whose performance is discussed in previous chapters, are even more promising. The transport sweep count provides an objective way to compare computational costs and the JFNK-FD-FR/JFNK-FDF-FR approaches were clearly significantly faster-converging than the existing fixed-point iteration scheme (the traditional outer-inner structure). Just as in the diffusion case, these methods can be implemented in a manner such that existing code is used practically intact. In present form, the Newton formulations for both transport and diffusion theory can be used to accelerate existing iterations relatively easily. To be truly competitive with other methods, however, the transport formulations must include some type of preconditioning. If a preconditioner were used to reduce the number of GMRES iterations per Newton step, the total sweep count could be significantly lowered. Still, the results in this work show that the formulations developed are capable of reducing the number of sweeps necessary by almost an order of magnitude in some cases, and this is without any preconditioning. Ultimately, this work has shown that the classical Newton method, utilizing Krylov subspace methods and the JFNK approximation, can be used to solve k-eigenvalue problems which arise in both neutron transport theory and the diffusion approximation in a more efficient manner than existing methods allow. It is very possible that, when optimized to the level that traditional techniques are optimized today, the Newton approach could significantly reduce the cost of performing neutronics criticality calculations.

260 Future Work There are many avenues for expanding this work. In fact, there are too many to list completely. However, a small sample of those which stand to provide the most benefit are briefly discussed: Transport Preconditioning Mentioned in many places, the transport formulation of the problem would likely benefit tremendously from an effective preconditioner for the GMRES iterations. There is a directly proportional relationship between the total number of GMRES iterations and the sweep count. Thus reducing the number of GMRES iterations has an immediate impact on the total number of sweeps. Since the operators are not explicitly available, many commonly used preconditioning techniques for Krylov iterations cannot be used. However, using the low-order diffusion approximation as a preconditioner may prove to be very effective as there is a long history of using diffusion operators to accelerate transport calculations. It is also likely that the diffusion approximation can be used to generate a good initial guess more cheaply than can be done using a fixed-point transport iteration. Extend to Other Criticality-Search Problems Due to the flexibility allowed in the creation of the nonlinear function there is no reason to limit application of the new approach to solving only the k-eigenvalue problem. It has been shown how the α-eigenvalue problem can be just as easily solved. However, it may also be possible to use the Newton approach to solve a variety of reactor physics problems. For instance, if the critical boron concentration is sought it may be possible to set k to unity in the nonlinear function and replace k in u with the unknown boron concentration. Replacing the constraint equation with some relationship the concentration must satisfy could result in a Newton method that solves for the flux and corresponding critical boron concentration. It may also be possible to extend this calculation to all types of criticality searches. Production-Level Diffusion Implementation Due to the sensitivity of the diffusion calculation to so many settings and parameters and the

261 247 wide variety of solution techniques in use the best way to determine whether the Newton formulations offer an advantage is to take a production-level or mature academic diffusion code and implement the Newton methods on top of the existing methods. In this way a comparison of the execution times would be a fairly objective measure of the computational costs of the various solution schemes. Transport Tweaks There are a number of places in the implementation of the Newton transport formulation where it may be possible to considerably lower the total cost of the calculation. Specifically the forcing factor η could be optimized for the transport problem so that the first Newton step is not excessively expensive. The convergence criteria could also be chosen more wisely so an additional Newton iteration is not performed if the current error has not met the criteria but is close enough to justify avoiding the computational cost of an additional iteration. As was shown in this work, the JFNK method makes it possible to perform many calculations which would be infeasible if it were necessary to form and store the Jacobian. The use of Newton methods and JFNK methods in particular has been explored very little with regards to neutronics calculations; most work in the nuclear community dealing with Newton methods has dealt with the problem of coupling neutronics-thermal-hydraulics software solvers. Therefore research into the possible applications of JFNK methods to stand-alone transport methods and reactor physics calculations is a fertile research area which should certainly be explored.

262 APPENDIX A Mathematical Definitions This appendix briefly lists a number of the special matrix (or vector) types and properties mentioned in the text and a short description of pertinent features. The list is organized alphabetically for ease of finding definitions when needed. For matrices whose structure is important, structural illustration will be provided. Saad [5] is the primary source for these definitions with Heath [1] and Golub and Van Loan [2] also consulted. In general, matrices are denoted by capital letters, for example A, and are assumed to be square, n n, where lower case letters, such as a ij, refer to the element of this matrix in row i and column j where 1 i n and 1 j n. A.1 Types of Matrices Biorthogonal: If a basis V is given by V = [v 1,..., v m ] T and another by W = [w 1,..., w m ] T then they are said to be biorthogonal when (v i, w j ) = δ ij where δ ij is the Kronecker delta, and (.,.) is the inner product. In matrix form this is written as W H V = I. Block Matrix: A block matrix is a matrix whose elements are other matrices. The dimensions of the block components must be consistent. For example,

263 249 a matrix A could be written as a block matrix [ ] A11 A 12 A = A 21 A 22 as long as the blocks are correctly dimensioned, i.e. the number of rows in A 11 is equal to the number of rows in A 12 and likewise of A 21 and A 22. The column dimensions of A 11 and A 21 must agree and the same is true for A 12 and A 22. Special types of matrices also have block equivalents, such that a block-diagonal would have A 21 = A 12 = 0, or a block-upper triangular would have A 21 = 0. Cholesky Decomposition: A symmetric positive definite matrix A can be decomposed into a lower triangular matrix and the conjugate transpose of this matrix like A = LL H. Finding an approximate representation of L can yield a preconditioner for solving linear systems using the conjugate gradient method. Conjugate: Two vectors u and v are conjugate with respect to A if u T Av = 0. Diagonal: a ij = 0 for j i. a a A = a nn Hermitian: A H = A or a ij = ā ji, where the bar signifies the complex conjugate. The Hermitian operation denoted by H is also known as the conjugate transpose. The real part of A is necessarily symmetric if A is Hermitian. Hessenberg: A matrix that is nearly triangular. For an upper Hessenberg matrix

264 250 h ij = 0 for i > j + 1 h 11 h 12 h h 1n h 21 h 22 h h 2n H u = 0 h 32 h h 3n h n,n 1 h nn For a lower Hessenberg matrix h ij = 0 for i < j 1. h 11 h h 21 h 22 h H l =. h 31 h 32 h hn 1,n h n1 h n2 h n3... h nn Inner Product : For u and v the inner product is given by u T v = v T u and denoted by (u, v). Krylov Subspace: The Krylov subspace generated by matrix A and vector x with dimension m is given by K m (A, x) = span[x, Ax, A x,..., A m 1 x] Lower Triangular: l ij = 0 for i < j. l l 21 l L = l n1 l n2... l nn Normal: A H A = AA H Orthogonal: For a matrix Q H Q = diagonal. Similar to a unitary matrix but the diagonal elements of the product are not necessarily equal to unity. For a vector u T v = 0 if u and v are orthogonal.

265 251 Orthonormal: Q H Q = I, same as unitary, this indicates that (q i, q i ) = 1 and (q i, q j ) = 0 for j i. Positive Definite: A real matrix A is positive definite if (Au, u) > 0, u R n, u 0. If A is symmetric then it is called symmetric positive definite (SPD). QR Factorization: The QR factorization of A (dimension m n) is given by QR = A with Q R m m being orthogonal and R R m n being upper triangular. If A has full column rank then the columns of Q form an orthonormal basis of the range of A. Common methods of QR factorization include Householder, Givens, and Gram-Schmidt. Rank: The maximum number of linearly independent columns of a matrix. Schur Decomposition: For any square A there exists a unitary matrix Q such that Q H AQ = R, where R is upper triangular and known as the Schur form of A. A and R are similar and thus have the same eigenvalues and since R is triangular these eigenvalues are the diagonal entries of R. Similar: Matrices A and B are similar if there is a nonsingular matrix X such that A = XBX 1. Similar matrices have the same eigenvalues. Symmetric: A T = A or a ij = a ji. The transpose operation is denoted by T. Tridiagonal: a ij = 0 unless i = j, or j = i ± 1. t 1,1 t 1,2 t 2,1 t 2,2 t 2,3 T = t n 1,n 2 t n 1,n 1 t n 1,n t n,n 1 t n,n Unitary: Q H Q = I, this implies that the columns of Q are orthonormal.

266 252 Upper Triangular: u ij = 0 for i > j. u 11 u u 1n 0 u U = u 2n u nn

267 APPENDIX B Benchmark Suite Specification Geometric diagrams for the Takeda and C5G7-MOX benchmarks are provided along with problem cross sections. For the Takeda benchmark set problems 1, 2 and 3 are modeled with rods inserted and rods out, and problem 3 is also modeled with no control rod positions in the matrix. The geometries for these problems are given by Figures B.1, B.2, and B.3, respectively. The C5G7-MOX problem was modeled for the Unrodded, Rodded A, and Rodded B cases. The general description of the 3-D geometry is found in B.5 while further detail of the assembly and pin level geometrical details is found in B.6. The layout of the axial reflector is given in Figure B.7. One should refer to the benchmark report for a full description of the dimensions. The rod configuration in the Unrodded, Rodded A, and Rodded B cases is sketched in Figure B.8 from top to bottom, respectively. The multigroup cross sections for the entire benchmark suite are given beginning with Table B.1 and ending with Table B.12.

268 254 Takeda-1 Geometry Figure B.1: Takeda-1 Geometry [79]

269 Figure B.2: Takeda-2 Geometry [79] 255

270 Figure B.3: Takeda-3 Rodded Geometry [79] 256

271 Figure B.4: Takeda-3 No Rod Positions Geometry [79] 257

272 Figure B.5: C5G7-MOX 3-D Geometry [83] 258

273 Figure B.6: C5G7-MOX Pin Description/Layout [83] 259

274 Figure B.7: C5G7-MOX Axial Reflector [83] 260

275 Figure B.8: C5G7-MOX: Unrodded, Rodded A, Rodded B [83] 261

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection Eigenvalue Problems Last Time Social Network Graphs Betweenness Girvan-Newman Algorithm Graph Laplacian Spectral Bisection λ 2, w 2 Today Small deviation into eigenvalue problems Formulation Standard eigenvalue

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 4 Eigenvalue Problems Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization General Tools for Solving Large Eigen-Problems

More information

Arnoldi Methods in SLEPc

Arnoldi Methods in SLEPc Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-4 Available at http://slepc.upv.es Arnoldi Methods in SLEPc V. Hernández J. E. Román A. Tomás V. Vidal Last update: October,

More information

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis Eigenvalue Problems Eigenvalue problems occur in many areas of science and engineering, such as structural analysis Eigenvalues also important in analyzing numerical methods Theory and algorithms apply

More information

LARGE SPARSE EIGENVALUE PROBLEMS

LARGE SPARSE EIGENVALUE PROBLEMS LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization 14-1 General Tools for Solving Large Eigen-Problems

More information

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland Matrix Algorithms Volume II: Eigensystems G. W. Stewart University of Maryland College Park, Maryland H1HJ1L Society for Industrial and Applied Mathematics Philadelphia CONTENTS Algorithms Preface xv xvii

More information

APPLIED NUMERICAL LINEAR ALGEBRA

APPLIED NUMERICAL LINEAR ALGEBRA APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

6.4 Krylov Subspaces and Conjugate Gradients

6.4 Krylov Subspaces and Conjugate Gradients 6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P

More information

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts

More information

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES 48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate

More information

Alternative correction equations in the Jacobi-Davidson method

Alternative correction equations in the Jacobi-Davidson method Chapter 2 Alternative correction equations in the Jacobi-Davidson method Menno Genseberger and Gerard Sleijpen Abstract The correction equation in the Jacobi-Davidson method is effective in a subspace

More information

Preconditioned inverse iteration and shift-invert Arnoldi method

Preconditioned inverse iteration and shift-invert Arnoldi method Preconditioned inverse iteration and shift-invert Arnoldi method Melina Freitag Department of Mathematical Sciences University of Bath CSC Seminar Max-Planck-Institute for Dynamics of Complex Technical

More information

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009) Iterative methods for Linear System of Equations Joint Advanced Student School (JASS-2009) Course #2: Numerical Simulation - from Models to Software Introduction In numerical simulation, Partial Differential

More information

Iterative methods for Linear System

Iterative methods for Linear System Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and

More information

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts

More information

M.A. Botchev. September 5, 2014

M.A. Botchev. September 5, 2014 Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev

More information

1 Extrapolation: A Hint of Things to Come

1 Extrapolation: A Hint of Things to Come Notes for 2017-03-24 1 Extrapolation: A Hint of Things to Come Stationary iterations are simple. Methods like Jacobi or Gauss-Seidel are easy to program, and it s (relatively) easy to analyze their convergence.

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 19: More on Arnoldi Iteration; Lanczos Iteration Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 17 Outline 1

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

Orthogonal iteration to QR

Orthogonal iteration to QR Notes for 2016-03-09 Orthogonal iteration to QR The QR iteration is the workhorse for solving the nonsymmetric eigenvalue problem. Unfortunately, while the iteration itself is simple to write, the derivation

More information

ABSTRACT. Professor G.W. Stewart

ABSTRACT. Professor G.W. Stewart ABSTRACT Title of dissertation: Residual Arnoldi Methods : Theory, Package, and Experiments Che-Rung Lee, Doctor of Philosophy, 2007 Dissertation directed by: Professor G.W. Stewart Department of Computer

More information

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4 EIGENVALUE PROBLEMS EIGENVALUE PROBLEMS p. 1/4 EIGENVALUE PROBLEMS p. 2/4 Eigenvalues and eigenvectors Let A C n n. Suppose Ax = λx, x 0, then x is a (right) eigenvector of A, corresponding to the eigenvalue

More information

Preface to the Second Edition. Preface to the First Edition

Preface to the Second Edition. Preface to the First Edition n page v Preface to the Second Edition Preface to the First Edition xiii xvii 1 Background in Linear Algebra 1 1.1 Matrices................................. 1 1.2 Square Matrices and Eigenvalues....................

More information

Iterative Methods for Linear Systems of Equations

Iterative Methods for Linear Systems of Equations Iterative Methods for Linear Systems of Equations Projection methods (3) ITMAN PhD-course DTU 20-10-08 till 24-10-08 Martin van Gijzen 1 Delft University of Technology Overview day 4 Bi-Lanczos method

More information

EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems

EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems JAMES H. MONEY and QIANG YE UNIVERSITY OF KENTUCKY eigifp is a MATLAB program for computing a few extreme eigenvalues

More information

Linear Solvers. Andrew Hazel

Linear Solvers. Andrew Hazel Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction

More information

Stabilization and Acceleration of Algebraic Multigrid Method

Stabilization and Acceleration of Algebraic Multigrid Method Stabilization and Acceleration of Algebraic Multigrid Method Recursive Projection Algorithm A. Jemcov J.P. Maruszewski Fluent Inc. October 24, 2006 Outline 1 Need for Algorithm Stabilization and Acceleration

More information

Math 504 (Fall 2011) 1. (*) Consider the matrices

Math 504 (Fall 2011) 1. (*) Consider the matrices Math 504 (Fall 2011) Instructor: Emre Mengi Study Guide for Weeks 11-14 This homework concerns the following topics. Basic definitions and facts about eigenvalues and eigenvectors (Trefethen&Bau, Lecture

More information

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Krylov Space Methods Nonstationary sounds good Radu Trîmbiţaş Babeş-Bolyai University Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Introduction These methods are used both to solve

More information

Krylov subspace projection methods

Krylov subspace projection methods I.1.(a) Krylov subspace projection methods Orthogonal projection technique : framework Let A be an n n complex matrix and K be an m-dimensional subspace of C n. An orthogonal projection technique seeks

More information

Iterative methods for symmetric eigenvalue problems

Iterative methods for symmetric eigenvalue problems s Iterative s for symmetric eigenvalue problems, PhD McMaster University School of Computational Engineering and Science February 11, 2008 s 1 The power and its variants Inverse power Rayleigh quotient

More information

Index. for generalized eigenvalue problem, butterfly form, 211

Index. for generalized eigenvalue problem, butterfly form, 211 Index ad hoc shifts, 165 aggressive early deflation, 205 207 algebraic multiplicity, 35 algebraic Riccati equation, 100 Arnoldi process, 372 block, 418 Hamiltonian skew symmetric, 420 implicitly restarted,

More information

Introduction. Chapter One

Introduction. Chapter One Chapter One Introduction The aim of this book is to describe and explain the beautiful mathematical relationships between matrices, moments, orthogonal polynomials, quadrature rules and the Lanczos and

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large

More information

Recent advances in approximation using Krylov subspaces. V. Simoncini. Dipartimento di Matematica, Università di Bologna.

Recent advances in approximation using Krylov subspaces. V. Simoncini. Dipartimento di Matematica, Università di Bologna. Recent advances in approximation using Krylov subspaces V. Simoncini Dipartimento di Matematica, Università di Bologna and CIRSA, Ravenna, Italy valeria@dm.unibo.it 1 The framework It is given an operator

More information

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

More information

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated. Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces

More information

ABSTRACT OF DISSERTATION. Ping Zhang

ABSTRACT OF DISSERTATION. Ping Zhang ABSTRACT OF DISSERTATION Ping Zhang The Graduate School University of Kentucky 2009 Iterative Methods for Computing Eigenvalues and Exponentials of Large Matrices ABSTRACT OF DISSERTATION A dissertation

More information

Krylov Subspaces. Lab 1. The Arnoldi Iteration

Krylov Subspaces. Lab 1. The Arnoldi Iteration Lab 1 Krylov Subspaces Lab Objective: Discuss simple Krylov Subspace Methods for finding eigenvalues and show some interesting applications. One of the biggest difficulties in computational linear algebra

More information

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Key words. conjugate gradients, normwise backward error, incremental norm estimation. Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 16-02 The Induced Dimension Reduction method applied to convection-diffusion-reaction problems R. Astudillo and M. B. van Gijzen ISSN 1389-6520 Reports of the Delft

More information

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems Charles University Faculty of Mathematics and Physics DOCTORAL THESIS Iveta Hnětynková Krylov subspace approximations in linear algebraic problems Department of Numerical Mathematics Supervisor: Doc. RNDr.

More information

FEM and sparse linear system solving

FEM and sparse linear system solving FEM & sparse linear system solving, Lecture 9, Nov 19, 2017 1/36 Lecture 9, Nov 17, 2017: Krylov space methods http://people.inf.ethz.ch/arbenz/fem17 Peter Arbenz Computer Science Department, ETH Zürich

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Krylov Subspaces. The order-n Krylov subspace of A generated by x is

Krylov Subspaces. The order-n Krylov subspace of A generated by x is Lab 1 Krylov Subspaces Lab Objective: matrices. Use Krylov subspaces to find eigenvalues of extremely large One of the biggest difficulties in computational linear algebra is the amount of memory needed

More information

Krylov Subspace Methods that Are Based on the Minimization of the Residual

Krylov Subspace Methods that Are Based on the Minimization of the Residual Chapter 5 Krylov Subspace Methods that Are Based on the Minimization of the Residual Remark 51 Goal he goal of these methods consists in determining x k x 0 +K k r 0,A such that the corresponding Euclidean

More information

1 Conjugate gradients

1 Conjugate gradients Notes for 2016-11-18 1 Conjugate gradients We now turn to the method of conjugate gradients (CG), perhaps the best known of the Krylov subspace solvers. The CG iteration can be characterized as the iteration

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 16: Reduction to Hessenberg and Tridiagonal Forms; Rayleigh Quotient Iteration Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY

A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY RONALD B. MORGAN AND MIN ZENG Abstract. A restarted Arnoldi algorithm is given that computes eigenvalues

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Lab 1: Iterative Methods for Solving Linear Systems

Lab 1: Iterative Methods for Solving Linear Systems Lab 1: Iterative Methods for Solving Linear Systems January 22, 2017 Introduction Many real world applications require the solution to very large and sparse linear systems where direct methods such as

More information

Contents. Preface... xi. Introduction...

Contents. Preface... xi. Introduction... Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug

More information

5.3 The Power Method Approximation of the Eigenvalue of Largest Module

5.3 The Power Method Approximation of the Eigenvalue of Largest Module 192 5 Approximation of Eigenvalues and Eigenvectors 5.3 The Power Method The power method is very good at approximating the extremal eigenvalues of the matrix, that is, the eigenvalues having largest and

More information

Eigenvalue Problems CHAPTER 1 : PRELIMINARIES

Eigenvalue Problems CHAPTER 1 : PRELIMINARIES Eigenvalue Problems CHAPTER 1 : PRELIMINARIES Heinrich Voss voss@tu-harburg.de Hamburg University of Technology Institute of Mathematics TUHH Heinrich Voss Preliminaries Eigenvalue problems 2012 1 / 14

More information

Algorithms that use the Arnoldi Basis

Algorithms that use the Arnoldi Basis AMSC 600 /CMSC 760 Advanced Linear Numerical Analysis Fall 2007 Arnoldi Methods Dianne P. O Leary c 2006, 2007 Algorithms that use the Arnoldi Basis Reference: Chapter 6 of Saad The Arnoldi Basis How to

More information

ABSTRACT NUMERICAL SOLUTION OF EIGENVALUE PROBLEMS WITH SPECTRAL TRANSFORMATIONS

ABSTRACT NUMERICAL SOLUTION OF EIGENVALUE PROBLEMS WITH SPECTRAL TRANSFORMATIONS ABSTRACT Title of dissertation: NUMERICAL SOLUTION OF EIGENVALUE PROBLEMS WITH SPECTRAL TRANSFORMATIONS Fei Xue, Doctor of Philosophy, 2009 Dissertation directed by: Professor Howard C. Elman Department

More information

Inexact and Nonlinear Extensions of the FEAST Eigenvalue Algorithm

Inexact and Nonlinear Extensions of the FEAST Eigenvalue Algorithm University of Massachusetts Amherst ScholarWorks@UMass Amherst Doctoral Dissertations Dissertations and Theses 2018 Inexact and Nonlinear Extensions of the FEAST Eigenvalue Algorithm Brendan E. Gavin University

More information

Alternative correction equations in the Jacobi-Davidson method. Mathematical Institute. Menno Genseberger and Gerard L. G.

Alternative correction equations in the Jacobi-Davidson method. Mathematical Institute. Menno Genseberger and Gerard L. G. Universiteit-Utrecht * Mathematical Institute Alternative correction equations in the Jacobi-Davidson method by Menno Genseberger and Gerard L. G. Sleijpen Preprint No. 173 June 1998, revised: March 1999

More information

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems Part I: Review of basic theory of eigenvalue problems 1. Let A C n n. (a) A scalar λ is an eigenvalue of an n n A

More information

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 1 SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 OUTLINE Sparse matrix storage format Basic factorization

More information

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation Tao Zhao 1, Feng-Nan Hwang 2 and Xiao-Chuan Cai 3 Abstract In this paper, we develop an overlapping domain decomposition

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 17 1 / 26 Overview

More information

From Stationary Methods to Krylov Subspaces

From Stationary Methods to Krylov Subspaces Week 6: Wednesday, Mar 7 From Stationary Methods to Krylov Subspaces Last time, we discussed stationary methods for the iterative solution of linear systems of equations, which can generally be written

More information

Inexact inverse iteration with preconditioning

Inexact inverse iteration with preconditioning Department of Mathematical Sciences Computational Methods with Applications Harrachov, Czech Republic 24th August 2007 (joint work with M. Robbé and M. Sadkane (Brest)) 1 Introduction 2 Preconditioned

More information

Lecture 3: Inexact inverse iteration with preconditioning

Lecture 3: Inexact inverse iteration with preconditioning Lecture 3: Department of Mathematical Sciences CLAPDE, Durham, July 2008 Joint work with M. Freitag (Bath), and M. Robbé & M. Sadkane (Brest) 1 Introduction 2 Preconditioned GMRES for Inverse Power Method

More information

Matrices, Moments and Quadrature, cont d

Matrices, Moments and Quadrature, cont d Jim Lambers CME 335 Spring Quarter 2010-11 Lecture 4 Notes Matrices, Moments and Quadrature, cont d Estimation of the Regularization Parameter Consider the least squares problem of finding x such that

More information

Numerical Methods - Numerical Linear Algebra

Numerical Methods - Numerical Linear Algebra Numerical Methods - Numerical Linear Algebra Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Numerical Linear Algebra I 2013 1 / 62 Outline 1 Motivation 2 Solving Linear

More information

Eigenvalues and eigenvectors

Eigenvalues and eigenvectors Chapter 6 Eigenvalues and eigenvectors An eigenvalue of a square matrix represents the linear operator as a scaling of the associated eigenvector, and the action of certain matrices on general vectors

More information

Nonlinear Iterative Solution of the Neutron Transport Equation

Nonlinear Iterative Solution of the Neutron Transport Equation Nonlinear Iterative Solution of the Neutron Transport Equation Emiliano Masiello Commissariat à l Energie Atomique de Saclay /DANS//SERMA/LTSD emiliano.masiello@cea.fr 1/37 Outline - motivations and framework

More information

Comparing iterative methods to compute the overlap Dirac operator at nonzero chemical potential

Comparing iterative methods to compute the overlap Dirac operator at nonzero chemical potential Comparing iterative methods to compute the overlap Dirac operator at nonzero chemical potential, Tobias Breu, and Tilo Wettig Institute for Theoretical Physics, University of Regensburg, 93040 Regensburg,

More information

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system !"#$% "&!#' (%)!#" *# %)%(! #! %)!#" +, %"!"#$ %*&%! $#&*! *# %)%! -. -/ 0 -. 12 "**3! * $!#%+,!2!#% 44" #% &#33 # 4"!#" "%! "5"#!!#6 -. - #% " 7% "3#!#3! - + 87&2! * $!#% 44" ) 3( $! # % %#!!#%+ 9332!

More information

Numerical Methods I Eigenvalue Problems

Numerical Methods I Eigenvalue Problems Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 2nd, 2014 A. Donev (Courant Institute) Lecture

More information

WHEN studying distributed simulations of power systems,

WHEN studying distributed simulations of power systems, 1096 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL 21, NO 3, AUGUST 2006 A Jacobian-Free Newton-GMRES(m) Method with Adaptive Preconditioner and Its Application for Power Flow Calculations Ying Chen and Chen

More information

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization

More information

Domain decomposition on different levels of the Jacobi-Davidson method

Domain decomposition on different levels of the Jacobi-Davidson method hapter 5 Domain decomposition on different levels of the Jacobi-Davidson method Abstract Most computational work of Jacobi-Davidson [46], an iterative method suitable for computing solutions of large dimensional

More information

arxiv: v1 [math.na] 5 May 2011

arxiv: v1 [math.na] 5 May 2011 ITERATIVE METHODS FOR COMPUTING EIGENVALUES AND EIGENVECTORS MAYSUM PANJU arxiv:1105.1185v1 [math.na] 5 May 2011 Abstract. We examine some numerical iterative methods for computing the eigenvalues and

More information

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS YOUSEF SAAD University of Minnesota PWS PUBLISHING COMPANY I(T)P An International Thomson Publishing Company BOSTON ALBANY BONN CINCINNATI DETROIT LONDON MADRID

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 9 Minimizing Residual CG

More information

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic joint work with Gérard

More information

IDR(s) as a projection method

IDR(s) as a projection method Delft University of Technology Faculty of Electrical Engineering, Mathematics and Computer Science Delft Institute of Applied Mathematics IDR(s) as a projection method A thesis submitted to the Delft Institute

More information

FINITE-DIMENSIONAL LINEAR ALGEBRA

FINITE-DIMENSIONAL LINEAR ALGEBRA DISCRETE MATHEMATICS AND ITS APPLICATIONS Series Editor KENNETH H ROSEN FINITE-DIMENSIONAL LINEAR ALGEBRA Mark S Gockenbach Michigan Technological University Houghton, USA CRC Press Taylor & Francis Croup

More information

Math 411 Preliminaries

Math 411 Preliminaries Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon s method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector

More information

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc. Lecture 11: CMSC 878R/AMSC698R Iterative Methods An introduction Outline Direct Solution of Linear Systems Inverse, LU decomposition, Cholesky, SVD, etc. Iterative methods for linear systems Why? Matrix

More information

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying I.2 Quadratic Eigenvalue Problems 1 Introduction The quadratic eigenvalue problem QEP is to find scalars λ and nonzero vectors u satisfying where Qλx = 0, 1.1 Qλ = λ 2 M + λd + K, M, D and K are given

More information

22.4. Numerical Determination of Eigenvalues and Eigenvectors. Introduction. Prerequisites. Learning Outcomes

22.4. Numerical Determination of Eigenvalues and Eigenvectors. Introduction. Prerequisites. Learning Outcomes Numerical Determination of Eigenvalues and Eigenvectors 22.4 Introduction In Section 22. it was shown how to obtain eigenvalues and eigenvectors for low order matrices, 2 2 and. This involved firstly solving

More information

In this section again we shall assume that the matrix A is m m, real and symmetric.

In this section again we shall assume that the matrix A is m m, real and symmetric. 84 3. The QR algorithm without shifts See Chapter 28 of the textbook In this section again we shall assume that the matrix A is m m, real and symmetric. 3.1. Simultaneous Iterations algorithm Suppose we

More information

On the influence of eigenvalues on Bi-CG residual norms

On the influence of eigenvalues on Bi-CG residual norms On the influence of eigenvalues on Bi-CG residual norms Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic duintjertebbens@cs.cas.cz Gérard Meurant 30, rue

More information

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A. AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative

More information

Comprehensive Introduction to Linear Algebra

Comprehensive Introduction to Linear Algebra Comprehensive Introduction to Linear Algebra WEB VERSION Joel G Broida S Gill Williamson N = a 11 a 12 a 1n a 21 a 22 a 2n C = a 11 a 12 a 1n a 21 a 22 a 2n a m1 a m2 a mn a m1 a m2 a mn Comprehensive

More information

MAA507, Power method, QR-method and sparse matrix representation.

MAA507, Power method, QR-method and sparse matrix representation. ,, and representation. February 11, 2014 Lecture 7: Overview, Today we will look at:.. If time: A look at representation and fill in. Why do we need numerical s? I think everyone have seen how time consuming

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J Olver 8 Numerical Computation of Eigenvalues In this part, we discuss some practical methods for computing eigenvalues and eigenvectors of matrices Needless to

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview & Matrix-Vector Multiplication Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 20 Outline 1 Course

More information

Linear algebra & Numerical Analysis

Linear algebra & Numerical Analysis Linear algebra & Numerical Analysis Eigenvalues and Eigenvectors Marta Jarošová http://homel.vsb.cz/~dom033/ Outline Methods computing all eigenvalues Characteristic polynomial Jacobi method for symmetric

More information

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information