Math 411 Preliminaries - PDF Free Download

Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon's method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector norms, 2-norm, infinite norm, Matrix norms, Condition number, Gauss Elimination Preliminary Use of Absolute error and Relative error and their relation to number of significant digits Solution of linear system of equations via Gauss elimination Stopping criteria in iterations Rounding error and truncation error Floating point system 1

Floating Point System Understand and handle error from floating point calculation Role of stability in numerical calculation Base, Sign bit, Exponent, Mantissa, underflow, overflow, rounding, chopping, roundoff error, absolute error, relative error, significant digits, truncation error, machine epsilon. Distribution of floating point numbers Addition of floating point numbers Loss of associativity and distributivity Catastrophic cancellation Prevention of catastrophic cancellation Forward error analysis Backward error analysis IEEE standard Basic Issues in Floating Point Arithmetic and Error Analysis, Supplementary Lecture Notes of J. Demmel on Floating Point System, University of California at Berkeley, September 1995. (local copy) Miscalculating Area and Angles of a Needle-like Triangle, Notes of W. Kahan on common misconceptions of floating point calculations, University of California at Berkeley, September 1997. (local copy) IEEE Floating Point Arithmetic, More advanced Lecture Notes of W. Kahan on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic, University of California at Berkeley, September 1995. (local copy) Goldberg, David, What every computer scientist should know about floating-point arithmetic, ACM Computing Surveys, Vol.23, No. 1 (March 1991), pp. 5-48 (Electronic version not available.) 2

Newton's method for several variables vector functions Newton's method derivations strength and weaknesses of the method interpretation of quadratic convergence Local convergence, Quadratic rate of convergence, Function Iteration, Jacobian matrix Newton's method derivations via Generalization of Newton's method for single variable functions Taylor series Quadratic rate of convergence and number of significant digits "Derivative" appears as a matrix Requires solution of a "totally new" linear system at each iteration step Non-convergence and slow convergence of Newton's method: Jacobian matrix may become singular Oscillation similar to one dimensional case Flat surface Multiple root Method properties: Strength Derivation of method may be obtained from Taylor series Local convergence when approximation is close to root Quadratic rate of convergence when approximation is close to root Reasonably easy to implement if Jacobian matrix can be symbolically calculated May be used to find complex roots Weaknesses Determination of starting guess may not be trivial Convergence or rate of convergence is not guaranteed when not close to the root Method may not converge or may converge very slowly Method requires evaluation of the Jacobian matrix as well as solution of a linear system at each iteration Stopping criteria choice not obvious Requires initial guess to be complex in order to find a complex root Can still view Newton's method as a fixed point iteration Quadratic convergence follows from general theory of fixed point iteration Theory for system of non-linear equations is still incomplete In implementation, some approximates Jacobian matrix via finite differencing (naïve) or "automatic differencing" (sophisticated) 3

Fixed point iteration in higher dimension Contraction mapping and Function iteration method Seidel's method to speed convergence Functional iteration, Seidel method Brouwer fixed point theorem (Review) Uniqueness of fixed point and derivative bounds Functional iteration and contraction mapping Use of Seidel's method to accelerate convergence Theoretical and Computable (though often pessimistic) error bounds Choice of initial guess not obvious Often used as a brute force approach or method of last resort even if derivative bounds are not satisfied 4

Quasi-Newton method- Broyden s method Broyden s method as an efficient way to apply concept from Newton s method Efficient computation of inverse of a matrix with a rank-1 update Rank 1 update, Sherman-Morrison formula Entries in Jacobian matrix may be approximated by finite differences Deteriotion of rate of convergence from quadratic to superlinear Initial computation of inverse requires O(n 3 ) operations using Gauss Elimination Sherman-Morrison implies subsequent calculation requires O(n 2 ) operations Hessian Broyden s method is implemented by explicitly calculating the inverse using Sherman Morrison formula. Gauss elimination is used only for constructing the inverse of the initial (exact or approximate) Jacobian matrix. 5

Steepest Descent method General solution of nonlinear equations Special case of solution of linear equations with a symmetric matrix Steepest descent direction, Quadratic form, Line search, Quadratic Interpolation Conversion of nonlinear system of equations to minimization problem Use of gradient direction as steepest descent direction Slow rate of convergence near minimum point Use of steepest descent method to generate initial guess for Newton or quasi-newton method For quadratic forms, performance deteriorates as condition number increases Line search has exact value for quadratic form Line search for general nonlinear problems is approximated by a quadratic minimization problem Conjugate direction methods Even if the original set of nonlinear equations do not admit a solution, a solution to the minimization problem can always be found using the steepest descent method. A theoretically optimal method (such as steepest descent method) is not necessarily the best numerical method 6

Conjugate Gradient method Conjugate direction, A-conjugate, (Theoretical) Finite termination Jonathan R. Shewchuk, An Introduction to the Conjugate Gradient Method Without the Agonizing Pain: http://www.cs.cmu.edu/~jrs/jrspapers.html#cg 7

Eigenvalue Problems Application giving rise to eigenvalue problems Properties of eigenvalues/ eigenvectors Eigenvalue Problem, Shift, Similarity transform Eigenvalues of diagonal and triangular matrices Eigenvalues of inverse matrix Gauss elimination does not preserve eigenvalues Similarity transformation preserves eigenvalues Change to eigenvalues through shifting, scalar product, matrix product Determinant as product of eigenvalues Gerschgorin theorem- location of eigenvalues Eigenvalues of real symmetric matrices are real Positive definite matrices 8

Power method Power method, Rayleigh quotient and inverse iteration Wielandt deflation Infinite norm, 2-norm and inner product, deflation Convergence of power method relies on existence of a dominant eigenvalue Normalization is needed within iteration to keep the entries of the eigenvector meaningful Eigenvector is obtained along with the eigenvalue Rayleigh quotient (for symmteric matrices) converges at twice the rate as regular power method LU factorization on A may be performed prior to applying the inverse power method or inverse iteration in order to speed up the calculation If the matrix is tridiagonal, each inverse iteration step is only O(n). Existence of multiple dominant eigenvalues of the same magnitude lead to either slow convergence or oscillation in the numerical result A matrix with the same set of eigenvalues as A, except the dominant eigenvalue is zeroed out, may be constructed using Wielandt deflation Power method x(new) = c A x(old) Largest (in mag.) eigenvalue Inverse power method A x(new) = c x(old) Smallest (in mag.) eigenvalue Shifted power method x(new) = c (A-s I) x(old) Eigenvalue farthest from shift s Inverse iteration (A-s I) x(new) = c x(old) Eigenvalue closest to shift s Here c denote a normalizing factor G. Peters and J.H. Wilkinson, Inverse Iteration, Ill-Conditioned Equations and Newton's Method, SIAM Review 21. No. 1 (1979), 339-360. (accessible through JSTOR) Even if the initial random vector does not have a component in the dominant direction, the presence of rounding error will eventually reintroduce a component in the dominant direction, and thus ensuring the success of the power method. If the shift s is close to an eigenvalue, the matrix A-sI is near singular and some care must be taken in the inverse iteration. However, even with poor conditioning, the inverse iteration performs well. 9

Householder Transformation Householder transformation to upper Hessenberg form QR factorization Orthogonal matrix, Upper Hessenberg form, QR factorization, catastrophic cancellation Orthogonal matrices have nice stability property: Qx = x in 2-norm, thus condition number is 1 Product of orthogonal matrices is orthogonal Each Householder matrix is constructed from the column vector that it needs to zero out Householder transformation is symmetric and orthogonal Pre- and post multiplication by a Householder transformation is a similarity transformation Construction of Householder transformations for a matrix The first component of the vector in the Householder transformation should be constructed with a sign chosen to avoid catastrophic cancellation QR factorization is closely related to Gram-Schmidt orthogonalization A symmetric matrix in upper Hessengberg form is tridiagonal In QR factorization, the matrix Q is the transpose of product of Householder transformations It is usually not necessary to compute the Householder transformation matrix explicitly, instead, use the normalized vector u or the unnormalized vector x associated with the Householder transformation. In QR factorization, the upper triangular matrix R obtained by pre-multiplication of Householder transformations does not possess the same set of eigenvalues as the original matrix. The QR factorization is not unique: by appropriately changing the signs of the entries in the matrices we can get many different factorizations QR factorization is applicable to rectangular matrices 10

QR algorithm Use of Givens' rotation matrix to covert upper Hessenberg matrix to upper triangular matrix Basic QR algorithm for symmetric matrices as a series of similarity transformation Givens' Rotation matrix Real symmetric matrices have real eigenvalues QR algorithm produces similarity transforms even though QR factorization does not preserve eigenvalues QR algorithm applied to a matrix with real eigenvalues gives a diagonal matrix in the limit. QR algorithm is usually performed after a matrix is converted to an upper Hessenberg form Jacobi rotation matrix is orthogonal but usually non-symmetric The QR factorization process may be performed using Jacobi rotation (or fast Given s method) Construction of rotation matrices QR algorithm is closely related to the power method and hence shift may be employed to enhance performance- often convergence rate is cubic Choice of shifts A discussion of the QR algorithm implementation in matlab: http://www.mathworks.com/publications/newsletter/pdf/sum95cleve.pdf QR algorithm is applicable to square matrix only. 11

Euler's method Euler's method as Lipschitz condition, Lipschitz constant, global error, local truncation error Euler's method is a first order Taylor series approximation Interpretation of Euler's method as a straight line approximation to the slope function f(t,y) at "initial" time global error estimate of Euler's method Effect of rounding error on Euler's method 12

Taylor series methods Local truncation error and order of Taylor series methods Derivation of Taylor series methods Interpretation of slope constants k i 's in the classical Runge-Kutta method Local truncation error, order of Taylor series methods, one step method Computation of derivatives of slope function f', f'' etc using chain rule 13

Classical Runge-Kutta methods Derivation of Runge-Kutta methods Interpretation of slope constants k i 's in the classical Runge-Kutta method Implicit Euler's method, modified Euler's method, Huen's method, Classical Runge-Kutta methods Use of Taylor series expansion in Runge-Kutta methods leads to system of nonlinear equations Infinite number of solutions implies 14

Runge-Kutta-Fehlberg method baisc ideas behind RKF method Stepsize control Estimation of local truncation error by comparing solutions from methods of different order Calculation of change of stepsize based on tolerance of local truncation error Overlapping of sloping constants in RFK method minimizes computational effort 15

Multistep methods Derivation of multistep methods Adams Bashforth methods, Adams Moulton methods, predictor-corrector pair Derivation of Adams Bashforth and Adams Moulton methods based on numerical integration Local truncation error of multistep methods Need one-step methods of the same order to generate starting values Implementation of predictor-corrector pair using methods of same order P(EC) n strategy 16

Stepsize control in multistep methods Stepsize control for predictor-corrector pairs Use of solutions from predictor and corrector to estimate local truncation error Determination of stepsize change for given certain error tolerance Necessity to restart using a one-step method whenever there is a change in stepsize 17

Extrapolation methods Midpoint method Asymptotic expansion formula, end point correction Elimination of succesive powers of stepsize from given asymptotic formulae relating the exact solution to the numerical solutions and stepsize. 1. Even though the first order Euler's method is used, the application of the end point correction generates solution values that are second order accurate. 2. It is not trivial to establish an asymptotic expansion for a general multistep method. 3. Without a rigorous asymptotic expansion formula, the extrapolation idea is only of limited use. 18

System of ODEs Conversion of higher order differential equations to first order system Conversion of higher order differential equations to first order system Runge-Kutta methods and multistep methods may be applied in a straight forward manner to first order system of ODE 19

Stability Relation between consistency, stability and convergence Determination of consistency of a given method Determination of stability of a given method Convergence, consistency, root condition (zero-stability), first and second characteristic polynomials Consistency + stability = convergence Convergence means difference between exact solution and numerical solution becomes small as stepsize tends to zero Consistency means difference between differential equation and difference equation becomes small as stepsize tends to zero All Runge-Kutta methods are consistent Multistep methods are consistent if p(1)=0 and p'(1)=q(1) where p and q are the first and second characteristic polynomials respectively For multistep methods, the root condition gives an easy means to determine the stability of the method. Solution of linear homogeneous difference equation: Set y n = ξ n in difference equation and solve for ξ. 20

Stiff equations Treatment of stiff systems by an A-stable method Stiff systems, characteristic polynomial, region of absolute stability, implicit trapezoidal rule, A-stability Application of classical methods (Runge-Kutta or multistep methods) to stiff systems leads to unstable solution unless an excessively small stepsize is used The implicit trapezoidal rule is the only A-stable multistep method A-stability is a very restrictive definition and there are not many methods that possess such properties. Hence there are other definition of stability. 21

Linear shooting methods Replacement of boundary value problem by two initial value problems Boundary value problems Replacement of boundary value problem by two initial value problems 22

Nonlinear shooting methods Replacement of boundary value problem by iterative solution of two coupled initial value problems Application of Newton's method lead to an auxiliary initial value problem Replacement of boundary value problem by iterative solution of two coupled initial value problems 23

Finite difference method for linear equations Construction of linear system of equations Finite difference, centered difference Construction of linear system of equations System of equations is often a tridiagonal system and may thus be solved efficiently Basic steps: 1. Partitioning of interval into equally spaced subintervals 2. Replace derivatives in equation by finite differences 3. Construct linear equations 4. Apply boundary conditions to complete system 5. Solve linear system 6. Use interpolation procedure etc to extract relevant information 24

Finite element methods for linear equations Finite element (Rayleigh-Ritz) method using linear elements or B-spline basis. Variational form, linear elements, cubic B-splines, compact (or local) support, weak solution Self adjoint equations usually correspond to a minimization problem over some function space When linear elements are used, only get a piecewise linear approximation to the original problem B-splines have continuity up to the second derivative The weak form of the differential equation may be obtained by computing the gradient of the functional of minimization problem The weak form may also be obtained by multiplying the original differential equation by a test function, integrate and then perform an integration by parts for the highest derivative term. The use of linear elements leads to a tridiagonal system The use of B-splines lead to a band matrix Numerical integration may be required when computing matrix and vector entries 25

Finite difference methods for nonlinear equations Application of finite difference methods to nonlinear equations Application of finite difference methods to nonlinear differential equations leads to a (large) system of nonlinear algebraic equations. Nonlinear system may be linearized by "lagging" the nonlinear term during iteration. This is essentially the Picard (fixed point) iteration. Convergence is usually slow but the radius of convergence is larger than the Newton's method. 26

Discrete least sqaures methods Least squares approximation using a discrete data set Normal equations Normal equations are obtained by computing the gradient of the minimization function Matrix interpretation: A T Ax=A T b 27

Continuous least squares methods Construction of least squares approximation to a continuous function Hilbert matrix, orthogonal polynomials, weight function, Legendre polynomials, Graham Schmidt process, Three term recurrence relation Construction of normal equations Use of standard polynomial basis leads to Hilbert matrix Graham Schmidt process of generating orthogonal polynomials Generating orthogonal polynomials from three-term recurrence relation 28

Chebyshev polynomials Properties of Chebyshev polynomials The role of Chebyshev points in interpolation Monic polynomials, Chebyshev points (roots of chebyshev polynomials), uniform boundedness of interpolation error, economization Three term recurrence relation of Chebyshev polynomials Explicit definition using cosine and arccos functions Orthogonality property Zeroes and Extremal points of Chebyshev polynomials Maximum values of monic Chebyshev polynomials Use of Chebyshev points as interpolating points to minimize Lagrange interpolation error Uniform boundedness of interpolation error using Chebyshev points 29

Padé approximation Construction of Padé approximation using Taylor series Rational approximation Matching powers of x in construction of Padé approximation of f(x) using Taylor series approximations In r(x)=p(x)/q(x), set q(0)=1 to ensure uniqueness of rational approximation Rational approximations give a more accurate representation of a given function than polynomial approximations of the same degree 30

Approximations by trigonometric polynomials Construction of least squares approximations by trigonometric functions Degree of trigonometric polynomials Continuous orthogonality of trigonometric functions Discrete orthogonality of trigonometric functions Need 2n+1 data points when computing a trigonometric polynomial of degree n Normal equations resulting from discrete least squares approximation by trigonometric polynomials correspond to a diagonal system 31

Fast Fourier Transform Computation of Fourier coefficients using Fast Fourier Transform Complex form of Fourier series, Transformation matrix, Bit reversal Relation between complex (exponential) form and real (sine and cosine) form of Fourier series Discrete Fourier transform requires O(N 2 ) operations Fast Fourier transform requires O(Nlog 2 N) operations Fast Fourier transform makes use of special structure of the transform matrix to calculate matrix-vector product efficiently. Generation of transformation matrix may be done efficiently All entries of the transformation matrix are powers of a single (complex) number Fast Fourier transform may be interpreted as the recursive decomposition of an even order polynomial into two polynomials of half the original order. Fast Poisson solvers for separable PDEs are based on FFT. 32