Conditioning and Stability

Similar documents
Numerical Linear Algebra

Iterative Solvers. Lab 6. Iterative Methods

Lecture 7. Gaussian Elimination with Pivoting. David Semeraro. University of Illinois at Urbana-Champaign. February 11, 2014

1 Backward and Forward Error

SVD and Image Compression

10/22/16. 1 Math HL - Santowski SKILLS REVIEW. Lesson 15 Graphs of Rational Functions. Lesson Objectives. (A) Rational Functions

Numerical Algorithms. IE 496 Lecture 20

Lecture 7. Floating point arithmetic and stability

Krylov Subspaces. Lab 1. The Arnoldi Iteration

Notes on Conditioning

1. Introduction to commutative rings and fields

Laboratory #3: Linear Algebra. Contents. Grace Yung

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Least squares and Eigenvalues

MODELS USING MATRICES WITH PYTHON

1 Problem 1 Solution. 1.1 Common Mistakes. In order to show that the matrix L k is the inverse of the matrix M k, we need to show that

Eigenvalue problems III: Advanced Numerical Methods

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel

Numerical Linear Algebra

Lab 1: Iterative Methods for Solving Linear Systems

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

6.5 Dividing Polynomials

Math 504 (Fall 2011) 1. (*) Consider the matrices

1. Introduction to commutative rings and fields

Krylov Subspaces. The order-n Krylov subspace of A generated by x is

Elements of Floating-point Arithmetic

Introduction - Motivation. Many phenomena (physical, chemical, biological, etc.) are model by differential equations. f f(x + h) f(x) (x) = lim

Outline Python, Numpy, and Matplotlib Making Models with Polynomials Making Models with Monte Carlo Error, Accuracy and Convergence Floating Point Mod

Solving linear systems (6 lectures)

Line Search Algorithms

4.4 Uniform Convergence of Sequences of Functions and the Derivative

Elements of Floating-point Arithmetic

Lecture 4: Linear Algebra 1

SCORE BOOSTER JAMB PREPARATION SERIES II

Chapter 4 No. 4.0 Answer True or False to the following. Give reasons for your answers.

Unit 13: Polynomials and Exponents

1 Finding and fixing floating point problems

A Glimpse at Scipy FOSSEE. June Abstract This document shows a glimpse of the features of Scipy that will be explored during this course.

c n (x a) n c 0 c 1 (x a) c 2 (x a) 2...

More General Functions Is this technique limited to the monomials {1, x, x 2, x 3,...}?

Lectures about Python, useful both for beginners and experts, can be found at (

Chapter #4 EEE8086-EEE8115. Robust and Adaptive Control Systems

Lecture 28 The Main Sources of Error

DM534 - Introduction to Computer Science

Complex Numbers. A complex number z = x + iy can be written in polar coordinates as re i where

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

Solving Polynomial and Rational Inequalities Algebraically. Approximating Solutions to Inequalities Graphically

STA141C: Big Data & High Performance Statistical Computing

QR 2: Least Squares and Computing Eigenvalues

Image Processing in Numpy

STA141C: Big Data & High Performance Statistical Computing

Errors. Intensive Computation. Annalisa Massini 2017/2018

LESSON 6.2 POLYNOMIAL OPERATIONS I

Conjugate-Gradient. Learn about the Conjugate-Gradient Algorithm and its Uses. Descent Algorithms and the Conjugate-Gradient Method. Qx = b.

Linear Algebra March 16, 2019

Linear Programing and Inversions

Linear Algebra Using MATLAB

Key Point. The nth order linear homogeneous equation with constant coefficients

AMS526: Numerical Analysis I (Numerical Linear Algebra)

6.1 Polynomial Functions

Lecture for Week 2 (Secs. 1.3 and ) Functions and Limits

( ) y 2! 4. ( )( y! 2)

Physics 331 Introduction to Numerical Techniques in Physics

Mathematics for Engineers. Numerical mathematics

Linear algebra & Numerical Analysis

AIMS Exercise Set # 1

7 Principal Component Analysis

Introduction to Python

Lab Objective: Introduce some of the basic optimization functions available in the CVXOPT package

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Complex Numbers. Visualize complex functions to estimate their zeros and poles.

OR MSc Maths Revision Course

Solving Linear Systems of Equations

Solutions to Assignment 1

A = 3 1. We conclude that the algebraic multiplicity of the eigenvalues are both one, that is,

ESCI 386 Scientific Programming, Analysis and Visualization with Python. Lesson 18 - Linear Algebra

Lab 4: Working with arrays and eigenvalues

nonlinear simultaneous equations of type (1)

Section 2.4: Add and Subtract Rational Expressions

Algebraic. techniques1

27 Wyner Math 2 Spring 2019

Eigenvalues and Eigenvectors

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

1.1 COMPUTER REPRESENTATION OF NUM- BERS, REPRESENTATION ERRORS

Stephen F Austin. Exponents and Logarithms. chapter 3

Gradient Descent Methods

Definition of the Drazin Inverse

Applied Linear Algebra in Geoscience Using MATLAB

Exercise List: Proving convergence of the Gradient Descent Method on the Ridge Regression Problem.

1 Floating point arithmetic

2. Limits at Infinity

Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n

Non-Convex Optimization. CS6787 Lecture 7 Fall 2017

DIFFERENTIATION RULES

Linear Algebra and Eigenproblems

Name: INSERT YOUR NAME HERE. Due to dropbox by 6pm PDT, Wednesday, December 14, 2011

2 Getting Started with Numerical Computations in Python

3.3 Real Zeros of Polynomial Functions

18.S34 linear algebra problems (2007)

Transcription:

Lab 17 Conditioning and Stability Lab Objective: Explore the condition of problems and the stability of algorithms. The condition number of a function measures how sensitive that function is to changes in the input. On the other hand, the stability of an algorithm measures how well that algorithm computes the value of a function from exact input. Condition number of a function The (absolute) condition number of a function f : R m R n is J(x) = lim sup δ 0 δx δ δf δx (17.1) where δf = f(x + δx) f(x). In other words, the condition number of f is (the limit of) the change in output over the change of input. Similarly, the relative condition number of f is the limit of the relative change in output over the relative change in input, i.e., In fact, κ(x) = lim sup δ 0 δx δ κ(x) = ( δf f(x) x f(x) J(x) A function is ill-conditioned if its condition number is large. / ) δx. (17.2) x Small changes to the input of an ill-conditioned function produce large changes in output. In applications, it is important to know if a function is ill-conditioned because there is usually some error in the parameters passed to the function. Example: the Wilkinson polynomial Let f : C n+1 C n be the function that sends (a 1,..., a m+1) to the roots of a 1x n + a 2x n 1 +... + a nx + a n+1. In other words, this function describes the problem of finding the roots of a polynomial. Unfortunately, root finding is extremely ill-conditioned. 179

180 Lab 17. Conditioning and Stability A classic example is the Wilkinson polynomial 20 w(x) = (x r) = x 20 210x 19 + 20615x 18.... r=1 We will use NumPy to explore the condition number of f at the coefficients of w(x). These coefficients are contained in the NumPy array w_coeffs below. We also create an array w_roots containing the roots of w(x). 1 >>> import numpy as np >>> from scipy import linalg as la >>> w_coeffs = np.array([1, -210, 20615, -1256850, 53327946, -1672280820, 40171771630, -756111184500, 11310276995381, -135585182899530, 1307535010540395, -10142299865511450, 63030812099294896, -311333643161390640, 1206647803780373360, -3599979517947607200, 8037811822645051776, -12870931245150988800, 13803759753640704000, -8752948036761600000, 2432902008176640000]) >>> w_roots = np.arange(1, 21) Next we perturb the polynomial by changing the x 19 -coefficient from 210 to 210.0000001. >>> perturb = np.zeros(21) >>> perturb[1]=1e-7 >>> perturbed_coeffs = w_coeffs - perturb Now we find the roots of the perturbed polynomial. The function np.poly1d() creates a polynomial object using the array passed to it as coefficients, and np.roots() finds the roots of a polynomial. >>> perturbed_roots = np.roots(np.poly1d(perturbed_coeffs)) The new roots are plotted with the original roots in Figure 17.1. Finally we compute an approximation to the condition number. before we compare them to ensure that they are in the same order. We sort the roots >>> w_roots = np.sort(w_roots) >>> perturbed_roots = np.sort(perturbed_roots) >>> la.norm(perturbed_roots-w_roots)/la.norm(perturb) 68214100.15878984 Thus, the condition number of this problem is something like 10 7. When we try to estimate the relative condition number, NumPy will not compute the 2-norm of w_coeffs because it is so large. NumPy will not compute the square root of a very large number. One easy solution is to use a different norm that does not require the square root. >>> # Estimate the absolute condition number in the infinity norm >>> k = la.norm(perturbed_roots-w_roots, np.inf)/la.norm(perturb, np.inf) >>> k 28260604.34345689 >>> # Estimate the relative condition number in the infinity norm 1 It is possible to create this array in NumPy by expanding the product 20 r=1 (x r), for example using np.poly(). However, this approach is unstable and will give you the wrong coefficients!

181 3 2 1 0 1 2 3 5 0 5 10 15 20 25 Figure 17.1: The blue dots are the roots of w(x). The red dots are the roots of the polynomial perturbed by 1e-7 in the x 19 -coefficient. >>> k*la.norm(w_coeffs, np.inf)/la.norm(w_roots, np.inf) 1.9505129642488696e+25 As you can see, the order of magnitude of the absolute condition number is the same as when we computed it with the 2-norm. The relative condition number for this problem is approximately 10 24. There are some caveats to this example. First, when we compute the quotients in (17.1) and (17.2) for a fixed δx, we are only approximating the condition number. The actual condition number is a limit of such quotients. We hope that when δx is small, a random quotient is at least the same order of magnitude as the limit, but we have no way to be sure. Second, this example assumes that NumPy s root-finding algorithm is stable, so that the difference between the roots of w_coeffs and perturbed_coeffs is due to the difference in coefficients, and not the difference in roots. We will return to this issue in the next section. Problem 1. Write a Python function that investigates the condition number of the Wilkinson polynomial by doing the following. 1. Perform this experiment: Randomly perturb w(x) by replacing a coefficient a i with a i r i, where r i is drawn from a normal distribution centered at 1 with variance e 10. Plot the results of 100 such experiments in a single graphic, along with the roots of the unperturbed polynomial w(x). The plot should look something like Figure 17.3. This exercise reproduces Figure 12.1 on p. 93 of Numerical Linear Algebra by Lloyd N. Trefethen and David Bau III.

182 Lab 17. Conditioning and Stability 2. Using the final experiment only, estimate the relative and absolute condition number (in any norm you prefer). Print these numbers to the screen. 6 4 2 0 2 4 6 0 5 10 15 20 Figure 17.2: Sample result of Problem 1. The blue dots are the roots of w(x) and the black dots are roots of random perturbations. This figure replicates Figure 12.1 on p. 93 of Numerical Linear Algebra by Lloyd N. Trefethen and David Bau III. Example: calculating eigenvalues Let f : C n2 C n be the function that sends an n n matrix to its n eigenvalues. This problem is well-conditioned for symmetric matrices, but can be extremely ill-conditioned for non-symmetric matrices. Let us use NumPy to calculate the condition number of the eigenvalue problem at the identity matrix. First we check that the eigenvalue solver is stable here. >>> M = np.array([[1,0],[0,1]]) >>> eigs = la.eig(m)[0] >>> eigs array([ 1.+0.j, 1.+0.j]) Now we perturb M by adding a matrix drawn from a random normal distribution over the complex numbers. We calculate the eigenvalues of the perturbed matrix. >>> perturb = perturb = np.random.normal(0, 1e-10, M.shape) + np.random.normal (0,1e-10, M.shape)*1j >>> eigsp = la.eig(m+perturb)[0] Finally we use this data to approximate the condition number. >>> k = la.norm(eigs-eigsp)/la.norm(perturb) # Absolute condition number

183 0.62957336119253127 >>> k*la.norm(m)/la.norm(eigs) # Relative condition number 0.62957336119253127 The absolute and relative condition number are the same because la.norm(m) and la.norm (eigs) are both 2. Problem 2 (Optional). Let us explore the condition number of the eigenvalue problem. 1. (a) Write the following function. def eig_condit(m): ''' Approximate the condition number of the eigenvalue problem at M. INPUT: M - A 2-D NumPy array, representing a matrix. RETURN: A tuple containing approximations to the absolute and relative condition numbers of the eigenvalue problem at M. ''' (b) Find an example of a 2 2 matrix with a very large condition number. (Hint: Look at matrices whose off-diagonal entries are very different in magnitude.) (c) What is the order of magnitude of the condition number of a symmetric 2 2 matrix? 2. Write the following function. def plot_eig_condit(x0=-100, x1=100, y0=-100, y1=100, res=10): ''' Plot the condition number of the eigenvalue problem on [x0, x1]x [y0,y1]. Specifically, use plt.pcolormesh to plot the relative condition number of the eigenvalue problem at [[1,x],[y,1]] on this domain. The variable `res' should be the number of sample points taken along each axis, for a total of `res'**2 points in the plot. ''' 3. Call your function for res=10, 50, 100, 200 and 400 (output for res = 200 is pictured below). Recall that matplotlib scales the colorbar of the output to fit the largest and smallest output values. What can you conclude about the condition number of the eigenvalue problem at a random 2 2 matrix?

184 Lab 17. Conditioning and Stability Figure 17.3: Output of plot_eig_condit(res=200). Stability of an Algorithm The stability of an algorithm is measured by the error in its output. Suppose we have some algorithm to compute f : R m R n. Let f(x) represent the value computed by the algorithm at x. Then the forward error of f at x is f(x) f(x), and the relative forward error of f at x is f(x) f(x). f(x) An algorithm is stable if this relative forward error is small. As an example, let us examine the stability of NumPy s root finding algorithm that we used to investigate the Wilkinson polynomial. We know the exact roots of w(x), and we can also compute these roots using NumPy s np.roots() function. >>> roots = np.arange(1,21) >>> w_coeffs = np.array([1, -210, 20615, -1256850, 53327946, -1672280820, 40171771630, -756111184500, 11310276995381, -135585182899530, 1307535010540395, -10142299865511450, 63030812099294896, -311333643161390640, 1206647803780373360, -3599979517947607200, 8037811822645051776, -12870931245150988800, 13803759753640704000, -8752948036761600000, 2432902008176640000]) >>> computed_roots = np.roots(np.poly1d(w_coeffs)) We sort the roots to ensure they are in the same order, then compute the absolute and relative forward error. >>> roots = np.sort(roots) >>> computed_roots = np.sort(computed_roots) >>> la.norm(roots-computed_roots) # Forward error

185 0.020612653126379665 >>> la.norm(roots-computed_roots)/la.norm(roots) # Relative forward error 0.00038476268486104599 This analysis gives us hope that questions of stability did not interfere too much with our experiments in Problem 1. Catastrophic Cancellation Catastrphic Cancellation is a term for when a computer takes the difference of two very similar numbers, and the result is stored with a small number of significant digits. Because of the way computers store and perform arithmetic on numbers, future computations can amplify a catastrophic cancellation into a huge error. You are at risk for catastrophic cancellation whenever you subtract floats or large integers that are very close to each other. You can avoid the problem either by rewriting your program to not use subtraction, or by increasing the number of significant digits that your computer tracks. Here is an example of catastrophic cancellation. Suppose we wish to compute a b. We can either do this subtraction directly or perform the equivalent division a b = ( a b) a + b a + b = a b a + b = 1 a + b. Let us perform this computation both ways in NumPy with a = 10 2 0+1 and b = 10 2 0. >>> np.sqrt(1e20+1)-np.sqrt(1e20) 0.0 >>> 1/(np.sqrt(1e20+1)+np.sqrt(1e20)) 5.0000000000000002e-11 Since a b, clearly a b should be nonzero. Problem 3. Let I(n) = 1 0 xn e x 1 dx. 1. Prove that 0 I(n) 1 for all n. 2. It can be shown that for n > 1, where!n is the subfactorial of n. function. def integral(n): '''Return I(n).''' I(n) = ( 1) n!n + ( 1) n+1 n! e Use this formula to write the following Hint: The subfactorial function can be imported from SymPy with the line from sympy import subfactorial. 3. The actual values of I(n) for many values of n are listed in the table below. Use your function integral() to compute I(n) for these same values of n, and create a table comparing the data. How can you explain what is happening?

186 Lab 17. Conditioning and Stability n Actual value of I(n) 1 0.367879441171 5 0.145532940573 10 0.0838770701034 15 0.0590175408793 20 0.0455448840758 25 0.0370862144237 30 0.0312796739322 35 0.0270462894091 40 0.023822728669 45 0.0212860390856 50 0.0192377544343