Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Similar documents
Bindel, Spring 2017 Numerical Analysis (CS 4220) Notes for

1 Error analysis for linear systems

Bindel, Fall 2011 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Jan 9

ERROR AND SENSITIVTY ANALYSIS FOR SYSTEMS OF LINEAR EQUATIONS. Perturbation analysis for linear systems (Ax = b)

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

12. Perturbed Matrices

Validation for scientific computations Error analysis

1 Backward and Forward Error

Scientific Computing

Topics. Review of lecture 2/11 Error, Residual and Condition Number. Review of lecture 2/16 Backward Error Analysis The General Case 1 / 22

Condition number and matrices

Lecture 9. Errors in solving Linear Systems. J. Chaudhry (Zeb) Department of Mathematics and Statistics University of New Mexico

2tdt 1 y = t2 + C y = which implies C = 1 and the solution is y = 1

Lecture 10: Powers of Matrices, Difference Equations

A linear equation in two variables is generally written as follows equation in three variables can be written as

Inner product spaces. Layers of structure:

Lecture 7. Floating point arithmetic and stability

Numerical Analysis: Solutions of System of. Linear Equation. Natasha S. Sharma, PhD

Solutions to Math 41 First Exam October 15, 2013

An overview of key ideas

Outline. Math Numerical Analysis. Errors. Lecture Notes Linear Algebra: Part B. Joseph M. Mahaffy,

30.4. Matrix Norms. Introduction. Prerequisites. Learning Outcomes

Lecture 7. Gaussian Elimination with Pivoting. David Semeraro. University of Illinois at Urbana-Champaign. February 11, 2014

Math 328 Course Notes

Numerical Analysis FMN011

Linear Algebra Massoud Malek

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization

Error Bounds for Iterative Refinement in Three Precisions

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization

Introduction, basic but important concepts

Least squares: the big idea

Lecture 5. Ch. 5, Norms for vectors and matrices. Norms for vectors and matrices Why?

Eigenspaces and Diagonalizable Transformations

Dense LU factorization and its error analysis

Review Questions REVIEW QUESTIONS 71

MATH2071: LAB #5: Norms, Errors and Condition Numbers

Modern Algebra Prof. Manindra Agrawal Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

Since the determinant of a diagonal matrix is the product of its diagonal elements it is trivial to see that det(a) = α 2. = max. A 1 x.

EXAMPLES OF CLASSICAL ITERATIVE METHODS

Physics 411 Lecture 7. Tensors. Lecture 7. Physics 411 Classical Mechanics II

1 Finding and fixing floating point problems

Errors. Intensive Computation. Annalisa Massini 2017/2018

chapter 5 INTRODUCTION TO MATRIX ALGEBRA GOALS 5.1 Basic Definitions

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for At a high level, there are two pieces to solving a least squares problem:

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Number Systems III MA1S1. Tristan McLoughlin. December 4, 2013

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Automatic Differentiation and Neural Networks

Introduction to Generating Functions

1 Motivation for Instrumental Variable (IV) Regression

Solutions to Math 41 First Exam October 18, 2012

Stripey Squares. Kelly Delp Ithaca College Ithaca, NY 14850, USA Abstract

Iterative solvers for linear equations

x n+1 = x n f(x n) f (x n ), n 0.

The Intermediate Value Theorem If a function f (x) is continuous in the closed interval [ a,b] then [ ]

Linear Algebra, Summer 2011, pt. 2

I will view the Kadison Singer question in the following form. Let M be the von Neumann algebra direct sum of complex matrix algebras of all sizes,

Math 308 Midterm Answers and Comments July 18, Part A. Short answer questions

Linear Least-Squares Data Fitting

Lecture 4: Constructing the Integers, Rationals and Reals

A Review of Matrix Analysis

LECTURE 16: TENSOR PRODUCTS

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Lecture 7. Econ August 18

An analogy from Calculus: limits

Math 480 The Vector Space of Differentiable Functions

PHYSICS 107. Lecture 3 Numbers and Units

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

1 Basics of vector space

NUMERICAL MATHEMATICS & COMPUTING 6th Edition

Lecture 9: Elementary Matrices

Notes on Conditioning

VII Selected Topics. 28 Matrix Operations

1.9 Algebraic Expressions

Math 110 Professor Ken Ribet

Linear Algebra in Hilbert Space

Bindel, Fall 2013 Matrix Computations (CS 6210) Week 2: Friday, Sep 6

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

STEP Support Programme. Hints and Partial Solutions for Assignment 1

THE INVERSE FUNCTION THEOREM

1 Lyapunov theory of stability

Lecture 2: Groups. Rajat Mittal. IIT Kanpur

LECTURE 10: REVIEW OF POWER SERIES. 1. Motivation

Opening Theme: Flexibility vs. Stability

Epsilon Delta proofs

P3317 HW from Lecture 7+8 and Recitation 4

LECTURES 14/15: LINEAR INDEPENDENCE AND BASES

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 9

Handout #6 INTRODUCTION TO ALGEBRAIC STRUCTURES: Prof. Moseley AN ALGEBRAIC FIELD

Least squares problems Linear Algebra with Computer Science Application

CDS Solutions to the Midterm Exam

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5. Ax = b.

Chapter 4 & 5: Vector Spaces & Linear Transformations

Lecture 4: Applications of Orthogonality: QR Decompositions

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

Theorem 5.3. Let E/F, E = F (u), be a simple field extension. Then u is algebraic if and only if E/F is finite. In this case, [E : F ] = deg f u.

Numerical behavior of inexact linear solvers

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ).

Introduction - Motivation. Many phenomena (physical, chemical, biological, etc.) are model by differential equations. f f(x + h) f(x) (x) = lim

Transcription:

1 Notions of error Notes for 2016-09-02 The art of numerics is finding an approximation with a fast algorithm, a form that is easy to analyze, and an error bound Given a task, we want to engineer an approximation that is good enough, and that composes well with other approximations To make these goals precise, we need to define types of errors and error propagation, and some associated notation which is the point of this lecture 11 Absolute and relative error Suppose ˆx is an approximation to x The absolute error is e abs = ˆx x Absolute error has the same dimensions as x, and can be misleading without some context An error of one meter per second is dramatic if x is my walking pace; if x is the speed of light, it is a very small error The relative error is a measure with a more natural sense of scale: e rel = ˆx x x Relative error is familiar in everyday life: when someone talks about an error of a few percent, or says that a given measurement is good to three significant figures, she is describing a relative error We sometimes estimate the relative error in approximating x by ˆx using the relative error in approximating ˆx by x: ê rel = ˆx x ˆx As long as ê rel < 1, a little algebra gives that ê rel 1 + ê rel e rel êrel 1 ê rel

If we know ê rel is much less than one, then it is a good estimate for e rel If ê rel is not much less than one, we know that ˆx is a poor approximation to x Either way, ê rel is often just as useful as e rel, and may be easier to estimate Relative error makes no sense for x = 0, and may be too pessimistic when the property of x we care about is small enough A natural intermediate between absolute and relative errors is the mixed error e mixed = ˆx x x + τ where τ is some natural scale factor associated with x 12 Errors beyond scalars Absolute and relative error make sense for vectors as well as scalars If is a vector norm and ˆx and x are vectors, then the (normwise) absolute and relative errors are e abs = ˆx x, e rel = ˆx x x We might also consider the componentwise absolute or relative errors e abs,i = ˆx i x i e rel,i = ˆx i x i x i The two concepts are related: the maximum componentwise relative error can be computed as a normwise error in a norm defined in terms of the solution vector: max e rel,i = ˆx x i where z = diag(x) 1 z More generally, absolute error makes sense whenever we can measure distances between the truth and the approximation; and relative error makes sense whenever we can additionally measure the size of the truth However, there are often many possible notions of distance and size; and different ways to measure give different notions of absolute and relative error In practice, this deserves some care

13 Dimensions and scaling The first step in analyzing many application problems is nondimensionalization: combining constants in the problem to obtain a small number of dimensionless constants Examples include the aspect ratio of a rectangle, the Reynolds number in fluid mechanics 1, and so forth There are three big reasons to nondimensionalize: Typically, the physics of a problem only really depends on dimensionless constants, of which there may be fewer than the number of dimensional constants This is important for parameter studies, for example For multi-dimensional problems in which the unknowns have different units, it is hard to judge an approximation error as small or large, even with a (normwise) relative error estimate But one can usually tell what is large or small in a non-dimensionalized problem Many physical problems have dimensionless parameters much less than one or much greater than one, and we can approximate the physics in these limits Often when dimensionless constants are huge or tiny and asymptotic approximations work well, naive numerical methods work work poorly Hence, nondimensionalization helps us choose how to analyze our problems and a purely numerical approach may be silly 2 Forward and backward error We often approximate a function f by another function ˆf For a particular x, the forward (absolute) error is ˆf(x) f(x) In words, forward error is the function output Sometimes, though, we can think of a slightly wrong input: ˆf(x) = f(ˆx) In this case, x ˆx is called the backward error An algorithm that always has small backward error is backward stable 1 Or any of a dozen other named numbers in fluid mechanics Fluid mechanics is a field that appreciates the power of dimensional analysis

A condition number a tight constant relating relative output error to relative input error For example, for the problem of evaluating a sufficiently nice function f(x) where x is the input and ˆx = x + h is a perturbed input (relative error h / x ), the condition number κ[f(x)] is the smallest constant such that f(x + h) f(x) f(x) If f is differentiable, the condition number is κ[f(x)] h x + o( h ) κ[f(x)] = lim h 0 f(x + h) f(x) / f(x) (x + h) x / x = f (x) x f(x) If f is Lipschitz in a neighborhood of x (locally Lipschitz), then κ[f(x)] = M f(x) x f(x) where M f is the smallest constant such that f(x+h) f(x) M f h +o( h ) When the problem has no linear bound on the output error relative to the input error, we sat the problem has an infinite condition number An example is x 1/3 at x = 0 A problem with a small condition number is called well-conditioned; a problem with a large condition number is ill-conditioned A backward stable algorithm applied to a well-conditioned problem has a small forward error 3 Perturbing matrix problems To make the previous discussion concrete, suppose I want y = Ax, but because of a small error in A (due to measurement errors or roundoff effects), I instead compute ŷ = (A + E)x where E is small The expression for the absolute error is trivial: = Ex But I usually care more about the relative error = Ex If we assume that A is invertible and that we are using consistent norms (which we will usually assume), then Ex = EA 1 y E A 1,

which gives us A A 1 E A = κ(a) E A That is, the relative error in the output is the relative error in the input multiplied by the condition number κ(a) = A A 1 Technically, this is the condition number for the problem of matrix multiplication (or solving linear systems, as we will see) with respect to a particular (consistent) norm; different problems have different condition numbers Nonetheless, it is common to call this the condition number of A For some problems, we are given more control over the structure of the error matrix E For example, we might suppose that A is symmetric, and ask whether we can get a tighter bound if in addition to assuming a bound on E, we also assume E is symmetric In this particular case, the answer is no we have the same condition number either way, at least for the 2-norm or Frobenius norm 2 In other cases, assuming a structure to the perturbation does indeed allow us to achieve tighter bounds As an example of a refined bound, we consider moving from condition numbers based on small norm-wise perturbations to condition numbers based on small element-wise perturbations Suppose E is elementwise small relative to A, ie E ɛ A Suppose also that we are dealing with a norm such that X X, as is true of all the norms we have seen so far Then EA 1 A A 1 ɛ The quantity κ rel (A) = A A 1 is the relative condition number; it is closely related to the Skeel condition number which we will see in our discussion of linear systems 3 Unlike the standard condition number, the relative condition number is invariant under column scaling of A; that is κ rel (AD) = κ rel (A) where D is a nonsingular diagonal matrix What if, instead of perturbing A, we perturb x? That is, if ŷ = Aˆx and y = Ax, what is the condition number relating / to ˆx x / x? We note that = A(ˆx x) A ˆx x ; 2 This is left as an exercise for the student 3 The Skeel condition number involves the two factors in the reverse order

and x = A 1 y A 1 = A 1 1 x Put together, this implies A A 1 ˆx x x The same condition number appears again!