Sparse least squares and Q-less QR

Similar documents
Least squares: the big idea

1 Cricket chirps: an example

1 Computing with constraints

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

Linear Algebra, part 3 QR and SVD

CS227-Scientific Computing. Lecture 4: A Crash Course in Linear Algebra

Bindel, Fall 2009 Matrix Computations (CS 6210) Week 8: Friday, Oct 17

Linear Least Squares Problems

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

Lecture 3: QR-Factorization

Review Questions REVIEW QUESTIONS 71

Linear least squares problem: Example

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti

Computational Methods. Systems of Linear Equations

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

Linear Least-Squares Data Fitting

CS6964: Notes On Linear Systems

MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix

Computational Linear Algebra

MATHEMATICS FOR COMPUTER VISION WEEK 2 LINEAR SYSTEMS. Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year

B553 Lecture 5: Matrix Algebra Review

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

Introduction to Mathematical Programming

Applied Linear Algebra in Geoscience Using MATLAB

Numerical Methods I Non-Square and Sparse Linear Systems

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Numerical Methods Lecture 2 Simultaneous Equations

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Review. Example 1. Elementary matrices in action: (a) a b c. d e f = g h i. d e f = a b c. a b c. (b) d e f. d e f.

Linear Algebraic Equations

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

lecture 2 and 3: algorithms for linear algebra

Section 3.5 LU Decomposition (Factorization) Key terms. Matrix factorization Forward and back substitution LU-decomposition Storage economization

Linear Solvers. Andrew Hazel

R b. x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 1 1, x h. , x p. x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9

Numerical Linear Algebra

Matrix decompositions

From Stationary Methods to Krylov Subspaces

(17) (18)

Numerical Methods - Numerical Linear Algebra

MH1200 Final 2014/2015

1 Vectors. Notes for Bindel, Spring 2017 Numerical Analysis (CS 4220)

Applied Numerical Linear Algebra. Lecture 8

Scientific Computing

Scientific Computing: Dense Linear Systems

ECEN 615 Methods of Electric Power Systems Analysis Lecture 18: Least Squares, State Estimation

MIDTERM. b) [2 points] Compute the LU Decomposition A = LU or explain why one does not exist.

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

CS 143 Linear Algebra Review

6 EIGENVALUES AND EIGENVECTORS

Lecture 11. Linear systems: Cholesky method. Eigensystems: Terminology. Jacobi transformations QR transformation

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

AM205: Assignment 2. i=1

Math 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm

FINITE-DIMENSIONAL LINEAR ALGEBRA

Consider the following example of a linear system:

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Bastian Steder

Computational Methods. Least Squares Approximation/Optimization

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization

MAT Linear Algebra Collection of sample exams

lecture 3 and 4: algorithms for linear algebra

Linear Analysis Lecture 16

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

This can be accomplished by left matrix multiplication as follows: I

MS&E 318 (CME 338) Large-Scale Numerical Optimization

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Maren Bennewitz, Diego Tipaldi, Luciano Spinello

Course Notes: Week 1

Mathematical Methods in Engineering and Science Prof. Bhaskar Dasgupta Department of Mechanical Engineering Indian Institute of Technology, Kanpur

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization

Getting Started with Communications Engineering

Bindel, Spring 2017 Numerical Analysis (CS 4220) Notes for So far, we have considered unconstrained optimization problems.

Numerical Linear Algebra

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Chapter 5 Orthogonality

Hani Mehrpouyan, California State University, Bakersfield. Signals and Systems

Bindel, Fall 2013 Matrix Computations (CS 6210) Week 2: Friday, Sep 6

Solving Linear Systems of Equations

Since the determinant of a diagonal matrix is the product of its diagonal elements it is trivial to see that det(a) = α 2. = max. A 1 x.

The antitriangular factorisation of saddle point matrices

Vector and Matrix Norms. Vector and Matrix Norms

Least-Squares Systems and The QR factorization

CSC 336F Assignment #3 Due: 24 November 2017.

1 Extrapolation: A Hint of Things to Come

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

5.7 Cramer's Rule 1. Using Determinants to Solve Systems Assumes the system of two equations in two unknowns

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

MATH 3511 Lecture 1. Solving Linear Systems 1

COMP 558 lecture 18 Nov. 15, 2010

CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Notes on Eigenvalues, Singular Values and QR

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz

Transcription:

Notes for 2016-02-29 Sparse least squares and Q-less QR Suppose we want to solve a full-rank least squares problem in which A is large and sparse. In principle, we could solve the problem via the normal equations A T Ax = A T b, or introduce A = QR and multiply A T Ax = R T Rx = b by R T to find Rx = R T A T b = Q T b. Note that there is a very close relation between these approaches; the matrix R in the QR decomposition is a Cholesky factor of A T A in the normal equations (possibly scaled by a diagonal matrix with ±1 on the diagonal). As we have discussed, it may not be advantageous to use the normal equations directly, since forming and factoring A T A brings in the square of the condition number. On the other hand, in a sparse setting it s not necessarily such a good idea to use the usual QR approach, since even if A and A T A are sparse, Q in general will not be. Consequently, when A is sparse, we would typically use the following steps 1 1. Possibly permute the columns of A so that the Cholesky factor of A T A (or the factor R, which has the same structure) remains sparse. See help colamd for an example of a generally good permutation. 2. Compute a Q-less QR decomposition, e.g. R = qr(a,0) in Matlabwhere A is sparse. This does not compute the (usually very dense) Q factor explicitly. It also does not form A T A explicitly. If the right hand side b is known initially, the Matlab qr function can compute Q T b implicitly at the same time it does the QR factorization. 3. Compute Q T b as R T (A T b), since the latter computation involves only a sparse multiply and a sparse triangular solve. 4. Solve Rx = Q T b. 1 In MATLAB, you can also use backslash to solve a least squares problem, and it will do the right thing if A is sparse.

It s not a bad idea to do iterative refinement after this see help qr in MATLAB: x = R\(R \(A b)) r = b A x; e = R\(R \(A r)); x = x + e; Of course, just as some square sparse systems cannot be re-ordered so that the LU factorization will be sparse, some sparse least squares problems cannot be re-ordered so that the QR factorization will be sparse. Sometimes, these problems can be more effectively treated by iterative methods of the sort we will discuss later in the class. Weight patiently So far, we have considered least squares problems involving the standard Euclidean norm in R n. But while we clearly take advantage of Euclidean structure in thinking about least squares, nothing requires that we look to the standard inner product for that structure. For example, consider the problem minimize Ax b 2 M where M is a general symmetric and positive definite matrix. We can approach this problem through a generalization of the normal equations, e.g. A T MAx = A T Mb. Alternately, we can observe that if M = RM T R M is a Cholesky factorization, then the M-norm least squares problem is equivalent to a least squares problem with respect to the standard inner product: minimize R M (Ax b) 2. Frequently, we care about weighted least squares problems minimize m wi 2 ri 2 where w 2 i are a set of weights. This may be appropriate if some rows represent measures that are more certain than others; for example, we may try to i=1

identify and down-weight outliers that do not fit the general trend of the rest of the data. Alternately, we may incorporate weights if different rows are naturally scaled differently. A weighted least squares problem can be re-phrased as a standard least squares problem: minimize D 1 (Ax b) 2 where D = diag(w 1,..., w m ) is a diagonal scaling matrix corresponding to the weights. Weighted least squares makes a good building block for solving more general fitting problems. The iteratively reweighted least squares procedure (IRLS) is used to solve problems of the form m minimize l(r i ) where l is some positive-definite loss function. If we let l(r) = r 2, we get the standard least squares problem; but standard least squares problems are sensitive to outliers, and may not be appropriate when measurement errors come from a non-gaussian distribution with heavy tails. In some procedures, one tries to identify outliers statically, then downweights them in an ordinary least squares problem. In other procedures, one uses a loss function l that grows less quickly as a function of the argument; for example, l(r) = r corresponds to least absolute deviation fitting. A popular choice due to Huber combines the least squares and least absolute deviation loss: l(r) = { r 2, 2ɛ i=1 r ɛ r ɛ, r > ɛ. 2 There are other choices as well, such as the Tukey biweight function. The idea in IRLS is to approximate the solution to the general loss minimization throguh a sequence of least square problems; at each step, weights w j = l (ˆr j /s)/ˆr j, where s is a scale parameter and ˆr is the residual vector from the previous step. We will return to IRLS later in the class when we talk about iterations for nonlinear least squares problems. Consider constraint Weighted least squares is a good way to construct fits in which some equations are more important than others. But what if we want to enforce some

equations exactly? That is, we consider the linearly constrained least squares problem minimize Ax b 2 s.t. Bx = c where B R l n with l < n. There are several approaches to solving this problem. One in particular relates to the picture we saw when discussing the QR approach to solving least squares problems. We start by finding some solution y to the constraint equation By = c, and form a homogeneous problem in terms of a correction z = x y minimize Az s 2 s.t. Bz = 0, s = b Ay. In the usual QR approach, we consider an orthogonal transformation to the residual space such that part of the residual is independent of x and part of the residual can be set to zero. In the constrained setting, we also want to transform the space where the variable lives. That is, suppose we write B T = [ ] [ ] R U 1 U B 2 0 Then the condition Bz = 0 is equivalent to the condition that z = U 1 w. Therefore, we want to solve the problem minimize AU 1 w s 2. This method of handling constraints is fine in the dense case. In the sparse case, we might prefer an alternate approach based on the method of Lagrange multipliers, which we will return to in more detail later in the semester. In this method, we want to find a stationary point for the augmented Lagrangian L(x, λ) = 1 2 Ax b 2 + λ T (Bx c). Differentiating with respect to x and λ, we have i.e. [ A T A B T [ ] T [ ] δx A T Ax + B T λ A T b = 0, δλ Bx c B 0 ] [ ] x = λ [ ] A T b. c

Applying block Gaussian elimination, we have B(A T A) 1 B T λ = B(A T A) 1 A T b c. Using the economy factorization A = QR, and introducing M = BR 1, we have (M T M)λ = MQ T b c. From there, we can solve for λ and then do some algebra, which leads to the following MATLAB code: % Factorizations independent of RHS [Q,R] = qr(a, 0); M = B/R; [Q2,T] = qr(m,0); % Part that depends on the RHS x0 = R\(Q b); % Unconstrained solve lambda = T\(T \(B x0 c)); % Compute multipliers x = x0 R\(M lambda); % Correction to unconstrained solve

Problems to ponder 1. Derive the generalized normal equations for minimizing Ax b 2 M. 2. Write a MATLAB code to solve the linearly constrained least squares problem using the first approach described (QR decomposition of the constraint matrix). 3. Complete the algebra to get from the extended system formulation for the linearly constrained least squares problem to the MATLAB code at the end of the notes.