Subset Selection. Ilse Ipsen. North Carolina State University, USA

Size: px
Start display at page:

Download "Subset Selection. Ilse Ipsen. North Carolina State University, USA"

Transcription

1 Subset Selection Ilse Ipsen North Carolina State University, USA

2 Subset Selection Given: real or complex matrix A integer k Determine permutation matrix P so that AP = ( A 1 }{{} k A 2 ) Important columns A 1 Columns of A 1 are very linearly independent Wannabe basis vectors Redundant columns A 2 Columns of A 2 are well represented by A 1

3 Outline Two applications Mathematical formulation Bounds Algorithms (two norm) Redundant columns (Frobenius norm) Randomized algorithms

4 First application Joint work with Tim Kelley Solution of nonlinear least squares problems Levenberg-Marquardt trust-region algorithm Difficulties: illconditioned or rank deficient Jacobians errors in residuals and Jacobians Improve conditioning with subset selection Damped driven harmonic oscillators allow us to construct different scenarios

5 Harmonic Oscillators m x + c x + k x = 0 displacement x(t) mass m damping constant c spring stiffness k Given: displacement measurements x j at time t j Want: parameters p = (m, c, k) Nonlinear least squares problem min x(t j, p) x j 2 p j

6 Nonlinear Least Squares Problem Residual R(p) = ( x(t 1, p) x 1 x(t 2, p) x 2... ) T Nonlinear Least Squares Problem min p f(p) where f(p) = 1 2 R(p)T R(p) At a minimizer p : f(p ) = 0 Gradient: f(p) = R (p) T R(p) Jacobian: R (p) Solve f(p) = 0 by Levenberg-Marquardt algorithm p new = p ( νi + R (p) T R (p)) 1 R (p) T R(p)

7 Jacobian Levenberg-Marquardt algorithm p new = p ( νi + R (p) T R (p)) 1 R (p) T R(p) But: Jacobian R (p) does not have full column rank m x + c x + k x = 0 for infinitely many p = (m, c, k) x + c m x + k m x = 0 m c x + x + k c x = 0 m k x + c k x + x = 0 Which parameter to keep fixed? Which column in the Jacobian is redundant? Subset Selection!

8 Second Application Scott Pope s Ph.D. thesis (NCState, 2009) Cardiovascular and respiratory modeling Identify parameters that are important for predicting blood flow and pressure Solve nonlinear least squares problem combined with subset selection Parameters identified as important: total systemic resistance cerebrovascular resistance arterial compliance time of peak systolic ventricular pressure

9 Mathematical Formulation of Subset Selection Given: real or complex matrix A with n columns integer k Determine permutation matrix P so that AP = ( A 1 }{{} k A }{{} 2 ) n k Important columns A 1 Columns of A 1 are very linearly independent Smallest singular value of A 1 is large Redundant columns A 2 Columns of A 2 are well represented by A 1 min Z A 1 Z A 2 is small (two norm)

10 Singular Value Decomposition (SVD) m n matrix A, m n AP = U Σ V Singular vectors: U T U = I m V T V = VV T = I n Singular values: Σ = σ 1... σ 1... σ n 0 σ n

11 Ideal Matrices for Subset Selection Singular Value Decomposition AP = ( A 1 }{{} k ( Σ1 A }{{} 2 ) = U n k Σ 2 ) V If exists permutation P so that V = I then Important columns large singular values of A ( ) Σ1 A 1 = U and σ 0 i (A 1 ) = σ i (A) 1 i k Redundant columns small singular values of A ( ) 0 A 2 = U and min A 1 Z A 2 = A 2 = σ k+1 (A) Z Σ 2

12 Subset Selection Requirements Important columns A 1 k columns of A 1 should be very linearly independent Smallest singular value σ k (A 1 ) should be large for some γ σ k (A)/γ σ k (A 1 ) σ k (A) Redundant columns A 2 Columns of A 2 should be well represented by A 1 min Z A 1 Z A 2 should be small (two norm) for some γ σ k+1 (A) min Z A 1 Z A 2 γ σ k+1 (A)

13 Bounds for Subset Selection Singular value decomposition AP = ( A 1 }{{} k A }{{} 2 ) = U n k ( Σ1 Σ 2 ) ( ) V11 V 12 V 21 V 22 Important columns A 1 σ i (A) V 1 11 σ i(a 1 ) σ i (A) for all 1 i k Redundant columns A 2 σ k+1 (A) min Z A 1 Z A 2 V 1 11 σ k+1(a)

14 Matrix ( V 11 How Small Can V 1 11 Be? V 12 ) is k n with orthonormal rows If V 11 is nonsingular then V V 1 11 V 12 2 Follows from I = V 11 V T 11 + V 12V T 12 If det(v 11 ) is maximal then V 1 11 V 12 k(n k) Follows from Cramer s rule: (V 1 11 V 12) ij 1 There exists a permutation such that V k(n k)

15 Bounds for Subset Selection AP = ( A 1 }{{} k ( Σ1 A }{{} 2 ) = U n k Σ 2 ) ( ) V11 V 12 V 21 V 22 Permute right singular vectors so that det(v 11 ) maximal Important columns A 1 σ i (A) 1 + k(n k) σ i (A 1 ) σ i (A) for all 1 i k Redundant columns A 2 σ k+1 (A) min Z A 1 Z A k(n k) σ k+1 (A)

16 Algorithms for Subset Selection QR decomposition with column pivoting ( ) R11 R AP = Q 12 where Q T Q = I 0 R 22 Important columns ( ) R11 Q = A 0 1 and σ i (A 1 ) = σ i (R 11 ) 1 i k Redundant columns ( ) R12 Q = A 2 min A 1 Z A 2 = R 22 Z R 22

17 QR Decomposition with Column Pivoting Determines permutation matrix P so that ( ) R11 R AP = Q 12 0 R 22 where R 11 well-conditioned R 22 small Numerical rank of A is k QR decomposition reveals rank

18 Rank Revealing QR (RRQR) Decompositions Businger & Golub (1965) QR with column pivoting Faddev, Kublanovskaya & Faddeeva (1968) Golub, Klema & Stewart (1976) Gragg & Stewart (1976) Stewart (1984) Foster (1986) T. Chan (1987) Hong & Pan (1992) Chandrasekaran & Ipsen (1994) Gu & Eisenstat (1996) Strong RRQR

19 Input: Strong RRQR (Gu & Eisenstat 1996) m n matrix A, m n, integer k ( ) R11 R Output: AP = Q 12 0 R 22 R 11 is well conditioned R 11 is k k σ i (A) 1 + k(n k) σ i (R 11 ) σ i (A) 1 i k R 22 is small σ k+j (A) σ j (R 22 ) 1 + k(n k) σ k+j (A) Offdiagonal block not too large ( ) R 1 11 R 12 1 ij

20 Strong RRQR Algorithm 1 Compute some QR decomposition with column pivoting ( ) R11 R AP initial = Q 12 0 R 22 2 Repeat «R11 Exchange a column of with a column of 0 Update permutations P, retriangularize until det(r 11 ) stops increasing 3 Output: AP final = ( A 1 }{{} k A }{{} 2 ) n k R12 R 22 «Operation count: O ( mn 2) until det(r 11) stops increasing by n

21 Another Strong RRQR Algorithm In the spirit of Golub, Klema and Stewart (1976) 1 Compute SVD ( Σ1 A = U Σ 2 ) ( ) V1 V 2 2 Apply strong RRQR to V 1 : V 1 P = (V }{{} 11 k V }{{} 12 ) n k k(n k) σ i (V 11 ) 1 1 i k 3 Output: AP = ( A }{{} 1 k A }{{} 2 ) n k

22 Summary: Deterministic Subset Selection m n matrix A, m n, AP = ( A 1 }{{} k Important columns A 1 Redundant columns A 2 A }{{} 2 ) n k σ k (A)/p(k, n) σ k (A 1 ) σ k (A) σ k+1 (A) min Z A 1 Z A 2 p(k, n) σ k+1 (A) p(k, n) Depends on leading k right singular vectors Best known value: p(k, n) = 1 + k(n k) Algorithms: Rank revealing QR decompositions, SVD Operation count: O ( mn 2) for p(k, n) = p 1 + nk(n k)

23 AP = ( A 1 }{{} k RRQR: A }{{} 2 ) n k Redundant Columns min Z A 1 Z A 2 p(k, n) σ k+1 (A) Orthogonal projector onto range(a 1 ): A 1 A 1 min z A 1 Z A 2 = (I A 1 A 1 ) A 2 Largest among all small singular values σ k+1 (A) = Σ 2 RRQR: (I A 1 A 1 ) A 2 p(k, n) Σ 2

24 Subset Selection for Redundant Columns AP = ( A 1 }{{} k A }{{} 2 ) n k Among all ( n k) choices find A1 that minimizes (I A 1 A 1 ) A 2 ξ Best bounds: 2 norm (I A 1 A 1 ) A k(n k) Σ 2 2 Frobenius norm (I A 1 A 1 ) A 2 F k + 1 Σ 2 F

25 Frobenius Norm There exist k columns A 1 so that (I A 1 A 1 ) A 2 2 F (k + 1) σ j (A) 2 j k+1 Idea: Volume sampling Deshpande, Rademacher, Vempala & Wang 2006 (I A 1 A 1 ) A 2 2 F = j k+1 (I A 1A 1 ) a j 2 2 Volume Vol(a j ) = a j 2 Vol ( A 1 a j ) = Vol(A1 ) (I A 1 A 1 )a j 2 Volume and singular values i 1 <...<i k Vol (A i1...i k ) 2 = i 1 <...<i k σ i1 (A) 2... σ ik (A) 2

26 Maximizing Volumes Is Really Hard Given: matrix A with n columns of unit norm integer k real number ν [0, 1] Finding k columns A 1 of A such that Vol(A 1 ) ν is NP-hard There is no polynomial time approximation scheme [Civril & Magdon-Ismail, 2007]

27 Randomized Subset Selection Frieze, Kannan & Vempala 2004 Deshpande, Rademacher, Vempala & Wang 2006 Liberty, Woolfe, Martinsson, Rokhlin & Tygert 2007 Drineas, Mahoney & Muthukrishnan 2006, 2008 Boutsidis, Mahoney & Drineas 2009 Civril & Magdon-Ismail 2009 Applications Statistical data analysis: feature selection principal component analysis Pass efficient algorithms for large data sets

28 2-Phase Randomized Algorithm Boutsidis, Mahoney & Drineas Randomized Phase: Sample small number ( k log k) of columns 2 Deterministic Phase: Apply rank revealing QR to sampled columns With 70% probability: Two norm min Z A 1 Z A 2 2 O Frobenius norm min Z A 1 Z A 2 F O ( k 3/4 log 1/2 k (n k) 1/4) Σ 2 2 ( ) k log 1/2 k Σ 2 F

29 2-Phase Randomized Algorithm 1 Compute SVD ( Σ1 A = U ) ( ) V1 Σ 2 V 2 2 Randomized phase: Scale: V 1 V 1 D Sample c columns: 3 Deterministic phase: Apply RRQR to V s : 4 Output: AP s P d = ( A }{{} 1 k (V 1 D) P s = ( V s }{{} ĉ V s P d = ( V d }{{} k A }{{} 2 ) n k ) )

30 Randomized Phase (Frobenius Norm) Sampling Column i of V 1 sampled with probability p i Probabilities p i = c ( ) (V1 ) i i n V 1 F Scaling Scaling matrix D = diag ( 1/ p / p n ) Scaled matrix V 1 D All columns have the same norm Columns sampled with probability 1/n Purpose of scaling makes sampling uniform makes expected values easier to compute

31 Analysis of 2-Phase Algorithm ( ) V1s D 1 Sample c columns (VD) P s = s V 2s D s 2 RRQR selects k columns (V 1s D s ) P d = ( V d ) Perturbation theory RRQR min Z A 1 Z A 2 F Σ 2 F + Σ 2 V 2s D s F σ k (V d ) With high probability σ k (V 1s D s ) 1/2 c k log k σ k (V d ) σ k(v 1s D s ) 1 + k(ĉ k) min Z A 1 Z A 2 F O Σ 2 V 2s D s F 4 Σ 2 F ( ) k log 1/2 k Σ 2 F

32 Expected Values of Frobenius Norms If D ii = 1/ p i then ( ) E X D 2 F = X 2 F Frobenius norm X D 2 F = trace(x D2 X T ) Linearity Scaling [ ] E X D 2 F = trace(x E [D 2] }{{} I X T ) [ ] E D 2 ii = p i 1 + (1 p i ) 0 = 1 p i

33 From Expected Values to Probability ( ) E X D 2 F = X 2 F Markov s inequality x = X D 2 F, Prob (x a) E(x)/a a = 10 E(x) With probability at most 1/10 X D 2 F 10 X 2 F With probability at least 9/10 X D 2 F 10 X 2 F

34 Issues with Randomized Algorithms How to choose c: 10 3 k log k, k log k, 17 k log k,...? We don t know the number of sampled columns ĉ Number of sampled columns can be too small: ĉ < k No information about singular values of important columns How often does one have to run the algorithm to get a good result? How accurately do the singular vectors and singular values have to be computed? How sensitive is the algorithm to the choice of probabilities? How does the randomized algorithm compare to the deterministic algorithms: accuracy, run time?

35 Summary Given: real or complex matrix A, integer k Want: AP = ( A }{{} 1 A 2 ) }{{} k n k Important columns A 1 Singular values close to k largest singular values of A Redundant columns A 2 Proj. of A 2 on range(a 1) 2,F smallest singular values of A Bounds depend on dominant k right singular vectors Deterministic algorithms: RRQR, SVD Randomized algorithm: 2 phases: 1. randomized sampling, 2. RRQR on samples Exact subset selection is hard

Subset Selection. Deterministic vs. Randomized. Ilse Ipsen. North Carolina State University. Joint work with: Stan Eisenstat, Yale

Subset Selection. Deterministic vs. Randomized. Ilse Ipsen. North Carolina State University. Joint work with: Stan Eisenstat, Yale Subset Selection Deterministic vs. Randomized Ilse Ipsen North Carolina State University Joint work with: Stan Eisenstat, Yale Mary Beth Broadbent, Martin Brown, Kevin Penner Subset Selection Given: real

More information

Randomized algorithms for the approximation of matrices

Randomized algorithms for the approximation of matrices Randomized algorithms for the approximation of matrices Luis Rademacher The Ohio State University Computer Science and Engineering (joint work with Amit Deshpande, Santosh Vempala, Grant Wang) Two topics

More information

Rank-Deficient Nonlinear Least Squares Problems and Subset Selection

Rank-Deficient Nonlinear Least Squares Problems and Subset Selection Rank-Deficient Nonlinear Least Squares Problems and Subset Selection Ilse Ipsen North Carolina State University Raleigh, NC, USA Joint work with: Tim Kelley and Scott Pope Motivating Application Overview

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

SUBSET SELECTION ALGORITHMS: RANDOMIZED VS. DETERMINISTIC

SUBSET SELECTION ALGORITHMS: RANDOMIZED VS. DETERMINISTIC SUBSET SELECTION ALGORITHMS: RANDOMIZED VS. DETERMINISTIC MARY E. BROADBENT, MARTIN BROWN, AND KEVIN PENNER Advisors: Ilse Ipsen 1 and Rizwana Rehman 2 Abstract. Subset selection is a method for selecting

More information

A randomized algorithm for approximating the SVD of a matrix

A randomized algorithm for approximating the SVD of a matrix A randomized algorithm for approximating the SVD of a matrix Joint work with Per-Gunnar Martinsson (U. of Colorado) and Vladimir Rokhlin (Yale) Mark Tygert Program in Applied Mathematics Yale University

More information

A fast randomized algorithm for approximating an SVD of a matrix

A fast randomized algorithm for approximating an SVD of a matrix A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,

More information

14.2 QR Factorization with Column Pivoting

14.2 QR Factorization with Column Pivoting page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution

More information

Introduction to Numerical Linear Algebra II

Introduction to Numerical Linear Algebra II Introduction to Numerical Linear Algebra II Petros Drineas These slides were prepared by Ilse Ipsen for the 2015 Gene Golub SIAM Summer School on RandNLA 1 / 49 Overview We will cover this material in

More information

Randomized Numerical Linear Algebra: Review and Progresses

Randomized Numerical Linear Algebra: Review and Progresses ized ized SVD ized : Review and Progresses Zhihua Department of Computer Science and Engineering Shanghai Jiao Tong University The 12th China Workshop on Machine Learning and Applications Xi an, November

More information

Randomized Algorithms for Matrix Computations

Randomized Algorithms for Matrix Computations Randomized Algorithms for Matrix Computations Ilse Ipsen Students: John Holodnak, Kevin Penner, Thomas Wentworth Research supported in part by NSF CISE CCF, DARPA XData Randomized Algorithms Solve a deterministic

More information

Column Selection via Adaptive Sampling

Column Selection via Adaptive Sampling Column Selection via Adaptive Sampling Saurabh Paul Global Risk Sciences, Paypal Inc. saupaul@paypal.com Malik Magdon-Ismail CS Dept., Rensselaer Polytechnic Institute magdon@cs.rpi.edu Petros Drineas

More information

A fast randomized algorithm for overdetermined linear least-squares regression

A fast randomized algorithm for overdetermined linear least-squares regression A fast randomized algorithm for overdetermined linear least-squares regression Vladimir Rokhlin and Mark Tygert Technical Report YALEU/DCS/TR-1403 April 28, 2008 Abstract We introduce a randomized algorithm

More information

Approximate Spectral Clustering via Randomized Sketching

Approximate Spectral Clustering via Randomized Sketching Approximate Spectral Clustering via Randomized Sketching Christos Boutsidis Yahoo! Labs, New York Joint work with Alex Gittens (Ebay), Anju Kambadur (IBM) The big picture: sketch and solve Tradeoff: Speed

More information

Lecture 5: Randomized methods for low-rank approximation

Lecture 5: Randomized methods for low-rank approximation CBMS Conference on Fast Direct Solvers Dartmouth College June 23 June 27, 2014 Lecture 5: Randomized methods for low-rank approximation Gunnar Martinsson The University of Colorado at Boulder Research

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

arxiv: v1 [math.na] 29 Dec 2014

arxiv: v1 [math.na] 29 Dec 2014 A CUR Factorization Algorithm based on the Interpolative Decomposition Sergey Voronin and Per-Gunnar Martinsson arxiv:1412.8447v1 [math.na] 29 Dec 214 December 3, 214 Abstract An algorithm for the efficient

More information

A determinantal point process for column subset selection

A determinantal point process for column subset selection A determinantal point process for column subset selection Ayoub Belhadji 1, Rémi Bardenet 1, Pierre Chainais 1 1 Univ. Lille, CNRS, Centrale Lille, UMR 9189 - CRIStAL, 59651 Villeneuve d Ascq, France Abstract

More information

dimensionality reduction for k-means and low rank approximation

dimensionality reduction for k-means and low rank approximation dimensionality reduction for k-means and low rank approximation Michael Cohen, Sam Elder, Cameron Musco, Christopher Musco, Mădălina Persu Massachusetts Institute of Technology 0 overview Simple techniques

More information

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 26, 2010 Linear system Linear system Ax = b, A C m,n, b C m, x C n. under-determined

More information

1. Addition: To every pair of vectors x, y X corresponds an element x + y X such that the commutative and associative properties hold

1. Addition: To every pair of vectors x, y X corresponds an element x + y X such that the commutative and associative properties hold Appendix B Y Mathematical Refresher This appendix presents mathematical concepts we use in developing our main arguments in the text of this book. This appendix can be read in the order in which it appears,

More information

Computational Methods. Least Squares Approximation/Optimization

Computational Methods. Least Squares Approximation/Optimization Computational Methods Least Squares Approximation/Optimization Manfred Huber 2011 1 Least Squares Least squares methods are aimed at finding approximate solutions when no precise solution exists Find the

More information

Randomized algorithms for matrix computations and analysis of high dimensional data

Randomized algorithms for matrix computations and analysis of high dimensional data PCMI Summer Session, The Mathematics of Data Midway, Utah July, 2016 Randomized algorithms for matrix computations and analysis of high dimensional data Gunnar Martinsson The University of Colorado at

More information

13. Nonlinear least squares

13. Nonlinear least squares L. Vandenberghe ECE133A (Fall 2018) 13. Nonlinear least squares definition and examples derivatives and optimality condition Gauss Newton method Levenberg Marquardt method 13.1 Nonlinear least squares

More information

Lecture 9: Matrix approximation continued

Lecture 9: Matrix approximation continued 0368-348-01-Algorithms in Data Mining Fall 013 Lecturer: Edo Liberty Lecture 9: Matrix approximation continued Warning: This note may contain typos and other inaccuracies which are usually discussed during

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 7: More on Householder Reflectors; Least Squares Problems Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 15 Outline

More information

Orthogonalization and least squares methods

Orthogonalization and least squares methods Chapter 3 Orthogonalization and least squares methods 31 QR-factorization (QR-decomposition) 311 Householder transformation Definition 311 A complex m n-matrix R = [r ij is called an upper (lower) triangular

More information

Randomly Sampling from Orthonormal Matrices: Coherence and Leverage Scores

Randomly Sampling from Orthonormal Matrices: Coherence and Leverage Scores Randomly Sampling from Orthonormal Matrices: Coherence and Leverage Scores Ilse Ipsen Joint work with Thomas Wentworth (thanks to Petros & Joel) North Carolina State University Raleigh, NC, USA Research

More information

ETNA Kent State University

ETNA Kent State University Electronic Transactions on Numerical Analysis. Volume 17, pp. 76-92, 2004. Copyright 2004,. ISSN 1068-9613. ETNA STRONG RANK REVEALING CHOLESKY FACTORIZATION M. GU AND L. MIRANIAN Ý Abstract. For any symmetric

More information

RandNLA: Randomization in Numerical Linear Algebra

RandNLA: Randomization in Numerical Linear Algebra RandNLA: Randomization in Numerical Linear Algebra Petros Drineas Department of Computer Science Rensselaer Polytechnic Institute To access my web page: drineas Why RandNLA? Randomization and sampling

More information

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities Mathematical Optimisation, Chpt 2: Linear Equations and inequalities Peter J.C. Dickinson p.j.c.dickinson@utwente.nl http://dickinson.website version: 12/02/18 Monday 5th February 2018 Peter J.C. Dickinson

More information

Matrix decompositions

Matrix decompositions Matrix decompositions Zdeněk Dvořák May 19, 2015 Lemma 1 (Schur decomposition). If A is a symmetric real matrix, then there exists an orthogonal matrix Q and a diagonal matrix D such that A = QDQ T. The

More information

Rank revealing factorizations, and low rank approximations

Rank revealing factorizations, and low rank approximations Rank revealing factorizations, and low rank approximations L. Grigori Inria Paris, UPMC January 2018 Plan Low rank matrix approximation Rank revealing QR factorization LU CRTP: Truncated LU factorization

More information

Rank revealing factorizations, and low rank approximations

Rank revealing factorizations, and low rank approximations Rank revealing factorizations, and low rank approximations L. Grigori Inria Paris, UPMC February 2017 Plan Low rank matrix approximation Rank revealing QR factorization LU CRTP: Truncated LU factorization

More information

Gram-Schmidt Orthogonalization: 100 Years and More

Gram-Schmidt Orthogonalization: 100 Years and More Gram-Schmidt Orthogonalization: 100 Years and More September 12, 2008 Outline of Talk Early History (1795 1907) Middle History 1. The work of Åke Björck Least squares, Stability, Loss of orthogonality

More information

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Numerical Methods. Elena loli Piccolomini. Civil Engeneering.  piccolom. Metodi Numerici M p. 1/?? Metodi Numerici M p. 1/?? Numerical Methods Elena loli Piccolomini Civil Engeneering http://www.dm.unibo.it/ piccolom elena.loli@unibo.it Metodi Numerici M p. 2/?? Least Squares Data Fitting Measurement

More information

A Fast And Optimal Deterministic Algorithm For NP-Hard Antenna Selection Problem

A Fast And Optimal Deterministic Algorithm For NP-Hard Antenna Selection Problem A Fast And Optimal Deterministic Algorithm For NP-Hard Antenna Selection Problem Naveed Iqbal, Christian Schneider and Reiner S. Thomä Electronic Measurement Research Lab Ilmenau University of Technology

More information

Continuous analogues of matrix factorizations NASC seminar, 9th May 2014

Continuous analogues of matrix factorizations NASC seminar, 9th May 2014 Continuous analogues of matrix factorizations NSC seminar, 9th May 2014 lex Townsend DPhil student Mathematical Institute University of Oxford (joint work with Nick Trefethen) Many thanks to Gil Strang,

More information

randomized block krylov methods for stronger and faster approximate svd

randomized block krylov methods for stronger and faster approximate svd randomized block krylov methods for stronger and faster approximate svd Cameron Musco and Christopher Musco December 2, 25 Massachusetts Institute of Technology, EECS singular value decomposition n d left

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725 Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple

More information

ETNA Kent State University

ETNA Kent State University C 8 Electronic Transactions on Numerical Analysis. Volume 17, pp. 76-2, 2004. Copyright 2004,. ISSN 1068-613. etnamcs.kent.edu STRONG RANK REVEALING CHOLESKY FACTORIZATION M. GU AND L. MIRANIAN Abstract.

More information

Sparse Features for PCA-Like Linear Regression

Sparse Features for PCA-Like Linear Regression Sparse Features for PCA-Like Linear Regression Christos Boutsidis Mathematical Sciences Department IBM T J Watson Research Center Yorktown Heights, New York cboutsi@usibmcom Petros Drineas Computer Science

More information

Total Least Squares Approach in Regression Methods

Total Least Squares Approach in Regression Methods WDS'08 Proceedings of Contributed Papers, Part I, 88 93, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS Total Least Squares Approach in Regression Methods M. Pešta Charles University, Faculty of Mathematics

More information

arxiv: v1 [math.na] 23 Nov 2018

arxiv: v1 [math.na] 23 Nov 2018 Randomized QLP algorithm and error analysis Nianci Wu a, Hua Xiang a, a School of Mathematics and Statistics, Wuhan University, Wuhan 43007, PR China. arxiv:1811.09334v1 [math.na] 3 Nov 018 Abstract In

More information

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular form) Given: matrix C = (c i,j ) n,m i,j=1 ODE and num math: Linear algebra (N) [lectures] c phabala 2016 DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors

More information

Orthonormal Transformations and Least Squares

Orthonormal Transformations and Least Squares Orthonormal Transformations and Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 30, 2009 Applications of Qx with Q T Q = I 1. solving

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9 STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9 1. qr and complete orthogonal factorization poor man s svd can solve many problems on the svd list using either of these factorizations but they

More information

Review problems for MA 54, Fall 2004.

Review problems for MA 54, Fall 2004. Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on

More information

Dominant feature extraction

Dominant feature extraction Dominant feature extraction Francqui Lecture 7-5-200 Paul Van Dooren Université catholique de Louvain CESAME, Louvain-la-Neuve, Belgium Goal of this lecture Develop basic ideas for large scale dense matrices

More information

c 2011 Society for Industrial and Applied Mathematics

c 2011 Society for Industrial and Applied Mathematics SIAM J. NUMER. ANAL. Vol. 49, No. 3, pp. 1244 1266 c 2011 Society for Industrial and Applied Mathematics RANK-DEFICIENT NONLINEAR LEAST SQUARES PROBLEMS AND SUBSET SELECTION I. C. F. IPSEN, C. T. KELLEY,

More information

Theorem A.1. If A is any nonzero m x n matrix, then A is equivalent to a partitioned matrix of the form. k k n-k. m-k k m-k n-k

Theorem A.1. If A is any nonzero m x n matrix, then A is equivalent to a partitioned matrix of the form. k k n-k. m-k k m-k n-k I. REVIEW OF LINEAR ALGEBRA A. Equivalence Definition A1. If A and B are two m x n matrices, then A is equivalent to B if we can obtain B from A by a finite sequence of elementary row or elementary column

More information

ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition

ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University of Hong Kong Lecture 8: QR Decomposition

More information

Householder reflectors are matrices of the form. P = I 2ww T, where w is a unit vector (a vector of 2-norm unity)

Householder reflectors are matrices of the form. P = I 2ww T, where w is a unit vector (a vector of 2-norm unity) Householder QR Householder reflectors are matrices of the form P = I 2ww T, where w is a unit vector (a vector of 2-norm unity) w Px x Geometrically, P x represents a mirror image of x with respect to

More information

5. Orthogonal matrices

5. Orthogonal matrices L Vandenberghe EE133A (Spring 2017) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal

More information

Cheat Sheet for MATH461

Cheat Sheet for MATH461 Cheat Sheet for MATH46 Here is the stuff you really need to remember for the exams Linear systems Ax = b Problem: We consider a linear system of m equations for n unknowns x,,x n : For a given matrix A

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment Two Caramanis/Sanghavi Due: Tuesday, Feb. 19, 2013. Computational

More information

CUR Matrix Factorizations

CUR Matrix Factorizations CUR Matrix Factorizations Algorithms, Analysis, Applications Mark Embree Virginia Tech embree@vt.edu October 2017 Based on: D. C. Sorensen and M. Embree. A DEIM Induced CUR Factorization. SIAM J. Sci.

More information

arxiv: v1 [math.na] 1 Sep 2018

arxiv: v1 [math.na] 1 Sep 2018 On the perturbation of an L -orthogonal projection Xuefeng Xu arxiv:18090000v1 [mathna] 1 Sep 018 September 5 018 Abstract The L -orthogonal projection is an important mathematical tool in scientific computing

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition An Important topic in NLA Radu Tiberiu Trîmbiţaş Babeş-Bolyai University February 23, 2009 Radu Tiberiu Trîmbiţaş ( Babeş-Bolyai University)The Singular Value Decomposition

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

Conceptual Questions for Review

Conceptual Questions for Review Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.

More information

Linear Algebra, part 3 QR and SVD

Linear Algebra, part 3 QR and SVD Linear Algebra, part 3 QR and SVD Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2012 Going back to least squares (Section 1.4 from Strang, now also see section 5.2). We

More information

Parallel Singular Value Decomposition. Jiaxing Tan

Parallel Singular Value Decomposition. Jiaxing Tan Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector

More information

Subspace sampling and relative-error matrix approximation

Subspace sampling and relative-error matrix approximation Subspace sampling and relative-error matrix approximation Petros Drineas Rensselaer Polytechnic Institute Computer Science Department (joint work with M. W. Mahoney) For papers, etc. drineas The CUR decomposition

More information

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.)

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.) page 121 Index (Page numbers set in bold type indicate the definition of an entry.) A absolute error...26 componentwise...31 in subtraction...27 normwise...31 angle in least squares problem...98,99 approximation

More information

Low Rank Approximation Lecture 3. Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL

Low Rank Approximation Lecture 3. Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL Low Rank Approximation Lecture 3 Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch 1 Sampling based approximation Aim: Obtain rank-r approximation

More information

Random Methods for Linear Algebra

Random Methods for Linear Algebra Gittens gittens@acm.caltech.edu Applied and Computational Mathematics California Institue of Technology October 2, 2009 Outline The Johnson-Lindenstrauss Transform 1 The Johnson-Lindenstrauss Transform

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra The two principal problems in linear algebra are: Linear system Given an n n matrix A and an n-vector b, determine x IR n such that A x = b Eigenvalue problem Given an n n matrix

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Lecture 6. Numerical methods. Approximation of functions

Lecture 6. Numerical methods. Approximation of functions Lecture 6 Numerical methods Approximation of functions Lecture 6 OUTLINE 1. Approximation and interpolation 2. Least-square method basis functions design matrix residual weighted least squares normal equation

More information

1. Select the unique answer (choice) for each problem. Write only the answer.

1. Select the unique answer (choice) for each problem. Write only the answer. MATH 5 Practice Problem Set Spring 7. Select the unique answer (choice) for each problem. Write only the answer. () Determine all the values of a for which the system has infinitely many solutions: x +

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 1. Basic Linear Algebra Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Example

More information

Class notes: Approximation

Class notes: Approximation Class notes: Approximation Introduction Vector spaces, linear independence, subspace The goal of Numerical Analysis is to compute approximations We want to approximate eg numbers in R or C vectors in R

More information

Linear Algebra in Actuarial Science: Slides to the lecture

Linear Algebra in Actuarial Science: Slides to the lecture Linear Algebra in Actuarial Science: Slides to the lecture Fall Semester 2010/2011 Linear Algebra is a Tool-Box Linear Equation Systems Discretization of differential equations: solving linear equations

More information

Math 407: Linear Optimization

Math 407: Linear Optimization Math 407: Linear Optimization Lecture 16: The Linear Least Squares Problem II Math Dept, University of Washington February 28, 2018 Lecture 16: The Linear Least Squares Problem II (Math Dept, University

More information

Linear Analysis Lecture 16

Linear Analysis Lecture 16 Linear Analysis Lecture 16 The QR Factorization Recall the Gram-Schmidt orthogonalization process. Let V be an inner product space, and suppose a 1,..., a n V are linearly independent. Define q 1,...,

More information

Tighter Low-rank Approximation via Sampling the Leveraged Element

Tighter Low-rank Approximation via Sampling the Leveraged Element Tighter Low-rank Approximation via Sampling the Leveraged Element Srinadh Bhojanapalli The University of Texas at Austin bsrinadh@utexas.edu Prateek Jain Microsoft Research, India prajain@microsoft.com

More information

1 Linear Algebra Problems

1 Linear Algebra Problems Linear Algebra Problems. Let A be the conjugate transpose of the complex matrix A; i.e., A = A t : A is said to be Hermitian if A = A; real symmetric if A is real and A t = A; skew-hermitian if A = A and

More information

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008 Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008 Exam 2 will be held on Tuesday, April 8, 7-8pm in 117 MacMillan What will be covered The exam will cover material from the lectures

More information

DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 1002 Lecture notes 10 November 23, Linear models DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

More information

LOW-RANK MATRIX APPROXIMATIONS DO NOT NEED A SINGULAR VALUE GAP

LOW-RANK MATRIX APPROXIMATIONS DO NOT NEED A SINGULAR VALUE GAP LOW-RANK MATRIX APPROXIMATIONS DO NOT NEED A SINGULAR VALUE GAP PETROS DRINEAS AND ILSE C. F. IPSEN Abstract. Low-rank approximations to a real matrix A can be conputed from ZZ T A, where Z is a matrix

More information

Fundamentals of Engineering Analysis (650163)

Fundamentals of Engineering Analysis (650163) Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is

More information

Sparse Approximation of Signals with Highly Coherent Dictionaries

Sparse Approximation of Signals with Highly Coherent Dictionaries Sparse Approximation of Signals with Highly Coherent Dictionaries Bishnu P. Lamichhane and Laura Rebollo-Neira b.p.lamichhane@aston.ac.uk, rebollol@aston.ac.uk Support from EPSRC (EP/D062632/1) is acknowledged

More information

PRACTICE FINAL EXAM. why. If they are dependent, exhibit a linear dependence relation among them.

PRACTICE FINAL EXAM. why. If they are dependent, exhibit a linear dependence relation among them. Prof A Suciu MTH U37 LINEAR ALGEBRA Spring 2005 PRACTICE FINAL EXAM Are the following vectors independent or dependent? If they are independent, say why If they are dependent, exhibit a linear dependence

More information

Least-Squares Fitting of Model Parameters to Experimental Data

Least-Squares Fitting of Model Parameters to Experimental Data Least-Squares Fitting of Model Parameters to Experimental Data Div. of Mathematical Sciences, Dept of Engineering Sciences and Mathematics, LTU, room E193 Outline of talk What about Science and Scientific

More information

Compressed Sensing and Robust Recovery of Low Rank Matrices

Compressed Sensing and Robust Recovery of Low Rank Matrices Compressed Sensing and Robust Recovery of Low Rank Matrices M. Fazel, E. Candès, B. Recht, P. Parrilo Electrical Engineering, University of Washington Applied and Computational Mathematics Dept., Caltech

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Review of Linear Algebra Denis Helic KTI, TU Graz Oct 9, 2014 Denis Helic (KTI, TU Graz) KDDM1 Oct 9, 2014 1 / 74 Big picture: KDDM Probability Theory

More information

to be more efficient on enormous scale, in a stream, or in distributed settings.

to be more efficient on enormous scale, in a stream, or in distributed settings. 16 Matrix Sketching The singular value decomposition (SVD) can be interpreted as finding the most dominant directions in an (n d) matrix A (or n points in R d ). Typically n > d. It is typically easy to

More information

Large Scale Data Analysis Using Deep Learning

Large Scale Data Analysis Using Deep Learning Large Scale Data Analysis Using Deep Learning Linear Algebra U Kang Seoul National University U Kang 1 In This Lecture Overview of linear algebra (but, not a comprehensive survey) Focused on the subset

More information

Exercise Sheet 1.

Exercise Sheet 1. Exercise Sheet 1 You can download my lecture and exercise sheets at the address http://sami.hust.edu.vn/giang-vien/?name=huynt 1) Let A, B be sets. What does the statement "A is not a subset of B " mean?

More information

c 2015 Society for Industrial and Applied Mathematics

c 2015 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. 36, No. 1, pp. 110 137 c 015 Society for Industrial and Applied Mathematics RANDOMIZED APPROXIMATION OF THE GRAM MATRIX: EXACT COMPUTATION AND PROBABILISTIC BOUNDS JOHN

More information

Why the QR Factorization can be more Accurate than the SVD

Why the QR Factorization can be more Accurate than the SVD Why the QR Factorization can be more Accurate than the SVD Leslie V. Foster Department of Mathematics San Jose State University San Jose, CA 95192 foster@math.sjsu.edu May 10, 2004 Problem: or Ax = b for

More information

Scientific Computing: Dense Linear Systems

Scientific Computing: Dense Linear Systems Scientific Computing: Dense Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 February 9th, 2012 A. Donev (Courant Institute)

More information

Randomization: Making Very Large-Scale Linear Algebraic Computations Possible

Randomization: Making Very Large-Scale Linear Algebraic Computations Possible Randomization: Making Very Large-Scale Linear Algebraic Computations Possible Gunnar Martinsson The University of Colorado at Boulder The material presented draws on work by Nathan Halko, Edo Liberty,

More information

Provably Correct Algorithms for Matrix Column Subset Selection with Selectively Sampled Data

Provably Correct Algorithms for Matrix Column Subset Selection with Selectively Sampled Data Journal of Machine Learning Research 18 (2018) 1-42 Submitted 5/15; Revised 9/16; Published 4/18 Provably Correct Algorithms for Matrix Column Subset Selection with Selectively Sampled Data Yining Wang

More information