Subset Selection. Ilse Ipsen. North Carolina State University, USA
|
|
- Cameron Lambert
- 6 years ago
- Views:
Transcription
1 Subset Selection Ilse Ipsen North Carolina State University, USA
2 Subset Selection Given: real or complex matrix A integer k Determine permutation matrix P so that AP = ( A 1 }{{} k A 2 ) Important columns A 1 Columns of A 1 are very linearly independent Wannabe basis vectors Redundant columns A 2 Columns of A 2 are well represented by A 1
3 Outline Two applications Mathematical formulation Bounds Algorithms (two norm) Redundant columns (Frobenius norm) Randomized algorithms
4 First application Joint work with Tim Kelley Solution of nonlinear least squares problems Levenberg-Marquardt trust-region algorithm Difficulties: illconditioned or rank deficient Jacobians errors in residuals and Jacobians Improve conditioning with subset selection Damped driven harmonic oscillators allow us to construct different scenarios
5 Harmonic Oscillators m x + c x + k x = 0 displacement x(t) mass m damping constant c spring stiffness k Given: displacement measurements x j at time t j Want: parameters p = (m, c, k) Nonlinear least squares problem min x(t j, p) x j 2 p j
6 Nonlinear Least Squares Problem Residual R(p) = ( x(t 1, p) x 1 x(t 2, p) x 2... ) T Nonlinear Least Squares Problem min p f(p) where f(p) = 1 2 R(p)T R(p) At a minimizer p : f(p ) = 0 Gradient: f(p) = R (p) T R(p) Jacobian: R (p) Solve f(p) = 0 by Levenberg-Marquardt algorithm p new = p ( νi + R (p) T R (p)) 1 R (p) T R(p)
7 Jacobian Levenberg-Marquardt algorithm p new = p ( νi + R (p) T R (p)) 1 R (p) T R(p) But: Jacobian R (p) does not have full column rank m x + c x + k x = 0 for infinitely many p = (m, c, k) x + c m x + k m x = 0 m c x + x + k c x = 0 m k x + c k x + x = 0 Which parameter to keep fixed? Which column in the Jacobian is redundant? Subset Selection!
8 Second Application Scott Pope s Ph.D. thesis (NCState, 2009) Cardiovascular and respiratory modeling Identify parameters that are important for predicting blood flow and pressure Solve nonlinear least squares problem combined with subset selection Parameters identified as important: total systemic resistance cerebrovascular resistance arterial compliance time of peak systolic ventricular pressure
9 Mathematical Formulation of Subset Selection Given: real or complex matrix A with n columns integer k Determine permutation matrix P so that AP = ( A 1 }{{} k A }{{} 2 ) n k Important columns A 1 Columns of A 1 are very linearly independent Smallest singular value of A 1 is large Redundant columns A 2 Columns of A 2 are well represented by A 1 min Z A 1 Z A 2 is small (two norm)
10 Singular Value Decomposition (SVD) m n matrix A, m n AP = U Σ V Singular vectors: U T U = I m V T V = VV T = I n Singular values: Σ = σ 1... σ 1... σ n 0 σ n
11 Ideal Matrices for Subset Selection Singular Value Decomposition AP = ( A 1 }{{} k ( Σ1 A }{{} 2 ) = U n k Σ 2 ) V If exists permutation P so that V = I then Important columns large singular values of A ( ) Σ1 A 1 = U and σ 0 i (A 1 ) = σ i (A) 1 i k Redundant columns small singular values of A ( ) 0 A 2 = U and min A 1 Z A 2 = A 2 = σ k+1 (A) Z Σ 2
12 Subset Selection Requirements Important columns A 1 k columns of A 1 should be very linearly independent Smallest singular value σ k (A 1 ) should be large for some γ σ k (A)/γ σ k (A 1 ) σ k (A) Redundant columns A 2 Columns of A 2 should be well represented by A 1 min Z A 1 Z A 2 should be small (two norm) for some γ σ k+1 (A) min Z A 1 Z A 2 γ σ k+1 (A)
13 Bounds for Subset Selection Singular value decomposition AP = ( A 1 }{{} k A }{{} 2 ) = U n k ( Σ1 Σ 2 ) ( ) V11 V 12 V 21 V 22 Important columns A 1 σ i (A) V 1 11 σ i(a 1 ) σ i (A) for all 1 i k Redundant columns A 2 σ k+1 (A) min Z A 1 Z A 2 V 1 11 σ k+1(a)
14 Matrix ( V 11 How Small Can V 1 11 Be? V 12 ) is k n with orthonormal rows If V 11 is nonsingular then V V 1 11 V 12 2 Follows from I = V 11 V T 11 + V 12V T 12 If det(v 11 ) is maximal then V 1 11 V 12 k(n k) Follows from Cramer s rule: (V 1 11 V 12) ij 1 There exists a permutation such that V k(n k)
15 Bounds for Subset Selection AP = ( A 1 }{{} k ( Σ1 A }{{} 2 ) = U n k Σ 2 ) ( ) V11 V 12 V 21 V 22 Permute right singular vectors so that det(v 11 ) maximal Important columns A 1 σ i (A) 1 + k(n k) σ i (A 1 ) σ i (A) for all 1 i k Redundant columns A 2 σ k+1 (A) min Z A 1 Z A k(n k) σ k+1 (A)
16 Algorithms for Subset Selection QR decomposition with column pivoting ( ) R11 R AP = Q 12 where Q T Q = I 0 R 22 Important columns ( ) R11 Q = A 0 1 and σ i (A 1 ) = σ i (R 11 ) 1 i k Redundant columns ( ) R12 Q = A 2 min A 1 Z A 2 = R 22 Z R 22
17 QR Decomposition with Column Pivoting Determines permutation matrix P so that ( ) R11 R AP = Q 12 0 R 22 where R 11 well-conditioned R 22 small Numerical rank of A is k QR decomposition reveals rank
18 Rank Revealing QR (RRQR) Decompositions Businger & Golub (1965) QR with column pivoting Faddev, Kublanovskaya & Faddeeva (1968) Golub, Klema & Stewart (1976) Gragg & Stewart (1976) Stewart (1984) Foster (1986) T. Chan (1987) Hong & Pan (1992) Chandrasekaran & Ipsen (1994) Gu & Eisenstat (1996) Strong RRQR
19 Input: Strong RRQR (Gu & Eisenstat 1996) m n matrix A, m n, integer k ( ) R11 R Output: AP = Q 12 0 R 22 R 11 is well conditioned R 11 is k k σ i (A) 1 + k(n k) σ i (R 11 ) σ i (A) 1 i k R 22 is small σ k+j (A) σ j (R 22 ) 1 + k(n k) σ k+j (A) Offdiagonal block not too large ( ) R 1 11 R 12 1 ij
20 Strong RRQR Algorithm 1 Compute some QR decomposition with column pivoting ( ) R11 R AP initial = Q 12 0 R 22 2 Repeat «R11 Exchange a column of with a column of 0 Update permutations P, retriangularize until det(r 11 ) stops increasing 3 Output: AP final = ( A 1 }{{} k A }{{} 2 ) n k R12 R 22 «Operation count: O ( mn 2) until det(r 11) stops increasing by n
21 Another Strong RRQR Algorithm In the spirit of Golub, Klema and Stewart (1976) 1 Compute SVD ( Σ1 A = U Σ 2 ) ( ) V1 V 2 2 Apply strong RRQR to V 1 : V 1 P = (V }{{} 11 k V }{{} 12 ) n k k(n k) σ i (V 11 ) 1 1 i k 3 Output: AP = ( A }{{} 1 k A }{{} 2 ) n k
22 Summary: Deterministic Subset Selection m n matrix A, m n, AP = ( A 1 }{{} k Important columns A 1 Redundant columns A 2 A }{{} 2 ) n k σ k (A)/p(k, n) σ k (A 1 ) σ k (A) σ k+1 (A) min Z A 1 Z A 2 p(k, n) σ k+1 (A) p(k, n) Depends on leading k right singular vectors Best known value: p(k, n) = 1 + k(n k) Algorithms: Rank revealing QR decompositions, SVD Operation count: O ( mn 2) for p(k, n) = p 1 + nk(n k)
23 AP = ( A 1 }{{} k RRQR: A }{{} 2 ) n k Redundant Columns min Z A 1 Z A 2 p(k, n) σ k+1 (A) Orthogonal projector onto range(a 1 ): A 1 A 1 min z A 1 Z A 2 = (I A 1 A 1 ) A 2 Largest among all small singular values σ k+1 (A) = Σ 2 RRQR: (I A 1 A 1 ) A 2 p(k, n) Σ 2
24 Subset Selection for Redundant Columns AP = ( A 1 }{{} k A }{{} 2 ) n k Among all ( n k) choices find A1 that minimizes (I A 1 A 1 ) A 2 ξ Best bounds: 2 norm (I A 1 A 1 ) A k(n k) Σ 2 2 Frobenius norm (I A 1 A 1 ) A 2 F k + 1 Σ 2 F
25 Frobenius Norm There exist k columns A 1 so that (I A 1 A 1 ) A 2 2 F (k + 1) σ j (A) 2 j k+1 Idea: Volume sampling Deshpande, Rademacher, Vempala & Wang 2006 (I A 1 A 1 ) A 2 2 F = j k+1 (I A 1A 1 ) a j 2 2 Volume Vol(a j ) = a j 2 Vol ( A 1 a j ) = Vol(A1 ) (I A 1 A 1 )a j 2 Volume and singular values i 1 <...<i k Vol (A i1...i k ) 2 = i 1 <...<i k σ i1 (A) 2... σ ik (A) 2
26 Maximizing Volumes Is Really Hard Given: matrix A with n columns of unit norm integer k real number ν [0, 1] Finding k columns A 1 of A such that Vol(A 1 ) ν is NP-hard There is no polynomial time approximation scheme [Civril & Magdon-Ismail, 2007]
27 Randomized Subset Selection Frieze, Kannan & Vempala 2004 Deshpande, Rademacher, Vempala & Wang 2006 Liberty, Woolfe, Martinsson, Rokhlin & Tygert 2007 Drineas, Mahoney & Muthukrishnan 2006, 2008 Boutsidis, Mahoney & Drineas 2009 Civril & Magdon-Ismail 2009 Applications Statistical data analysis: feature selection principal component analysis Pass efficient algorithms for large data sets
28 2-Phase Randomized Algorithm Boutsidis, Mahoney & Drineas Randomized Phase: Sample small number ( k log k) of columns 2 Deterministic Phase: Apply rank revealing QR to sampled columns With 70% probability: Two norm min Z A 1 Z A 2 2 O Frobenius norm min Z A 1 Z A 2 F O ( k 3/4 log 1/2 k (n k) 1/4) Σ 2 2 ( ) k log 1/2 k Σ 2 F
29 2-Phase Randomized Algorithm 1 Compute SVD ( Σ1 A = U ) ( ) V1 Σ 2 V 2 2 Randomized phase: Scale: V 1 V 1 D Sample c columns: 3 Deterministic phase: Apply RRQR to V s : 4 Output: AP s P d = ( A }{{} 1 k (V 1 D) P s = ( V s }{{} ĉ V s P d = ( V d }{{} k A }{{} 2 ) n k ) )
30 Randomized Phase (Frobenius Norm) Sampling Column i of V 1 sampled with probability p i Probabilities p i = c ( ) (V1 ) i i n V 1 F Scaling Scaling matrix D = diag ( 1/ p / p n ) Scaled matrix V 1 D All columns have the same norm Columns sampled with probability 1/n Purpose of scaling makes sampling uniform makes expected values easier to compute
31 Analysis of 2-Phase Algorithm ( ) V1s D 1 Sample c columns (VD) P s = s V 2s D s 2 RRQR selects k columns (V 1s D s ) P d = ( V d ) Perturbation theory RRQR min Z A 1 Z A 2 F Σ 2 F + Σ 2 V 2s D s F σ k (V d ) With high probability σ k (V 1s D s ) 1/2 c k log k σ k (V d ) σ k(v 1s D s ) 1 + k(ĉ k) min Z A 1 Z A 2 F O Σ 2 V 2s D s F 4 Σ 2 F ( ) k log 1/2 k Σ 2 F
32 Expected Values of Frobenius Norms If D ii = 1/ p i then ( ) E X D 2 F = X 2 F Frobenius norm X D 2 F = trace(x D2 X T ) Linearity Scaling [ ] E X D 2 F = trace(x E [D 2] }{{} I X T ) [ ] E D 2 ii = p i 1 + (1 p i ) 0 = 1 p i
33 From Expected Values to Probability ( ) E X D 2 F = X 2 F Markov s inequality x = X D 2 F, Prob (x a) E(x)/a a = 10 E(x) With probability at most 1/10 X D 2 F 10 X 2 F With probability at least 9/10 X D 2 F 10 X 2 F
34 Issues with Randomized Algorithms How to choose c: 10 3 k log k, k log k, 17 k log k,...? We don t know the number of sampled columns ĉ Number of sampled columns can be too small: ĉ < k No information about singular values of important columns How often does one have to run the algorithm to get a good result? How accurately do the singular vectors and singular values have to be computed? How sensitive is the algorithm to the choice of probabilities? How does the randomized algorithm compare to the deterministic algorithms: accuracy, run time?
35 Summary Given: real or complex matrix A, integer k Want: AP = ( A }{{} 1 A 2 ) }{{} k n k Important columns A 1 Singular values close to k largest singular values of A Redundant columns A 2 Proj. of A 2 on range(a 1) 2,F smallest singular values of A Bounds depend on dominant k right singular vectors Deterministic algorithms: RRQR, SVD Randomized algorithm: 2 phases: 1. randomized sampling, 2. RRQR on samples Exact subset selection is hard
Subset Selection. Deterministic vs. Randomized. Ilse Ipsen. North Carolina State University. Joint work with: Stan Eisenstat, Yale
Subset Selection Deterministic vs. Randomized Ilse Ipsen North Carolina State University Joint work with: Stan Eisenstat, Yale Mary Beth Broadbent, Martin Brown, Kevin Penner Subset Selection Given: real
More informationRandomized algorithms for the approximation of matrices
Randomized algorithms for the approximation of matrices Luis Rademacher The Ohio State University Computer Science and Engineering (joint work with Amit Deshpande, Santosh Vempala, Grant Wang) Two topics
More informationRank-Deficient Nonlinear Least Squares Problems and Subset Selection
Rank-Deficient Nonlinear Least Squares Problems and Subset Selection Ilse Ipsen North Carolina State University Raleigh, NC, USA Joint work with: Tim Kelley and Scott Pope Motivating Application Overview
More informationThis article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution
More informationSUBSET SELECTION ALGORITHMS: RANDOMIZED VS. DETERMINISTIC
SUBSET SELECTION ALGORITHMS: RANDOMIZED VS. DETERMINISTIC MARY E. BROADBENT, MARTIN BROWN, AND KEVIN PENNER Advisors: Ilse Ipsen 1 and Rizwana Rehman 2 Abstract. Subset selection is a method for selecting
More informationA randomized algorithm for approximating the SVD of a matrix
A randomized algorithm for approximating the SVD of a matrix Joint work with Per-Gunnar Martinsson (U. of Colorado) and Vladimir Rokhlin (Yale) Mark Tygert Program in Applied Mathematics Yale University
More informationA fast randomized algorithm for approximating an SVD of a matrix
A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,
More information14.2 QR Factorization with Column Pivoting
page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution
More informationIntroduction to Numerical Linear Algebra II
Introduction to Numerical Linear Algebra II Petros Drineas These slides were prepared by Ilse Ipsen for the 2015 Gene Golub SIAM Summer School on RandNLA 1 / 49 Overview We will cover this material in
More informationRandomized Numerical Linear Algebra: Review and Progresses
ized ized SVD ized : Review and Progresses Zhihua Department of Computer Science and Engineering Shanghai Jiao Tong University The 12th China Workshop on Machine Learning and Applications Xi an, November
More informationRandomized Algorithms for Matrix Computations
Randomized Algorithms for Matrix Computations Ilse Ipsen Students: John Holodnak, Kevin Penner, Thomas Wentworth Research supported in part by NSF CISE CCF, DARPA XData Randomized Algorithms Solve a deterministic
More informationColumn Selection via Adaptive Sampling
Column Selection via Adaptive Sampling Saurabh Paul Global Risk Sciences, Paypal Inc. saupaul@paypal.com Malik Magdon-Ismail CS Dept., Rensselaer Polytechnic Institute magdon@cs.rpi.edu Petros Drineas
More informationA fast randomized algorithm for overdetermined linear least-squares regression
A fast randomized algorithm for overdetermined linear least-squares regression Vladimir Rokhlin and Mark Tygert Technical Report YALEU/DCS/TR-1403 April 28, 2008 Abstract We introduce a randomized algorithm
More informationApproximate Spectral Clustering via Randomized Sketching
Approximate Spectral Clustering via Randomized Sketching Christos Boutsidis Yahoo! Labs, New York Joint work with Alex Gittens (Ebay), Anju Kambadur (IBM) The big picture: sketch and solve Tradeoff: Speed
More informationLecture 5: Randomized methods for low-rank approximation
CBMS Conference on Fast Direct Solvers Dartmouth College June 23 June 27, 2014 Lecture 5: Randomized methods for low-rank approximation Gunnar Martinsson The University of Colorado at Boulder Research
More informationChapter 3 Transformations
Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases
More informationLinear Algebra Massoud Malek
CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product
More informationarxiv: v1 [math.na] 29 Dec 2014
A CUR Factorization Algorithm based on the Interpolative Decomposition Sergey Voronin and Per-Gunnar Martinsson arxiv:1412.8447v1 [math.na] 29 Dec 214 December 3, 214 Abstract An algorithm for the efficient
More informationA determinantal point process for column subset selection
A determinantal point process for column subset selection Ayoub Belhadji 1, Rémi Bardenet 1, Pierre Chainais 1 1 Univ. Lille, CNRS, Centrale Lille, UMR 9189 - CRIStAL, 59651 Villeneuve d Ascq, France Abstract
More informationdimensionality reduction for k-means and low rank approximation
dimensionality reduction for k-means and low rank approximation Michael Cohen, Sam Elder, Cameron Musco, Christopher Musco, Mădălina Persu Massachusetts Institute of Technology 0 overview Simple techniques
More informationLeast Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo
Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 26, 2010 Linear system Linear system Ax = b, A C m,n, b C m, x C n. under-determined
More information1. Addition: To every pair of vectors x, y X corresponds an element x + y X such that the commutative and associative properties hold
Appendix B Y Mathematical Refresher This appendix presents mathematical concepts we use in developing our main arguments in the text of this book. This appendix can be read in the order in which it appears,
More informationComputational Methods. Least Squares Approximation/Optimization
Computational Methods Least Squares Approximation/Optimization Manfred Huber 2011 1 Least Squares Least squares methods are aimed at finding approximate solutions when no precise solution exists Find the
More informationRandomized algorithms for matrix computations and analysis of high dimensional data
PCMI Summer Session, The Mathematics of Data Midway, Utah July, 2016 Randomized algorithms for matrix computations and analysis of high dimensional data Gunnar Martinsson The University of Colorado at
More information13. Nonlinear least squares
L. Vandenberghe ECE133A (Fall 2018) 13. Nonlinear least squares definition and examples derivatives and optimality condition Gauss Newton method Levenberg Marquardt method 13.1 Nonlinear least squares
More informationLecture 9: Matrix approximation continued
0368-348-01-Algorithms in Data Mining Fall 013 Lecturer: Edo Liberty Lecture 9: Matrix approximation continued Warning: This note may contain typos and other inaccuracies which are usually discussed during
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 7: More on Householder Reflectors; Least Squares Problems Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 15 Outline
More informationOrthogonalization and least squares methods
Chapter 3 Orthogonalization and least squares methods 31 QR-factorization (QR-decomposition) 311 Householder transformation Definition 311 A complex m n-matrix R = [r ij is called an upper (lower) triangular
More informationRandomly Sampling from Orthonormal Matrices: Coherence and Leverage Scores
Randomly Sampling from Orthonormal Matrices: Coherence and Leverage Scores Ilse Ipsen Joint work with Thomas Wentworth (thanks to Petros & Joel) North Carolina State University Raleigh, NC, USA Research
More informationETNA Kent State University
Electronic Transactions on Numerical Analysis. Volume 17, pp. 76-92, 2004. Copyright 2004,. ISSN 1068-9613. ETNA STRONG RANK REVEALING CHOLESKY FACTORIZATION M. GU AND L. MIRANIAN Ý Abstract. For any symmetric
More informationRandNLA: Randomization in Numerical Linear Algebra
RandNLA: Randomization in Numerical Linear Algebra Petros Drineas Department of Computer Science Rensselaer Polytechnic Institute To access my web page: drineas Why RandNLA? Randomization and sampling
More informationMathematical Optimisation, Chpt 2: Linear Equations and inequalities
Mathematical Optimisation, Chpt 2: Linear Equations and inequalities Peter J.C. Dickinson p.j.c.dickinson@utwente.nl http://dickinson.website version: 12/02/18 Monday 5th February 2018 Peter J.C. Dickinson
More informationMatrix decompositions
Matrix decompositions Zdeněk Dvořák May 19, 2015 Lemma 1 (Schur decomposition). If A is a symmetric real matrix, then there exists an orthogonal matrix Q and a diagonal matrix D such that A = QDQ T. The
More informationRank revealing factorizations, and low rank approximations
Rank revealing factorizations, and low rank approximations L. Grigori Inria Paris, UPMC January 2018 Plan Low rank matrix approximation Rank revealing QR factorization LU CRTP: Truncated LU factorization
More informationRank revealing factorizations, and low rank approximations
Rank revealing factorizations, and low rank approximations L. Grigori Inria Paris, UPMC February 2017 Plan Low rank matrix approximation Rank revealing QR factorization LU CRTP: Truncated LU factorization
More informationGram-Schmidt Orthogonalization: 100 Years and More
Gram-Schmidt Orthogonalization: 100 Years and More September 12, 2008 Outline of Talk Early History (1795 1907) Middle History 1. The work of Åke Björck Least squares, Stability, Loss of orthogonality
More informationNumerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??
Metodi Numerici M p. 1/?? Numerical Methods Elena loli Piccolomini Civil Engeneering http://www.dm.unibo.it/ piccolom elena.loli@unibo.it Metodi Numerici M p. 2/?? Least Squares Data Fitting Measurement
More informationA Fast And Optimal Deterministic Algorithm For NP-Hard Antenna Selection Problem
A Fast And Optimal Deterministic Algorithm For NP-Hard Antenna Selection Problem Naveed Iqbal, Christian Schneider and Reiner S. Thomä Electronic Measurement Research Lab Ilmenau University of Technology
More informationContinuous analogues of matrix factorizations NASC seminar, 9th May 2014
Continuous analogues of matrix factorizations NSC seminar, 9th May 2014 lex Townsend DPhil student Mathematical Institute University of Oxford (joint work with Nick Trefethen) Many thanks to Gil Strang,
More informationrandomized block krylov methods for stronger and faster approximate svd
randomized block krylov methods for stronger and faster approximate svd Cameron Musco and Christopher Musco December 2, 25 Massachusetts Institute of Technology, EECS singular value decomposition n d left
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple
More informationETNA Kent State University
C 8 Electronic Transactions on Numerical Analysis. Volume 17, pp. 76-2, 2004. Copyright 2004,. ISSN 1068-613. etnamcs.kent.edu STRONG RANK REVEALING CHOLESKY FACTORIZATION M. GU AND L. MIRANIAN Abstract.
More informationSparse Features for PCA-Like Linear Regression
Sparse Features for PCA-Like Linear Regression Christos Boutsidis Mathematical Sciences Department IBM T J Watson Research Center Yorktown Heights, New York cboutsi@usibmcom Petros Drineas Computer Science
More informationTotal Least Squares Approach in Regression Methods
WDS'08 Proceedings of Contributed Papers, Part I, 88 93, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS Total Least Squares Approach in Regression Methods M. Pešta Charles University, Faculty of Mathematics
More informationarxiv: v1 [math.na] 23 Nov 2018
Randomized QLP algorithm and error analysis Nianci Wu a, Hua Xiang a, a School of Mathematics and Statistics, Wuhan University, Wuhan 43007, PR China. arxiv:1811.09334v1 [math.na] 3 Nov 018 Abstract In
More informationCS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares
CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search
More informationMath 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination
Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column
More informationDEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular
form) Given: matrix C = (c i,j ) n,m i,j=1 ODE and num math: Linear algebra (N) [lectures] c phabala 2016 DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix
More informationLinear Algebra Review. Vectors
Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors
More informationOrthonormal Transformations and Least Squares
Orthonormal Transformations and Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 30, 2009 Applications of Qx with Q T Q = I 1. solving
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9
STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9 1. qr and complete orthogonal factorization poor man s svd can solve many problems on the svd list using either of these factorizations but they
More informationReview problems for MA 54, Fall 2004.
Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on
More informationDominant feature extraction
Dominant feature extraction Francqui Lecture 7-5-200 Paul Van Dooren Université catholique de Louvain CESAME, Louvain-la-Neuve, Belgium Goal of this lecture Develop basic ideas for large scale dense matrices
More informationc 2011 Society for Industrial and Applied Mathematics
SIAM J. NUMER. ANAL. Vol. 49, No. 3, pp. 1244 1266 c 2011 Society for Industrial and Applied Mathematics RANK-DEFICIENT NONLINEAR LEAST SQUARES PROBLEMS AND SUBSET SELECTION I. C. F. IPSEN, C. T. KELLEY,
More informationTheorem A.1. If A is any nonzero m x n matrix, then A is equivalent to a partitioned matrix of the form. k k n-k. m-k k m-k n-k
I. REVIEW OF LINEAR ALGEBRA A. Equivalence Definition A1. If A and B are two m x n matrices, then A is equivalent to B if we can obtain B from A by a finite sequence of elementary row or elementary column
More informationENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition
ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University of Hong Kong Lecture 8: QR Decomposition
More informationHouseholder reflectors are matrices of the form. P = I 2ww T, where w is a unit vector (a vector of 2-norm unity)
Householder QR Householder reflectors are matrices of the form P = I 2ww T, where w is a unit vector (a vector of 2-norm unity) w Px x Geometrically, P x represents a mirror image of x with respect to
More information5. Orthogonal matrices
L Vandenberghe EE133A (Spring 2017) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal
More informationCheat Sheet for MATH461
Cheat Sheet for MATH46 Here is the stuff you really need to remember for the exams Linear systems Ax = b Problem: We consider a linear system of m equations for n unknowns x,,x n : For a given matrix A
More informationThe University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.
The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment Two Caramanis/Sanghavi Due: Tuesday, Feb. 19, 2013. Computational
More informationCUR Matrix Factorizations
CUR Matrix Factorizations Algorithms, Analysis, Applications Mark Embree Virginia Tech embree@vt.edu October 2017 Based on: D. C. Sorensen and M. Embree. A DEIM Induced CUR Factorization. SIAM J. Sci.
More informationarxiv: v1 [math.na] 1 Sep 2018
On the perturbation of an L -orthogonal projection Xuefeng Xu arxiv:18090000v1 [mathna] 1 Sep 018 September 5 018 Abstract The L -orthogonal projection is an important mathematical tool in scientific computing
More informationThe Singular Value Decomposition
The Singular Value Decomposition An Important topic in NLA Radu Tiberiu Trîmbiţaş Babeş-Bolyai University February 23, 2009 Radu Tiberiu Trîmbiţaş ( Babeş-Bolyai University)The Singular Value Decomposition
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)
AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical
More informationConceptual Questions for Review
Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.
More informationLinear Algebra, part 3 QR and SVD
Linear Algebra, part 3 QR and SVD Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2012 Going back to least squares (Section 1.4 from Strang, now also see section 5.2). We
More informationParallel Singular Value Decomposition. Jiaxing Tan
Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector
More informationSubspace sampling and relative-error matrix approximation
Subspace sampling and relative-error matrix approximation Petros Drineas Rensselaer Polytechnic Institute Computer Science Department (joint work with M. W. Mahoney) For papers, etc. drineas The CUR decomposition
More informationIndex. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.)
page 121 Index (Page numbers set in bold type indicate the definition of an entry.) A absolute error...26 componentwise...31 in subtraction...27 normwise...31 angle in least squares problem...98,99 approximation
More informationLow Rank Approximation Lecture 3. Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL
Low Rank Approximation Lecture 3 Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch 1 Sampling based approximation Aim: Obtain rank-r approximation
More informationRandom Methods for Linear Algebra
Gittens gittens@acm.caltech.edu Applied and Computational Mathematics California Institue of Technology October 2, 2009 Outline The Johnson-Lindenstrauss Transform 1 The Johnson-Lindenstrauss Transform
More informationNumerical Linear Algebra
Numerical Linear Algebra The two principal problems in linear algebra are: Linear system Given an n n matrix A and an n-vector b, determine x IR n such that A x = b Eigenvalue problem Given an n n matrix
More informationDimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationLecture 6. Numerical methods. Approximation of functions
Lecture 6 Numerical methods Approximation of functions Lecture 6 OUTLINE 1. Approximation and interpolation 2. Least-square method basis functions design matrix residual weighted least squares normal equation
More information1. Select the unique answer (choice) for each problem. Write only the answer.
MATH 5 Practice Problem Set Spring 7. Select the unique answer (choice) for each problem. Write only the answer. () Determine all the values of a for which the system has infinitely many solutions: x +
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 1. Basic Linear Algebra Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Example
More informationClass notes: Approximation
Class notes: Approximation Introduction Vector spaces, linear independence, subspace The goal of Numerical Analysis is to compute approximations We want to approximate eg numbers in R or C vectors in R
More informationLinear Algebra in Actuarial Science: Slides to the lecture
Linear Algebra in Actuarial Science: Slides to the lecture Fall Semester 2010/2011 Linear Algebra is a Tool-Box Linear Equation Systems Discretization of differential equations: solving linear equations
More informationMath 407: Linear Optimization
Math 407: Linear Optimization Lecture 16: The Linear Least Squares Problem II Math Dept, University of Washington February 28, 2018 Lecture 16: The Linear Least Squares Problem II (Math Dept, University
More informationLinear Analysis Lecture 16
Linear Analysis Lecture 16 The QR Factorization Recall the Gram-Schmidt orthogonalization process. Let V be an inner product space, and suppose a 1,..., a n V are linearly independent. Define q 1,...,
More informationTighter Low-rank Approximation via Sampling the Leveraged Element
Tighter Low-rank Approximation via Sampling the Leveraged Element Srinadh Bhojanapalli The University of Texas at Austin bsrinadh@utexas.edu Prateek Jain Microsoft Research, India prajain@microsoft.com
More information1 Linear Algebra Problems
Linear Algebra Problems. Let A be the conjugate transpose of the complex matrix A; i.e., A = A t : A is said to be Hermitian if A = A; real symmetric if A is real and A t = A; skew-hermitian if A = A and
More informationMath 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008
Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008 Exam 2 will be held on Tuesday, April 8, 7-8pm in 117 MacMillan What will be covered The exam will cover material from the lectures
More informationDS-GA 1002 Lecture notes 10 November 23, Linear models
DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.
More informationLOW-RANK MATRIX APPROXIMATIONS DO NOT NEED A SINGULAR VALUE GAP
LOW-RANK MATRIX APPROXIMATIONS DO NOT NEED A SINGULAR VALUE GAP PETROS DRINEAS AND ILSE C. F. IPSEN Abstract. Low-rank approximations to a real matrix A can be conputed from ZZ T A, where Z is a matrix
More informationFundamentals of Engineering Analysis (650163)
Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is
More informationSparse Approximation of Signals with Highly Coherent Dictionaries
Sparse Approximation of Signals with Highly Coherent Dictionaries Bishnu P. Lamichhane and Laura Rebollo-Neira b.p.lamichhane@aston.ac.uk, rebollol@aston.ac.uk Support from EPSRC (EP/D062632/1) is acknowledged
More informationPRACTICE FINAL EXAM. why. If they are dependent, exhibit a linear dependence relation among them.
Prof A Suciu MTH U37 LINEAR ALGEBRA Spring 2005 PRACTICE FINAL EXAM Are the following vectors independent or dependent? If they are independent, say why If they are dependent, exhibit a linear dependence
More informationLeast-Squares Fitting of Model Parameters to Experimental Data
Least-Squares Fitting of Model Parameters to Experimental Data Div. of Mathematical Sciences, Dept of Engineering Sciences and Mathematics, LTU, room E193 Outline of talk What about Science and Scientific
More informationCompressed Sensing and Robust Recovery of Low Rank Matrices
Compressed Sensing and Robust Recovery of Low Rank Matrices M. Fazel, E. Candès, B. Recht, P. Parrilo Electrical Engineering, University of Washington Applied and Computational Mathematics Dept., Caltech
More informationKnowledge Discovery and Data Mining 1 (VO) ( )
Knowledge Discovery and Data Mining 1 (VO) (707.003) Review of Linear Algebra Denis Helic KTI, TU Graz Oct 9, 2014 Denis Helic (KTI, TU Graz) KDDM1 Oct 9, 2014 1 / 74 Big picture: KDDM Probability Theory
More informationto be more efficient on enormous scale, in a stream, or in distributed settings.
16 Matrix Sketching The singular value decomposition (SVD) can be interpreted as finding the most dominant directions in an (n d) matrix A (or n points in R d ). Typically n > d. It is typically easy to
More informationLarge Scale Data Analysis Using Deep Learning
Large Scale Data Analysis Using Deep Learning Linear Algebra U Kang Seoul National University U Kang 1 In This Lecture Overview of linear algebra (but, not a comprehensive survey) Focused on the subset
More informationExercise Sheet 1.
Exercise Sheet 1 You can download my lecture and exercise sheets at the address http://sami.hust.edu.vn/giang-vien/?name=huynt 1) Let A, B be sets. What does the statement "A is not a subset of B " mean?
More informationc 2015 Society for Industrial and Applied Mathematics
SIAM J. MATRIX ANAL. APPL. Vol. 36, No. 1, pp. 110 137 c 015 Society for Industrial and Applied Mathematics RANDOMIZED APPROXIMATION OF THE GRAM MATRIX: EXACT COMPUTATION AND PROBABILISTIC BOUNDS JOHN
More informationWhy the QR Factorization can be more Accurate than the SVD
Why the QR Factorization can be more Accurate than the SVD Leslie V. Foster Department of Mathematics San Jose State University San Jose, CA 95192 foster@math.sjsu.edu May 10, 2004 Problem: or Ax = b for
More informationScientific Computing: Dense Linear Systems
Scientific Computing: Dense Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 February 9th, 2012 A. Donev (Courant Institute)
More informationRandomization: Making Very Large-Scale Linear Algebraic Computations Possible
Randomization: Making Very Large-Scale Linear Algebraic Computations Possible Gunnar Martinsson The University of Colorado at Boulder The material presented draws on work by Nathan Halko, Edo Liberty,
More informationProvably Correct Algorithms for Matrix Column Subset Selection with Selectively Sampled Data
Journal of Machine Learning Research 18 (2018) 1-42 Submitted 5/15; Revised 9/16; Published 4/18 Provably Correct Algorithms for Matrix Column Subset Selection with Selectively Sampled Data Yining Wang
More information