SUPPLEMENTAL NOTES FOR ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION WITH APPLICATION TO MORTALITY DATA

Size: px
Start display at page:

Download "SUPPLEMENTAL NOTES FOR ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION WITH APPLICATION TO MORTALITY DATA"

Transcription

1 SUPPLEMENTAL NOTES FOR ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION WITH APPLICATION TO MORTALITY DATA By Lingsong Zhang, Haipeng Shen and Jianhua Z. Huang Purdue University, University of North Carolina, Texas A&M University 1. Derivation of the Criteria for Penalty Parameter Selection. We now derive the leave-one-row/column-out cross-validation and generalized cross-validation criteria for penalty parameter selection. The derivation extends the previous work of Huang, Shen and Buja (2009), When estimating v given u, we can select the penalty parameter λ v by using the leave-one-column-out cross-validation. Let v ( j) be the estimate of v obtained by leaving out the jth column of X, and Ŷ( j) = U v ( j) be the fitted value of Y using v ( j). Let Y j and Ŷ( j) j denote the jth block of Y and Ŷ( j) respectively, corresponding to the jth column in X. The leaveone-column-out cross-validation score for updating v given u is defined as (1) CV(λ v λ u ) = 1 n = 1 n n j=1 n j=1 (u v ( j) j (Y j Ŷ( j) j x j ) T W j (u v ( j) j x j ) ) T W j (Y j Ŷ( j) j ), where W j is the jth diagonal block of W. We now derive a shortcut formula for the cross-validation score so that there is no need to actually perform the leave-one-column-out operation in implementation of the method, which is computationally expensive. Let v j denote the jth element of v given in Equation (7) of the main paper. Let X be the same as X, except that the jth column is replaced by v j u, and Y = Svec(X ). Note that this means that the jth block (consisting of m elements) of Y and Y are different and the rest of blocks are the same. By Correspondence to Lingsong Zhang. Shen s work was partially supported by NIH/NIDA (1 RC1 DA ) and NSF (CMMI , DMS ). Huang s work was partially supported by NCI (CA57030), NSF (DMS , DMS , DMS ), and Award No. KUS-C , made by King Abdullah University of Science and Technology (KAUST). 1

2 2 ZHANG, SHEN AND HUANG the definition of H, we have that Thus, (2) Ŷ ( j) = U v ( j) = HY. Y Ŷ( j) = Y HY = Y HY H(Y Y). Denote H jj as the jth diagonal block of H (i.e., it is a m m square matrix). Since only the jth block of Y Y are nonzeros, inspecting the jth block of both sides of the equation (2) gives Y j Ŷ( j) j = Y j Ŷj + H jj (Y j Ŷ( j) j ). Rearrangement of this equation leads to the following leave-one-out lemma. Lemma 1. The leave-jth-column-out cross-validated residual for estimating v given u is Y j Ŷ( j) j = (I H jj ) 1 (Y j Ŷj). Next, we use Lemma 1 to give a simple expression of the leave-one-column cross-validation score. Let O = (U T WU + 2Ω v u ) 1. We can show that H jj = O jj uu T W j, where O jj is the jjth entry of the matrix O. This expression and Lemma 1 can be used to derive the following result. The proof of the result is given at the end of the appendix. Lemma 2. The weighted jth leave-one-column-out cross-validation error sum of squares (u v ( j) j x j ) T W j (u v ( j) j x j ) can be written as (3) x T j W j x j xt j W ju u T W j u + ut W j u ( v j x T j W ju/u T W j u) 2 (1 O jj u T W j u) 2. By definition, the leave-one-column-out cross-validation score is the average over j of the expression given in (3). Since the first two terms are not related to λ v when conditioning on u, the cross-validation score can be equivalently expressed as the average over j of the last term in (3), which is (excluding the irrelevant factor u T W j u) CV(λ v λ u ) = 1 n n j=1 ( v j x T j W ju/u T W j u) 2 (1 O jj u T W j u) 2.

3 ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION 3 Note that O jj u T W j u is a scalar, so O jj u T W j u = tr(o jj uu T W j ) = tr(h jj ). Consequently, we obtain the following expression of the leave-one-columnout cross-validation score (4) CV(λ v λ u ) = 1 n ( v j x T j W ju/u T W j u) 2 n (1 tr(h jj )) 2. j=1 Replacing tr(h jj ) in CV(λ v λ u ) with its average over j, tr(h)/n, leads to the GCV criterion (5) GCV(λ v λ u ) = 1 n ( v j x T j W ju/u T W j u) 2 n (1 tr(h)/n) 2. j=1 It can be shown that the jth component of v = (U T WU) 1 U T WY, the unregularized update of v, is v j = xt j W ju/u T W j u. Thus, the GCV formula can also be written as GCV(λ v λ u ) = v v 2 /n (1 tr(h)/n) 2. The GCV formula for selecting the penalty parameter λ u when updating u given v can be derived in a similar manner Proof for Lemma 2. By the result in Lemma 1, the weighted jth leave-one-column-out cross-validation error sum of squares (denoted by r j ) is (6) r j = (u v ( j) j x j ) T W j (u v ( j) j x j ) = (u v j x j ) T (1 H jj ) 1 W j (1 H jj ) 1 (u v j x j ). Since H jj = O jj uu T W j, by using the Sherman Morrison formula (see (A.2.1) on page 210 of Cook and Weisberg, 1982), we obtain (I H jj ) 1 = I + O jjuu T W j 1 O jj u T W j u. Thus, letting z = u v j x j, we have that (7) (1 H jj ) 1 (u v j x j ) = z + O jjuu T W j z 1 O jj u T W j u. Moreover, (8) z T W j z (ut W j z) 2 u T W j u = xt j W j x j (ut W j x j ) 2 u T W j u.

4 4 ZHANG, SHEN AND HUANG Plugging (7) into (6), expanding the quadratic form, and applying some additional algebra, we obtain r j = z T W j z (ut W j z) 2 u T W j u + ut W j u (ut W j z/u T W j u) 2 (1 O jj u T W j u) 2. The desired result then follows from (8) and the definition of z. 2. Dealing with Missing Data via the MM Algorithm. We show that the missing value handling approach described in Section 2.5 of the main paper is an application of the MM algorithm (Hunter and Lange, 2004) and has desirable properties. Let I o be the set of indices of all (i, j) pairs for which x i,j is observable. The indices outside I o correspond to missing observations. In the presence of missing observations, the RobR criterion function to be minimized is (9) R(u, v) = ( ) xij u i v j + P λ (u, v), σ (i,j) I o ρ and P λ (u, v) is the penalty function defined in equation (4) of the main paper. Suppose u 0 and v 0 are some initial guesses of u and v. Define X = ( x ij ) with { x ij (i, j) I o x i,j = u 0 i v0 j (i, j) I o. Define the surrogate criterion function R(u, v; u 0, v 0 ) = ( xij u i v j ρ σ i,j ) + P λ (u, v). It is easily seen that R(u, v; u 0, v 0 ) R(u, v) with equality when u = u 0 and v = v 0, and so R(u, v) is a majorizing function. The MM algorithm starts from some initial guesses u 0 and v 0, minimizes the surrogate function R(u, v; u 0, v 0 ) over u and v, updates the initial guesses with the current minimizers, and iterates until convergence. Let (u m, v m ), m = 0, 1, 2,..., be a sequence of minimizers generated by the algorithm. We have that R(u m+1, v m+1 ) R(u m, v m ; u m, v m ) = R(u m, v m ). Thus the criterion value decreases with the number of iterations, and the algorithm is guaranteed to converge to a local minimum. 3. Additional Simulation Studies.

5 ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION 5 m Rm RobRm m Rm RobRm m Rm RobRm m Rm RobRm m Rm RobRm L2 distance between uhat and u Fig 1: Rank One Simulation with Missing Values: Boxplots of the L 2 Distance between û and u Rank One Signal Matrix with Missing Values. This section provides detailed results for the simulation study reported in Section 3.2 of the main paper. Our motivating Spanish mortality data contain both outliers and missing values, which motivates us to investigate the performance of RobR when facing missing values. For each simulated data set in Section 3.1 of the main paper, we randomly select and delete 100 cells from it to form a new data set with missing values. We use the imputation method described in Section 2.5 of the main paper to estimate u and v for, R and RobR. Figures 1 and 2 compare the boxplots of the L 2 distances between the estimates and the truth, respectively. We can clearly see that the RobR remains to be the winner across all the settings considered Rank Two Signal Matrix. This section provides detailed results for the simulation study reported in Section 3.3 of the main paper. We now study the situation where the true signal matrix is rank two,

6 6 ZHANG, SHEN AND HUANG m Rm RobRm m Rm RobRm m Rm RobRm m Rm RobRm m Rm RobRm L2 distance between vhat and v Fig 2: Rank One Simulation with Missing Values: Boxplots of the L 2 Distance between v 0 and v 0.

7 ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION 7 using a similar simulation setting that has been considered by Huang, Shen and Buja (2009). In particular, let U 1 (y) = sin(2πy), U 2 (y) = sin(2π(y 0.25)), V1 (z) = exp( 4(z 0.25) 2 ), V2 (z) = exp( 4(z 0.75) 2 ); and consider the following true rank-two two-way functional model: (10) X(y, z) = 100U 1 (y)v 1 (z) + 50U 2 (y)v 2 (z) + ɛ(y, z), with y [0, 1] and z [0, 1]. Here U and V are normalized U and V. To simulate the functional data matrix with no outliers, we consider 100 equal-spaced grids in either direction and sample ɛ(y, z) independently from N(0, ). The simulation again is repeated 100 times. Similar to the rank 1 approximation in Section 3.1 of the main paper, we consider the following simulation scenarios: 1. s: The benchmark setting to see how RobR compares with the non-robust methods when there are no outliers. 2. Outlying cells: We randomly select 25 cells in the data and replace their entries with outlying values, in particular, values that are randomly simulated from U[C 1, 2C 1 ], the uniform distribution with support [C 1, 2C 1 ] with C 1 defined similarly as in Section 3.1 of the main paper. 3. : We randomly select five rows, and replace them by five outlying curves, defined as Y k sin(4πz) plus noise, where Y k is a random number generated from U[C 1, 2C 1 ]. 4. : We randomly select a continuous square block of cells, which in size is at most a quarter of the whole matrix, and add to the cell entries a random number generated from U[C 1, 2C 1 ]. 5. : We replace the diagonal entries with numbers generated from U[C 1, 2C 1 ]. As pointed out by Huang, Shen and Buja (2009), the defining decomposition in (10) is not in form as the components are not orthogonal. Hence, we below compare how the three methods estimate the true underlying rank-two signal, and the corresponding rank-two subspaces spanned by the first-two left and right singular vectors, respectively. We use the following two criteria to gauge the performance of the three methods. The first criterion is the Frobenius distance between the true rank-two signal matrix X 0 and the estimated best rank-two matrix X 0, i.e. the Frobenius norm of the approximation error matrix: X 0 X 0. Figure 3 summaries the comparison of the Frobenius distance for the three methods. In all

8 8 ZHANG, SHEN AND HUANG R RobR R RobR R RobR R RobR R RobR Frobenius distance between rank 2 approximation and X Fig 3: Rank Two Simulation: The Frobenius norm of the difference matrices for rank 2 approximation by different method

9 ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION 9 R RobR R RobR R RobR R RobR R RobR Principal angle between uhat and u Fig 4: Rank Two Simulation The principal angle between Û and U cases with outliers, RobR performs the best, having the smallest average distance and variability. When the data have no outliers, RobR and R perform similarly and both are better than. As our second measure, we use the largest principal angle (Golub and Van Loan, 1996) between the true subspace and the subspace spanned by the corresponding singular vector estimates, which measures the closeness between the subspaces. Specifically, let U = span(u1, U 2 ) denote the linear subspace spanned by U1 (y) and U 2 (y) evaluated at the grid points and Û be the corresponding estimate of this subspace. The principal angle between U and Û can be computed as cos 1 (ρ) 180/π, where ρ is the minimum eigenvalue of the matrix Q Ṱ Q U U where QÛ and Q U are orthogonal basis matrices obtained by the QR decomposition of the matrices Û and U, respectively. Similarly, we can define V = span(v1, V 2 ) and its estimate V, and calculate the principal angles between the two spaces. Figures 4 and 5 compare the boxplots of the principal angles between U and its estimate Û as well as between V and its estimate V, respectively,

10 10 ZHANG, SHEN AND HUANG R RobR R RobR R RobR R RobR R RobR Principal angle between vhat and v Fig 5: Rank Two Simulation: The principal angle between V and V

11 ROBUST REGULARIZED SINGULAR VALUE DECOMPOSITION 11 obtained from the 100 simulation runs. When the data have no outliers, R performs the best when estimating U, and R and RobR perform similarly, both better than, when estimating V. For all the outlying settings, RobR has the best performance. 4. Additional Analysis on the Spanish Mortality Data. In addition to the individual pairs of singular vectors, we also compare the cumulative approximation performance from the three methods as well as the corresponding approximation errors in Figure 6. The top row shows the 3-dimensional surface plots of the best rank-two approximations, where we focus on age between 11 and 50 to highlight the comparison. Again, one can clearly see in the /R approximations the two outlying strips around 1918 and As a comparison, the RobR approximation depicts a nice two-way smooth pattern, which is much less affected by the outliers. The colors suggest that the range of the /R approximations is much wider than RobR. The bottom row plots the corresponding approximation error surfaces. Note that the outlier periods appear much larger in the RobR plot. App R App RobR App Year Age Year Age Year Age Res R Res RobR Res Year Age Year Age Year Age Fig 6: Comparison of the Cumulative Rank-Two Approximations and the Corresponding Approximation Errors.

12 12 ZHANG, SHEN AND HUANG References. Cook, R. D. and Weisberg, S. (1982). Residuals and influence in regression. Chapman and Hall New York. Golub, G. H. and Van Loan, C. F. (1996). Matrix Computations, 3rd edition. The Johns Hopkins University Press. Huang, J. Z., Shen, H. and Buja, A. (2009). The analysis of two-way functional data using two-way regularized singular value decompositions. Journal of the American Statistical Association Hunter, D. R. and Lange, K. (2004). The Tutorial on MM Algorithm. The American Statistician Department of Statistics Purdue University 150 N. University St. West Lafayette, IN, Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC, Department of Statistics Texas A&M University 3143 TAMU College Station, TX

Jianhua Z. Huang, Haipeng Shen, Andreas Buja

Jianhua Z. Huang, Haipeng Shen, Andreas Buja Several Flawed Approaches to Penalized SVDs A supplementary note to The analysis of two-way functional data using two-way regularized singular value decompositions Jianhua Z. Huang, Haipeng Shen, Andreas

More information

Functional SVD for Big Data

Functional SVD for Big Data Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, 2014 1 / 24 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem

More information

Functional principal components analysis via penalized rank one approximation

Functional principal components analysis via penalized rank one approximation Electronic Journal of Statistics Vol. 2 (2008) 678 695 ISSN: 1935-7524 DI: 10.1214/08-EJS218 Functional principal components analysis via penalized rank one approximation Jianhua Z. Huang Department of

More information

Sparse principal component analysis via regularized low rank matrix approximation

Sparse principal component analysis via regularized low rank matrix approximation Journal of Multivariate Analysis 99 (2008) 1015 1034 www.elsevier.com/locate/jmva Sparse principal component analysis via regularized low rank matrix approximation Haipeng Shen a,, Jianhua Z. Huang b a

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

SVD, PCA & Preprocessing

SVD, PCA & Preprocessing Chapter 1 SVD, PCA & Preprocessing Part 2: Pre-processing and selecting the rank Pre-processing Skillicorn chapter 3.1 2 Why pre-process? Consider matrix of weather data Monthly temperatures in degrees

More information

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization

More information

Supplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data

Supplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data Supplement to A Generalized Least Squares Matrix Decomposition Genevera I. Allen 1, Logan Grosenic 2, & Jonathan Taylor 3 1 Department of Statistics and Electrical and Computer Engineering, Rice University

More information

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition AM 205: lecture 8 Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition QR Factorization A matrix A R m n, m n, can be factorized

More information

Introduction. Chapter One

Introduction. Chapter One Chapter One Introduction The aim of this book is to describe and explain the beautiful mathematical relationships between matrices, moments, orthogonal polynomials, quadrature rules and the Lanczos and

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

COMPARISON OF MODE SHAPE VECTORS IN OPERATIONAL MODAL ANALYSIS DEALING WITH CLOSELY SPACED MODES.

COMPARISON OF MODE SHAPE VECTORS IN OPERATIONAL MODAL ANALYSIS DEALING WITH CLOSELY SPACED MODES. IOMAC'5 6 th International Operational Modal Analysis Conference 5 May-4 Gijón - Spain COMPARISON OF MODE SHAPE VECTORS IN OPERATIONAL MODAL ANALYSIS DEALING WITH CLOSELY SPACED MODES. Olsen P., and Brincker

More information

Section 6.4. The Gram Schmidt Process

Section 6.4. The Gram Schmidt Process Section 6.4 The Gram Schmidt Process Motivation The procedures in 6 start with an orthogonal basis {u, u,..., u m}. Find the B-coordinates of a vector x using dot products: x = m i= x u i u i u i u i Find

More information

The Analysis of Two-Way Functional Data Using Two- Way Regularized Singular Value Decompositions

The Analysis of Two-Way Functional Data Using Two- Way Regularized Singular Value Decompositions This article was downloaded by: [University North Carolina - Chapel Hill] On: 3 January 204, At: 06:2 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 072954 Registered

More information

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Lecture 9: Numerical Linear Algebra Primer (February 11st) 10-725/36-725: Convex Optimization Spring 2015 Lecture 9: Numerical Linear Algebra Primer (February 11st) Lecturer: Ryan Tibshirani Scribes: Avinash Siravuru, Guofan Wu, Maosheng Liu Note: LaTeX template

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

Problem # Max points possible Actual score Total 120

Problem # Max points possible Actual score Total 120 FINAL EXAMINATION - MATH 2121, FALL 2017. Name: ID#: Email: Lecture & Tutorial: Problem # Max points possible Actual score 1 15 2 15 3 10 4 15 5 15 6 15 7 10 8 10 9 15 Total 120 You have 180 minutes to

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Matrix decompositions

Matrix decompositions Matrix decompositions Zdeněk Dvořák May 19, 2015 Lemma 1 (Schur decomposition). If A is a symmetric real matrix, then there exists an orthogonal matrix Q and a diagonal matrix D such that A = QDQ T. The

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Integrating Data Transformation in Principal Components Analysis

Integrating Data Transformation in Principal Components Analysis Marquette University e-publications@marquette Mathematics, Statistics and Computer Science Faculty Research and Publications Mathematics, Statistics and Computer Science, Department of 3-1-2015 Integrating

More information

Chapter 6: Orthogonality

Chapter 6: Orthogonality Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products

More information

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u MATH 434/534 Theoretical Assignment 7 Solution Chapter 7 (71) Let H = I 2uuT Hu = u (ii) Hv = v if = 0 be a Householder matrix Then prove the followings H = I 2 uut Hu = (I 2 uu )u = u 2 uut u = u 2u =

More information

Least Squares Optimization

Least Squares Optimization Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques, which are widely used to analyze and visualize data. Least squares (LS)

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6 CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6 GENE H GOLUB Issues with Floating-point Arithmetic We conclude our discussion of floating-point arithmetic by highlighting two issues that frequently

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Institute for Computational Mathematics Hong Kong Baptist University

Institute for Computational Mathematics Hong Kong Baptist University Institute for Computational Mathematics Hong Kong Baptist University ICM Research Report 08-0 How to find a good submatrix S. A. Goreinov, I. V. Oseledets, D. V. Savostyanov, E. E. Tyrtyshnikov, N. L.

More information

Solution of Linear Equations

Solution of Linear Equations Solution of Linear Equations (Com S 477/577 Notes) Yan-Bin Jia Sep 7, 07 We have discussed general methods for solving arbitrary equations, and looked at the special class of polynomial equations A subclass

More information

Practical Linear Algebra: A Geometry Toolbox

Practical Linear Algebra: A Geometry Toolbox Practical Linear Algebra: A Geometry Toolbox Third edition Chapter 12: Gauss for Linear Systems Gerald Farin & Dianne Hansford CRC Press, Taylor & Francis Group, An A K Peters Book www.farinhansford.com/books/pla

More information

Least Squares Optimization

Least Squares Optimization Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. Broadly, these techniques can be used in data analysis and visualization

More information

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition

Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition MATH 322, Linear Algebra I J. Robert Buchanan Department of Mathematics Spring 205 Motivation When working with an inner product space, the most

More information

Integer Least Squares: Sphere Decoding and the LLL Algorithm

Integer Least Squares: Sphere Decoding and the LLL Algorithm Integer Least Squares: Sphere Decoding and the LLL Algorithm Sanzheng Qiao Department of Computing and Software McMaster University 28 Main St. West Hamilton Ontario L8S 4L7 Canada. ABSTRACT This paper

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Total Least Squares Approach in Regression Methods

Total Least Squares Approach in Regression Methods WDS'08 Proceedings of Contributed Papers, Part I, 88 93, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS Total Least Squares Approach in Regression Methods M. Pešta Charles University, Faculty of Mathematics

More information

MATH 425-Spring 2010 HOMEWORK ASSIGNMENTS

MATH 425-Spring 2010 HOMEWORK ASSIGNMENTS MATH 425-Spring 2010 HOMEWORK ASSIGNMENTS Instructor: Shmuel Friedland Department of Mathematics, Statistics and Computer Science email: friedlan@uic.edu Last update April 18, 2010 1 HOMEWORK ASSIGNMENT

More information

Chapter 7: Symmetric Matrices and Quadratic Forms

Chapter 7: Symmetric Matrices and Quadratic Forms Chapter 7: Symmetric Matrices and Quadratic Forms (Last Updated: December, 06) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved

More information

The Analysis of Two-Way Functional Data. Using Two-Way Regularized Singular Value Decompositions

The Analysis of Two-Way Functional Data. Using Two-Way Regularized Singular Value Decompositions The Analysis of Two-Way Functional Data Using Two-Way Regularized Singular Value Decompositions Jianhua Z. Huang, Haipeng Shen and Andreas Buja Abstract Two-way functional data consist of a data matrix

More information

ANALYSIS OF NONLINEAR PARTIAL LEAST SQUARES ALGORITHMS

ANALYSIS OF NONLINEAR PARTIAL LEAST SQUARES ALGORITHMS ANALYSIS OF NONLINEAR PARIAL LEAS SQUARES ALGORIHMS S. Kumar U. Kruger,1 E. B. Martin, and A. J. Morris Centre of Process Analytics and Process echnology, University of Newcastle, NE1 7RU, U.K. Intelligent

More information

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Linear Models 1. Isfahan University of Technology Fall Semester, 2014 Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and

More information

CAAM 454/554: Stationary Iterative Methods

CAAM 454/554: Stationary Iterative Methods CAAM 454/554: Stationary Iterative Methods Yin Zhang (draft) CAAM, Rice University, Houston, TX 77005 2007, Revised 2010 Abstract Stationary iterative methods for solving systems of linear equations are

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET This is a (not quite comprehensive) list of definitions and theorems given in Math 1553. Pay particular attention to the ones in red. Study Tip For each

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Biclustering via Sparse Singular Value Decomposition

Biclustering via Sparse Singular Value Decomposition Biometrics 000, 000 000 DOI: 000 December 2009 Biclustering via Sparse Singular Value Decomposition Mihee Lee 1, Haipeng Shen 1,, Jianhua Z. Huang 2, and J. S. Marron 1 1 Department of Statistics and Operations

More information

Least Squares Optimization

Least Squares Optimization Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. I assume the reader is familiar with basic linear algebra, including the

More information

Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics

Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics Supplement to Sparse Non-Negative Generalized PCA with Applications to Metabolomics Genevera I. Allen Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research

More information

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated. Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 9 1 / 23 Overview

More information

Computational Methods. Eigenvalues and Singular Values

Computational Methods. Eigenvalues and Singular Values Computational Methods Eigenvalues and Singular Values Manfred Huber 2010 1 Eigenvalues and Singular Values Eigenvalues and singular values describe important aspects of transformations and of data relations

More information

Algorithmen zur digitalen Bildverarbeitung I

Algorithmen zur digitalen Bildverarbeitung I ALBERT-LUDWIGS-UNIVERSITÄT FREIBURG INSTITUT FÜR INFORMATIK Lehrstuhl für Mustererkennung und Bildverarbeitung Prof. Dr.-Ing. Hans Burkhardt Georges-Köhler-Allee Geb. 05, Zi 01-09 D-79110 Freiburg Tel.

More information

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION APPENDIX B EIGENVALUES AND SINGULAR VALUE DECOMPOSITION B.1 LINEAR EQUATIONS AND INVERSES Problems of linear estimation can be written in terms of a linear matrix equation whose solution provides the required

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725 Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple

More information

A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing

A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 5, SEPTEMBER 2001 1215 A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing Da-Zheng Feng, Zheng Bao, Xian-Da Zhang

More information

Math 3191 Applied Linear Algebra

Math 3191 Applied Linear Algebra Math 9 Applied Linear Algebra Lecture : Orthogonal Projections, Gram-Schmidt Stephen Billups University of Colorado at Denver Math 9Applied Linear Algebra p./ Orthonormal Sets A set of vectors {u, u,...,

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Algebra C Numerical Linear Algebra Sample Exam Problems

Algebra C Numerical Linear Algebra Sample Exam Problems Algebra C Numerical Linear Algebra Sample Exam Problems Notation. Denote by V a finite-dimensional Hilbert space with inner product (, ) and corresponding norm. The abbreviation SPD is used for symmetric

More information

Subset selection for matrices

Subset selection for matrices Linear Algebra its Applications 422 (2007) 349 359 www.elsevier.com/locate/laa Subset selection for matrices F.R. de Hoog a, R.M.M. Mattheij b, a CSIRO Mathematical Information Sciences, P.O. ox 664, Canberra,

More information

Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values

Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values 1MARCH 001 SCHNEIDER 853 Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values APIO SCHNEIDER Atmospheric and Oceanic Sciences Program,

More information

Robustness of Principal Components

Robustness of Principal Components PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.

More information

Statistics 910, #5 1. Regression Methods

Statistics 910, #5 1. Regression Methods Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known

More information

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.

More information

Notes on Householder QR Factorization

Notes on Householder QR Factorization Notes on Householder QR Factorization Robert A van de Geijn Department of Computer Science he University of exas at Austin Austin, X 7872 rvdg@csutexasedu September 2, 24 Motivation A fundamental problem

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Efficient and Accurate Rectangular Window Subspace Tracking

Efficient and Accurate Rectangular Window Subspace Tracking Efficient and Accurate Rectangular Window Subspace Tracking Timothy M. Toolan and Donald W. Tufts Dept. of Electrical Engineering, University of Rhode Island, Kingston, RI 88 USA toolan@ele.uri.edu, tufts@ele.uri.edu

More information

Linear Algebra (Review) Volker Tresp 2018

Linear Algebra (Review) Volker Tresp 2018 Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c

More information

Math 4377/6308 Advanced Linear Algebra

Math 4377/6308 Advanced Linear Algebra 1.4 Linear Combinations Math 4377/6308 Advanced Linear Algebra 1.4 Linear Combinations & Systems of Linear Equations Jiwen He Department of Mathematics, University of Houston jiwenhe@math.uh.edu math.uh.edu/

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 22 1 / 21 Overview

More information

LINEAR ALGEBRA SUMMARY SHEET.

LINEAR ALGEBRA SUMMARY SHEET. LINEAR ALGEBRA SUMMARY SHEET RADON ROSBOROUGH https://intuitiveexplanationscom/linear-algebra-summary-sheet/ This document is a concise collection of many of the important theorems of linear algebra, organized

More information

Supplementary Materials for Fast envelope algorithms

Supplementary Materials for Fast envelope algorithms 1 Supplementary Materials for Fast envelope algorithms R. Dennis Cook and Xin Zhang 1 Proof for Proposition We first prove the following: F(A 0 ) = log A T 0 MA 0 + log A T 0 (M + U) 1 A 0 (A1) log A T

More information

Dense LU factorization and its error analysis

Dense LU factorization and its error analysis Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,

More information

Introduction to Numerical Linear Algebra II

Introduction to Numerical Linear Algebra II Introduction to Numerical Linear Algebra II Petros Drineas These slides were prepared by Ilse Ipsen for the 2015 Gene Golub SIAM Summer School on RandNLA 1 / 49 Overview We will cover this material in

More information

MULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES

MULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES MULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES S. Visuri 1 H. Oja V. Koivunen 1 1 Signal Processing Lab. Dept. of Statistics Tampere Univ. of Technology University of Jyväskylä P.O.

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors

More information

Lecture: Linear algebra. 4. Solutions of linear equation systems The fundamental theorem of linear algebra

Lecture: Linear algebra. 4. Solutions of linear equation systems The fundamental theorem of linear algebra Lecture: Linear algebra. 1. Subspaces. 2. Orthogonal complement. 3. The four fundamental subspaces 4. Solutions of linear equation systems The fundamental theorem of linear algebra 5. Determining the fundamental

More information

M.A. Botchev. September 5, 2014

M.A. Botchev. September 5, 2014 Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev

More information

Main matrix factorizations

Main matrix factorizations Main matrix factorizations A P L U P permutation matrix, L lower triangular, U upper triangular Key use: Solve square linear system Ax b. A Q R Q unitary, R upper triangular Key use: Solve square or overdetrmined

More information

Applied Numerical Linear Algebra. Lecture 8

Applied Numerical Linear Algebra. Lecture 8 Applied Numerical Linear Algebra. Lecture 8 1/ 45 Perturbation Theory for the Least Squares Problem When A is not square, we define its condition number with respect to the 2-norm to be k 2 (A) σ max (A)/σ

More information

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 5 Singular Value Decomposition We now reach an important Chapter in this course concerned with the Singular Value Decomposition of a matrix A. SVD, as it is commonly referred to, is one of the

More information

6. Orthogonality and Least-Squares

6. Orthogonality and Least-Squares Linear Algebra 6. Orthogonality and Least-Squares CSIE NCU 1 6. Orthogonality and Least-Squares 6.1 Inner product, length, and orthogonality. 2 6.2 Orthogonal sets... 8 6.3 Orthogonal projections... 13

More information

Forecast comparison of principal component regression and principal covariate regression

Forecast comparison of principal component regression and principal covariate regression Forecast comparison of principal component regression and principal covariate regression Christiaan Heij, Patrick J.F. Groenen, Dick J. van Dijk Econometric Institute, Erasmus University Rotterdam Econometric

More information

On The Belonging Of A Perturbed Vector To A Subspace From A Numerical View Point

On The Belonging Of A Perturbed Vector To A Subspace From A Numerical View Point Applied Mathematics E-Notes, 7(007), 65-70 c ISSN 1607-510 Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/ On The Belonging Of A Perturbed Vector To A Subspace From A Numerical View

More information

MAT 610: Numerical Linear Algebra. James V. Lambers

MAT 610: Numerical Linear Algebra. James V. Lambers MAT 610: Numerical Linear Algebra James V Lambers January 16, 2017 2 Contents 1 Matrix Multiplication Problems 7 11 Introduction 7 111 Systems of Linear Equations 7 112 The Eigenvalue Problem 8 12 Basic

More information

CS 246 Review of Linear Algebra 01/17/19

CS 246 Review of Linear Algebra 01/17/19 1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector

More information

Preliminary Examination, Numerical Analysis, August 2016

Preliminary Examination, Numerical Analysis, August 2016 Preliminary Examination, Numerical Analysis, August 2016 Instructions: This exam is closed books and notes. The time allowed is three hours and you need to work on any three out of questions 1-4 and any

More information

Note on deleting a vertex and weak interlacing of the Laplacian spectrum

Note on deleting a vertex and weak interlacing of the Laplacian spectrum Electronic Journal of Linear Algebra Volume 16 Article 6 2007 Note on deleting a vertex and weak interlacing of the Laplacian spectrum Zvi Lotker zvilo@cse.bgu.ac.il Follow this and additional works at:

More information

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET This is a (not quite comprehensive) list of definitions and theorems given in Math 1553. Pay particular attention to the ones in red. Study Tip For each

More information

Tutorial on Principal Component Analysis

Tutorial on Principal Component Analysis Tutorial on Principal Component Analysis Copyright c 1997, 2003 Javier R. Movellan. This is an open source document. Permission is granted to copy, distribute and/or modify this document under the terms

More information

How to Distinguish True Dependence from Varying Independence?

How to Distinguish True Dependence from Varying Independence? University of Texas at El Paso DigitalCommons@UTEP Departmental Technical Reports (CS) Department of Computer Science 8-1-2013 How to Distinguish True Dependence from Varying Independence? Marketa Krmelova

More information

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) 0 overview Our Contributions: 1 overview Our Contributions: A near optimal low-rank

More information

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.

More information

Derivation of the Kalman Filter

Derivation of the Kalman Filter Derivation of the Kalman Filter Kai Borre Danish GPS Center, Denmark Block Matrix Identities The key formulas give the inverse of a 2 by 2 block matrix, assuming T is invertible: T U 1 L M. (1) V W N P

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition An Important topic in NLA Radu Tiberiu Trîmbiţaş Babeş-Bolyai University February 23, 2009 Radu Tiberiu Trîmbiţaş ( Babeş-Bolyai University)The Singular Value Decomposition

More information

Chap 3. Linear Algebra

Chap 3. Linear Algebra Chap 3. Linear Algebra Outlines 1. Introduction 2. Basis, Representation, and Orthonormalization 3. Linear Algebraic Equations 4. Similarity Transformation 5. Diagonal Form and Jordan Form 6. Functions

More information

Jim Lambers MAT 610 Summer Session Lecture 1 Notes

Jim Lambers MAT 610 Summer Session Lecture 1 Notes Jim Lambers MAT 60 Summer Session 2009-0 Lecture Notes Introduction This course is about numerical linear algebra, which is the study of the approximate solution of fundamental problems from linear algebra

More information

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6

Orthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6 Chapter 6 Orthogonality 6.1 Orthogonal Vectors and Subspaces Recall that if nonzero vectors x, y R n are linearly independent then the subspace of all vectors αx + βy, α, β R (the space spanned by x and

More information

Appendix A: Matrices

Appendix A: Matrices Appendix A: Matrices A matrix is a rectangular array of numbers Such arrays have rows and columns The numbers of rows and columns are referred to as the dimensions of a matrix A matrix with, say, 5 rows

More information