Secrets of Matrix Factorization: Supplementary Document

Size: px
Start display at page:

Download "Secrets of Matrix Factorization: Supplementary Document"

Transcription

1 Secrets of Matrix Factorization: Supplementary Document Je Hyeong Hong University of Cambridge Andrew Fitzgibbon Microsoft, Cambridge, UK Abstract This document consists of details to which the original submission article has made references. These include a list of symbols, some matrix identities and calculus rules, a list of VarPro derivatives, symmetry analysis of the un-regularized VarPro Gauss-Newton matrix and MTSS computation details. Symbol Meaning R p q S p The space of real matrices of size p q The space of symmetric real matrices of size p p M The measurement matrix R m n (may contain noise and missing data) W The weight matrix R m n U The first decomposition matrix R m r V The second decomposition matrix R n r u vec(u) R mr v vec(v ) R nr m The number of rows in M n The number of columns in M p The number of visible entries in M p j The number of visible entries in the j-th column of M f The objective R ε The vector objective R p J Jacobian H Hessian V arg min V f(u, V) R n r Π Projection matrix R p mn selecting nonzero entries of vec(w) W Π diag(vec(w)) R p mn m W vec(m) R p Ũ W(I n U) R p nr Ṽ W(V I m ) R p mr The permutation matrix satisfying vec(u ) = K mr vec(u) R mr mr The permutation matrix satisfying vec(v ) = K nr vec(v) R nr nr R W (UV M) R m n Z (W R) I r R mr nr K mr K nr Table : A list of symbols used in [].

2 . Symbols A list of symbols used in [] can be found in Table. 2. Matrix identities and calculus rules Some of the key matrix identities and calculus rules in [6, 5] are illustrated and extended below for convenience. 2.. Vec and Kronecker product identities Given that each small letter in bold is an arbitrary column vector of appropriate size, we have 2.2. Differentiation of µ-pseudo inverse vec(a B) = diag(vec A) vec(b), () vec(axb) = (B A) vec(x), (2) vec(ab) = (B I) vec(a) = (I A) vec(b), (3) (A B)(C D) = AC BD, (4) (A b)c = AC b, (5) vec(a ) = K A vec(a), (6) (A B) = K (B A)K 2, and (7) (A b) = K (b A). (8) Let us first define µ-pseudo inverse X µ := (X X + µi) X. The derivative of µ-pseudo inverse is [X µ ] = [(X X + µi) ]X + (X X + µi) [X] (9) = (X X + µi) [X X + µi](x X + µi) X + (X X + µi) [X] (0) = (X X + µi) ( [X ]X + X [X])(X X + µi) X + (X X + µi) [X] () = (X X + µi) [X ]XX µ (X X + µi) X [X]X µ + (X X + µi) [X] (2) = X µ [X]X µ + (X X + µi) [X] (I XX µ ). (3) The derivative of pseudo-inverse X := (X X) X is then simply the above for µ = Derivatives for variable projection (VarPro) A full list of derivatives for un-regularized VarPro can be found in Table 2. Derivations are illustrated in [2]. The regularized setting is also discussed in [2]. 3.. Differentiating v (u) From [], we have Substituting (4) to the original objective yields Taking [v ] and applying (3) gives us v (u) := arg min f(u, v) = Ũ µ m. (4) v ε (u, v (u)) = Ũv m = (I ŨŨ µ ) m. (5) [v ] = Ũ µ [Ũ]Ũ µ m + (Ũ Ũ + µi) [Ũ] (I ŨŨ µ ) m (6) = Ũ µ [Ũ]v (Ũ Ũ + µi) [Ũ] ε / noting v (u) = Ũ µ m and (5) (7) = Ũ µ Ṽ u (Ũ Ũ + µi) Z K mr u, / noting bilinearity and the results in [2] (8) and hence du = (Ũ Ũ + µi) (Ũ Ṽ + Z K mr ). (9)

3 Quantity Definition ALS (RW3) + [Approx. Gauss-Newton (RW2)] RW 2 + [Full Gauss-Newton (RW)] RW + [Full Newton] F N w/o Damping w/o Projection constraint P v (u) R nr du Rnr mr v := arg min f(u, v) v du := d vec(v ) d vec(u) Cost vector ε R p ε ε := Ũv m Jacobian J R p mr J := dε du Cost function f R Ordinary gradient g R mr Projected gradient g p R mr Reduced gradient g r R mr r2 Ordinary Hessian H GN, H S mr Projected Hessian H pgn, H p S mr Reduced Hessian H r S (mr r2 ) f := ε 2 2 g := df du g p := P p g (P p := I mr I r UU ) g r := P r g (P r := I r U ) 2 H GN 2 H := 2 := J J + λi dg du + λi 2 H pgn := P pj J P p + λi 2 H p := 2 P dg p p du P p + λi (P p := I r (I m UU )) 2 H r := 2 P rh pp r (P r := I r U ) v = Ũ m ( m R p consists of non-zero elements of m.) du = [Ũ Ṽ ] RW 2 [(Ũ Ũ) Z K mr ] RW (Z := (W R ) I r ) J = (I p [ŨŨ ] RW 2 )Ṽ [Ũ Z K mr ] RW f = W (UV M) F 2 g = Ṽ ε = vec((w R )V ) 2 g p = 2 g 2 g r = 2 P rg 2 H GN = Ṽ (I p [ŨŨ ] RW 2 )Ṽ + [K mrz (Ũ Ũ) Z K mr ] RW + λi mr 2 H = 2 H GN [2K mrz (Ũ Ũ) Z K mr ] F N [K mrz Ũ Ṽ + Ṽ Ũ Z K mr ] F N + λi mr 2 H pgn = 2 H GN 2 H p = 2 H pgn [2K mrz (Ũ Ũ) Z K mr ] F N [K mrz Ũ Ṽ P p + P p Ṽ Ũ Z K mr ] F N + αi r UU P + λi mr 2 H r = 2 P rh pp r = 2 P rh P r = 2 P dg r du P r + λi mr K mr vec(u) = vec(u ) R := W (UV M) Z := (W R ) I r Ordinary update Ordinary update for U U U unvec(h g) Projected update Grassmann manifold retraction U qf(u unvec(h p gp)) Reduced update Grassmann manifold retraction U qf(u P r unvec(h r gr)) Table 2: List of derivatives and extensions for un-regularized variable projection on low-rank matrix factorization with missing data. This includes notions to Ruhe and Wedin s algorithms.

4 4. Additional symmetry of the Gauss-Newton matrix for un-regularized VarPro To observe the additional symmetry, we first need to know how the un-regularized VarPro problem can be reformulated in terms of column-wise derivatives. Below is a summary of these derivatives found in [2]. 4.. Summary of column-wise derivatives for VarPro The unregularized objective f (u, v (u)) can be defined as the sum of squared norm of each column in R (u, v (u)). i.e. f (u, v (u)) := W j (Uvj m j ) 2 2, (20) j= where v j is the j-th row of V (U), m j is the j-th column of M and W j is the non-zero rows of diag vec(w j ), where w j R m is the j-th column of W. Hence, W j is a p j m weight-projection matrix, where p j is the number of non-missing entries in column j. Now define the followings: Hence, v j := arg min v j Ũ j := W j U (2) m j := W j m j, and (22) ε j = Ũ j v j m j. (23) Ũ j vj m j 2 2 (24) j= which shows that each row of V is independent of other rows. 4.. Some definitions We define = arg min Ũ j vj m j 2 2 = Ũ j m j, (25) v j Ṽ j := W j (v j I m ) R pj mr, and (26) Z j := (w j r j ) I r R mr r. (27) 4..2 Column-wise Jacobian J j The Jacobian matrix is also independent for each column. From [2], the Jacobian matrix for column j, J j, is J j = (I Ũ j Ũ j )Ṽ j Ũ j Z j K mr (28) = (I Ũ j Ũ j )Ṽ j Ũ ( j (Ũ j Ũj) (w j r j ) ). (29) 4..3 The column-wise form of Hessian H From [2], the column-wise form of H is 2 H = [ Ṽ j (I [Ũ j Ũ j ] RW 2)Ṽ j + [ ] F N [K mrz j (Ũ j Ũj) Z j K mr ] RW j= [Ṽ j Ũ j Z All Z j K mr can be replaced with I (w j r j ) using (8). j K mr + K mrz j Ũ jṽ j ] F N ] + λi. (30)

5 4.2. Symmetry analysis As derived in [2], the Gauss-Newton matrix for un-regularized VarPro is identical to its projection to the tangent space of U. The column-wise representation of this is 2 H GN = Simplifying the first term yields [ ] Ṽ j (I [Ũ j Ũ j ] RW 2)Ṽ j + [K mrz j (Ũ j Ũj) Z j K mr ] RW + λi. (3) j= Ṽ j Ṽ j = (vj I) W W j j (vj I) (32) = (v j W j )(v j W j ) = v j v j W j W j, / noting (5) (33) which is the Kronecker product of two symmetric matrices. Now defining Ũ jq := qf(ũ j ) gives Ũ j Ũ j = Ũ jq Ũ jq. Hence, simplifying the second term yields Ṽ j (Ũ j Ũ j )Ṽ j = (v j W j )(Ũ jq Ũ jq)(v j W j ) (34) Given that Ũ j = Ũ jq Ũ jr, we can transform (Ũ j Ũj) to Ũ = v j v j W j ŨjQŨ jq W j. / noting (5) (35) jr Ũ jr. The last term then becomes K mrz j (Ũ j Ũj) Z j K mr = (I (w j r j ))Ũ jr Ũ jr (I (w j r j ) ) / noting (8) (36) = Ũ jr Ũ jr (w j r j )(w j r j ). / noting (5) (37) Hence, all the terms inside the square brackets in (3) can be represented as the Kronecker product of two symmetric matrices, which has an additional internal symmetry that can be exploited. This means that computing the Gauss-Newton matrix for un-regularized VarPro can be sped up by factor of 4 at most. 5. Computing Mean time to second success (MTSS) As mentioned in [], computing MTSS is based on the assumption that there exists a known best optimum on a dataset to which thousands or millions of past runs have converged. Given that we have such dataset, we first draw a set of random starting points, usually between 20 to 00, from an isotropic Normal distribution (i.e. U = randn(m, r) with some predefined seed). For each sample of starting points, we run each algorithm and observe two things,. whether the algorithm has successfully converged to the best-known optimum, and 2. the elapsed time. Since the samples of starting points have been drawn in a pseudo-random manner with a monotonically-increasing seed number, we can now estimate how long it would have taken each algorithm to observe two successes starting from each sample number. The average of these yields MTSS. An example is included in the next section to demonstrate this procedures more clearly. 5.. An example illustration Table 3 shows some experimental data produced by an algorithm on a dataset of which the best-known optimum is.225. For each sample of starting points, the final cost and the elapsed time have been recorded. We first determine which samples have yielded success, which is defined as convergence to the best-known optimum (.225). Then, for each sample number, we can calculate up to which sample number we should have run to obtain two successes. e.g. when starting from sample number 5, we would have needed to run the algorithm up to sample number 7. This allows us to estimate the times required for two successes starting from each sample number (see Table 3). MTSS is simply the average of these times, 432.4/7 = 6.8.

6 Sample no Final cost Elapsed time (s) Success (S) S S S S S Sample numbers of 2 & 4 2 & 4 4 & 6 4 & 6 6 & 7 6 & 7 7 & 9 N/A N/A N/A next 2 successes Time required for 2 successes (s) N/A N/A N/A Table 3: Above example data is used to compute MTSS of an algorithm. We assume that each sample starts from randomlydrawn initial points and that the best-known optimum for this dataset is.225 after millions of trials. In this case, MTSS is the average of the bottom row values which is Results The supplementary package includes a set of csv files that were used to generate the results figures in []. They can be found in the <main>/results directory. 6.. Limitations Some results are incomplete for the following reasons:. RTRMC [3] fails to run on FAC and Scu datasets. The algorithm has also failed on several occasions when running on other datasets. The suspected cause of such failure is the algorithm s use of Cholesky decomposition, which is unable to decompose non-positive definite matrices. 2. The original Damped Wiberg code (TO DW) [7] fails to run on FAC and Scu datasets. The way in which the code implements QR-decomposition requires datasets to be trimmed in advance so that each column in the measurement matrix has at least the number of visible entries equal to the estimated rank of the factorization model. 3. CSF [4] runs extremely slowly on UGb dataset. 4. Our Ceres-implemented algorithms are not yet able to take in sparse measurement matrices, which is required to handle large datasets such as NET. References [] Authors. Secrets of matrix factorization: Approximations, numerics and manifold optimization. 205., 2, 5, 6 [2] Authors. Secrets of matrix factorization: Further derivations and comparisons. Technical report, further.pdf, , 4, 5 [3] N. Boumal and P.-A. Absil. RTRMC: A Riemannian trust-region method for low-rank matrix completion. In Advances in Neural Information Processing Systems 24 (NIPS 20), pages [4] P. F. Gotardo and A. M. Martinez. Computing smooth time trajectories for camera and deformable shape in structure from motion with occlusion. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 33(0): , Oct [5] J. R. Magnus and H. Neudecker. Matrix Differential Calculus with Applications in Statistics and Econometrics. 3rd edition, [6] T. P. Minka. Old and new matrix algebra useful for statistics. Technical report, Microsoft Research, [7] T. Okatani, T. Yoshida, and K. Deguchi. Efficient algorithm for low-rank matrix factorization with missing components and performance comparison of latest algorithms. In Proceedings of the 20 IEEE International Conference on Computer Vision (ICCV), pages , 20. 6

A Derivation of the 3-entity-type model

A Derivation of the 3-entity-type model A Derivation of the 3-entity-type model The 3-entity-type model consists of two relationship matrices, a m n matrix X and a n r matrix Y In the spirit of generalized linear models we define the low rank

More information

Revisiting the Variable Projection Method for Separable Nonlinear Least Squares Problems

Revisiting the Variable Projection Method for Separable Nonlinear Least Squares Problems Revisiting the Variable Projection Method for Separable Nonlinear Least Squares Problems e Hyeong Hong University of Cambridge jhh37@cam.ac. Christopher Zach Toshiba Research Europe christopher.m.zach@gmail.com

More information

Matrix Derivatives and Descent Optimization Methods

Matrix Derivatives and Descent Optimization Methods Matrix Derivatives and Descent Optimization Methods 1 Qiang Ning Department of Electrical and Computer Engineering Beckman Institute for Advanced Science and Techonology University of Illinois at Urbana-Champaign

More information

Efficient Computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm

Efficient Computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm Efficient Computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm Anders Eriksson, Anton van den Hengel School of Computer Science University of Adelaide,

More information

Vectors. January 13, 2013

Vectors. January 13, 2013 Vectors January 13, 2013 The simplest tensors are scalars, which are the measurable quantities of a theory, left invariant by symmetry transformations. By far the most common non-scalars are the vectors,

More information

General and Nested Wiberg Minimization: L 2 and Maximum Likelihood

General and Nested Wiberg Minimization: L 2 and Maximum Likelihood General and Nested Wiberg Minimization: L 2 and Maximum Likelihood Dennis Strelow Google Mountain View, CA strelow@google.com Abstract. Wiberg matrix factorization breaks a matrix Y into low-rank factors

More information

Learning Task Grouping and Overlap in Multi-Task Learning

Learning Task Grouping and Overlap in Multi-Task Learning Learning Task Grouping and Overlap in Multi-Task Learning Abhishek Kumar Hal Daumé III Department of Computer Science University of Mayland, College Park 20 May 2013 Proceedings of the 29 th International

More information

Lecture 4. Tensor-Related Singular Value Decompositions. Charles F. Van Loan

Lecture 4. Tensor-Related Singular Value Decompositions. Charles F. Van Loan From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 4. Tensor-Related Singular Value Decompositions Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010

More information

Efficient Computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm

Efficient Computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm Efficient Computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm Anders Eriksson and Anton van den Hengel School of Computer Science University of Adelaide,

More information

Foundations of Computer Vision

Foundations of Computer Vision Foundations of Computer Vision Wesley. E. Snyder North Carolina State University Hairong Qi University of Tennessee, Knoxville Last Edited February 8, 2017 1 3.2. A BRIEF REVIEW OF LINEAR ALGEBRA Apply

More information

Data-driven signal processing

Data-driven signal processing 1 / 35 Data-driven signal processing Ivan Markovsky 2 / 35 Modern signal processing is model-based 1. system identification prior information model structure 2. model-based design identification data parameter

More information

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Tao Wu Institute for Mathematics and Scientific Computing Karl-Franzens-University of Graz joint work with Prof.

More information

Pose estimation from point and line correspondences

Pose estimation from point and line correspondences Pose estimation from point and line correspondences Giorgio Panin October 17, 008 1 Problem formulation Estimate (in a LSE sense) the pose of an object from N correspondences between known object points

More information

Corrigendum to Inference on impulse response functions in structural VAR models [J. Econometrics 177 (2013), 1-13]

Corrigendum to Inference on impulse response functions in structural VAR models [J. Econometrics 177 (2013), 1-13] Corrigendum to Inference on impulse response functions in structural VAR models [J. Econometrics 177 (2013), 1-13] Atsushi Inoue a Lutz Kilian b a Department of Economics, Vanderbilt University, Nashville

More information

Learning Recurrent Neural Networks with Hessian-Free Optimization: Supplementary Materials

Learning Recurrent Neural Networks with Hessian-Free Optimization: Supplementary Materials Learning Recurrent Neural Networks with Hessian-Free Optimization: Supplementary Materials Contents 1 Pseudo-code for the damped Gauss-Newton vector product 2 2 Details of the pathological synthetic problems

More information

Method 1: Geometric Error Optimization

Method 1: Geometric Error Optimization Method 1: Geometric Error Optimization we need to encode the constraints ŷ i F ˆx i = 0, rank F = 2 idea: reconstruct 3D point via equivalent projection matrices and use reprojection error equivalent projection

More information

Optimisation on Manifolds

Optimisation on Manifolds Optimisation on Manifolds K. Hüper MPI Tübingen & Univ. Würzburg K. Hüper (MPI Tübingen & Univ. Würzburg) Applications in Computer Vision Grenoble 18/9/08 1 / 29 Contents 2 Examples Essential matrix estimation

More information

ENGG5781 Matrix Analysis and Computations Lecture 9: Kronecker Product

ENGG5781 Matrix Analysis and Computations Lecture 9: Kronecker Product ENGG5781 Matrix Analysis and Computations Lecture 9: Kronecker Product Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University of Hong Kong Kronecker product and

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Institute for Computational Mathematics Hong Kong Baptist University

Institute for Computational Mathematics Hong Kong Baptist University Institute for Computational Mathematics Hong Kong Baptist University ICM Research Report 08-0 How to find a good submatrix S. A. Goreinov, I. V. Oseledets, D. V. Savostyanov, E. E. Tyrtyshnikov, N. L.

More information

Matrix differential calculus Optimization Geoff Gordon Ryan Tibshirani

Matrix differential calculus Optimization Geoff Gordon Ryan Tibshirani Matrix differential calculus 10-725 Optimization Geoff Gordon Ryan Tibshirani Review Matrix differentials: sol n to matrix calculus pain compact way of writing Taylor expansions, or definition: df = a(x;

More information

Corrigendum to Inference on impulse. response functions in structural VAR models. [J. Econometrics 177 (2013), 1-13]

Corrigendum to Inference on impulse. response functions in structural VAR models. [J. Econometrics 177 (2013), 1-13] Corrigendum to Inference on impulse response functions in structural VAR models [J. Econometrics 177 (2013), 1-13] Atsushi Inoue a Lutz Kilian b a Department of Economics, Vanderbilt University, Nashville

More information

nonrobust estimation The n measurement vectors taken together give the vector X R N. The unknown parameter vector is P R M.

nonrobust estimation The n measurement vectors taken together give the vector X R N. The unknown parameter vector is P R M. Introduction to nonlinear LS estimation R. I. Hartley and A. Zisserman: Multiple View Geometry in Computer Vision. Cambridge University Press, 2ed., 2004. After Chapter 5 and Appendix 6. We will use x

More information

18.06SC Final Exam Solutions

18.06SC Final Exam Solutions 18.06SC Final Exam Solutions 1 (4+7=11 pts.) Suppose A is 3 by 4, and Ax = 0 has exactly 2 special solutions: 1 2 x 1 = 1 and x 2 = 1 1 0 0 1 (a) Remembering that A is 3 by 4, find its row reduced echelon

More information

Optimization and Root Finding. Kurt Hornik

Optimization and Root Finding. Kurt Hornik Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding

More information

Mathematical Foundations of Applied Statistics: Matrix Algebra

Mathematical Foundations of Applied Statistics: Matrix Algebra Mathematical Foundations of Applied Statistics: Matrix Algebra Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/105 Literature Seber, G.

More information

Rank-constrained Optimization: A Riemannian Manifold Approach

Rank-constrained Optimization: A Riemannian Manifold Approach Introduction Rank-constrained Optimization: A Riemannian Manifold Approach Guifang Zhou 1, Wen Huang 2, Kyle A. Gallivan 1, Paul Van Dooren 2, P.-A. Absil 2 1- Florida State University 2- Université catholique

More information

On Expected Gaussian Random Determinants

On Expected Gaussian Random Determinants On Expected Gaussian Random Determinants Moo K. Chung 1 Department of Statistics University of Wisconsin-Madison 1210 West Dayton St. Madison, WI 53706 Abstract The expectation of random determinants whose

More information

Characterization of Gradient Dominance and Regularity Conditions for Neural Networks

Characterization of Gradient Dominance and Regularity Conditions for Neural Networks Characterization of Gradient Dominance and Regularity Conditions for Neural Networks Yi Zhou Ohio State University Yingbin Liang Ohio State University Abstract zhou.1172@osu.edu liang.889@osu.edu The past

More information

Lecture 4. CP and KSVD Representations. Charles F. Van Loan

Lecture 4. CP and KSVD Representations. Charles F. Van Loan Structured Matrix Computations from Structured Tensors Lecture 4. CP and KSVD Representations Charles F. Van Loan Cornell University CIME-EMS Summer School June 22-26, 2015 Cetraro, Italy Structured Matrix

More information

Computational methods for mixed models

Computational methods for mixed models Computational methods for mixed models Douglas Bates Department of Statistics University of Wisconsin Madison March 27, 2018 Abstract The lme4 package provides R functions to fit and analyze several different

More information

KRONECKER PRODUCT AND LINEAR MATRIX EQUATIONS

KRONECKER PRODUCT AND LINEAR MATRIX EQUATIONS Proceedings of the Second International Conference on Nonlinear Systems (Bulletin of the Marathwada Mathematical Society Vol 8, No 2, December 27, Pages 78 9) KRONECKER PRODUCT AND LINEAR MATRIX EQUATIONS

More information

Subspace Projection Matrix Completion on Grassmann Manifold

Subspace Projection Matrix Completion on Grassmann Manifold Subspace Projection Matrix Completion on Grassmann Manifold Xinyue Shen and Yuantao Gu Dept. EE, Tsinghua University, Beijing, China http://gu.ee.tsinghua.edu.cn/ ICASSP 2015, Brisbane Contents 1 Background

More information

Sparse Levenberg-Marquardt algorithm.

Sparse Levenberg-Marquardt algorithm. Sparse Levenberg-Marquardt algorithm. R. I. Hartley and A. Zisserman: Multiple View Geometry in Computer Vision. Cambridge University Press, second edition, 2004. Appendix 6 was used in part. The Levenberg-Marquardt

More information

Review Questions REVIEW QUESTIONS 71

Review Questions REVIEW QUESTIONS 71 REVIEW QUESTIONS 71 MATLAB, is [42]. For a comprehensive treatment of error analysis and perturbation theory for linear systems and many other problems in linear algebra, see [126, 241]. An overview of

More information

Visual SLAM Tutorial: Bundle Adjustment

Visual SLAM Tutorial: Bundle Adjustment Visual SLAM Tutorial: Bundle Adjustment Frank Dellaert June 27, 2014 1 Minimizing Re-projection Error in Two Views In a two-view setting, we are interested in finding the most likely camera poses T1 w

More information

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University Matrix completion: Fundamental limits and efficient algorithms Sewoong Oh Stanford University 1 / 35 Low-rank matrix completion Low-rank Data Matrix Sparse Sampled Matrix Complete the matrix from small

More information

CVPR A New Tensor Algebra - Tutorial. July 26, 2017

CVPR A New Tensor Algebra - Tutorial. July 26, 2017 CVPR 2017 A New Tensor Algebra - Tutorial Lior Horesh lhoresh@us.ibm.com Misha Kilmer misha.kilmer@tufts.edu July 26, 2017 Outline Motivation Background and notation New t-product and associated algebraic

More information

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg Linear Algebra, part 3 Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2010 Going back to least squares (Sections 1.7 and 2.3 from Strang). We know from before: The vector

More information

Versal deformations in generalized flag manifolds

Versal deformations in generalized flag manifolds Versal deformations in generalized flag manifolds X. Puerta Departament de Matemàtica Aplicada I Escola Tècnica Superior d Enginyers Industrials de Barcelona, UPC Av. Diagonal, 647 08028 Barcelona, Spain

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2017

Cheng Soon Ong & Christian Walder. Canberra February June 2017 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2017 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 141 Part III

More information

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Jialin Dong ShanghaiTech University 1 Outline Introduction FourVignettes: System Model and Problem Formulation Problem Analysis

More information

Tensor Analysis in Euclidean Space

Tensor Analysis in Euclidean Space Tensor Analysis in Euclidean Space James Emery Edited: 8/5/2016 Contents 1 Classical Tensor Notation 2 2 Multilinear Functionals 4 3 Operations With Tensors 5 4 The Directional Derivative 5 5 Curvilinear

More information

Basic Math for

Basic Math for Basic Math for 16-720 August 23, 2002 1 Linear Algebra 1.1 Vectors and Matrices First, a reminder of a few basic notations, definitions, and terminology: Unless indicated otherwise, vectors are always

More information

Non-convex optimization. Issam Laradji

Non-convex optimization. Issam Laradji Non-convex optimization Issam Laradji Strongly Convex Objective function f(x) x Strongly Convex Objective function Assumptions Gradient Lipschitz continuous f(x) Strongly convex x Strongly Convex Objective

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition An Important topic in NLA Radu Tiberiu Trîmbiţaş Babeş-Bolyai University February 23, 2009 Radu Tiberiu Trîmbiţaş ( Babeş-Bolyai University)The Singular Value Decomposition

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 4

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 4 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 4 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory April 12, 2012 Andre Tkacenko

More information

The Bock iteration for the ODE estimation problem

The Bock iteration for the ODE estimation problem he Bock iteration for the ODE estimation problem M.R.Osborne Contents 1 Introduction 2 2 Introducing the Bock iteration 5 3 he ODE estimation problem 7 4 he Bock iteration for the smoothing problem 12

More information

Chapter 3 Numerical Methods

Chapter 3 Numerical Methods Chapter 3 Numerical Methods Part 2 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization 1 Outline 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization Summary 2 Outline 3.2

More information

Two-View Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix

Two-View Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix Two-View Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix René Vidal Stefano Soatto Shankar Sastry Department of EECS, UC Berkeley Department of Computer Sciences, UCLA 30 Cory Hall,

More information

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn Today s class Linear Algebraic Equations LU Decomposition 1 Linear Algebraic Equations Gaussian Elimination works well for solving linear systems of the form: AX = B What if you have to solve the linear

More information

Low-rank matrix completion via preconditioned optimization on the Grassmann manifold

Low-rank matrix completion via preconditioned optimization on the Grassmann manifold Low-rank matrix completion via preconditioned optimization on the Grassmann manifold Nicolas Boumal a, P.-A. Absil b a Inria & D.I., UMR 8548, Ecole Normale Supérieure, Paris, France. b Department of Mathematical

More information

A Study of Kruppa s Equation for Camera Self-calibration

A Study of Kruppa s Equation for Camera Self-calibration Proceedings of the International Conference of Machine Vision and Machine Learning Prague, Czech Republic, August 14-15, 2014 Paper No. 57 A Study of Kruppa s Equation for Camera Self-calibration Luh Prapitasari,

More information

Comparative Statics. Autumn 2018

Comparative Statics. Autumn 2018 Comparative Statics Autumn 2018 What is comparative statics? Contents 1 What is comparative statics? 2 One variable functions Multiple variable functions Vector valued functions Differential and total

More information

Grassmann Averages for Scalable Robust PCA Supplementary Material

Grassmann Averages for Scalable Robust PCA Supplementary Material Grassmann Averages for Scalable Robust PCA Supplementary Material Søren Hauberg DTU Compute Lyngby, Denmark sohau@dtu.dk Aasa Feragen DIKU and MPIs Tübingen Denmark and Germany aasa@diku.dk Michael J.

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization

Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Sparse and low-rank decomposition for big data systems via smoothed Riemannian optimization Yuanming Shi ShanghaiTech University, Shanghai, China shiym@shanghaitech.edu.cn Bamdev Mishra Amazon Development

More information

Optimization on the Manifold of Multiple Homographies

Optimization on the Manifold of Multiple Homographies Optimization on the Manifold of Multiple Homographies Anders Eriksson, Anton van den Hengel University of Adelaide Adelaide, Australia {anders.eriksson, anton.vandenhengel}@adelaide.edu.au Abstract It

More information

Lecture 9 Least Square Problems

Lecture 9 Least Square Problems March 26, 2018 Lecture 9 Least Square Problems Consider the least square problem Ax b (β 1,,β n ) T, where A is an n m matrix The situation where b R(A) is of particular interest: often there is a vectors

More information

A Factorization Method for 3D Multi-body Motion Estimation and Segmentation

A Factorization Method for 3D Multi-body Motion Estimation and Segmentation 1 A Factorization Method for 3D Multi-body Motion Estimation and Segmentation René Vidal Department of EECS University of California Berkeley CA 94710 rvidal@eecs.berkeley.edu Stefano Soatto Dept. of Computer

More information

II. DIFFERENTIABLE MANIFOLDS. Washington Mio CENTER FOR APPLIED VISION AND IMAGING SCIENCES

II. DIFFERENTIABLE MANIFOLDS. Washington Mio CENTER FOR APPLIED VISION AND IMAGING SCIENCES II. DIFFERENTIABLE MANIFOLDS Washington Mio Anuj Srivastava and Xiuwen Liu (Illustrations by D. Badlyans) CENTER FOR APPLIED VISION AND IMAGING SCIENCES Florida State University WHY MANIFOLDS? Non-linearity

More information

Recurrent Neural Network Approach to Computation of Gener. Inverses

Recurrent Neural Network Approach to Computation of Gener. Inverses Recurrent Neural Network Approach to Computation of Generalized Inverses May 31, 2016 Introduction The problem of generalized inverses computation is closely related with the following Penrose equations:

More information

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St Structured Lower Rank Approximation by Moody T. Chu (NCSU) joint with Robert E. Funderlic (NCSU) and Robert J. Plemmons (Wake Forest) March 5, 1998 Outline Introduction: Problem Description Diculties Algebraic

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

Solving Constrained Rayleigh Quotient Optimization Problem by Projected QEP Method

Solving Constrained Rayleigh Quotient Optimization Problem by Projected QEP Method Solving Constrained Rayleigh Quotient Optimization Problem by Projected QEP Method Yunshen Zhou Advisor: Prof. Zhaojun Bai University of California, Davis yszhou@math.ucdavis.edu June 15, 2017 Yunshen

More information

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang CS 231A Section 1: Linear Algebra & Probability Review Kevin Tang Kevin Tang Section 1-1 9/30/2011 Topics Support Vector Machines Boosting Viola Jones face detector Linear Algebra Review Notation Operations

More information

Linear Algebra Short Course Lecture 2

Linear Algebra Short Course Lecture 2 Linear Algebra Short Course Lecture 2 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School of Information Science, NAIST 1 Some useful references Introduction to linear

More information

DUAL REGULARIZED TOTAL LEAST SQUARES SOLUTION FROM TWO-PARAMETER TRUST-REGION ALGORITHM. Geunseop Lee

DUAL REGULARIZED TOTAL LEAST SQUARES SOLUTION FROM TWO-PARAMETER TRUST-REGION ALGORITHM. Geunseop Lee J. Korean Math. Soc. 0 (0), No. 0, pp. 1 0 https://doi.org/10.4134/jkms.j160152 pissn: 0304-9914 / eissn: 2234-3008 DUAL REGULARIZED TOTAL LEAST SQUARES SOLUTION FROM TWO-PARAMETER TRUST-REGION ALGORITHM

More information

TRACKING and DETECTION in COMPUTER VISION

TRACKING and DETECTION in COMPUTER VISION Technischen Universität München Winter Semester 2013/2014 TRACKING and DETECTION in COMPUTER VISION Template tracking methods Slobodan Ilić Template based-tracking Energy-based methods The Lucas-Kanade(LK)

More information

1 Differentiable manifolds and smooth maps. (Solutions)

1 Differentiable manifolds and smooth maps. (Solutions) 1 Differentiable manifolds and smooth maps Solutions Last updated: March 17 2011 Problem 1 The state of the planar pendulum is entirely defined by the position of its moving end in the plane R 2 Since

More information

Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved.

Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved. Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved. 1 Converting Matrices Into (Long) Vectors Convention:

More information

Numerical Methods for CSE. Examination - Solutions

Numerical Methods for CSE. Examination - Solutions R. Hiptmair A. Moiola Fall term 2010 Numerical Methods for CSE ETH Zürich D-MATH Examination - Solutions February 10th, 2011 Total points: 65 = 8+14+16+16+11 (These are not master solutions but just a

More information

High Accuracy Fundamental Matrix Computation and Its Performance Evaluation

High Accuracy Fundamental Matrix Computation and Its Performance Evaluation High Accuracy Fundamental Matrix Computation and Its Performance Evaluation Kenichi Kanatani Department of Computer Science, Okayama University, Okayama 700-8530 Japan kanatani@suri.it.okayama-u.ac.jp

More information

Quick Introduction to Nonnegative Matrix Factorization

Quick Introduction to Nonnegative Matrix Factorization Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices

More information

ELA THE OPTIMAL PERTURBATION BOUNDS FOR THE WEIGHTED MOORE-PENROSE INVERSE. 1. Introduction. Let C m n be the set of complex m n matrices and C m n

ELA THE OPTIMAL PERTURBATION BOUNDS FOR THE WEIGHTED MOORE-PENROSE INVERSE. 1. Introduction. Let C m n be the set of complex m n matrices and C m n Electronic Journal of Linear Algebra ISSN 08-380 Volume 22, pp. 52-538, May 20 THE OPTIMAL PERTURBATION BOUNDS FOR THE WEIGHTED MOORE-PENROSE INVERSE WEI-WEI XU, LI-XIA CAI, AND WEN LI Abstract. In this

More information

Nonnegative Matrix Factorization Clustering on Multiple Manifolds

Nonnegative Matrix Factorization Clustering on Multiple Manifolds Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Nonnegative Matrix Factorization Clustering on Multiple Manifolds Bin Shen, Luo Si Department of Computer Science,

More information

Using Hankel structured low-rank approximation for sparse signal recovery

Using Hankel structured low-rank approximation for sparse signal recovery Using Hankel structured low-rank approximation for sparse signal recovery Ivan Markovsky 1 and Pier Luigi Dragotti 2 Department ELEC Vrije Universiteit Brussel (VUB) Pleinlaan 2, Building K, B-1050 Brussels,

More information

Low Rank Approximation Lecture 7. Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL

Low Rank Approximation Lecture 7. Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL Low Rank Approximation Lecture 7 Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch 1 Alternating least-squares / linear scheme General setting:

More information

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Lecture 9: Numerical Linear Algebra Primer (February 11st) 10-725/36-725: Convex Optimization Spring 2015 Lecture 9: Numerical Linear Algebra Primer (February 11st) Lecturer: Ryan Tibshirani Scribes: Avinash Siravuru, Guofan Wu, Maosheng Liu Note: LaTeX template

More information

Matrix-Tensor and Deep Learning in High Dimensional Data Analysis

Matrix-Tensor and Deep Learning in High Dimensional Data Analysis Matrix-Tensor and Deep Learning in High Dimensional Data Analysis Tien D. Bui Department of Computer Science and Software Engineering Concordia University 14 th ICIAR Montréal July 5-7, 2017 Introduction

More information

Computational Methods. Least Squares Approximation/Optimization

Computational Methods. Least Squares Approximation/Optimization Computational Methods Least Squares Approximation/Optimization Manfred Huber 2011 1 Least Squares Least squares methods are aimed at finding approximate solutions when no precise solution exists Find the

More information

Riemannian Optimization Method on the Flag Manifold for Independent Subspace Analysis

Riemannian Optimization Method on the Flag Manifold for Independent Subspace Analysis Riemannian Optimization Method on the Flag Manifold for Independent Subspace Analysis Yasunori Nishimori 1, Shotaro Akaho 1, and Mark D. Plumbley 2 1 Neuroscience Research Institute, National Institute

More information

ENGG5781 Matrix Analysis and Computations Lecture 0: Overview

ENGG5781 Matrix Analysis and Computations Lecture 0: Overview ENGG5781 Matrix Analysis and Computations Lecture 0: Overview Wing-Kin (Ken) Ma 2018 2019 Term 1 Department of Electronic Engineering The Chinese University of Hong Kong Course Information W.-K. Ma, ENGG5781

More information

Reconstruction from projections using Grassmann tensors

Reconstruction from projections using Grassmann tensors Reconstruction from projections using Grassmann tensors Richard I. Hartley 1 and Fred Schaffalitzky 2 1 Australian National University and National ICT Australia, Canberra 2 Australian National University,

More information

Computational Multilinear Algebra

Computational Multilinear Algebra Computational Multilinear Algebra Charles F. Van Loan Supported in part by the NSF contract CCR-9901988. Involves Working With Kronecker Products b 11 b 12 b 21 b 22 c 11 c 12 c 13 c 21 c 22 c 23 c 31

More information

B553 Lecture 5: Matrix Algebra Review

B553 Lecture 5: Matrix Algebra Review B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations

More information

Fast Nonnegative Matrix Factorization with Rank-one ADMM

Fast Nonnegative Matrix Factorization with Rank-one ADMM Fast Nonnegative Matrix Factorization with Rank-one Dongjin Song, David A. Meyer, Martin Renqiang Min, Department of ECE, UCSD, La Jolla, CA, 9093-0409 dosong@ucsd.edu Department of Mathematics, UCSD,

More information

Derivation of the Kalman Filter

Derivation of the Kalman Filter Derivation of the Kalman Filter Kai Borre Danish GPS Center, Denmark Block Matrix Identities The key formulas give the inverse of a 2 by 2 block matrix, assuming T is invertible: T U 1 L M. (1) V W N P

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 2: Orthogonal Vectors and Matrices; Vector Norms Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 11 Outline 1 Orthogonal

More information

Errata List Numerical Mathematics and Computing, 7th Edition Ward Cheney & David Kincaid Cengage Learning (c) March 2013

Errata List Numerical Mathematics and Computing, 7th Edition Ward Cheney & David Kincaid Cengage Learning (c) March 2013 Chapter Errata List Numerical Mathematics and Computing, 7th Edition Ward Cheney & David Kincaid Cengage Learning (c) 202 9 March 203 Page 4, Summary, 2nd bullet item, line 4: Change A segment of to The

More information

Natural Gradient Learning for Over- and Under-Complete Bases in ICA

Natural Gradient Learning for Over- and Under-Complete Bases in ICA NOTE Communicated by Jean-François Cardoso Natural Gradient Learning for Over- and Under-Complete Bases in ICA Shun-ichi Amari RIKEN Brain Science Institute, Wako-shi, Hirosawa, Saitama 351-01, Japan Independent

More information

The Robust Low Rank Matrix Factorization in the Presence of. Zhang Qiang

The Robust Low Rank Matrix Factorization in the Presence of. Zhang Qiang he Robust Low Rank Matrix Factorization in the Presence of Missing data and Outliers Zhang Qiang Content he definition of matrix factorization and its application he methods for matrix factorization in

More information

Designing Information Devices and Systems II

Designing Information Devices and Systems II EECS 16B Fall 2016 Designing Information Devices and Systems II Linear Algebra Notes Introduction In this set of notes, we will derive the linear least squares equation, study the properties symmetric

More information

A LINEAR PROGRAMMING BASED ANALYSIS OF THE CP-RANK OF COMPLETELY POSITIVE MATRICES

A LINEAR PROGRAMMING BASED ANALYSIS OF THE CP-RANK OF COMPLETELY POSITIVE MATRICES Int J Appl Math Comput Sci, 00, Vol 1, No 1, 5 1 A LINEAR PROGRAMMING BASED ANALYSIS OF HE CP-RANK OF COMPLEELY POSIIVE MARICES YINGBO LI, ANON KUMMER ANDREAS FROMMER Department of Electrical and Information

More information

f(x, y) = 1 2 x y2 xy 3

f(x, y) = 1 2 x y2 xy 3 Problem. Find the critical points of the function and determine their nature. We have We find the critical points We calculate f(x, y) = 2 x2 + 3 2 y2 xy 3 f x = x y 3 = 0 = x = y 3 f y = 3y 3xy 2 = 0

More information

Non-Negative Matrix Factorization with Quasi-Newton Optimization

Non-Negative Matrix Factorization with Quasi-Newton Optimization Non-Negative Matrix Factorization with Quasi-Newton Optimization Rafal ZDUNEK, Andrzej CICHOCKI Laboratory for Advanced Brain Signal Processing BSI, RIKEN, Wako-shi, JAPAN Abstract. Non-negative matrix

More information

Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization and Randomization Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.

More information

Numerically Stable Cointegration Analysis

Numerically Stable Cointegration Analysis Numerically Stable Cointegration Analysis Jurgen A. Doornik Nuffield College, University of Oxford, Oxford OX1 1NF R.J. O Brien Department of Economics University of Southampton November 3, 2001 Abstract

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 27, 2015 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information