Optimization on the Grassmann manifold: a case study

Size: px
Start display at page:

Download "Optimization on the Grassmann manifold: a case study"

Transcription

1 Optimization on the Grassmann manifold: a case study Konstantin Usevich and Ivan Markovsky Department ELEC, Vrije Universiteit Brussel 28 March nd Benelux Meeting on Systems and Control, Houffalize, Belgium

2 Introduction Optimization on the Grassmann manifold Grassmann manifold: Gr(d, m) := { d-dimensional subspaces of R m} f(l ) L Gr(d,m) Example: fitting x 1,..., x N R m to a d-dimensional subspace L f 1 (L ) = N ρ(x k, L ) k=1 Examples: computer vision, machine learning, system identification,......, matrix/tensor low-rank factorization/approximation,... 2 of 15

3 Introduction What is this talk about Grassmann manifold: Gr(d, m) := { d-dimensional subspaces of R m} f(l ) L Gr(d,m) (Absil, Mahony, Sepulchre 2008): How to define derivatives, how to optimize; atlases, tangent bundles, Riemannian metric,... 3 of 15

4 Introduction What is this talk about Grassmann manifold: Gr(d, m) := { d-dimensional subspaces of R m} f(l ) L Gr(d,m) (Absil, Mahony, Sepulchre 2008): How to define derivatives, how to optimize; atlases, tangent bundles, Riemannian metric,... This talk: (re)statement of the problem, reformulations as min X R n f(x) 3 of 15

5 Introduction: problem (re)statement Problem (re)statement R R d m f(r) subject to R R f, where Rf := {R R d m rank R = d}, and f(r) = f(ur) nonsingular U R d d. d=1: Rf=R m \{0} 4 of 15

6 Introduction: problem (re)statement Problem (re)statement R R d m f(r) subject to R R f, where Rf := {R R d m rank R = d}, and f(r) = f(ur) nonsingular U R d d. Problems: How to impose the constraint R Rf? If R is a minimum, then UR is also a minimum... d=1: Rf=R m \{0} A solution: Reduce the search space 4 of 15

7 Introduction: problem (re)statement Reduction of the search space R R d m f(r) subject to R R f, where Rf := {R R d m rank R = d}, and f(r) = f(ur) nonsingular U R d d. How to reduce the search space: 1. RR = I d orthonormal basis represents all subspaces, nonlinear constraint d=1: Rf=R m \{0} 5 of 15

8 Introduction: problem (re)statement Reduction of the search space R R d m f(r) subject to R R f, where Rf := {R R d m rank R = d}, and f(r) = f(ur) nonsingular U R d d. d=1: Rf=R m \{0} How to reduce the search space: 1. RR = I d orthonormal basis represents all subspaces, nonlinear constraint 2. R = [ X I d ], where X R d (m d) represents almost all subspaces (all R with nonsingular R :,(m d+1):m ) 5 of 15

9 Outline Outline Introduction: problem (re)statement Optimization over [X I d ]Π Optimization with RR = I d A case study 5 of 15

10 Optimization over [X I d ]Π Parametrizations with permutation matrices [ X Id ] do not represent all d-dimensional subspaces Solution: consider all permutations Π R m m and matrices of the form [ X I d ] Π, Corresponds to fixing R :,J = I d 6 of 15

11 Optimization over [X I d ]Π Parametrizations with permutation matrices [ X Id ] do not represent all d-dimensional subspaces Solution: consider all permutations Π R m m and matrices of the form [ X I d ] Π, Corresponds to fixing R :,J = I d d=1, R m 6 of 15

12 Optimization over [X I d ]Π Parametrizations with permutation matrices [ X Id ] do not represent all d-dimensional subspaces Solution: consider all permutations Π R m m and matrices of the form [ X I d ] Π, Corresponds to fixing R :,J = I d d=1, R m 6 of 15

13 Optimization over [X I d ]Π Parametrizations with permutation matrices [ X Id ] do not represent all d-dimensional subspaces Solution: consider all permutations Π R m m and matrices of the form [ X I d ] Π, Corresponds to fixing R :,J = I d d=1, R m 6 of 15

14 Optimization over [X I d ]Π Parametrizations with permutation matrices [ X Id ] do not represent all d-dimensional subspaces Solution: consider all permutations Π R m m and matrices of the form [ X I d ] Π, Corresponds to fixing R :,J = I d d=1, R m Moreover, for d = 1 we can select Π: X 1,j 1 Question: for d > 1, can we consider only X from a bounded subset of R d (m d)? 6 of 15

15 Optimization over [X I d ]Π Bounded parametrizations with permutation matrices Theorem (Knuth, 1980) For any R R d m, rank R = d, there exist U and Π such that UR = [ X I d ] Π, Xij 1 for all (i, j) 7 of 15

16 Optimization over [X I d ]Π Bounded parametrizations with permutation matrices Theorem (Knuth, 1980) For any R R d m, rank R = d, there exist U and Π such that Proof: (sketch) UR = [ X I d ] Π, Xij 1 for all (i, j) 1. Find d d submatrix with maximal determinant (maximal volume) J 0 = arg max det R :,J R = J 2. Take Π 0 that permutes R :,J0 and R :,(m d+1):m R = RΠ 0 = 3. Take [ X I d ] = (R:,J0 ) 1 RΠ 0 By Kramer s rule we have that X i,j 1 7 of 15

17 Optimization over [X I d ]Π Optimization over permutations is equivalent to Π perm. R Rf f(r) min f ([ ] ) X I d Π X [ 1;1] d (m d) There are m! (m d)! permutations Π... 8 of 15

18 Optimization over [X I d ]Π Optimization over permutations is equivalent to Π perm. R Rf f(r) min f ([ ] ) X I d Π X [ 1;1] d (m d) There are m! (m d)! permutations Π..... we can switch the permutation in the course of optimization, if X i,j > δ 1 for some i, j. 8 of 15

19 Optimization with RR = I d Outline Introduction: problem (re)statement Optimization over [X I d ]Π Optimization with RR = I d A case study 8 of 15

20 Optimization with RR = I d Orthonormal basis R R d m f(r) subject to RR = I d Disadvantage: nonlinear constraint d=1, R m Possible solutions: 1. Lagrange multipliers, 2. penalty: f(r) + γ RR I d 2 F γ? not at all. 9 of 15

21 Optimization with RR = I d Penalty method for orthonormal bases R Rf f(r) subject to RR = I d where f(r) = f(ur) nonsingular U ( ) Theorem. For any γ > 0, the local minima of R Rf f(r) + γ RR I d 2 F coincide with the local minima of ( ). 10 of 15

22 Optimization with RR = I d Penalty method for orthonormal bases R Rf f(r) subject to RR = I d where f(r) = f(ur) nonsingular U ( ) Theorem. For any γ > 0, the local minima of R Rf f(r) + γ RR I d 2 F coincide with the local minima of ( ). Proof. Let R = UΣV be an SVD of R. Then f(r) + γ RR I d 2 F f(v ) + γ V V I d 2 F = f(v ) = f(r) 10 of 15

23 A case study Outline Introduction: problem (re)statement Optimization over [X I d ]Π Optimization with RR = I d A case study 10 of 15

24 A case study Structured low-rank approximation System identification, model reduction, etc. structured low-rank approximation Structured low-rank approximation: given p R np, S and r < m p p p 2 2 subject to rank S ( p) r. (SLRA) 11 of 15

25 A case study Structured low-rank approximation System identification, model reduction, etc. structured low-rank approximation Structured low-rank approximation: given p R np, S and r < m p p p 2 2 subject to rank S ( p) r. (SLRA) Kernel representation + variable projection: R R d m : rank R=d f(r) := ( min p p p 2 2 subject to RS ( p) = 0 ), where d = m r. 11 of 15

26 A case study Structured low-rank approximation: an example Hankel SLRA: given w = (w 1,..., w n ), g( [ ] r 1 r 2 r 3 ) := min w ŵ 2 2 subject to [ ] ŵ 1 ŵ 2 ŵ 3 ŵ n 2 r1 r 2 r 3 ŵ 2 ŵ 3 ŵ n 2 ŵ n 1 = 0 ŵ 3 ŵ n 2 ŵ n 1 ŵ n min f(r) solves the SLRA problem f(αr) = f(r) α 0 12 of 15

27 A case study: f([x 1 x 2 1]) (3 10 Hankel matrix) of 15

28 A case study Some results Mosaic Hankel low-rank approximation (LTI system identification of DAISY datasets), approximation errors are measured as 100% (1 p p 2 2/ p 2 2) t. # [X I d ] [X I d ]Π genrtr RR = I d RR I d 2 F Table: Percentage fits of the methods 14 of 15

29 Conclusions Conclusions 2 reformulations as optimization on Euclidean space freedom in choosing the optimization method Issues: bound on the number of switches, choosing regularization parameter... References: I. Markovsky, K. Usevich (in press), Structured low-rank approximation with missing values, SIAM J. Matrix Anal. Appl. K. Usevich, I. Markovsky (2012), Global and local optimization methods for structured low-rank approximation, submitted. software and reproducible experiments 15 of 15

Using Hankel structured low-rank approximation for sparse signal recovery

Using Hankel structured low-rank approximation for sparse signal recovery Using Hankel structured low-rank approximation for sparse signal recovery Ivan Markovsky 1 and Pier Luigi Dragotti 2 Department ELEC Vrije Universiteit Brussel (VUB) Pleinlaan 2, Building K, B-1050 Brussels,

More information

Structured low-rank approximation with missing data

Structured low-rank approximation with missing data Structured low-rank approximation with missing data Ivan Markovsky and Konstantin Usevich Department ELEC Vrije Universiteit Brussel {imarkovs,kusevich}@vub.ac.be Abstract We consider low-rank approximation

More information

Data-driven signal processing

Data-driven signal processing 1 / 35 Data-driven signal processing Ivan Markovsky 2 / 35 Modern signal processing is model-based 1. system identification prior information model structure 2. model-based design identification data parameter

More information

A software package for system identification in the behavioral setting

A software package for system identification in the behavioral setting 1 / 20 A software package for system identification in the behavioral setting Ivan Markovsky Vrije Universiteit Brussel 2 / 20 Outline Introduction: system identification in the behavioral setting Solution

More information

Two well known examples. Applications of structured low-rank approximation. Approximate realisation = Model reduction. System realisation

Two well known examples. Applications of structured low-rank approximation. Approximate realisation = Model reduction. System realisation Two well known examples Applications of structured low-rank approximation Ivan Markovsky System realisation Discrete deconvolution School of Electronics and Computer Science University of Southampton The

More information

A missing data approach to data-driven filtering and control

A missing data approach to data-driven filtering and control 1 A missing data approach to data-driven filtering and control Ivan Markovsky Abstract In filtering, control, and other mathematical engineering areas it is common to use a model-based approach, which

More information

ELEC system identification workshop. Behavioral approach

ELEC system identification workshop. Behavioral approach 1 / 33 ELEC system identification workshop Behavioral approach Ivan Markovsky 2 / 33 The course consists of lectures and exercises Session 1: behavioral approach to data modeling Session 2: subspace identification

More information

Generalized Power Method for Sparse Principal Component Analysis

Generalized Power Method for Sparse Principal Component Analysis Generalized Power Method for Sparse Principal Component Analysis Peter Richtárik CORE/INMA Catholic University of Louvain Belgium VOCAL 2008, Veszprém, Hungary CORE Discussion Paper #2008/70 joint work

More information

A comparison between structured low-rank approximation and correlation approach for data-driven output tracking

A comparison between structured low-rank approximation and correlation approach for data-driven output tracking A comparison between structured low-rank approximation and correlation approach for data-driven output tracking Simone Formentin a and Ivan Markovsky b a Dipartimento di Elettronica, Informazione e Bioingegneria,

More information

Sparsity in system identification and data-driven control

Sparsity in system identification and data-driven control 1 / 40 Sparsity in system identification and data-driven control Ivan Markovsky This signal is not sparse in the "time domain" 2 / 40 But it is sparse in the "frequency domain" (it is weighted sum of six

More information

Generalized Shifted Inverse Iterations on Grassmann Manifolds 1

Generalized Shifted Inverse Iterations on Grassmann Manifolds 1 Proceedings of the Sixteenth International Symposium on Mathematical Networks and Systems (MTNS 2004), Leuven, Belgium Generalized Shifted Inverse Iterations on Grassmann Manifolds 1 J. Jordan α, P.-A.

More information

Low-rank approximation and its applications for data fitting

Low-rank approximation and its applications for data fitting Low-rank approximation and its applications for data fitting Ivan Markovsky K.U.Leuven, ESAT-SISTA A line fitting example b 6 4 2 0 data points Classical problem: Fit the points d 1 = [ 0 6 ], d2 = [ 1

More information

STUDY of genetic variations is a critical task to disease

STUDY of genetic variations is a critical task to disease Haplotype Assembly Using Manifold Optimization and Error Correction Mechanism Mohamad Mahdi Mohades, Sina Majidian, and Mohammad Hossein Kahaei arxiv:9.36v [math.oc] Jan 29 Abstract Recent matrix completion

More information

REGULARIZED STRUCTURED LOW-RANK APPROXIMATION WITH APPLICATIONS

REGULARIZED STRUCTURED LOW-RANK APPROXIMATION WITH APPLICATIONS REGULARIZED STRUCTURED LOW-RANK APPROXIMATION WITH APPLICATIONS MARIYA ISHTEVA, KONSTANTIN USEVICH, AND IVAN MARKOVSKY Abstract. We consider the problem of approximating an affinely structured matrix,

More information

Chapter -1: Di erential Calculus and Regular Surfaces

Chapter -1: Di erential Calculus and Regular Surfaces Chapter -1: Di erential Calculus and Regular Surfaces Peter Perry August 2008 Contents 1 Introduction 1 2 The Derivative as a Linear Map 2 3 The Big Theorems of Di erential Calculus 4 3.1 The Chain Rule............................

More information

Lift me up but not too high Fast algorithms to solve SDP s with block-diagonal constraints

Lift me up but not too high Fast algorithms to solve SDP s with block-diagonal constraints Lift me up but not too high Fast algorithms to solve SDP s with block-diagonal constraints Nicolas Boumal Université catholique de Louvain (Belgium) IDeAS seminar, May 13 th, 2014, Princeton The Riemannian

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

ONP-MF: An Orthogonal Nonnegative Matrix Factorization Algorithm with Application to Clustering

ONP-MF: An Orthogonal Nonnegative Matrix Factorization Algorithm with Application to Clustering ONP-MF: An Orthogonal Nonnegative Matrix Factorization Algorithm with Application to Clustering Filippo Pompili 1, Nicolas Gillis 2, P.-A. Absil 2,andFrançois Glineur 2,3 1- University of Perugia, Department

More information

A structured low-rank approximation approach to system identification. Ivan Markovsky

A structured low-rank approximation approach to system identification. Ivan Markovsky 1 / 35 A structured low-rank approximation approach to system identification Ivan Markovsky Main message: system identification SLRA minimize over B dist(a,b) subject to rank(b) r and B structured (SLRA)

More information

Riemannian Optimization Method on the Flag Manifold for Independent Subspace Analysis

Riemannian Optimization Method on the Flag Manifold for Independent Subspace Analysis Riemannian Optimization Method on the Flag Manifold for Independent Subspace Analysis Yasunori Nishimori 1, Shotaro Akaho 1, and Mark D. Plumbley 2 1 Neuroscience Research Institute, National Institute

More information

1 Differentiable manifolds and smooth maps. (Solutions)

1 Differentiable manifolds and smooth maps. (Solutions) 1 Differentiable manifolds and smooth maps Solutions Last updated: March 17 2011 Problem 1 The state of the planar pendulum is entirely defined by the position of its moving end in the plane R 2 Since

More information

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Tao Wu Institute for Mathematics and Scientific Computing Karl-Franzens-University of Graz joint work with Prof.

More information

An extrinsic look at the Riemannian Hessian

An extrinsic look at the Riemannian Hessian http://sites.uclouvain.be/absil/2013.01 Tech. report UCL-INMA-2013.01-v2 An extrinsic look at the Riemannian Hessian P.-A. Absil 1, Robert Mahony 2, and Jochen Trumpf 2 1 Department of Mathematical Engineering,

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 6 1 / 22 Overview

More information

Invariant Subspace Computation - A Geometric Approach

Invariant Subspace Computation - A Geometric Approach Invariant Subspace Computation - A Geometric Approach Pierre-Antoine Absil PhD Defense Université de Liège Thursday, 13 February 2003 1 Acknowledgements Advisor: Prof. R. Sepulchre (Université de Liège)

More information

Supervised Dimension Reduction:

Supervised Dimension Reduction: Supervised Dimension Reduction: A Tale of Two Manifolds S. Mukherjee, K. Mao, F. Liang, Q. Wu, M. Maggioni, D-X. Zhou Department of Statistical Science Institute for Genome Sciences & Policy Department

More information

Structured Matrices and Solving Multivariate Polynomial Equations

Structured Matrices and Solving Multivariate Polynomial Equations Structured Matrices and Solving Multivariate Polynomial Equations Philippe Dreesen Kim Batselier Bart De Moor KU Leuven ESAT/SCD, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium. Structured Matrix Days,

More information

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Jialin Dong ShanghaiTech University 1 Outline Introduction FourVignettes: System Model and Problem Formulation Problem Analysis

More information

The (extended) rank weight enumerator and q-matroids

The (extended) rank weight enumerator and q-matroids The (extended) rank weight enumerator and q-matroids Relinde Jurrius Ruud Pellikaan Vrije Universiteit Brussel, Belgium Eindhoven University of Technology, The Netherlands Conference on Random network

More information

distances between objects of different dimensions

distances between objects of different dimensions distances between objects of different dimensions Lek-Heng Lim University of Chicago joint work with: Ke Ye (CAS) and Rodolphe Sepulchre (Cambridge) thanks: DARPA D15AP00109, NSF DMS-1209136, NSF IIS-1546413,

More information

Computation. For QDA we need to calculate: Lets first consider the case that

Computation. For QDA we need to calculate: Lets first consider the case that Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the

More information

Inverse and Implicit Mapping Theorems (Sections III.3-III.4)

Inverse and Implicit Mapping Theorems (Sections III.3-III.4) MATH W8 Daily Notes Preamble As an executive decision, I am skipping Section III.2. It is something like an 8-page lemma, with a number of concepts and results introduced at this stage mainly for the purpose

More information

Improved initial approximation for errors-in-variables system identification

Improved initial approximation for errors-in-variables system identification Improved initial approximation for errors-in-variables system identification Konstantin Usevich Abstract Errors-in-variables system identification can be posed and solved as a Hankel structured low-rank

More information

Problems in Linear Algebra and Representation Theory

Problems in Linear Algebra and Representation Theory Problems in Linear Algebra and Representation Theory (Most of these were provided by Victor Ginzburg) The problems appearing below have varying level of difficulty. They are not listed in any specific

More information

i = f iα : φ i (U i ) ψ α (V α ) which satisfy 1 ) Df iα = Df jβ D(φ j φ 1 i ). (39)

i = f iα : φ i (U i ) ψ α (V α ) which satisfy 1 ) Df iα = Df jβ D(φ j φ 1 i ). (39) 2.3 The derivative A description of the tangent bundle is not complete without defining the derivative of a general smooth map of manifolds f : M N. Such a map may be defined locally in charts (U i, φ

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Book of Abstracts - Extract th Annual Meeting. March 23-27, 2015 Lecce, Italy. jahrestagung.gamm-ev.de

Book of Abstracts - Extract th Annual Meeting. March 23-27, 2015 Lecce, Italy. jahrestagung.gamm-ev.de GESELLSCHAFT für ANGEWANDTE MATHEMATIK und MECHANIK e.v. INTERNATIONAL ASSOCIATION of APPLIED MATHEMATICS and MECHANICS 86 th Annual Meeting of the International Association of Applied Mathematics and

More information

OPTIMALITY CONDITIONS FOR GLOBAL MINIMA OF NONCONVEX FUNCTIONS ON RIEMANNIAN MANIFOLDS

OPTIMALITY CONDITIONS FOR GLOBAL MINIMA OF NONCONVEX FUNCTIONS ON RIEMANNIAN MANIFOLDS OPTIMALITY CONDITIONS FOR GLOBAL MINIMA OF NONCONVEX FUNCTIONS ON RIEMANNIAN MANIFOLDS S. HOSSEINI Abstract. A version of Lagrange multipliers rule for locally Lipschitz functions is presented. Using Lagrange

More information

Variational inequalities for set-valued vector fields on Riemannian manifolds

Variational inequalities for set-valued vector fields on Riemannian manifolds Variational inequalities for set-valued vector fields on Riemannian manifolds Chong LI Department of Mathematics Zhejiang University Joint with Jen-Chih YAO Chong LI (Zhejiang University) VI on RM 1 /

More information

Math Advanced Calculus II

Math Advanced Calculus II Math 452 - Advanced Calculus II Manifolds and Lagrange Multipliers In this section, we will investigate the structure of critical points of differentiable functions. In practice, one often is trying to

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

The Learning Problem and Regularization Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee

The Learning Problem and Regularization Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee The Learning Problem and Regularization 9.520 Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing

More information

1 Differentiable manifolds and smooth maps. (Solutions)

1 Differentiable manifolds and smooth maps. (Solutions) 1 Differentiable manifolds and smooth maps Solutions Last updated: February 16 2012 Problem 1 a The projection maps a point P x y S 1 to the point P u 0 R 2 the intersection of the line NP with the x-axis

More information

DISSERTATION MEAN VARIANTS ON MATRIX MANIFOLDS. Submitted by. Justin D Marks. Department of Mathematics. In partial fulfillment of the requirements

DISSERTATION MEAN VARIANTS ON MATRIX MANIFOLDS. Submitted by. Justin D Marks. Department of Mathematics. In partial fulfillment of the requirements DISSERTATION MEAN VARIANTS ON MATRIX MANIFOLDS Submitted by Justin D Marks Department of Mathematics In partial fulfillment of the requirements For the Degree of Doctor of Philosophy Colorado State University

More information

Subset selection for matrices

Subset selection for matrices Linear Algebra its Applications 422 (2007) 349 359 www.elsevier.com/locate/laa Subset selection for matrices F.R. de Hoog a, R.M.M. Mattheij b, a CSIRO Mathematical Information Sciences, P.O. ox 664, Canberra,

More information

A Least Squares Formulation for Canonical Correlation Analysis

A Least Squares Formulation for Canonical Correlation Analysis A Least Squares Formulation for Canonical Correlation Analysis Liang Sun, Shuiwang Ji, and Jieping Ye Department of Computer Science and Engineering Arizona State University Motivation Canonical Correlation

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Max-Planck-Institut für Mathematik in den Naturwissenschaften Leipzig

Max-Planck-Institut für Mathematik in den Naturwissenschaften Leipzig Max-Planck-Institut für Mathematik in den Naturwissenschaften Leipzig Convergence analysis of Riemannian GaussNewton methods and its connection with the geometric condition number by Paul Breiding and

More information

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY Eugene Tyrtyshnikov Institute of Numerical Mathematics Russian Academy of Sciences (joint work with Ivan Oseledets) WHAT ARE TENSORS? Tensors

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Principal Components Analysis. Sargur Srihari University at Buffalo

Principal Components Analysis. Sargur Srihari University at Buffalo Principal Components Analysis Sargur Srihari University at Buffalo 1 Topics Projection Pursuit Methods Principal Components Examples of using PCA Graphical use of PCA Multidimensional Scaling Srihari 2

More information

II. DIFFERENTIABLE MANIFOLDS. Washington Mio CENTER FOR APPLIED VISION AND IMAGING SCIENCES

II. DIFFERENTIABLE MANIFOLDS. Washington Mio CENTER FOR APPLIED VISION AND IMAGING SCIENCES II. DIFFERENTIABLE MANIFOLDS Washington Mio Anuj Srivastava and Xiuwen Liu (Illustrations by D. Badlyans) CENTER FOR APPLIED VISION AND IMAGING SCIENCES Florida State University WHY MANIFOLDS? Non-linearity

More information

On Weighted Structured Total Least Squares

On Weighted Structured Total Least Squares On Weighted Structured Total Least Squares Ivan Markovsky and Sabine Van Huffel KU Leuven, ESAT-SCD, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium {ivanmarkovsky, sabinevanhuffel}@esatkuleuvenacbe wwwesatkuleuvenacbe/~imarkovs

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

The nonsmooth Newton method on Riemannian manifolds

The nonsmooth Newton method on Riemannian manifolds The nonsmooth Newton method on Riemannian manifolds C. Lageman, U. Helmke, J.H. Manton 1 Introduction Solving nonlinear equations in Euclidean space is a frequently occurring problem in optimization and

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

Singular Value Decomposition (SVD) and Polar Form

Singular Value Decomposition (SVD) and Polar Form Chapter 2 Singular Value Decomposition (SVD) and Polar Form 2.1 Polar Form In this chapter, we assume that we are dealing with a real Euclidean space E. Let f: E E be any linear map. In general, it may

More information

Directional Field. Xiao-Ming Fu

Directional Field. Xiao-Ming Fu Directional Field Xiao-Ming Fu Outlines Introduction Discretization Representation Objectives and Constraints Outlines Introduction Discretization Representation Objectives and Constraints Definition Spatially-varying

More information

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto Reproducing Kernel Hilbert Spaces 9.520 Class 03, 15 February 2006 Andrea Caponnetto About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

SYLLABUS. 1 Linear maps and matrices

SYLLABUS. 1 Linear maps and matrices Dr. K. Bellová Mathematics 2 (10-PHY-BIPMA2) SYLLABUS 1 Linear maps and matrices Operations with linear maps. Prop 1.1.1: 1) sum, scalar multiple, composition of linear maps are linear maps; 2) L(U, V

More information

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane

More information

Rank-Constrainted Optimization: A Riemannian Manifold Approach

Rank-Constrainted Optimization: A Riemannian Manifold Approach Ran-Constrainted Optimization: A Riemannian Manifold Approach Guifang Zhou1, Wen Huang2, Kyle A. Gallivan1, Paul Van Dooren2, P.-A. Absil2 1- Florida State University - Department of Mathematics 1017 Academic

More information

Stochastic dynamical modeling:

Stochastic dynamical modeling: Stochastic dynamical modeling: Structured matrix completion of partially available statistics Armin Zare www-bcf.usc.edu/ arminzar Joint work with: Yongxin Chen Mihailo R. Jovanovic Tryphon T. Georgiou

More information

Before you begin read these instructions carefully.

Before you begin read these instructions carefully. MATHEMATICAL TRIPOS Part IB Friday, 8 June, 2012 1:30 pm to 4:30 pm PAPER 4 Before you begin read these instructions carefully. Each question in Section II carries twice the number of marks of each question

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Support Vector Machine (SVM) Hamid R. Rabiee Hadi Asheri, Jafar Muhammadi, Nima Pourdamghani Spring 2013 http://ce.sharif.edu/courses/91-92/2/ce725-1/ Agenda Introduction

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

225A DIFFERENTIAL TOPOLOGY FINAL

225A DIFFERENTIAL TOPOLOGY FINAL 225A DIFFERENTIAL TOPOLOGY FINAL KIM, SUNGJIN Problem 1. From hitney s Embedding Theorem, we can assume that N is an embedded submanifold of R K for some K > 0. Then it is possible to define distance function.

More information

Realization and identification of autonomous linear periodically time-varying systems

Realization and identification of autonomous linear periodically time-varying systems Realization and identification of autonomous linear periodically time-varying systems Ivan Markovsky, Jan Goos, Konstantin Usevich, and Rik Pintelon Department ELEC, Vrije Universiteit Brussel, Pleinlaan

More information

ECON 4117/5111 Mathematical Economics

ECON 4117/5111 Mathematical Economics Test 1 September 29, 2006 1. Use a truth table to show the following equivalence statement: (p q) (p q) 2. Consider the statement: A function f : X Y is continuous on X if for every open set V Y, the pre-image

More information

Math Topology II: Smooth Manifolds. Spring Homework 2 Solution Submit solutions to the following problems:

Math Topology II: Smooth Manifolds. Spring Homework 2 Solution Submit solutions to the following problems: Math 132 - Topology II: Smooth Manifolds. Spring 2017. Homework 2 Solution Submit solutions to the following problems: 1. Let H = {a + bi + cj + dk (a, b, c, d) R 4 }, where i 2 = j 2 = k 2 = 1, ij = k,

More information

Lecture 8: Linear Algebra Background

Lecture 8: Linear Algebra Background CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 8: Linear Algebra Background Lecturer: Shayan Oveis Gharan 2/1/2017 Scribe: Swati Padmanabhan Disclaimer: These notes have not been subjected

More information

Geometry on Probability Spaces

Geometry on Probability Spaces Geometry on Probability Spaces Steve Smale Toyota Technological Institute at Chicago 427 East 60th Street, Chicago, IL 60637, USA E-mail: smale@math.berkeley.edu Ding-Xuan Zhou Department of Mathematics,

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Applied Machine Learning for Biomedical Engineering. Enrico Grisan

Applied Machine Learning for Biomedical Engineering. Enrico Grisan Applied Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Data representation To find a representation that approximates elements of a signal class with a linear combination

More information

ON ANGLES BETWEEN SUBSPACES OF INNER PRODUCT SPACES

ON ANGLES BETWEEN SUBSPACES OF INNER PRODUCT SPACES ON ANGLES BETWEEN SUBSPACES OF INNER PRODUCT SPACES HENDRA GUNAWAN AND OKI NESWAN Abstract. We discuss the notion of angles between two subspaces of an inner product space, as introduced by Risteski and

More information

The multiple-vector tensor-vector product

The multiple-vector tensor-vector product I TD MTVP C KU Leuven August 29, 2013 In collaboration with: N Vanbaelen, K Meerbergen, and R Vandebril Overview I TD MTVP C 1 Introduction Inspiring example Notation 2 Tensor decompositions The CP decomposition

More information

Reduced-Rank Hidden Markov Models

Reduced-Rank Hidden Markov Models Reduced-Rank Hidden Markov Models Sajid M. Siddiqi Byron Boots Geoffrey J. Gordon Carnegie Mellon University ... x 1 x 2 x 3 x τ y 1 y 2 y 3 y τ Sequence of observations: Y =[y 1 y 2 y 3... y τ ] Assume

More information

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work. Assignment 1 Math 5341 Linear Algebra Review Give complete answers to each of the following questions Show all of your work Note: You might struggle with some of these questions, either because it has

More information

Lecture 4 Orthonormal vectors and QR factorization

Lecture 4 Orthonormal vectors and QR factorization Orthonormal vectors and QR factorization 4 1 Lecture 4 Orthonormal vectors and QR factorization EE263 Autumn 2004 orthonormal vectors Gram-Schmidt procedure, QR factorization orthogonal decomposition induced

More information

Institute for Computational Mathematics Hong Kong Baptist University

Institute for Computational Mathematics Hong Kong Baptist University Institute for Computational Mathematics Hong Kong Baptist University ICM Research Report 08-0 How to find a good submatrix S. A. Goreinov, I. V. Oseledets, D. V. Savostyanov, E. E. Tyrtyshnikov, N. L.

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Functional Analysis Review

Functional Analysis Review Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all

More information

Topological complexity of motion planning algorithms

Topological complexity of motion planning algorithms Topological complexity of motion planning algorithms Mark Grant mark.grant@ed.ac.uk School of Mathematics - University of Edinburgh 31st March 2011 Mark Grant (University of Edinburgh) TC of MPAs 31st

More information

CALCULUS ON MANIFOLDS. 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M =

CALCULUS ON MANIFOLDS. 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M = CALCULUS ON MANIFOLDS 1. Riemannian manifolds Recall that for any smooth manifold M, dim M = n, the union T M = a M T am, called the tangent bundle, is itself a smooth manifold, dim T M = 2n. Example 1.

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

1 Determinants. 1.1 Determinant

1 Determinants. 1.1 Determinant 1 Determinants [SB], Chapter 9, p.188-196. [SB], Chapter 26, p.719-739. Bellow w ll study the central question: which additional conditions must satisfy a quadratic matrix A to be invertible, that is to

More information

REFLECTIONS IN A EUCLIDEAN SPACE

REFLECTIONS IN A EUCLIDEAN SPACE REFLECTIONS IN A EUCLIDEAN SPACE PHILIP BROCOUM Let V be a finite dimensional real linear space. Definition 1. A function, : V V R is a bilinear form in V if for all x 1, x, x, y 1, y, y V and all k R,

More information

ELEC system identification workshop. Subspace methods

ELEC system identification workshop. Subspace methods 1 / 33 ELEC system identification workshop Subspace methods Ivan Markovsky 2 / 33 Plan 1. Behavioral approach 2. Subspace methods 3. Optimization methods 3 / 33 Outline Exact modeling Algorithms 4 / 33

More information

Fantope Regularization in Metric Learning

Fantope Regularization in Metric Learning Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB Glossary of Linear Algebra Terms Basis (for a subspace) A linearly independent set of vectors that spans the space Basic Variable A variable in a linear system that corresponds to a pivot column in the

More information

Algorithms to Compute Bases and the Rank of a Matrix

Algorithms to Compute Bases and the Rank of a Matrix Algorithms to Compute Bases and the Rank of a Matrix Subspaces associated to a matrix Suppose that A is an m n matrix The row space of A is the subspace of R n spanned by the rows of A The column space

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Stochastic gradient descent on Riemannian manifolds

Stochastic gradient descent on Riemannian manifolds Stochastic gradient descent on Riemannian manifolds Silvère Bonnabel 1 Centre de Robotique - Mathématiques et systèmes Mines ParisTech SMILE Seminar Mines ParisTech Novembre 14th, 2013 1 silvere.bonnabel@mines-paristech

More information

Problem # Max points possible Actual score Total 120

Problem # Max points possible Actual score Total 120 FINAL EXAMINATION - MATH 2121, FALL 2017. Name: ID#: Email: Lecture & Tutorial: Problem # Max points possible Actual score 1 15 2 15 3 10 4 15 5 15 6 15 7 10 8 10 9 15 Total 120 You have 180 minutes to

More information

Math 147, Homework 1 Solutions Due: April 10, 2012

Math 147, Homework 1 Solutions Due: April 10, 2012 1. For what values of a is the set: Math 147, Homework 1 Solutions Due: April 10, 2012 M a = { (x, y, z) : x 2 + y 2 z 2 = a } a smooth manifold? Give explicit parametrizations for open sets covering M

More information

DEVELOPMENT OF MORSE THEORY

DEVELOPMENT OF MORSE THEORY DEVELOPMENT OF MORSE THEORY MATTHEW STEED Abstract. In this paper, we develop Morse theory, which allows us to determine topological information about manifolds using certain real-valued functions defined

More information

MATHEMATICS COMPREHENSIVE EXAM: IN-CLASS COMPONENT

MATHEMATICS COMPREHENSIVE EXAM: IN-CLASS COMPONENT MATHEMATICS COMPREHENSIVE EXAM: IN-CLASS COMPONENT The following is the list of questions for the oral exam. At the same time, these questions represent all topics for the written exam. The procedure for

More information

Extra exercises Analysis in several variables

Extra exercises Analysis in several variables Extra exercises Analysis in several variables E.P. van den Ban Spring 2018 Exercise 1. Let U R n and V R p be open subsets and let π : U V be a C k map which is a surjective submersion. By a local C k

More information