Part I Generalized Principal Component Analysis

Similar documents
Segmentation of Subspace Arrangements III Robust GPCA

Sparse Subspace Clustering

A New GPCA Algorithm for Clustering Subspaces by Fitting, Differentiating and Dividing Polynomials

Generalized Principal Component Analysis (GPCA)

Subspace Arrangements in Theory and Practice

ESTIMATION OF SUBSPACE ARRANGEMENTS: ITS ALGEBRA AND STATISTICS ALLEN YANG YANG

The Multibody Trifocal Tensor: Motion Segmentation from 3 Perspective Views

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Scalable Subspace Clustering

Two-View Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix

FILTRATED ALGEBRAIC SUBSPACE CLUSTERING

Unsupervised dimensionality reduction

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Mixtures of Robust Probabilistic Principal Component Analyzers

Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Apprentissage non supervisée

Nonlinear Dimensionality Reduction

Generalized Principal Component Analysis

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Generalized Principal Component Analysis (GPCA): an Algebraic Geometric Approach to Subspace Clustering and Motion Segmentation. René Esteban Vidal

Statistical Pattern Recognition

L26: Advanced dimensionality reduction

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA

Image Analysis. PCA and Eigenfaces

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Manifold Learning and it s application

What is Principal Component Analysis?

Machine Learning. Chao Lan

All you want to know about GPs: Linear Dimensionality Reduction

Chap.11 Nonlinear principal component analysis [Book, Chap. 10]

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Lecture 7: Con3nuous Latent Variable Models

Non-linear Dimensionality Reduction

Statistical and Computational Analysis of Locality Preserving Projection

Identification of PWARX Hybrid Models with Unknown and Possibly Different Orders

Application of Numerical Algebraic Geometry to Geometric Data Analysis

ε ε

Principal Component Analysis

Chemometrics: Classification of spectra

Smart PCA. Yi Zhang Machine Learning Department Carnegie Mellon University

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

Dimensionality Reduction AShortTutorial

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Machine Learning Techniques for Computer Vision

Robust Motion Segmentation by Spectral Clustering

Consensus Algorithms for Camera Sensor Networks. Roberto Tron Vision, Dynamics and Learning Lab Johns Hopkins University

Grassmann Averages for Scalable Robust PCA Supplementary Material

Data Mining II. Prof. Dr. Karsten Borgwardt, Department Biosystems, ETH Zürich. Basel, Spring Semester 2016 D-BSSE

Statistical Machine Learning

Tensor LRR Based Subspace Clustering

Dimensionality Reduction:

Algebraic Clustering of Affine Subspaces

CS 540: Machine Learning Lecture 1: Introduction

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

Machine Learning for Data Science (CS4786) Lecture 12

SEQUENTIAL SUBSPACE FINDING: A NEW ALGORITHM FOR LEARNING LOW-DIMENSIONAL LINEAR SUBSPACES.

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

Robustness of Principal Components

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

Probabilistic Graphical Models

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

IV. Matrix Approximation using Least-Squares

EECS 275 Matrix Computation

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Linear Regression and Its Applications

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

MANIFOLD LEARNING: A MACHINE LEARNING PERSPECTIVE. Sam Roweis. University of Toronto Department of Computer Science. [Google: Sam Toronto ]

CS 4495 Computer Vision Principle Component Analysis

Outliers Robustness in Multivariate Orthogonal Regression

Validation of nonlinear PCA

Machine Learning. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Two-View Multibody Structure from Motion

Learning sets and subspaces: a spectral approach

Probabilistic Latent Semantic Analysis

Unsupervised Learning: K- Means & PCA

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Multiple Similarities Based Kernel Subspace Learning for Image Classification

LECTURE NOTE #11 PROF. ALAN YUILLE

ECE521 week 3: 23/26 January 2017

Pattern Recognition and Machine Learning

Maximum variance formulation

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Dimensionality Reduction with Principal Component Analysis

Math Review: parameter estimation. Emma

Dimensionality Reduc1on

Graph Metrics and Dimension Reduction

Dimension Reduction. David M. Blei. April 23, 2012

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

Principal Component Analysis!! Lecture 11!

Introduction to Machine Learning

Clustering VS Classification

A Factorization Method for 3D Multi-body Motion Estimation and Segmentation

Latent Variable Models and EM Algorithm

Vector Space Models. wine_spectral.r

Transcription:

Part I Generalized Principal Component Analysis René Vidal Center for Imaging Science Institute for Computational Medicine Johns Hopkins University

Principal Component Analysis (PCA) Given a set of points x 1, x 2,, x N Geometric PCA: find a subspace S passing through them Statistical PCA: find projection directions that maximize the variance Solution (Beltrami 1873, Jordan 1874, Hotelling 33, Eckart-Householder-Young 36) Basis for S Applications: data compression, regression, computer vision (eigenfaces), pattern recognition, genomics

Extensions of PCA Higher order SVD (Tucker 66, Davis 02) Independent Component Analysis (Common 94) Probabilistic PCA (Tipping-Bishop 99) Identify subspace from noisy data Gaussian noise: standard PCA Noise in exponential family (Collins et al. 01) Nonlinear dimensionality reduction Multidimensional scaling (Torgerson 58) Locally linear embedding (Roweis-Saul 00) Isomap (Tenenbaum 00) Nonlinear PCA (Scholkopf-Smola-Muller 98) Identify nonlinear manifold by applying PCA to data embedded in high-dimensional space Principal Curves and Principal Geodesic Analysis (Hastie-Stuetzle 89, Tishbirany 92, Fletcher 04)

Generalized Principal Component Analysis Given a set of points lying in multiple subspaces, identify The number of subspaces and their dimensions A basis for each subspace The segmentation of the data points Chicken-and-egg problem Given segmentation, estimate subspaces Given subspaces, segment the data

Prior work on subspace clustering Iterative algorithms: K-subspace (Ho et al. 03), RANSAC, subspace selection and growing (Leonardis et al. 02) Probabilistic approaches: learn the parameters of a mixture model using e.g. EM Mixtures of PPCA: (Tipping-Bishop 99): Multi-Stage Learning (Kanatani 04) Initialization Geometric approaches: 2 planes in R 3 (Shizawa-Maze 91) Factorization approaches: independent subspaces of equal dimension (Boult-Brown 91, Costeira-Kanade 98, Kanatani 01) Spectral clustering based approaches: (Yan-Pollefeys 06)

Basic ideas behind GPCA Towards an analytic solution to subspace clustering Can we estimate ALL models simultaneously using ALL data? When can we do so analytically? In closed form? Is there a formula for the number of models? Will consider the most general case Subspaces of unknown and possibly different dimensions Subspaces may intersect arbitrarily (not only at the origin) GPCA is an algebraic geometric approach to data segmentation Number of subspaces = degree of a polynomial Subspace basis = derivatives of a polynomial Subspace clustering is algebraically equivalent to Polynomial fitting Polynomial differentiation

Applications of GPCA in computer vision Geometry Vanishing points Image compression Segmentation Intensity (black-white) Texture Motion (2-D, 3-D) Video (host-guest) Recognition Faces (Eigenfaces) Man - Woman Human Gaits Dynamic Textures Water-bird Biomedical imaging Hybrid systems identification

Introductory example: algebraic clustering in 1D Number of groups?

Introductory example: algebraic clustering in 1D How to compute n, c, b s? Number of clusters Cluster centers Solution is unique if Solution is closed form if

Introductory example: algebraic clustering in 2D What about dimension 2? What about higher dimensions? Complex numbers in higher dimensions? How to find roots of a polynomial of quaternions? Instead Project data onto one or two dimensional space Apply same algorithm to projected data

Representing one subspace One plane One line One subspace can be represented with Set of linear equations Set of polynomials of degree 1

Representing n subspaces Two planes One plane and one line Plane: Line: De Morgan s rule A union of n subspaces can be represented with a set of homogeneous polynomials of degree n

Fitting polynomials to data points Polynomials can be written linearly in terms of the vector of coefficients by using polynomial embedding Veronese map Coefficients of the polynomials can be computed from nullspace of embedded data Solve using least squares N = #data points

Finding a basis for each subspace Case of hyperplanes: Only one polynomial Number of subspaces Basis are normal vectors Polynomial Factorization (GPCA-PFA) [CVPR 2003] Find roots of polynomial of degree in one variable Solve linear systems in variables Solution obtained in closed form for Problems Computing roots may be sensitive to noise The estimated polynomial may not perfectly factor with noisy Cannot be applied to subspaces of different dimensions Polynomials are estimated up to change of basis, hence they may not factor, even with perfect data

Finding a basis for each subspace Polynomial Differentiation (GPCA-PDA) [CVPR 04] To learn a mixture of subspaces we just need one positive example per class

Choosing one point per subspace With noise and outliers Polynomials may not be a perfect union of subspaces Normals can estimated correctly by choosing points optimally Distance to closest subspace without knowing segmentation?

GPCA for hyperplane segmentation Coefficients of the polynomial can be computed from null space of embedded data matrix Solve using least squares N = #data points Number of subspaces can be computed from the rank of embedded data matrix Normal to the subspaces can be computed from the derivatives of the polynomial

GPCA for subspaces of different dimensions There are multiple polynomials fitting the data The derivative of each polynomial gives a different normal vector Can obtain a basis for the subspace by applying PCA to normal vectors

GPCA for subspaces of different dimensions Apply polynomial embedding to projected data Obtain multiple subspace model by polynomial fitting Solve to obtain Need to know number of subspaces Obtain bases & dimensions by polynomial differentiation Optimally choose one point per subspace using distance

An example Given data lying in the union of the two subspaces We can write the union as Therefore, the union can be represented with the two polynomials

An example Can compute polynomials from Can compute normals from

Dealing with high-dimensional data Minimum number of points K = dimension of ambient space n = number of subspaces In practice the dimension of each subspace ki is much smaller than K Subspace 1 Subspace 2 Number and dimension of the subspaces is preserved by a linear projection onto a subspace of dimension Can remove outliers by robustly fitting the subspace Open problem: how to choose projection? PCA?

GPCA with spectral clustering Spectral clustering Build a similarity matrix between pairs of points Use eigenvectors to cluster data How to define a similarity for subspaces? Want points in the same subspace to be close Want points in different subspace to be far Use GPCA to get basis Distance: subspace angles

Comparison of PFA, PDA, K-sub, EM Error in the normals [degrees] 18 16 14 12 10 8 6 4 2 PFA K sub PDA EM PDA+K sub PDA+EM PDA+K sub+em 0 0 1 2 3 4 5 Noise level [%]

Dealing with outliers GPCA with perfect data GPCA with outliers GPCA fails because PCA fails seek a robust estimate of Null(L n ) where L n =[ n (x 1 ),..., n (x N )].

Three approaches to tackle outliers Probability-based: small-probability samples Probability plots: [Healy 1968, Cox 1968] PCs: [Rao 1964, Ganadesikan & Kettenring 1972] M-estimators: [Huber 1981, Camplbell 1980] Multivariate-trimming (MVT): [Ganadesikan & Kettenring 1972] Influence-based: large influence on model parameters Parameter difference with and without a sample: [Hampel et al. 1986, Critchley 1985] Consensus-based: not consistent with models of high consensus. Hough: [Ballard 1981, Lowe 1999] RANSAC: [Fischler & Bolles 1981, Torr 1997] Least Median Estimate (LME): [Rousseeuw 1984, Steward 1999]

Robust GPCA

Robust GPCA Simulation on Robust GPCA (parameters fixed at = 0.3rad and = 0.4 RGPCA Influence (e) 12% (f) 32% (g) 48% (h) 12% (i) 32% (j) 48% RGPCA - MVT (k) 12% (l) 32% (m) 48% (n) 12% (o) 32% (p) 48%

Robust GPCA Comparison with RANSAC Accuracy (q) (2,2,1) in 3 (r) (4,2,2,1) in 5 (s) (5,5,5) in 6 Speed Table: Average time of RANSAC and RGPCA with 24% outliers. Arrangement (2,2,1) in 3 (4,2,2,1) in 5 (5,5,5) in 6 RANSAC 44s 5.1min 3.4min MVT 46s 23min 8min Influence 3min 58min 146min

Summary GPCA: algorithm for clustering subspaces Deals with unknown and possibly different dimensions Deals with arbitrary intersections among the subspaces Our approach is based on Projecting data onto a low-dimensional subspace Fitting polynomials to projected subspaces Differentiating polynomials to obtain a basis Applications in image processing and computer vision Image segmentation: intensity and texture Image compression Face recognition under varying illumination

For more information, Vision, Dynamics and Learning Lab @ Johns Hopkins University Thank You!