CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

Similar documents
CS 4495 Computer Vision Principle Component Analysis

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Image Analysis. PCA and Eigenfaces

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Principal Component Analysis (PCA)

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

1 Principal Components Analysis

What is Principal Component Analysis?

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Advanced Introduction to Machine Learning CMU-10715

Eigenfaces. Face Recognition Using Principal Components Analysis

Covariance and Principal Components

Pattern Recognition 2

Principal Component Analysis

Face Recognition and Biometric Systems

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

Dimensionality reduction

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

PCA FACE RECOGNITION

Deriving Principal Component Analysis (PCA)

Linear Subspace Models

Lecture: Face Recognition and Feature Reduction

Principal Component Analysis

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Lecture: Face Recognition and Feature Reduction

ECE 661: Homework 10 Fall 2014

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Face detection and recognition. Detection Recognition Sally

Example: Face Detection

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

PCA Review. CS 510 February 25 th, 2013

Face Detection and Recognition

PCA, Kernel PCA, ICA

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 18: Principal Component Analysis (PCA) Dr. Yanjun Qi. University of Virginia

The Mathematics of Facial Recognition

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

Computation. For QDA we need to calculate: Lets first consider the case that

Principal Components Analysis (PCA)

Principal Component Analysis and Linear Discriminant Analysis

Principal Component Analysis

Machine Learning - MT & 14. PCA and MDS

20 Unsupervised Learning and Principal Components Analysis (PCA)

PCA and LDA. Man-Wai MAK

Introduction to Machine Learning

15 Singular Value Decomposition

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Lecture: Face Recognition

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

Methods for sparse analysis of high-dimensional data, II

Unsupervised Learning: K- Means & PCA

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

Announcements (repeat) Principal Components Analysis

PCA and LDA. Man-Wai MAK

Reconnaissance d objetsd et vision artificielle

CSC 411 Lecture 12: Principal Component Analysis

COS 429: COMPUTER VISON Face Recognition

Expectation Maximization

EECS490: Digital Image Processing. Lecture #26

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview

Linear Models Review

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Eigenface-based facial recognition

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

Methods for sparse analysis of high-dimensional data, II

2D Image Processing Face Detection and Recognition

Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA

Covariance to PCA. CS 510 Lecture #8 February 17, 2014

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

CITS 4402 Computer Vision

Eigenvalues, Eigenvectors, and an Intro to PCA

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Covariance to PCA. CS 510 Lecture #14 February 23, 2018

Eigenvalues, Eigenvectors, and an Intro to PCA

Numerical Methods I Singular Value Decomposition

Principal Component Analysis

Linear Algebra Methods for Data Mining

Face Recognition Using Eigenfaces

Principal Component Analysis

Dimensionality Reduction

Linear & Non-Linear Discriminant Analysis! Hugh R. Wilson

14 Singular Value Decomposition

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

GEOG 4110/5100 Advanced Remote Sensing Lecture 15

LECTURE 16: PCA AND SVD

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Lecture Notes 1: Vector spaces

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

Covariance and Correlation Matrix

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

Principal Components Analysis

Signal Analysis. Principal Component Analysis

Transcription:

CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision)

Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions in a feature space along which points have the greatest variance. 30 25 20 15 10 5 0 0 5 10 15 20 25 30 Wavelength 1 30 25 20 15 PC 1 PC 2 10 5 0 0 5 10 15 20 25 30 Wavelength 1

Wavelength 2 Wavelength 2 Principal Components First PC is the direction of maximum variance. Technically (and mathematically) it s from the origin, but we actually mean the mean. Subsequent PCs are orthogonal to previous PCs and describe maximum residual variance 30 25 20 15 10 5 0 0 5 10 15 20 25 30 Wavelength 1 30 25 20 15 PC 1 PC 2 10 5 0 0 5 10 15 20 25 30 Wavelength 1

2D example: Fitting a line y E( a, b, d) ( axi byi d) i E 0 2 ( ax i by i d) 0 d d ax by i 2 d n a b x

2D example: Fitting a line y 2 E [ a( xi x) b( yi y)] Bn i where B Substitute d = ax + by : x1x y1 y x2 x y2 y...... xn x yn y so minimize subject to n 2 Bn =1 axis of least inertia 2 gives d a n b x

Sound familiar???

Another algebraic interpretation Minimizing sum of squares of distances to the line is the same as maximizing the sum of squares of the projections on that line. Origin P x T P x (unit vector through origin)

Algebraic interpretation Trick: How is the sum of squares of projection lengths expressed in algebraic terms? L i n e x T P P P P t t t t 1 2 3 m B T Point 1 Point 2 Point 3 : Point m B L i n e x ( xp T ) i i 2

Algebraic interpretation Our goal: max( x T B T Bx), subject to x T x = 1 L i n e P P P P t t t t 1 2 3 m Point 1 Point 2 Point 3 : Point m L i n e i ( xp T ) i 2 x T B T B x

Algebraic interpretation max( x T B T Bx), subject to x T x = 1 maximize E x T Mx subject to x T x 1 ( M B T B) T T E x Mx (1 x x) E x E x 2Mx 2x T 0 Mx x ( xis an eigenvector of B B)

Yet another algebraic interpretation T BB n n 2 x i xi yi i1 i1 n n 2 xi y i yi i1 i1 if about orign i So the principal components are the orthogonal directions of the covariance matrix of a set points.

Yet one more algebraic interpretation T B B x x T if about origin B T B ( x x)( xx) T otherwise outer product So the principal components are the orthogonal directions of the covariance matrix of a set points.

Eigenvectors How many eigenvectors are there? For Real Symmetric Matrices of size NxN Except in degenerate cases when eigenvalues repeat, there are N distinct eigenvectors

PCA: Eigenvalues 5 λ 1 λ 2 4 3 2 4.0 4.5 5.0 5.5 6.0

Dimensionality Reduction G v 2 x v 1 We can represent the orange points with only their v 1 coordinates x R

Higher Dimensions Suppose each data point is N-dimensional T var( v) ( x - x) v x T v A v w herea ( x x) ( x x) Outer product The ev with largest eigenvalue λ captures the most variation among training vectors x eigenvector with smallest eigenvalue has least variation T

The space of all face images When viewed as vectors of pixel values, face images are extremely high-dimensional 100x100 image = 10,000 dimensions

The space of all face images However, relatively few 10,000-dimensional vectors correspond to valid face images We want to effectively model the subspace of face images

The space of all face images We want to construct a low-dimensional linear subspace that best explains the variation in the set of face images

Principal Component Analysis Given: M data points x 1,, x M in R d where d is big We want some directions in R d that capture most of the variation of the x i. The coefficients would be: u x i = u T x i μ (μ: mean of data points) What unit vector u in R d captures the most variance of the data?

Principal Component Analysis Direction that maximizes the variance of the projected data: 1 M T T T var( u) u ( x )( u ( x )) i i i1 M Projection of data point 1 N ( )( ) T T u x x i i i1 M u T u Covariance matrix of data Σ u

Principal Component Analysis Direction that maximizes the variance of the projected data: T var( u) u u Since u is a unit vector that can be expressed in terms of some linear sum of the eigenvectors of Σ, then direction of u that maximizes the variance is the eigenvector associated with the largest eigenvalue of Σ Σ

Principal component analysis The direction that captures the maximum covariance of the data is the eigenvector corresponding to the largest eigenvalue of the data covariance matrix Furthermore, the top k orthogonal directions that capture the most variance of the data are the k eigenvectors corresponding to the k largest eigenvalues But first, we ll need the PCA d>>>n trick

The dimensionality trick Let Φ i be the (very big vector length d ) that is face image I with the mean image subtracted. Define C = 1 M Φ iφ i T = AA T where A = [Φ 1 Φ 2 Φ M ] is the matrix of faces, and is d M. Note: C is a huge d d matrix (remember d is the length of the vector of the image).

The dimensionality trick So C = AA T is a huge matrix. But consider A T A. It is only M M. Finding those eigenvalues is easy. Suppose v i is an eigenvector A T A: A T Av i = λv i Premultiply by A: AA T Av i = λav i So: Av i are the eigenvectors of C = AA T

How many eigenvectors are there? If had M > d then there would be d. But M d. So intuition would say there are M of them. But wait: if 2 points in 3D, how many eigenvectors? 3 points? Subtracting out the mean yields M-1.

Eigenfaces: Key idea (Turk and Pentland, 1991) Assume that most face images lie on a low-dimensional subspace determined by the first k (k <<< d) directions of maximum variance Use PCA to determine the vectors or eigenfaces u 1, u 2, u k that span that subspace Represent all face images in the dataset as linear combinations of eigenfaces. Find the coefficients by dot product.

Eigenfaces example Training images x 1,, x M

Mean: μ Top eigenvectors: u 1, u k

Eigenfaces example Principal component (eigenvector) u k

Eigenfaces example Principal component (eigenvector) u k μ + 3σ k u k μ 3σ k u k

Eigenfaces example Face x in face space coordinates (dot products) : T T x [ u ( x ),..., u ( x )] 1 k [ w,... w ] 1 k This vector is the representation of the face.

Eigenfaces example Reconstruction: = + x = µ + w 1 u 1 + w 2 u 2 + w 3 u 3 + w 4 u 4 +

Eigenfaces example But what about recognition?

Recognition with eigenfaces Given novel image x: Project onto subspace: w 1,, w k = [u 1 T x µ,, u k T x µ ] Optional: check reconstruction error x x to determine whether image is really a face Classify as closest training face in k-dimensional subspace This is why it s a generative model.

And old cast of characters

Limitations PCA of global structure Global appearance method: not robust to misalignment, background variation

Limitations PCA in general The direction of maximum variance is not always good for classification