Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

Similar documents
Unsupervised Learning: K- Means & PCA

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Advanced Introduction to Machine Learning CMU-10715

Deriving Principal Component Analysis (PCA)

Principal Component Analysis

What is Principal Component Analysis?

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis (PCA)

PCA FACE RECOGNITION

Eigenimaging for Facial Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

Principal Component Analysis (PCA)

Face Detection and Recognition

Lecture: Face Recognition

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Face detection and recognition. Detection Recognition Sally

20 Unsupervised Learning and Principal Components Analysis (PCA)

Example: Face Detection

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014

Image Analysis. PCA and Eigenfaces

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

Lecture: Face Recognition and Feature Reduction

Face recognition Computer Vision Spring 2018, Lecture 21

COS 429: COMPUTER VISON Face Recognition

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Lecture: Face Recognition and Feature Reduction

Data Preprocessing Tasks

Face Recognition and Biometric Systems

Lecture 17: Face Recogni2on

Eigenfaces. Face Recognition Using Principal Components Analysis

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 18: Principal Component Analysis (PCA) Dr. Yanjun Qi. University of Virginia

Machine Learning 2nd Edition

2D Image Processing Face Detection and Recognition

Lecture 13 Visual recognition

Linear Subspace Models

Principal Component Analysis

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Methods for sparse analysis of high-dimensional data, II

ECE 661: Homework 10 Fall 2014

Lecture 17: Face Recogni2on

Statistical Machine Learning

STA 414/2104: Lecture 8

Principal Component Analysis (PCA) CSC411/2515 Tutorial

Reconnaissance d objetsd et vision artificielle

Principal components analysis COMS 4771

CITS 4402 Computer Vision

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview

Principal Components Analysis (PCA)

Methods for sparse analysis of high-dimensional data, II

Main matrix factorizations

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

1 Principal Components Analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

CS 4495 Computer Vision Principle Component Analysis

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

Lecture Notes 1: Vector spaces

INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

PCA, Kernel PCA, ICA

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Dimensionality reduction

PCA and LDA. Man-Wai MAK

STA 414/2104: Lecture 8

Expectation Maximization

Classification Based on Probability

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

EECS490: Digital Image Processing. Lecture #26

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Pattern Recognition 2

Machine Learning. Data visualization and dimensionality reduction. Eric Xing. Lecture 7, August 13, Eric Xing Eric CMU,

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

Lecture 7: Con3nuous Latent Variable Models

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello

Announcements (repeat) Principal Components Analysis

Machine Learning 11. week

Principal Component Analysis

CS 559: Machine Learning Fundamentals and Applications 5 th Set of Notes

Dimensionality Reduction

Eigenface-based facial recognition

Department of Computer Science and Engineering

Principal Component Analysis CS498

Real Time Face Detection and Recognition using Haar - Based Cascade Classifier and Principal Component Analysis

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

PCA and LDA. Man-Wai MAK

Enhanced Fisher Linear Discriminant Models for Face Recognition

Covariance and Principal Components

PRINCIPAL COMPONENTS ANALYSIS

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule

Face Recognition. Lecture-14

Transcription:

Robot Image Credit: Viktoriya Sukhanova 13RF.com Dimensionality Reduction

Feature Selection vs. Dimensionality Reduction Feature Selection (last time) Select a subset of features. When classifying novel patterns, only a small number of features need to be computed (i.e., faster classification). The measurement units (length, weight, etc.) of the features are preserved. Dimensionality Reduction (this time) Transform features into a smaller set. When classifying novel patterns, all features need to be computed. The measurement units (length, weight, etc.) of the features are lost.

How Can We Visualize High Dimensional Data? E.g., 53 blood and urine tests for 5 patients Instances H-WBC H-RBC H-Hgb H-Hct H-MCV H-MCH H-MCHC A1.0000.00 1.00 1.0000 5.0000 9.0000 3.0000 A 7.3000 5.000 1.7000 3.0000.0000 9.0000 3.0000 A3.3000.00 1.00 1.0000 91.0000 3.0000 35.0000 A 7.5000.700 1.9000 5.0000 1.0000 33.0000 33.0000 A5 7.3000 5.500 15.000.0000.0000.0000 33.0000 A.9000.00 1.0000 7.0000 97.0000 33.0000 3.0000 A7 7.000.00 1.7000 3.0000 9.0000 31.0000 3.0000 A.000.00 15.000.0000.0000 33.0000 37.0000 A9 5.00.70 1.0000 3.0000 9.0000 30.0000 3.0000 Features Difficult to see the correlations between the features... 3

Data Visualization Is there a representation better than the raw features? Is it really necessary to show all the 53 dimensions? what if there are strong correlations between the features? Could we find the smallest subspace of the 53-D space that keeps the most information about the original data? One solution: Principal Component Analysis

Principle Component Analysis Orthogonal projection of data onto lower-dimension linear space that... maximizes variance of projected data (purple line) minimizes mean squared distance between data point and projections (sum of blue lines) 5

The Principal Components Vectors originating from the center of mass Principal component #1 points in the direction of the largest variance Each subsequent principal component is orthogonal to the previous ones, and points in the directions of the largest variance of the residual subspace

D Gaussian Dataset 7

1 st PCA axis

nd PCA axis 9

PCA Algorithm Given data {x 1,, x n }, compute covariance matrix S X is the n x d data matrix Compute data mean (average over all rows of X) Subtract mean from each row of X (centering the data) Compute covariance matrix S = X T X ( S is d x d ) PCA basis vectors are given by the eigenvectors of S Q,Λ = numpy.linalg.eig(s) {q i, l i } i=1..n are the eigenvectors/eigenvalues of S... l 1 ³ l ³ ³ l n Larger eigenvalue Þ more important eigenvectors

Dimensionality Reduction Can ignore the components of lesser significance 5 0 Variance (%) 15 5 0 PC1 PC PC3 PC PC5 PC PC7 PC PC9 PC You do lose some information, but if the eigenvalues are small, you don t lose much choose only the first k eigenvectors, based on their eigenvalues final data set has only k dimensions 11

X = 0101... 1110... 001100.... 00... PCA 3 7 5 X has d columns Slide by Eric Eaton Q = Q are the eigenvectors of Σ; columns are ordered by importance! Q is d x d 0.3 0.3 0.30 0.3... 0.0 0.13 0.0 0.1... 0. 0.93 0.1 0........... 0.0 0.3 0.7 0.93... 3 7 5 1

X = 0101... 1110... 001100.... 00... PCA 3 7 5 Each row of Q corresponds to a feature; keep only first k columns of Q Slide by Eric Eaton Q = 0.3 0.3 0.30 0.3... 0.0 0.13 0.0 0.1... 0. 0.93 0.1 0........... 0.0 0.3 0.7 0.93... 3 7 5 13

PCA Each column of Q gives weights for a linear combination of the original features Q = 0.3 0.3 0.30 0.3... 0.0 0.13 0.0 0.1... 0. 0.93 0.1 0........... 0.0 0.3 0.7 0.93... 3 7 5 = 0.3 feature1 + 0.0 feature 0. feature3 +... Slide by Eric Eaton 1

PCA We can apply these formulas to get the new representation for each instance x 3 0101... 1110... 001100... 7 ^ X =. 00... x 3 Q = 7 5 0.3 0.3 0.0 0.13 0. 0.93.. 0.0 0.3 3 7 5 The new D representation for x 3 is given by: ^ x 31 = 0.3(0) + 0.0(0) - 0.(1) +... ^ x 3 = 0.3(0) + 0.13(0) + 0.93(1) +... ^ The re-projected data matrix is given by X = XQ Slide by Eric Eaton ^ 15

PCA Example Data Center Data Covariance Matrix Eigenvectors 0.155 0.15-0.7351-0.777 0.15 0.7155 0.777-0.7351 Eigenvalues 0.090 0 0 1.03

PCA Visualization of MNIST Digits 17

Challenge: Facial Recognition Want to identify specific person, based on facial image Robust to glasses, lighting, Þ Can t just use the given 5 x 5 pixels 1

PCA applications - Eigenfaces Eigenfaces are the eigenvectors of the covariance matrix of the probability distribution of the vector space of human faces Eigenfaces are the standardized face ingredients derived from the statistical analysis of many pictures of human faces A human face may be considered to be a combination of these standard face ingredients

PCA applications -Eigenfaces To generate a set of eigenfaces: 1. Large set of digitized images of human faces is taken under the same lighting conditions.. The images are normalized to line up the eyes and mouths. 3. The eigenvectors of the covariance matrix of the statistical distribution of face image vectors are then extracted.. These eigenvectors are called eigenfaces.

PCA applications -Eigenfaces the principal eigenface looks like a bland androgynous average human face http://en.wikipedia.org/wiki/image:eigenfaces.png

Eigenfaces

Eigenfaces Face Recognition When properly weighted, eigenfaces can be summed together to create an approximate gray-scale rendering of a human face. Remarkably few eigenvector terms are needed to give a fair likeness of most people's faces Hence eigenfaces provide a means of applying data compression to faces for identification purposes. Similarly, Expert Object Recognition in Video

Experiment and Results Eigenfaces Data used here are from the ORL database of faces. Facial images of 1 people each with views are used. - Training set contains 1 7 images. - Test set contains 1 3 images. First three eigenfaces :

Classification Using Nearest Neighbor Save average coefficients for each person. Classify new face as the person with the closest average. Recognition accuracy increases with number of eigenfaces until ~15. Best recognition rates Training set 99% Test set 9%

Image Compression

Original Image Divide the original 37x9 image into patches: Each patch is an instance that contains 1x1 pixels on a grid View each as a 1-D vector 7

L error and PCA dim

PCA compression: 1D à 0D 9

PCA compression: 1D à 1D 30

1 most important eigenvectors 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 31

PCA compression: 1D à D 3

most important eigenvectors 1 1 1 1 1 1 1 1 1 1 1 1 33

PCA compression: 1D à 3D 3

3 most important eigenvectors 1 1 1 1 1 1 35

PCA compression: 1D à 1D 3

0 most important eigenvectors Looks like the discrete cosine bases of JPG!... 37

D Discrete Cosine Basis http://en.wikipedia.org/wiki/discrete_cosine_transform 3