Lecture 13 Visual recognition

Similar documents
Lecture: Face Recognition

COS 429: COMPUTER VISON Face Recognition

Face recognition Computer Vision Spring 2018, Lecture 21

PCA FACE RECOGNITION

Lecture 17: Face Recogni2on

Example: Face Detection

Lecture 17: Face Recogni2on

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Reconnaissance d objetsd et vision artificielle

CITS 4402 Computer Vision

Machine Learning. Data visualization and dimensionality reduction. Eric Xing. Lecture 7, August 13, Eric Xing Eric CMU,

Face detection and recognition. Detection Recognition Sally

Introduction to Discriminative Machine Learning

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Image Analysis. PCA and Eigenfaces

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

CS 231A Section 1: Linear Algebra & Probability Review

Principal Component Analysis (PCA)

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

Face Detection and Recognition

Distinguish between different types of scenes. Matching human perception Understanding the environment

Subspace Methods for Visual Learning and Recognition

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Linear Subspace Models

ECE 661: Homework 10 Fall 2014

Global Scene Representations. Tilke Judd

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

CS 3710: Visual Recognition Describing Images with Features. Adriana Kovashka Department of Computer Science January 8, 2015

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Dimensionality reduction

Deriving Principal Component Analysis (PCA)

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

Advanced Introduction to Machine Learning CMU-10715

38 1 Vol. 38, No ACTA AUTOMATICA SINICA January, Bag-of-phrases.. Image Representation Using Bag-of-phrases

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

A Unified Bayesian Framework for Face Recognition

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Kernel Density Topic Models: Visual Topics Without Visual Words

Pattern Recognition 2

Principal Component Analysis

PCA and LDA. Man-Wai MAK

What is Principal Component Analysis?

L11: Pattern recognition principles

CS 4495 Computer Vision Principle Component Analysis

Data Mining Techniques

2D Image Processing Face Detection and Recognition

Dimension Reduction (PCA, ICA, CCA, FLD,

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

Simultaneous and Orthogonal Decomposition of Data using Multimodal Discriminant Analysis

Symmetric Two Dimensional Linear Discriminant Analysis (2DLDA)

An Efficient Pseudoinverse Linear Discriminant Analysis method for Face Recognition

Eigenfaces. Face Recognition Using Principal Components Analysis

Discriminant Uncorrelated Neighborhood Preserving Projections

Probabilistic Latent Semantic Analysis

Principal Component Analysis

Regularized Discriminant Analysis and Reduced-Rank LDA

Machine Learning - MT & 14. PCA and MDS

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition.

Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview

PCA and LDA. Man-Wai MAK

Unsupervised Learning: K- Means & PCA

COMP 408/508. Computer Vision Fall 2017 PCA for Recognition

Online Appearance Model Learning for Video-Based Face Recognition

Two-Layered Face Detection System using Evolutionary Algorithm

Recognition and Classification in Images and Video

Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification

Eigenface-based facial recognition

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Multilinear Subspace Analysis of Image Ensembles

When Fisher meets Fukunaga-Koontz: A New Look at Linear Discriminants

N-mode Analysis (Tensor Framework) Behrouz Saghafi

Face Recognition in Subspaces

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang

Random Sampling LDA for Face Recognition

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection

CS 559: Machine Learning Fundamentals and Applications 5 th Set of Notes

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

STA 414/2104: Lecture 8

Probabilistic Class-Specific Discriminant Analysis

Pedestrian Density Estimation by a Weighted Bag of Visual Words Model

Principal Component Analysis and Linear Discriminant Analysis

Lie Algebrized Gaussians for Image Representation

Face Recognition. Lecture-14

EE 6882 Visual Search Engine

20 Unsupervised Learning and Principal Components Analysis (PCA)

Compressed Fisher vectors for LSVR

Maximally Stable Local Description for Scale Selection

Dimension Reduction and Component Analysis Lecture 4

Eigenfaces and Fisherfaces

Dimension Reduction and Component Analysis

Transcription:

Lecture 13 Visual recognition Announcements Silvio Savarese Lecture 13-20-Feb-14

Lecture 13 Visual recognition Object classification bag of words models Discriminative methods Generative methods Object classification by PCA and FLD Silvio Savarese Lecture 13-20-Feb-14

Challenges Variability due to: View point Illumination Occlusions Intra-class variability

Challenges: intra-class variation

Basic properties Representation How to represent an object category; which classification scheme? Learning How to learn the classifier, given training data Recognition How the classifier is to be used on novel data

definition of BoW Histogram of visual words (codewords) codewords dictionary

learning Representation recognition feature detection & representation codewords dictionary image representation category models (and/or) classifiers category decision

Classification Discriminative methods Nearest neighbors Linear classifier SVM Generative methods 20-Feb-14

SVM classification category models Model space Class 1 Class N w

SVM classification Query image Model space Winning class: pink w

Caltech 101

Caltech 101 BOW ~15%

Major drawback of BOW models Don t capture spatial information!

Spatial Pyramid Matching Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid, and J. Ponce.. 2006 N i i h i h h h I 1 2 1 2 1 )) ( ), ( min( ), ( ), ( 2 1 ), ( 2 1 2 1 h h I h h I

Caltech 101

Caltech 101 Pyramid matching

Discriminative models Nearest neighbor Neural networks 10 6 examples Shakhnarovich, Viola, Darrell 2003 Berg, Berg, Malik 2005... LeCun, Bottou, Bengio, Haffner 1998 Rowley, Baluja, Kanade 1998 Support Vector Machines Latent SVM Structural SVM Boosting Guyon, Vapnik, Heisele, Serre, Poggio Random forests Felzenszwalb 00 Ramanan 03 Viola, Jones 2001, Torralba et al. 2004, Opelt et al. 2006, Courtesy of Vittorio Ferrari Slide credit: Kristen Grauman Slide adapted from Antonio Torralba

Lecture 13 Visual recognition Object classification bag of words models Discriminative methods Generative methods Object classification by PCA and FLD Silvio Savarese Lecture 13-20-Feb-14

Image classification p( zebra image) vs. p( no zebra image) Bayes rule: p ( zebra image ) p ( no zebra image ) p ( image zebra ) p ( image no zebra ) p ( zebra ) p ( no zebra ) posterior ratio likelihood ratio prior ratio

p( zebra image) p( no zebra image) Discriminative methods Modeling the posterior ratio: Decision boundary Zebra Non-zebra

Generative methods p( zebra image) vs. p( no zebra image) Bayes rule: p ( zebra image ) p ( no zebra image ) p ( image zebra ) p ( image no zebra ) p ( zebra ) p ( no zebra ) posterior ratio likelihood ratio prior ratio

Generative models 1. Naïve Bayes classifier Csurka Bray, Dance & Fan, 2004 2. Hierarchical Bayesian text models (plsa and LDA) Background: Hoffman 2001, Blei, Ng & Jordan, 2004 Object categorization: Sivic et al. 2005, Sudderth et al. 2005 Natural scene categorization: Fei-Fei et al. 2005

Some notations w: a collection of all N codewords in the image w = [w1,w2,,wn] c: category of the image

the Naïve Bayes model p(c w) ~ p(c)p(w c) p( c) p( w1,, w c) N Prior prob. of the object classes Image likelihood given the class c w N

the Naïve Bayes model p (c w) ~ p(c)p(w c) p( c) p( w1,, w c) Prior prob. of the object classes Image likelihood given the class p( c) N N n1 p( w n c) Likelihood of n th visual word given the class Assume that each feature (codewords) is conditionally independent given the class p(w,, 1 w N c) N i1 p(w c) i

the Naïve Bayes model p (c w) ~ p(c)p(w c) p( c) p( w1,, w c) Prior prob. of the object classes Image likelihood given the class p( c) N N n1 p( w n c) Likelihood of n th visual word given the class Example: p( wi c1 ) 2 classes: bananas vs oranges Histogram of colors Wi = number of pixels colored in yellow in the image p( wi c2) x-axis: percentage of pixel that are colored in yellow in the image 25% 50% 75%

the Naïve Bayes model p (c w) ~ p(c)p(w c) p( c) p( w1,, w c) Prior prob. of the object classes Image likelihood given the class p( c) N N n1 p( w n c) Likelihood of n th visual word given the class How do we learn P(w i c j )? From empirical frequencies of code words in images from a given class p( wi c2) p( wi c1 )

Classification/Recognition c arg max c p( c w) p( c) N n1 p( w n c) Object class decision Example: p( wi c1 ) 2 classes: bananas vs oranges Query image contains a banana Look at how many pixels are yellow: say 60% Look at corresponding likelihood values given the two class hypotheses banana! p( wi c2) 60%

Summary: Generative models Naïve Bayes Unigram models in document analysis Assumes conditional independence of words given class Parameter estimation: frequency counting

Csurka s dataset 7 classes Csurka et al. 2004

E = 28% E = 15%

Generative vs discriminative Discriminative methods Computationally efficient & fast Generative models Convenient for weakly- or un-supervised, incremental training Prior information Flexibility in modeling parameters

Weakness of BoW the models All have equal probability for bag-of-words methods Location information is important No rigorous geometric information of the object components Segmentation and localization unclear

Lecture 13 Visual recognition Object classification bag of words models Discriminative methods Generative methods Object classification by PCA and FLD Silvio Savarese Lecture 13-20-Feb-14

Object classification by Principle Component Analysis (PCA) Linear Discriminant Analysis (LDA) Originally introduced for faces: Eigenfaces and Fisherfaces Turk & Penland, 91 Belhumeur et al.,

The Space of images or histograms An image (or histogram) H is a point in a high dimensional space An N x M image is a point in R NM [Thanks to Chuck Dyer, Steve Seitz, Nishino]

Key Idea H in the possible set {xˆ} are highly correlated. So, compress them to a low-dimensional subspace that captures key appearance characteristics of the visual DOFs. USE PCA for estimating the sub-space (dimensionality reduction) Compare two objects by projecting the images into the subspace and measuring the EUCLIDEAN distance between them. EIGENFACES: [Turk and Pentland 91]

Image space Face space Computes n-dim subspace such that the projection of the data points onto the subspace has the largest variance among all n-dim subspaces. Maximize the scatter of the training images in face space

USE PCA for estimating the sub-space x2 4 x2 4 1 2 6 1 2 6 3 5 3 5 X1 x1 x1 PCA projection Computes n-dim subspace such that the projection of the data points onto the subspace has the largest variance among all n-dim subspaces.

USE PCA for estimating the sub-space x 2 2rd principal component x 1 1 st principal component

PCA Mathematical Formulation PCA = eigenvalue decomposition of a data covariance matrix Define a transformation, W, y j W T x j j 1, 2... N ~ S T m-dimensional Orthonormal n-dimensional N T T (x j x)(x j x) j1 N T T (y j y)(y j y) W ST W j1 Measure data scatter S = Data Scatter matrix = Transf. data scatter matrix Eigenvectors of S T [ v1 v 2 v m ]

Image space v 1 Face space v 2 v 3 v 4

Projecting onto the Eigenfaces The eigenfaces v 1,..., v K span the space of faces A face is converted to eigenface coordinates by

Algorithm Training 1. Align training images x 1, x 2,, x N Note that each image is formulated into a long vector! 2. Compute average face x = 1/N Σ x i 3. Compute the difference image x i x

Algorithm 4. Compute the covariance matrix (total scatter matrix) S N T T (x j x)(x j x) j1 5. Compute the eigenvectors of the covariance matrix S T 6. Compute training projections a1, a2... a N Testing 1. Take query image X 2. Project X into Eigenface space (W = {eigenfaces}) and compute projection ω i 3. Compare projection ω i with all training N projections a i

Illustration of Eigenfaces The visualization of eigenvectors: These are the first 4 eigenvectors from a training set of 400 images (ORL Face Database).

Eigenfaces look somewhat like generic faces.

Reconstruction and Errors P = 4 P = 200 P = 400 Only selecting the top P eigenfaces reduces the dimensionality. Fewer eigenfaces result in more information loss, and hence less discrimination between faces.

Summary for Eigenface Pros Non-iterative, globally optimal solution Limitations PCA projection is optimal for reconstruction from a low dimensional basis, but may NOT be optimal for discrimination

Extensions Generalized PCA: R. Vidal, Y. Ma, and S. Sastry. Generalized Principal Component Analysis (GPCA). IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 27, number 12, pages 1-15, 2005. Tensor Faces: "Multilinear Analysis of Image Ensembles: TensorFaces," M.A.O. Vasilescu, D. Terzopoulos, Proc. 7th European Conference on Computer Vision (ECCV'02), Copenhagen, Denmark, May, 2002 PCA-SIFT PCA-SIFT: A More Distinctive Representation for Local Image Descriptors - Y Ke, R Sukthankar - IEEE CVPR 04

Linear Discriminant Analysis (LDA) Fisher s Linear Discriminant (FLD) Eigenfaces exploit the max scatter of the training images in face space Fisherfaces attempt to maximise the between class scatter, while minimising the within class scatter.

Illustration of the Projection Using two classes as example: x2 x2 x1 x1 Poor Projection Good

Variables N Sample images: c classes: x,, 1 x N 1,, c Average of each class: Total average: i 1 N 1 i x k x k i N x k N k1

Scatters Scatter of class i: T i k x i k i x x S i k c i S W S i 1 c i T i i i S B 1 B W T S S S Within class scatter: Between class scatter: Total scatter:

Illustration x2 S W S 1 S 2 Within class scatter S 1 S 2 S B x1 Between class scatter S i x k x x T i k i k i c S W S i i1 S B c i1 i i i T

Mathematical Formulation (1) After projection: y W k T x k Between class scatter (of y s): Within class scatter (of y s): ~ S ~ S B W W W T T S S B W W W

Illustration 2 ~ S 1 ~ S S ~ B 2 1 ~ ~ ~ S S S W x1 x2 k T k x y W W S W S B T B ~ W S W S W T W ~ c i S W S i 1 c i T i i i S B 1

Mathematical Formulation The desired projection: W opt ~ SB arg max ~ W S W arg max W W W T T S S B W W W How is it found? Generalized Eigenvectors If S w has full rank, the generalized eigenvectors are eigenvectors of S W -1 S B with largest eigen-values SB wi i SW wi i 1,, m

Results: Eigenface vs. Fisherface Input: Train: Test: 160 images of 16 people 159 images 1 image Variation in Facial Expression, Eyewear, and Lighting With glasses Without glasses 3 Lighting conditions 5 expressions

Results: Eigenface vs. Fisherface Error rate %

Object detection Next lecture