Dr. Ulas Bagci

Similar documents
Face detection and recognition. Detection Recognition Sally

Reconnaissance d objetsd et vision artificielle

2D Image Processing Face Detection and Recognition

PCA FACE RECOGNITION

Lecture 17: Face Recogni2on

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

Lecture 17: Face Recogni2on

COS 429: COMPUTER VISON Face Recognition

Eigenfaces. Face Recognition Using Principal Components Analysis

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Boosting: Algorithms and Applications

Face recognition Computer Vision Spring 2018, Lecture 21

Image Analysis. PCA and Eigenfaces

Lecture 13 Visual recognition

Lecture: Face Recognition

Two-Layered Face Detection System using Evolutionary Algorithm

ECE 661: Homework 10 Fall 2014

Outline: Ensemble Learning. Ensemble Learning. The Wisdom of Crowds. The Wisdom of Crowds - Really? Crowd wiser than any individual

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Real Time Face Detection and Recognition using Haar - Based Cascade Classifier and Principal Component Analysis

Example: Face Detection

BBM406 - Introduc0on to ML. Spring Ensemble Methods. Aykut Erdem Dept. of Computer Engineering HaceDepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Eigenimaging for Facial Recognition

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

Unsupervised Learning: K- Means & PCA

Face Recognition Using Eigenfaces

Machine Learning 2nd Edition

CITS 4402 Computer Vision

Face Detection and Recognition

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Advanced Introduction to Machine Learning CMU-10715

INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

COMP 408/508. Computer Vision Fall 2017 PCA for Recognition

Pattern Recognition 2

Principal Component Analysis

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

Learning theory. Ensemble methods. Boosting. Boosting: history

Deriving Principal Component Analysis (PCA)

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

Linear Dimensionality Reduction

Advanced Machine Learning & Perception

Learning with multiple models. Boosting.

Linear Subspace Models

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014

An overview of Boosting. Yoav Freund UCSD

Principal Component Analysis CS498

Eigenface-based facial recognition

Lecture 13: Tracking mo3on features op3cal flow

CS4670: Computer Vision Kavita Bala. Lecture 7: Harris Corner Detec=on

L11: Pattern recognition principles

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Boosting: Foundations and Algorithms. Rob Schapire

Visual Object Detection

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4

EECS490: Digital Image Processing. Lecture #26

CS 4495 Computer Vision Principle Component Analysis

COMS 4721: Machine Learning for Data Science Lecture 13, 3/2/2017

Boos$ng Can we make dumb learners smart?

Lecture 7: Con3nuous Latent Variable Models

A Unified Bayesian Framework for Face Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Introduction to Computer Vision

PCA and LDA. Man-Wai MAK

Region Covariance: A Fast Descriptor for Detection and Classification

Aruna Bhat Research Scholar, Department of Electrical Engineering, IIT Delhi, India

Non-parametric Classification of Facial Features

COMS 4771 Lecture Boosting 1 / 16

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

COMP 562: Introduction to Machine Learning

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /

Classification: The rest of the story

What is Principal Component Analysis?

Machine Learning. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

20 Unsupervised Learning and Principal Components Analysis (PCA)

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

Subspace Methods for Visual Learning and Recognition

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition.

Machine Learning - MT & 14. PCA and MDS

Advanced Machine Learning & Perception

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

PCA and LDA. Man-Wai MAK

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview

Two-stage Pedestrian Detection Based on Multiple Features and Machine Learning

Facial Expression Recogni1on Using Ac1ve Appearance

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Decision Trees. Tobias Scheffer

Feature detectors and descriptors. Fei-Fei Li

Machine Learning Lecture 7

Motivating the Covariance Matrix

Locality Preserving Projections

Computer Vision. Pa0ern Recogni4on Concepts Part I. Luis F. Teixeira MAP- i 2012/13

INTEREST POINTS AT DIFFERENT SCALES

Advances in Computer Vision. Prof. Bill Freeman. Image and shape descriptors. Readings: Mikolajczyk and Schmid; Belongie et al.

STATISTICAL LEARNING SYSTEMS

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

Transcription:

CAP5415-Computer Vision Lecture 18-Face Recogni;on Dr. Ulas Bagci bagci@ucf.edu 1

Lecture 18 Face Detec;on and Recogni;on Detec;on Recogni;on Sally 2

Why wasn t Massachusetss Bomber iden6fied by the Massachuse8s Department of Motor Vehicles system from the video surveillance images? He was enrolled in MA DMV Database! DMV Face Recognition System? Slide Credits to Animetrics, Dr. Marc Valliant, VP & CTO 3

Today s FR technology will reliably find controlled facial photo in a mugshot database of controlled database. Slide Credits to Animetrics, Dr. Marc Valliant, VP & CTO 4

Controlled Facial Photo Today s FR technology will reliably find controlled facial photo in a mugshot database of controlled database. However, there are confounding variables in uncontrolled facial photos Slide Credits to Animetrics, Dr. Marc Valliant, VP & CTO 5

Controlled Facial Photo Today s FR technology will reliably find controlled facial photo in a mugshot database of controlled database. However, there are confounding variables in uncontrolled facial photos Resolution (not enough pixels) Facial Pose angulated Illumination Occluded facial areas Slide Credits to Animetrics, Dr. Marc Valliant, VP & CTO 6

Further Difficul;es 7

Three goals Feature Computa;on features must be computed as quickly as possible Feature Selec;on select the most discriminating features Real ;meliness must focus on potentially positive image areas (that contain faces) 8

Face Detec;on Before face recogni;on can be applied to a general image, the loca;ons and sizes of any faces must be first found. Rowley, Baluja, Kanade (1998) 9

Lecture 18 Face Detec;on/Recogni;on using Mobile Devices Face detec6on (camera automa;cally Adjust the focus based on detected Faces) Auto-login with recognized faces 10

Eye, mouth,.. Featurebased Templatebased AAM, Appearance- Based Patches, Face Detec;on 11

Some of the representa;ve works 12

Rectangle (Haar-like) Features Rectangle filters Value = (pixels in white area) (pixels in black area) 13

Fast Computa;on with Integral Images This can quickly be computed in one pass through the image formal definition: (, ) = i( x', y ') ii x y x' x, y' y Recursive definition: (, ) = (, 1 ) + (, ) (, ) = ( 1, ) + (, ) s x y s x y i x y ii x y ii x y s x y (x,y) IMAGE 0 1 1 1 1 2 2 3 1 2 1 1 1 3 1 0 INTEGRAL IMAGE 0 1 2 3 1 4 7 11 2 7 11 16 3 11 16 21 14

Feature Selec;on For a 24x24 detection region, the number of possible rectangle features is ~160,000! 15

Feature Selec;on For a 24x24 detection region, the number of possible rectangle features is ~160,000! PCA 16

Local Binary Paferns (LBP): Alterna;ve Features Gray-scale invariant texture measure Derived from local neighborhood Powerful texture descriptor Computa;onally simple Robust against monotonic gray-scale changes 17

Local Binary Paferns (LBP): Alterna;ve Features (LBP from dynamic/video texture) 18

Principal Component Analysis (PCA) Mapping from the inputs in the original d-dimensional space to a new (k<d)-dimensional space, with minimum loss of informa;on. 19

Principal Component Analysis (PCA) Mapping from the inputs in the original d-dimensional space to a new (k<d)-dimensional space, with minimum loss of informa;on. PCA is an unsupervised method, it does not use output informa;on. 20

Principal Component Analysis (PCA) Mapping from the inputs in the original d-dimensional space to a new (k<d)-dimensional space, with minimum loss of informa;on. PCA is an unsupervised method, it does not use output informa;on. PCA centers the sample and then rotates the axes to line up with the direc;ons of highest variance. 21

Principal Component Analysis (PCA) The projec;on of x on the direc;on of w is z = w T x Second principal component * * * * * * * * * * * * * * Data points * * * * * * * * * * First principal component Original axes 22

Principal Component Analysis (PCA) The projec;on of x on the direc;on of w is z = w T x The principal component us w 1 such that the sample, aker projec;on on w 1, is most spread out so that the difference between the sample points becomes most apparent. 23

Principal Component Analysis (PCA) The projec;on of x on the direc;on of w is z = w T x The principal component us w 1 such that the sample, aker projec;on on w 1, is most spread out so that the difference between the sample points becomes most apparent. To have unique solu;on, w 1 =1 24

Principal Component Analysis (PCA) The projec;on of x on the direc;on of w is The principal component us w 1 such that the sample, aker projec;on on w 1, is most spread out so that the difference between the sample points becomes most apparent. To have unique solu;on, with z = w T x w 1 =1 z 1 = w 1 T x Cov(x) = 25

Principal Component Analysis (PCA) The projec;on of x on the direc;on of w is The principal component us w 1 such that the sample, aker projec;on on w 1, is most spread out so that the difference between the sample points becomes most apparent. To have unique solu;on, with Then, z = w T x w 1 =1 z 1 = w 1 T x Cov(x) = Var(z 1 )=w 1 T w 1 26

Principal Component Analysis (PCA) The projec;on of x on the direc;on of w is The principal component us w 1 such that the sample, aker projec;on on w 1, is most spread out so that the difference between the sample points becomes most apparent. To have unique solu;on, with Then, z = w T x w 1 =1 z 1 = w 1 T x Cov(x) = Var(z 1 )=w 1 T w 1 SEEK w 1 such that Var(z 1 ) is maximized! 27

Solu;on of PCA Write it as a Lagrange problem, take deriva;ves w.r.t to w, then z = W T (x m) where m is the sample mean Cov(z) =W T SW X T X = WDW T (= D diagonal) (S: spectral decomp.) 28

Solu;on of PCA X T X = WDW T Let us say we want to reduce dimensionality to k<d, we take the first k columns of W (with the highest eigenvalues). 29

Solu;on of PCA X T X = WDW T Let us say we want to reduce dimensionality to k<d, we take the first k columns of W (with the highest eigenvalues). z t i = w T i x t i=1, k, t=1,...,n X = USV T U = evec(xx T ) V = evec(x T X T ) S 2 = eval(xx T ) (X T X)w i = i w i 30

Scree plot: Ability of PCs to explain varia;on in data Enough PCs (principal components) to have a cumulative variance explained by the PCs that is >50-70% Kaiser criterion: keep PCs with eigenvalues >1 λ N λ i 31

Recap: PCA calcula;ons in cartoon Steps in PCA: #1 Calculate Adjusted Data Set Adjusted Data Set: A Data Set: D Mean values: M = n dims - M i is calculated by taking the mean of the values in dimension i data samples 32

Recap: PCA calcula;ons in cartoon Steps in PCA: #2 Calculate Co-variance matrix, C, from Adjusted Data Set, A Co-variance Matrix: C n C ij = cov(i,j) n Note: Since the means of the dimensions in the adjusted data set, A, are 0, the covariance matrix can simply be wrifen as: C = A A T /(n-1) 33

Recap: PCA calcula;ons in cartoon Steps in PCA: #3 Calculate eigenvectors and eigenvalues of C Matrix E Eigenvalues Matrix E Eigenvalues x x Eigenvectors Eigenvectors If some eigenvalues are 0 or very small, we can essen;ally discard those eigenvalues and the corresponding eigenvectors, hence reducing the dimensionality of the new basis. 34

Recap: PCA calcula;ons in cartoon Steps in PCA: #4 Transforming data set to the new basis F = E T A where: F is the transformed data set E T is the transpose of the E matrix containing the eigenvectors A is the adjusted data set Note that the dimensions of the new dataset, F, are less than the data set A To recover A from F: (E T ) -1 F = (E T ) -1 E T A (E T ) T F = A EF = A * E is orthogonal, therefore E -1 = E T 35

Holis;c FR: Eigenfaces Eigenfaces, fisherfaces, tensorfaces.. 36

Gabor Feature-based FR Earlier FR methods are mostly feature-based. The most successful feature-based FR is the elas;c bunch graph matching system with Gabor filter coefficients as features: (scale) 37

Gabor Features Scale (5) Orienta;on (8) 38

PCA on Faces: Eigenfaces Average face First principal component Other components For all except average, gray = 0, white > 0, black < 0 39

Eigenfaces example Training faces 40

Eigenfaces example Top eigenvectors: u 1, u k Mean: µ 41

Applica;on to faces Representing faces onto this basis Face reconstruc;on: 42

Simplest Approach to FR The simplest approach is to think of it as a template matching problem Problems arise when performing recognition in a high-dimensional space. Significant improvements can be achieved by first mapping the data into a lower dimensionality space. 43

FR using eigenfaces 44

FR using eigenfaces The distance e r is called distance within face space (difs) The Euclidean distance can be used to compute e r, however, the Mahalanobis distance has shown to work better: k k Ω Ω = ( w w ) K i= 1 i i 2 Euclidean distance Mahalanobis distance 45

Face detec;on (iphoto) 46

Face Detec;on Nikon S60 47

Face Detec;on Nikon S60 finds 12 faces 48

The Viola/Jones Face Detector A seminal approach to real-;me object detec;on Training is slow, but detec;on is very fast Key ideas Integral images for fast feature evalua;on Boos5ng for feature selec;on A7en5onal cascade for fast rejec;on of non-face windows P. Viola and M. Jones. Rapid object detec5on using a boosted cascade of simple features. CVPR 2001. P. Viola and M. Jones. Robust real-5me face detec5on. IJCV 57(2), 2004. 49

The Viola/Jones Face Detector-Training Ini;ally, weight each training example equally In each boos;ng round: Find the weak learner that achieves the lowest weighted training error Raise the weights of training examples misclassified by current weak learner Compute final classifier as linear combina;on of all weak learners (weight of each learner is directly propor;onal to its accuracy) Exact formulas for re-weigh;ng and combining weak learners depend on the par;cular boos;ng scheme (e.g., AdaBoost) P. Viola and M. Jones. Rapid object detec5on using a boosted cascade of simple features. CVPR 2001. P. Viola and M. Jones. Robust real-5me face detec5on. IJCV 57(2), 2004. 50

The Viola/Jones Face Detector-Tes;ng First two features selected by boos;ng: This feature combina;on can yield 100% detec;on rate and 50% false posi;ve rate 51

The Viola/Jones Face Detector-Tes;ng A 200-feature classifier can yield 95% detec;on rate and a false posi;ve rate of 1 in 14084 Not good enough! 52

Afen;onal Cascade We start with simple classifiers which reject many of the nega;ve sub-windows while detec;ng almost all posi;ve sub-windows Posi;ve response from the first classifier triggers the evalua;on of a second (more complex) classifier, and so on A nega;ve outcome at any point leads to the immediate rejec;on of the sub-window % Detection 0 100 Receiver opera;ng characteris;c % False Pos 0 50 vs false neg determined by IMAGE SUB-WINDOW Classifier 1 T Classifier 2 T Classifier 3 T FACE F NON-FACE F NON-FACE F NON-FACE 53

Cascaded Classifiers (Boos;ng) Output Base-learners input 54

Boos;ng for FR Weak Classifier 1 55

Boos;ng for FR Weights Increased 56

Boos;ng for FR Weak Classifier 2 57

Boos;ng for FR Weights Increased 58

Boos;ng for FR Weak Classifier 3 59

Boos;ng for FR Final classifier is a combination of weak classifiers 60

AdaBoost Algorithm Given 1 1 m m, 1 Initialize = = For ( x, y ),...,( x, y ) x X, y Y = { 1, + 1} t = 1,..., T For each classifier ht : X { 1, + 1} that minimizes the error with respect to the distribution ε is the weighted error rate of classifier t If εt 0.5, then stop Choose αt R, typically Update where D1 () i, i 1,..., m, m D h t i i D t = arg minε ε = D()[ i y h( x )] h H t t t t i t i 1 1 ε ln t αt = 2 εt D()exp( i α yh( x )) i = Zt is a normalized factor (choose so that Dt+1 will sum_x=1) t t i t i t+ 1() Z t ht 61

Boos;ng for FR Define weak learners based on rectangle features T H( x) = sign atht( x) t= 1 62

Boos;ng & SVM Advantages of boosting Integrates classification with feature selection Complexity of training is linear instead of quadratic in the number of training examples Flexibility in the choice of weak learners, boosting scheme Testing is fast Easy to implement Disadvantages Needs many training examples Often does not work as SVM 63

Simple FR for Mobile Devices (LBP: local binary paferns) 64

References & Slice Credits Animetrics, Dr. Marc Valliant, VP & CTO M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive Neuroscience, vol. 3, no. 1., 1991. Y. Freund and R. Schapire, A short introduc;on to boos;ng, Journal of Japanese Society for Ar5ficial Intelligence, 14(5):771-780, September, 1999. S.Li, et al. Handbook of Face Recognition, Springer. Paul A. Viola and Michael J. Jones, Intl. J. Computer Vision 57(2), 137 154, 2004, (originally in CVPR 2001) Some slides adapted from Bill Freeman, MIT 6.869, April 2005) Friedman, J., Hastie, T. and Tibshirani, R. Additive Logistic Regression: a Statistical View of Boosting http://www-stat.stanford.edu/~hastie/papers/boost.ps 65