CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

Similar documents
Support Vector Machines

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Intro to Visual Recognition

Which Separator? Spring 1

Linear Classification, SVMs and Nearest Neighbors

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Kernel Methods and SVMs Extension

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

Support Vector Machines

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Support Vector Machines

Lecture 3: Dual problems and Kernels

Natural Language Processing and Information Retrieval

Pattern Classification

Support Vector Machines CS434

Maximal Margin Classifier

Multilayer Perceptron (MLP)

Lecture 10 Support Vector Machines. Oct

Support Vector Machines CS434

10-701/ Machine Learning, Fall 2005 Homework 3

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Chapter 6 Support vector machine. Séparateurs à vaste marge

18-660: Numerical Methods for Engineering Design and Optimization

UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 10: Classifica8on with Support Vector Machine (cont.

Ensemble Methods: Boosting

Advanced Introduction to Machine Learning

Support Vector Machines

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Kristin P. Bennett. Rensselaer Polytechnic Institute

Support Vector Machines

CSE 252C: Computer Vision III

Linear discriminants. Nuno Vasconcelos ECE Department, UCSD

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

Homework Assignment 3 Due in class, Thursday October 15

15-381: Artificial Intelligence. Regression and cross validation

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Discriminative classifier: Logistic Regression. CS534-Machine Learning

FMA901F: Machine Learning Lecture 5: Support Vector Machines. Cristian Sminchisescu

SVMs: Duality and Kernel Trick. SVMs as quadratic programs

Maximum Likelihood Estimation (MLE)

Linear Feature Engineering 11

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification

CSC 411 / CSC D11 / CSC C11

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.

Classification as a Regression Problem

Online Classification: Perceptron and Winnow

Evaluation of classifiers MLPs

Lecture 10 Support Vector Machines II

Video Data Analysis. Video Data Analysis, B-IT

The exam is closed book, closed notes except your one-page cheat sheet.

SVMs: Duality and Kernel Trick. SVMs as quadratic programs

Generative classification models

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Nonlinear Classifiers II

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks

A Tutorial on Data Reduction. Linear Discriminant Analysis (LDA) Shireen Elhabian and Aly A. Farag. University of Louisville, CVIP Lab September 2009

Statistical machine learning and its application to neonatal seizure detection

Neural networks. Nuno Vasconcelos ECE Department, UCSD

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

Feature Selection for SVMs

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Multi-layer neural networks

Recap: the SVM problem

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Lagrange Multipliers Kernel Trick

Support Vector Novelty Detection

Learning with Tensor Representation

UVA CS / Introduc8on to Machine Learning and Data Mining

Support Vector Machines. Jie Tang Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University 2012

Lecture 6: Support Vector Machines

CSC321 Tutorial 9: Review of Boltzmann machines and simulated annealing

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Boostrapaggregating (Bagging)

17 Support Vector Machines

Multilayer neural networks

Lecture 12: Classification

1 Convex Optimization

Evaluation of simple performance measures for tuning SVM hyperparameters

Lecture 10: Dimensionality reduction

MULTICLASS LEAST SQUARES AUTO-CORRELATION WAVELET SUPPORT VECTOR MACHINES. Yongzhong Xing, Xiaobei Wu and Zhiliang Xu

Classification. Representing data: Hypothesis (classifier) Lecture 2, September 14, Reading: Eric CMU,

Evaluation for sets of classes

Unit 5: Quadratic Equations & Functions

The big picture. Outline

Generalized Linear Methods

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #16 Scribe: Yannan Wang April 3, 2014

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

Statistical pattern recognition

Pairwise Multi-classification Support Vector Machines: Quadratic Programming (QP-P A MSVM) formulations

1 The Mistake Bound Model

Kernel Methods and SVMs

Transcription:

CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015

Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research Net tme: Frst student presentaton

Classfcaton vs Detecton Classfcaton Gven an mage or an mage regon, determne whch of N categores t represents Detecton Determne where n the mage a category s to be found

Classfcaton

James Hays

The machne learnng framework Apply a predcton functon to a feature representaton of the mage to get the desred output: f( ) = apple f( ) = tomato f( ) = cow Svetlana Lazebnk

The machne learnng framework y = f() output predcton functon mage features Tranng: gven a tranng set of labeled eamples {( 1,y 1 ),, ( N,y N )}, estmate the predcton functon f by mnmzng the predcton error on the tranng set Testng: apply f to a never before seen test eample and output the predcted value y = f() Svetlana Lazebnk

Tranng Steps Tranng Images Tranng Labels Image Features Tranng Learned model Testng Test Image Image Features Learned model Predcton Derek Hoem, Svetlana Lazebnk

Recognton task and supervson Images n the tranng set must be annotated wth the correct answer that the model s epected to produce Contans a motorbke Svetlana Lazebnk

Generalzaton How well does a learned model generalze from the data t was traned on to a new test set? Tranng set (labels known) Test set (labels unknown) Svetlana Lazebnk

Classfcaton Assgn nput vector to one of two or more classes Any decson rule dvdes the nput space nto decson regons separated by decson boundares Svetlana Lazebnk

Supervsed classfcaton Gven a collecton of labeled eamples, come up wth a functon that wll predct the labels of new eamples. four nne Tranng eamples? Novel nput How good s some functon that we come up wth to do the classfcaton? Depends on Krsten Grauman Mstakes made Cost assocated wth the mstakes

Supervsed classfcaton Gven a collecton of labeled eamples, come up wth a functon that wll predct the labels of new eamples. Consder the two-class (bnary) decson problem L(4 9): Loss of classfyng a 4 as a 9 L(9 4): Loss of classfyng a 9 as a 4 Rsk of a classfer s s epected loss: 4 9 usng sl4 9 Pr9 4 usng sl9 4 R( s) Pr We want to choose a classfer so as to mnmze ths total rsk Krsten Grauman

Supervsed classfcaton Optmal classfer wll mnmze total rsk. Feature value At decson boundary, ether choce of label yelds same epected loss. If we choose class four at boundary, epected loss s: P( class s 9 ) L(9 4) P(class s 4 ) L(4 4) If we choose class nne at boundary, epected loss s: P( class s 4 ) L(4 9) So, best decson boundary s at pont where P( class s 9 ) L(9 4) P(class s 4 ) L(4 9) Krsten Grauman

Supervsed classfcaton Optmal classfer wll mnmze total rsk. Feature value At decson boundary, ether choce of label yelds same epected loss. To classfy a new pont, choose class wth lowest epected loss;.e., choose four f P( 9 ) L(9 4) P(4 ) L(4 9) Loss for choosng four Loss for choosng nne Krsten Grauman

Supervsed classfcaton P(4 ) P(9 ) Feature value Optmal classfer wll mnmze total rsk. At decson boundary, ether choce of label yelds same epected loss. To classfy a new pont, choose class wth lowest epected loss;.e., choose four f P( 9 ) L(9 4) P(4 ) L(4 9) Loss for choosng four Loss for choosng nne Krsten Grauman How to evaluate these probabltes?

Classfers: Nearest neghbor Tranng eamples from class 1 Test eample Tranng eamples from class 2 f() = label of the tranng eample nearest to All we need s a dstance functon for our nputs No tranng requred! Svetlana Lazebnk

1-nearest neghbor o o + o o o o + 2 o 1 James Hays

5-nearest neghbor o o + o o o o + 2 o 1 James Hays

Lnear classfers C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1998

Lnes n R 2 Let w a c y a cy b 0 Krsten Grauman

Lnes n R 2 Let w a c y w a cy b 0 w b 0 Krsten Grauman

0, y 0 Lnes n R 2 Let w a c y w a cy b 0 w b 0 Krsten Grauman

0, y 0 D Lnes n R 2 Let w a c y w a cy b 0 w b 0 D Krsten Grauman a a 2 cy c 2 b w w 0 0 dstance from b pont to lne

0, y 0 D Lnes n R 2 Let w a c y w a cy b 0 w b 0 D Krsten Grauman a a 2 cy c 2 b w w 0 0 dstance from b pont to lne

Lnear classfers Fnd lnear functon to separate postve and negatve eamples postve negatve : : w w b b 0 0 Whch lne s best? C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1998

Support vector machnes Dscrmnatve classfer based on optmal separatng lne (for 2d case) Mamze the margn between the postve and negatve tranng eamples C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1998

Support vector machnes Want lne that mamzes the margn. postve negatve ( y ( y 1) : 1) : w b 1 w b 1 For support, vectors, w b 1 Support vectors Margn C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1998

Support vector machnes Want lne that mamzes the margn. postve negatve ( y ( y 1) : 1) : w b 1 w b 1 Support vectors Margn For support, vectors, w b 1 Dstance between pont w b and lne: w For support vectors: Τ w b 1 1 1 2 M w w w w w C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1998

Support vector machnes Want lne that mamzes the margn. postve negatve ( y ( y 1) : 1) : w b 1 w b 1 For support, vectors, w b 1 Dstance between pont w b and lne: w Support vectors Margn Therefore, the margn s 2 / w C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1998

Fndng the mamum margn lne 1. Mamze margn 2/ w 2. Correctly classfy all tranng data ponts: postve ( y negatve ( y 1) : 1) : w b 1 w b 1 Quadratc optmzaton problem: Mnmze 1 w T w 2 Subject to y (w +b) 1 One constrant for each tranng pont. Note sgn trck. C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1998

Fndng the mamum margn lne Soluton: w y Learned weght Support vector C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1998

Fndng the mamum margn lne Soluton: w y b = y w Classfcaton functon: f ( ) sgn sgn (for any support vector) ( w b) y Notce that t reles on an nner product between the test pont and the support vectors (Solvng the optmzaton problem also nvolves computng the nner products j between all pars of tranng ponts) b If f() < 0, classfy as negatve, otherwse classfy as postve. C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1998

Nonlnear SVMs Datasets that are lnearly separable work out great: 0 But what f the dataset s just too hard? 0 We can map t to a hgher-dmensonal space: 2 Andrew Moore 0

Nonlnear SVMs General dea: the orgnal nput space can always be mapped to some hgher-dmensonal feature space where the tranng set s separable: Φ: φ() Andrew Moore

Nonlnear SVMs The kernel trck: nstead of eplctly computng the lftng transformaton φ(), defne a kernel functon K such that K(, j ) = φ( ) φ( j ) Andrew Moore

Eamples of kernel functons Lnear: K(, j ) T j Gaussan RBF: K(, j ) ep( 2 2 Hstogram ntersecton: j 2 ) K (, j ) mn( ( k), j ( k)) k Andrew Moore

Summary: SVMs for mage classfcaton 1. Pck an mage representaton 2. Pck a kernel functon for that representaton 3. Compute the matr of kernel values between every par of tranng eamples 4. Feed the kernel matr nto your favorte SVM solver to obtan support vectors and weghts 5. At test tme: compute kernel values for your test eample and each support vector, and combne them wth the learned weghts to get the value of the decson functon Svetlana Lazebnk

What about mult-class SVMs? Unfortunately, there s no defntve mult-class SVM formulaton In practce, we have to obtan a mult-class SVM by combnng multple two-class SVMs One vs. others Tranng: learn an SVM for each class vs. the others Testng: apply each SVM to the test eample, and assgn t to the class of the SVM that returns the hghest decson value One vs. one Tranng: learn an SVM for each par of classes Testng: each learned SVM votes for a class to assgn to the test eample Svetlana Lazebnk

Detecton

PASCAL Vsual Object Challenge aeroplane bke brd boat bottle bus car cat char cow table dog horse motorbke person plant sheep sofa tran tv Deva Ramanan

Detecton: Scannng-wndow templates Dalal and Trggs CVPR05 (HOG) Papageorgou and Poggo ICIP99 (wavelets) pos neg w w = weghts for orentaton and spatal bns Deva Ramanan w > 0

Deformable part models Model encodes local appearance + sprng deformaton Deva Ramanan

Homework for Net Tme Paper revew for Features due at 10pm on 1/14, send to kovashka@cs.ptt.edu