Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Similar documents
Support Vector Machines

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

Intro to Visual Recognition

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Linear Classification, SVMs and Nearest Neighbors

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Which Separator? Spring 1

Linear discriminants. Nuno Vasconcelos ECE Department, UCSD

Lecture 3: Dual problems and Kernels

Pattern Classification

Natural Language Processing and Information Retrieval

Kernel Methods and SVMs Extension

Chapter 6 Support vector machine. Séparateurs à vaste marge

Support Vector Machines

Support Vector Machines CS434

SVMs: Duality and Kernel Trick. SVMs as quadratic programs

Support Vector Machines

SVMs: Duality and Kernel Trick. SVMs as quadratic programs

Advanced Introduction to Machine Learning

Support Vector Machines CS434

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Maximal Margin Classifier

Nonlinear Classifiers II

Lecture 10 Support Vector Machines. Oct

10-701/ Machine Learning, Fall 2005 Homework 3

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

18-660: Numerical Methods for Engineering Design and Optimization

Support Vector Machines

FMA901F: Machine Learning Lecture 5: Support Vector Machines. Cristian Sminchisescu

Kristin P. Bennett. Rensselaer Polytechnic Institute

Recap: the SVM problem

Lagrange Multipliers Kernel Trick

CSE 252C: Computer Vision III

Support Vector Machines

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification

Ensemble Methods: Boosting

UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 10: Classifica8on with Support Vector Machine (cont.

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Discriminative classifier: Logistic Regression. CS534-Machine Learning

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

Multilayer Perceptron (MLP)

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Lecture 12: Classification

Lecture 10 Support Vector Machines II

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING

Parameter estimation class 5

Homework Assignment 3 Due in class, Thursday October 15

15-381: Artificial Intelligence. Regression and cross validation

CSC 411 / CSC D11 / CSC C11

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Classification learning II

Maximum Likelihood Estimation (MLE)

Lecture 6: Support Vector Machines

Learning with Tensor Representation

Feature Selection: Part 1

Report on Image warping

Classification as a Regression Problem

Video Data Analysis. Video Data Analysis, B-IT

A Tutorial on Data Reduction. Linear Discriminant Analysis (LDA) Shireen Elhabian and Aly A. Farag. University of Louisville, CVIP Lab September 2009

Generative classification models

Regularized Discriminant Analysis for Face Recognition

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

Lecture 10: Dimensionality reduction

Support Vector Novelty Detection

Linear Feature Engineering 11

Lecture Notes on Linear Regression

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

Support Vector Machines. Jie Tang Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University 2012

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

EEE 241: Linear Systems

Neural networks. Nuno Vasconcelos ECE Department, UCSD

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Other NN Models. Reinforcement learning (RL) Probabilistic neural networks

Machine Learning. Support Vector Machines. Eric Xing. Lecture 4, August 12, Reading: Eric CMU,

VQ widely used in coding speech, image, and video

Evaluation of classifiers MLPs

Boostrapaggregating (Bagging)

Machine Learning. What is a good Decision Boundary? Support Vector Machines

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)

Pattern Classification

Supervised Fuzzy Hyperline Segment Neural Network for Rotation Invariant Handwritten Character Recognition

Spectral Clustering. Shannon Quinn

Shuai Dong. Isaac Newton. Gottfried Leibniz

Unit 5: Quadratic Equations & Functions

Errors for Linear Systems

The exam is closed book, closed notes except your one-page cheat sheet.

Machine Learning. Support Vector Machines. Eric Xing , Fall Lecture 9, October 8, 2015

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Statistical pattern recognition

A NEW ALGORITHM FOR FINDING THE MINIMUM DISTANCE BETWEEN TWO CONVEX HULLS. Dougsoo Kaown, B.Sc., M.Sc. Dissertation Prepared for the Degree of

The Finite Element Method: A Short Introduction

Kernel Methods and SVMs

Transcription:

Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem?

Classfers Learn a decson rule assgnng bag-offeatures representatons of mages to dfferent classes Decson ecso boundary Zebra Non-zebra

Classfcaton Assgn nput vector to one of two or more classes Any decson rule dvdes nput space nto decson regons separated by decson boundares

Nearest Negbor Classfer Assgn label of nearest tranng data pont to eac test data pont from Duda et al. Vorono parttonng of feature space for two-category D and 3D data Source: D. Lowe

K-Nearest Negbors For a new pont fnd te k closest ponts from tranng data Labels of te k ponts vote to classfy Works well provded tere s lots of data and te dstance functon s good k = 5 Source: D. Lowe

Functons for comparng stograms N L dstance: = = N D N χ dstance: = + = N D Quadratc dstance cross-bn dstance: = j A D Hstogram ntersecton smlarty functon: = j j j A D g y = N I mn =

Lnear classfers Fnd lnear functon yperplane to separate postve and negatve eamples postve : w + b negatve : w + b < 0 0 Wc yperplane s best?

Support vector macnes Fnd yperplane tat mamzes te margn between te postve and negatve eamples C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Support vector macnes Fnd yperplane tat mamzes te margn between te postve and negatve eamples postve y negatve y = : = : w + b w + b Support vectors Margn For support vectors w + b = ± w Dstance between pont and yperplane: w + Terefore te margn s / w b C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Fndng te mamum margn yperplane. Mamze margn / w. Correctly classfy all tranng data: postve y = : negatve y = : w + b w + b Quadratc optmzaton problem: Mnmze w T w Subject to y w +b C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Fndng te mamum margn yperplane Soluton: w = α y learned wegt Support vector C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Fndng te mamum margn yperplane Soluton: w = α y b = y w for any support vector Classfcaton functon decson boundary: w + b = α y Notce tat t reles on an nner product between te test pont and te support vectors Solvng te optmzaton problem also nvolves computng te nner products j between all pars of tranng ponts + b C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Nonlnear SVMs Datasets tat are lnearly separable work out great: 0 But wat f te dataset s just too ard? 0 We can map t to a ger-dmensonal space: 0 Slde credt: Andrew Moore

Nonlnear SVMs General dea: te orgnal nput space can always be mapped to some ger-dmensonal feature space were te tranng set s separable: Φ: φ Slde credt: Andrew Moore

Nonlnear SVMs Te kernel trck: nstead of eplctly computng te lftng transformaton φ defne a kernel functon K suc tatt K j = φ φ j to be vald te kernel functon must satsfy Mercer s condton Ts gves a nonlnear decson boundary n te orgnal feature space: α yϕ ϕ + b = α yk + b C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Nonlnear kernel: Eample Consder te mappng = ϕ y y y K y y y y y + = + = = ϕ ϕ y y y K + =

Kernels for bags of features Hstogram ntersecton kernel: N I = mn = Generalzed Gaussan kernel: K ep D A = D can be L dstance Eucldean dstance χ dstance etc. J. Zang M. Marszalek S. Lazebnk and C. Scmd Local Features and Kernels for Classfcaton of Teture and Object Categores: A Compreensve Study IJCV 007

Summary: SVMs for mage classfcaton. Pck an mage representaton n our case bag of features. Pck a kernel functon for tat representaton 3. Compute te matr of kernel values between every par of tranng eamples 4. Feed te kernel matr nto your favorte SVM solver to obtan support vectors and wegts 5. At test tme: compute kernel values for your test eample and eac support vector and combne tem wt te learned wegts to get te value of te decson functon

Wat about mult-class SVMs? Unfortunately tere s no defntve multclass SVM formulaton In practce we ave to obtan a mult-class SVM by combnng multple two-class SVMs One vs. oters Tranng: learn an SVM for eac class vs. te oters Testng: apply eac SVM to test eample and assgn to t te class of te SVM tat returns te gest decson value One vs. one Tranng: learn an SVM for eac par of classes Testng: eac learned SVM votes for a class to assgn to te test eample

SVMs: Pros and cons Pros Many publcly avalable SVM packages: ttp://www.kernel-macnes.org/software Kernel-based framework s very powerful fleble SVMs work very well n practce even wt very small tranng sample szes Cons No drect mult-class SVM must combne two-class SVMs Computaton memory Durng tranng tme must compute matr of kernel values for every par of eamples Learnng can take a very long tme for large-scale problems

Summary: Classfers Nearest-negbor and k-nearest-negbor classfers L dstance χ dstance quadratc dstance stogram ntersecton Support vector macnes Lnear classfers Margn mamzaton Te kernel trck Kernel functons: stogram ntersecton generalzed Gaussan pyramd matc Mult-class l Of course tere are many oter classfers out tere Neural networks boostng decson trees