EE 6882 Visual Search Engine

Similar documents
Advances in Computer Vision. Prof. Bill Freeman. Image and shape descriptors. Readings: Mikolajczyk and Schmid; Belongie et al.

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

Blobs & Scale Invariance

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

Feature extraction: Corners and blobs

CSE 473/573 Computer Vision and Image Processing (CVIP)

Lecture 8: Interest Point Detection. Saad J Bedros

Invariant local features. Invariant Local Features. Classes of transformations. (Good) invariant local features. Case study: panorama stitching

Feature detectors and descriptors. Fei-Fei Li

Feature detectors and descriptors. Fei-Fei Li

INTEREST POINTS AT DIFFERENT SCALES

Local Features (contd.)

Lecture 12. Local Feature Detection. Matching with Invariant Features. Why extract features? Why extract features? Why extract features?

Image Analysis. Feature extraction: corners and blobs

Blob Detection CSC 767

Corner detection: the basic idea

Lecture 8: Interest Point Detection. Saad J Bedros

LoG Blob Finding and Scale. Scale Selection. Blobs (and scale selection) Achieving scale covariance. Blob detection in 2D. Blob detection in 2D

Achieving scale covariance

SIFT keypoint detection. D. Lowe, Distinctive image features from scale-invariant keypoints, IJCV 60 (2), pp , 2004.

CS 3710: Visual Recognition Describing Images with Features. Adriana Kovashka Department of Computer Science January 8, 2015

Extract useful building blocks: blobs. the same image like for the corners

Detectors part II Descriptors

Lecture 7: Finding Features (part 2/2)

CS5670: Computer Vision

Lecture 7: Finding Features (part 2/2)

Recap: edge detection. Source: D. Lowe, L. Fei-Fei

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

Properties of detectors Edge detectors Harris DoG Properties of descriptors SIFT HOG Shape context

Distinguish between different types of scenes. Matching human perception Understanding the environment

Vlad Estivill-Castro (2016) Robots for People --- A project for intelligent integrated systems

Kai Yu NEC Laboratories America, Cupertino, California, USA

Support Vector Machine & Its Applications

Recognition and Classification in Images and Video

Support Vector Machine (SVM) and Kernel Methods

Wavelet-based Salient Points with Scale Information for Classification

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Face detection and recognition. Detection Recognition Sally

Reconnaissance d objetsd et vision artificielle

Maximally Stable Local Description for Scale Selection

Linear Classification and SVM. Dr. Xin Zhang

Scale & Affine Invariant Interest Point Detectors

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

SUPPORT VECTOR MACHINE

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights

Kernel Density Topic Models: Visual Topics Without Visual Words

SIFT: Scale Invariant Feature Transform

Support Vector Machine (SVM) and Kernel Methods

CS4670: Computer Vision Kavita Bala. Lecture 7: Harris Corner Detec=on

Support Vector Machines II. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Support Vector Machine (SVM) and Kernel Methods

Introduction to Support Vector Machines

Jeff Howbert Introduction to Machine Learning Winter

Lecture 13 Visual recognition

The state of the art and beyond

Support Vector Machine. Industrial AI Lab.

SIFT: SCALE INVARIANT FEATURE TRANSFORM BY DAVID LOWE

Support Vector Machines: Maximum Margin Classifiers

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM

6.869 Advances in Computer Vision. Prof. Bill Freeman March 1, 2005

CITS 4402 Computer Vision

ML (cont.): SUPPORT VECTOR MACHINES

Brief Introduction of Machine Learning Techniques for Content Analysis

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee

Global Scene Representations. Tilke Judd

10/05/2016. Computational Methods for Data Analysis. Massimo Poesio SUPPORT VECTOR MACHINES. Support Vector Machines Linear classifiers

Object Recognition Using Local Characterisation and Zernike Moments

SIFT, GLOH, SURF descriptors. Dipartimento di Sistemi e Informatica

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan

LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning

Harris Corner Detector

Overview. Harris interest points. Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points

Support Vector Machine (continued)

Linear & nonlinear classifiers

Multi-class SVMs. Lecture 17: Aykut Erdem April 2016 Hacettepe University

Probabilistic Latent Semantic Analysis

Course 10. Kernel methods. Classical and deep neural networks.

Kernel Methods and Support Vector Machines

Lecture 6: Finding Features (part 1/2)

COMS 4771 Introduction to Machine Learning. Nakul Verma

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Pattern Recognition and Machine Learning. Perceptrons and Support Vector machines

Convex Optimization M2

Fisher Vector image representation

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

Instance-level l recognition. Cordelia Schmid INRIA

Scale & Affine Invariant Interest Point Detectors

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Support Vector Machines and Kernel Methods

Classification and Support Vector Machine

UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 9: Classifica8on with Support Vector Machine (cont.

Support Vector Machines.

CENG 793. On Machine Learning and Optimization. Sinan Kalkan

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

38 1 Vol. 38, No ACTA AUTOMATICA SINICA January, Bag-of-phrases.. Image Representation Using Bag-of-phrases

Image matching. by Diva Sian. by swashford

Transcription:

EE 6882 Visual Search Engine Prof. Shih Fu Chang, Feb. 13 th 2012 Lecture #4 Local Feature Matching Bag of Word image representation: coding and pooling (Many slides from A. Efors, W. Freeman, C. Kambhamettu, L. Xie, and likely others) (Slides preparation assisted by Rong Rong Ji) Corner Detection Types of local image windows Flat: Little or no brightness change Edge: Strong brightness change in single direction Flow: Parallel stripes Corner/spot: Strong brightness changes in orthogonal directions Basic idea Find points where two edges meet Look at the gradient behavior over a small window (Slide of A. Efros) 1

Harris Detector: Mathematics Change of intensity for the shift [u,v]: xy, 2 E( uv, ) wxy (, ) I( x uy, v) I( xy, ) Window function Shifted intensity Intensity Window function w(x,y) = or 1 in window, 0 outside Gaussian Harris Detector: Mathematics Taylor s Expansion: For small shifts [u,v] we have a bilinear approximation: u Euv (, ) uv, M v where M is a 2 2 matrix computed from image derivatives: M 2 Ix IxI y w( x, y) 2 xy, II x y Iy 2

Harris Detector: Mathematics Intensity change in shifting window: eigenvalue analysis u Euv (, ) uv, M v Ellipse E(u,v) = const 1 > 2 eigenvalues of M If we try every possible shift, the direction of fastest change is 1 ( 1 ) -1/2 ( 2) -1/2 (Slide of K. Efros) Harris Detector: Mathematics Measure of corner response: trace 2 R det M k M R det M Trace M det M trace M 1 2 1 2 Or det M trace M 1 2 1 2 (k empirical constant, k = 0.04-0.06) 3

Harris Detector The Algorithm: Find points with large corner response function R (R > threshold) Take the points of local maxima of R Models of Image Change Geometry Rotation Similarity (rotation + uniform scale) Affine (scale dependent on direction) valid for: orthographic camera, locally planar object Photometry Affine intensity change (I a I + b) (Slide of C. Kambhamettu) 4

Harris Detector: Some Properties But: non-invariant to image scale! All points will be classified as edges Corner! (Slide of C. Kambhamettu) Scale Invariant Detection Consider regions (e.g. circles) of different sizes around a point Regions of corresponding sizes (at different scales) will look the same in both images Fine/Low Coarse/High (Slide of C. Kambhamettu) 5

Scale Invariant Detection The problem: how do we choose corresponding circles independently in each image? (Slide of C. Kambhamettu) Scale-Space Pyrimad 6

Scale Space: Difference of Guassian 2 2 x y 2 1 2 2 Gxy (,, ) e Scale Invariant Detection Functions for determining scale Kernels: DoG Gxyk (,, ) Gxy (,, ) (Difference of Gaussians) 2 L Gxx x y Gyy x y (,, ) (,, ) (Laplacian) where Gaussian f Kernel Image 2 2 x y 2 1 2 2 Gxy (,, ) e Note: both kernels are invariant to scale and rotation (Slide of C. Kambhamettu) 7

Gausian Kernel, DOG Sigma 2 Sigma 4 Diff Sigma2-Sigma4 Difference of Gaussian, DOG 8

Resa mple B lur Subtrac t 2/13/2012 Key Point Localization Detect maxima and minima of difference-of-gaussian in scale space Scale Invariant Interest Point Detectors Harris-Laplacian 1 Find local maximum of: Harris corner detector in space (image coordinates) Laplacian in scale scale y Laplacian Harris x SIFT (Lowe) 2 Find local maximum of: Difference of Gaussians in space and scale scale y DoG DoG 1 K.Mikolajczyk, C.Schmid. Indexing Based on Scale Invariant Interest Points. ICCV 2001 2 D.Lowe. Distinctive Image Features from Scale-Invariant Keypoints. IJCV 2004 x (Slide of C. Kambhamettu) 9

Scale Invariant Detectors Experimental evaluation of detectors w.r.t. scale change Repeatability rate: # correct correspondences avg # detected points K.Mikolajczyk, C.Schmid. Indexing Based on Scale Invariant Interest Points. ICCV 2001 SIFT keypoints 10

After extrema detection After curvature, edge responses 11

Keypoints orientation and scale SIFT Invariant Descriptors Extract image patches relative to local orientation Dominant direction of gradient 12

Local Appearance Descriptor (SIFT) Compute gradient in a local patch Histogram of oriented gradients over local grids e.g., 4x4 grids and 8 directions > 4x4x8=128 dimensions Scale invariant S.-F. Chang, Columbia U. 25 [Lowe, ICCV 1999] Point Descriptors We know how to detect points Next question: How to match them?? Point descriptor should be: Invariant Distinctive 13

Feature matching? Slide of A. Efros Feature-space outlier rejection [Lowe, 1999]: 1-NN: SSD of the closest match 2-NN: SSD of the second-closest match Look at how much the best match (1-NN) is than the 2 nd best match (2-NN), e.g. 1-NN/2-NN Slide of A. Efros 14

Feature-space outliner rejection Can we now compute H from the blue points? No! Still too many outliers What can we do? Slide of A. Efros RANSAC for estimating homography RANSAC loop: 1. Select four feature pairs (at random) 2. Compute homography H (exact) 3. Compute inliers where SSD(p i, H p i) < ε 4. Keep largest set of inliers 5. Re-compute least-squares H estimate on all of the inliers Slide of A. Efros 15

2/13/2012 Least squares fit Find average translation vector Slide of A. Efros RANSAC Slide of A. Efros 16

From local features to Visual Words clustering 128 D feature space visual word vocabulary K-Mean Clustering Training data x label() i? Unsupervised learning K mean clustering Fix K value i Initialize the representative of each cluster Map samples to closest cluster Re compute the centers x1, x2,..., x N samples for i=1,2,...,n, xi Ck, if Dist(xi, Ck) Dist(xi, Ck' ), k k ' end Can be used to initialize other clustering methods x(2) C 1 ++ + C 2 + ++ o o ++ + o oo oo C K C 3 x(1) 17

Visual Words: Image Patch Patterns Corners Blobs eyes letters Sivic and Zisserman, Video Google, 2006 Represent Image as Bag of Words keypoint features visual words clustering BoW histogram 18

Pooling Binary Features Boureau, Jean Ponce, Yann LeCun, A Theoretical Analysis of Feature Pooling in Visual Recognition, ICML 2010 Consider PxK matrix P: # of features, K: # of codewords To begin with simple model, assume vi are iid. Distribution Separability Better separability achieved by 1. increasing the distance between the means of the two classconditional distributions 2. reducing their standard deviations. 19

Distribution Separability Average pooling: Max pooling: Class separability 20

For binary features: For continuous features: Modeling will be more complex and the conclusions are slightly different Soft Coding -- Assign a feature to multiple visual words -- weights are determined by feature-to-word similarity Details in: Jiang, Ngo and Yang, ACM CIVR 2007. Image source: http://www.cs.joensuu.fi/pages/franti/vq/lkm15.gif 42 21

a Multi BoW Spatial Pyramid Kernel S. Lazebnik, et al, CVPR 2006 43 Classifiers K Nearest Neighbors + Voting Linear Discriminative Model (SVM) 44 22

Machine Learning: Build Classifier Find separating hyperplane: w to maximize margin Airplane w T x+ b = 0 Decision function: f(x) = sign(w T x+ b) w T x i + b > 0 if label y i = +1 w T x i + b < 0 if label y i = 1 Support Vector Machine (tutorial by Burges 98) Look for separation plane with the highest margin Decision boundary : t H 0 0 wx+ b= Linearly separable w T x i + b > 1 if label y i = +1 w T x i + b < 1 if label y i = 1 y i (w T x i + b) > 1 for all x i Two parallel hyperplanes defining the margin hyperplane H1( H ) : t + wxi+ b= + 1 hyperplane H2( H ) : t - wxi+ b= - 1 Margin: sum of distances of the closest points to the separation plane margin = 2 / w Best plane defined by w and b 23

Max Margin Solution for separable case 0 Weight sum from positive class = Weight sum from negative class Direction of w: roughly from negative support vectors to positive ones w 0 if a i > 0, xi is on H+ or H- and is a support vector How to compute w and b? How to classify new data? Non-separable Add slack variables x i if 1, then x is misclassified (i.e. training error) i i Lagrange multiplier: minimize New objective function Ensure positivity 24

All the points located in the margin gap or the wrong side will get i C What if C increases? 0 i C C i after C increases When C increases, samples with errors get more weights better training accuracy, but smaller margin less generalization performance Generalized Linear Discriminant Functions Include more than just the linear terms å d d d å å t t g( x ) = w + wx + w xx = w + w x + x Wx In general Example 0 i i ij i j 0 i= 1 i= 1 j= 1 Shape of decision boundary ellipsoid, hyperhyperboloid, lines etc ) d å t g( x) = a y ( x) = a y i= 1 i i gx ( ) = a+ ax+ ax 1 2 3 2 = [ a1 a2 a3] é1 x x ù êë úû 2 t g( x) = a x + a x + a x x 1 1 2 2 3 1 2 t = [ a a a ][ 1 x x x ] 1 2 3 1 1 2 Data become separable in higher-dimensional space learning parameters in high dimension is hard (curse of dim.) instead, try to maximize margins SVM Figure from Duda, Hart, and Stork 25

Non-Linear Space Map to a high dimensional space, to make the data separable Find the SVM in the high-dim space (embedding space) N s å g( x) = a yf ( s ) F ( x) + b i= 1 w i i i Luckily, we don t have to find We can use the same method to maximize L D to find i l l l 1 L = a - a a y y F ( x ) F( x ) å D i i j i j i j i= 1 2 i= 1 j= 1 l l l 1 å ai å å aia jyyk i j ( xi, x j) i= 1 2 i= 1 j= 1 = - F( s ) nor a y F( s ) å å l å i i i i i= 1 Instead, we define kernel K ( s, x) =F ( s ) F( x) N s å Þ g( x) = a yk( s, x) + b i= 1 i i i i i a Some popular kernels polynomial Gaussian Radial Basis Function (RBF) sigmoidal neural network separable Cubic polynomial non-separable 26

SVM Classifier is completely determined by the training samples that are on the hyperplanes or within the margin y i (w T x i + b) <= 1 w * l = å i= 1 a y x i i i 0 i C C i Reading List Lazebnik, S., C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. in IEEE CVPR. 2006. Jiang, Y., C. Ngo, and J. Yang. Towards optimal bag of features for object categorization and semantic video retrieval. in ACM CIVR. 2007. Chang, S., et al. Columbia University/VIREO CityU/IRIT TRECVID2008 high level feature extraction and interactive video search. in NIST TRECVID Workshop. 2008. Jiang, Y., et al. Columbia UCF TRECVID2010 Multimedia Event Detection: Combining Multiple Modalities, Contextual Concepts, and Temporal Matching. in TRECVID Workshop. 2010. Pattern Classification, 2nd ed., Richard O. Duda, Peter E. Hart, and David G. Stork ISBN: 0 471 05669 3, 2000, Wiley Viola, P. and M. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. in Proceedings IEEE Conf. on Computer Vision and Pattern Recognition. 2001. Yan, R., J. Yang, and A.G. Hauptmann. Learning Query Class Dependent Weights in Automatic Video Retrieval. in ACM Multimedia. 2004. New York. 27