Scale & Affine Invariant Interest Point Detectors

Similar documents
Overview. Harris interest points. Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

Feature detectors and descriptors. Fei-Fei Li

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

Feature detectors and descriptors. Fei-Fei Li

INTEREST POINTS AT DIFFERENT SCALES

Detectors part II Descriptors

Feature extraction: Corners and blobs

Properties of detectors Edge detectors Harris DoG Properties of descriptors SIFT HOG Shape context

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

Recap: edge detection. Source: D. Lowe, L. Fei-Fei

Maximally Stable Local Description for Scale Selection

SIFT keypoint detection. D. Lowe, Distinctive image features from scale-invariant keypoints, IJCV 60 (2), pp , 2004.

Scale & Affine Invariant Interest Point Detectors

Lecture 8: Interest Point Detection. Saad J Bedros

Affine Adaptation of Local Image Features Using the Hessian Matrix

Image Analysis. Feature extraction: corners and blobs

Blob Detection CSC 767

CS5670: Computer Vision

Advances in Computer Vision. Prof. Bill Freeman. Image and shape descriptors. Readings: Mikolajczyk and Schmid; Belongie et al.

LoG Blob Finding and Scale. Scale Selection. Blobs (and scale selection) Achieving scale covariance. Blob detection in 2D. Blob detection in 2D

Blobs & Scale Invariance

Achieving scale covariance

Extract useful building blocks: blobs. the same image like for the corners

Wavelet-based Salient Points with Scale Information for Classification

CSE 473/573 Computer Vision and Image Processing (CVIP)

The state of the art and beyond

Scale-space image processing

Vlad Estivill-Castro (2016) Robots for People --- A project for intelligent integrated systems

Given a feature in I 1, how to find the best match in I 2?

Instance-level l recognition. Cordelia Schmid & Josef Sivic INRIA

Lecture 12. Local Feature Detection. Matching with Invariant Features. Why extract features? Why extract features? Why extract features?

Lecture 8: Interest Point Detection. Saad J Bedros

SIFT: SCALE INVARIANT FEATURE TRANSFORM BY DAVID LOWE

Invariant local features. Invariant Local Features. Classes of transformations. (Good) invariant local features. Case study: panorama stitching

Feature Vector Similarity Based on Local Structure

Lecture 7: Finding Features (part 2/2)

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

SIFT: Scale Invariant Feature Transform

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

Image matching. by Diva Sian. by swashford

SURVEY OF APPEARANCE-BASED METHODS FOR OBJECT RECOGNITION

SURF Features. Jacky Baltes Dept. of Computer Science University of Manitoba WWW:

6.869 Advances in Computer Vision. Prof. Bill Freeman March 1, 2005

Corner detection: the basic idea

Lecture 7: Finding Features (part 2/2)

Lecture 6: Finding Features (part 1/2)

Hilbert-Huang Transform-based Local Regions Descriptors

Visual Object Recognition

CS 3710: Visual Recognition Describing Images with Features. Adriana Kovashka Department of Computer Science January 8, 2015

Feature detection.

Advanced Features. Advanced Features: Topics. Jana Kosecka. Slides from: S. Thurn, D. Lowe, Forsyth and Ponce. Advanced features and feature matching

Instance-level l recognition. Cordelia Schmid INRIA

Is the Scale Invariant Feature Transform (SIFT) really Scale Invariant?

A New Shape Adaptation Scheme to Affine Invariant Detector

Edge Detection. Introduction to Computer Vision. Useful Mathematics Funcs. The bad news

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

Video and Motion Analysis Computer Vision Carnegie Mellon University (Kris Kitani)

On the Completeness of Coding with Image Features

arxiv: v1 [cs.cv] 10 Feb 2016

EE 6882 Visual Search Engine

CS4670: Computer Vision Kavita Bala. Lecture 7: Harris Corner Detec=on

Edge Detection. CS 650: Computer Vision

Lecture 7: Edge Detection

Local Features (contd.)

Harris Corner Detector

Rotational Invariants for Wide-baseline Stereo

Optical Flow, Motion Segmentation, Feature Tracking

VIDEO SYNCHRONIZATION VIA SPACE-TIME INTEREST POINT DISTRIBUTION. Jingyu Yan and Marc Pollefeys

Rapid Object Recognition from Discriminative Regions of Interest

KAZE Features. 1 Introduction. Pablo Fernández Alcantarilla 1, Adrien Bartoli 1, and Andrew J. Davison 2

Interest Operators. All lectures are from posted research papers. Harris Corner Detector: the first and most basic interest operator

Optical flow. Subhransu Maji. CMPSCI 670: Computer Vision. October 20, 2016

Lucas-Kanade Optical Flow. Computer Vision Carnegie Mellon University (Kris Kitani)

SCALE-SPACE - Theory and Applications

The Kadir Operator Saliency, Scale and Image Description. Timor Kadir and Michael Brady University of Oxford

On the consistency of the SIFT Method

ECE 468: Digital Image Processing. Lecture 8

Feature Extraction and Image Processing

Coding Images with Local Features

Feature Tracking. 2/27/12 ECEn 631

Roadmap. Introduction to image analysis (computer vision) Theory of edge detection. Applications

SIFT, GLOH, SURF descriptors. Dipartimento di Sistemi e Informatica

Lecture 6: Edge Detection. CAP 5415: Computer Vision Fall 2008

Laplacian Filters. Sobel Filters. Laplacian Filters. Laplacian Filters. Laplacian Filters. Laplacian Filters

Lecture 04 Image Filtering

Do We Really Have to Consider Covariance Matrices for Image Feature Points?

Keypoint extraction: Corners Harris Corners Pkwy, Charlotte, NC

Orientation Map Based Palmprint Recognition

EECS150 - Digital Design Lecture 15 SIFT2 + FSM. Recap and Outline

Affine Differential Invariants for Invariant Feature Point Detection

Lec 12 Review of Part I: (Hand-crafted) Features and Classifiers in Image Classification

Human Action Recognition under Log-Euclidean Riemannian Metric

Equi-Affine Differential Invariants for Invariant Feature Point Detection

Interest Point Detection with Wavelet Maxima Lines

Filtering and Edge Detection

Motion Estimation (I) Ce Liu Microsoft Research New England

Galilean-diagonalized spatio-temporal interest operators

Transcription:

Scale & Affine Invariant Interest Point Detectors Krystian Mikolajczyk and Cordelia Schmid Presented by Hunter Brown & Gaurav Pandey, February 19, 2009

Roadmap: Motivation Scale Invariant Detector Affine Invariant Detector Applications Conclusion 2

Problem Easy to find good interest points. Hard to find good interest points invariant to changing viewing conditions. 3

Related Work Kadir & Brady 1, 2001 Scale Selection Lowe 2, 1999 SIFT Lindeberg 3, 1998 Scale invariant detectors: LoG Lindeberg & Garding 4, 1997 Blob affine features Harris & Stephens 5, 1988 Harris Corner Detector 4

Roadmap Motivation Scale Invariant Detector Affine Invariant Detector Applications Conclusion 5

Scale Invariant Detector Idea: Scale Adapted Harris Corner Detector + Automatic Scale Selection 6

Harris Corner Detector Derivation Let the autocorrelation function be: (1) Now, approximate the second term with a taylor series: (2) Partial derivatives in x and y. Then substitute (2) back into (1): (3) 7

Harris Corner Derivation (cont) Now we have: (4) And finally, (5) Scale factors 8 Smooth with weighted gaussian kernal of size σ i

Automatic Scale Selection LoG: Laplacian-of-Gaussians Smooth via Gaussian kernel Apply second order differential operator (Laplacian) Courtesy Image Metrology A/S, Denmark. Courtesy Don Matthys, S.J., Marquette University. 9 Sneak peak: Only computes one scale per pixel.

Scale Invariant Algorithm Build image pyramid for pre-selected scales: (σ n = 1.4 n σ 0 ) For each level, compute Harris corners For every detected point, find the extrema of the LoG Keep points for which LoG is a local maximum 10 Courtesy Berend Engelbrecht, TheCodeProject.com

Roadmap: Motivation Scale Invariant Detector Affine Invariant Detector Applications Conclusion 11

Affine Transformation: Characteristic Shape Rotation + Scale x + Scale y Cool math alert: Can use the second moment matrix to find the affine deformation of an isotropic (invariant to direction) structure. 12

13 Courtesy Silvio Savarese, EECS442 Lecture 17

14 Courtesy Silvio Savarese, EECS442 Lecture 17

Affine Transformation of 2 nd Moment Matrix Goal 15

Second Moment Matrix Recall: Gradient: [dx dy] T T Define: μ(x, Σ I, Σ D ) = det( ) g( )*[( L)( x, )( L)( x, ) ] D D D D Covariance matrices (4) 16

Adjoint Transformation P=D T SD is adjoint transformation 6. Given: x R = Ax L Let: T ( xl, I, L, D, L) A ( xr, I, R, D, R) A Scale adapted 2 nd moment matrix!! A T T ( AxL, A I, LA, A, D L A T ) A M M (, I L, D, L x L, L (, I R, D, R x R, R ) ) 17

Adjoint Transformation M M L R A A T T M M R L A A 1 (6) R A L A T Suppose that M L can be computed such that, (7) 18

Adjoint Transformation Then, we can derive the following I, R D, R I M D M 1 R 1 R (8) If we estimate Σ R and Σ L such that 7,8 are true, then 6 must be true. 19

Affine Transformation Finally, Define: A 1/ 2 M R RM L 1/ 2 Orthogonal, represents arbitrary rotation or mirror transformation. M R 1 0 1 R 0 2 20

Normalized Isotropy Measure Q min max ( ) ( ) 21

Roadmap: Motivation Scale Invariant Detector Affine Invariant Detector Applications Conclusion 22

Feature Matching: Estimated Scale Factor: 4.9 Estimated Rotation: 19 23

Feature Matching: 24

25 Courtesy Silvio Savarese, EECS442 Lecture 17

26 Courtesy Silvio Savarese, EECS442 Lecture 17

Roadmap: Motivation Scale Invariant Detector Affine Invariant Detector Applications Conclusion 27

Conclusion Scale Invariance: Harris+Auto Scale (LoG) Affine Invariance: Harris+Affine Adaptation 28

Computational Complexity 29

Future Work Stability and convergence Invariance with occlusions New applications 30

References: 1. Kadir, T. and Brady,M. 2001. Scale, saliency and image description. International Journal of Computer Vision, 45(2):83 105. 2. Lowe, D.G. 1999. Object recognition from local scale-invariant features. In Proceedings of the 7th International Conference on Computer Vision, Kerkyra, Greece, pp. 1150 1157. 3. Lindeberg, T. 1998. Feature detection with automatic scale selection. International Journal of Computer Vision, 30(2):79 116. 4. Lindeberg, T. and Garding, J. 1997. Shape-adapted smoothing in estimation of 3-D shape cues from affine deformations of local 2-D brightness structure. Image and Vision Computing, 15(6):415 434. 5. Harris, C. & Stephens, M. A Combined Corner and Edge Detector. ALVEY Vision Conference, 1988, p147-152. 6. Hartley, R. I. & Zisserman, A. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521540518, 2004. 31

Evaluation of Feature Detectors and Descriptors based on 3D Objects By-: Pierre Moreels and Pietro Perona Califonia Institute of Technology, Pasadena, CA Presented by Hunter Brown & Gaurav Pandey, January 20, 2009

Objective: To explore the performance of a number of popular feature detectors and descriptors in matching 3D object features across view points and lightening conditions. 33

Motivation Critical issues in detection, description and matching of features are-: Robustness with respect to viewpoint and lighting changes. The number of features detected in a typical image. The frequency of false alarms and mismatches. The computational cost of each step. Different applications weigh these requirements differently. For example, object recognition, SLAM and wide-baseline stereo demand robustness to viewpoint, while the frequency of false matches may be more critical in object recognition, where thousands of potentially matching images are considered, rather than in wide-baseline stereo and mosaicing where only few images are present. 34

Previous Work The first extensive study of features stability depending on the feature detector being used, was performed by Schmid et al. (2000). The database consisted of drawing and paintings photographed from a number of view points. The key point here is that all scenes were planar, and the transformation between two images taken from different viewpoint was a homography. Ground truth (Homography) was computed from a grid of artificial points projected onto the paintings. Performance was measured by the repeatability rate, i.e. the percentage of locations selected as features in two images. 35

Previous Work (Schmid et al) Repeatability Criteria 3D points detected in one image should also be detected at approximately corresponding positions in subsequent images x i = H 1i x 1 where H 1i = P i P 1-1 36

Previous Work (Schmid et al) Ground Truth / Homography estimation 37

Previous Work Mikolajczyk et al (2004) performed a similar study of affineinvariant features detectors. Again they used planar scenes and the ground truth homography was computed using manually selected correspondences. Note -: They completely excluded the descriptor from their studies. If we have a detector which fires at any/most points of the image then according to their studies that will correspond to a stable feature, which is not true. Mikolajczyk and Schmid (2005) provided a complementary study where the focus was not on the detector stage but on the descriptor. 38

In this paper Focus is on 3D objects instead of just planar surfaces which allows greater variability in view point. Focus on detector and descriptor both rather than only one of these. Ground Truth estimated from the epipolar constraint because the dataset is not planar anymore. 39

Ground Truth Estimation 40

Ground Truth Estimation Potential matches for p have to lie on the corresponding epipolarline l 41

Ground Truth Estimation 42

Feature Detectors Based on Second Moment Matrix -: The motivation for these detectors is to select points where the image intensity has a high variability both in x and y directions Frostner Detector (1986)-: Selects as features the local maxima of the function Harris Corner Detector (1988)-: Selects as features the extrema of the function Lucas-Tomasi-Kanade Feature Detector (1994)-: Very similar to Harris, but with a greedy corner selection criterion. 43

Lucas-Tomasi-Kanade - Corner detector Very similar to Harris, but with a greedy corner selection criterion. We know that a corner is detected for which the eigenvalues of the Second moment matrix are large. Put all points for which λ 1 > thresh in a list L Sort the list in decreasing order by λ 1 Declare highest pixel p in L to be a corner. Then remove all points from L that are within a DxD neighborhood of p Continue until L is empty 44

Feature Detectors Interest point detection performed at different scale-: The Difference of Gaussian Detector (1994)-: Selects scale-space extrema of the image filtered by a difference of Gaussian. The Kadir-Brady Detector (2004)-: Selects locations where the local entropy has a maximum over scale and where the intensity probability density function varies fastest. MSER Features (2002) -: They are based on watershed flooding performed on the image intensities. Scale and Affine Interest Point Detectors (2004) computes a multiscale representation for the Harris interest point detector and then selects points at which a local measure (the Laplacian) is maximal over scales. 45

Kadir Brady Detector The method consists of three steps: Calculation of Shannon entropy of local image attributes for each x over a range of scales Select scales at which the entropy over scale function exhibits a peak sp ; Calculate the magnitude change of the PDF as a function of scale at each peak The final saliency Y D (x,sp) is the product of H D (x,sp) and W D (x,sp). 46

MSER-: Maximally Stable Extremal Regions Thresholds on intensity Form blobs and look for intensity range where watershed basin remains relatively stable. 47

Descriptors The role of the descriptor is to characterize the local image appearance around the location identified by the feature detector. Some popular descriptors are -: SIFT features (Lowe, 2004) are computed from gradient information. PCA-SIFT (Ke and Sukthankar, 2004) computes a primary orientation similarly to SIFT. Local patches are then projected onto a lowerdimensional space by using PCA analysis. Steerable filters (Freeman and Adelson, 1991) steer derivatives in a particular direction given the components of the local jet. E.g, steering derivatives in the direction of the gradient makes them invariant to rotation. Scale invariance is achieved by using various filter sizes. The Shape context descriptor (Belongie et al., 2002) is comparable to SIFT, but based on edges. Edges are extracted with the Canny filter, their location and orientation are then quantized into histograms using log-polar coordinates. 48

49

Steerable Filters Consider the 2D Gaussian filter The first derivative of this function in x direction is -: The same function rotated by 90 0 -: So for any arbitrary angle this can be written as-: Convolving these filters with the image in the direction of the gradient makes them invariant to rotation. Scale invariance is achieved by using various filter sizes 50

Performance Evaluation 51

Distance Measure in Appearance Space 52

Results 53

SIFT and Shape context descriptors with different detectors 54

55 Hessian/Harris Affine and DOG detectors with different descriptors

Best detector for a given descriptor 56

Stable keypoints as the complexity of object increases 57

58 Stable keypoints as the complexity of object increases

Conclusion Best Overall choice is using an Affine-rectified detector (Harris/Hessian Affine by Mikolajczyk and Schmid) followed by a SIFT (Lowe) or Shape Context descriptor (Belongie et al) 59

60