SIFT: Scale Invariant Feature Transform

Similar documents
Advanced Features. Advanced Features: Topics. Jana Kosecka. Slides from: S. Thurn, D. Lowe, Forsyth and Ponce. Advanced features and feature matching

Harris Corner Detector

SIFT: SCALE INVARIANT FEATURE TRANSFORM BY DAVID LOWE

INTEREST POINTS AT DIFFERENT SCALES

Feature detectors and descriptors. Fei-Fei Li

Feature detectors and descriptors. Fei-Fei Li

CSE 473/573 Computer Vision and Image Processing (CVIP)

Scale-space image processing

CS5670: Computer Vision

Feature extraction: Corners and blobs

Lecture 8: Interest Point Detection. Saad J Bedros

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

Blob Detection CSC 767

LoG Blob Finding and Scale. Scale Selection. Blobs (and scale selection) Achieving scale covariance. Blob detection in 2D. Blob detection in 2D

Detectors part II Descriptors

Vlad Estivill-Castro (2016) Robots for People --- A project for intelligent integrated systems

Achieving scale covariance

6.869 Advances in Computer Vision. Prof. Bill Freeman March 1, 2005

SIFT keypoint detection. D. Lowe, Distinctive image features from scale-invariant keypoints, IJCV 60 (2), pp , 2004.

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

Image Processing 1 (IP1) Bildverarbeitung 1

EECS150 - Digital Design Lecture 15 SIFT2 + FSM. Recap and Outline

Lecture 12. Local Feature Detection. Matching with Invariant Features. Why extract features? Why extract features? Why extract features?

Blobs & Scale Invariance

Lecture 7: Finding Features (part 2/2)

Lecture 7: Finding Features (part 2/2)

Lecture 8: Interest Point Detection. Saad J Bedros

Image Analysis. Feature extraction: corners and blobs

Recap: edge detection. Source: D. Lowe, L. Fei-Fei

Image matching. by Diva Sian. by swashford

SURF Features. Jacky Baltes Dept. of Computer Science University of Manitoba WWW:

Advances in Computer Vision. Prof. Bill Freeman. Image and shape descriptors. Readings: Mikolajczyk and Schmid; Belongie et al.

Interest Operators. All lectures are from posted research papers. Harris Corner Detector: the first and most basic interest operator

SIFT, GLOH, SURF descriptors. Dipartimento di Sistemi e Informatica

Extract useful building blocks: blobs. the same image like for the corners

Scale & Affine Invariant Interest Point Detectors

CS 3710: Visual Recognition Describing Images with Features. Adriana Kovashka Department of Computer Science January 8, 2015

Overview. Harris interest points. Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points

Properties of detectors Edge detectors Harris DoG Properties of descriptors SIFT HOG Shape context

Instance-level l recognition. Cordelia Schmid INRIA

Edge Detection. CS 650: Computer Vision

Scale & Affine Invariant Interest Point Detectors

ECE 468: Digital Image Processing. Lecture 8

Given a feature in I 1, how to find the best match in I 2?

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

ECE Digital Image Processing and Introduction to Computer Vision

Orientation Map Based Palmprint Recognition

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

EE 6882 Visual Search Engine

Wavelet-based Salient Points with Scale Information for Classification

Lecture 6: Edge Detection. CAP 5415: Computer Vision Fall 2008

Lec 12 Review of Part I: (Hand-crafted) Features and Classifiers in Image Classification

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

Templates, Image Pyramids, and Filter Banks

Lecture 7: Edge Detection

Keypoint extraction: Corners Harris Corners Pkwy, Charlotte, NC

CS4670: Computer Vision Kavita Bala. Lecture 7: Harris Corner Detec=on

Laplacian Filters. Sobel Filters. Laplacian Filters. Laplacian Filters. Laplacian Filters. Laplacian Filters

Filtering and Edge Detection

Visual Object Recognition

Taking derivative by convolution

Introduction to Computer Vision

Lecture 6: Finding Features (part 1/2)

Invariant local features. Invariant Local Features. Classes of transformations. (Good) invariant local features. Case study: panorama stitching

Robert Collins CSE598G Mean-Shift Blob Tracking through Scale Space

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

Instance-level l recognition. Cordelia Schmid & Josef Sivic INRIA

Edge Detection PSY 5018H: Math Models Hum Behavior, Prof. Paul Schrater, Spring 2005

Optical Flow, Motion Segmentation, Feature Tracking

Machine vision. Summary # 4. The mask for Laplacian is given

Corner detection: the basic idea

Lecture 04 Image Filtering

Machine vision, spring 2018 Summary 4

Maximally Stable Local Description for Scale Selection

Introduction to Computer Vision. 2D Linear Systems

Edge Detection. Computer Vision P. Schrater Spring 2003

Roadmap. Introduction to image analysis (computer vision) Theory of edge detection. Applications

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

Slide a window along the input arc sequence S. Least-squares estimate. σ 2. σ Estimate 1. Statistically test the difference between θ 1 and θ 2

TRACKING and DETECTION in COMPUTER VISION Filtering and edge detection

Corner. Corners are the intersections of two edges of sufficiently different orientations.

Feature Tracking. 2/27/12 ECEn 631

Feature detection.

CS 534: Computer Vision Segmentation III Statistical Nonparametric Methods for Segmentation

Edge Detection. Introduction to Computer Vision. Useful Mathematics Funcs. The bad news

Lesson 04. KAZE, Non-linear diffusion filtering, ORB, MSER. Ing. Marek Hrúz, Ph.D.

Lucas-Kanade Optical Flow. Computer Vision Carnegie Mellon University (Kris Kitani)

Lecture 3: The Shape Context

Multimedia Databases. Previous Lecture. 4.1 Multiresolution Analysis. 4 Shape-based Features. 4.1 Multiresolution Analysis

Lecture 3: Linear Filters

Is the Scale Invariant Feature Transform (SIFT) really Scale Invariant?

Multimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig

Image Stitching II. Linda Shapiro CSE 455

Affine Adaptation of Local Image Features Using the Hessian Matrix

Face detection and recognition. Detection Recognition Sally

Reading. 3. Image processing. Pixel movement. Image processing Y R I G Q

Lecture 3: Linear Filters

Mixture Models and EM

Transcription:

1 SIFT: Scale Invariant Feature Transform With slides from Sebastian Thrun Stanford CS223B Computer Vision, Winter 2006

3 Pattern Recognition Want to find in here

SIFT Invariances: Scaling Rotation Illumination Deformation Provides Good localization Yes Yes Yes Maybe Yes Distinctive image features from scale-invariant keypoints. David G. Lowe, International Journal of Computer Vision, 60, 2 (2004), pp. 91-110. 4

5 Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging parameters SIFT Features

Advantages of invariant local features Locality: features are local, so robust to occlusion and clutter (no prior segmentation) Distinctiveness: individual features can be matched to a large database of objects Quantity: many features can be generated for even small objects Efficiency: close to real-time performance Extensibility: can easily be extended to wide range of differing feature types, with each adding robustness

SIFT On-A-Slide 1. Enforce invariance to scale: Compute Gaussian difference max, for may different scales; non-maximum suppression, find local maxima: keypoint candidates 2. Localizable corner: For each maximum fit quadratic function. Compute center with sub-pixel accuracy by setting first derivative to zero. 3. Eliminate edges: Compute ratio of eigenvalues, drop keypoints for which this ratio is larger than a threshold. 4. Enforce invariance to orientation: Compute orientation, to achieve scale invariance, by finding the strongest second derivative direction in the smoothed image (possibly multiple orientations). Rotate patch so that orientation points up. 5. Compute feature signature: Compute a "gradient histogram" of the local image region in a 4x4 pixel region. Do this for 4x4 regions of that size. Orient so that largest gradient points up (possibly multiple solutions). Result: feature vector with 128 values (15 fields, 8 gradients). 6. Enforce invariance to illumination change and camera saturation: Normalize to unit length to increase invariance to illumination. Then threshold all gradients, to become invariant to camera saturation. 7

SIFT On-A-Slide 1. Enforce invariance to scale: Compute Gaussian difference max, for may different scales; non-maximum suppression, find local maxima: keypoint candidates 2. Localizable corner: For each maximum fit quadratic function. Compute center with sub-pixel accuracy by setting first derivative to zero. 3. Eliminate edges: Compute ratio of eigenvalues, drop keypoints for which this ratio is larger than a threshold. 4. Enforce invariance to orientation: Compute orientation, to achieve scale invariance, by finding the strongest second derivative direction in the smoothed image (possibly multiple orientations). Rotate patch so that orientation points up. 5. Compute feature signature: Compute a "gradient histogram" of the local image region in a 4x4 pixel region. Do this for 4x4 regions of that size. Orient so that largest gradient points up (possibly multiple solutions). Result: feature vector with 128 values (15 fields, 8 gradients). 6. Enforce invariance to illumination change and camera saturation: Normalize to unit length to increase invariance to illumination. Then threshold all gradients, to become invariant to camera saturation. 8

9 Find Invariant Corners 1. Enforce invariance to scale: Compute Gaussian difference max, for may different scales; non-maximum suppression, find local maxima: keypoint candidates

Finding Keypoints (Corners) Idea: Find Corners, but scale invariance Approach: Run linear filter (diff of Gaussians) At different resolutions of image pyramid 10

11 Difference of Gaussians Minus Equals

Difference of Gaussians surf(fspecial('gaussian',40,4)) surf(fspecial('gaussian',40,8)) surf(fspecial('gaussian',40,8) - fspecial('gaussian',40,4)) 12

Find Corners with DiffOfGauss im =imread('bridge.jpg'); bw = double(im(:,:,1)) / 256; for i = 1 : 10 gaussd = fspecial('gaussian',40,2*i) - fspecial('gaussian',40,i); res = abs(conv2(bw, gaussd, 'same')); res = res / max(max(res)); imshow(res) ; title(['\bf i = ' num2str(i)]); drawnow end 13

Gaussian Kernel Size i=1 14

Gaussian Kernel Size i=2 15

Gaussian Kernel Size i=3 16

Gaussian Kernel Size i=4 17

Gaussian Kernel Size i=5 18

Gaussian Kernel Size i=6 19

Gaussian Kernel Size i=7 20

Gaussian Kernel Size i=8 21

Gaussian Kernel Size i=9 22

Gaussian Kernel Size i=10 23

Res ample Blur Subtra ct 24 Key point localization Detect maxima and minima of difference-of-gaussian in scale space

25 Example of keypoint detection (a) 233x189 image (b) 832 DOG extrema (c) 729 above threshold

SIFT On-A-Slide 1. Enforce invariance to scale: Compute Gaussian difference max, for may different scales; non-maximum suppression, find local maxima: keypoint candidates 2. Localizable corner: For each maximum fit quadratic function. Compute center with sub-pixel accuracy by setting first derivative to zero. 3. Eliminate edges: Compute ratio of eigenvalues, drop keypoints for which this ratio is larger than a threshold. 4. Enforce invariance to orientation: Compute orientation, to achieve scale invariance, by finding the strongest second derivative direction in the smoothed image (possibly multiple orientations). Rotate patch so that orientation points up. 5. Compute feature signature: Compute a "gradient histogram" of the local image region in a 4x4 pixel region. Do this for 4x4 regions of that size. Orient so that largest gradient points up (possibly multiple solutions). Result: feature vector with 128 values (15 fields, 8 gradients). 6. Enforce invariance to illumination change and camera saturation: Normalize to unit length to increase invariance to illumination. Then threshold all gradients, to become invariant to camera saturation. 26

27 Example of keypoint detection Threshold on value at DOG peak and on ratio of principle curvatures (Harris approach) (c) 729 left after peak value threshold (from 832) (d) 536 left after testing ratio of principle curvatures

SIFT On-A-Slide 1. Enforce invariance to scale: Compute Gaussian difference max, for may different scales; non-maximum suppression, find local maxima: keypoint candidates 2. Localizable corner: For each maximum fit quadratic function. Compute center with sub-pixel accuracy by setting first derivative to zero. 3. Eliminate edges: Compute ratio of eigenvalues, drop keypoints for which this ratio is larger than a threshold. 4. Enforce invariance to orientation: Compute orientation, to achieve scale invariance, by finding the strongest second derivative direction in the smoothed image (possibly multiple orientations). Rotate patch so that orientation points up. 5. Compute feature signature: Compute a "gradient histogram" of the local image region in a 4x4 pixel region. Do this for 4x4 regions of that size. Orient so that largest gradient points up (possibly multiple solutions). Result: feature vector with 128 values (15 fields, 8 gradients). 6. Enforce invariance to illumination change and camera saturation: Normalize to unit length to increase invariance to illumination. Then threshold all gradients, to become invariant to camera saturation. 28

Select canonical orientation Create histogram of local gradient directions computed at selected scale Assign canonical orientation at peak of smoothed histogram Each key specifies stable 2D coordinates (x, y, scale, orientation) 0 2π 29

SIFT On-A-Slide 1. Enforce invariance to scale: Compute Gaussian difference max, for may different scales; non-maximum suppression, find local maxima: keypoint candidates 2. Localizable corner: For each maximum fit quadratic function. Compute center with sub-pixel accuracy by setting first derivative to zero. 3. Eliminate edges: Compute ratio of eigenvalues, drop keypoints for which this ratio is larger than a threshold. 4. Enforce invariance to orientation: Compute orientation, to achieve scale invariance, by finding the strongest second derivative direction in the smoothed image (possibly multiple orientations). Rotate patch so that orientation points up. 5. Compute feature signature: Compute a "gradient histogram" of the local image region in a 4x4 pixel region. Do this for 4x4 regions of that size. Orient so that largest gradient points up (possibly multiple solutions). Result: feature vector with 128 values (15 fields, 8 gradients). 6. Enforce invariance to illumination change and camera saturation: Normalize to unit length to increase invariance to illumination. Then threshold all gradients, to become invariant to camera saturation. 30

SIFT vector formation Thresholded image gradients are sampled over 16x16 array of locations in scale space Create array of orientation histograms 8 orientations x 4x4 histogram array = 128 dimensions 31

Nearest-neighbor matching to feature database Hypotheses are generated by approximate nearest neighbor matching of each feature to vectors in the database SIFT use best-bin-first (Beis & Lowe, 97) modification to k-d tree algorithm Use heap data structure to identify bins in order by their distance from query point Result: Can give speedup by factor of 1000 while finding nearest neighbor (of interest) 95% of the time 32

33 3D Object Recognition Extract outlines with background subtraction

3D Object Recognition Only 3 keys are needed for recognition, so extra keys provide robustness Affine model is no longer as accurate 34

Recognition under occlusion 35

36 Test of illumination invariance Same image under differing illumination 273 keys verified in final match

Examples of view interpolation 37

Location recognition 38

SIFT Invariances: Scaling Rotation Illumination Deformation Provides Good localization Yes Yes Yes Maybe Yes 39