Achieving scale covariance

Similar documents
LoG Blob Finding and Scale. Scale Selection. Blobs (and scale selection) Achieving scale covariance. Blob detection in 2D. Blob detection in 2D

SIFT keypoint detection. D. Lowe, Distinctive image features from scale-invariant keypoints, IJCV 60 (2), pp , 2004.

Blob Detection CSC 767

Feature extraction: Corners and blobs

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

Detectors part II Descriptors

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

CSE 473/573 Computer Vision and Image Processing (CVIP)

Blobs & Scale Invariance

Extract useful building blocks: blobs. the same image like for the corners

CS5670: Computer Vision

INTEREST POINTS AT DIFFERENT SCALES

Lecture 8: Interest Point Detection. Saad J Bedros

Image Analysis. Feature extraction: corners and blobs

Feature detectors and descriptors. Fei-Fei Li

Feature detectors and descriptors. Fei-Fei Li

Overview. Harris interest points. Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

Properties of detectors Edge detectors Harris DoG Properties of descriptors SIFT HOG Shape context

Advances in Computer Vision. Prof. Bill Freeman. Image and shape descriptors. Readings: Mikolajczyk and Schmid; Belongie et al.

Corner detection: the basic idea

Scale-space image processing

Recap: edge detection. Source: D. Lowe, L. Fei-Fei

Lecture 12. Local Feature Detection. Matching with Invariant Features. Why extract features? Why extract features? Why extract features?

CS 3710: Visual Recognition Describing Images with Features. Adriana Kovashka Department of Computer Science January 8, 2015

SIFT: SCALE INVARIANT FEATURE TRANSFORM BY DAVID LOWE

Vlad Estivill-Castro (2016) Robots for People --- A project for intelligent integrated systems

Lecture 8: Interest Point Detection. Saad J Bedros

SIFT: Scale Invariant Feature Transform

Lecture 7: Finding Features (part 2/2)

SIFT, GLOH, SURF descriptors. Dipartimento di Sistemi e Informatica

Lecture 7: Finding Features (part 2/2)

Wavelet-based Salient Points with Scale Information for Classification

Scale & Affine Invariant Interest Point Detectors

EE 6882 Visual Search Engine

Harris Corner Detector

CS4670: Computer Vision Kavita Bala. Lecture 7: Harris Corner Detec=on

Advanced Features. Advanced Features: Topics. Jana Kosecka. Slides from: S. Thurn, D. Lowe, Forsyth and Ponce. Advanced features and feature matching

Lecture 6: Finding Features (part 1/2)

SURF Features. Jacky Baltes Dept. of Computer Science University of Manitoba WWW:

Invariant local features. Invariant Local Features. Classes of transformations. (Good) invariant local features. Case study: panorama stitching

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

Templates, Image Pyramids, and Filter Banks

Local Features (contd.)

Scale & Affine Invariant Interest Point Detectors

EECS150 - Digital Design Lecture 15 SIFT2 + FSM. Recap and Outline

Given a feature in I 1, how to find the best match in I 2?

Robert Collins CSE598G Mean-Shift Blob Tracking through Scale Space

Image matching. by Diva Sian. by swashford

Maximally Stable Local Description for Scale Selection

Lesson 04. KAZE, Non-linear diffusion filtering, ORB, MSER. Ing. Marek Hrúz, Ph.D.

SCALE-SPACE - Theory and Applications

The state of the art and beyond

Edge Detection. CS 650: Computer Vision

Image Processing 1 (IP1) Bildverarbeitung 1

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

Filtering and Edge Detection

Lecture 6: Edge Detection. CAP 5415: Computer Vision Fall 2008

Global Scene Representations. Tilke Judd

Lec 12 Review of Part I: (Hand-crafted) Features and Classifiers in Image Classification

Orientation Map Based Palmprint Recognition

Feature detection.

Linear Diffusion. E9 242 STIP- R. Venkatesh Babu IISc

ECE 468: Digital Image Processing. Lecture 8

SURVEY OF APPEARANCE-BASED METHODS FOR OBJECT RECOGNITION

Computer Vision Lecture 3

Lecture 7: Edge Detection

Laplacian Filters. Sobel Filters. Laplacian Filters. Laplacian Filters. Laplacian Filters. Laplacian Filters

Instance-level l recognition. Cordelia Schmid & Josef Sivic INRIA

Outline. Convolution. Filtering

Introduction to Computer Vision

Machine vision. Summary # 4. The mask for Laplacian is given

Edge Detection PSY 5018H: Math Models Hum Behavior, Prof. Paul Schrater, Spring 2005

Edge Detection. Introduction to Computer Vision. Useful Mathematics Funcs. The bad news

Interest Operators. All lectures are from posted research papers. Harris Corner Detector: the first and most basic interest operator

Machine vision, spring 2018 Summary 4

ECE Digital Image Processing and Introduction to Computer Vision

Affine Adaptation of Local Image Features Using the Hessian Matrix

Feature extraction: Corners and blobs

The Kadir Operator Saliency, Scale and Image Description. Timor Kadir and Michael Brady University of Oxford

Discriminative part-based models. Many slides based on P. Felzenszwalb

Edge Detection in Computer Vision Systems

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

Shape of Gaussians as Feature Descriptors

* h + = Lec 05: Interesting Points Detection. Image Analysis & Retrieval. Outline. Image Filtering. Recap of Lec 04 Image Filtering Edge Features

Image Stitching II. Linda Shapiro CSE 455

Edge Detection. Computer Vision P. Schrater Spring 2003

Optical Flow, Motion Segmentation, Feature Tracking

Region Covariance: A Fast Descriptor for Detection and Classification

Visual Object Detection

arxiv: v1 [cs.cv] 10 Feb 2016

Image Stitching II. Linda Shapiro EE/CSE 576

6.869 Advances in Computer Vision. Prof. Bill Freeman March 1, 2005

KAZE Features. 1 Introduction. Pablo Fernández Alcantarilla 1, Adrien Bartoli 1, and Andrew J. Davison 2

Instance-level l recognition. Cordelia Schmid INRIA

Visual Object Recognition

Gaussian derivatives

Fisher Vector image representation

Keypoint extraction: Corners Harris Corners Pkwy, Charlotte, NC

Transcription:

Achieving scale covariance Goal: independently detect corresponding regions in scaled versions of the same image Need scale selection mechanism for finding characteristic region size that is covariant with the image transformation S.Lazebnik, UNC

Blobs (and scale selection) S.Lazebnik, UNC

Blob detection in 2D Laplacian of Gaussian: Circularly symmetric operator for blob detection in 2D 2 2 g g 2 g 2 2 x y S.Lazebnik, UNC of Gaussian controls the radius of the operator

Blob detection in 2D Laplacian of Gaussian: Circularly symmetric operator for blob detection in 2D 2 2 g 2 2 g Scale-normalized norm g 2 2 y x need this to make filter response insensitive to the scale S.Lazebnik, UNC

LoG Blob Finding and Scale Lapacian of Gaussian (LoG) filter extrema locate blobs maxima = dark blobs on light background minima = light blobs on dark background Scale of blob (size ; radius in pixels) is determined by the sigma parameter of the LoG filter. LoG sigma = 2 LoG sigma = 10

Scale Selection Laplacian operator.

Scale selection At what scale does the Laplacian achieve a maximum response to a binary circle of radius r? r image Laplacian

Scale selection At what scale does the Laplacian achieve a maximum response to a binary circle of radius r? To get maximum response, the zeros of the Laplacian have to be aligned with the circle The Laplacian is given by (up to scale): ( x y 2 ) e 2 2 2 ( x 2 y 2 ) / 2 2 Therefore, the maximum response occurs at r circle image Laplacian r / 2.

Characteristic scale We define the characteristic scale of a blob as the scale that produces peak of Laplacian response in the blob center characteristic scale T. Lindeberg (1998). "Feature detection with automatic scale selection." International Journal of Computer Vision 30 (2): pp 77--116.

Scale-space blob detector 1. Convolve image with scale-normalized Laplacian at several scales 2. Find maxima of squared Laplacian response in scale-space

Scale-space blob detector: Example S.Lazebnik, UNC

Scale-space blob detector: Example S.Lazebnik, UNC

Scale-space blob detector: Example S.Lazebnik, UNC

Efficient implementation Approximating the Laplacian with a difference of Gaussians: L 2 Gxx ( x, y, ) G yy ( x, y, ) (Laplacian) DoG G ( x, y, k ) G ( x, y, ) (Difference of Gaussians) Why do this: 2D Gaussian filter is separable into two 1D filters, making it more efficient to compute.

Structure Per-pixel transformations Interest points: edges, corners, blobs Feature descriptors Computer vision: models, learning and inference. 2011 Simon J.D. Prince 15

Feature Descriptors Most features descriptors can be thought of either: Templates Intensity, gradients, etc. Histograms Color, texture, SIFT descriptors, etc. or combinations of both James Hays, Brown U.

Textons An attempt to characterize texture Replace each pixel with integer representing the texture type Computer vision: models, learning and inference. 2011 Simon J.D. Prince 17

Computing Textons Take a bank of filters and apply to lots of images Cluster in filter space For new pixel, filter surrounding region with same filter bank, and assign to nearest cluster Computer vision: models, learning and inference. 2011 Simon J.D. Prince 18

Bag of words descriptor Compute visual features in image Compute descriptor around each Find closest match in library and assign index Compute histogram of these indices over the region Dictionary computed using K-means Computer vision: models, learning and inference. 2011 Simon J.D. Prince 19

Sift Descriptor Goal: produce a scale and rotation invariant vector that describes the region around an interest point. All calculations are relative to the orientation and scale of the keypoint Makes descriptor invariant to rotation and scale Computer vision: models, learning and inference. 2011 Simon J.D. Prince 20

HoG Descriptor HOG=Histogram of Oriented Gradients Computer vision: models, learning and inference. 2011 Simon J.D. Prince 21

Shape Context Descriptor Count the number of points inside each bin, e.g.: Count = 4... Count = 10 Log-polar binning: more precision for nearby points, more flexibility for farther points. Belongie & Malik, ICCV 2001 K. Grauman, B. Leibe

Shape Context Descriptor K. Grauman, B. Leibe

SIFT Detector/Descriptor Algorithm Steps Let s look under the hood at one of the most popular image feature detectors. Generating Gaussian and DOG pyramid Locating scale-space extrema Eliminating edge responses (related to Harris corners) Computing patch orientation Spatial histogram of gradients References: Lowe, Distinctive Image Features from Scale Invariant Keypoints International Journal of Computer Vision, Vol.60(2):91-110, 2004. Crowley et.al., Fast Computation of Characteristic Scale using a HalfOctave Pyramid, Cognitive Vision Workshop, Zurich Switzerland, Sept 19-20, 2002.

Generating a Gaussian Pyramid Basic Functions: Blur (convolve with Gaussian to smooth image) DownSample (reduce image size by half) Upsample (double image size)

Using a Half-Octave Pyramid From Crowley et.al., Fast Computation of Characteristic Scale using a Half-Octave Pyramid.. General idea: cascaded filtering using [1 4 6 4 1] kernel (sigma = 1) to generate a pyramid with two images per octave (power of 2 change in resolution). When we reach a full octave, downsample the image. Crowley etal.

Half Octave Gaussian and DOG blur blur tmp blur downsample downsample blur blur tmp blur

Half Octave Gaussian and DOG L1 L2 blur G1 blur G2 G1,L1: N x M G2,L2: N x M G3,L3: N/2 x M/2 G4,L4: N/2 x M/2 G5,L5: N/4 x M/4 and so on tmp blur G5 downsample downsample G3 L3 blur G4 blur L4 tmp blur

Find Extrema in Space and Scale DOG level L-1 Scale DOG level L DOG level L+1 Space Hint: when finding maxima or minima at level L, we must DownSample or UpSample as necessary to make DOG images at level L-1 and L+1 the same size as L.

Filtering out edge responses Either Using Harris corner matrix (2x2 matrix computed from first partial image derivatives) H Or use Hessian matrix (2x2 matrix computed from second partial image derivatives) Dxx Dxy Dxy Dyy Lowe s notation, Section 4.1

Filtering out edge responses Would be removed by Harris method Since the goal is filtering out edges, I like this one better Would be removed by Hessian method

Any Significant Difference? Not much difference Yellow=extrema, blue=removed by Harris corners, red=removed by Hessian

Computing Patch Orientation

Which Gaussian is Closest in Scale? L G(i) Sigma = Sigma = 1.18 (Crowley paper) G(i+1) Sigma = sqrt(2)

Which Gaussian is Closest in Scale? L G(i) Sigma = Sigma = 1.18 (Crowley paper) G(i+1) Sigma = sqrt(2) G(I) is closest in scale to L

Computing Orientation

Gradient Magnitude and Angle Dx Dy M

Computing Orientation

Sampling/Weighting over a Circular Region Gaussian-weighted circular window == 2D Gaussian kernel. Weight magnitude by Gaussian kernel. What size kernel? Lowe suggests sigma value of weighting kernel should be 3 times bigger than scale of the keypoint. Design decision: make the gaussian kernel have sigma =3, in the pixel coordinate system of G(I), the image from the Gaussian pyramid that magnitude and angle were computed with. Plus or minus 3 sigma, with sigma = 3, gives a width of 19 pixels. 19x19

Example Keypoint location = extrema location Keypoint scale is scale of the DOG image

Example (continued) gradient magnitude gaussian image (at closest scale, from pyramid) gradient orientation

Example (continued).* gradient magnitude = weighted by 2D gaussian kernel weighted gradient magnitude

Example (continued) weighted gradient magnitude weighted orientation histogram. Each bucket contains sum of weighted gradient magnitudes corresponding to angles that fall within that bucket. gradient orientation 36 buckets 10 degree range of angles in each bucket, i.e. 0 <=ang<10 : bucket 1 10<=ang<20 : bucket 2 20<=ang<30 : bucket 3

Computing Orientation

Example (continued) weighted gradient magnitude weighted orientation histogram. peak 80% of peak value gradient orientation 20-30 degrees Orientation of keypoint is approximately 25 degrees

Example (continued) There may be multiple orientations. peak Second peak 80% of peak value In this case, generate duplicate SIFT patches, one with orientation at 25 degrees, one at 155 degrees. Design decision: one might want to limit number of possible multiple peaks to two.

Forming the Sift Descriptor 1. Compute image gradients 2. Pool into local histograms 3. Concatenate histograms 4. Normalize histograms Computer vision: models, learning and inference. 2011 Simon J.D. Prince 47

Spatial Histogram of Gradients Computed on rotated and scaled version of window according to computed orientation & scale resample the window Based on gradients weighted by a Gaussian of variance half the window (for smooth falloff)

Spatial Histogram of Gradients 4x4 array of gradient orientation histograms each orientation increment is weighted by magnitude 8 orientations x 4x4 array = 128 dimensions Motivation: some sensitivity to spatial layout, but not too much. showing only 2x2 here but is 4x4

Ensure smoothness Gaussian weight Trilinear interpolation a given gradient contributes to 8 bins: 4 in space times 2 in orientation

Reduce effect of illumination 128-dim vector normalized to 1 Threshold gradient magnitudes to avoid excessive influence of high gradients after normalization, clamp gradients >0.2 renormalize

SIFT Invariance and covariance properties Laplacian blob response (detection) is invariant w.r.t. rotation and scaling Blob location covariant w.r.t. rotation and scale Normalized patch is invariant wrt rotation/scale Normalized vector is insensitive to illumination Not invariant/covariant with respect to affine transformations (such as due to small changes in viewing angle) S.Lazebnik, UNC

Achieving affine covariance Consider the second moment matrix of the window containing the blob: I x2 M w( x, y ) x, y I x I y IxI y 0 1 1 R R 2 I y 0 2 direction of the fastest change Recall: u [u v] M const v ( max)-1/2 direction of the slowest change ( min)-1/2 This ellipse visualizes the characteristic shape of the window S.Lazebnik, UNC

Affine adaptation example Scale-invariant regions (blobs) S.Lazebnik, UNC

Affine adaptation example Affine-adapted blobs S.Lazebnik, UNC

Affine adaptation Problem: the second moment window determined by weights w(x,y) must match the characteristic shape of the region Solution: iterative approach Use a circular window to compute second moment matrix Perform affine adaptation to find an ellipse-shaped window Recompute second moment matrix using new window and iterate S.Lazebnik, UNC

Iterative affine adaptation K. Mikolajczyk and C. Schmid, Scale and Affine invariant interest point detectors, IJCV 60(1):63-86, 2004. http://www.robots.ox.ac.uk/~vgg/research/affine/

Affine covariance Affinely transformed versions of the same neighborhood will give rise to ellipses that are related by the same transformation What to do if we want to compare these image regions? Affine normalization: transform these regions into samesize circles S.Lazebnik, UNC

Affine normalization Problem: There is no unique transformation from an ellipse to a unit circle We can rotate or flip a unit circle, and it still stays a unit circle S.Lazebnik, UNC

Eliminating rotation ambiguity To assign a unique orientation to circular image windows: Create histogram of local gradient directions in the patch Assign canonical orientation at peak of smoothed histogram 0 S.Lazebnik, UNC 2

From covariant regions to invariant features Extract affine regions Normalize regions Eliminate rotational ambiguity Compute appearance descriptors SIFT (Lowe 04) S.Lazebnik, UNC