Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

nstance-level recognition: ocal invariant features Cordelia Schmid NRA Grenoble

Overview ntroduction to local features Harris interest points + SSD ZNCC SFT Scale & affine invariant interest point detectors Evaluation and comparison of different detectors Region descriptors and their performance

ocal features local descriptor Several / man local descriptors per image Robust to occlusion/clutter + no object segmentation required Photometric : distinctive nvariant : to image transformations + illumination changes

ocal features: interest points

ocal features: Contours/segments

ocal features: segmentation

Application: Matching Find corresponding locations in the image

llustration Matching nterest points etracted with Harris detector ~ 500 points

llustration Matching Matching nterest points matched based on cross-correlation 188 pairs

llustration Global constraints Matching Global constraint - Robust estimation of the fundamental matri 99 inliers 89 outliers

Application: Panorama stitching mages courtes of A. Zisserman.

Application: nstance-level recognition Search for particular objects and scenes in large databases

nstance-level recognition: Approach mage content is transformed into local features invariant to geometric and photometric transformations Matching local invariant descriptors ocal Features e.g. SFT K. Grauman B. eibe Slide credit: David owe 13

Difficulties Finding the object despite possibl large changes in scale viewpoint lighting and partial occlusion requires invariant description Scale Viewpoint ighting Occlusion

Difficulties Ver large images collection need for efficient indeing Flickr has billion photographs more than 1 million added dail Facebook has 15 billion images ~7 million added dail arge personal collections Video collections i.e. YouTube

Applications Take a picture of a product or advertisement find relevant information on the web [Piee Milpi]

Applications Cop detection for images and videos Quer video Search in 00h of video

Applications Son Aibo Robotics Recognize docking station Place recognition oop closure in SAM K. Grauman B. eibe Slide credit: David owe 18

ocal features - histor ine segments [owe 87 Aache 90] nterest points & cross correlation [Z. Zhang et al. 95] Rotation invariance with differential invariants [Schmid&Mohr 96] Scale & affine invariant detectors [indeberg 98 owe 99 Tutelaars&VanGool 00 Mikolajczk&Schmid 0 Matas et al. 0] Dense detectors and descriptors [eung&malik 99 Fei-Fei& Perona 05 azebnik et al. 06] Contour and region segmentation descriptors [Shotton et al. 05 Opelt et al. 06 Ferrari et al. 06 eordeanu et al. 07]

ocal features 1 Etraction of local features Contours/segments nterest points & regions Regions b segmentation Dense features points on a regular grid Description of local features Dependant on the feature tpe Contours/segments angles length ratios nterest points grelevels gradient histograms Regions segmentation teture + color distributions

ine matching Etraction de contours Zero crossing of aplacian ocal maima of gradients Chain contour points hsteresis Etraction of line segments Description of segments Mi-point length orientation angle between pairs etc.

Eperimental results line segments images 600 600

Eperimental results line segments 48 / 1 line segments etracted

Eperimental results line segments 89 matched line segments - 100% correct

Eperimental results line segments 3D reconstruction

Problems of line segments Often onl partial etraction ine segments broken into parts Missing parts nformation not ver discriminative 1D information Similar for man segments Potential solutions Pairs and triplets of segments nterest points

Harris detector [Harris & Stephens 88] Based on the idea of auto-correlation mportant difference in all directions => interest point

Harris detector Auto-correlation function for a point and a shift A = k k k + k + k k W W

Harris detector Auto-correlation function for a point and a shift A = k k k + k + k k W W { A small in all directions large in one directions large in all directions uniform region contour interest point

Harris detector

Harris detector Discret shifts are avoided based on the auto-correlation matri + = + + with first order approimation = W k k k k k k + = + + k k k k k k k k A k k k W k k k + + =

Harris detector = W k k W k k k k W k k k k W k k k k k k k k k k Auto-correlation matri the sum can be smoothed with a Gaussian = G

Harris detector Auto-correlation matri A G = captures the structure of the local neighborhood measure based on eigenvalues of this matri strong eigenvalues 1 strong eigenvalue 0 eigenvalue => interest point => contour => uniform region

nterpreting the eigenvalues Classification of image points using eigenvalues of autocorrelation matri: λ Edge λ >> λ 1 Corner λ 1 and λ are large λ 1 ~ λ ; \ λ 1 and λ are small; Flat region Edge λ 1 >> λ λ 1

R Corner response function = det A A α: constant 0.04 to 0.06 α trace = λ1λ α λ1 + Edge R < 0 Corner R > 0 λ Flat region R small Edge R < 0

Harris detector Cornerness function f = det A k trace A = 1λ k λ λ + λ 1 Reduces the effect of a strong contour nterest point detection Treshold absolut relatif number of corners ocal maima f > thresh 8 neighbourhood f f

Harris Detector: Steps

Compute corner response R Harris Detector: Steps

Harris Detector: Steps Find points with large corner response: R>threshold

Harris Detector: Steps Take onl the points of local maima of R

Harris Detector: Steps

Harris detector: Summar of steps 1. Compute Gaussian derivatives at each piel. Compute second moment matri A in a Gaussian window around each piel 3. Compute corner response function R 4. Threshold R 5. Find local maima of response function non-maimum suppression

Harris - invariance to transformations Geometric transformations translation rotation similitude rotation + scale change affine valide for local planar objects Photometric transformations Affine intensit changes a + b

Harris Detector: nvariance Properties Rotation Ellipse rotates but its shape i.e. eigenvalues remains the same Corner response R is invariant to image rotation

Harris Detector: nvariance Properties Affine intensit change Onl derivatives are used => invariance to intensit shift + b ntensit scale: a R threshold R image coordinate image coordinate Partiall invariant to affine intensit change dependent on tpe of threshold

Harris Detector: nvariance Properties Scaling Corner All points will be classified as edges Not invariant to scaling

Comparison of patches - SSD Comparison of the intensities in the neighborhood of two interest points 1 1 image 1 image SSD : sum of square difference N N + i + j + i + N + 1 1 1 1 i= N j= N 1 j Small difference values similar patches

Comparison of patches 1 1 1 1 1 j i j i N N i N N j N + + + + = = + SSD : nvariance to photometric transformations? ntensit changes + b => Normalizing with the mean of each patch ntensit changes a + b 1 1 1 1 1 1 m j i m j i N N i N N j N + + + + = = + => Normalizing with the mean of each patch 1 1 1 1 1 1 1 = = + + + + + N N i N N j N m j i m j i σ σ => Normalizing with the mean and standard deviation of each patch

Cross-correlation ZNCC 1 1 1 1 1 1 1 = = + + + + + N N i N N j N m j i m j i σ σ zero normalized SSD ZNCC: zero normalized cross correlation + + + + = = + 1 1 1 1 1 1 1 σ σ m j i m j i N N i N N j N ZNCC values between -1 and 1 1 when identical patches in practice threshold around 0.5

ocal descriptors Grevalue derivatives Differential invariants [Koenderink 87] SFT descriptor [owe 99]

Grevalue derivatives: mage gradient The gradient of an image: The gradient points in the direction of most rapid increase in intensit The gradient direction is given b how does this relate to the direction of the edge? The edge strength is given b the gradient magnitude

Differentiation and convolution Recall for D function f: f = lim ε 0 f + ε ε f ε We could approimate this as f f n+1 f n Convolution with the filter -1 1

Finite difference filters Other approimations of derivative filters eist:

Effects of noise Consider a single row or column of the image Plotting intensit as a function of position gives a signal Where is the edge?

Solution: smooth first f g f * g d d f g To find edges look for peaks in d f g d

Derivative theorem of convolution Differentiation is convolution and convolution is associative: d d f g = d f g d This saves us one operation: f d d g f d d g

ocal descriptors σ σ σ G G Grevalue derivatives Convolution with Gaussian derivatives ep 1 σ πσ σ G + = = d d G G σ σ = M * * * * σ σ σ σ G G G G v

ocal descriptors * G G G σ σ σ Notation for grevalue derivatives [Koenderink 87] = = M M * * * * G G G G σ σ σ σ v nvariance?

ocal descriptors rotation invariance nvariance to image rotation : differential invariants [Koen87] + + + gradient magnitude + + + + + K K K K aplacian

aplacian of Gaussian OG OG = G σ + G σ

SFT descriptor [owe 99] Approach 8 orientations of the gradient 44 spatial grid Dimension 18 soft-assignment to spatial bins normalization of the descriptor to norm one comparison with Euclidean distance image patch gradient 3D histogram

ocal descriptors - rotation invariance Estimation of the dominant orientation etract gradient orientation histogram over gradient orientation peak in this histogram 0 π Rotate patch in dominant direction

ocal descriptors illumination change Robustness to illumination changes in case of an affine transformation = a + b 1

ocal descriptors illumination change Robustness to illumination changes in case of an affine transformation = a + b 1 Normalization of derivatives with gradient magnitude + +

ocal descriptors illumination change Robustness to illumination changes in case of an affine transformation = a + b 1 Normalization of derivatives with gradient magnitude + + Normalization of the image patch with mean and variance

nvariance to scale changes Scale change between two images Scale factor s can be eliminated Support region for calculation!! n case of a convolution with Gaussian derivatives defined b ep 1 σ πσ σ G + = = d d G G σ σ σ

Scale invariance - motivation Description regions have to be adapted to scale changes nterest points have to be repeatable for scale changes

Harris detector + scale changes Repeatabilit rate R ε = { a i b i dist H ai b ma a b i i i < ε}

Scale adaptation Scale change between two images 1 1 1 = = s s 1 1 Scale adapted derivative calculation

Scale adaptation = = 1 1 1 1 1 s s Scale change between two images 1 1 1 1 1 σ σ s G s G n n i i n i i K K = Scale adapted derivative calculation sσ n s

Scale adaptation σ i where are the derivatives with Gaussian convolution ~ σ σ σ σ σ G

Scale adaptation σ i where are the derivatives with Gaussian convolution ~ σ σ σ σ σ G ~ σ σ σ σ σ s s s s s G s Scale adapted auto-correlation matri

Harris detector adaptation to scale R ε { a b dist H a b < ε} = i i i i

Multi-scale matching algorithm s = 1 s = 3 s = 5

Multi-scale matching algorithm s = 1 8 matches

Multi-scale matching algorithm Robust estimation of a global affine transformation s = 1 3 matches

Multi-scale matching algorithm s = 1 3 matches s = 3 4 matches

Multi-scale matching algorithm s = 1 3 matches s = 3 4 matches highest number of matches correct scale s = 5 16 matches

Matching results Scale change of 5.7

Matching results 100% correct matches 13 matches

Scale selection We want to find the characteristic scale of the blob b convolving it with aplacians at several scales and looking for the maimum response However aplacian response decas as scale increases: original signal radius=8 increasing σ Wh does this happen?

Scale normalization The response of a derivative of Gaussian filter to a perfect step edge decreases as σ increases 1 σ π

Scale normalization The response of a derivative of Gaussian filter to a perfect step edge decreases as σ increases To keep response the same scale-invariant must multipl Gaussian derivative b σ aplacian is the second Gaussian derivative so it must be aplacian is the second Gaussian derivative so it must be multiplied b σ

Effect of scale normalization Original signal Unnormalized aplacian response Scale-normalized aplacian response maimum

Blob detection in D aplacian of Gaussian: Circularl smmetric operator for blob detection in D g = g + g

Blob detection in D aplacian of Gaussian: Circularl smmetric operator for blob detection in D Scale-normalized: g g g = σ + norm

Scale selection The D aplacian is given b / σ + σ + e up to scale For a binar circle of radius r the aplacian achieves a maimum at σ = r / r aplacian response image r / scale σ

Characteristic scale We define the characteristic scale as the scale that produces peak of aplacian response characteristic scale T. indeberg 1998. Feature detection with automatic scale selection. nternational Journal of Computer Vision 30 : pp 77--116.

Scale selection For a point compute a value gradient aplacian etc. at several scales Normalization of the values with the scale factor e.g. aplacian s + Select scale s at the maimum characteristic scale s + scale Ep. results show that the aplacian gives best results

Scale selection Scale invariance of the characteristic scale s norm. ap. scale

Scale selection Scale invariance of the characteristic scale s norm. ap. norm. ap. scale scale Relation between characteristic scales s s 1 = s

Scale-invariant detectors Harris-aplace Mikolajczk & Schmid 01 aplacian detector indeberg 98 Difference of Gaussian owe 99 Harris-aplace aplacian

Harris-aplace multi-scale Harris points selection of points at maimum of aplacian invariant points + associated regions [Mikolajczk & Schmid 01]

Matching results 13 / 190 detected interest points

Matching results 58 points are initiall matched

Matching results 3 points are matched after verification all correct

OG detector Convolve image with scalenormalized aplacian at several scales OG = s G σ + G σ Detection of maima and minima of aplacian in scale space

Hessian detector Hessian matri H = Determinant of Hessian matri DET = Penalizes/eliminates long structures with small derivative in a single direction

Efficient implementation Difference of Gaussian DOG approimates the aplacian DOG = G kσ G σ Error due to the approimation

DOG detector Fast computation scale space processed one octave at a time David G. owe. "Distinctive image features from scale-invariant kepoints. JCV 60.