Lecture 12. Local Feature Detection. Matching with Invariant Features. Why extract features? Why extract features? Why extract features?

Lecture 1 Why extract eatures? Motivation: panorama stitching We have two images how do we combine them? Local Feature Detection Guest lecturer: Alex Berg Reading: Harris and Stephens David Lowe IJCV We need to match (align) images Building a Panorama Matching with Invariant Features Darya Frolova, Denis Simakov he Weizmann Institute o Science March 004 M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 003 http://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/iles/invariantfeatures.ppt Why extract eatures? Motivation: panorama stitching We have two images how do we combine them? Why extract eatures? Motivation: panorama stitching We have two images how do we combine them? Step 1: Detect eature points in both images Step : Find corresponding pairs Step 1: Detect eature points in both images Step : Find corresponding pairs Step 3: Use these pairs to align images 1

Matching with Features Problem 1: Detect the same point independently in both images Matching with Features Problem : For each point correctly recognize the corresponding one? no chance to match! We need a repeatable detector We need a reliable and distinctive descriptor Selecting Good Features What s a good eature? Satisies brightness constancy Has suicient texture variation Does not have too much texture variation Corresponds to a real surace patch Does not deorm too much over time Applications Feature points are used or: Motion tracking Image alignment 3D reconstruction Object recognition Indexing and database retrieval Robot navigation Finding Corners Key property: in the region around a corner, image gradient has two or more dominant directions Corners are repeatable and distinctive C.Harris and M.Stephens. "A Combined Corner and Edge Detector. Proceedings o the 4th Alvey Vision Conerence: pages 147--151.

he Basic Idea An introductory example: Harris corner detector We should easily recognize the point by looking through a small window Shiting a window in any direction should give a large change in intensity C.Harris, M.Stephens. A Combined Corner and Edge Detector. 1988 Harris Detector: Basic Idea lat region: no change in all directions edge : no change along the edge direction corner : signiicant change in all directions Harris Detector: Mathematics Window-averaged change o intensity or the shit [u,v]: xy, [ ] E( uv, ) = wxy (, ) I( x+ uy, + v) I( xy, ) Harris Detector: Mathematics Change o intensity or the shit [u,v]: [ ] E( uv, ) = wxy (, ) I( x+ uy, + v) I( xy, ) xy, Window unction Shited intensity Intensity Second-order aylor expansion o E(u,v) about (0,0) (bilinear approximation or small shits): Window unction w(x,y) = or Eu (0,0) 1 Euu (0,0) E( u, v) E(0,0) + [ u v] + [ u v] Ev (0,0) Euv (0,0) Euv(0,0) u E vv(0,0) v 1 in window, 0 outside Gaussian 3

Harris Detector: Mathematics Expanding E(u,v) in a nd order aylor series expansion, we have,or small shits [u,v], a bilinear approximation: u Euv (, ) [ uv, ] M v Interpreting the second moment matrix First, consider an axis-aligned corner: where M is a matrix computed rom image derivatives: Ix IxI y M = w( x, y) xy, II x y Iy M Interpreting the second moment matrix First, consider an axis-aligned corner: M = I I x x I y I xi y λ1 = I y 0 0 λ his means dominant gradient directions align with x or y axis I either λ is close to 0, then this is not a corner, so look or locations where both are large. General Case Since M is symmetric, we have M = R 1 λ1 0 R 0 λ We can visualize M as an ellipse with axis lengths determined by the eigenvalues and orientation determined by R Ellipse E(u,v) = const u [ u v] M = const v direction o the astest change (λ max ) -1/ (λ min ) -1/ direction o the slowest change Slide credit: David Jacobs Harris Detector: Mathematics Harris Detector: Mathematics Classiication o image points using eigenvalues o M: λ 1 and λ are small; E is almost constant in all directions λ Edge λ >> λ 1 Flat region Corner λ 1 and λ are large, λ 1 ~ λ ; E increases in all directions Edge λ 1 >> λ Measure o corner response: ( trace ) R = det M k M det M = λ λ 1 trace M = λ + λ 1 (k empirical constant, k = 0.04-0.06) λ 1 4

Harris Detector: Mathematics R = det( M ) k trace( M ) = λ1λ k( λ1 + λ), 0.04 k.06 R depends only on eigenvalues o M R is large or a corner R is negative with large magnitude or an edge R is small or a lat region λ Edge R < 0 Flat R small Corner R > 0 Edge R < 0 Harris Detector: Summary Average intensity change in direction [u,v] can be expressed as a bilinear orm: u Euv (, ) [ uv, ] M v Describe a point in terms o eigenvalues o M: measure o corner response R = λ ( ) 1λ k λ1+ λ A good (corner) point should have a large intensity change in all directions, i.e. R should be large positive λ 1 Harris Detector Harris Detector: Worklow Algorithm: 1. Compute Gaussian derivatives at each pixel. Compute second moment matrix M in a Gaussian window around each pixel 3. Compute corner response unction R 4. hreshold R 5. Find local maxima o response unction (nonmaximum suppression) Harris Detector: Worklow Compute corner response R Harris Detector: Worklow Find points with large corner response: R>threshold 5

Harris Detector: Worklow ake only the points o local maxima o R Harris Detector: Worklow Harris Detector: Some Properties Rotation invariance Ellipse rotates but its shape (i.e. eigenvalues) remains the same Corner response R is invariant to image rotation Harris Detector: Some Properties Invariance to image intensity change? Harris Detector: Some Properties Partial invariance to additive and multiplicative intensity changes Only derivatives are used => invariance to intensity shit I I + b Intensity scale: I a I R threshold R x (image coordinate) x (image coordinate) 6

Harris Detector: Some Properties Harris Detector: Some Properties Invariant to image scale? All points will be classiied as edges Corner! Not invariant to scaling Harris Detector: Some Properties Quality o Harris detector or dierent scale changes Repeatability rate: # correspondences # possible correspondences C.Schmid et.al. Evaluation o Interest Point Detectors. IJCV 000 We want to: detect the same interest points regardless o image changes Invariance We want eatures to be detected despite geometric or photometric changes in the image: i we have two transormed versions o the same image, eatures should be detected in corresponding locations Darya Frolova, Denis Simakov http://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/iles/invariantfeatures.ppt 7

Models o Image Change Geometric Rotation Scale Aine valid or: orthographic camera, locally planar object Photometric Aine intensity change (I a I + b) Rotation Invariant Detection C.Schmid et.al. Evaluation o Interest Point Detectors. IJCV 000 Scale Invariant Detection Scale Invariant Detection Consider regions (e.g. circles) o dierent sizes around a point Regions o corresponding sizes will look the same in both images he problem: how do we choose corresponding circles independently in each image? 8

Scale Invariant Detection Solution: Design a unction on the region (circle), which is scale invariant (the same or corresponding regions, even i they are at dierent scales) Example: average intensity. For corresponding regions (even o dierent sizes) it will be the same. For a point in one image, we can consider it as a unction o region size (circle radius) Scale Invariant Detection Common approach: ake a local maximum o this unction Observation: region size, or which the maximum is achieved, should be invariant to image scale. Important: this scale invariant region size is ound in each image independently! Image 1 scale = 1/ Image Image 1 scale = 1/ Image region size region size s 1 region size s region size Scale Invariant Detection Scale Invariant Detection A good unction or scale detection: has one stable sharp peak bad region size bad region size Good! region size For usual images: a good unction would be a one which responds to contrast (sharp local intensity change) Functions or determining scale Kernels: ( xx (,, ) yy (,, )) L= G x y + G x y σ σ σ (Laplacian) DoG = G( x, y, kσ ) G( x, y, σ ) (Dierence o Gaussians) where Gaussian x + y 1 πσ σ Gxy (,, σ ) = e = Kernel Image Note: both kernels are invariant to scale and rotation Scale Invariant Detectors Scale Invariant Detectors Harris-Laplacian 1 Find local maximum o: Harris corner detector in space (image coordinates) Laplacian in scale SIF (Lowe) Find local maximum o: Dierence o Gaussians in space and scale scale scale y y Harris x Laplacian DoG Experimental evaluation o detectors w.r.t. scale change Repeatability rate: # correspondences # possible correspondences DoG x 1 K.Mikolajczyk, C.Schmid. Indexing Based on Scale Invariant Interest Points. ICCV 001 D.Lowe. Distinctive Image Features rom Scale-Invariant Keypoints. Accepted to IJCV 004 K.Mikolajczyk, C.Schmid. Indexing Based on Scale Invariant Interest Points. ICCV 001 9

Scale Invariant Detection: Summary Given: two images o the same scene with a large scale dierence between them Goal: ind the same interest points independently in each image Solution: search or maxima o suitable unctions in scale and in space (over the image) Methods: 1. Harris-Laplacian [Mikolajczyk, Schmid]: maximize Laplacian over scale, Harris measure o corner response over the image. SIF [Lowe]: maximize Dierence o Gaussians over scale and space Aine Invariant Detection Above we considered: Similarity transorm (rotation + uniorm scale) Aine Invariant Detection ake a local intensity extremum as initial point Go along every ray starting rom this point and stop when extremum o unction is reached Now we go on to: Aine transorm (rotation + non-uniorm scale) points along the ray We will obtain approximately corresponding regions () t = 1 t t o It () I 0 I () t I dt 0 Remark: we search or scale in every direction.uytelaars, L.V.Gool. Wide Baseline Stereo Matching Based on Local, Ainely Invariant Regions. BMVC 000. Aine Invariant Detection Aine Invariant Detection Algorithm summary (detection o aine invariant region): Start rom a local intensity extremum point Go in every direction until the point o extremum o some unction Curve connecting the points is the region boundary Compute geometric moments o orders up to or this region Replace the region with ellipse.uytelaars, L.V.Gool. Wide Baseline Stereo Matching Based on Local, Ainely Invariant Regions. BMVC 000. he regions ound may not exactly correspond, so we approximate them with ellipses Geometric Moments: mpq p q = x y ( x, y) dxdy Fact: moments m pq uniquely determine the unction aking to be the characteristic unction o a region (1 inside, 0 outside), moments o orders up to allow to approximate the region by an ellipse his ellipse will have the same moments o orders up to as the original region 10

Aine Invariant Detection Covariance matrix o region points deines an ellipse: 1 p Σ = Σ = 1 1 p 1 pp region 1 ( p = [x, y] is relative to the center o mass) q= Ap Σ = AΣ A 1 q Σ = Ellipses, computed or corresponding regions, also correspond! Σ = 1 q 1 qq region Aine Invariant Detection : Summary Under aine transormation, we do not know in advance shapes o the corresponding regions Ellipse given by geometric covariance matrix o a region robustly approximates this region For corresponding regions ellipses also correspond. Methods: 1. Search or extremum along rays [uytelaars, Van Gool]:. Maximally Stable Extremal Regions [Matas et.al.] Point Descriptors We know how to detect points Next question: How to match them?? Point descriptor should be: 1. Invariant. Distinctive Descriptors Invariant to Rotation Harris corner response measure: depends only on the eigenvalues o the matrix M Ix IxI y M = w( x, y) xy, II x y Iy C.Harris, M.Stephens. A Combined Corner and Edge Detector. 1988 11

Descriptors Invariant to Rotation Image moments in polar coordinates k iθ l m r e I(, r θ ) drdθ kl = Descriptors Invariant to Rotation Find local orientation Dominant direction o gradient Rotation in polar coordinates is translation o the angle: θ θ + θ 0 his transormation changes only the phase o the moments, but not its magnitude Rotation invariant descriptor consists o magnitudes o moments: m kl Compute image derivatives relative to this orientation Matching is done by comparing vectors [ m kl ] k,l J.Matas et.al. Rotational Invariants or Wide-baseline Stereo. Research Report o CMP, 003 1 K.Mikolajczyk, C.Schmid. Indexing Based on Scale Invariant Interest Points. ICCV 001 D.Lowe. Distinctive Image Features rom Scale-Invariant Keypoints. Accepted to IJCV 004 Descriptors Invariant to Scale Use the scale determined by detector to compute descriptor in a normalized rame For example: moments integrated over an adapted window derivatives adapted to scale: si x Aine Invariant Descriptors Aine invariant color moments m abc p q a (, ) b (, ) c pq = x y R x y G x y B ( x, y) dxdy region Dierent combinations o these moments are ully aine invariant Also invariant to aine transormation o intensity I a I + b F.Mindru et.al. Recognizing Color Patterns Irrespective o Viewpoint and Illumination. CVPR99 1

Blur Subtract Aine Invariant Descriptors SIF Scale Invariant Feature ransorm 1 Find aine normalized rame Σ = 1 pp A Σ = qq Empirically ound to show very good perormance, invariant to image rotation, scale, intensity change, and to moderate aine transormations 1 1 Σ 1 = A1 A1 A 1 A Σ = A A Scale =.5 Rotation = 45 0 rotation Compute rotational invariant descriptor in this normalized rame J.Matas et.al. Rotational Invariants or Wide-baseline Stereo. Research Report o CMP, 003 1 D.Lowe. Distinctive Image Features rom Scale-Invariant Keypoints. Accepted to IJCV 004 K.Mikolajczyk, C.Schmid. A Perormance Evaluation o Local Descriptors. CVPR 003 CVPR 003 utorial Recognition and Matching Based on Local Invariant Features Invariant Local Features Image content is transormed into local eature coordinates that are invariant to translation, rotation, scale, and other imaging parameters David Lowe Computer Science Department University o British Columbia SIF Features Advantages o invariant local eatures Locality: eatures are local, so robust to occlusion and clutter (no prior segmentation) Distinctiveness: individual eatures can be matched to a large database o objects Quantity: many eatures can be generated or even small objects Eiciency: close to real-time perormance Extensibility: can easily be extended to wide range o diering eature types, with each adding robustness Scale invariance Requires a method to repeatably select points in location and scale: he only reasonable scale-space kernel is a Gaussian (Koenderink, 1984; Lindeberg, 1994) An eicient choice is to detect peaks in the dierence o Gaussian pyramid (Burt & Adelson, 1983; Crowley & Parker, 1984 but examining more scales) Dierence-o-Gaussian with constant ratio o scales is a close approximation to Lindeberg s scale-normalized Laplacian (can be shown rom the heat diusion equation) 13

Blur Subtract Scale space processed one octave at a time Key point localization Detect maxima and minima o dierence-o-gaussian in scale space Fit a quadratic to surrounding values or sub-pixel and sub-scale interpolation (Brown & Lowe, 00) aylor expansion around point: Oset o extremum (use inite dierences or derivatives): Select canonical orientation Create histogram o local gradient directions computed at selected scale Assign canonical orientation at peak o smoothed histogram Each key speciies stable D coordinates (x, y, scale, orientation) Example o keypoint detection hreshold on value at DOG peak and on ratio o principle curvatures (Harris approach) (a) 33x189 image (b) 83 DOG extrema (c) 79 let ater peak value threshold (d) 536 let ater testing ratio o principle curvatures 0 π SIF vector ormation hresholded image gradients are sampled over 16x16 array o locations in scale space Create array o orientation histograms 8 orientations x 4x4 histogram array = 18 dimensions Feature stability to noise Match eatures ater random change in image scale & orientation, with diering levels o image noise Find nearest neighbor in database o 30,000 eatures 14

Feature stability to aine change Match eatures ater random change in image scale & orientation, with % image noise, and aine distortion Find nearest neighbor in database o 30,000 eatures Distinctiveness o eatures Vary size o database o eatures, with 30 degree aine change, % image noise Measure % correct or single nearest neighbor match A good SIF eatures tutorial http://www.cs.toronto.edu/~jepson/csc503/tutsif04.pd By Estrada, Jepson, and Fleet. 15