CS 3710: Visual Recognition Describing Images with Features Adriana Kovashka Department of Computer Science January 8, 2015
Plan for Today Presentation assignments + schedule changes Image filtering Feature detection Feature description Feature matching Next time: Classification and detection Adriana s research
Announcements Open door policy Fixed office hours? Adriana s travel Clarification of experiment presentations
Presentation Assignments
Image Description
An image is a set of pixels What we see What a computer sees S. Narasimhan Source: S. Narasimhan
Problems with pixel representation Not invariant to small changes Translation Illumination etc. Some parts of an image are more important than others What do we want to represent?
Preprocessing: Image filtering
Image filtering Compute a function of the local neighborhood at each pixel in the image Function specified by a filter or mask saying how to combine values from neighbors. Uses of filtering: Enhance an image (denoise, resize, etc) Extract information (texture, edges, etc) Detect patterns (template matching) Derek Hoiem, Kristen Grauman
Motivation: noise reduction Even multiple images of the same static scene will not be identical. Kristen Grauman
Common types of noise Salt and pepper noise: random occurrences of black and white pixels Impulse noise: random occurrences of white pixels Gaussian noise: variations in intensity drawn from a Gaussian normal distribution Kristen Grauman, Steve Seitz
Motivation: noise reduction How could we reduce the noise, i.e., give an estimate of the true intensities? What if there s only one image? Kristen Grauman
First attempt at a solution Let s replace each pixel with an average of all the values in its neighborhood Assumptions: Expect pixels to be like their neighbors Expect noise processes to be independent from pixel to pixel Kristen Grauman
First attempt at a solution Let s replace each pixel with an average of all the values in its neighborhood Moving average in 1D: Kristen Grauman, S. Marschner
Weighted Moving Average Can add weights to our moving average Weights [1, 1, 1, 1, 1] / 5 Non-uniform weights [1, 4, 6, 4, 1] / 16 Kristen Grauman, S. Marschner
Moving Average In 2D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Kristen Grauman, Steve Seitz
Moving Average In 2D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Kristen Grauman, Steve Seitz
Moving Average In 2D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 20 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Kristen Grauman, Steve Seitz
Moving Average In 2D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 20 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Kristen Grauman, Steve Seitz
Moving Average In 2D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 20 30 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Kristen Grauman, Steve Seitz
Moving Average In 2D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 20 30 30 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Kristen Grauman, Steve Seitz
Moving Average In 2D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 10 20 30 30 30 20 10 0 20 40 60 60 60 40 20 0 30 60 90 90 90 60 30 0 30 50 80 80 90 60 30 0 30 50 80 80 90 60 30 0 20 30 50 50 60 40 20 10 20 30 30 30 30 20 10 10 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Kristen Grauman, Steve Seitz
Correlation filtering Say the averaging window size is 2k+1 x 2k+1: Attribute uniform weight to each pixel Loop over all pixels in neighborhood around image pixel F[i,j] Now generalize to allow different weights depending on neighboring pixel s relative position: Non-uniform weights Kristen Grauman
Correlation filtering This is called cross-correlation, denoted Filtering an image: replace each pixel with a linear combination of its neighbors. The filter kernel or mask H[u,v] is the prescription for the weights in the linear combination. Kristen Grauman
Gaussian filter What if we want nearest neighboring pixels to have the most influence on the output? 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 2 4 2 1 2 1 This kernel is an approximation of a 2d Gaussian function: Removes high-frequency components from the image ( low-pass filter ). Kristen Grauman, Steve Seitz
Kristen Grauman Smoothing with a Gaussian
Smoothing with a Gaussian Dali Antonio Torralba
Aude Oliva Marilyn Einstein
Describing images with features: Feature detection
Kristen Grauman What points would you choose?
Local features: desired properties Repeatability The same feature can be found in several images despite geometric and photometric transformations Saliency Each feature has a distinctive description Compactness and efficiency Many fewer features than image pixels Locality A feature occupies a relatively small area of the image; robust to clutter and occlusion Kristen Grauman
Goal: interest operator repeatability We want to detect (at least some of) the same points in both images. Kristen Grauman No chance to find true matches, yet we have to be able to run the detection procedure independently per image.
Goal: descriptor distinctiveness We want to be able to reliably determine which point goes with which.? Must provide some invariance to geometric and photometric differences between the two views. Kristen Grauman
Corners as distinctive interest points We should easily recognize the point by looking through a small window Shifting a window in any direction should give a large change in intensity flat region: no change in all directions Alyosha Efros, Darya Frolova, Denis Simakov edge : no change along the edge direction corner : significant change in all directions
Harris Detector: Mathematics Window-averaged squared change of intensity induced by shifting the image data by [u,v]: Window function Shifted intensity Intensity Window function w(x,y) = or 1 in window, 0 outside Gaussian Darya Frolova, Denis Simakov
Harris Detector: Mathematics Window-averaged squared change of intensity induced by shifting the image data by [u,v]: Window function Shifted intensity Intensity E(u, v) Darya Frolova, Denis Simakov
Harris Detector: Mathematics Expanding I(x,y) in a Taylor series expansion, we have, for small shifts [u,v], a quadratic approximation to the error surface between a patch and itself, shifted by [u,v]: where M is a 2 2 matrix computed from image derivatives: Darya Frolova, Denis Simakov
y y y x y x x x I I I I I I I I y x w M ), ( x I I x y I I y y I x I I I y x Notation: Kristen Grauman Harris Detector: Mathematics
What does this matrix reveal? Since M is symmetric, we have M X 1 0 0 X 2 T Mx i x i i The eigenvalues of M reveal the amount of intensity change in the two principal orthogonal gradient directions in the window. Kristen Grauman
Corner response function edge : 1 >> 2 2 >> 1 corner : 1 and 2 are large, 1 ~ 2 flat region: 1 and 2 are small Alyosha Efros, Darya Frolova, Denis Simakov, Kristen Grauman
Harris Detector: Mathematics Measure of corner response: (k empirical constant, k = 0.04-0.06) Darya Frolova, Denis Simakov
Harris Detector: Summary Compute image gradients Ix and Iy for all pixels For each pixel Compute by looping over neighbors x,y compute Find points with large corner response function R (R > threshold) Take the points of locally maximum R as the detected feature points (i.e., pixels where R is bigger than for all the 4 or 8 neighbors). Darya Frolova, Denis Simakov 42
Kristen Grauman Example of Harris application
Harris Detector: Some Properties Partial invariance to additive and multiplicative intensity changes Only derivatives are used => invariance to intensity shift Intensity scaling: fine, except for the threshold that s used to specify when R is large enough. R R threshold x (image coordinate) x (image coordinate) Darya Frolova, Denis Simakov
Harris Detector: Some Properties Invariant to image scale? image zoomed image Antonio Torralba
Harris Detector: Some Properties Not invariant to image scale! All points will be classified as edges Corner! Darya Frolova, Denis Simakov
Scale Invariant Detection The problem: how do we choose corresponding circles independently in each image? Do objects in the image have a characteristic scale that we can identify? Darya Frolova, Denis Simakov
Solution: Scale Invariant Detection Design a function on the region (circle), which is scale invariant (the same for corresponding regions, even if they are at different scales) Take a local maximum of this function f Image 1 f Image 2 scale = 1/2 Antonio Torralba s 1 s 2 region size region size
Scale Invariant Detection A good function for scale detection: has one stable sharp peak f Bad region size f Bad region size f Good! region size For usual images: a good function would be a one which responds to contrast (sharp local intensity change) Antonio Torralba
Scale Invariant Detection Functions for determining scale Kernels: (Laplacian: 2nd derivative of Gaussian) (Difference of Gaussians) where Gaussian Note: both kernels are invariant to scale and rotation Darya Frolova, Denis Simakov
K.Mikolajczyk, C.Schmid. Indexing Based on Scale Invariant Interest Points. ICCV 2001 Laplacian Scale Invariant Detectors Harris-Laplacian Find local maximum of: Harris corner detector in space (image coordinates) Laplacian in scale scale y Harris x
Describing images with features: Feature description
Raw patches as local descriptors The simplest way to describe the neighborhood around an interest point is to write down the list of intensities to form a feature vector. But this is very sensitive to even small shifts, rotations. Kristen Grauman
Geometric transformations Kristen Grauman e.g. scale, translation, rotation
Photometric transformations Kristen Grauman, Tinne Tuytelaars
SIFT descriptor [Lowe 2004] Use histograms to bin pixels within sub-patches according to their orientation. 0 2p Kristen Grauman
Making the descriptor rotation invariant CSE 576: Computer Vision Rotate the patch according to its dominant gradient orientation This puts the patches into a canonical orientation Kristen Grauman, Matthew Brown
Image features: Histograms of oriented gradients (HOG) Bin gradients from 8x8 pixel neighborhoods into 9 orientations Deva Ramanan (Dalal & Triggs CVPR 05)
http://web.mit.edu/vondrick/ihog/ What is this?
http://web.mit.edu/vondrick/ihog/ What is this?
http://web.mit.edu/vondrick/ihog/ What is this?
http://web.mit.edu/vondrick/ihog/ What is this?
http://web.mit.edu/vondrick/ihog/ What is this?
http://web.mit.edu/vondrick/ihog/ What is this?
http://web.mit.edu/vondrick/ihog/ What is this?
http://web.mit.edu/vondrick/ihog/ What is this?
http://web.mit.edu/vondrick/ihog/ What is this?
http://web.mit.edu/vondrick/ihog/ What is this?
Kristen Grauman Filter Banks
Image from http://www.texasexplorer.com/austincap2.jpg Kristen Grauman
Kristen Grauman Showing magnitude of responses
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Kristen Grauman
Can you match the texture to the response? Filters A 1 B 2 C 3 Derek Hoiem Mean responses
Representing texture by mean response Filters Derek Hoiem Mean responses
[r1, r2,, r38] We can form a feature vector from the list of responses at each pixel. Kristen Grauman
Shape Context Belongie, Malik and Puzicha, PAMI 2002 Representation of the local shape around a feature location as histogram of edge points in an image relative to that location. Computed by counting edge points in log polar space. Tamara Berg
Color Histograms Representation of the distribution of colors in an image, derived by counting the number of pixels of each of given set of color ranges in a typically (3D) color space (RGB, HSV etc). Tamara Berg
Gist Oliva and Torralba, IJCV 2001 Captures the global energy of the scene. Computes edge orientation responses for multiple orientations and scales. Tamara Berg
Describing images with features: Feature matching
Correspondence and alignment Correspondence: matching points, patches, edges, or regions across images James Hays
Correspondence and alignment Alignment: find the parameters of the transformation that best align matched points Fitting: find the parameters of a model that best fit the data James Hays
Hough Transform P.V.C. Hough, Machine Analysis of Bubble Chamber Pictures, Proc. Int. Conf. High Energy Accelerators and Instrumentation, 1959 Given a set of points, find the curve or line that explains the data points best y m x Hough space b y = m x + b m = (y b) / x James Hays, Silvio Savarese
Hough Transform y m x b y m 3 5 3 3 2 2 3 7 11 10 4 3 2 3 1 4 5 2 2 1 0 1 3 3 James Hays, Silvio Savarese x b
RANSAC (RANdom SAmple Consensus) : Fischler & Bolles in 81. Algorithm: 1. Sample (randomly) the number of points required to fit the model 2. Solve for model parameters using samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence James Hays, Silvio Savarese
RANSAC Line fitting example Algorithm: 1. Sample (randomly) the number of points required to fit the model (#=2) 2. Solve for model parameters using samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence James Hays, Silvio Savarese
RANSAC Line fitting example Algorithm: 1. Sample (randomly) the number of points required to fit the model (#=2) 2. Solve for model parameters using samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence James Hays, Silvio Savarese
RANSAC Line fitting example N I 6 Algorithm: 1. Sample (randomly) the number of points required to fit the model (#=2) 2. Solve for model parameters using samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence James Hays, Silvio Savarese
RANSAC Algorithm: 1. Sample (randomly) the number of points required to fit the model (#=2) 2. Solve for model parameters using samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence N I 14 James Hays, Silvio Savarese
Example: solving for translation A 1 A 2 A 3 B 1 B 2 B 3 Given matched points in {A} and {B}, estimate the translation of the object y x A i A i B i B i t t y x y x Derek Hoiem
Example: solving for translation A 1 A 2 A 3 B 1 B 2 B 3 Least squares solution y x A i A i B i B i t t y x y x (t x, t y ) 1. Write down objective function 2. Write in form Ax=b 3. Solve using pseudo-inverse or eigenvalue decomposition A n B n A n B n A B A B y x y y x x y y x x t t 1 1 1 1 1 0 0 1 1 0 0 1 Derek Hoiem
Example: solving for translation B 4 A 1 B 5 B 6 A 2 A B 1 3 (t x, t y ) A 4 B 2 B 3 A 5 A 6 Problem: outliers, multiple objects, and/or many-to-one matches Derek Hoiem Hough transform solution 1. Initialize a grid of parameter values 2. Each matched pair casts a vote for consistent values 3. Find the parameters with the most votes 4. Solve using least squares with inliers x y B i B i x y A i A i t t x y
Example: solving for translation A 1 A 5 B 4 A 2 A B 1 3 (t x, t y ) A 4 B 2 B 3 B 5 Problem: outliers RANSAC solution 1. Sample a set of matching points (1 pair) 2. Solve for transformation parameters 3. Score parameters with number of inliers 4. Repeat steps 1-3 N times x y B i B i x y A i A i t t x y Derek Hoiem
Local features: main components 1) Detection: Identify the interest points 2) Description: Extract vector feature descriptor surrounding each interest point x (1) [ x,, x (1) 1 1 d ] 3) Matching: Determine correspondence between descriptors in two views x (2) [ x,, x (2) 2 1 d ] Kristen Grauman
Next Time Classification and detection Adriana s research