Robert Collins CSE598G Mean-Shift Blob Tracking through Scale Space

Similar documents
SIFT keypoint detection. D. Lowe, Distinctive image features from scale-invariant keypoints, IJCV 60 (2), pp , 2004.

CSE 473/573 Computer Vision and Image Processing (CVIP)

Blob Detection CSC 767

LoG Blob Finding and Scale. Scale Selection. Blobs (and scale selection) Achieving scale covariance. Blob detection in 2D. Blob detection in 2D

Achieving scale covariance

Feature extraction: Corners and blobs

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

Feature detectors and descriptors. Fei-Fei Li

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

INTEREST POINTS AT DIFFERENT SCALES

CS 534: Computer Vision Segmentation III Statistical Nonparametric Methods for Segmentation

Feature detectors and descriptors. Fei-Fei Li

Properties of detectors Edge detectors Harris DoG Properties of descriptors SIFT HOG Shape context

Blobs & Scale Invariance

Image Analysis. Feature extraction: corners and blobs

Vlad Estivill-Castro (2016) Robots for People --- A project for intelligent integrated systems

Edge Detection. CS 650: Computer Vision

Computer Vision Lecture 3

Detectors part II Descriptors

SIFT: SCALE INVARIANT FEATURE TRANSFORM BY DAVID LOWE

SIFT: Scale Invariant Feature Transform

Lecture 8: Interest Point Detection. Saad J Bedros

CS5670: Computer Vision

Corner detection: the basic idea

Outline. Convolution. Filtering

Extract useful building blocks: blobs. the same image like for the corners

Lecture 04 Image Filtering

Mean-Shift Tracker Computer Vision (Kris Kitani) Carnegie Mellon University

Advanced Features. Advanced Features: Topics. Jana Kosecka. Slides from: S. Thurn, D. Lowe, Forsyth and Ponce. Advanced features and feature matching

Edge Detection. Introduction to Computer Vision. Useful Mathematics Funcs. The bad news

Overview. Harris interest points. Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points

Machine vision. Summary # 4. The mask for Laplacian is given

Image matching. by Diva Sian. by swashford

Edge Detection. Image Processing - Computer Vision

Harris Corner Detector

CS 147: Computer Systems Performance Analysis

CS4670: Computer Vision Kavita Bala. Lecture 7: Harris Corner Detec=on

Machine vision, spring 2018 Summary 4

Advances in Computer Vision. Prof. Bill Freeman. Image and shape descriptors. Readings: Mikolajczyk and Schmid; Belongie et al.

Edge Detection PSY 5018H: Math Models Hum Behavior, Prof. Paul Schrater, Spring 2005

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

C4B Computer Vision Question Sheet 2 answers

Filtering and Edge Detection

Image Gradients and Gradient Filtering Computer Vision

Scale & Affine Invariant Interest Point Detectors

Introduction to Computer Vision. 2D Linear Systems

ECE Digital Image Processing and Introduction to Computer Vision. Outline

Lecture 7: Finding Features (part 2/2)

Edge Detection. Computer Vision P. Schrater Spring 2003

Missing Data Interpolation with Gaussian Pyramids

Wavelet-based Salient Points with Scale Information for Classification

Lecture 6: Edge Detection. CAP 5415: Computer Vision Fall 2008

CSC487/2503: Foundations of Computer Vision. Visual Tracking. David Fleet

Image Filtering. Slides, adapted from. Steve Seitz and Rick Szeliski, U.Washington

LOCAL SEARCH. Today. Reading AIMA Chapter , Goals Local search algorithms. Introduce adversarial search 1/31/14

Advanced Edge Detection 1

Spatial-Domain Convolution Filters

Feature detection.

Probabilistic Machine Learning. Industrial AI Lab.

Fraunhofer Institute for Computer Graphics Research Interactive Graphics Systems Group, TU Darmstadt Fraunhoferstrasse 5, Darmstadt, Germany

CSE 598C Vision-based Tracking Seminar. Times: MW 10:10-11:00AM Willard 370 Instructor: Robert Collins Office Hours: Tues 2-4PM, Wed 9-9:50AM

Lecture 7: Finding Features (part 2/2)

Kernel-based density. Nuno Vasconcelos ECE Department, UCSD

Lecture 12. Local Feature Detection. Matching with Invariant Features. Why extract features? Why extract features? Why extract features?

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES

Scale Space Smoothing, Image Feature Extraction and Bessel Filters

Lecture 8: Interest Point Detection. Saad J Bedros

Scale-space image processing

16 : Approximate Inference: Markov Chain Monte Carlo

Fourier transforms and convolution

CS 3710: Visual Recognition Describing Images with Features. Adriana Kovashka Department of Computer Science January 8, 2015

Introduction to Computer Vision

CITS 4402 Computer Vision

Reading. 3. Image processing. Pixel movement. Image processing Y R I G Q

Recap: edge detection. Source: D. Lowe, L. Fei-Fei

VIDEO SYNCHRONIZATION VIA SPACE-TIME INTEREST POINT DISTRIBUTION. Jingyu Yan and Marc Pollefeys

Roadmap. Introduction to image analysis (computer vision) Theory of edge detection. Applications

I Chen Lin, Assistant Professor Dept. of CS, National Chiao Tung University. Computer Vision: 4. Filtering

Announcements. Filtering. Image Filtering. Linear Filters. Example: Smoothing by Averaging. Homework 2 is due Apr 26, 11:59 PM Reading:

To factor an expression means to write it as a product of factors instead of a sum of terms. The expression 3x

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

Nonparametric Methods Lecture 5

The Derivative of a Function

Over-enhancement Reduction in Local Histogram Equalization using its Degrees of Freedom. Alireza Avanaki

Curve Fitting Re-visited, Bishop1.2.5

SURF Features. Jacky Baltes Dept. of Computer Science University of Manitoba WWW:

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

EE 6882 Visual Search Engine

Spatial Enhancement Region operations: k'(x,y) = F( k(x-m, y-n), k(x,y), k(x+m,y+n) ]

Review Smoothing Spatial Filters Sharpening Spatial Filters. Spatial Filtering. Dr. Praveen Sankaran. Department of ECE NIT Calicut.

Laplacian Filters. Sobel Filters. Laplacian Filters. Laplacian Filters. Laplacian Filters. Laplacian Filters

Scale & Affine Invariant Interest Point Detectors

Convolution and Linear Systems

SIFT, GLOH, SURF descriptors. Dipartimento di Sistemi e Informatica

An IDL Based Image Deconvolution Software Package

Linear Models for Regression

MIT 2.71/2.710 Optics 10/31/05 wk9-a-1. The spatial frequency domain

Linear Diffusion. E9 242 STIP- R. Venkatesh Babu IISc

Transcription:

Mean-Shift Blob Tracking through Scale Space Robert Collins, CVPR 03

Abstract Mean-shift tracking Choosing scale of kernel is an issue Scale-space feature selection provides inspiration Perform mean-shift with scale-space kernel to optimize for blob location and scale

Nice Property Running mean-shift with kernel K on weight image w is equivalent to performing gradient ascent in a (virtual) image formed by convolving w with some shadow kernel H. Δx = Σa K(a-x) w(a) (a-x) -c x Σa K(a-x) w(a) [Σa H(a-x) w(a) ]

Size Does Matter! Mean-shift is related to kernel density estimation, aka Parzen estimation, so choosing correct scale of the mean-shift kernel is important. Too big Too small

Size Does Matter Fixed-scale ± 10% scale adaptation Our Approach! Tracking through scale space

Some Approaches to Size Selection Choose one scale and stick with it. Bradski s CAMSHIFT tracker computes principal axes and scales from the second moment matrix of the blob. Assumes one blob, little clutter. CRM adapt window size by +/- 10% and evaluate using Battacharyya coefficient. Although this does stop the window from growing too big, it is not sufficient to keep the window from shrinking too much. Comaniciu s variable bandwidth methods. Computationally complex. Rasmussen and Hager: add a border of pixels around the window, and require that pixels in the window should look like the object, while pixels in the border should not. Center-surround

Scale-Space Theory

Scale Space Basic idea: different scales are appropriate for describing different objects in the image, and we may not know the correct scale/size ahead of time.

Scale Selection Laplacian operator.

LoG Operator M.Hebert, CMU

Approximating LoG with DoG LoG can be approximate by a Difference of two Gaussians (DoG) at different scales but more convenient if : We will come back to DoG later

Local Scale Space Maxima Lindeberg proposes that the natural scale for describing a feature is the scale at which a normalized derivative for detecting that feature achieves a local maximum both spatially and in scale. DnormL is a normalized Laplacian of Gaussian operator σ 2 LoG σ Example for blob detection Scale

Extrema in Space and Scale Scale Space

Example: Blob Detection

Why Normalized Derivatives Laplacian of Gaussian (LOG) Amplitude of LOG response decreases with greater smoothing

Interesting Observation If we approximate the LOG by a Difference of Gaussian (DOG) filter we do not have to normalize to achieve constant applitude across scale.

Another Explanation Lowe, IJCV 2004 (Sift key paper)

Anyhow... Scale space theory says we should look for modes in a DoG - filtered image volume. Let s just think of the spatial dimensions for now We want to look for modes in DoG-filtered image, meaning a weight image convolved with a DoG filter. Insight: if we view DoG filter as a shadow kernel, we could use mean-shift to find the modes. Of course, we d have to figure out what mean-shift kernel corresponds to a shadow kernel that is a DoG.

Kernel-Shadow Pairs Given a convolution kernel H, what is the corresponding mean-shift kernel K? Perform change of variables r = a-x 2 Rewrite H(a-x) => h( a-x 2 ) => h(r). Then kernel K must satisfy h (r) = - c k (r) Examples Shadow DoG Epanichnikov Gaussian Kernel Flat Gaussian

Kernel related to DoG Shadow h (r) = - c k (r) shadow where σ 1 = σ/sqrt(1.6) σ 2 = σ*sqrt(1.6) kernel

h (r) = - c k (r) Kernel related to DoG Shadow some values are negative. Is this a problem? Umm... Yes it is

Dealing with Negative Weights

Show little demo with neg weights mean-shift will sometimes converge to a valley rather than a peak. The behavior is sometimes even stranger than that (step size becomes way too big and you end up in another part of the function).

Why we might want negative weights Given an n-bucket histogram {m i i=1,,n} and data histogram {d i i=1,,n}, CRM suggest measuring similarity using the Battacharyya Coefficient ρ n i= 1 m i d i They use the mean-shift algorithm to climb the spatial gradient of this function by weighting each pixel falling into bucket i the term at right Note the similarity to the likelihood ratio function w = m / d i wi log2 i m d i i i m i / d i log 2 m d i i

Why we might want negative weights wi log2 m d i i Using the likelihood ratio makes sense probabilistically. For example: using mean-shift with uniform kernel on weights that are likelihood ratios: would then be equivalent to using KL divergence to measure difference between model m and data d histograms. sum over pixels sum over buckets with value i (note, n*di pixels have value i)

Analysis: Scaling the Weights recall: mean shift offset what if w(a) is scaled to c*w(a)? c c So mean shift is invariant to scaled weights

Analysis: Adding a Constant what if we add a constant to get w(a)+c? So mean shift is not invariant to an added constant This is annoying!

Adding a Constant result: It isn t a good idea to just add a large positive number to our weights to make sure they stay positive. show little demo again, adding a constant.

Another Interpretation of Mean-shift Offset Thinking of offset as a weighted center of mass doesn t make sense for negative weights. Δx = weight point Σa K(a-x) w(a) (a-x) Σa K(a-x) w(a)

Another Interpretation of Mean-shift Offset Think of each offset as a vector, which has a direction and magnitude. Δx = vector Σa K(a-x) w(a) (a-x) Σa K(a-x) w(a) Note, a negative weight now just means a vector in the opposite direction. Interpret mean shift offset as an estimate of the average vector. Note: numerator interpreted as sum of directions and magnitudes But denominator should just be sum of magnitudes (which should all be positive)

Absolute Value in Denominator or does it?

back to the demo There can be oscillations when there are negative weights. I m not sure what to do about that.

Outline of Scale-Space Mean Shift General Idea: build a designer shadow kernel that generates the desired DOG scale space when convolved with weight image w(x). Change variables, and take derivatives of the shadow kernel to find corresponding mean-shift kernels using the relationship shown earlier. Given an initial estimate (x 0, s 0 ), apply the mean-shift algorithm to find the nearest local mode in scale space. Note that, using mean-shift, we DO NOT have to explicitly generate the scale space.

Scale-Space Kernel

Mean-Shift through Scale Space 1) Input weight image w(a) with current location x 0 and scale s 0 2) Holding s fixed, perform spatial mean-shift using equation 3) Let x be the location computed from step 2. Holding x fixed, perform mean-shift along the scale axis using equation 4) Repeat steps 2 and 3 until convergence.

Second Thoughts Rather than being strictly correct about the kernel K, note that it is approximately Gaussian. blue: Kernel associated with shadow kernel of DoG with sigma σ red: Gaussian kernel with sigma σ/sqrt(1.6) so why not avoid issues with negative kernel by just using a Gaussian to find the spatial mode?

scaledemo.m interleave Gaussian spatial mode finding with 1D DoG mode finding.

Summary Mean-shift tracking Choosing scale of kernel is an issue Scale-space feature selection provides inspiration Perform mean-shift with scale-space kernel to optimize for blob location and scale` Contributions Natural mechanism for choosing scale WITHIN mean-shift framework Building designer kernels for efficient hill-climbing on (implicitly -defined) convolution surfaces