The state of the art and beyond

Similar documents
Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

Maximally Stable Local Description for Scale Selection

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

Advances in Computer Vision. Prof. Bill Freeman. Image and shape descriptors. Readings: Mikolajczyk and Schmid; Belongie et al.

Overview. Harris interest points. Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points

Invariant local features. Invariant Local Features. Classes of transformations. (Good) invariant local features. Case study: panorama stitching

Wavelet-based Salient Points with Scale Information for Classification

Feature detectors and descriptors. Fei-Fei Li

Feature detectors and descriptors. Fei-Fei Li

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

Recap: edge detection. Source: D. Lowe, L. Fei-Fei

Scale & Affine Invariant Interest Point Detectors

Lecture 8: Interest Point Detection. Saad J Bedros

Properties of detectors Edge detectors Harris DoG Properties of descriptors SIFT HOG Shape context

Blobs & Scale Invariance

Rotational Invariants for Wide-baseline Stereo

Detectors part II Descriptors

Fisher Vector image representation

Distinguish between different types of scenes. Matching human perception Understanding the environment

Pictorial Structures Revisited: People Detection and Articulated Pose Estimation. Department of Computer Science TU Darmstadt

Lecture 6: Finding Features (part 1/2)

INTEREST POINTS AT DIFFERENT SCALES

Feature extraction: Corners and blobs

CS5670: Computer Vision

LoG Blob Finding and Scale. Scale Selection. Blobs (and scale selection) Achieving scale covariance. Blob detection in 2D. Blob detection in 2D

Achieving scale covariance

Compressed Fisher vectors for LSVR

Kernel Density Topic Models: Visual Topics Without Visual Words

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

38 1 Vol. 38, No ACTA AUTOMATICA SINICA January, Bag-of-phrases.. Image Representation Using Bag-of-phrases

Affine Adaptation of Local Image Features Using the Hessian Matrix

Photo Tourism and im2gps: 3D Reconstruction and Geolocation of Internet Photo Collections Part II

SURVEY OF APPEARANCE-BASED METHODS FOR OBJECT RECOGNITION

Rapid Object Recognition from Discriminative Regions of Interest

Space-time Zernike Moments and Pyramid Kernel Descriptors for Action Classification

Human Action Recognition under Log-Euclidean Riemannian Metric

Local Features (contd.)

Learning Discriminative and Transformation Covariant Local Feature Detectors

Pedestrian Density Estimation by a Weighted Bag of Visual Words Model

EE 6882 Visual Search Engine

Global Scene Representations. Tilke Judd

Lecture 12. Local Feature Detection. Matching with Invariant Features. Why extract features? Why extract features? Why extract features?

Image Analysis. Feature extraction: corners and blobs

Lecture 8: Interest Point Detection. Saad J Bedros

SIFT keypoint detection. D. Lowe, Distinctive image features from scale-invariant keypoints, IJCV 60 (2), pp , 2004.

Interpreting the Ratio Criterion for Matching SIFT Descriptors

Riemannian Metric Learning for Symmetric Positive Definite Matrices

Accumulated Stability Voting: A Robust Descriptor from Descriptors of Multiple Scales

Probabilistic Latent Semantic Analysis

DISCRIMINATIVE DECORELATION FOR CLUSTERING AND CLASSIFICATION

On the Completeness of Coding with Image Features

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

Use Bin-Ratio Information for Category and Scene Classification

Kai Yu NEC Laboratories America, Cupertino, California, USA

Lecture 7: Finding Features (part 2/2)

arxiv: v1 [cs.cv] 10 Feb 2016

Blob Detection CSC 767

Given a feature in I 1, how to find the best match in I 2?

CSE 473/573 Computer Vision and Image Processing (CVIP)

Scale-space image processing

Recognition and Classification in Images and Video

Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time

Region Moments: Fast invariant descriptors for detecting small image structures

CS 3710: Visual Recognition Describing Images with Features. Adriana Kovashka Department of Computer Science January 8, 2015

Comparative study of global invariant. descriptors for object recognition

Lecture 7: Finding Features (part 2/2)

Sparse Kernel Machines - SVM

Beyond Spatial Pyramids

Face detection and recognition. Detection Recognition Sally

Instance-level l recognition. Cordelia Schmid & Josef Sivic INRIA

A New Shape Adaptation Scheme to Affine Invariant Detector

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

SUBJECTIVE EVALUATION OF IMAGE UNDERSTANDING RESULTS

CITS 4402 Computer Vision

SURF Features. Jacky Baltes Dept. of Computer Science University of Manitoba WWW:

SCENE CLASSIFICATION USING SPATIAL RELATIONSHIP BETWEEN LOCAL POSTERIOR PROBABILITIES

SCENE CLASSIFICATION USING SPATIAL RELATIONSHIP BETWEEN LOCAL POSTERIOR PROBABILITIES

DYNAMIC TEXTURE RECOGNITION USING ENHANCED LBP FEATURES

Fantope Regularization in Metric Learning

Information Theory in Computer Vision and Pattern Recognition

Large-scale classification of traffic signs under real-world conditions

On the consistency of the SIFT Method

Lie Algebrized Gaussians for Image Representation

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES

Loss Functions and Optimization. Lecture 3-1

Object Recognition Using Local Characterisation and Zernike Moments

Kronecker Decomposition for Image Classification

Hilbert-Huang Transform-based Local Regions Descriptors

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, August 18, ISSN

Boosting: Algorithms and Applications

Visual Object Recognition

Analysis on a local approach to 3D object recognition

Asaf Bar Zvi Adi Hayat. Semantic Segmentation

Lecture 13 Visual recognition

Detecting Humans via Their Pose

Image Processing 1 (IP1) Bildverarbeitung 1

Is the Scale Invariant Feature Transform (SIFT) really Scale Invariant?

Scale & Affine Invariant Interest Point Detectors

CSC 411: Lecture 03: Linear Classification

Transcription:

Feature Detectors and Descriptors The state of the art and beyond Local covariant detectors and descriptors have been successful in many applications Registration Stereo vision Motion estimation Matching Retrieval Classification Detection Action recognition Robot navigation However, they have still many issues to address 2/30

Different classes of detector in term of generality Harris, DoG, EBR, MSER. All possible task Sharable category-specific features Category specific Object specific Detector of a particular patch Of course, in the restricted domain, performance goes up, but at a cost (learning required, multiple detectors run, may overfit). Direct comparison in terms of e.g. repeatability not appropriate. 3/30

Detector & Descriptor evaluation issues Mikolajczyk et al. IJCV 05 protocol, but: Bias towards dense responses: Larger number of detection better performance (return all windows and you ll be perfect) Bias towards large regions: No parallax (either flat scene or same viewpoint) Bias towards (current) detector-friendly problems: Scenes selected so that standard detectors perform well. There are many image categories, where standard detectors are useless. Increasing generality (applicability) cannot be demonstrated on this dataset Limited number of scenes, risk of over-fitting 4/30

Detector & Descriptor evaluation issues Mikolajczyk k et al. IJCV 05 protocol (like all others we know) ignore important aspects of detectors: Memory requirements note papers dealing with the how do I reduce memory needs problem, very important for large scale problem Speed this might be overrated, most detector are not the bottleneck of a system, most can be parallelized easily, should be well motivated Coverage reasonable spatial distribution of features is beneficial ; metric not clear Complementarity even a underperforming feature in terms of repeatability is valuable, if complementary Geometric precision i Degrees of freedom fixed Can the Mikolajczyk et al. protocol be fixed? future work 5/30

Detector & Descriptor as Components Performance in different application/niches may vary significantly; ifi different aspects are important Evaluation in context of some (general) application such as categorization is important see Evaluation Report later In a system, yet another set of aspects of a detector becomes important: e.g. efficient feedback control, adaptability 6/30

Previous Evaluations 2D Scene Homography C. Schmid, R. Mohr, and C. Bauckhage, Evaluation of interest point detectors, IJCV, 2000. K. Mikolajczyk and C. Schmid, A performance evaluation of local descriptors, CVPR, 2003. T. Kadir, M. Brady, and A. Zisserman, An affine invariant method for selecting salient regions in images, in ECCV, 2004. K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky,T. Kadir, and L. Van Gool, A comparison of affine region detectors, IJCV, 2005. A. Haja, S. Abraham, and B. Jahne, Localization accuracy of region detectors, t CVPR 2008 7/30

Previous Evaluations 3D Scene - epipolar constraints t F. Fraundorfer and H. Bischof, Evaluation of local detectors on nonplanar, scenes, in AAPR, 2004. P. Moreels and P. Perona, Evaluation of features detectors and descriptors based on 3D objects, IJCV, 2007. S. Winder and M. Brown, Learning local image descriptors, CVPR, 2007,2009. 8/30

Previous Evaluations Image/object categories K. Mikolajczyk, B. Leibe, and B. Schiele, Local features for object class recognition, in ICCV, 2005 E. Seemann, B. Leibe, K. Mikolajczyk, and B. Schiele, An evaluation of local shape-based features for pedestrian detection, in BMVC, 2005. M. Stark and B. Schiele, How good are local features for classes of geometric objects, in ICCV, 2007. K. E. A. van de Sande, T. Gevers and C. G. M. Snoek, Evaluation of Color Descriptors for Object and Scene Recognition. CVPR, 2008. 9/30

Domains Recent Developments Registration Stereo vision Motion estimation Matching Retrieval Classification Detection Action recognition Robot navigation 10/30

Recent Developments Learning features from the data Interest point detectors - decision trees, AdaBoost Local descriptors exhaustive optimizations of descriptor parameters Improving discriminability dimensionality/memory reduction LDA, PCA Feature selection General saliency, application driven selection Quantization Clustering Efficient features extraction algorithms Integral images, sampling strategies, GPU implementations Domain specific features 3D, spatio-temporal, IR, biology, medical imaging 11/30

Feature Detectors and Descriptors Evaluation for image classification KrystianMikolajczyk and Mark Barnard University of Surrey, UK CVPR, Miami, 2009

Setting up an evaluation Which problem? Category recognition Bags-of features What dataset? Pascal VOC 2007 Protocol and criteria? Public dataset, Avoiding risk to over-fitting/optimizing g to the data 13/30

Dataset PASCAL VOC 2007, 20 object categories, 5011 training/validation images 4952 test images 24,640 annotated objects 14/30

Bags-of-features Approach 1. Interest point / region detector 2. Descriptors 3. K-means clustering (4000 clusters) 4. Histogram of cluster occurrences (NN assignment) 5. Chi-square distance and RBF kernel for KDA or SVM classifier J. Zhang and M. Marszalek and S. Lazebnik and C. Schmid, Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study, IJCV, 2007 K. E. A. van de Sande, T. Gevers and dc. G. M. Snoek, Evaluation of Color Descriptors for Object and Scene Recognition. CVPR, 2008 15/30

Evaluation PASCAL VOC measures Average precision for every object category Mean average precision Category AP recision prap output=> recall 16/30

Recognition Evaluation www.featurespace.org Protocol, data, binaries i 17/30

Evaluated Features 33 different features types University of Amsterdam TU Vienna CMP Prague ETH Zurich EPFL Lausanne University of Surrey Stanford University Harvard Medical School 18/30

Features Region detectors t (combined with SIFT descriptor) MSER CMP Prague Computed in opponent chromatic space Harris-Laplace, Hessian-Laplace, Surrey UK Improved localization and scale precision Dog (D. Lowe) Sparse color features, TU Vienna Color Harris detector Region descriptors (combined with Harris-Laplace detector) Original Sift (D. Lowe) Color descriptors (histograms, moments, SIFT) - UvA Amsterdam DAISY EPFL Lausanne SURF ETH Zurich Color descriptors - Surrey UK Ordinal SIFT Harvard CHoG Stanford 19/30

MAP Ranking color/gray, density, dimensionality... MAP #dimensions density 20/30

P07/DM Dimensionality MAP #dimensions density Observations Dimensionality not much correlated with the performance 21/30

P07/DS Density MAP Ranking density #dimensions Observations Strongly correlated (the more the better) 22/30

P07/GD Grayvalue descriptors MAP Ranking #dimensions density Observations Color improves All based on histograms of gradient locations and orientations Results biased by density Implementation details matter 23/30

P07/CD Color Descriptors MAP Ranking #dimensions density Observations SIFT still dominates(histograms of gradient locations and orientations) Opponent chromatic space (normalized red-green, blue-yellow, and intensity Y) 24/30

P07/OC per object category Ranked 1st Ranked 2nd Observations No single best feature, different features suitable for different categories Bias from the number of training images 25/30

P07/MK Multi-Kernel e Fusion per object category 26/30

Conclusions Histograms of gradient location-orientation dominate Color brings improvement for most classes opponent chromatic space Feature number the more the better Similar ranking in image matching performance generalizes across applications 27/30