Distinguish between different types of scenes. Matching human perception Understanding the environment

Similar documents
Global Scene Representations. Tilke Judd

Kernel Density Topic Models: Visual Topics Without Visual Words

CS 3710: Visual Recognition Describing Images with Features. Adriana Kovashka Department of Computer Science January 8, 2015

Lecture 13 Visual recognition

The state of the art and beyond

Probabilistic Latent Semantic Analysis

Shared Segmentation of Natural Scenes. Dependent Pitman-Yor Processes

Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification

Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification

EE 6882 Visual Search Engine

Loss Functions and Optimization. Lecture 3-1

Using Both Latent and Supervised Shared Topics for Multitask Learning

38 1 Vol. 38, No ACTA AUTOMATICA SINICA January, Bag-of-phrases.. Image Representation Using Bag-of-phrases

Modeling Complex Temporal Composition of Actionlets for Activity Prediction

Kai Yu NEC Laboratories America, Cupertino, California, USA

Use Bin-Ratio Information for Category and Scene Classification

Pedestrian Density Estimation by a Weighted Bag of Visual Words Model

Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions

Clustering with k-means and Gaussian mixture distributions

Urban land use information retrieval based on scene classification of Google Street View images

Course 10. Kernel methods. Classical and deep neural networks.

Loss Functions and Optimization. Lecture 3-1

CS Lecture 18. Topic Models and LDA

SCENE CLASSIFICATION USING SPATIAL RELATIONSHIP BETWEEN LOCAL POSTERIOR PROBABILITIES

SCENE CLASSIFICATION USING SPATIAL RELATIONSHIP BETWEEN LOCAL POSTERIOR PROBABILITIES

CITS 4402 Computer Vision

CS145: INTRODUCTION TO DATA MINING

Dimension Reduction (PCA, ICA, CCA, FLD,

Detecting Humans via Their Pose

Chapter 2. Semantic Image Representation

Maximally Stable Local Description for Scale Selection

Space-time Zernike Moments and Pyramid Kernel Descriptors for Action Classification

Contextual Modeling with Labeled Multi-LDA

Beyond Spatial Pyramids

Generative Clustering, Topic Modeling, & Bayesian Inference

Graphical Object Models for Detection and Tracking

Blobs & Scale Invariance

Recognition and Classification in Images and Video

Representing Sets of Instances for Visual Recognition

Lie Algebrized Gaussians for Image Representation

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

Feature detectors and descriptors. Fei-Fei Li

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

Unsupervised Activity Perception by Hierarchical Bayesian Models

Compressed Fisher vectors for LSVR

Global Behaviour Inference using Probabilistic Latent Semantic Analysis

Human Action Recognition under Log-Euclidean Riemannian Metric

Properties of detectors Edge detectors Harris DoG Properties of descriptors SIFT HOG Shape context

Feature detectors and descriptors. Fei-Fei Li

Bag of Words: Automated Classification of Images. Benjamin Deavin

DepthQualificationExam Presentation

Latent Dirichlet Allocation Introduction/Overview

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Fantope Regularization in Metric Learning

Neural networks and optimization

Recent Advances in Bayesian Inference Techniques

Recap: edge detection. Source: D. Lowe, L. Fei-Fei

INTEREST POINTS AT DIFFERENT SCALES

Invariant local features. Invariant Local Features. Classes of transformations. (Good) invariant local features. Case study: panorama stitching

Anticipating Visual Representations from Unlabeled Data. Carl Vondrick, Hamed Pirsiavash, Antonio Torralba

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated

Human activity recognition in the semantic simplex of elementary actions

Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities

Instance-level recognition: Local invariant features. Cordelia Schmid INRIA, Grenoble

Photo Tourism and im2gps: 3D Reconstruction and Geolocation of Internet Photo Collections Part II

Corners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros

Learning Sparse Covariance Patterns for Natural Scenes

Pattern recognition. "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher

Example: Face Detection

Lecture 8: Interest Point Detection. Saad J Bedros

Higher-order Statistical Modeling based Deep CNNs (Introduction) Peihua Li Dalian University of Technology

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

Topic Modelling and Latent Dirichlet Allocation

Learning Spa+otemporal Graphs of Human Ac+vi+es

Internet Video Search

Unsupervised Learning of Discrimina4ve Rela4ve Visual A9ributes

CS 6375 Machine Learning

CS6220: DATA MINING TECHNIQUES

Efficient Learning of Sparse, Distributed, Convolutional Feature Representations for Object Recognition

Infinite-Word Topic Models for Digital Media

Image Analysis. PCA and Eigenfaces

Robotics 2. AdaBoost for People and Place Detection. Kai Arras, Cyrill Stachniss, Maren Bennewitz, Wolfram Burgard

Distributed ML for DOSNs: giving power back to users

Analysis on a local approach to 3D object recognition

Templates, Image Pyramids, and Filter Banks

Optical Flow, Motion Segmentation, Feature Tracking

TRAFFIC SCENE RECOGNITION BASED ON DEEP CNN AND VLAD SPATIAL PYRAMIDS

Asaf Bar Zvi Adi Hayat. Semantic Segmentation

CSE 473/573 Computer Vision and Image Processing (CVIP)

Classical Predictive Models

Chapter 3 Large-Scale Image Geolocalization

Domain adaptation for deep learning

Brief Introduction of Machine Learning Techniques for Content Analysis

Computer Vision Lecture 3

Neural networks and optimization

What is semi-supervised learning?

6.869 Advances in Computer Vision. Prof. Bill Freeman March 1, 2005

Wavelet-based Salient Points with Scale Information for Classification

Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net

Transcription:

Scene Recognition Adriana Kovashka UTCS, PhD student

Problem Statement Distinguish between different types of scenes Applications Matching human perception Understanding the environment Indexing of images / video Robotics Graphics In-painting

Background Definition of a scene [A] scene is mainly characterized as a place in which we can move [Oliva 2001] Assumptions Human categorization Approaches Parsing of the scene as a whole, or in parts

Coast [Oliva 2001] Mountain [Oliva 2001]

Inside City [Oliva 2001] Street [Oliva 2001]

Kitchen [Lazebnik 2006] Industrial [Lazebnik 2006]

Scene?

Scene?

Urban or natural?

Urban or natural?

Spatial Envelope [Oliva 2001] Inspiration from human perception Naturalness, openness, roughness Expansion, ruggedness Holistic, no recognition of objects Three levels cars and people vs. street vs. urban environment

Spatial Envelope (cont d) Scene modeling Discrete Fourier Transform [Oliva 2001] Windowed Fourier Transform [Oliva 2001]

Spatial Envelope (cont d) Principal Components Analysis [Oliva 2001]

Spatial Envelope (cont d) [Oliva 2001]

Spatial Envelope (cont d) Properties of the spatial envelope Discriminant spectral template (DST) Relates spectral components to properties of the spatial envelope Parameter d learned through matching of feature vectors and property values Windowed discriminant spectral template (WDST)

Spatial Envelope (cont d) [Oliva 2001]

Spatial Envelope (cont d) [Oliva 2001]

Spatial Envelope (cont d) Results Scene properties [Oliva 2001]

Spatial Envelope (cont d) [Oliva 2001]

Spatial Envelope (cont d) [Oliva 2001]

Spatial Envelope (cont d) Classification K-nn 4 out of 7 neighbors picked by humans [Oliva 2001]

Spatial Envelope (cont d) Strengths Higher-level descriptions Low dimensionality Invariance to object composition Weak local information Weaknesses Significant number of human labels

[Oliva 2001]

Spatial Pyramid [Lazebnik 2006] Global, locally orderless Bag-of-features Extension of Pyramid Match Kernel in 2-d Regular clustering of features

Spatial Pyramid (cont d) [Grauman 2005] as quoted in [Lazebnik 2006] [Lazebnik 2006]

Spatial Pyramid (cont d) [Lazebnik 2006]

Spatial Pyramid (cont d) [Lazebnik 2006]

Spatial Pyramid (cont d) Results SVM classification Scene recognition [Lazebnik 2006] 65.2% for [Fei-Fei 2005]

Spatial Pyramid (cont d) [Lazebnik 2006]

Spatial Pyramid (cont d) Object recognition [Lazebnik 2006]

Spatial Pyramid (cont d) Strengths Reasonable dimensionality Locally orderless Dense representation Robust to failures at individual levels Weaknesses No invariability to composition of image Not robust to clutter

[Hays 2007] Scene Completion http://graphics.cs.cmu.edu/ projects/scene-completion/ Input image Scene Descriptor Image Collection 20 completions Context matching + blending 200 matches Hays and Efros, SIGGRAPH 2007

[Oliva 2001]

Topic Models [Fei-Fei 2005] Bayesian hierarchical model Intermediate representations Bag-of-features f 4 ways to extract regions 2 types of features

Topic Models (cont d) [Fei-Fei 2005]

Hierarchical Bayesian text models [Fei-Fei 2005] beach Latent Dirichlet Allocation (LDA) D c π N z w Fei-Fei et al. ICCV 2005

Topic Models (cont d) η distribution of class labels l θ parameter (estimated by EM) c class label π distribution of themes for image z theme x patch β parameter (estimated by EM) [Fei-Fei 2005]

Topic Models (cont d) Codebook 174 codewords [Fei-Fei 2005]

Topic Models (cont d) [Fei-Fei 2005]

Topic Models (cont d) Results [Fei-Fei 2005]

Topic Models (cont d) Strengths Unsupervised Invariant to composition Weaknesses No geometry Matches of themes to categories No correspondence to semantic categories

Comparison Global vs. local Spatial Envelope, Spatial Pyramid Topic Models Viewpoint / location biases vs. invariability Spatial Pyramid Topic Models, Spatial Envelope

Comparison (cont d) Intermediate representations Spatial Envelope, Topic Models Supervision vs. no supervision Spatial Envelope Topic Models, Spatial Pyramid Object recognition?

Discussion Object recognition vs. scene recognition Global approaches Spatial Pyramid, scenes vs. objects results Bag-of-features Use of scene recognition Ambiguous scenes Human recognition of scenes Importance

References [Fei-Fei 2005] L. Fei-Fei and P. Perona. A Bayesian Hierarchical Model for Learning Natural Scene Categories. CVPR 2005. [Grauman 2005] K. Grauman and T. Darrell. The Pyramid Match Kernel: Discriminative i i Classification with Sets of Image Features. ICCV 2005. [Hays 2007] J. Hays and A.A. Efros. Scene completion using millions of photographs. SIGGRAPH 2007. [Lazebnik 2006] S. Lazebnik, C. Schmid, and J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. CVPR 2006. [Oliva 2001] A. Oliva and A. Torralba. Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope. IJCV 2001.