Graphical Object Models for Detection and Tracking

Similar documents
Pictorial Structures Revisited: People Detection and Articulated Pose Estimation. Department of Computer Science TU Darmstadt

Detecting Humans via Their Pose

Density Propagation for Continuous Temporal Chains Generative and Discriminative Models

Kernel Density Topic Models: Visual Topics Without Visual Words

Unsupervised Learning of Hierarchical Models. in collaboration with Josh Susskind and Vlad Mnih

Statistical Learning Reading Assignments

Part 4: Conditional Random Fields

Multi-Layer Boosting for Pattern Recognition

Human Pose Tracking I: Basics. David Fleet University of Toronto

Boosting: Algorithms and Applications

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Detector Ensemble. Abstract. 1. Introduction

Global Behaviour Inference using Probabilistic Latent Semantic Analysis

Probabilistic Graphical Models

Expectation Propagation Algorithm

Probabilistic Graphical Models

Distinguish between different types of scenes. Matching human perception Understanding the environment

Clustering with k-means and Gaussian mixture distributions

Statistical Filters for Crowd Image Analysis

Naïve Bayes classification

Generative Learning. INFO-4604, Applied Machine Learning University of Colorado Boulder. November 29, 2018 Prof. Michael Paul

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Loss Functions and Optimization. Lecture 3-1

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them

Active and Semi-supervised Kernel Classification

Undirected Graphical Models

Generative Clustering, Topic Modeling, & Bayesian Inference

CSC487/2503: Foundations of Computer Vision. Visual Tracking. David Fleet

Face detection and recognition. Detection Recognition Sally

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Introduction to Machine Learning

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji

Bayesian Networks BY: MOHAMAD ALSABBAGH

Semi-Supervised Learning

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Diffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel)

Probabilistic Graphical Models

Introduction to Machine Learning

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang

The Origin of Deep Learning. Lili Mou Jan, 2015

Pattern Recognition and Machine Learning

Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees

Notes on Machine Learning for and

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /

Introduction. Chapter 1

Does Unlabeled Data Help?

Undirected graphical models

Semi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University

Density Estimation. Seungjin Choi

Discriminative Learning and Big Data

CS 4495 Computer Vision Principle Component Analysis

Machine Learning Techniques for Computer Vision

Probabilistic and Bayesian Machine Learning

Introduction to Gaussian Process

CPSC 340: Machine Learning and Data Mining

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

Recent Advances in Bayesian Inference Techniques

Basis Expansion and Nonlinear SVM. Kai Yu

Algorithm-Independent Learning Issues

Probabilistic Graphical Models

Hidden Markov Models

Probabilistic Graphical Models

Probabilistic Graphical Models

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Visual Object Detection

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract

Variational Autoencoders

Kai Yu NEC Laboratories America, Cupertino, California, USA

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

Introduction To Graphical Models

Loss Functions and Optimization. Lecture 3-1

COMP90051 Statistical Machine Learning

Semi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT)

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Conditional Random Field

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts III-IV

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Machine Learning (CS 567) Lecture 2

Intelligent Systems (AI-2)

Machine Learning Lecture 2

STA 4273H: Statistical Machine Learning

Learning Spa+otemporal Graphs of Human Ac+vi+es

COS 429: COMPUTER VISON Face Recognition

Fast Human Detection from Videos Using Covariance Features

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials)

Mathematical Formulation of Our Example

Minimum Message Length Inference and Mixture Modelling of Inverse Gaussian Distributions

Day 5: Generative models, structured classification

PATTERN RECOGNITION AND MACHINE LEARNING

CS446: Machine Learning Fall Final Exam. December 6 th, 2016

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

Chapter 10: Random Fields

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017

Transcription:

Graphical Object Models for Detection and Tracking (ls@cs.brown.edu) Department of Computer Science Brown University Joined work with: -Ying Zhu, Siemens Corporate Research, Princeton, NJ -DorinComaniciu, Siemens Corporation Research, Princeton, NJ - Michael J. Black, Department of Computer Science, Brown University

Object Detection and Tracking Vehicle Pedestrian

Why Object Detection is Hard? Many target objects Appearance/lighting changes Partial occlusions Different orientations (articulations) of the object Different scale of objects

Why Object Detection is Hard? Many target objects Appearance/lighting changes Partial occlusions Different orientations (articulations) of the object Different scale of objects

Why Object Detection is Hard? Many target objects Appearance/lighting changes Partial occlusions Different orientations (articulations) of the object Different scale of objects

Why Object Detection is Hard? Many target objects Appearance/lighting changes Partial occlusions Different orientations (articulations) of the object Different scale of objects

Why Object Detection is Hard? Many target objects Appearance/lighting changes Partial occlusions Different orientations (articulations) of the object Different scale of objects

Object Detection Is This a Vehicle?

Object Detection Is This a Vehicle?

Object Detection Is This a Vehicle?

Object Detection Is This a Vehicle?

Object Detection: Machine Learning Approach f f f M 1 2 N Classifier Vehicle / Not-Vehicle

Object Detection: Using Pixel Values as Features f f f M 1 2 N A Trainable Pedestrian Detection System ( 98) Papageorgiou, Evgeniou, Poggio SVM Classifier f i Pixel value at location i, where i is in the patch (N~128x64=8192) Many training examples to learn Requires many support vectors

Object Detection: Feature Selection + Classification Pedestrian Class Examples Background Class Examples Learn Features to Use for Classification and the Classifier itself Classifier

Object Detection: AdaBoost Approach Wavelet-like over-complete set of features, with simple weak classifiers [-1,1] ~40,000 features to choose from

Object Detection: AdaBoost Approach h h h ( f ) ( f ) M ( f ) N 1 1 2 2 N α 1 α 2 α N Robust Real-time Object Detection ( 01) P. Viola and M. Jones M Σ = HY ( i ) > 1 2 1 2 Vehicle Not-Vehicle N is much smaller then the number of pixels (~100)

Object Detection: AdaBoost Approach Tends to produce many false positives (need motion information Viola & Jones 04) Does not explicitly model object parts, or their spatial relationship

Why parts are useful? Vehicle Objects: Parts Vehicle Objects Parts are easier to model Parts are robust to appearance changes (due to articulations and lighting) Parts can be reused

Part-Based Object Detection Example-Based Object Detection in Images by Components ( 01) A. Mohan, C. Papageorgiou, T. Poggio Object Class Recognition by Unsupervised Scale-Invariant Learning ( 03) R. Fergus, P. Perona, A. Zisserman A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories ( 03) L. Fei-Fei, R. Fergus, P. Perona Human detection based on a probabilistic assembly of robust part detectors ( 04) K. Mikolajczyk, C. Schmid, A. Zisserman Unlike all previous methods We use graphical model to represent an object, which results in elegant inference algorithm We incorporate temporal constraints Supervised learning (unlike Fergus, Perona, Zisserman)

Graphical Object Models Object Layer O Component Layer C BL C TL C TR C BR Object is represented as a 2-layer graphical model Each part of the object (and the object itself) is a node Spatial (and temporal) constraints are encoded using conditional distributions

Graphical Object Models: Modeling Parts Each part/object has an associated AdaBoost detector X i = [ xys,, ] T 3D parameter vector (X i ) defining the position and the scale of the part/object in an image to be estimated

Graphical Object Models: Spatio-Temporal Extension Time Components Object O t-1 C TL C TR C BL C BR C BL C TL O t C TR C BR O t+2 C TL C TR C BL C BR Spatial model can be extended in time

Graphical Object Models The joint distribution of the 2-layer spatio-temporal model can be written: PX X X X X X X X Y Y O Co C1 CN O Co C1 CN (,,, L,, LL,,,, L,, L ) 0 0 0 0 1 Z O O O Ck Ck Cl ψ ( X, X ) ψ ( X, X ) ψ ( X, X ) O Ck φ ( X, Y) φ ( X, Y) T T T T i T ij i j ik i i kl i i ij ik ikl i i i i ik i i ik =

Graphical Object Models The joint distribution of the 2-layer spatio-temporal model can be written: State of object at time T PX X X X X X X X Y Y O Co C1 CN O Co C1 CN (,,, L,, LL,,,, L,, L ) 0 0 0 0 1 Z O O O Ck Ck Cl ψ ( X, X ) ψ ( X, X ) ψ ( X, X ) O Ck φ ( X, Y) φ ( X, Y) T T T T i T ij i j ik i i kl i i ij ik ikl i i i i ik i i ik =

Graphical Object Models The joint distribution of the 2-layer spatio-temporal model can be written: State of component 1 at time T PX X X X X X X X Y Y O Co C1 CN O Co C1 CN (,,, L,, LL,,,, L,, L ) 0 0 0 0 1 Z O O O Ck Ck Cl ψ ( X, X ) ψ ( X, X ) ψ ( X, X ) O Ck φ ( X, Y) φ ( X, Y) T T T T i T ij i j ik i i kl i i ij ik ikl i i i i ik i i ik =

Graphical Object Models The joint distribution of the 2-layer spatio-temporal model can be written: Image at time T PX X X X X X X X Y Y O Co C1 CN O Co C1 CN (,,, L,, LL,,,, L,, L ) 0 0 0 0 1 Z O O O Ck Ck Cl ψ ( X, X ) ψ ( X, X ) ψ ( X, X ) O Ck φ ( X, Y) φ ( X, Y) T T T T i T ij i j ik i i kl i i ij ik ikl i i i i ik i i ik =

Graphical Object Models The joint distribution of the 2-layer spatio-temporal model can be written: Temporal constraints between objects PX X X X X X X X Y Y O Co C1 CN O Co C1 CN (,,, L,, LL,,,, L,, L ) 0 0 0 0 1 Z O O O Ck Ck Cl ψ ( X, X ) ψ ( X, X ) ψ ( X, X ) O Ck φ ( X, Y) φ ( X, Y) T T T T i T ij i j ik i i kl i i ij ik ikl i i i i ik i i ik =

Graphical Object Models The joint distribution of the 2-layer spatio-temporal model can be written: Spatial constraints between objects and it s components PX X X X X X X X Y Y O Co C1 CN O Co C1 CN (,,, L,, LL,,,, L,, L ) 0 0 0 0 1 Z O O O Ck Ck Cl ψ ( X, X ) ψ ( X, X ) ψ ( X, X ) O Ck φ ( X, Y) φ ( X, Y) T T T T i T ij i j ik i i kl i i ij ik ikl i i i i ik i i ik =

Graphical Object Models The joint distribution of the 2-layer spatio-temporal model can be written: Spatial constraints between components of the objects PX X X X X X X X Y Y O Co C1 CN O Co C1 CN (,,, L,, LL,,,, L,, L ) 0 0 0 0 1 Z O O O Ck Ck Cl ψ ( X, X ) ψ ( X, X ) ψ ( X, X ) O Ck φ ( X, Y) φ ( X, Y) T T T T i T ij i j ik i i kl i i ij ik ikl i i i i ik i i ik =

Graphical Object Models The joint distribution of the 2-layer spatio-temporal model can be written: PX X X X X X X X Y Y O Co C1 CN O Co C1 CN (,,, L,, LL,,,, L,, L ) 0 0 0 0 1 Z O O O Ck Ck Cl ψ ( X, X ) ψ ( X, X ) ψ ( X, X ) O Ck φ ( X, Y) φ ( X, Y) T T T T i T ij i j ik i i kl i i ij ik ikl i i i i ik i i ik Evidence for the object =

Graphical Object Models The joint distribution of the 2-layer spatio-temporal model can be written: PX X X X X X X X Y Y O Co C1 CN O Co C1 CN (,,, L,, LL,,,, L,, L ) 0 0 0 0 1 Z O O O Ck Ck Cl ψ ( X, X ) ψ ( X, X ) ψ ( X, X ) O Ck φ ( X, Y) φ ( X, Y) T T T T i T ij i j ik i i kl i i ij ik ikl i i i i ik i i ik Evidence for the each component of the object =

Inference Algorithm Inference in such graphical models can be estimated using Belief Propagation But, not when State-space is continuous, and Messages are not Gaussian This forces the use of approximate inference algorithms (PAMPAS / Non- Parametric BP) M. Isard (CVPR 03) E. Sudderth, A. Ihler, W.Freeman, A. Willsky (CVPR 03)

Learning Temporal and Spatial Constraints Constraints (conditional distributions) are modeled using a Mixture of Gaussians with a single Gaussian outlier process ψ = λ µ Λ + 0 ij ( X j Xi ) N ( ij, ij ) M ij 0 (1 λ ) qijm N( Fijm ( Xi ), Gijm ( Xi )) m = 1 Learned from the set of labeled patterns

AdaBoost Image Likelihood Given a set of labeled patterns AdaBoost learns the weighted combination of base classifiers i i HY ( X) = αkhk( Y X) k = 1 The final strong classifier gives the confidence that a patch of the image Y defined by the state X i is of the desired class K

AdaBoost Image Likelihood We can convert the confidence score H(Y X i ) into a likelihood by: i i HY ( X) φi ( Y X ) exp T T is the temperature parameter that controls the smoothness of the likelihood function Note, that the image likelihoods are assumed to be independent (not strictly so due to the possible overlap)

Non-Parametric Belief Propagation (PAMPAS) Represent messages and beliefs by a discrete set of weighted samples/kernels (i.e. Mixture of Gaussians) TL TR Obj BL BR

Non-Parametric Belief Propagation (PAMPAS) Non-Parametric BP can be approximately solved using Monte Carlo integration For details, please see: Attractive people: Assembling loose-limbed models using nonparametric belief propagation (NIPS 03) L. Sigal, M. I. Isard, B. H. Sigelman, M. J. Black Tracking Loose-limbed People (CVPR 04) L. Sigal, S. Bhatia, S. Roth, M. J. Black, M. Isard

Preliminary Experiments: Vehicle Detection and Tracking Top-Left Top-Right Original Image Bottom-Left Bottom-Right Object

Preliminary Experiments: Vehicle Detection and Tracking Frame 12 Frame 52 Frame 32 Part detectors are unreliable

Preliminary Experiments: Vehicle Detection and Tracking

Preliminary Experiments: Pedestrian Detection Initialization (based on likelihood) Head Left-Upper Body Right-Upper Body Lower-Body Object Pedestrian Parts/Components Object (GOM+BP) Parts (GOM+BP)

Conclusions New framework that provides unified approach to object/detection and tracking Tracking can benefit from object detection to resolve transient failures Object detection can benefit from temporal consistency Part-based object detection and tracking formulated using Graphical Models and solved using approximate BP We can successfully detect and track two classes of objects (pedestrians and cars)

Future Work Image likelihoods are not really independent (correlations may be explicitly modeled) Distributed Occlusion Reasoning for Tracking with Nonparametric Belief Propagation E. Sudderth, M. Mandel, W. Freeman, A. Willsky NIPS 04 Multi-target detection Currently we can detect multiple targets by exclusion (one target at a time) Unsupervised / semi-supervised learning of Graphical Object Models

Thank you!!!