Visual Object Detection
|
|
- Bernadette Paul
- 5 years ago
- Views:
Transcription
1 Visual Object Detection Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL / 47
2 Visual Object Detection I Detecting an object in an image I I output: location and size of all instances of this object class Challenges I I I I I I what is an object? how to describe the object? how likely is an image an object of interest? how to handle the scale changes? how to handle the orientation of the target? how to handle all sorts of visual variabilities? 2 / 47
3 Outline Basics in Detection Theory Boosting-based Detection Feature Template-based Detection Deformable Parts Model (DPM) based Detection Deep Network based Detection 3 / 47
4 Action and Risk Classes: {ω 1, ω 2,..., ω c } Actions: {α 1, α 2,..., α a } Loss: λ(α k ω i ) Conditional risk: R(α k x) = c λ(α k ω i )p(ω i x) i=1 Decision function, α(x), specifies a decision rule. Overall risk: R = x R(α(x) x)p(x)dx It is the expected loss associated with a given decision rule. 4 / 47
5 Bayesian Decision and Bayesian Risk Bayesian decision α = argmin R(α k x) k This leads to the minimum overall risk. (why?) Bayesian risk: the minimum overall risk R = R(α x)p(x)dx Bayesian risk is the best one can achieve. x 5 / 47
6 Example: Minimum-error-rate classification Let s have a specific example of Bayesian decision In classification problems, action α k corresponds to ω k Let s define a zero-one loss function λ(α k ω i ) = { 0 i = k 1 i k i, k = 1,..., c This means: no loss for correct decisions & all errors are equal It easy to see: the conditional risk error rate R(α k x) = i k P(ω i x) = 1 P(ω k x) Bayesian decision rule minimum-error-rate classification Decide ω k if P(ω k x) > P(ω i x) i k 6 / 47
7 Classifier and Discriminant Functions Discriminant function: g i (x), i = 1,..., C, assigns ω i to x Classifier Examples: x ω i if g i (x) > g j (x) g i (x) = P(ω i x) g i (x) = P(x ω i )P(ω i ) j i g i (x) = ln P(x ω i ) + ln P(ω i ) Note: the choice of D-function is not unique, but they may give equivalent classification result. Decision region: the partition of the feature space x R i if g i (x) > g j (x) j i Decision boundary: 7 / 47
8 Miss Detections vs. False Positives errors: miss detections AND false positives No free lunch! 8 / 47
9 Visual Detection: Three Key Issues Target representation Rule-based models Shape template-based models Image appearance-based models Visual feature-based models Pattern classification various choices of classifers training Effective search determing the location: scanning all pixels locations determinng the scale: scanning the scale space how to make the search faster? 9 / 47
10 Outline Basics in Detection Theory Boosting-based Detection Feature Template-based Detection Deformable Parts Model (DPM) based Detection Deep Network based Detection 10 / 47
11 Example: front-view face detection locate the faces in an image challenges: large variations in the visual appearances due to: scale and/or rotation illumination facial expression partial occlusion 11 / 47
12 Viola-Jones Detector Feature Simple Harr wavelet features Classifier AdaBoost feature selection Smart ideas to speed things up integral image cascading classifiers 12 / 47
13 Feature: Harr-like wavelet A bank of Harr-like wavelet filters Applying a filter to a pixel location produces a feature How many such features does a detection window generate? How to compute such features rapidly? 13 / 47
14 A Smart Idea: Integral Image The value of the integral image at (x, y) is the sum of all the pixels above and to the left II (x, y) = I (u, v) u x,v y This is done only once for an image Then the computation of the sum of all pixels within any rectangular region is a constant complexity Ex: the sum within D is done via / 47
15 Weak Classifier Weak features and weak classifiers a weak classifier uses only one simple feature for classification { x if p j f j (x) < p j θ j h j (x) = 0 otherwise a weak classifer: (f j, θ j, p j ) Why not combining multiple weak classifiers? 15 / 47
16 AdaBoost for Feature Selection 16 / 47
17 Feature Selection and Combination Strong classifier T 1 α t h t (x) 1 2 h(x) = t=1 0 otherwise Does the selection make senses? T α t t=1 17 / 47
18 Speeding Up: Attentional Cascade Motivation most deteciton windwos contain non-faces thus most computation is wasted Idea? can we save computation on non-faces? early rejection? using simple classifiers for screening. Does the selection make senses? 18 / 47
19 Designing Cascade Design parameters # of cascade stages # of features for each stage parameters of each stage Example: a 32-stage classifier S1: 2-feature, detect 100% faces and reject 60% non-faces S2: 5-feature, detect 100% faces and reject 80% non-faces S3-5: 20-feature S6-7: 50-feature S8-12: 100-feature S13-32: 200-feature Designing a good cascade needs tremendous engineering efforts 19 / 47
20 Cascade Performance A 200-feature classifier vs a 10-stage 20-feature cascade Similar accuracy, but cascade is 10 times faster 20 / 47
21 Training Images Data collection: positive data and negative data Validation set 21 / 47
22 Results 22 / 47
23 Summary Advantages simple: easy to implement fast: real-time performances Limitations and open problems difficult to design cascade cannot handle out-of-plane rotation difficult to handle partial occlusion 23 / 47
24 Outline Basics in Detection Theory Boosting-based Detection Feature Template-based Detection Deformable Parts Model (DPM) based Detection Deep Network based Detection 24 / 47
25 Viola-Jones Detector for Pedestrain Detection use 45,000 possible features OK results, but still far from satisfactory 25 / 47
26 From Face to Pedstrain Detection I I I articulated poses various views unpredictable cloth 26 / 47
27 Histogram of Gradient Orientations Bining of the gradient orientations within a cell quantization of the orientations of the image gradient weightedby the maganitude of the gradient (not a histogram) Spatial combination (R-HoG and C-HoG) to form a block the purpose is to normalize the local histograms with the block lead to a normalized descriptor A HoG descriptor represents a block 27 / 47
28 a 7 15 array a 3,780-D vector 28 / 47 HoG Feature HoG descriptor dimension (a 36-D vector) using 9 bins for the orientation quantization ([0, π)) cell size: 8 8 pixels, and block size: 2 2 cells HoG-based human representation (an array of HoG vectors) detection window size: stride (i.e., block overlap): half of the block size (8 pixels)
29 HoG + Linear SVM (a) average gradient image over the training samples (b) each pixel is the max positive SVM weight for the block (b) each pixel is the max negative SVM weight for the block 29 / 47
30 HoG + Linear SVM (d) a test image (e) its R-HOG features (7x15x36) (f) the descriptor weighted by the positive SVM weights (g) the descriptor weighted by the negative SVM weights 30 / 47
31 Examples: before clustering I Search over scale (scaling factor 1.05) 31 / 47
32 Examples Clustering needs to be performed to (1) group multiple detections, (2) reduce false postivies 32 / 47
33 Outline Basics in Detection Theory Boosting-based Detection Feature Template-based Detection Deformable Parts Model (DPM) based Detection Deep Network based Detection 33 / 47
34 Deformable Parts Model Large variations in visual appearances challenge object detection Such variations are induced by: deformation of the target s shape structual composition large appearance chagnes view changes etc. These variations may be tremendous Rigid templates and single deformable models are not able to capture such huge variations Part-based deformable models modeling the structual composition and variations strong expressive power sharing computation a rich model 34 / 47
35 Model: features and filters Scale-space image representation (or image pyramid) p = (x, y, l) is a position (x, y) in the l-th level of the pyramid H(p) is the raw visual feature pyramid (a tensor) Visual features: φ(h, p, w, h) located at p supported by the w h subwindow (whose top-left corner is p) using the H as the raw feature represented as a vector by stacking features in subwindow denoted by φ(p) for short, and w, h are predefined parameters φ(p) are visual observations Filters: F a 2D filter, with size w h represented as a 1D vector by stacking the elements concept: weighting the features in the w h subwindow filter response: < F, φ(p) > F will be learned Existance of an object an object is encoded by a filter the response of such a filter at p indicates how likely this object exists at p 35 / 47
36 Model: configuration and springs The DPM model of an object with n parts is defined by (F 0, P 1,..., P n, b) A root filter F0 : covers an entire object in lower resolution n fine part filters Pi : covers smaller parts in higher resolution b is a real-value bias term A part filter: P i = (F i, v i, d i ) Fi is the filter for the i-th part vi is a 2D vector: an anchor position for this part w.r.t. root di is a 4D vector: coefficients for the deformable cost displacement: [dx i, dy i ] = [x i, y i ] (2[x 0, y 0] + v i ) Define φ d (p i, p 0) = φ d (dx i, dy i ) = [dx i, dy i, dxi 2, dyi 2 ] Configuration and Spring Configuration: (p0, p 1, p 2,..., p n ) Spring : the strength of a spring is di. A star topology Every part is connected to the root No connection among parts Configuration is to be inferred (or estimated) for each image Spring is to be learned based on all training images 36 / 47
37 Model: parameters and the linear model Model parameters Λ = (F 0, F 1,..., F n, d 1,..., d n, b) Model obervations (evidence) Y = φ(p) Model target variable X = p0, the location of the root Model laten variable Z = (p1,..., p n ), the part locations To evaluate a complete hypothesis (X, Z) s(x, Z Y) = s(p 0, p 1,..., p n ) n = < F i, φ(p i ) > i=0 Another way: a linear form where n < d i, φ d (p i, p 0 ) > +b i=1 s(x, Z Y) =< β, ψ(x, Z) > β = [F 0,..., F n, d 1,... d n, b] ψ(x, Z) = [φ(p 0 ),..., φ(p n ), φ d (p 1, p 0 ),..., φ d (p n, p 0 ), 1] 37 / 47
38 Inference The objective of inference is to find s(p 0 ) s(p 0 Y) = For each part, define { D i (p 0 ) = max p i Then, easy to see: max s(p 0, p 1,..., p n Y) (p 1,...,p n) Fi T s(p 0 ) = F T 0 φ(p 0 ) + } φ(p i ) di T φ d (p i, p 0 ) n D i (p 0 ) + d D i (p 0 ) is the maximum contribution of the i-th part to the score of the root at p 0 (i.e., optimal subpath property) This is a very simple dynamic programming problem The part localization is done via back tracking in the DP i=1 p i = arg max p i D i (p 0 ) 38 / 47
39 Inference: Computing D i (p 0 ) The key in DPM inference is to compute D i (p 0 ). { } D i (p 0 ) = max F T p i φ(p i ) di T φ d (p i, p 0 ) i The first term Fi T φ(p i ) the response map of the part filter Fi independent of the root location p 0. easy to compute The second term is di T φ d (p i, p 0 ) penelty of the placement of pi for a given root position p 0 easy to compute as well The major issue is the maximization over p i if considering all possible choices of pi, although it is linear, it wastes a lot of computation This is implemented via a generalized distance tranform. This leads to a transformed response map [transforming from the part filter response map Fi T φ(p i ) to D i (p 0 )] 39 / 47
40 Inference: process 40 / 47
41 Learning: Latent SVM Consider a classifer (discriminative function) in the following form f β (x) = max z Z(x) βt Φ(x, z) where β is a vector of the model parameters, and z are the latent values The set Z(x) defines the domain of z given an x Classification is obtained based on the sign of f β (x) Given training data D = ((x 1, y 1 ),..., (x n, y n )), where y i { 1, 1} minimzing the following objective function L D (β) = 1 n 2 β 2 + C max(0, 1 y i f β (x i )) i=1 where max(0, 1 y i f β (x i ) is the standard hinge loss. C controls the regularization. Note: If Z(x i ) = 1, then it degenerates to linear SVM. 41 / 47
42 Learning: Solving Latent SVM Denote by Z p the latent value for each positive training sample For a positive example, set Z(x i ) = {z i } where z i is the latent value specified for x i by Z p. Define an auxiliary objective function L D (β, Z p ) = L D(Zp)(β) Property: L D (β) = min Z p L D (β, Z p ) i.e., L D (β, Z p ) bounds the LSVM objective. Now, we minimize L D (β, Z p ) instead Minimzing L D (β, Z p ) Relabeling positive examples: optimize L D (β, Z p ) over Z p by selecting the highest-score latent values for each pos. example: z i = arg max z Z(x i ) βt Φ(x i, z) Estimating β: optimize LD (β, Z p ) over β by solving the convex optimziation problem defined by L D(Zp)(β) L D (β, Z p ) = 1 2 β 2 + C max(0, 1 f β (x i )) {x i y i (x i )=1} 42 / 47
43 Some DPM models 43 / 47
44 DPM on PASCAL 2007 Dataset 44 / 47
45 Outline Basics in Detection Theory Boosting-based Detection Feature Template-based Detection Deformable Parts Model (DPM) based Detection Deep Network based Detection 45 / 47
46 Rowley-Baluja-Kanade s Detector Train a multilayer neural network (1998) Receptive fields An early attempt of using neural network for face detection Tremendous Deep networks for face detection nowadays 46 / 47
47 Some Results 47 / 47
Discriminative part-based models. Many slides based on P. Felzenszwalb
More sliding window detection: ti Discriminative part-based models Many slides based on P. Felzenszwalb Challenge: Generic object detection Pedestrian detection Features: Histograms of oriented gradients
More informationA Discriminatively Trained, Multiscale, Deformable Part Model
A Discriminatively Trained, Multiscale, Deformable Part Model P. Felzenszwalb, D. McAllester, and D. Ramanan Edward Hsiao 16-721 Learning Based Methods in Vision February 16, 2009 Images taken from P.
More informationBayesian Decision and Bayesian Learning
Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i
More informationBoosting: Algorithms and Applications
Boosting: Algorithms and Applications Lecture 11, ENGN 4522/6520, Statistical Pattern Recognition and Its Applications in Computer Vision ANU 2 nd Semester, 2008 Chunhua Shen, NICTA/RSISE Boosting Definition
More informationObject Detection Grammars
Object Detection Grammars Pedro F. Felzenszwalb and David McAllester February 11, 2010 1 Introduction We formulate a general grammar model motivated by the problem of object detection in computer vision.
More informationReconnaissance d objetsd et vision artificielle
Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le
More informationFace detection and recognition. Detection Recognition Sally
Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification
More informationPCA FACE RECOGNITION
PCA FACE RECOGNITION The slides are from several sources through James Hays (Brown); Srinivasa Narasimhan (CMU); Silvio Savarese (U. of Michigan); Shree Nayar (Columbia) including their own slides. Goal
More informationTwo-Layered Face Detection System using Evolutionary Algorithm
Two-Layered Face Detection System using Evolutionary Algorithm Jun-Su Jang Jong-Hwan Kim Dept. of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology (KAIST),
More informationECE 661: Homework 10 Fall 2014
ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;
More informationCOS 429: COMPUTER VISON Face Recognition
COS 429: COMPUTER VISON Face Recognition Intro to recognition PCA and Eigenfaces LDA and Fisherfaces Face detection: Viola & Jones (Optional) generic object models for faces: the Constellation Model Reading:
More information2D Image Processing Face Detection and Recognition
2D Image Processing Face Detection and Recognition Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de
More informationBeyond Spatial Pyramids
Beyond Spatial Pyramids Receptive Field Learning for Pooled Image Features Yangqing Jia 1 Chang Huang 2 Trevor Darrell 1 1 UC Berkeley EECS 2 NEC Labs America Goal coding pooling Bear Analysis of the pooling
More informationLecture 13 Visual recognition
Lecture 13 Visual recognition Announcements Silvio Savarese Lecture 13-20-Feb-14 Lecture 13 Visual recognition Object classification bag of words models Discriminative methods Generative methods Object
More informationCS 231A Section 1: Linear Algebra & Probability Review
CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationLearning theory. Ensemble methods. Boosting. Boosting: history
Learning theory Probability distribution P over X {0, 1}; let (X, Y ) P. We get S := {(x i, y i )} n i=1, an iid sample from P. Ensemble methods Goal: Fix ɛ, δ (0, 1). With probability at least 1 δ (over
More informationAchieving scale covariance
Achieving scale covariance Goal: independently detect corresponding regions in scaled versions of the same image Need scale selection mechanism for finding characteristic region size that is covariant
More informationDifferential Motion Analysis (II)
Differential Motion Analysis (II) Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu 1/30 Outline
More informationBayesian Decision Theory
Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian
More informationMinimum Error-Rate Discriminant
Discriminants Minimum Error-Rate Discriminant In the case of zero-one loss function, the Bayes Discriminant can be further simplified: g i (x) =P (ω i x). (29) J. Corso (SUNY at Buffalo) Bayesian Decision
More informationMachine Learning Basics
Security and Fairness of Deep Learning Machine Learning Basics Anupam Datta CMU Spring 2019 Image Classification Image Classification Image classification pipeline Input: A training set of N images, each
More informationMetric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT)
Metric Embedding of Task-Specific Similarity Greg Shakhnarovich Brown University joint work with Trevor Darrell (MIT) August 9, 2006 Task-specific similarity A toy example: Task-specific similarity A toy
More informationCS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang
CS 231A Section 1: Linear Algebra & Probability Review Kevin Tang Kevin Tang Section 1-1 9/30/2011 Topics Support Vector Machines Boosting Viola Jones face detector Linear Algebra Review Notation Operations
More informationLoss Functions and Optimization. Lecture 3-1
Lecture 3: Loss Functions and Optimization Lecture 3-1 Administrative Assignment 1 is released: http://cs231n.github.io/assignments2017/assignment1/ Due Thursday April 20, 11:59pm on Canvas (Extending
More informationCITS 4402 Computer Vision
CITS 4402 Computer Vision A/Prof Ajmal Mian Adj/A/Prof Mehdi Ravanbakhsh Lecture 06 Object Recognition Objectives To understand the concept of image based object recognition To learn how to match images
More informationMachine Learning for Signal Processing Detecting faces in images
Machine Learning for Signal Processing Detecting faces in images Class 7. 19 Sep 2013 Instructor: Bhiksha Raj 19 Sep 2013 11755/18979 1 Administrivia Project teams? Project proposals? 19 Sep 2013 11755/18979
More informationLoG Blob Finding and Scale. Scale Selection. Blobs (and scale selection) Achieving scale covariance. Blob detection in 2D. Blob detection in 2D
Achieving scale covariance Blobs (and scale selection) Goal: independently detect corresponding regions in scaled versions of the same image Need scale selection mechanism for finding characteristic region
More informationINTRODUCTION HIERARCHY OF CLASSIFIERS
INTRODUCTION DETECTION AND FOLDED HIERARCHIES FOR EFFICIENT DONALD GEMAN JOINT WORK WITH FRANCOIS FLEURET 2 / 35 INTRODUCTION DETECTION (CONT.) HIERARCHY OF CLASSIFIERS...... Advantages - Highly efficient
More informationCS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS
CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting
More informationFace recognition Computer Vision Spring 2018, Lecture 21
Face recognition http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 2018, Lecture 21 Course announcements Homework 6 has been posted and is due on April 27 th. - Any questions about the homework?
More informationBayesian decision theory Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory
Bayesian decision theory 8001652 Introduction to Pattern Recognition. Lectures 4 and 5: Bayesian decision theory Jussi Tohka jussi.tohka@tut.fi Institute of Signal Processing Tampere University of Technology
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationRepresenting Images Detecting faces in images
11-755 Machine Learning for Signal Processing Representing Images Detecting faces in images Class 5. 15 Sep 2009 Instructor: Bhiksha Raj Last Class: Representing Audio n Basic DFT n Computing a Spectrogram
More informationOutline: Ensemble Learning. Ensemble Learning. The Wisdom of Crowds. The Wisdom of Crowds - Really? Crowd wiser than any individual
Outline: Ensemble Learning We will describe and investigate algorithms to Ensemble Learning Lecture 10, DD2431 Machine Learning A. Maki, J. Sullivan October 2014 train weak classifiers/regressors and how
More informationLECTURE NOTE #3 PROF. ALAN YUILLE
LECTURE NOTE #3 PROF. ALAN YUILLE 1. Three Topics (1) Precision and Recall Curves. Receiver Operating Characteristic Curves (ROC). What to do if we do not fix the loss function? (2) The Curse of Dimensionality.
More informationClassifier Performance. Assessment and Improvement
Classifier Performance Assessment and Improvement Error Rates Define the Error Rate function Q( ω ˆ,ω) = δ( ω ˆ ω) = 1 if ω ˆ ω = 0 0 otherwise When training a classifier, the Apparent error rate (or Test
More informationQ&A of the Deformable Part Model
Q&A of the Deformable Part Model Philipp Krähenbühl Lecture 1 -! 1 Deformable Part Model [P.Felzenszwalb, D.McAllester, and D.Ramanan. A DiscriminaFvely Trained, MulFscale, Deformable Part Model. CVPR
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Linear Classifiers. Blaine Nelson, Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Linear Classifiers Blaine Nelson, Tobias Scheffer Contents Classification Problem Bayesian Classifier Decision Linear Classifiers, MAP Models Logistic
More informationPattern recognition. "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher
Pattern recognition "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher The more relevant patterns at your disposal, the better your decisions will be. This is hopeful news to
More informationCMU-Q Lecture 24:
CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input
More informationIntelligent Systems Statistical Machine Learning
Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2014/2015, Our tasks (recap) The model: two variables are usually present: - the first one is typically discrete k
More informationMax-Margin Additive Classifiers for Detection
Max-Margin Additive Classifiers for Detection Subhransu Maji and Alexander C. Berg Sam Hare VGG Reading Group October 30, 2009 Introduction CVPR08: SVMs with additive kernels can be evaluated efficiently.
More informationLearning Methods for Linear Detectors
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2011/2012 Lesson 20 27 April 2012 Contents Learning Methods for Linear Detectors Learning Linear Detectors...2
More informationRegion Covariance: A Fast Descriptor for Detection and Classification
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Region Covariance: A Fast Descriptor for Detection and Classification Oncel Tuzel, Fatih Porikli, Peter Meer TR2005-111 May 2006 Abstract We
More informationHierarchical Boosting and Filter Generation
January 29, 2007 Plan Combining Classifiers Boosting Neural Network Structure of AdaBoost Image processing Hierarchical Boosting Hierarchical Structure Filters Combining Classifiers Combining Classifiers
More informationDiscriminative Learning and Big Data
AIMS-CDT Michaelmas 2016 Discriminative Learning and Big Data Lecture 2: Other loss functions and ANN Andrew Zisserman Visual Geometry Group University of Oxford http://www.robots.ox.ac.uk/~vgg Lecture
More informationPictorial Structures Revisited: People Detection and Articulated Pose Estimation. Department of Computer Science TU Darmstadt
Pictorial Structures Revisited: People Detection and Articulated Pose Estimation Mykhaylo Andriluka Stefan Roth Bernt Schiele Department of Computer Science TU Darmstadt Generic model for human detection
More informationLoss Functions and Optimization. Lecture 3-1
Lecture 3: Loss Functions and Optimization Lecture 3-1 Administrative: Live Questions We ll use Zoom to take questions from remote students live-streaming the lecture Check Piazza for instructions and
More informationUNSUPERVISED LEARNING
UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training
More informationFast Human Detection from Videos Using Covariance Features
THIS PAPER APPEARED IN THE ECCV VISUAL SURVEILLANCE WORKSHOP (ECCV-VS), MARSEILLE, OCTOBER 2008 Fast Human Detection from Videos Using Covariance Features Jian Yao Jean-Marc Odobez IDIAP Research Institute
More informationBayesian Decision Theory Lecture 2
Bayesian Decision Theory Lecture 2 Jason Corso SUNY at Buffalo 14 January 2009 J. Corso (SUNY at Buffalo) Bayesian Decision Theory Lecture 2 14 January 2009 1 / 58 Overview and Plan Covering Chapter 2
More informationMRC: The Maximum Rejection Classifier for Pattern Detection. With Michael Elad, Renato Keshet
MRC: The Maimum Rejection Classifier for Pattern Detection With Michael Elad, Renato Keshet 1 The Problem Pattern Detection: Given a pattern that is subjected to a particular type of variation, detect
More informationECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering
More informationGraphical Object Models for Detection and Tracking
Graphical Object Models for Detection and Tracking (ls@cs.brown.edu) Department of Computer Science Brown University Joined work with: -Ying Zhu, Siemens Corporate Research, Princeton, NJ -DorinComaniciu,
More informationECE521 Lecture7. Logistic Regression
ECE521 Lecture7 Logistic Regression Outline Review of decision theory Logistic regression A single neuron Multi-class classification 2 Outline Decision theory is conceptually easy and computationally hard
More informationClassification: The rest of the story
U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598 Machine Learning for Signal Processing Classification: The rest of the story 3 October 2017 Today s lecture Important things we haven t covered yet Fisher
More informationCS5670: Computer Vision
CS5670: Computer Vision Noah Snavely Lecture 5: Feature descriptors and matching Szeliski: 4.1 Reading Announcements Project 1 Artifacts due tomorrow, Friday 2/17, at 11:59pm Project 2 will be released
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationThe exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.
CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. Please
More informationLearning Linear Detectors
Learning Linear Detectors Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps Today Detection versus Classification Bayes Classifiers Linear Classifiers Examples of Detection 3 Learning: Detection
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA
More informationPart 4: Conditional Random Fields
Part 4: Conditional Random Fields Sebastian Nowozin and Christoph H. Lampert Colorado Springs, 25th June 2011 1 / 39 Problem (Probabilistic Learning) Let d(y x) be the (unknown) true conditional distribution.
More informationKernel methods, kernel SVM and ridge regression
Kernel methods, kernel SVM and ridge regression Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Collaborative Filtering 2 Collaborative Filtering R: rating matrix; U: user factor;
More informationConvolutional Neural Networks
Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»
More informationInstance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest
More informationCSE 598C Vision-based Tracking Seminar. Times: MW 10:10-11:00AM Willard 370 Instructor: Robert Collins Office Hours: Tues 2-4PM, Wed 9-9:50AM
CSE 598C Vision-based Tracking Seminar Times: MW 10:10-11:00AM Willard 370 Instructor: Robert Collins Office Hours: Tues 2-4PM, Wed 9-9:50AM What is Tracking? typical idea: tracking a single target in
More information44 CHAPTER 2. BAYESIAN DECISION THEORY
44 CHAPTER 2. BAYESIAN DECISION THEORY Problems Section 2.1 1. In the two-category case, under the Bayes decision rule the conditional error is given by Eq. 7. Even if the posterior densities are continuous,
More informationFinal Examination CS540-2: Introduction to Artificial Intelligence
Final Examination CS540-2: Introduction to Artificial Intelligence May 9, 2018 LAST NAME: SOLUTIONS FIRST NAME: Directions 1. This exam contains 33 questions worth a total of 100 points 2. Fill in your
More informationCOMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection
COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection Instructor: Herke van Hoof (herke.vanhoof@cs.mcgill.ca) Based on slides by:, Jackie Chi Kit Cheung Class web page:
More informationFinal Examination CS 540-2: Introduction to Artificial Intelligence
Final Examination CS 540-2: Introduction to Artificial Intelligence May 7, 2017 LAST NAME: SOLUTIONS FIRST NAME: Problem Score Max Score 1 14 2 10 3 6 4 10 5 11 6 9 7 8 9 10 8 12 12 8 Total 100 1 of 11
More informationEEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1
EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle
More informationPattern Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley & Sons, 2000 with the permission of the authors
More informationCS 1674: Intro to Computer Vision. Final Review. Prof. Adriana Kovashka University of Pittsburgh December 7, 2016
CS 1674: Intro to Computer Vision Final Review Prof. Adriana Kovashka University of Pittsburgh December 7, 2016 Final info Format: multiple-choice, true/false, fill in the blank, short answers, apply an
More informationLinear Models for Classification
Linear Models for Classification Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Linear
More informationExample: Face Detection
Announcements HW1 returned New attendance policy Face Recognition: Dimensionality Reduction On time: 1 point Five minutes or more late: 0.5 points Absent: 0 points Biometrics CSE 190 Lecture 14 CSE190,
More informationMachine Learning 2017
Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section
More informationLinear Classifiers as Pattern Detectors
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2014/2015 Lesson 16 8 April 2015 Contents Linear Classifiers as Pattern Detectors Notation...2 Linear
More information> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel
Logistic Regression Pattern Recognition 2016 Sandro Schönborn University of Basel Two Worlds: Probabilistic & Algorithmic We have seen two conceptual approaches to classification: data class density estimation
More informationHow to do backpropagation in a brain
How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep
More informationTDT4173 Machine Learning
TDT4173 Machine Learning Lecture 3 Bagging & Boosting + SVMs Norwegian University of Science and Technology Helge Langseth IT-VEST 310 helgel@idi.ntnu.no 1 TDT4173 Machine Learning Outline 1 Ensemble-methods
More informationCS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines
CS4495/6495 Introduction to Computer Vision 8C-L3 Support Vector Machines Discriminative classifiers Discriminative classifiers find a division (surface) in feature space that separates the classes Several
More informationLearning features by contrasting natural images with noise
Learning features by contrasting natural images with noise Michael Gutmann 1 and Aapo Hyvärinen 12 1 Dept. of Computer Science and HIIT, University of Helsinki, P.O. Box 68, FIN-00014 University of Helsinki,
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationIntroduction to Statistical Inference
Structural Health Monitoring Using Statistical Pattern Recognition Introduction to Statistical Inference Presented by Charles R. Farrar, Ph.D., P.E. Outline Introduce statistical decision making for Structural
More informationStructured Prediction
Structured Prediction Classification Algorithms Classify objects x X into labels y Y First there was binary: Y = {0, 1} Then multiclass: Y = {1,...,6} The next generation: Structured Labels Structured
More informationModeling Complex Temporal Composition of Actionlets for Activity Prediction
Modeling Complex Temporal Composition of Actionlets for Activity Prediction ECCV 2012 Activity Recognition Reading Group Framework of activity prediction What is an Actionlet To segment a long sequence
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall
More informationDISCRIMINATIVE DECORELATION FOR CLUSTERING AND CLASSIFICATION
DISCRIMINATIVE DECORELATION FOR CLUSTERING AND CLASSIFICATION ECCV 12 Bharath Hariharan, Jitandra Malik, and Deva Ramanan MOTIVATION State-of-the-art Object Detection HOG Linear SVM Dalal&Triggs Histograms
More informationBasis Expansion and Nonlinear SVM. Kai Yu
Basis Expansion and Nonlinear SVM Kai Yu Linear Classifiers f(x) =w > x + b z(x) = sign(f(x)) Help to learn more general cases, e.g., nonlinear models 8/7/12 2 Nonlinear Classifiers via Basis Expansion
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationExpectation Maximization
Expectation Maximization Bishop PRML Ch. 9 Alireza Ghane c Ghane/Mori 4 6 8 4 6 8 4 6 8 4 6 8 5 5 5 5 5 5 4 6 8 4 4 6 8 4 5 5 5 5 5 5 µ, Σ) α f Learningscale is slightly Parameters is slightly larger larger
More informationRobotics 2 AdaBoost for People and Place Detection
Robotics 2 AdaBoost for People and Place Detection Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard v.1.0, Kai Arras, Oct 09, including material by Luciano Spinello and Oscar Martinez Mozos
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 4: Curse of Dimensionality, High Dimensional Feature Spaces, Linear Classifiers, Linear Regression, Python, and Jupyter Notebooks Peter Belhumeur Computer Science
More informationError Rates. Error vs Threshold. ROC Curve. Biometrics: A Pattern Recognition System. Pattern classification. Biometrics CSE 190 Lecture 3
Biometrics: A Pattern Recognition System Yes/No Pattern classification Biometrics CSE 190 Lecture 3 Authentication False accept rate (FAR): Proportion of imposters accepted False reject rate (FRR): Proportion
More informationRecognition Performance from SAR Imagery Subject to System Resource Constraints
Recognition Performance from SAR Imagery Subject to System Resource Constraints Michael D. DeVore Advisor: Joseph A. O SullivanO Washington University in St. Louis Electronic Systems and Signals Research
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural
More informationDeep learning on 3D geometries. Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering
Deep learning on 3D geometries Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering Overview Background Methods Numerical Result Future improvements Conclusion Background
More informationMachine Learning, Midterm Exam
10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have
More information