A Discriminatively Trained, Multiscale, Deformable Part Model

Size: px

Start display at page:

Download "A Discriminatively Trained, Multiscale, Deformable Part Model"

Gordon Barrett
6 years ago
Views:

1 A Discriminatively Trained, Multiscale, Deformable Part Model P. Felzenszwalb, D. McAllester, and D. Ramanan Edward Hsiao Learning Based Methods in Vision February 16, 2009 Images taken from P. Felzenszwalb, D. Ramanan, N. Dalal and B. Triggs

2 Overview

Histogram of Oriented Gradients (HOG) Split detection window into 8x8 non-overlapping pixel regions called cells Compute 1D histogram of gradients in each cell and discretize

3 Histogram of Oriented Gradients (HOG) Split detection window into 8x8 non-overlapping pixel regions called cells Compute 1D histogram of gradients in each cell and discretize into 9 orientation bins Normalize histogram of each cell with the total energy in the four 2x2 blocks that contain that cell -> 9x4 feature vector Apply a linear SVM classifier

4 Histogram of Oriented Gradients (HOG) block Feature vector f = [,,,, ] cell 9 orientation bins degrees normalize 9x4 feature vector per cell

5 Histogram of Oriented Gradients (HOG) block Feature vector f = [,,,, ] cell 9 orientation bins degrees normalize 9x4 feature vector per cell

6 Histogram of Oriented Gradients (HOG) block Feature vector f = [,,,, ] cell 9 orientation bins degrees normalize 9x4 feature vector per cell

7 Histogram of Oriented Gradients (HOG) block Feature vector f = [,,,, ] cell 9 orientation bins degrees normalize 9x4 feature vector per cell

8 SVM Review w x = 1 2 w c = 1 w x =1 c =1 minimize 1 2 c ( w x ) 1 i w 2 i subject to c ( w i xi ) 1

9 Hinge Loss max( 0,1 c ( w x i i )) incorrectly classified 1 marginal correctly classified 1 1 c ( w x ) > i i 0 if incorrectly classified or inside margin arg min λ w w 2 + n i= 1 max(0,1 c i ( w x i ))

10 HOG & Linear SVM not pedestrian w f < 0 pedestrian w f > 0 HOG space

11 Histogram of Oriented Gradients (HOG) w + Positive components w - Test image HOG descriptor Negative components

12 Average Gradients person car motorbike

13 Deformable Part Models Root filter 8x8 resolution

14 Deformable Part Models Root filter 8x8 resolution Part filter 4x4 resolution Quadratic spatial model 2 a x + a y + b x + 2 x, i i y, i i x, i i y, i i b y

15 HOG Pyramid φ( H, p) concatenation of HOG features in a subwindow of the HOG pyramid H at position p = (x,y,l)

16 Deformable Part Models score = Root filter F 0 Part filters P 1 P n P = F, v, s, a, b ) i ( i i i i i filter response part placement

17 Part Models v i root filter s i,1 s i,2 b x, i = by, i part filter F i b x, i < by, i P = i ( Fi, vi, si, ai, bi ) Quadratic spatial model a x + a y + b x + b 2 2 x, i i y, i i x, i i y, i i b i 0 y

18 Star Graph / 1-fan root filter position part filter positions

19 Distance Transforms part anchor location part position quadratic distance specified by a i and b i filter response

20 Quadratic 1-D Distance Transform

21 Quadratic 1-D Distance Transform

22 Quadratic 1-D Distance Transform

23 Distance Transforms in 2-D input column distance transform full distance transform

24 Latent SVM HOG & Linear SVM ) ( ) ( x w x f w Φ = w = F 0 ) ~, ~, ~, ~,..., ~, ~ ~, ~, ), ), ( ( ),..., ), ( ( ), ), ( ( ( ), ( n n n n n y x y x y x y x p x H p x H p x H z x φ φ φ = Φ )) ( max(0,1 arg min 1 2 * i n i w i w x f y w w = + = λ ), ( max ) ( ) ( z x w x f x Z z w Φ = ),,...,,,,..., ( n n n b a b a F F w = ) ), ( ( ) ( 0 p x H x φ = Φ Deformable Parts & Latent SVM )) ( max(0,1 arg min 1 2 * i n i w i w x f y w w = + = λ

25 Semi-convexity f w ( x) = max w Φ( x, z) z Z ( x) convex in w w * = arg min λ w w 2 + i pos max(0,1 f w ( x )) + i i neg max(0,1+ f w ( x )) i If f w (x) is linear in w, this is a standard SVM (convex) If f w (x) is arbitrary, this is in general not convex If f w (x) is convex in w, the hinge loss is convex for negative examples (semi-convex) - hinge loss is convex in w if positive examples are restricted to single choice of Z(x) wˆ = arg min λ w w 2 + i pos max(0,1 w Φ( x, z i i )) + i neg max(0,1+ f w ( x )) i convex Optimization is now convex!

26 Coordinate Descent 1. Hold w fixed, and optimize the latent values for the positive examples z i = arg max z Z ( x i ) w Φ( x, z) 2. Hold {z i } fixed for positive examples, optimize w by solving the convex problem wˆ = arg min λ w w 2 + i pos max(0,1 w Φ( x i, z i )) + i neg max(0,1+ f w ( x i ))

27 Data Mining Hard Negatives HOG Space positive examples negative examples

28 Data Mining Hard Negatives HOG Space positive examples negative examples

29 Data Mining Hard Negatives HOG Space positive examples negative examples

30 Data Mining Hard Negatives HOG Space positive examples negative examples

31 Data Mining Hard Negatives HOG Space positive examples negative examples

32 Data Mining Hard Negatives HOG Space positive examples negative examples

33 Data Mining Hard Negatives HOG Space positive examples negative examples

34 Model Learning Algorithm Initialize root filter Update root filter Initialize parts Update model

35 Root Filter Initialization Select aspect ratio and size by using a heuristic - model aspect is the mode of data - model size is largest size > 80% of the data Train initial root filter F 0 using an SVM with no latent variables - positive examples anisotropically scaled to aspect and size of filter - random negative examples

36 Root Filter Update Find best scoring placement of root filter that significantly overlaps the bounding box Retrain F 0 with new positive set

37 Part Initialization Greedily select regions in root filter with most energy Part filter initialized to subwindow at twice the resolution Quadratic deformation cost initialized to weak Gaussian

as many as can fit in memory) Train a new model using SVM Keep only hard examples

38 Model Update Positive examples highest scoring placement with > 50% overlap with bounding box Negative examples high scoring detections with no target object (add as many as can fit in memory) Train a new model using SVM Keep only hard examples and add more negative examples Iterate 10 times positive example hard negative example

39 Results PASCAL07 - Person

40 Results PASCAL07 - Bicycle

41 Results PASCAL07 - Car

42 Results PASCAL07 - Horse

43 Results - Person So do more realistic images give higher scores?

44 Superhuman 2.56!

45 Gradient Domain Editing g x g x ' g y g y ' f(x,y) = x, y f = g 2 f = g Decompose Edit Reconstruct (Solve Poisson Equation)

46 Generating a person 9 orientation bins 18 orientation bins for positive and negative

47 Generating a person g x initial orientation bin assignments g y initial person

48 Simulated Annealing higher cost P = exp ( c c T new current ) T is initially high and decreases with number of iterations

49 Person Score: 2.56 Score: 0.96

50 Generated Images Car Score: 3.14 Score: 1.57 Horse Score: 0.84 Score: -0.30

51 Generated Images Bicycle Score: 2.63 Score: 2.18 Cat Score: 0.80 Score: -0.71

52 Gradient Erasing Original Score: 0.83 Erased Score: 2.78 Difference image

53 Gradient Erasing Original Score: Erased Score: 0.26 Difference image

54 Gradient Addition Score: 0.83 Score: 3.03

55 Gradient Addition Score: 2.15

56 Discriminatively Trained Mixtures of Deformable Part Models P. Felzenszwalb, D. McAllester, and D. Ramanan Slide taken from P. Felzenszwalb

57 Questions?

58 Thank You

Discriminative part-based models. Many slides based on P. Felzenszwalb

More sliding window detection: ti Discriminative part-based models Many slides based on P. Felzenszwalb Challenge: Generic object detection Pedestrian detection Features: Histograms of oriented gradients