Mean-Shift Tracker Computer Vision (Kris Kitani) Carnegie Mellon University

Size: px

Start display at page:

Download "Mean-Shift Tracker Computer Vision (Kris Kitani) Carnegie Mellon University"

Loren Baldwin
5 years ago
Views:

1 Mean-Shift Tracker Computer Vision (Kris Kitani) Carnegie Mellon University

3 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975)

4 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Find the region of highest density

5 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Pick a point

6 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Draw a window

7 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Compute the mean

8 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Shift the window

9 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Compute the mean

10 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Shift the window

11 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Compute the mean

12 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Shift the window

13 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Compute the mean

14 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Shift the window

15 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Compute the mean

16 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Shift the window

17 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Compute the mean

18 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Shift the window

19 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Compute the mean

20 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Shift the window

21 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Compute the mean

22 Mean Shift Algorithm A mode seeking algorithm Fukunaga & Hostetler (1975) Shift the window

Kernel Density Estimation 16-385 Computer

23 Kernel Density Estimation Computer Vision (Kris Kitani) Carnegie Mellon University

24 To understand the mean shift algorithm Kernel Density Estimation Approximate the underlying PDF from samples Put bump on every sample to approximate the PDF

25 p(x) probability density function cumulative density function

26 1 randomly sample 0

27 1 randomly sample 0

28 1 randomly sample 0 samples

29 place Gaussian bumps on the samples samples

30 samples

31 samples

32 samples

33 Kernel Density Estimate approximates the original PDF samples

34 Kernel Density Estimation Approximate the underlying PDF from samples from it p(x) = X i (x x i ) 2 c i e 2 2 Gaussian bump aka kernel Put bump on every sample to approximate the PDF

35 Kernel Function K(x, x 0 ) a distance between two points

36 Epanechnikov kernel K(x, x 0 )= c(1 kx x 0 k 2 ) kx x 0 k 2 apple 1 0 otherwise Uniform kernel c kx x 0 k 2 apple 1 K(x, x 0 )= 0 otherwise Normal kernel K(x, x 0 )=c exp 1 2 kx x0 k 2 Radially symmetric kernels

37 Radially symmetric kernels can be written in terms of its profile K(x, x 0 )=c k(kx x 0 k 2 ) profile

38 Connecting KDE and the Mean Shift Algorithm

39 Mean-Shift Tracker Computer Vision (Kris Kitani) Carnegie Mellon University

40 Mean-Shift Tracking Given a set of points: {x s } S s=1 x s 2R d and a kernel: K(x, x 0 ) Find the mean sample point: x

41 Mean-Shift Algorithm Initialize x While v(x) > 1. Compute mean-shift P m(x) = s K(x, x s)x P s s K(x, x s) v(x) =m(x) x 2. Update x x + v(x) Where does this algorithm come from?

42 Mean-Shift Algorithm Initialize x While v(x) > 1. Compute mean-shift P m(x) = s K(x, x s)x P s s K(x, x s) Where does this come from? v(x) =m(x) x 2. Update x x + v(x) Where does this algorithm come from?

43 How is the KDE related to the mean shift algorithm? Recall: Kernel density estimate (radially symmetric kernels) P (x) = 1 N c X n k(kx x n k 2 ) We can show that: Gradient of the PDF is related to the mean shift vector rp (x) / m(x) The mean shift is a step in the direction of the gradient of the KDE

44 In mean-shift tracking, we are trying to find this which means we are trying to

45 We are trying to optimize this: x = arg max x = arg max x P (x) 1 N c X n k( x x n 2 ) usually non-linear non-parametric How do we optimize this non-linear function?

46 We are trying to optimize this: x = arg max x = arg max x P (x) 1 N c X n k( x x n 2 ) usually non-linear non-parametric How do we optimize this non-linear function? compute partial derivatives, gradient descent

47 P (x) = 1 N c X n k(kx x n k 2 ) Compute the gradient

48 P (x) = 1 N c X n k(kx x n k 2 ) Gradient rp (x) = 1 N c X n rk(kx x n k 2 ) Expand the gradient (algebra)

49 P (x) = 1 N c X n k(kx x n k 2 ) Gradient rp (x) = 1 N c X n rk(kx x n k 2 ) Expand gradient rp (x) = 1 N 2c X n (x x n )k 0 (kx x n k 2 )

50 P (x) = 1 N c X n k(kx x n k 2 ) Gradient rp (x) = 1 N c X n rk(kx x n k 2 ) Expand gradient rp (x) = 1 N 2c X n (x x n )k 0 (kx x n k 2 ) Call the gradient of the kernel function g k 0 ( ) = g( )

51 P (x) = 1 N c X n k(kx x n k 2 ) Gradient rp (x) = 1 N c X n rk(kx x n k 2 ) Expand gradient rp (x) = 1 N 2c X n (x x n )k 0 (kx x n k 2 ) change of notation (kernel-shadow pairs) rp (x) = 1 N 2c X n (x n x)g(kx x n k 2 ) keep this in memory: k 0 ( ) = g( )

52 rp (x) = 1 N 2c X n (x n x)g(kx x n k 2 ) multiply it out rp (x) = 1 N 2c X n x n g(kx x n k 2 ) 1 xg(kx x n k 2 ) N 2c X n too long! (use short hand notation) rp (x) = 1 N 2c X n x n g n 1 N 2c X n xg n

53 rp (x) = 1 N 2c X n x n g n 1 N 2c X n xg n rp (x) = 1 N 2c X n multiply by one! x n g n P P n g n n g n 1 N 2c X n xg n rp (x) = 1 N 2c X n collecting like terms P g n x ng n n P n g n x Does this look familiar?

54 mean shift rp (x) = 1 N 2c X n P g n x ng n n P n g n x { constant { mean shift! The mean shift is a step in the direction of the gradient of the KDE m(x) v(x) = P n x ng n P n g n x = rp (x) 1 N 2c P n g n Gradient ascent with adaptive step size

55 Mean-Shift Algorithm Initialize x While v(x) > 1. Compute mean-shift P m(x) = s K(x, x s)x P s s K(x, x s) v(x) =m(x) x 2. Update x x + v(x) gradient with adaptive step size rp (x) 1 N 2c P n g n

56 Everything up to now has been about distributions over samples

57 Dealing with images Pixels for a lattice, spatial density is the same everywhere! What can we do?

58 Consider a set of points: {x s } S s=1 x s 2R d Associated weights: w(x s ) Sample mean: m(x) = P s K(x, x s)w(x s )x s P s K(x, x s)w(x s ) Mean shift: m(x) x

59 Mean-Shift Algorithm Initialize x While v(x) > 1. Compute mean-shift P m(x) = s K(x, x s)w(x s )x P s s K(x, x s)w(x s ) v(x) =m(x) x 2. Update x x + v(x)

60 For images, each pixel is point with a weight

61 For images, each pixel is point with a weight

62 For images, each pixel is point with a weight

63 For images, each pixel is point with a weight

64 For images, each pixel is point with a weight

65 For images, each pixel is point with a weight

66 For images, each pixel is point with a weight

67 For images, each pixel is point with a weight

68 For images, each pixel is point with a weight

69 For images, each pixel is point with a weight

70 For images, each pixel is point with a weight

71 For images, each pixel is point with a weight

72 For images, each pixel is point with a weight

73 For images, each pixel is point with a weight

74 Finally mean shift tracking in video

75 Goal: find the best candidate location in frame 2 x center coordinate of target center coordinate y of candidate target candidate there are many candidates but only one target Frame 1 Frame 2 Use the mean shift algorithm to find the best candidate location

76 Non-rigid object tracking

77 Compute a descriptor for the target Target

78 Search for similar descriptor in neighborhood in next frame Target Candidate

79 Compute a descriptor for the new target Target

80 Search for similar descriptor in neighborhood in next frame Target Candidate

81 How do we model the target and candidate regions?

82 Modeling the target M-dimensional target descriptor q = {q 1,...,q M } (centered at target center) a fancy (confusing) way to write a weighted histogram q m = C X n Normalization factor sum over all pixels k(kx n k 2 ) [b(x n ) m] function of inverse distance (weight) Kronecker delta function quantization function bin ID A normalized color histogram (weighted by distance)

83 Modeling the candidate M-dimensional candidate descriptor p(y) ={p 1 (y),...,p M (y} (centered at location y) y 0 a weighted histogram at y p m = C h X n k y x n h 2! [b(x n ) m] bandwidth

84 Similarity between the target and candidate Distance function d(y) = p 1 [p(y), q] Bhattacharyya Coefficient (y) [p(y), q] = X m p pm (y)q u Just the Cosine distance between two unit vectors p(y) (y) = cos y = p(y)> q kpkkqk = X m p pm (y)q m q

85 Now we can compute the similarity between a target and multiple candidate regions

86 q target p(y) [p(y), q] image similarity over image

87 q target we want to find this peak p(y) [p(y), q] image similarity over image

88 Objective function min y d(y) same as max y [p(y), q] Assuming a good initial guess [p(y 0 + y), q] Linearize around the initial guess (Taylor series expansion) [p(y), q] 1 2 X p pm (y 0 )q m + 1 X r qm p m (y) 2 p m (y 0 ) m m function at specified value derivative

89 Linearized objective [p(y), q] 1 2 X m p pm (y 0 )q m X m p m (y) r qm p m (y 0 ) p m = C h X n k y x n h 2! [b(x n ) m] Remember definition of this? [p(y), q] 1 2 X m Fully expanded ( p pm (y 0 )q m + 1 X X C h k 2 m n y x n h 2! [b(x n ) m] ) r qm p m (y 0 )

90 [p(y), q] 1 2 X m Fully expanded linearized objective ( p pm (y 0 )q m + 1 X X C h k 2 m n y x n h 2! [b(x n ) m] ) r qm p m (y 0 ) [p(y), q] 1 2 X m Moving terms around p pm (y 0 )q m + C h 2 X n w n k y x n h 2! Does not depend on unknown y X r where w n = m Weighted kernel density estimate qm p m (y 0 ) [b(x n) m] Weight is bigger when q m >p m (y 0 )

91 OK, why are we doing all this math?

92 We want to maximize this max y [p(y), q]

93 We want to maximize this max y [p(y), q] [p(y), q] 1 2 X m where Fully expanded linearized objective p pm (y 0 )q m + C h 2 w n = X m r qm X n w n k y x n h p m (y 0 ) [b(x n) m] 2!

94 We want to maximize this max y [p(y), q] [p(y), q] 1 2 X m where Fully expanded linearized objective p pm (y 0 )q m + C h 2 doesn t depend on unknown y w n = X m r qm X n w n k y x n h p m (y 0 ) [b(x n) m] 2!

95 We want to maximize this max y [p(y), q] [p(y), q] 1 2 X m where Fully expanded linearized objective p pm (y 0 )q m + C h 2 doesn t depend on unknown y w n = X m r qm X n w n k y x n h p m (y 0 ) [b(x n) m] only need to maximize this! 2!

96 We want to maximize this max y [p(y), q] [p(y), q] 1 2 X m where Fully expanded linearized objective p pm (y 0 )q m + C h 2 doesn t depend on unknown y w n = X m r qm X n w n k y x n h p m (y 0 ) [b(x n) m] 2! Mean Shift Algorithm! what can we use to solve this weighted KDE?

97 C h 2 X n w n k y h x n 2! the new sample of mean of this KDE is y 1 = (new candidate location) P n x nw n g P n w ng y0 y0 x n h x n h 2 2 (this was derived earlier)

98 Mean-Shift Object Tracking For each frame: 1. Initialize location Compute q Compute p(y 0 ) y 0 2. Derive weights w n 3. Shift to new candidate location (mean shift) y 1 4. Compute p(y 1 ) 5. If ky 0 y 1 k < return Otherwise and go back to 2 y 0 y 1

99 Compute a descriptor for the target Target q

100 Search for similar descriptor in neighborhood in next frame Target Candidate max y [p(y), q]

101 Compute a descriptor for the new target Target q

102 Search for similar descriptor in neighborhood in next frame Target Candidate max y [p(y), q]

103

CS 556: Computer Vision. Lecture 21

CS 556: Computer Vision. Lecture 21 CS 556: Computer Vision Lecture 21 Prof. Sinisa Todorovic sinisa@eecs.oregonstate.edu 1 Meanshift 2 Meanshift Clustering Assumption: There is an underlying pdf governing data properties in R Clustering