Announcements Tracking Computer Vision I CSE5A Lecture 17 HW 3 due toda HW 4 will be on web site tomorrow: Face recognition using 3 techniques Toda: Tracking Reading: Sections 17.1-17.3 The Motion Field Rigid Motion: General Case p& P Position and orientation o a rigid bod Rotation Matrix & Translation vector Rigid Motion: Velocit Vector: T Angular Velocit Vector: ω (or Ω) p & = T + ω p Motion Field Equation T u zu Tx ω uv ω x u& = ω + ωzv + Tzv T ω uv ωxv v& = + ωx ωzu T: Components o 3-D linear motion ω: Angular velocit vector (u,v): Image point coordinates : depth : ocal length T u zu Tx ω uv ω x u& = ω + ωzv + Tzv T ω uv ωxv v& = + ωx ωzu ω Pure Translation
Pure Translation Pure Rotation: T=0 T u zu Tx ω uv ω x u& = ω + ωzv + Tzv T ω uv ωxv v& = + ωx ωzu Radial about FOE Parallel ( FOE point at ininit) T Motion parallel to image plane Independent o T x T T z Independent o Onl unction o (u,v), and ω Pure Rotation: Motion Field on Sphere Mathematical ormulation I (x,,t) = brightness at image point (x,) at time t Consider scene (or camera) to be moving, so x(t), (t) Brightness constanc assumption: dx d I ( x + δt, + δt, t + δt) = I( x,, t) di Optical low constraint equation : di = x dx + d + t We can measure We want to solve or Solving or low Optical low constraint equation : di = x dx + d,, x t dx d, One equation, two unknowns + t Measurements I x =, I =, I t = t Flow vector dx u =, d v =
Two was to get low de( u, v) = du de( u, v) = dv I I ( I u + I v + I ) x x ( I u + I v + I ) x t t 1. Think globall, and regularize over image. Look over window and assume constant motion in the window Some variants Coarse to ine (image pramids) Local/global motion models Robust estimation
Motion Model Example: Aine Motion Visual Tracking 10000 $$/Hz 1000 100 10 1 0.1 000 150 60 5.08 0.8 0. 1988 1991 1993 1995 1998 000
[ Ho, Lee, Kriegman ] Visual Tracking Main Challenges 1. 3-D Pose Variation. Occlusion o the target 3. Illumination variation 4. Camera jitter 5. Expression variation etc. Main tracking notions State: usuall a inite number o parameters (a vector) that characterizes the state (e.g., location, size, pose, deormation o thing being tracked. Dnamics: How does the state change over time? How is that changed constrained? Representation: How do ou represent the thing being tracked Prediction: Given the state at time t-1, what is an estimate o the state at time t? Correction: Given the predicted state at time t, and a measurement at time t, update the state. What is state? State Examples: -D image location, Φ=(u,v) Image location + scale Φ=(u,v,s) Image location + scale + orientation Φ=(u,v,s,θ) Aine transormation 3-D pose 3-D pose plus internal shape parameters (some ma be discrete). e.g., or a ace, 3-D pose +acial expression using FACS + ee state (open/closed). Collections o control points speciing a spline Above, but or multiple objects (e.g. tracking a ormation o airplanes). Augment above with temporal derivatives (, φ) object is ball, state is 3D position+velocit, measurements are stereo pairs object is person, state is bod coniguration, measurements are rames, clock is in camera (30 ps) What is state here? φ & Example: Blob Tracker Blob Tracking in IR Images From input image I(u,v) (color?) at time t, create a binar image b appling a unction (I(u,v)). Clean up binar image using morphological operators Perorm connected component exploration to ind blobs. connected regions. Compute their moments (mean and covariance o coordinates o region), and use as state Using state estimate rom t-1 and perorm data association to identi state in rom t. Threshold about bod temperature Connected component analsis Position, scale, orientation o regions Temporal coherence
Tracking: Probabilistic ramework Ver general model: We assume there are moving objects, which have an underling state X There are measurements Y, some o which are unctions o this state There is a clock at each tick, the state changes at each tick, we get a new observation Tracking State X 0 X 1 X t-1 X t X t+1 Y 0 Y 1 Y t-1 Y t Y t+1 Instead o knowing state at each instant, we treat the state as random variables X t characterized b a pd P(X t ) or perhaps conditioned on other Random Variables e.g., P(X t X t-1 ), etc. The observation (measurement) Yt is a random variable conditioned on the state P(Y t X t ) Generall, we don t observe the state it s hidden. Three main steps Simpliing Assumptions Tracking as induction Assume data association is done Sometimes challenging in cluttered scenes. See work b Christopher Rasmussen on Joint Probabilistic Data Association Filters (JPDAF). Do correction or the 0 th rame Assume we have corrected estimate or i th rame show we can do prediction or i+1, correction or i+1 Base case P( x) is our observation model. Given the state x, what is the observation. For example, P( x) might be a Gaussian with mean x. Prior distribution o initial state
Induction step Induction step Given How is this ormulation used 1. It s ignored. At each time instant, the state is estimated (perhaps a maximum likelihood estimate or something nonprobabilistic). The conditional distributions are represented b some convenient parametric orm (e.g., Gaussian). 3. The PDF s are represented nonparametricall, and sampling techniques are used. Linear dnamic models Use notation ~ to mean has the pd o, N(a, b) is a normal distribution with mean a and covariance b. A linear dnamic model has the orm ( ) ( ) x i = N D i 1 x i 1 ;Σ di i = N M i x i ;Σ mi This is much, much more general than it looks, and extremel powerul Driting points Examples we assume that the new position o the point is the old one, plus noise. For the measurement model, we ma not need to observe the whole state o the object e.g. a point moving in 3D, at the 3k th tick we see x, 3k+1 th tick we see, 3k+ th tick we see z in this case, we can still make decent estimates o all three coordinates at each tick. This propert, which does not appl to ever model, is called Observabilit Examples Points moving with constant velocit Periodic motion Etc. Points moving with constant acceleration
Points moving with constant velocit We have Points moving with constant acceleration We have (the Greek letters denote noise terms) Stack (u, v) into a single state vector u i = u i 1 + tv i 1 + ε i v i = v i 1 + ς i u = 1 t v 0 1 u v which is the orm we had above i i 1 + noise D i (the Greek letters denote noise terms) Stack (u, v) into a single state vector u i = u i 1 + tv i 1 + ε i v i = v i 1 + ta i 1 +ς i a i = a i 1 + ξ i u 1 t 0 u v = 0 1 t v a 0 0 1 a i i 1 which is the orm we had above + noise Ke ideas: The Kalman Filter Linear models interact uniquel well with Gaussian noise - make the prior Gaussian, everthing else Gaussian and the calculations are eas Gaussians are reall eas to represent Just a mean vector mean and covariance matrix. The Kalman Filter in 1D Dnamic Model Notation Corrected mean Predicted mean Prediction or 1-D Kalman ilter The new state is obtained b multipling old state b known constant adding zero-mean noise Thereore, predicted mean or new state is constant times mean or old state Predicted variance is sum o constant^ times old state variance and noise variance Because: Old state is normal random variable, Multipling normal rv b constant implies mean is multiplied b a constant variance is multiplied b square o constant Adding zero mean noise adds zero to the mean, Adding rv s adds variance
Correction or 1D Kalman ilter Pattern match to identities given in book basicall, guess the integrals, get: Notice: i measurement noise is small, we rel mainl on the measurement, i it s large, mainl on the prediction Other approaches to state estimation Condensation and its variants are most popular (Issard & Blake) Represent state using weighted samples Measurement Generation Sample rom p(x) Evaluate p( I X) at samples Keep high-scoring samples Ascend gradient & pick exemplar