L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms

L09. PARTICLE FILTERING NA568 Mobile Robotics: Methods & Algorithms

Particle Filters Different approach to state estimation Instead of parametric description of state (and uncertainty), use a set of state samples. The distribution of these particles represents the posterior distribution. Can represent arbitrary PDFs, not just Gaussians.

Particle Filters Represent belief by random samples Estimation of non-gaussian, nonlinear processes Sequential Monte Carlo filter, Survival of the fittest, Condensation, Bootstrap filter, Point-Mass (a.k.a. particle ) filter Filtering: [Rubin, 88], [Gordon et al., 93], [Kitagawa 96] Computer vision: [Isard and Blake 96, 98] Dynamic Bayesian Networks: [Kanazawa et al., 95]

Sample-based Localization (sonar)

Sequential Monte Carlo (SMC): Brief History Basic idea of SMC around since the 1950 s Explored through 60 s and 70 s, but largely overlooked and ignored 1) modest computational power at the time 2) vanilla Sequential Importance Sampling (SIS) leads to degeneracy over time The major contribution to development of the SMC method was the inclusion of the resampling step [Neil Gordon et al, 1993]

Roots of SMC are in MC Integration Let I be the result of a multivariate integral Thought experiment Imagine discretizing and evaluating I numerically. What is the complexity?

MC Integration Suppose we can factorize g(x) Such that π(x) can be interpreted as a pdf Draw N>>1 i.i.d. samples from π(x), then

MC Integration continued If the samples x i are i.i.d., then I N is the unbiased estimate of the integral I. According to the law of large numbers I N will almost surely converge to I. If the variance of f(x) is finite, i.e. then the CLT holds and the estimation error converges in distribution:

MC Integration Punch line The error e=i N -I is on the order of O(N -½ ) i.e. the rate of convergence is independent of the dimension of the integrand n x!

Example Integrate g(x) = sin(x)cos(y) over domain 0<x<π and 0<y<π/2

Continued Equivalent MC Integral Draw x i i.i.d. samples from π(x) and compute f(x i ), compute mean of f(x) for N=10, 100, 1000, 10000,, 10 7 Expect std of error

It works!

Importance Sampling Ideally we want to sample from π(x) and estimate I. But suppose we can only generate samples from density q(x), which is similar to π(x) i.e., same region of support q(x) is call the importance or proposal density Importance weight

Importance Sampling π(x) q(x) Weight samples:

MC Integration with importance density Draw N>>1 i.i.d. samples Compute If desired density π(x) is known only up to a proportionality constant then what? Normalized Importance weight

Bayesian Inference MC can be applied in a Bayesian framework where π(x) is the sought after posterior density. To develop, let us introduce: The sequence of all target states up to time k PF Tutorial (Gordon et al) ProbRob The sequence of all measurements up to time k

Posterior over State Trajectory Joint-posterior density at time k: Discrete approximation: Dirac delta fcn Normalized weights, w ki, chosen using principle of importance sampling.

Recursive Factorization Sampling Suppose at time step k-1 we have samples approximating With reception of z k at time k, we wish to approximate If importance density is chosen to factorize as Samples can be obtained by augmenting existing samples with the new state

Recursive Factorization Weight Update Target distribution Hence, importance weights become:

Filtered Posterior If then importance density only depends on x k-1, and z k i.e., Markov Hence In such scenarios, only need to store x k i s Can silently ignore sample sequence of state trajectory, x 0:k- 1 i, and history of observations, z 1:k-1 i Filtered posterior approximation becomes

SIS Belief

Sequential Importance Sampling (SIS) Algorithm

Degeneracy Problem Ideally, the importance density q(. ) should be the posterior itself, i.e., For the assumed factored form below, it has been shown that the variance of the importance weights can only increase over time. In practical terms, all but one particle will eventually have negligible weight after a fixed number of time steps Effectively, a large computational effort is devoted to updating particles whose contribution to the approximation p(x k z 1:k ) is almost zero A. Doucet, S. Godsill, and C. Andrieu, On sequential Monte Carlo sampling methods for Bayesian filtering, Statistics and Computing, vol. 10, no. 3, pp. 197-208, 2000.

Measure of Degeneracy: Effective Number of Particles Two extremes i) uniform ii) singular

Resampling Given: Set S of weighted samples. Wanted : Random sample, where the probability of drawing x k i is given by w ki. Typically done N times with replacement to generate new sample set S.

Resampling Resampling eliminates particles with low weights and multiplies particles with high weights Maps random measure {x ki, w ki } to {x k i*, 1/N} Sample with replacement such that P(x k i* = x ki ) = w k i uniform

Resampling w N-1 w N w 1 w 2 w N-1 w N w 1 w 2 w 3 w 3 Roulette wheel Binary search, O(N log N) Stochastic universal sampling Systematic resampling Linear time complexity, O(N) Easy to implement, low variance

Low-Variance Resampling Algorithm Algorithm systematic_resampling(s, N): 1. 1 S ' =, c1 = w 2. For i = 2N Generate CDF i 3. c c + w i = i 1 1 4. u1 ~ U[0, N ], i = 1 Initialize threshold 5. For j =1N Draw samples 6. While ( u j > c i ) Skip until next threshold reached 7. i = i +1 8. { i 1 S' = S' < x, N > } Insert 9. 1 u j+ 1 = u j + N Increment threshold 10. Return S Also called stochastic universal sampling

Idea behind low variance sampling 1/N u 1

Comments on Resampling Pros Reduces effects of degeneracy Cons Limits opportunity to parallelize implementation since all particles must be combined Particles with high importance weights are replicated. This leads to loss of diversity among particles, a.k.a. sample impoverishment Diversity of particle paths is reduced, any smoothed estimate based on particles paths degenerates

Trajectory Degeneracy Due to Resampling Implicitly, each particle represents a guess at the realization of the state sequence x 0:k k=0 k=1 k=2 A 0 A 0 A 0 B 0 B 0 B 0 C 0 C 0 C 0 D 0 D 0 C 1 Resampling step causes some particle lineages to die k=3 k=4 A 0 A 0 B 0 A 3 C 0 C 0 C 1 C 1 Trajectories can eventually collapse to a single source node k=5 k=6 A 0 A 0 A 3 A 3 C 0 A 5 C 1 C 1 k=7 A 0 A 3 A 5 C 1 k=8 A 0 A 3 A 5 A 7

Next Lecture Particle Filtering: Part II Selection of importance density Optimal Suboptimal Monte Carlo Localization (i.e., PF)