Latent state estimation using control theory

Similar documents
An efficient approach to stochastic optimal control. Bert Kappen SNN Radboud University Nijmegen the Netherlands

Smoothing Estimates of Diffusion Processes

Stochastic optimal control theory

Controlled sequential Monte Carlo

Information geometry for bivariate distribution control

Approximate Bayesian inference

Sequential Monte Carlo Methods for Bayesian Computation

Expectation Propagation in Dynamical Systems

Path Integral Stochastic Optimal Control for Reinforcement Learning

Markov Chain Monte Carlo Methods for Stochastic

Probabilistic Graphical Models for Image Analysis - Lecture 4

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification

Lecture 6: Bayesian Inference in SDE Models

Basic math for biology

Hierarchy. Will Penny. 24th March Hierarchy. Will Penny. Linear Models. Convergence. Nonlinear Models. References

Computer Intensive Methods in Mathematical Statistics

Probabilistic Graphical Models

Lecture 1a: Basic Concepts and Recaps

Gaussian Process Approximations of Stochastic Differential Equations

Policy Search for Path Integral Control

Markov Chain Monte Carlo Methods for Stochastic Optimization

Sequential Monte Carlo Samplers for Applications in High Dimensions

Particle Filtering Approaches for Dynamic Stochastic Optimization

Integrated Non-Factorized Variational Inference

Linear Dynamical Systems

Auxiliary Particle Methods

State-Space Methods for Inferring Spike Trains from Calcium Imaging

Effective Connectivity & Dynamic Causal Modelling

Statistical Inference and Methods

Probabilistic and Bayesian Machine Learning

Gaussian processes for inference in stochastic differential equations

Bayesian Inference Course, WTCN, UCL, March 2013

L 0 methods. H.J. Kappen Donders Institute for Neuroscience Radboud University, Nijmegen, the Netherlands. December 5, 2011.

Lecture 4: Dynamic models

Week 3: The EM algorithm

The concentration of a drug in blood. Exponential decay. Different realizations. Exponential decay with noise. dc(t) dt.

arxiv: v3 [math.oc] 18 Jan 2012

Bayesian Machine Learning

Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering

An Brief Overview of Particle Filtering

Contrastive Divergence

L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms

Fast MCMC sampling for Markov jump processes and extensions

Real Time Stochastic Control and Decision Making: From theory to algorithms and applications

Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

Policy Search for Path Integral Control

Towards a Bayesian model for Cyber Security

An Implementation of Dynamic Causal Modelling

Applications of Optimal Stopping and Stochastic Control

Unsupervised Learning

Stochastic optimal control theory

Bayesian Machine Learning - Lecture 7

Improved diffusion Monte Carlo

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 1: Introduction

Lecture 8: Bayesian Estimation of Parameters in State Space Models

F denotes cumulative density. denotes probability density function; (.)

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

arxiv: v1 [q-bio.nc] 15 Mar 2018

Variational Principal Components

Expectation propagation for signal detection in flat-fading channels

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Stochastic Generative Hashing

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

Variational Inference (11/04/13)

Deterministic Dynamic Programming

Sequential Monte Carlo Methods in High Dimensions

Introduction to Probabilistic Graphical Models: Exercises

HST 583 FUNCTIONAL MAGNETIC RESONANCE IMAGING DATA ANALYSIS AND ACQUISITION A REVIEW OF STATISTICS FOR FMRI DATA ANALYSIS

CS Lecture 19. Exponential Families & Expectation Propagation

Continuous Time Particle Filtering for fmri

Linearly-Solvable Stochastic Optimal Control Problems

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

Numerical Solutions of ODEs by Gaussian (Kalman) Filtering

The Kalman Filter ImPr Talk

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters

Sequential Monte Carlo methods for system identification

Kolmogorov Equations and Markov Processes

Stochastic Optimal Control in Continuous Space-Time Multi-Agent Systems

Monte Carlo Approximation of Monte Carlo Filters

Dynamic Causal Modelling for fmri

Use of probability gradients in hybrid MCMC and a new convergence test

The Hierarchical Particle Filter

The Origin of Deep Learning. Lili Mou Jan, 2015

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

COMP2610/COMP Information Theory

State Space Gaussian Processes with Non-Gaussian Likelihoods

Variational Inference. Sargur Srihari

Lecture 13 : Variational Inference: Mean Field Approximation

Variational Inference via Stochastic Backpropagation

1. Stochastic Processes and filtrations

Introduction to Machine Learning

Introduction. Stochastic Processes. Will Penny. Stochastic Differential Equations. Stochastic Chain Rule. Expectations.

Graphical Models and Kernel Methods

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Robust regression and non-linear kernel methods for characterization of neuronal response functions from limited data

Gaussian Process Approximations of Stochastic Differential Equations

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection

Transcription:

Latent state estimation using control theory Bert Kappen SNN Donders Institute, Radboud University, Nijmegen Gatsby Unit, UCL London August 3, 7 with Hans Christian Ruiz Bert Kappen

Smoothing problem Given observed data y :T, estimate (statistics of) marginal posteriors p(x t y :T ) with p(x :T y :T ) = p(x :T)p(y :T x :T ) p(y :T ) p(y :T ) = x :T p(x :T )p(y :T x :T ) NB is different from filtering problem p(x t y :t ). Bert Kappen

Smoothing problem Given observed data y :T, estimate (statistics of) marginal posteriors p(x t y :T ) with p(x :T y :T ) = p(x :T)p(y :T x :T ) p(y :T ) p(y :T ) = x :T p(x :T )p(y :T x :T ) NB is different from filtering problem p(x t y :t ). Common approaches: Kalman Filtering: exact for linear dynamics and Gaussian observations Variational, EP, extended Kalman Filter: valid for simple posterior distributions Particle filtering, SMC: generic, but inefficient Bert Kappen

Time series inference Prior process p(x :T x ): dx t = dw t x = Observation at end time only: p(y T x T ) = exp( βx T ).5.5.5 3 4 5 6 7 8 9 t Red: p(x t y :t, x ) (exact) Blue: p(x t y :t, x ) (PF) Black: p(x t y :T, x ) (exact) Bert Kappen 3

Time series inference Prior process p(x :T x ): dx t = dw t x = Observation at end time only: p(y T x T ) = exp( βx T ).5.5.5 3 4 5 6 7 8 9 t Red: p(x t y :t, x ) (exact) Blue: p(x t y :t, x ) (PF + resampling) Black: p(x t y :T, x ) (exact) Bert Kappen 4

Time series inference Prior process p(x :T x ): dx t = dw t x = Observation at end time only: p(y T x T ) = exp( βx T ).8.8.6.6.4.4.....4.4 3 4 5 6 7 8 9 t 3 4 5 6 7 8 9 t Samples from filtered distribution do not well represent smoothed distribution. Bert Kappen 5

Control approach to time-series inference Outline: Introduction to control theory and path integral control theory Optimal control optimal sampling Adaptive importance sampler: Estimate improved sampler from self-generated data fmri application Bert Kappen 6

Control theory.5.5 Consider a stochastic dynamical system dx t = f (X t, u)dt + dw t E(dW t,i dw t. j ) = ν i j dt Given X find control function u(x, t) that minimizes the expected future cost C = E ( φ(x T ) + T ) dtv(x t, u(x t, t)) Bert Kappen 7

Control theory.5.5.5.5 Standard approach: define J(x, t) is optimal cost-to-go from x, t. J(x, t) = min u t:t E u (φ(x T ) + T t ) dtv(x t, u(x t, t)) X t = x J satisfies a partial differential equation t J(t, x) = min u ( V(x, u) + f (x, u) x J(x, t) + ) ν xj(x, t) J(x, T) = φ(x) with u = u(x, t).this is HJB equation. Optimal control u (x, t) defines distribution over trajectories p (τ) (= p(τ x, )). Bert Kappen 8

Path integral control theory.5.5 dx t = f (X t, t)dt + g(x t, t)(u(x t, t)dt + dw t ) X = x Goal is to find function u(x, t) that minimizes C = E u ( S (τ) + T dt u(x t, t) ) S (τ) = φ(x T ) + T V(X t, t) Bert Kappen 9

Path integral control theory.5.5.5.5 Equivalent formulation: Find distribution over trajectories p that minimizes C(p) = dτp(τ) ( S (τ) + log p(τ) ) q(τ) q(τ x, ) is distribution over uncontrolled trajectories. The optimal solution is given by p (τ) = (τ) ψq(τ)e S E u T dt u(x t, t) = dτp(τ) log p(τ) q(τ). Bert Kappen

Path integral control theory.5.5.5.5 Equivalent formulation: Find distribution over trajectories p that minimizes C(p) = dτp(τ) ( S (τ) + log p(τ) ) q(τ) q(τ x, ) is distribution over uncontrolled trajectories. The optimal solution is given by p (τ) = ψ q(τ)e S (τ) = p(τ u ). Equivalence of optimal control and discounted cost (Girsanov) Bert Kappen

Path integral control theory.5.5.5.5 The optimal control cost is C(p ) = log ψ = J(x, ) with ψ = dτq(τ)e S (τ) = E q e S J(x, t) can be computed by forward sampling from q. Bert Kappen

Estimating ψ = Ee S ESS =.8, C=3.7.5.5 Sample N trajectories from uncontrolled dynamics τ i q(τ) w i = e S (τ i) ˆψ = N i w i ˆψ unbiased estimate of ψ. Sampling efficiency is inversely proportional to variance in (normalized) w i. ES S = N + N Var(w) Bert Kappen 3

Importance sampling ESS =.8, C=3.7 ESS = 3.5, C=5. ESS=9.5, C=..5.5.5.5.5.5 Sample N trajectories from controlled dynamics and reweight yields unbiased estimate of cost-to-go: τ i p(τ) w i = e S (τ i) q(τ i) p(τ i ) = e S u(τ i ) ˆψ = N i w i S u (τ) = S (τ) + T dt u(x t, t) + T u(x t, t)dw t Bert Kappen 4

Importance sampling ESS =.8, C=3.7 ESS = 3.5, C=5. ESS=9.5, C=..5.5.5.5.5.5 S u (τ) = S (τ) + T dt u(x t, t) + T u(x t, t)dw t Thm: Better u (in the sense of optimal control) provides a better sampler (in the sense of effective sample size). Optimal u = u (in the sense of optimal control) requires only one sample and S u (τ) deterministic! Thijssen, Kappen 5 Bert Kappen 5

Proof Control cost is C(p) = E p ( S (τ) + log p(τ) q(τ)) = ES u Using Jensen s inequality: C = log q(τ)e S (τ) = log τ τ p(τ) S (τ) log p(τ)e q(τ) τ p(τ) ( S (τ) + log p(τ) ) q(τ) Bert Kappen 6

( ) Control cost is C(p) = E p S (τ) + log p(τ) q(τ) Proof Using Jensen s inequality: C = log q(τ)e S (τ) = log τ τ p(τ) S (τ) log p(τ)e q(τ) τ p(τ) ( S (τ) + log p(τ) ) q(τ) = C(p) The inequality is saturated when S (τ) + log p(τ) q(τ) side evaluate to S (τ) + log p(τ) q(τ). has zero variance: left and right This is realized when p = p 3. 3 p exists when τ q(τ)e S (τ) < Bert Kappen 7

The Path Integral Cross Entropy (PICE) method We wish to estimate ψ = S (τ) dτq(τ)e The optimal (zero variance) importance sampler is p (τ) = ψ q(τ)e S (τ). We approximate p (τ) with p u (τ), where u(x, t θ) is a parametrized control function. Following the Cross Entropy method, we minimise KL(p p u ). θ KL(p p u ) θ E u e S u T dw t u(x t, t θ) θ u(x, t θ) is arbitrary. Estimate gradient by sampling. Kappen, Ruiz 6 Bert Kappen 8

Coordination of UAVs Chao Xu ACC 7 Bert Kappen 9

Control Inference Fleming, Mitter Bert Kappen

Path integral control theory.5.5.5.5 Equivalent formulation: Find distribution over trajectories p that minimizes C(p) = dτp(τ) ( S (τ) + log p(τ) ) q(τ) q(τ x, ) is distribution over uncontrolled trajectories. The optimal solution is given by p (τ) = ψ q(τ)e S (τ) = p(τ u ). q(τ) is the prior process p(x :T x ) and e S (τ) = t p(y t x t ). Bert Kappen

Control versus particle smoothing Linear stochastic dynamics, Gaussian observations at t =, T. Control method with linear time dependent controller u(x, t) = a(t)x+b(t). N = particles, 5 IS iterations. ESS from.5% to 98%. FS with N = particles. ESS = %. FFBSi with N = M = forward and backward particles. ESS = 7%. Control error in posterior mean is orders of magnitude smaller than FFBSi error. Bert Kappen

Control versus particle smoothing Linear stochastic dynamics, Gaussian observations at t =, T. FFBSi error in posterior mean (average over time) increases with more unlikely observations, while control error does not. Bert Kappen 3

Control method scales with many observations Linear stochastic dynamics, Gaussian observations at t =, T. Bert Kappen 4

Non linear problem 5 dimensional neural network model with 4 observation of one neuron. Left: Number of iterations needed to reach ESS=. versus number of observations (N = 6). Annealing vital for large number of observations. Right: Number of iterations needed to reach ESS=. versus number of particles (5 observations). Annealing allows significant less number of particles. Bert Kappen 5

Non linear problem 5 dimensional neural network model with 4 observation of one neuron. Variance of posterior mean for APIS, FS and FFBSi versus number of observations. N = 6 forward particles. Backward particles such that APIS and FFBSi have similar CPU time. Bert Kappen 6

Non linear problem 5 dimensional neural network model with 4 observation of one neuron. ESS for typical run of APIS with annealing. N = 75 Bert Kappen 7

Infering neural activity from BOLD fmri Subjects were asked to respond as fast as possible to a visual or auditory stimulus Measure fmri BOLD response in motor cortex Reconstruct neural signal from BOLD signal. Neural activity dz t = AZ t dt + σ z dw t Hemodynamics Vasodilation s(z, s, f ) Blood flow f (s) Balloon model Blood volume v( f, v) Deoxygenated hemoglobin q( f, v, q) BOLD y(v, q) Bert Kappen 8

Infering neural activity from BOLD fmri Subjects were asked to respond as fast as possible to a visual or auditory stimulus Measure fmri BOLD response in motor cortex Reconstruct neural signal from BOLD signal. Neural activity dz t = AZ t dt + σ z dw t Hemodynamics Vasodilation s(z, s, f ) Blood flow f (s) Balloon model Blood volume v( f, v) Deoxygenated hemoglobin q( f, v, q) BOLD y(v, q) Bert Kappen 9

Infering neural activity from BOLD fmri Subjects were asked to respond as fast as possible to a visual or auditory stimulus Measure fmri BOLD response in motor cortex Reconstruct neural signal from BOLD signal. Neural activity dz t = AZ t dt + σ z dw t Hemodynamics Vasodilation s(z, s, f ) Blood flow f (s) Balloon model Blood volume v( f, v) Deoxygenated hemoglobin q( f, v, q) BOLD y(v, q) Bert Kappen 3

Infering neural activity from BOLD fmri u(x, t) = a(t)x + b(t). 5. particles, iterations. Error in stimulus time reconstruction.5.77s. Error using fit of two Gamma functions.8.5s Bert Kappen 3

Summary Smoothing problem (Bayesian posterior computation in time series) can be formulated as a stochastic optimal control problem (Fleming 99). Bert Kappen 3

Summary Smoothing problem (Bayesian posterior computation in time series) can be formulated as a stochastic optimal control problem (Fleming 99). Better controller = Better sampler. Bert Kappen 33

Summary Smoothing problem (Bayesian posterior computation in time series) can be formulated as a stochastic optimal control problem (Fleming 99). Better controller = Better sampler. Parametrized controller found by optimizing Cross Entropy can be (in principle) arbitrary accurate. Bert Kappen 34

Summary Smoothing problem (Bayesian posterior computation in time series) can be formulated as a stochastic optimal control problem (Fleming 99). Better controller = Better sampler. Parametrized controller found by optimizing Cross Entropy can be (in principle) arbitrary accurate. Improved performance compared to state of the art SMC. Bert Kappen 35

Summary Smoothing problem (Bayesian posterior computation in time series) can be formulated as a stochastic optimal control problem (Fleming 99). Better controller = Better sampler. Parametrized controller found by optimizing Cross Entropy can be (in principle) arbitrary accurate. Improved performance compared to state of the art SMC. Excellent parallelization possible. Bert Kappen 36

Thank you! Kappen, Hilbert Johan, and Hans Christian Ruiz. Adaptive importance sampling for control and inference. Journal of Statistical Physics 6.5 (6): 44-66. Ruiz, Hans-Christian, and Hilbert J. Kappen. Particle Smoothing for Hidden Diffusion Processes: Adaptive Path Integral Smoother. IEEE Transactions on Signal Processing 65. (7): 39-33. Bert Kappen 37