Expectation Propagation in Dynamical Systems

Size: px
Start display at page:

Download "Expectation Propagation in Dynamical Systems"

Transcription

1 Expectation Propagation in Dynamical Systems Marc Peter Deisenroth Joint Work with Shakir Mohamed (UBC) August 10, 2012 Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 1

2 Motivation Figure : Complex time series: motion capture, GDP, climate Time series in economics, robotics, motion capture, etc. have unknown dynamical structure, are high-dimensional and noisy Flexible and accurate models Nonlinear (Gaussian process) dynamical systems (GPDS) Accurate inference in (GP)DS important for Better knowledge about latent structures Parameter learning Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 2

3 Outline 1 Inference in Time Series Models Filtering and Smoothing Expectation Propagation Approximating the Partition Function Relation to Smoothing 2 EP in Gaussian Process Dynamical Systems Gaussian Processes Filtering/Smoothing in GPDS Expectation Propagation in GPDS 3 Results Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 3

4 Time Series Models Inference in Time Series Models Filtering and Smoothing x t 1 x t x t+1 z t 1 z t z t+1 x t = f(x t 1 ) + w, w N ( 0, Q ) z t = g(x t ) + v, v N ( 0, R ) Latent state x R D Measurement/observation z R E Transition function f Measurement function g Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 4

5 Inference in Time Series Models Filtering and Smoothing x t 1 x t x t+1 z t 1 z t z t+1 Objective: Posterior distribution over latent variables x t Filtering (Forward Inference) Compute p(x t z 1:t ) for t = 1,..., T Smoothing (Forward-Backward Inference) Compute p(x t z 1:t ) for t = 1,..., T (forward sweep) Compute p(x t z 1:T ) for t = T,..., 1 (backward sweep) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 5

6 Inference in Time Series Models Filtering and Smoothing x t 1 x t x t+1 z t 1 z t z t+1 Objective: Posterior distribution over latent variables x t Filtering (Forward Inference) Compute p(x t z 1:t ) for t = 1,..., T Smoothing (Forward-Backward Inference) Compute p(x t z 1:t ) for t = 1,..., T (forward sweep) Compute p(x t z 1:T ) for t = T,..., 1 (backward sweep) Examples: Linear systems: Kalman filter/smoother (Kalman, 1959) Nonlinear systems: Approximate inference Extended Kalman Filter/Smoother (Kalman, ) Unscented Kalman Filter/Smoother (Julier & Uhlmann, 1997) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 5

7 Filtering and Smoothing Machine Learning Perspective x t 1 x t x t+1 z t 1 z t z t+1 Treat filtering/smoothing as an inference problem in graphical models with hidden variables Allows for efficient local message passing Messages are unnormalized probability distributions distributed Iterative refinement of the posterior marginals p(x t ), t = 1,..., T Multiple forward-backward sweeps until global consistency (convergence) Here: Expectation Propagation (Minka 2001) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 6

8 Expectation Propagation Expectation Propagation x t 1 x t x t+1 p(x t+1 x t ) x t x t+1 z t 1 z t z t+1 p(z t x t ) p(z t+1 x t+1 ) Inference in factor graphs Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 7

9 Expectation Propagation Expectation Propagation x t 1 x t x t+1 p(x t+1 x t ) x t x t+1 z t 1 z t z t+1 p(z t x t ) p(z t+1 x t+1 ) Inference in factor graphs p(x t ) = n i=1 t i(x t ) q(x t ) = n i=1 t i (x t ) Approximate factors t i are members of the Exponential Family (e.g., Multinomial, Gamma, Gaussian) Find good a good approximation such that q p Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 7

10 Expectation Propagation Expectation Propagation Figure : Moment matching vs. mode matching. Borrowed from Bishop (2006) EP locally minimizes KL(p q), where p is the true distribution and q is an approximation (from Exponential Family) to it. EP = moment matching (unlike Variational Bayes [ mode matching ], which minimizes KL(q p)) EP exploits properties of the Exponential Family: Compute moments of distributions via derivatives of the log-partition function Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 8

11 Expectation Propagation Expectation Propagation q (x t) x t p(x t+1 x t) x t+1 q (x t+1) q (x t) x t q (x t) q (x t+1) x t+1 q (x t+1) q (x t) q (x t+1) p(z t x t) p(z t+1 x t+1) q (x t) q (x t+1) Figure : Factor graph (left) and fully factored factor graph (right). Write down the (fully factored) factor graph p(x t ) = n i=1 t i(x t ) q(x t ) = n i=1 t i (x t ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 9

12 Expectation Propagation Expectation Propagation q (x t) x t p(x t+1 x t) x t+1 q (x t+1) q (x t) x t q (x t) q (x t+1) x t+1 q (x t+1) q (x t) q (x t+1) p(z t x t) p(z t+1 x t+1) q (x t) q (x t+1) Figure : Factor graph (left) and fully factored factor graph (right). Write down the (fully factored) factor graph p(x t ) = n i=1 t i(x t ) q(x t ) = n i=1 t i (x t ) Find approximate t i, such that KL(p q) is minimized. Multiple sweeps through graph until global consistency of the messages is assured Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 9

13 Expectation Propagation Messages in a Dynamical System q (x t ) x t q (x t ) q (x t+1 ) x t+1 q (x t+1 ) q (x t ) q (x t+1 ) Approximate (factored) marginal: q(x t ) = i t i (x t ) Here, our messages t i have names: Measurement message q Forward message q Backward message q Define cavity distribution: q \i (x t ) = q(x t )/ t i (x t ) = k i t k (x t ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 10

14 Gaussian EP in More Detail Expectation Propagation 1 Write down the factor graph q (xt) xt q (xt) q (xt+1) xt+1 q (xt+1) q (xt) q (xt+1) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11

15 Expectation Propagation Gaussian EP in More Detail 1 Write down the factor graph 2 Initialize all messages t i, i =,, Until convergence: q (xt) xt q (xt) q (xt) q (xt+1) xt+1 q (xt+1) q (xt+1) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11

16 Gaussian EP in More Detail Expectation Propagation 1 Write down the factor graph 2 Initialize all messages t i, i =,, Until convergence: q (xt) xt q (xt) q (xt) q (xt+1) xt+1 q (xt+1) 3 For all latent variables x t and corresponding messages t i (x t ) do q (xt+1) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11

17 Gaussian EP in More Detail Expectation Propagation 1 Write down the factor graph 2 Initialize all messages t i, i =,, Until convergence: q (xt) xt q (xt) q (xt) q (xt+1) xt+1 q (xt+1) 3 For all latent variables x t and corresponding messages t i (x t ) do 1 Compute the cavity distribution q \i (x t ) = N ( x t µ \i t, Σ \i ) t by Gaussian division. q (xt+1) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11

18 Gaussian EP in More Detail Expectation Propagation 1 Write down the factor graph 2 Initialize all messages t i, i =,, Until convergence: q (xt) xt q (xt) q (xt) q (xt+1) xt+1 q (xt+1) 3 For all latent variables x t and corresponding messages t i (x t ) do 1 Compute the cavity distribution q \i (x t ) = N ( x t µ \i t, Σ \i ) t by Gaussian division. 2 Compute the moments of t i (x t )q \i (x t ) Updated moments of q(x t ) q (xt+1) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11

19 Gaussian EP in More Detail Expectation Propagation 1 Write down the factor graph 2 Initialize all messages t i, i =,, Until convergence: q (xt) xt q (xt) q (xt) q (xt+1) xt+1 q (xt+1) 3 For all latent variables x t and corresponding messages t i (x t ) do 1 Compute the cavity distribution q \i (x t ) = N ( x t µ \i t, Σ \i ) t by Gaussian division. 2 Compute the moments of t i (x t )q \i (x t ) Updated moments of q(x t ) 3 Compute updated message q (xt+1) t i (x t ) = q(x t )/q \i (x t ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11

20 Expectation Propagation Updating the Measurement Message q (x t ) x t q (x t ) q (x t ) Measurement message true factor cavity distr. {}}{{}}{ q (x t ) = proj[ t (x t ) q \ (x t ) ] q \ (x t ) The proj[.] operator projects onto Exponential Family distributions Implemented by taking derivatives of the log partition function log Z, where Z = t (x t )q \ (x t )dx t, t (x t ) = p(z t x t ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 12

21 Expectation Propagation Updating in Context: Forward Message q (x t) x t p(x t+1 x t) x t+1 q (x t+1) q (x t) x t q (x t) q (x t+1) x q t+1 (x t+1) q (x t) q (x t+1) q (x t) q (x t+1) Forward message Need to take the coupling between x t and x t+1 into account (lost when writing down the fully factored factor graph). Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 13

22 Expectation Propagation Updating in Context: Forward Message q (x t) x t p(x t+1 x t) x t+1 q (x t+1) q (x t) x t q (x t) q (x t+1) x q t+1 (x t+1) q (x t) q (x t+1) q (x t) q (x t+1) Forward message Need to take the coupling between x t and x t+1 into account (lost when writing down the fully factored factor graph). Key insight: Want a close approximation q (x t+1 )q (x t+1 ) q (x t+1 ) q \ (x t+1 ) p(x t+1 x t )q (x t )q (x t )dx t }{{} context q \ (x t+1 ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 13

23 Expectation Propagation Updating in Context: Forward Message q (x t) x t p(x t+1 x t) x t+1 q (x t+1) q (x t) x t q (x t) q (x t+1) x q t+1 (x t+1) q (x t) q (x t+1) q (x t) q (x t+1) Forward message Need to take the coupling between x t and x t+1 into account (lost when writing down the fully factored factor graph). Key insight: Want a close approximation q (x t+1 )q (x t+1 ) q (x t+1 ) q \ (x t+1 ) p(x t+1 x t )q (x t )q (x t )dx t }{{} context q \ (x t+1 ) cavity distr. Achieve this by projection {}}{ true factor {}}{ q (x t+1 ) = proj[ q \ (x t+1 ) t (x t+1 )] q \, (x t+1 ) t (x t+1 ) = p(x t+1 x t )q (x t )q (x t )dx t Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 13

24 Key Points and Challenge Approximating the Partition Function EP is based on matching the moments of t i (x t )q \i (x t ) Computing the partition function Z i (µ \i t, Σ\i t ) = t i (x t )q \i (x t )dx t and its derivatives with respect to µ \i t and Σ \i t are sufficient for EP Properties of the Exponential Family Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 14

25 Key Points and Challenge Approximating the Partition Function EP is based on matching the moments of t i (x t )q \i (x t ) Computing the partition function Z i (µ \i t, Σ\i t ) = t i (x t )q \i (x t )dx t and its derivatives with respect to µ \i t and Σ \i t are sufficient for EP Properties of the Exponential Family Tricky part: Integral not solvable for nonlinear systems with continuous variables Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 14

26 Approach Inference in Time Series Models Approximating the Partition Function Interpretation of partition function Z i as a probability distribution. Example: Measurement message Z = t (x)q \ (x)dx = p(z x)q \ (x)dx = p(z) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 15

27 Approach Inference in Time Series Models Approximating the Partition Function Interpretation of partition function Z i as a probability distribution. Example: Measurement message Z = t (x)q \ (x)dx = p(z x)q \ (x)dx = p(z) Idea: Approximate p(z) by a (Gaussian) distribution Z Take the derivatives of log Z with respect to the moments of the cavity distribution Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 15

28 Approach Inference in Time Series Models Approximating the Partition Function Interpretation of partition function Z i as a probability distribution. Example: Measurement message Z = t (x)q \ (x)dx = p(z x)q \ (x)dx = p(z) Idea: Approximate p(z) by a (Gaussian) distribution Z Take the derivatives of log Z with respect to the moments of the cavity distribution Get updated moments for the posterior and the messages Fixes the intractability problems, but we are no longer exact Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 15

29 Possible Gaussian Approximations Approximating the Partition Function Example: Measurement message Z = t (x)q \ (x)dx = t (x) = N ( z g(x), S ) t (x)n ( x µ \, Σ \ ) dx Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 16

30 Possible Gaussian Approximations Approximating the Partition Function Example: Measurement message Z = t (x)q \ (x)dx = t (x) = N ( z g(x), S ) t (x)n ( x µ \, Σ \ ) dx Linearize g at µ \ integral tractable Gaussian moment matching: compute mean and variance of Z approximate Z by a Gaussian with the correct mean/variance Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 16

31 Theoretical Results Inference in Time Series Models Relation to Smoothing Z = t (x)q \ (x)dx = t (x) = N ( z g(x), S ) t (x)n ( x µ \, Σ \ ) dx Relation to Common Filters/Smoothers Approximating Z by a Gaussian Z is equivalent to approximating p(x, z) by a Gaussian an approximation that is common to almost all filtering algorithms a a Deisenroth & Ohlsson (ACC 2011) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 17

32 Theoretical Results Inference in Time Series Models Relation to Smoothing Z = t (x)q \ (x)dx = t (x) = N ( z g(x), S ) t (x)n ( x µ \, Σ \ ) dx Relation to Common Filters/Smoothers Approximating Z by a Gaussian Z is equivalent to approximating p(x, z) by a Gaussian an approximation that is common to almost all filtering algorithms a a Deisenroth & Ohlsson (ACC 2011) Generalizing Common Smoothers Linearizing g(x) in Z generalizes the EKS to an iterative procedure Moment matching generalizes the ADS to an iterative procedure Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 17

33 Relation to Smoothing Interesting Side Effects To minimize the KL divergence, EP updates require the derivatives log Z µ \, log Z Σ \ 1 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 18

34 Interesting Side Effects Relation to Smoothing To minimize the KL divergence, EP updates require the derivatives log Z µ \, log Z Σ \ The Gaussian approximation of Z = p(z) N ( ) µ z, Σ z is exact if and only if there is a linear relationship between x and z, i.e., z = Jx, x N ( µ \, Σ \ ) for some J µ z, Σ z have a special form 1 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 18

35 Interesting Side Effects Relation to Smoothing To minimize the KL divergence, EP updates require the derivatives log Z µ \, log Z Σ \ The Gaussian approximation of Z = p(z) N ( ) µ z, Σ z is exact if and only if there is a linear relationship between x and z, i.e., z = Jx, x N ( µ \, Σ \ ) for some J µ z, Σ z have a special form Linearity must be explicitly encoded in the partial derivatives! 1 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 18

36 Interesting Side Effects Relation to Smoothing To minimize the KL divergence, EP updates require the derivatives log Z µ \, log Z Σ \ The Gaussian approximation of Z = p(z) N ( ) µ z, Σ z is exact if and only if there is a linear relationship between x and z, i.e., z = Jx, x N ( µ \, Σ \ ) for some J µ z, Σ z have a special form Linearity must be explicitly encoded in the partial derivatives! Example: log Z µ \ = log Z µ z µ z µ \ = (z µ z) Σ 1 z Even if µ z is a general function of µ \ and Σ \, this must be ignored. Otherwise: Inconsistent EP updates! 1 1 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 18 J

37 Illustration: Toy Tracking Problem Relation to Smoothing 4 2 Ground truth EKS State Time step Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 19

38 Illustration: Toy Tracking Problem Relation to Smoothing 4 2 Ground truth EKS 4 2 Ground truth EP EKS State 0 State Time step Time step Iteratively improving the posteriors via EP can heal the the EKS Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 19

39 EP in Gaussian Process Dynamical Systems Gaussian Process Dynamical Systems x t 1 x t x t+1 z t 1 z t z t+1 x t = f(x t 1 ) + w, w N ( 0, Q ) z t = g(x t ) + v, v N ( 0, R ) State x (not observed) Measurement/observation z GP distribution p(f) over transition function f GP distribution p(g) over measurement function g Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 20

40 EP in Gaussian Process Dynamical Systems Gaussian Processes Gaussian Processes for Flexible Modeling Non-parametric method flexible, i.e., shape of function adapts to data Probabilistic method consistently describes uncertainties about the unknown function Sufficient: specification of high-level assumptions (e.g., smoothness) Automatic trade-off between data-fit and complexity of the function (Occam s razor) 2 x t (x t 1, u t 1 ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 21

41 EP in Gaussian Process Dynamical Systems Gaussian Process Regression Gaussian Processes Mathematically: Probability distribution over functions Bayesian inference tractable: 1 Specify high-level prior beliefs p(f) about the function (e.g., smoothness) 2 Observe data X, y = f(x) + ε 3 Compute posterior distribution p(f X, y) over functions Bayes theorem: p(f X, y) = p(y X, f)p(f) p(y X) p(f): Prior (over functions) p(y X, f): Likelihood (noise model) p(f X, y): Posterior (over functions) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 22

42 EP in Gaussian Process Dynamical Systems Gaussian Processes Pictorial Introduction to Gaussian Processes f(x) x Prior belief about the function. Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 23

43 EP in Gaussian Process Dynamical Systems Gaussian Processes Pictorial Introduction to Gaussian Processes f(x) x Observe some function values. Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 23

44 EP in Gaussian Process Dynamical Systems Gaussian Processes Pictorial Introduction to Gaussian Processes f(x) x Posterior belief about the function. Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 23

45 EP in Gaussian Process Dynamical Systems Filtering/Smoothing in GPDS Gaussian Process Dynamical Systems x t 1 x t x t+1 z t 1 z t z t+1 x t = f(x t 1 ) + w, w N ( 0, Q ) z t = g(x t ) + v, v N ( 0, R ) GP distribution p(f) over transition function f GP distribution p(g) over measurement function g Let s talk about inference in GPDSs Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 24

46 EP in Gaussian Process Dynamical Systems Inference in GPDS Filtering/Smoothing in GPDS t t p( ) t p(x t 1, u t 1 ) (x t 1, u t 1 ) Objective: Gaussian approximations to the joints p(x t, z t z 1:t 1 ) and p(x t 1, x t z 1:t 1 ) sufficient for Gaussian filtering/smoothing 2 2 Deisenroth & Ohlsson (ACC 2011) 3 Deisenroth et al. (ICML 2009), Deisenroth et al. (IEEE-TAC, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 25

47 EP in Gaussian Process Dynamical Systems Inference in GPDS Filtering/Smoothing in GPDS t t p( ) t p(x t 1, u t 1 ) (x t 1, u t 1 ) Objective: Gaussian approximations to the joints p(x t, z t z 1:t 1 ) and p(x t 1, x t z 1:t 1 ) sufficient for Gaussian filtering/smoothing 2 Mapping distributions through a GP requires approximations, e.g., Linearization of the posterior GP mean function (red) Moment matching (blue) Filtering/smoothing in GPDS 3 : GP-EKS, GP-ADS, GP-CKS,... 2 Deisenroth & Ohlsson (ACC 2011) 3 Deisenroth et al. (ICML 2009), Deisenroth et al. (IEEE-TAC, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 25

48 EP in Gaussian Process Dynamical Systems Expectation Propagation in GPDS EP in GPDS Generalize single-sweep forward-backward smoothing in GPDSs to an iterative procedure using EP Slightly more involved than EP in nonlinear systems (e.g., EP-EKS) Also have to average over function distribution (GP) 4 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 26

49 EP in Gaussian Process Dynamical Systems Expectation Propagation in GPDS EP in GPDS Generalize single-sweep forward-backward smoothing in GPDSs to an iterative procedure using EP Slightly more involved than EP in nonlinear systems (e.g., EP-EKS) Also have to average over function distribution (GP) Key idea the same as before: Approximate the partition function by a Gaussian distribution 4 Linearization of the posterior mean function (e.g., Ko & Fox, 2009) EP-GPEKS Moment matching (e.g., Quiñonero-Candela et al., 2003) EP-GPADS 4 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 26

50 Results Results: Synthetic Data (1) 4 2 f(x) Ground truth Training data GP x Figure : GP model with training set and ground truth x t+1 = 4 sin(4x t ) + w, w N ( 0, 0.1 2) z t = 4 sin(4x t ) + v, v N ( 0, 0.1 2) Initial state distribution p(x 1 ) = N ( 0, 1 ) very broad 30 training points for GP models, randomly selected Tracking horizon: 20 time steps Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 27

51 Results Results: Synthetic Data (2) state True state Posterior state distribution (EP GPADS) Posterior state distribution (GPADS) Time step (a) Posterior trajectories with confidence bounds. Average NLL per data point EP GPADS GPADS EP iteration (b) Average NLL as a function of the EP iteration with standard error. After convergence, the posterior is spot on (left) Iterating EP greatly improves predictive power (right) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 28

52 Results Results: Pendulum Tracking Pendulum Method NLL x MAE x LPU x GPEKS 0.29 ± ± ± 0.12 EP-GPEKS 0.24 ± ± ± 0.12 GPADS 0.75 ± ± ± 0.06 EP-GPADS 0.79 ± ± ± 0.04 NLL: negative log likelihood MAE: mean absolute error LPU: log posterior uncertainty predictive performance error of the posterior mean tightness of the posterior Linearization-based inference: Variances too small EP makes things worse Moment-matching based inference: Coherent estimates EP improves posterior Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 29

53 Results Results: Motion Capture Data 10 trials of golf swings recorded at 40 Hz (mocap.cs.cmu.edu) Observations z R 56 Latent space x R 3 7 training sequences, 3 test sequences GPDS learning via GPDM approach (Wang et al., 2008) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 30

54 Results Results: Motion Capture Data Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 31

55 Results Summary General framework for iterative inference in dynamical systems Key: Approximation of the partition function Rederive classical filters/smoothers as a special case Promising results in (GP)DS marc@ias.tu-darmstadt.de Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 32

56 Results References [1] C. M. Bishop. Pattern Recognition and Machine Learning. Information Science and Statistics. Springer-Verlag, [2] M. P. Deisenroth, M. F. Huber, and U. D. Hanebeck. Analytic Moment-based Gaussian Process Filtering. In L. Bouttou and M. L. Littman, editors, Proceedings of the 26th International Conference on Machine Learning, pages , Montreal, QC, Canada, June Omnipress. [3] M. P. Deisenroth and S. Mohamed. Expectation Propagation in Gaussian Process Dynamical Systems, July [4] M. P. Deisenroth and H. Ohlsson. A General Perspective on Gaussian Filtering and Smoothing: Explaining Current and Deriving New Algorithms. In Proceedings of the American Control Conference, [5] M. P. Deisenroth, R. Turner, M. Huber, U. D. Hanebeck, and C. E. Rasmussen. Robust Filtering and Smoothing with Gaussian Processes. IEEE Transactions on Automatic Control, 57(7): , doi: /tac [6] S. J. Julier and J. K. Uhlmann. A New Extension of the Kalman Filter to Nonlinear Systems. In Proceedings of AeroSense: 11th Symposium on Aerospace/Defense Sensing, Simulation and Controls, pages , [7] R. E. Kalman. A New Approach to Linear Filtering and Prediction Problems. Transactions of the ASME Journal of Basic Engineering, 82(Series D):35 45, [8] J. Ko and D. Fox. GP-BayesFilters: Bayesian Filtering using Gaussian Process Prediction and Observation Models. Autonomous Robots, 27(1):75 90, July [9] T. P. Minka. A Family of Algorithms for Approximate Bayesian Inference. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, January [10] J. Quiñonero-Candela, A. Girard, J. Larsen, and C. E. Rasmussen. Propagation of Uncertainty in Bayesian Kernel Models Application to Multiple-Step Ahead Forecasting. In IEEE International Conference on Acoustics, Speech and Signal Processing, volume 2, pages , April [11] J. M. Wang, D. J. Fleet, and A. Hertzmann. Gaussian Process Dynamical Models for Human Motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2): , Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 33

State-Space Inference and Learning with Gaussian Processes

State-Space Inference and Learning with Gaussian Processes Ryan Turner 1 Marc Peter Deisenroth 1, Carl Edward Rasmussen 1,3 1 Department of Engineering, University of Cambridge, Trumpington Street, Cambridge CB 1PZ, UK Department of Computer Science & Engineering,

More information

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

PILCO: A Model-Based and Data-Efficient Approach to Policy Search PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

MODEL BASED LEARNING OF SIGMA POINTS IN UNSCENTED KALMAN FILTERING. Ryan Turner and Carl Edward Rasmussen

MODEL BASED LEARNING OF SIGMA POINTS IN UNSCENTED KALMAN FILTERING. Ryan Turner and Carl Edward Rasmussen MODEL BASED LEARNING OF SIGMA POINTS IN UNSCENTED KALMAN FILTERING Ryan Turner and Carl Edward Rasmussen University of Cambridge Department of Engineering Trumpington Street, Cambridge CB PZ, UK ABSTRACT

More information

Gaussian Process Approximations of Stochastic Differential Equations

Gaussian Process Approximations of Stochastic Differential Equations Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Dan Cawford Manfred Opper John Shawe-Taylor May, 2006 1 Introduction Some of the most complex models routinely run

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

Analytic Long-Term Forecasting with Periodic Gaussian Processes

Analytic Long-Term Forecasting with Periodic Gaussian Processes Nooshin Haji Ghassemi School of Computing Blekinge Institute of Technology Sweden Marc Peter Deisenroth Department of Computing Imperial College London United Kingdom Department of Computer Science TU

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Gaussian Process Latent Variable Models for Dimensionality Reduction and Time Series Modeling

Gaussian Process Latent Variable Models for Dimensionality Reduction and Time Series Modeling Gaussian Process Latent Variable Models for Dimensionality Reduction and Time Series Modeling Nakul Gopalan IAS, TU Darmstadt nakul.gopalan@stud.tu-darmstadt.de Abstract Time series data of high dimensions

More information

Non-Gaussian likelihoods for Gaussian Processes

Non-Gaussian likelihoods for Gaussian Processes Non-Gaussian likelihoods for Gaussian Processes Alan Saul University of Sheffield Outline Motivation Laplace approximation KL method Expectation Propagation Comparing approximations GP regression Model

More information

Identification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM

Identification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM Identification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM Roger Frigola Fredrik Lindsten Thomas B. Schön, Carl E. Rasmussen Dept. of Engineering, University of Cambridge,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Linear Dynamical Systems

Linear Dynamical Systems Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Expectation propagation for signal detection in flat-fading channels

Expectation propagation for signal detection in flat-fading channels Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA

More information

Expectation Propagation for Approximate Bayesian Inference

Expectation Propagation for Approximate Bayesian Inference Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given

More information

Modelling and Control of Nonlinear Systems using Gaussian Processes with Partial Model Information

Modelling and Control of Nonlinear Systems using Gaussian Processes with Partial Model Information 5st IEEE Conference on Decision and Control December 0-3, 202 Maui, Hawaii, USA Modelling and Control of Nonlinear Systems using Gaussian Processes with Partial Model Information Joseph Hall, Carl Rasmussen

More information

GWAS V: Gaussian processes

GWAS V: Gaussian processes GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011

More information

Identification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM

Identification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM Preprints of the 9th World Congress The International Federation of Automatic Control Identification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM Roger Frigola Fredrik

More information

Reinforcement Learning with Reference Tracking Control in Continuous State Spaces

Reinforcement Learning with Reference Tracking Control in Continuous State Spaces Reinforcement Learning with Reference Tracking Control in Continuous State Spaces Joseph Hall, Carl Edward Rasmussen and Jan Maciejowski Abstract The contribution described in this paper is an algorithm

More information

Non Linear Latent Variable Models

Non Linear Latent Variable Models Non Linear Latent Variable Models Neil Lawrence GPRS 14th February 2014 Outline Nonlinear Latent Variable Models Extensions Outline Nonlinear Latent Variable Models Extensions Non-Linear Latent Variable

More information

Multiple-step Time Series Forecasting with Sparse Gaussian Processes

Multiple-step Time Series Forecasting with Sparse Gaussian Processes Multiple-step Time Series Forecasting with Sparse Gaussian Processes Perry Groot ab Peter Lucas a Paul van den Bosch b a Radboud University, Model-Based Systems Development, Heyendaalseweg 135, 6525 AJ

More information

Efficient Reinforcement Learning for Motor Control

Efficient Reinforcement Learning for Motor Control Efficient Reinforcement Learning for Motor Control Marc Peter Deisenroth and Carl Edward Rasmussen Department of Engineering, University of Cambridge Trumpington Street, Cambridge CB2 1PZ, UK Abstract

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

Bayesian Inference Course, WTCN, UCL, March 2013

Bayesian Inference Course, WTCN, UCL, March 2013 Bayesian Course, WTCN, UCL, March 2013 Shannon (1948) asked how much information is received when we observe a specific value of the variable x? If an unlikely event occurs then one would expect the information

More information

Data-Driven Differential Dynamic Programming Using Gaussian Processes

Data-Driven Differential Dynamic Programming Using Gaussian Processes 5 American Control Conference Palmer House Hilton July -3, 5. Chicago, IL, USA Data-Driven Differential Dynamic Programming Using Gaussian Processes Yunpeng Pan and Evangelos A. Theodorou Abstract We present

More information

Expectation Propagation in Factor Graphs: A Tutorial

Expectation Propagation in Factor Graphs: A Tutorial DRAFT: Version 0.1, 28 October 2005. Do not distribute. Expectation Propagation in Factor Graphs: A Tutorial Charles Sutton October 28, 2005 Abstract Expectation propagation is an important variational

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September

More information

GP-SUM. Gaussian Process Filtering of non-gaussian Beliefs

GP-SUM. Gaussian Process Filtering of non-gaussian Beliefs GP-SUM. Gaussian Process Filtering of non-gaussian Beliefs Maria Bauza and Alberto Rodriguez Mechanical Engineering Department Massachusetts Institute of Technology @mit.edu Abstract. This

More information

State Space Gaussian Processes with Non-Gaussian Likelihoods

State Space Gaussian Processes with Non-Gaussian Likelihoods State Space Gaussian Processes with Non-Gaussian Likelihoods Hannes Nickisch 1 Arno Solin 2 Alexander Grigorievskiy 2,3 1 Philips Research, 2 Aalto University, 3 Silo.AI ICML2018 July 13, 2018 Outline

More information

NON-LINEAR NOISE ADAPTIVE KALMAN FILTERING VIA VARIATIONAL BAYES

NON-LINEAR NOISE ADAPTIVE KALMAN FILTERING VIA VARIATIONAL BAYES 2013 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING NON-LINEAR NOISE ADAPTIVE KALMAN FILTERING VIA VARIATIONAL BAYES Simo Särä Aalto University, 02150 Espoo, Finland Jouni Hartiainen

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

RECURSIVE OUTLIER-ROBUST FILTERING AND SMOOTHING FOR NONLINEAR SYSTEMS USING THE MULTIVARIATE STUDENT-T DISTRIBUTION

RECURSIVE OUTLIER-ROBUST FILTERING AND SMOOTHING FOR NONLINEAR SYSTEMS USING THE MULTIVARIATE STUDENT-T DISTRIBUTION 1 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 3 6, 1, SANTANDER, SPAIN RECURSIVE OUTLIER-ROBUST FILTERING AND SMOOTHING FOR NONLINEAR SYSTEMS USING THE MULTIVARIATE STUDENT-T

More information

Gaussian Process Dynamical Models Jack M Wang, David J Fleet, Aaron Hertzmann, NIPS 2005

Gaussian Process Dynamical Models Jack M Wang, David J Fleet, Aaron Hertzmann, NIPS 2005 Gaussian Process Dynamical Models Jack M Wang, David J Fleet, Aaron Hertzmann, NIPS 2005 Presented by Piotr Mirowski CBLL meeting, May 6, 2009 Courant Institute of Mathematical Sciences, New York University

More information

Density Propagation for Continuous Temporal Chains Generative and Discriminative Models

Density Propagation for Continuous Temporal Chains Generative and Discriminative Models $ Technical Report, University of Toronto, CSRG-501, October 2004 Density Propagation for Continuous Temporal Chains Generative and Discriminative Models Cristian Sminchisescu and Allan Jepson Department

More information

Lecture 6: Bayesian Inference in SDE Models

Lecture 6: Bayesian Inference in SDE Models Lecture 6: Bayesian Inference in SDE Models Bayesian Filtering and Smoothing Point of View Simo Särkkä Aalto University Simo Särkkä (Aalto) Lecture 6: Bayesian Inference in SDEs 1 / 45 Contents 1 SDEs

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

Black-box α-divergence Minimization

Black-box α-divergence Minimization Black-box α-divergence Minimization José Miguel Hernández-Lobato, Yingzhen Li, Daniel Hernández-Lobato, Thang Bui, Richard Turner, Harvard University, University of Cambridge, Universidad Autónoma de Madrid.

More information

Optimal Control with Learned Forward Models

Optimal Control with Learned Forward Models Optimal Control with Learned Forward Models Pieter Abbeel UC Berkeley Jan Peters TU Darmstadt 1 Where we are? Reinforcement Learning Data = {(x i, u i, x i+1, r i )}} x u xx r u xx V (x) π (u x) Now V

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter

More information

2-Step Temporal Bayesian Networks (2TBN): Filtering, Smoothing, and Beyond Technical Report: TRCIM1030

2-Step Temporal Bayesian Networks (2TBN): Filtering, Smoothing, and Beyond Technical Report: TRCIM1030 2-Step Temporal Bayesian Networks (2TBN): Filtering, Smoothing, and Beyond Technical Report: TRCIM1030 Anqi Xu anqixu(at)cim(dot)mcgill(dot)ca School of Computer Science, McGill University, Montreal, Canada,

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014 Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Gaussian Process Approximations of Stochastic Differential Equations

Gaussian Process Approximations of Stochastic Differential Equations Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Centre for Computational Statistics and Machine Learning University College London c.archambeau@cs.ucl.ac.uk CSML

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Introduction to Probabilistic Graphical Models: Exercises

Introduction to Probabilistic Graphical Models: Exercises Introduction to Probabilistic Graphical Models: Exercises Cédric Archambeau Xerox Research Centre Europe cedric.archambeau@xrce.xerox.com Pascal Bootcamp Marseille, France, July 2010 Exercise 1: basics

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

Neutron inverse kinetics via Gaussian Processes

Neutron inverse kinetics via Gaussian Processes Neutron inverse kinetics via Gaussian Processes P. Picca Politecnico di Torino, Torino, Italy R. Furfaro University of Arizona, Tucson, Arizona Outline Introduction Review of inverse kinetics techniques

More information

Probabilistic Reasoning in Deep Learning

Probabilistic Reasoning in Deep Learning Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian

More information

Expectation propagation as a way of life

Expectation propagation as a way of life Expectation propagation as a way of life Yingzhen Li Department of Engineering Feb. 2014 Yingzhen Li (Department of Engineering) Expectation propagation as a way of life Feb. 2014 1 / 9 Reference This

More information

Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints

Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints Thang D. Bui Richard E. Turner tdb40@cam.ac.uk ret26@cam.ac.uk Computational and Biological Learning

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

NPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic

NPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic NPFL108 Bayesian inference Introduction Filip Jurčíček Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic Home page: http://ufal.mff.cuni.cz/~jurcicek Version: 21/02/2014

More information

Probabilistic Graphical Models Lecture 20: Gaussian Processes

Probabilistic Graphical Models Lecture 20: Gaussian Processes Probabilistic Graphical Models Lecture 20: Gaussian Processes Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 30, 2015 1 / 53 What is Machine Learning? Machine learning algorithms

More information

Model-Based Reinforcement Learning with Continuous States and Actions

Model-Based Reinforcement Learning with Continuous States and Actions Marc P. Deisenroth, Carl E. Rasmussen, and Jan Peters: Model-Based Reinforcement Learning with Continuous States and Actions in Proceedings of the 16th European Symposium on Artificial Neural Networks

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian

More information

Model Selection for Gaussian Processes

Model Selection for Gaussian Processes Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal

More information

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2017 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

More information

Non-Factorised Variational Inference in Dynamical Systems

Non-Factorised Variational Inference in Dynamical Systems st Symposium on Advances in Approximate Bayesian Inference, 08 6 Non-Factorised Variational Inference in Dynamical Systems Alessandro D. Ialongo University of Cambridge and Max Planck Institute for Intelligent

More information

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein Kalman filtering and friends: Inference in time series models Herke van Hoof slides mostly by Michael Rubinstein Problem overview Goal Estimate most probable state at time k using measurement up to time

More information

Variable sigma Gaussian processes: An expectation propagation perspective

Variable sigma Gaussian processes: An expectation propagation perspective Variable sigma Gaussian processes: An expectation propagation perspective Yuan (Alan) Qi Ahmed H. Abdel-Gawad CS & Statistics Departments, Purdue University ECE Department, Purdue University alanqi@cs.purdue.edu

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of

More information

Gaussian Processes in Machine Learning

Gaussian Processes in Machine Learning Gaussian Processes in Machine Learning November 17, 2011 CharmGil Hong Agenda Motivation GP : How does it make sense? Prior : Defining a GP More about Mean and Covariance Functions Posterior : Conditioning

More information

Probabilistic numerics for deep learning

Probabilistic numerics for deep learning Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Robust Filtering and Smoothing with Gaussian Processes

Robust Filtering and Smoothing with Gaussian Processes 1 Robust Filtering and Smoothing with Gaussian Processes Marc Peter Deisenroth, Ryan Turner Member, IEEE, Marco F. Huber Member, IEEE, Uwe D. Hanebeck Member, IEEE, Carl Edward Rasmussen Abstract We propose

More information

CSE 473: Artificial Intelligence

CSE 473: Artificial Intelligence CSE 473: Artificial Intelligence Hidden Markov Models Dieter Fox --- University of Washington [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 12 Dynamical Models CS/CNS/EE 155 Andreas Krause Homework 3 out tonight Start early!! Announcements Project milestones due today Please email to TAs 2 Parameter learning

More information

Mathematical Formulation of Our Example

Mathematical Formulation of Our Example Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Probabilistic Graphical Models for Image Analysis - Lecture 4

Probabilistic Graphical Models for Image Analysis - Lecture 4 Probabilistic Graphical Models for Image Analysis - Lecture 4 Stefan Bauer 12 October 2018 Max Planck ETH Center for Learning Systems Overview 1. Repetition 2. α-divergence 3. Variational Inference 4.

More information

Power EP. Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR , October 4, Abstract

Power EP. Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR , October 4, Abstract Power EP Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR-2004-149, October 4, 2004 Abstract This note describes power EP, an etension of Epectation Propagation (EP) that makes the computations

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Gaussian Processes for Machine Learning

Gaussian Processes for Machine Learning Gaussian Processes for Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics Tübingen, Germany carl@tuebingen.mpg.de Carlos III, Madrid, May 2006 The actual science of

More information

System identification and control with (deep) Gaussian processes. Andreas Damianou

System identification and control with (deep) Gaussian processes. Andreas Damianou System identification and control with (deep) Gaussian processes Andreas Damianou Department of Computer Science, University of Sheffield, UK MIT, 11 Feb. 2016 Outline Part 1: Introduction Part 2: Gaussian

More information

Distributed Gaussian Processes

Distributed Gaussian Processes Distributed Gaussian Processes Marc Deisenroth Department of Computing Imperial College London http://wp.doc.ic.ac.uk/sml/marc-deisenroth Gaussian Process Summer School, University of Sheffield 15th September

More information

Lecture 1a: Basic Concepts and Recaps

Lecture 1a: Basic Concepts and Recaps Lecture 1a: Basic Concepts and Recaps Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

2D Image Processing. Bayes filter implementation: Kalman filter

2D Image Processing. Bayes filter implementation: Kalman filter 2D Image Processing Bayes filter implementation: Kalman filter Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Modeling and state estimation Examples State estimation Probabilities Bayes filter Particle filter. Modeling. CSC752 Autonomous Robotic Systems

Modeling and state estimation Examples State estimation Probabilities Bayes filter Particle filter. Modeling. CSC752 Autonomous Robotic Systems Modeling CSC752 Autonomous Robotic Systems Ubbo Visser Department of Computer Science University of Miami February 21, 2017 Outline 1 Modeling and state estimation 2 Examples 3 State estimation 4 Probabilities

More information