Expectation Propagation in Dynamical Systems
|
|
- Paulina Lamb
- 5 years ago
- Views:
Transcription
1 Expectation Propagation in Dynamical Systems Marc Peter Deisenroth Joint Work with Shakir Mohamed (UBC) August 10, 2012 Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 1
2 Motivation Figure : Complex time series: motion capture, GDP, climate Time series in economics, robotics, motion capture, etc. have unknown dynamical structure, are high-dimensional and noisy Flexible and accurate models Nonlinear (Gaussian process) dynamical systems (GPDS) Accurate inference in (GP)DS important for Better knowledge about latent structures Parameter learning Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 2
3 Outline 1 Inference in Time Series Models Filtering and Smoothing Expectation Propagation Approximating the Partition Function Relation to Smoothing 2 EP in Gaussian Process Dynamical Systems Gaussian Processes Filtering/Smoothing in GPDS Expectation Propagation in GPDS 3 Results Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 3
4 Time Series Models Inference in Time Series Models Filtering and Smoothing x t 1 x t x t+1 z t 1 z t z t+1 x t = f(x t 1 ) + w, w N ( 0, Q ) z t = g(x t ) + v, v N ( 0, R ) Latent state x R D Measurement/observation z R E Transition function f Measurement function g Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 4
5 Inference in Time Series Models Filtering and Smoothing x t 1 x t x t+1 z t 1 z t z t+1 Objective: Posterior distribution over latent variables x t Filtering (Forward Inference) Compute p(x t z 1:t ) for t = 1,..., T Smoothing (Forward-Backward Inference) Compute p(x t z 1:t ) for t = 1,..., T (forward sweep) Compute p(x t z 1:T ) for t = T,..., 1 (backward sweep) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 5
6 Inference in Time Series Models Filtering and Smoothing x t 1 x t x t+1 z t 1 z t z t+1 Objective: Posterior distribution over latent variables x t Filtering (Forward Inference) Compute p(x t z 1:t ) for t = 1,..., T Smoothing (Forward-Backward Inference) Compute p(x t z 1:t ) for t = 1,..., T (forward sweep) Compute p(x t z 1:T ) for t = T,..., 1 (backward sweep) Examples: Linear systems: Kalman filter/smoother (Kalman, 1959) Nonlinear systems: Approximate inference Extended Kalman Filter/Smoother (Kalman, ) Unscented Kalman Filter/Smoother (Julier & Uhlmann, 1997) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 5
7 Filtering and Smoothing Machine Learning Perspective x t 1 x t x t+1 z t 1 z t z t+1 Treat filtering/smoothing as an inference problem in graphical models with hidden variables Allows for efficient local message passing Messages are unnormalized probability distributions distributed Iterative refinement of the posterior marginals p(x t ), t = 1,..., T Multiple forward-backward sweeps until global consistency (convergence) Here: Expectation Propagation (Minka 2001) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 6
8 Expectation Propagation Expectation Propagation x t 1 x t x t+1 p(x t+1 x t ) x t x t+1 z t 1 z t z t+1 p(z t x t ) p(z t+1 x t+1 ) Inference in factor graphs Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 7
9 Expectation Propagation Expectation Propagation x t 1 x t x t+1 p(x t+1 x t ) x t x t+1 z t 1 z t z t+1 p(z t x t ) p(z t+1 x t+1 ) Inference in factor graphs p(x t ) = n i=1 t i(x t ) q(x t ) = n i=1 t i (x t ) Approximate factors t i are members of the Exponential Family (e.g., Multinomial, Gamma, Gaussian) Find good a good approximation such that q p Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 7
10 Expectation Propagation Expectation Propagation Figure : Moment matching vs. mode matching. Borrowed from Bishop (2006) EP locally minimizes KL(p q), where p is the true distribution and q is an approximation (from Exponential Family) to it. EP = moment matching (unlike Variational Bayes [ mode matching ], which minimizes KL(q p)) EP exploits properties of the Exponential Family: Compute moments of distributions via derivatives of the log-partition function Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 8
11 Expectation Propagation Expectation Propagation q (x t) x t p(x t+1 x t) x t+1 q (x t+1) q (x t) x t q (x t) q (x t+1) x t+1 q (x t+1) q (x t) q (x t+1) p(z t x t) p(z t+1 x t+1) q (x t) q (x t+1) Figure : Factor graph (left) and fully factored factor graph (right). Write down the (fully factored) factor graph p(x t ) = n i=1 t i(x t ) q(x t ) = n i=1 t i (x t ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 9
12 Expectation Propagation Expectation Propagation q (x t) x t p(x t+1 x t) x t+1 q (x t+1) q (x t) x t q (x t) q (x t+1) x t+1 q (x t+1) q (x t) q (x t+1) p(z t x t) p(z t+1 x t+1) q (x t) q (x t+1) Figure : Factor graph (left) and fully factored factor graph (right). Write down the (fully factored) factor graph p(x t ) = n i=1 t i(x t ) q(x t ) = n i=1 t i (x t ) Find approximate t i, such that KL(p q) is minimized. Multiple sweeps through graph until global consistency of the messages is assured Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 9
13 Expectation Propagation Messages in a Dynamical System q (x t ) x t q (x t ) q (x t+1 ) x t+1 q (x t+1 ) q (x t ) q (x t+1 ) Approximate (factored) marginal: q(x t ) = i t i (x t ) Here, our messages t i have names: Measurement message q Forward message q Backward message q Define cavity distribution: q \i (x t ) = q(x t )/ t i (x t ) = k i t k (x t ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 10
14 Gaussian EP in More Detail Expectation Propagation 1 Write down the factor graph q (xt) xt q (xt) q (xt+1) xt+1 q (xt+1) q (xt) q (xt+1) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11
15 Expectation Propagation Gaussian EP in More Detail 1 Write down the factor graph 2 Initialize all messages t i, i =,, Until convergence: q (xt) xt q (xt) q (xt) q (xt+1) xt+1 q (xt+1) q (xt+1) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11
16 Gaussian EP in More Detail Expectation Propagation 1 Write down the factor graph 2 Initialize all messages t i, i =,, Until convergence: q (xt) xt q (xt) q (xt) q (xt+1) xt+1 q (xt+1) 3 For all latent variables x t and corresponding messages t i (x t ) do q (xt+1) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11
17 Gaussian EP in More Detail Expectation Propagation 1 Write down the factor graph 2 Initialize all messages t i, i =,, Until convergence: q (xt) xt q (xt) q (xt) q (xt+1) xt+1 q (xt+1) 3 For all latent variables x t and corresponding messages t i (x t ) do 1 Compute the cavity distribution q \i (x t ) = N ( x t µ \i t, Σ \i ) t by Gaussian division. q (xt+1) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11
18 Gaussian EP in More Detail Expectation Propagation 1 Write down the factor graph 2 Initialize all messages t i, i =,, Until convergence: q (xt) xt q (xt) q (xt) q (xt+1) xt+1 q (xt+1) 3 For all latent variables x t and corresponding messages t i (x t ) do 1 Compute the cavity distribution q \i (x t ) = N ( x t µ \i t, Σ \i ) t by Gaussian division. 2 Compute the moments of t i (x t )q \i (x t ) Updated moments of q(x t ) q (xt+1) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11
19 Gaussian EP in More Detail Expectation Propagation 1 Write down the factor graph 2 Initialize all messages t i, i =,, Until convergence: q (xt) xt q (xt) q (xt) q (xt+1) xt+1 q (xt+1) 3 For all latent variables x t and corresponding messages t i (x t ) do 1 Compute the cavity distribution q \i (x t ) = N ( x t µ \i t, Σ \i ) t by Gaussian division. 2 Compute the moments of t i (x t )q \i (x t ) Updated moments of q(x t ) 3 Compute updated message q (xt+1) t i (x t ) = q(x t )/q \i (x t ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 11
20 Expectation Propagation Updating the Measurement Message q (x t ) x t q (x t ) q (x t ) Measurement message true factor cavity distr. {}}{{}}{ q (x t ) = proj[ t (x t ) q \ (x t ) ] q \ (x t ) The proj[.] operator projects onto Exponential Family distributions Implemented by taking derivatives of the log partition function log Z, where Z = t (x t )q \ (x t )dx t, t (x t ) = p(z t x t ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 12
21 Expectation Propagation Updating in Context: Forward Message q (x t) x t p(x t+1 x t) x t+1 q (x t+1) q (x t) x t q (x t) q (x t+1) x q t+1 (x t+1) q (x t) q (x t+1) q (x t) q (x t+1) Forward message Need to take the coupling between x t and x t+1 into account (lost when writing down the fully factored factor graph). Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 13
22 Expectation Propagation Updating in Context: Forward Message q (x t) x t p(x t+1 x t) x t+1 q (x t+1) q (x t) x t q (x t) q (x t+1) x q t+1 (x t+1) q (x t) q (x t+1) q (x t) q (x t+1) Forward message Need to take the coupling between x t and x t+1 into account (lost when writing down the fully factored factor graph). Key insight: Want a close approximation q (x t+1 )q (x t+1 ) q (x t+1 ) q \ (x t+1 ) p(x t+1 x t )q (x t )q (x t )dx t }{{} context q \ (x t+1 ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 13
23 Expectation Propagation Updating in Context: Forward Message q (x t) x t p(x t+1 x t) x t+1 q (x t+1) q (x t) x t q (x t) q (x t+1) x q t+1 (x t+1) q (x t) q (x t+1) q (x t) q (x t+1) Forward message Need to take the coupling between x t and x t+1 into account (lost when writing down the fully factored factor graph). Key insight: Want a close approximation q (x t+1 )q (x t+1 ) q (x t+1 ) q \ (x t+1 ) p(x t+1 x t )q (x t )q (x t )dx t }{{} context q \ (x t+1 ) cavity distr. Achieve this by projection {}}{ true factor {}}{ q (x t+1 ) = proj[ q \ (x t+1 ) t (x t+1 )] q \, (x t+1 ) t (x t+1 ) = p(x t+1 x t )q (x t )q (x t )dx t Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 13
24 Key Points and Challenge Approximating the Partition Function EP is based on matching the moments of t i (x t )q \i (x t ) Computing the partition function Z i (µ \i t, Σ\i t ) = t i (x t )q \i (x t )dx t and its derivatives with respect to µ \i t and Σ \i t are sufficient for EP Properties of the Exponential Family Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 14
25 Key Points and Challenge Approximating the Partition Function EP is based on matching the moments of t i (x t )q \i (x t ) Computing the partition function Z i (µ \i t, Σ\i t ) = t i (x t )q \i (x t )dx t and its derivatives with respect to µ \i t and Σ \i t are sufficient for EP Properties of the Exponential Family Tricky part: Integral not solvable for nonlinear systems with continuous variables Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 14
26 Approach Inference in Time Series Models Approximating the Partition Function Interpretation of partition function Z i as a probability distribution. Example: Measurement message Z = t (x)q \ (x)dx = p(z x)q \ (x)dx = p(z) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 15
27 Approach Inference in Time Series Models Approximating the Partition Function Interpretation of partition function Z i as a probability distribution. Example: Measurement message Z = t (x)q \ (x)dx = p(z x)q \ (x)dx = p(z) Idea: Approximate p(z) by a (Gaussian) distribution Z Take the derivatives of log Z with respect to the moments of the cavity distribution Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 15
28 Approach Inference in Time Series Models Approximating the Partition Function Interpretation of partition function Z i as a probability distribution. Example: Measurement message Z = t (x)q \ (x)dx = p(z x)q \ (x)dx = p(z) Idea: Approximate p(z) by a (Gaussian) distribution Z Take the derivatives of log Z with respect to the moments of the cavity distribution Get updated moments for the posterior and the messages Fixes the intractability problems, but we are no longer exact Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 15
29 Possible Gaussian Approximations Approximating the Partition Function Example: Measurement message Z = t (x)q \ (x)dx = t (x) = N ( z g(x), S ) t (x)n ( x µ \, Σ \ ) dx Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 16
30 Possible Gaussian Approximations Approximating the Partition Function Example: Measurement message Z = t (x)q \ (x)dx = t (x) = N ( z g(x), S ) t (x)n ( x µ \, Σ \ ) dx Linearize g at µ \ integral tractable Gaussian moment matching: compute mean and variance of Z approximate Z by a Gaussian with the correct mean/variance Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 16
31 Theoretical Results Inference in Time Series Models Relation to Smoothing Z = t (x)q \ (x)dx = t (x) = N ( z g(x), S ) t (x)n ( x µ \, Σ \ ) dx Relation to Common Filters/Smoothers Approximating Z by a Gaussian Z is equivalent to approximating p(x, z) by a Gaussian an approximation that is common to almost all filtering algorithms a a Deisenroth & Ohlsson (ACC 2011) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 17
32 Theoretical Results Inference in Time Series Models Relation to Smoothing Z = t (x)q \ (x)dx = t (x) = N ( z g(x), S ) t (x)n ( x µ \, Σ \ ) dx Relation to Common Filters/Smoothers Approximating Z by a Gaussian Z is equivalent to approximating p(x, z) by a Gaussian an approximation that is common to almost all filtering algorithms a a Deisenroth & Ohlsson (ACC 2011) Generalizing Common Smoothers Linearizing g(x) in Z generalizes the EKS to an iterative procedure Moment matching generalizes the ADS to an iterative procedure Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 17
33 Relation to Smoothing Interesting Side Effects To minimize the KL divergence, EP updates require the derivatives log Z µ \, log Z Σ \ 1 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 18
34 Interesting Side Effects Relation to Smoothing To minimize the KL divergence, EP updates require the derivatives log Z µ \, log Z Σ \ The Gaussian approximation of Z = p(z) N ( ) µ z, Σ z is exact if and only if there is a linear relationship between x and z, i.e., z = Jx, x N ( µ \, Σ \ ) for some J µ z, Σ z have a special form 1 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 18
35 Interesting Side Effects Relation to Smoothing To minimize the KL divergence, EP updates require the derivatives log Z µ \, log Z Σ \ The Gaussian approximation of Z = p(z) N ( ) µ z, Σ z is exact if and only if there is a linear relationship between x and z, i.e., z = Jx, x N ( µ \, Σ \ ) for some J µ z, Σ z have a special form Linearity must be explicitly encoded in the partial derivatives! 1 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 18
36 Interesting Side Effects Relation to Smoothing To minimize the KL divergence, EP updates require the derivatives log Z µ \, log Z Σ \ The Gaussian approximation of Z = p(z) N ( ) µ z, Σ z is exact if and only if there is a linear relationship between x and z, i.e., z = Jx, x N ( µ \, Σ \ ) for some J µ z, Σ z have a special form Linearity must be explicitly encoded in the partial derivatives! Example: log Z µ \ = log Z µ z µ z µ \ = (z µ z) Σ 1 z Even if µ z is a general function of µ \ and Σ \, this must be ignored. Otherwise: Inconsistent EP updates! 1 1 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 18 J
37 Illustration: Toy Tracking Problem Relation to Smoothing 4 2 Ground truth EKS State Time step Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 19
38 Illustration: Toy Tracking Problem Relation to Smoothing 4 2 Ground truth EKS 4 2 Ground truth EP EKS State 0 State Time step Time step Iteratively improving the posteriors via EP can heal the the EKS Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 19
39 EP in Gaussian Process Dynamical Systems Gaussian Process Dynamical Systems x t 1 x t x t+1 z t 1 z t z t+1 x t = f(x t 1 ) + w, w N ( 0, Q ) z t = g(x t ) + v, v N ( 0, R ) State x (not observed) Measurement/observation z GP distribution p(f) over transition function f GP distribution p(g) over measurement function g Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 20
40 EP in Gaussian Process Dynamical Systems Gaussian Processes Gaussian Processes for Flexible Modeling Non-parametric method flexible, i.e., shape of function adapts to data Probabilistic method consistently describes uncertainties about the unknown function Sufficient: specification of high-level assumptions (e.g., smoothness) Automatic trade-off between data-fit and complexity of the function (Occam s razor) 2 x t (x t 1, u t 1 ) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 21
41 EP in Gaussian Process Dynamical Systems Gaussian Process Regression Gaussian Processes Mathematically: Probability distribution over functions Bayesian inference tractable: 1 Specify high-level prior beliefs p(f) about the function (e.g., smoothness) 2 Observe data X, y = f(x) + ε 3 Compute posterior distribution p(f X, y) over functions Bayes theorem: p(f X, y) = p(y X, f)p(f) p(y X) p(f): Prior (over functions) p(y X, f): Likelihood (noise model) p(f X, y): Posterior (over functions) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 22
42 EP in Gaussian Process Dynamical Systems Gaussian Processes Pictorial Introduction to Gaussian Processes f(x) x Prior belief about the function. Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 23
43 EP in Gaussian Process Dynamical Systems Gaussian Processes Pictorial Introduction to Gaussian Processes f(x) x Observe some function values. Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 23
44 EP in Gaussian Process Dynamical Systems Gaussian Processes Pictorial Introduction to Gaussian Processes f(x) x Posterior belief about the function. Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 23
45 EP in Gaussian Process Dynamical Systems Filtering/Smoothing in GPDS Gaussian Process Dynamical Systems x t 1 x t x t+1 z t 1 z t z t+1 x t = f(x t 1 ) + w, w N ( 0, Q ) z t = g(x t ) + v, v N ( 0, R ) GP distribution p(f) over transition function f GP distribution p(g) over measurement function g Let s talk about inference in GPDSs Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 24
46 EP in Gaussian Process Dynamical Systems Inference in GPDS Filtering/Smoothing in GPDS t t p( ) t p(x t 1, u t 1 ) (x t 1, u t 1 ) Objective: Gaussian approximations to the joints p(x t, z t z 1:t 1 ) and p(x t 1, x t z 1:t 1 ) sufficient for Gaussian filtering/smoothing 2 2 Deisenroth & Ohlsson (ACC 2011) 3 Deisenroth et al. (ICML 2009), Deisenroth et al. (IEEE-TAC, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 25
47 EP in Gaussian Process Dynamical Systems Inference in GPDS Filtering/Smoothing in GPDS t t p( ) t p(x t 1, u t 1 ) (x t 1, u t 1 ) Objective: Gaussian approximations to the joints p(x t, z t z 1:t 1 ) and p(x t 1, x t z 1:t 1 ) sufficient for Gaussian filtering/smoothing 2 Mapping distributions through a GP requires approximations, e.g., Linearization of the posterior GP mean function (red) Moment matching (blue) Filtering/smoothing in GPDS 3 : GP-EKS, GP-ADS, GP-CKS,... 2 Deisenroth & Ohlsson (ACC 2011) 3 Deisenroth et al. (ICML 2009), Deisenroth et al. (IEEE-TAC, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 25
48 EP in Gaussian Process Dynamical Systems Expectation Propagation in GPDS EP in GPDS Generalize single-sweep forward-backward smoothing in GPDSs to an iterative procedure using EP Slightly more involved than EP in nonlinear systems (e.g., EP-EKS) Also have to average over function distribution (GP) 4 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 26
49 EP in Gaussian Process Dynamical Systems Expectation Propagation in GPDS EP in GPDS Generalize single-sweep forward-backward smoothing in GPDSs to an iterative procedure using EP Slightly more involved than EP in nonlinear systems (e.g., EP-EKS) Also have to average over function distribution (GP) Key idea the same as before: Approximate the partition function by a Gaussian distribution 4 Linearization of the posterior mean function (e.g., Ko & Fox, 2009) EP-GPEKS Moment matching (e.g., Quiñonero-Candela et al., 2003) EP-GPADS 4 Deisenroth & Mohamed (arxiv preprint, 2012) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 26
50 Results Results: Synthetic Data (1) 4 2 f(x) Ground truth Training data GP x Figure : GP model with training set and ground truth x t+1 = 4 sin(4x t ) + w, w N ( 0, 0.1 2) z t = 4 sin(4x t ) + v, v N ( 0, 0.1 2) Initial state distribution p(x 1 ) = N ( 0, 1 ) very broad 30 training points for GP models, randomly selected Tracking horizon: 20 time steps Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 27
51 Results Results: Synthetic Data (2) state True state Posterior state distribution (EP GPADS) Posterior state distribution (GPADS) Time step (a) Posterior trajectories with confidence bounds. Average NLL per data point EP GPADS GPADS EP iteration (b) Average NLL as a function of the EP iteration with standard error. After convergence, the posterior is spot on (left) Iterating EP greatly improves predictive power (right) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 28
52 Results Results: Pendulum Tracking Pendulum Method NLL x MAE x LPU x GPEKS 0.29 ± ± ± 0.12 EP-GPEKS 0.24 ± ± ± 0.12 GPADS 0.75 ± ± ± 0.06 EP-GPADS 0.79 ± ± ± 0.04 NLL: negative log likelihood MAE: mean absolute error LPU: log posterior uncertainty predictive performance error of the posterior mean tightness of the posterior Linearization-based inference: Variances too small EP makes things worse Moment-matching based inference: Coherent estimates EP improves posterior Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 29
53 Results Results: Motion Capture Data 10 trials of golf swings recorded at 40 Hz (mocap.cs.cmu.edu) Observations z R 56 Latent space x R 3 7 training sequences, 3 test sequences GPDS learning via GPDM approach (Wang et al., 2008) Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 30
54 Results Results: Motion Capture Data Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 31
55 Results Summary General framework for iterative inference in dynamical systems Key: Approximation of the partition function Rederive classical filters/smoothers as a special case Promising results in (GP)DS marc@ias.tu-darmstadt.de Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 32
56 Results References [1] C. M. Bishop. Pattern Recognition and Machine Learning. Information Science and Statistics. Springer-Verlag, [2] M. P. Deisenroth, M. F. Huber, and U. D. Hanebeck. Analytic Moment-based Gaussian Process Filtering. In L. Bouttou and M. L. Littman, editors, Proceedings of the 26th International Conference on Machine Learning, pages , Montreal, QC, Canada, June Omnipress. [3] M. P. Deisenroth and S. Mohamed. Expectation Propagation in Gaussian Process Dynamical Systems, July [4] M. P. Deisenroth and H. Ohlsson. A General Perspective on Gaussian Filtering and Smoothing: Explaining Current and Deriving New Algorithms. In Proceedings of the American Control Conference, [5] M. P. Deisenroth, R. Turner, M. Huber, U. D. Hanebeck, and C. E. Rasmussen. Robust Filtering and Smoothing with Gaussian Processes. IEEE Transactions on Automatic Control, 57(7): , doi: /tac [6] S. J. Julier and J. K. Uhlmann. A New Extension of the Kalman Filter to Nonlinear Systems. In Proceedings of AeroSense: 11th Symposium on Aerospace/Defense Sensing, Simulation and Controls, pages , [7] R. E. Kalman. A New Approach to Linear Filtering and Prediction Problems. Transactions of the ASME Journal of Basic Engineering, 82(Series D):35 45, [8] J. Ko and D. Fox. GP-BayesFilters: Bayesian Filtering using Gaussian Process Prediction and Observation Models. Autonomous Robots, 27(1):75 90, July [9] T. P. Minka. A Family of Algorithms for Approximate Bayesian Inference. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, January [10] J. Quiñonero-Candela, A. Girard, J. Larsen, and C. E. Rasmussen. Propagation of Uncertainty in Bayesian Kernel Models Application to Multiple-Step Ahead Forecasting. In IEEE International Conference on Acoustics, Speech and Signal Processing, volume 2, pages , April [11] J. M. Wang, D. J. Fleet, and A. Hertzmann. Gaussian Process Dynamical Models for Human Motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2): , Marc Deisenroth (TU Darmstadt) EP in Dynamical Systems 33
State-Space Inference and Learning with Gaussian Processes
Ryan Turner 1 Marc Peter Deisenroth 1, Carl Edward Rasmussen 1,3 1 Department of Engineering, University of Cambridge, Trumpington Street, Cambridge CB 1PZ, UK Department of Computer Science & Engineering,
More informationPILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol
More informationLecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu
Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationMODEL BASED LEARNING OF SIGMA POINTS IN UNSCENTED KALMAN FILTERING. Ryan Turner and Carl Edward Rasmussen
MODEL BASED LEARNING OF SIGMA POINTS IN UNSCENTED KALMAN FILTERING Ryan Turner and Carl Edward Rasmussen University of Cambridge Department of Engineering Trumpington Street, Cambridge CB PZ, UK ABSTRACT
More informationGaussian Process Approximations of Stochastic Differential Equations
Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Dan Cawford Manfred Opper John Shawe-Taylor May, 2006 1 Introduction Some of the most complex models routinely run
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory
More informationAnalytic Long-Term Forecasting with Periodic Gaussian Processes
Nooshin Haji Ghassemi School of Computing Blekinge Institute of Technology Sweden Marc Peter Deisenroth Department of Computing Imperial College London United Kingdom Department of Computer Science TU
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationGaussian Process Latent Variable Models for Dimensionality Reduction and Time Series Modeling
Gaussian Process Latent Variable Models for Dimensionality Reduction and Time Series Modeling Nakul Gopalan IAS, TU Darmstadt nakul.gopalan@stud.tu-darmstadt.de Abstract Time series data of high dimensions
More informationNon-Gaussian likelihoods for Gaussian Processes
Non-Gaussian likelihoods for Gaussian Processes Alan Saul University of Sheffield Outline Motivation Laplace approximation KL method Expectation Propagation Comparing approximations GP regression Model
More informationIdentification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM
Identification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM Roger Frigola Fredrik Lindsten Thomas B. Schön, Carl E. Rasmussen Dept. of Engineering, University of Cambridge,
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationLearning Gaussian Process Models from Uncertain Data
Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada
More informationExpectation propagation for signal detection in flat-fading channels
Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA
More informationExpectation Propagation for Approximate Bayesian Inference
Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given
More informationModelling and Control of Nonlinear Systems using Gaussian Processes with Partial Model Information
5st IEEE Conference on Decision and Control December 0-3, 202 Maui, Hawaii, USA Modelling and Control of Nonlinear Systems using Gaussian Processes with Partial Model Information Joseph Hall, Carl Rasmussen
More informationGWAS V: Gaussian processes
GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011
More informationIdentification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM
Preprints of the 9th World Congress The International Federation of Automatic Control Identification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM Roger Frigola Fredrik
More informationReinforcement Learning with Reference Tracking Control in Continuous State Spaces
Reinforcement Learning with Reference Tracking Control in Continuous State Spaces Joseph Hall, Carl Edward Rasmussen and Jan Maciejowski Abstract The contribution described in this paper is an algorithm
More informationNon Linear Latent Variable Models
Non Linear Latent Variable Models Neil Lawrence GPRS 14th February 2014 Outline Nonlinear Latent Variable Models Extensions Outline Nonlinear Latent Variable Models Extensions Non-Linear Latent Variable
More informationMultiple-step Time Series Forecasting with Sparse Gaussian Processes
Multiple-step Time Series Forecasting with Sparse Gaussian Processes Perry Groot ab Peter Lucas a Paul van den Bosch b a Radboud University, Model-Based Systems Development, Heyendaalseweg 135, 6525 AJ
More informationEfficient Reinforcement Learning for Motor Control
Efficient Reinforcement Learning for Motor Control Marc Peter Deisenroth and Carl Edward Rasmussen Department of Engineering, University of Cambridge Trumpington Street, Cambridge CB2 1PZ, UK Abstract
More informationPart 1: Expectation Propagation
Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud
More informationBayesian Inference Course, WTCN, UCL, March 2013
Bayesian Course, WTCN, UCL, March 2013 Shannon (1948) asked how much information is received when we observe a specific value of the variable x? If an unlikely event occurs then one would expect the information
More informationData-Driven Differential Dynamic Programming Using Gaussian Processes
5 American Control Conference Palmer House Hilton July -3, 5. Chicago, IL, USA Data-Driven Differential Dynamic Programming Using Gaussian Processes Yunpeng Pan and Evangelos A. Theodorou Abstract We present
More informationExpectation Propagation in Factor Graphs: A Tutorial
DRAFT: Version 0.1, 28 October 2005. Do not distribute. Expectation Propagation in Factor Graphs: A Tutorial Charles Sutton October 28, 2005 Abstract Expectation propagation is an important variational
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September
More informationGP-SUM. Gaussian Process Filtering of non-gaussian Beliefs
GP-SUM. Gaussian Process Filtering of non-gaussian Beliefs Maria Bauza and Alberto Rodriguez Mechanical Engineering Department Massachusetts Institute of Technology @mit.edu Abstract. This
More informationState Space Gaussian Processes with Non-Gaussian Likelihoods
State Space Gaussian Processes with Non-Gaussian Likelihoods Hannes Nickisch 1 Arno Solin 2 Alexander Grigorievskiy 2,3 1 Philips Research, 2 Aalto University, 3 Silo.AI ICML2018 July 13, 2018 Outline
More informationNON-LINEAR NOISE ADAPTIVE KALMAN FILTERING VIA VARIATIONAL BAYES
2013 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING NON-LINEAR NOISE ADAPTIVE KALMAN FILTERING VIA VARIATIONAL BAYES Simo Särä Aalto University, 02150 Espoo, Finland Jouni Hartiainen
More informationLecture 2: From Linear Regression to Kalman Filter and Beyond
Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing
More informationRECURSIVE OUTLIER-ROBUST FILTERING AND SMOOTHING FOR NONLINEAR SYSTEMS USING THE MULTIVARIATE STUDENT-T DISTRIBUTION
1 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 3 6, 1, SANTANDER, SPAIN RECURSIVE OUTLIER-ROBUST FILTERING AND SMOOTHING FOR NONLINEAR SYSTEMS USING THE MULTIVARIATE STUDENT-T
More informationGaussian Process Dynamical Models Jack M Wang, David J Fleet, Aaron Hertzmann, NIPS 2005
Gaussian Process Dynamical Models Jack M Wang, David J Fleet, Aaron Hertzmann, NIPS 2005 Presented by Piotr Mirowski CBLL meeting, May 6, 2009 Courant Institute of Mathematical Sciences, New York University
More informationDensity Propagation for Continuous Temporal Chains Generative and Discriminative Models
$ Technical Report, University of Toronto, CSRG-501, October 2004 Density Propagation for Continuous Temporal Chains Generative and Discriminative Models Cristian Sminchisescu and Allan Jepson Department
More informationLecture 6: Bayesian Inference in SDE Models
Lecture 6: Bayesian Inference in SDE Models Bayesian Filtering and Smoothing Point of View Simo Särkkä Aalto University Simo Särkkä (Aalto) Lecture 6: Bayesian Inference in SDEs 1 / 45 Contents 1 SDEs
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationLecture 13 : Variational Inference: Mean Field Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1
More informationp L yi z n m x N n xi
y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen
More informationBlack-box α-divergence Minimization
Black-box α-divergence Minimization José Miguel Hernández-Lobato, Yingzhen Li, Daniel Hernández-Lobato, Thang Bui, Richard Turner, Harvard University, University of Cambridge, Universidad Autónoma de Madrid.
More informationOptimal Control with Learned Forward Models
Optimal Control with Learned Forward Models Pieter Abbeel UC Berkeley Jan Peters TU Darmstadt 1 Where we are? Reinforcement Learning Data = {(x i, u i, x i+1, r i )}} x u xx r u xx V (x) π (u x) Now V
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter
More information2-Step Temporal Bayesian Networks (2TBN): Filtering, Smoothing, and Beyond Technical Report: TRCIM1030
2-Step Temporal Bayesian Networks (2TBN): Filtering, Smoothing, and Beyond Technical Report: TRCIM1030 Anqi Xu anqixu(at)cim(dot)mcgill(dot)ca School of Computer Science, McGill University, Montreal, Canada,
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationLearning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014
Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationGaussian Process Approximations of Stochastic Differential Equations
Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Centre for Computational Statistics and Machine Learning University College London c.archambeau@cs.ucl.ac.uk CSML
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationIntroduction to Probabilistic Graphical Models: Exercises
Introduction to Probabilistic Graphical Models: Exercises Cédric Archambeau Xerox Research Centre Europe cedric.archambeau@xrce.xerox.com Pascal Bootcamp Marseille, France, July 2010 Exercise 1: basics
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationNeutron inverse kinetics via Gaussian Processes
Neutron inverse kinetics via Gaussian Processes P. Picca Politecnico di Torino, Torino, Italy R. Furfaro University of Arizona, Tucson, Arizona Outline Introduction Review of inverse kinetics techniques
More informationProbabilistic Reasoning in Deep Learning
Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian
More informationExpectation propagation as a way of life
Expectation propagation as a way of life Yingzhen Li Department of Engineering Feb. 2014 Yingzhen Li (Department of Engineering) Expectation propagation as a way of life Feb. 2014 1 / 9 Reference This
More informationStochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints
Stochastic Variational Inference for Gaussian Process Latent Variable Models using Back Constraints Thang D. Bui Richard E. Turner tdb40@cam.ac.uk ret26@cam.ac.uk Computational and Biological Learning
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationThe Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision
The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that
More informationNPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic
NPFL108 Bayesian inference Introduction Filip Jurčíček Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic Home page: http://ufal.mff.cuni.cz/~jurcicek Version: 21/02/2014
More informationProbabilistic Graphical Models Lecture 20: Gaussian Processes
Probabilistic Graphical Models Lecture 20: Gaussian Processes Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 30, 2015 1 / 53 What is Machine Learning? Machine learning algorithms
More informationModel-Based Reinforcement Learning with Continuous States and Actions
Marc P. Deisenroth, Carl E. Rasmussen, and Jan Peters: Model-Based Reinforcement Learning with Continuous States and Actions in Proceedings of the 16th European Symposium on Artificial Neural Networks
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationCSci 8980: Advanced Topics in Graphical Models Gaussian Processes
CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian
More informationModel Selection for Gaussian Processes
Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal
More informationData Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber
Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2017 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields
More informationNon-Factorised Variational Inference in Dynamical Systems
st Symposium on Advances in Approximate Bayesian Inference, 08 6 Non-Factorised Variational Inference in Dynamical Systems Alessandro D. Ialongo University of Cambridge and Max Planck Institute for Intelligent
More informationKalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein
Kalman filtering and friends: Inference in time series models Herke van Hoof slides mostly by Michael Rubinstein Problem overview Goal Estimate most probable state at time k using measurement up to time
More informationVariable sigma Gaussian processes: An expectation propagation perspective
Variable sigma Gaussian processes: An expectation propagation perspective Yuan (Alan) Qi Ahmed H. Abdel-Gawad CS & Statistics Departments, Purdue University ECE Department, Purdue University alanqi@cs.purdue.edu
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of
More informationGaussian Processes in Machine Learning
Gaussian Processes in Machine Learning November 17, 2011 CharmGil Hong Agenda Motivation GP : How does it make sense? Prior : Defining a GP More about Mean and Covariance Functions Posterior : Conditioning
More informationProbabilistic numerics for deep learning
Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationMachine Learning Techniques for Computer Vision
Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM
More information9 Forward-backward algorithm, sum-product on factor graphs
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationRobust Filtering and Smoothing with Gaussian Processes
1 Robust Filtering and Smoothing with Gaussian Processes Marc Peter Deisenroth, Ryan Turner Member, IEEE, Marco F. Huber Member, IEEE, Uwe D. Hanebeck Member, IEEE, Carl Edward Rasmussen Abstract We propose
More informationCSE 473: Artificial Intelligence
CSE 473: Artificial Intelligence Hidden Markov Models Dieter Fox --- University of Washington [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 12 Dynamical Models CS/CNS/EE 155 Andreas Krause Homework 3 out tonight Start early!! Announcements Project milestones due today Please email to TAs 2 Parameter learning
More informationMathematical Formulation of Our Example
Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationProbabilistic Graphical Models for Image Analysis - Lecture 4
Probabilistic Graphical Models for Image Analysis - Lecture 4 Stefan Bauer 12 October 2018 Max Planck ETH Center for Learning Systems Overview 1. Repetition 2. α-divergence 3. Variational Inference 4.
More informationPower EP. Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR , October 4, Abstract
Power EP Thomas Minka Microsoft Research Ltd., Cambridge, UK MSR-TR-2004-149, October 4, 2004 Abstract This note describes power EP, an etension of Epectation Propagation (EP) that makes the computations
More informationVariational Principal Components
Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationGaussian Processes for Machine Learning
Gaussian Processes for Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics Tübingen, Germany carl@tuebingen.mpg.de Carlos III, Madrid, May 2006 The actual science of
More informationSystem identification and control with (deep) Gaussian processes. Andreas Damianou
System identification and control with (deep) Gaussian processes Andreas Damianou Department of Computer Science, University of Sheffield, UK MIT, 11 Feb. 2016 Outline Part 1: Introduction Part 2: Gaussian
More informationDistributed Gaussian Processes
Distributed Gaussian Processes Marc Deisenroth Department of Computing Imperial College London http://wp.doc.ic.ac.uk/sml/marc-deisenroth Gaussian Process Summer School, University of Sheffield 15th September
More informationLecture 1a: Basic Concepts and Recaps
Lecture 1a: Basic Concepts and Recaps Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced
More informationGaussian Processes (10/16/13)
STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs
More information2D Image Processing. Bayes filter implementation: Kalman filter
2D Image Processing Bayes filter implementation: Kalman filter Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de
More informationSTA 414/2104: Machine Learning
STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationModeling and state estimation Examples State estimation Probabilities Bayes filter Particle filter. Modeling. CSC752 Autonomous Robotic Systems
Modeling CSC752 Autonomous Robotic Systems Ubbo Visser Department of Computer Science University of Miami February 21, 2017 Outline 1 Modeling and state estimation 2 Examples 3 State estimation 4 Probabilities
More information