Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters

Similar documents
Gaussians. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

Pontryagin s maximum principle

Linearly-Solvable Stochastic Optimal Control Problems

Lecture 10 Linear Quadratic Stochastic Control with Partial State Observation

Lecture 5 Linear Quadratic Stochastic Control

Bayesian Decision Theory in Sensorimotor Control

Optimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:

6.241 Dynamic Systems and Control

Steady State Kalman Filter

Homework Solution # 3

Hamilton-Jacobi-Bellman Equation Feb 25, 2008

6 OUTPUT FEEDBACK DESIGN

= m(0) + 4e 2 ( 3e 2 ) 2e 2, 1 (2k + k 2 ) dt. m(0) = u + R 1 B T P x 2 R dt. u + R 1 B T P y 2 R dt +

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

Optimal Control Theory

LQR, Kalman Filter, and LQG. Postgraduate Course, M.Sc. Electrical Engineering Department College of Engineering University of Salahaddin

SYSTEMTEORI - KALMAN FILTER VS LQ CONTROL

Inverse Optimality Design for Biological Movement Systems

State estimation and the Kalman filter

Lecture 9. Introduction to Kalman Filtering. Linear Quadratic Gaussian Control (LQG) G. Hovland 2004

Suppose that we have a specific single stage dynamic system governed by the following equation:

Trajectory-based optimization

Quadratic Stability of Dynamical Systems. Raktim Bhattacharya Aerospace Engineering, Texas A&M University

CALIFORNIA INSTITUTE OF TECHNOLOGY Control and Dynamical Systems. CDS 110b

CALIFORNIA INSTITUTE OF TECHNOLOGY Control and Dynamical Systems. CDS 110b

Optimization-Based Control

Lecture 2 and 3: Controllability of DT-LTI systems

Lecture 20: Linear Dynamics and LQG

Optimal control and estimation

Linear Systems. Manfred Morari Melanie Zeilinger. Institut für Automatik, ETH Zürich Institute for Dynamic Systems and Control, ETH Zürich

Chapter 3. LQ, LQG and Control System Design. Dutch Institute of Systems and Control

Linear-Quadratic Optimal Control: Full-State Feedback

Problem 1 Cost of an Infinite Horizon LQR

Topic # Feedback Control Systems

Lecture 5: Control Over Lossy Networks

EL2520 Control Theory and Practice

ECSE.6440 MIDTERM EXAM Solution Optimal Control. Assigned: February 26, 2004 Due: 12:00 pm, March 4, 2004

Optimal Control of Linear Systems with Stochastic Parameters for Variance Suppression: The Finite Time Case

Semidefinite Programming Duality and Linear Time-invariant Systems

4F3 - Predictive Control

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming

Stochastic optimal control methods for uncertain predictive reaching movements

Lecture 4 Continuous time linear quadratic regulator

Feedback Control of Turbulent Wall Flows

Optimization-Based Control

Linear Quadratic Zero-Sum Two-Person Differential Games Pierre Bernhard June 15, 2013

Optimization-Based Control

Outline. Linear regulation and state estimation (LQR and LQE) Linear differential equations. Discrete time linear difference equations

The Uncertainty Threshold Principle: Some Fundamental Limitations of Optimal Decision Making under Dynamic Uncertainty

Outline. 1 Linear Quadratic Problem. 2 Constraints. 3 Dynamic Programming Solution. 4 The Infinite Horizon LQ Problem.

Riccati Equations and Inequalities in Robust Control

Stochastic and Adaptive Optimal Control

Robotics: Science & Systems [Topic 6: Control] Prof. Sethu Vijayakumar Course webpage:

Here represents the impulse (or delta) function. is an diagonal matrix of intensities, and is an diagonal matrix of intensities.

EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

Robotics. Control Theory. Marc Toussaint U Stuttgart

On Stochastic Adaptive Control & its Applications. Bozenna Pasik-Duncan University of Kansas, USA

EE363 homework 2 solutions

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory

4F3 - Predictive Control

Subject: Optimal Control Assignment-1 (Related to Lecture notes 1-10)

Discrete Time Stochastic Optimal Control

1 Kalman Filter Introduction

State Regulator. Advanced Control. design of controllers using pole placement and LQ design rules

Time-Invariant Linear Quadratic Regulators Robert Stengel Optimal Control and Estimation MAE 546 Princeton University, 2015

Chapter 2 Event-Triggered Sampling

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

Lecture 9: Discrete-Time Linear Quadratic Regulator Finite-Horizon Case

Exam in Automatic Control II Reglerteknik II 5hp (1RT495)

ECONOMETRIC METHODS II: TIME SERIES LECTURE NOTES ON THE KALMAN FILTER. The Kalman Filter. We will be concerned with state space systems of the form

14 - Gaussian Stochastic Processes

Networked Sensing, Estimation and Control Systems

EE C128 / ME C134 Final Exam Fall 2014

Contents. 1 State-Space Linear Systems 5. 2 Linearization Causality, Time Invariance, and Linearity 31

OPTIMAL CONTROL AND ESTIMATION

RECURSIVE ESTIMATION AND KALMAN FILTERING

MATH4406 (Control Theory) Unit 6: The Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) Prepared by Yoni Nazarathy, Artem

Topic # /31 Feedback Control Systems. Analysis of Nonlinear Systems Lyapunov Stability Analysis

Probabilistic Graphical Models

Stochastic Differential Dynamic Programming

Robust Control 5 Nominal Controller Design Continued

Estimation of State Noise for the Ensemble Kalman filter algorithm for 2D shallow water equations.

Exam. 135 minutes, 15 minutes reading time

Linear Quadratic Zero-Sum Two-Person Differential Games

EE363 homework 5 solutions

Control Systems Lab - SC4070 Control techniques

Kalman Filter and Parameter Identification. Florian Herzog

Prediction, filtering and smoothing using LSCR: State estimation algorithms with guaranteed confidence sets

Extensions and applications of LQ

9 Controller Discretization

Lecture 7 : Generalized Plant and LFT form Dr.-Ing. Sudchai Boonto Assistant Professor

Chapter 2 Optimal Control Problem

Robot Localization and Kalman Filters

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Model Predictive Control Short Course Regulation

Automatic Control 2. Nonlinear systems. Prof. Alberto Bemporad. University of Trento. Academic year

DECENTRALIZED CONTROL OF INTERCONNECTED SINGULARLY PERTURBED SYSTEMS. A Dissertation by. Walid Sulaiman Alfuhaid

Gaussian Process Approximations of Stochastic Differential Equations

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

Transcription:

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 204 Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 /

LQG in continuous time Recall that for problems with dynamics and cost the optimal control law is u = dx = (a (x) + B (x) u) dt + C (x) dω ` (x, u) = q (x) + 2 ut R (x) u v t = q + a T v x + 2 tr CC T v xx We now impose further restrictions (LQG system): R B T v x and the HJB equation is dx = (Ax + Bu) dt + Cdω ` (x, u) = 2 xt Qx + 2 ut Ru q T (x) = 2 xt Q T x 2 vt x BR B T v x Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 2 /

Continuous-time Riccati equations Substituting the LQG dynamics and cost in the HJB equation yields v t = 2 xt Qx + x T A T v x + 2 tr CC T v xx 2 vt x BR B T v x We can now show that v is quadratic: v (x, t) = 2 xt V (t) x + α (t) At the final time this holds with α (T) = 0 and V (T) = Q T. Then α 2 xt Vx = 2 xt Qx + x T A T Vx + 2 tr CC T V 2 xt VBR B T Vx Using the fact that x T A T Vx = x T VAx and matching powers of x yields Theorem (Riccati equation) V = Q + A T V + VA VBR B T V α = CC 2 tr T V Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 3 /

Linear feedback control law When v (x, t) = 2 xt V (t) x + α (t), the optimal control u = R B T v x is u (x, t) = L (t) x L (t), R B T V (t) The Hessian V (t) and the matrix of feedback gains L (t) are independent of the noise amplitude C. Thus the optimal control law u (x, t) is the same for stochastic and deterministic systems (the latter is called LQR). Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 4 /

Linear feedback control law When v (x, t) = 2 xt V (t) x + α (t), the optimal control u = R B T v x is u (x, t) = L (t) x L (t), R B T V (t) The Hessian V (t) and the matrix of feedback gains L (t) are independent of the noise amplitude C. Thus the optimal control law u (x, t) is the same for stochastic and deterministic systems (the latter is called LQR). Example: dx = udt + 0.2dω ` (x, u) = 0.5u 2 q T (x) = 2.5x 2 Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 4 /

LQG in discrete time Consider an optimal control problem with dynamics and cost x k+ = Ax k + Bu k ` (x, u) = 2 xt Qx + 2 ut Ru Substituting in the Bellman equation v k (x) = min u f` (x, u) + v k+ (x 0 )g and making the ansatz v k (x) = 2 xt V k x yields 2 xt V k x = min u 2 xt Qx + 2 ut Ru + 2 (Ax + Bu)T V k+ (Ax + Bu) The minimum is u k (x) = L kx where L k, Theorem (Riccati equation) V k = Q + A T V k+ (A BL k ) R + B T V k+ B B T V k+ A. Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 5 /

Summary of Riccati equations Finite horizon Continuous time V = Q + A T V + VA VBR B T V Discrete time V k = Q + A T V k+ A A T V k+ B R + B T V k+ B B T V k+ A Average cost Continuous time ( care in Matlab) 0 = Q + A T V + VA VBR B T V Discrete time ( dare in Matlab) V = Q + A T VA A T VB R + B VB T B T VA Discounted cost is similar; first exit does not yield Riccati equations. Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 6 /

Relation between continuous and discrete time The continuous-time system ẋ = Ax + Bu ` (x, u) = 2 xt Qx + 2 ut Ru can be represented in discrete time with time-step as x k+ = (I + A) x k + Bu k ` (x, u) = 2 xt Qx + 2 ut Ru In the limit! 0 the discrete Riccati equation reduces to the continuous one: V = Q + (I + A) T V (I + A) (I + A) T V B R + B V B T B T V (I + A) V = Q + V + A T V + VA VB R + B VB T B T V + o 2 0 = Q + A T V + VA VB R + B T VB B T V + o 2 Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 7 /

Encoding targets as quadratic costs The matrices A, B, Q, R can be time-varying, which is useful for specifying reference trajectories xk, and for approximating non-lqg problems. The cost xk the state vector as xk 2 can be represented in the LQG framework by augmenting ex = x and writing the state cost as A 0, ea = 0 2 ext eq k ex = 2 ext D T k D k ex where D k = I, x k and so Dk ex k = x k x k., etc. If the target x is stationary we can instead include it in the state, and use D = [I, I]. This has the advantage that the resulting control law is independent of x and therefore can be used for all targets. Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 8 /

Optimal estimation in linear-gaussian systems Consider the partially-observed system x k+ = Ax k + Cω k y k = Hx k + Dε k with hidden state x k, measurement y k, and noise ε k, ω k N (0, I). Given a Gaussian prior x 0 N (bx 0, Σ 0 ) and a sequence of measurements y 0, y, y k, we want to compute the posterior p k+ (x k+ ). We can show by induction that the posterior is Gaussian at all times. Let p k (x k ) be N (bx k, Σ k ). This will act as a prior for estimating x k+. Now x k+ and y k are jointly Gaussian, with mean and covariance xk+ Abxk E = y k xk+ Cov y k = Hbx k CC T + AΣ k A T HΣ k A T AΣ k H T DD T + HΣ k H T Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 9 /

Kalman filter Lemma If u, v are jointly Gaussian with means bu, bv and covariances Σ uu, Σ vv, Σ uv = Σ T vu, then u given v is Gaussian with mean and covariance E [ujv] = bu + Σ uv Σvv (v bv) Cov [ujv] = Σ uu Σ uv Σvv Σ vu Applying this to our problem with u = x k+ and v = y k yields Theorem (Kalman filter) The mean bx and covariance Σ of the Gaussian posterior satisfy bx k+ = Abx k + K k (y k Hbx k ) Σ k+ = CC T + (A K k H) Σ k A T K k, AΣ k H T DD T + HΣ k H T Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 0 /

Duality of LQG control and Kalman filtering LQG controller State dynamics: Kalman filter Estimated state dynamics: x k+ = (A BL k ) x k + Cε k bx k+ = (A K k H) bx k + K k y k Gain matrix: L k = R + B T V k+ B B T V k+ A Backward Riccati equation: Gain matrix: K k = AΣ k H T DD T + HΣ k H T Forward Riccati equation: V k = Q + A T V k+ (A BL k ) Σ k+ = CC T + (A K k H) Σ k A T Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 /

Duality of LQG control and Kalman filtering LQG controller State dynamics: Kalman filter Estimated state dynamics: x k+ = (A BL k ) x k + Cε k bx k+ = (A K k H) bx k + K k y k Gain matrix: L k = R + B T V k+ B B T V k+ A Backward Riccati equation: Gain matrix: K k = AΣ k H T DD T + HΣ k H T Forward Riccati equation: V k = Q + A T V k+ (A BL k ) Σ k+ = CC T + (A K k H) Σ k A T This form of duality does not generalize to non-lqg systems. However there is a different duality which does generalize (see later). It involves an information filter, computing Σ instead of Σ. Emo Todorov (UW) AMATH/CSE 579, Winter 204 Winter 204 /