Lecture 4 Continuous time linear quadratic regulator

Similar documents
EE363 homework 2 solutions

6. Linear Quadratic Regulator Control

MATH4406 (Control Theory) Unit 6: The Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) Prepared by Yoni Nazarathy, Artem

Lecture 5 Linear Quadratic Stochastic Control

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

Quadratic Stability of Dynamical Systems. Raktim Bhattacharya Aerospace Engineering, Texas A&M University

Suppose that we have a specific single stage dynamic system governed by the following equation:

9 Controller Discretization

Lecture 10 Linear Quadratic Stochastic Control with Partial State Observation

Problem 1 Cost of an Infinite Horizon LQR

EE363 Review Session 1: LQR, Controllability and Observability

EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

EE363 homework 7 solutions

ECEEN 5448 Fall 2011 Homework #4 Solutions

5. Observer-based Controller Design

1 Steady State Error (30 pts)

Homework Solution # 3

MODERN CONTROL DESIGN

Lecture 6. Foundations of LMIs in System and Control Theory

EE221A Linear System Theory Final Exam

Module 07 Controllability and Controller Design of Dynamical LTI Systems

Topic # Feedback Control Systems

EE C128 / ME C134 Final Exam Fall 2014

Course Outline. Higher Order Poles: Example. Higher Order Poles. Amme 3500 : System Dynamics & Control. State Space Design. 1 G(s) = s(s + 2)(s +10)

Linear Quadratic Optimal Control Topics

ECSE.6440 MIDTERM EXAM Solution Optimal Control. Assigned: February 26, 2004 Due: 12:00 pm, March 4, 2004

Optimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:

6.241 Dynamic Systems and Control

Lecture 2 and 3: Controllability of DT-LTI systems

2 The Linear Quadratic Regulator (LQR)

State Regulator. Advanced Control. design of controllers using pole placement and LQ design rules

CONTROL DESIGN FOR SET POINT TRACKING

Linear Quadratic Regulator (LQR) Design I

Dynamic interpretation of eigenvectors

EE363 homework 8 solutions

Linear dynamical systems with inputs & outputs

Robotics. Control Theory. Marc Toussaint U Stuttgart

Australian Journal of Basic and Applied Sciences, 3(4): , 2009 ISSN Modern Control Design of Power System

EE C128 / ME C134 Feedback Control Systems

LQR, Kalman Filter, and LQG. Postgraduate Course, M.Sc. Electrical Engineering Department College of Engineering University of Salahaddin

Steady State Kalman Filter

1. Find the solution of the following uncontrolled linear system. 2 α 1 1

Deterministic Dynamic Programming

Linear Quadratic Regulator (LQR) I

Module 03 Linear Systems Theory: Necessary Background

= m(0) + 4e 2 ( 3e 2 ) 2e 2, 1 (2k + k 2 ) dt. m(0) = u + R 1 B T P x 2 R dt. u + R 1 B T P y 2 R dt +

Extensions and applications of LQ

Formula Sheet for Optimal Control

LINEAR QUADRATIC GAUSSIAN

The Inverted Pendulum

Linear Quadratic Regulator (LQR) Design II

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming

Lecture 2: Discrete-time Linear Quadratic Optimal Control

Iterative methods to compute center and center-stable manifolds with application to the optimal output regulation problem

4F3 - Predictive Control

Lecture 19 Observability and state estimation

Homogeneous and particular LTI solutions

1 (30 pts) Dominant Pole

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters

Control Systems I. Lecture 4: Diagonalization, Modal Analysis, Intro to Feedback. Readings: Emilio Frazzoli

Observability and state estimation

Robust Control 5 Nominal Controller Design Continued

Denis ARZELIER arzelier

CDS 110b: Lecture 2-1 Linear Quadratic Regulators

EE263: Introduction to Linear Dynamical Systems Review Session 6

A Tour of Reinforcement Learning The View from Continuous Control. Benjamin Recht University of California, Berkeley

6 OUTPUT FEEDBACK DESIGN

Subject: Optimal Control Assignment-1 (Related to Lecture notes 1-10)

Riccati Equations in Optimal Control Theory

Time-Invariant Linear Quadratic Regulators Robert Stengel Optimal Control and Estimation MAE 546 Princeton University, 2015

OPTIMAL CONTROL SYSTEMS

Lecture 9. Introduction to Kalman Filtering. Linear Quadratic Gaussian Control (LQG) G. Hovland 2004

Chapter Introduction. A. Bensoussan

Pontryagin s maximum principle

HW3 - Due 02/06. Each answer must be mathematically justified. Don t forget your name. 1 2, A = 2 2

Diagonalization. P. Danziger. u B = A 1. B u S.

SYSTEMTEORI - KALMAN FILTER VS LQ CONTROL

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

SUCCESSIVE POLE SHIFTING USING SAMPLED-DATA LQ REGULATORS. Sigeru Omatu

ECE 680 Fall Test #2 Solutions. 1. Use Dynamic Programming to find u(0) and u(1) that minimize. J = (x(2) 1) u 2 (k) x(k + 1) = bu(k),

Linear Quadratic Regulator (LQR) II

Optimal Linear Feedback Control for Incompressible Fluid Flow

Advanced Mechatronics Engineering

Here represents the impulse (or delta) function. is an diagonal matrix of intensities, and is an diagonal matrix of intensities.

The goal of this chapter is to study linear systems of ordinary differential equations: dt,..., dx ) T

Topic # /31 Feedback Control Systems. Analysis of Nonlinear Systems Lyapunov Stability Analysis

Lecture 20: Linear Dynamics and LQG

Consensus Control of Multi-agent Systems with Optimal Performance

Automatic Control 2. Nonlinear systems. Prof. Alberto Bemporad. University of Trento. Academic year

Deterministic Kalman Filtering on Semi-infinite Interval

6.241 Dynamic Systems and Control

ACM/CMS 107 Linear Analysis & Applications Fall 2016 Assignment 4: Linear ODEs and Control Theory Due: 5th December 2016

Hamilton-Jacobi-Bellman Equation Feb 25, 2008

Linear System Theory

Appendix A Solving Linear Matrix Inequality (LMI) Problems

Module 05 Introduction to Optimal Control

ME8281-Advanced Control Systems Design

Stochastic and Adaptive Optimal Control

Principles of Optimal Control Spring 2008

Nonlinear Optimal Tracking Using Finite-Horizon State Dependent Riccati Equation (SDRE)

Transcription:

EE363 Winter 2008-09 Lecture 4 Continuous time linear quadratic regulator continuous-time LQR problem dynamic programming solution Hamiltonian system and two point boundary value problem infinite horizon LQR direct solution of ARE via Hamiltonian 4 1

Continuous-time LQR problem continuous-time system ẋ = Ax + Bu, x(0) = x 0 problem: choose u : [0, T R m to minimize J = T 0 ( x(τ) T Qx(τ) + u(τ) T Ru(τ) ) dτ + x(t) T Q f x(t) T is time horizon Q = Q T 0, Q f = Q T f 0, R = RT > 0 are state cost, final state cost, and input cost matrices... an infinite-dimensional problem: (trajectory u : [0, T R m is variable) Continuous time linear quadratic regulator 4 2

Dynamic programming solution we ll solve LQR problem using dynamic programming for 0 t T we define the value function V t : R n R by V t (z) = min u T t ( x(τ) T Qx(τ) + u(τ) T Ru(τ) ) dτ + x(t) T Q f x(t) subject to x(t) = z, ẋ = Ax + Bu minimum is taken over all possible signals u : [t, T R m V t (z) gives the minimum LQR cost-to-go, starting from state z at time t V T (z) = z T Q f z Continuous time linear quadratic regulator 4 3

fact: V t is quadratic, i.e., V t (z) = z T P t z, where P t = P T t 0 similar to discrete-time case: P t can be found from a differential equation running backward in time from t = T the LQR optimal u is easily expressed in terms of P t Continuous time linear quadratic regulator 4 4

we start with x(t) = z let s take u(t) = w R m, a constant, over the time interval [t, t + h, where h > 0 is small cost incurred over [t, t + h is t+h t ( x(τ) T Qx(τ) + w T Rw ) dτ h(z T Qz + w T Rw) and we end up at x(t + h) z + h(az + Bw) Continuous time linear quadratic regulator 4 5

min-cost-to-go from where we land is approximately V t+h (z + h(az + Bw)) = (z + h(az + Bw)) T P t+h (z + h(az + Bw)) (z + h(az + Bw)) T (P t + hp t )(z + h(az + Bw)) ( ) z T P t z + h (Az + Bw) T P t z + z T P t (Az + Bw) + z T P t z (dropping h 2 and higher terms) cost incurred plus min-cost-to-go is approximately z T P t z+h ( ) z T Qz + w T Rw + (Az + Bw) T P t z + z T P t (Az + Bw) + z T P t z minimize over w to get (approximately) optimal w: 2hw T R + 2hz T P t B = 0 Continuous time linear quadratic regulator 4 6

so w = R 1 B T P t z thus optimal u is time-varying linear state feedback: u lqr (t) = K t x(t), K t = R 1 B T P t Continuous time linear quadratic regulator 4 7

HJ equation now let s substitute w into HJ equation: z T P( t z z T P t z+ ) +h z T Qz + w T Rw + (Az + Bw ) T P t z + z T P t (Az + Bw ) + z T P t z yields, after simplification, P t = A T P t + P t A P t BR 1 B T P t + Q which is the Riccati differential equation for the LQR problem we can solve it (numerically) using the final condition P T = Q f Continuous time linear quadratic regulator 4 8

Summary of cts-time LQR solution via DP 1. solve Riccati differential equation P t = A T P t + P t A P t BR 1 B T P t + Q, P T = Q f (backward in time) 2. optimal u is u lqr (t) = K t x(t), K t := R 1 B T P t DP method readily extends to time-varying A, B, Q, R, and tracking problem Continuous time linear quadratic regulator 4 9

Steady-state regulator usually P t rapidly converges as t decreases below T limit P ss satisfies (cts-time) algebraic Riccati equation (ARE) A T P + PA PBR 1 B T P + Q = 0 a quadratic matrix equation P ss can be found by (numerically) integrating the Riccati differential equation, or by direct methods for t not close to horizon T, LQR optimal input is approximately a linear, constant state feedback u(t) = K ss x(t), K ss = R 1 B T P ss Continuous time linear quadratic regulator 4 10

Derivation via discretization let s discretize using small step size h > 0, with Nh = T x((k + 1)h) x(kh) + hẋ(kh) = (I + ha)x(kh) + hbu(kh) J h 2 N 1 k=0 ( x(kh) T Qx(kh) + u(kh) T Ru(kh) ) + 1 2 x(nh)t Q f x(nh) this yields a discrete-time LQR problem, with parameters à = I + ha, B = hb, Q = hq, R = hr, Qf = Q f Continuous time linear quadratic regulator 4 11

solution to discrete-time LQR problem is u(kh) = K k x(kh), K k = ( R + B T 1 Pk+1 B) BT Pk+1 à P k 1 = Q + ÃT Pk à ÃT Pk B( R + 1 BT Pk B) BT Pk à substituting and keeping only h 0 and h 1 terms yields P k 1 = hq + P k + ha T Pk + h P k A h P k BR 1 B T Pk which is the same as 1 h ( P k P k 1 ) = Q + A T Pk + P k A P k BR 1 B T Pk letting h 0 we see that P k P kh, where P = Q + A T P + PA PBR 1 B T P Continuous time linear quadratic regulator 4 12

similarly, we have K k = ( R + B T 1 Pk+1 B) BT Pk+1 Ã = (hr + h 2 B T Pk+1 B) 1 hb T Pk+1 (I + ha) R 1 B T P kh as h 0 Continuous time linear quadratic regulator 4 13

Derivation using Lagrange multipliers pose as constrained problem: minimize J = 1 2 T 0 x(τ)t Qx(τ) + u(τ) T Ru(τ) dτ + 1 2 x(t)t Q f x(t) subject to ẋ(t) = Ax(t) + Bu(t), t [0, T optimization variable is function u : [0, T R m infinite number of equality constraints, one for each t [0, T introduce Lagrange multiplier function λ : [0, T R n and form L = J + T 0 λ(τ) T (Ax(τ) + Bu(τ) ẋ(τ)) dτ Continuous time linear quadratic regulator 4 14

Optimality conditions (note: you need distribution theory to really make sense of the derivatives here... ) from u(t) L = Ru(t) + B T λ(t) = 0 we get u(t) = R 1 B T λ(t) to find x(t) L, we use T 0 λ(τ) T ẋ(τ) dτ = λ(t) T x(t) λ(0) T x(0) T 0 λ(τ) T x(τ) dτ from x(t) L = Qx(t) + A T λ(t) + λ(t) = 0 we get λ(t) = A T λ(t) Qx(t) from x(t) L = Q f x(t) λ(t) = 0, we get λ(t) = Q f x(t) Continuous time linear quadratic regulator 4 15

Co-state equations optimality conditions are ẋ = Ax + Bu, x(0) = x 0, λ = A T λ Qx, λ(t) = Q f x(t) using u(t) = R 1 B T λ(t), can write as d dt [ x(t) λ(t) = [ A BR 1 B T Q A T [ x(t) λ(t) 2n 2n matrix above is called Hamiltonian for problem with conditions x(0) = x 0, λ(t) = Q f x(t), called two-point boundary value problem Continuous time linear quadratic regulator 4 16

as in discrete-time case, we can show that λ(t) = P t x(t), where P t = A T P t + P t A P t BR 1 B T P t + Q, P T = Q f in other words, value function P t gives simple relation between x and λ to show this, we show that λ = Px satisfies co-state equation λ = A T λ Qx λ = d dt (Px) = Px + Pẋ = (Q + A T P + PA PBR 1 B T P)x + P(Ax BR 1 B T λ) = Qx A T Px + PBR 1 B T Px PBR 1 B T Px = Qx A T λ Continuous time linear quadratic regulator 4 17

Solving Riccati differential equation via Hamiltonian the (quadratic) Riccati differential equation P = A T P + PA PBR 1 B T P + Q and the (linear) Hamiltonian differential equation d dt [ x(t) λ(t) = [ A BR 1 B T Q A T [ x(t) λ(t) are closely related λ(t) = P t x(t) suggests that P should have the form P t = λ(t)x(t) 1 (but this doesn t make sense unless x and λ are scalars) Continuous time linear quadratic regulator 4 18

consider the Hamiltonian matrix (linear) differential equation d dt [ X(t) Y (t) = [ A BR 1 B T Q A T [ X(t) Y (t) where X(t), Y (t) R n n then, Z(t) = Y (t)x(t) 1 satisfies Riccati differential equation Ż = AT Z + ZA ZBR 1 B T Z + Q hence we can solve Riccati DE by solving (linear) matrix Hamiltonian DE, with final conditions X(T) = I, Y (T) = Q f, and forming P(t) = Y (t)x(t) 1 Continuous time linear quadratic regulator 4 19

Ż = d dt Y X 1 = Ẏ X 1 Y X 1 ẊX 1 = ( QX A T Y )X 1 Y X 1 ( AX BR 1 B T Y ) X 1 = Q A T Z ZA + ZBR 1 B T Z where we use two identities: d dt (F(t)G(t)) = F(t)G(t) + F(t)Ġ(t) d dt ( F(t) 1 ) = F(t) 1 F(t)F(t) 1 Continuous time linear quadratic regulator 4 20

Infinite horizon LQR we now consider the infinite horizon cost function J = 0 x(τ) T Qx(τ) + u(τ) T Ru(τ) dτ we define the value function as V (z) = min u subject to x(0) = z, ẋ = Ax + Bu 0 x(τ) T Qx(τ) + u(τ) T Ru(τ) dτ we assume that (A, B) is controllable, so V is finite for all z can show that V is quadratic: V (z) = z T Pz, where P = P T 0 Continuous time linear quadratic regulator 4 21

optimal u is u(t) = Kx(t), where K = R 1 B T P (i.e., a constant linear state feedback) HJ equation is ARE Q + A T P + PA PBR 1 B T P = 0 which together with P 0 characterizes P can solve as limiting value of Riccati DE, or via direct method Continuous time linear quadratic regulator 4 22

Closed-loop system with K LQR optimal state feedback gain, closed-loop system is ẋ = Ax + Bu = (A + BK)x fact: closed-loop system is stable when (Q,A) observable and (A, B) controllable we denote eigenvalues of A + BK, called closed-loop eigenvalues, as λ 1,..., λ n with assumptions above, Rλ i < 0 Continuous time linear quadratic regulator 4 23

Solving ARE via Hamiltonian [ A BR 1 B T Q A T [ I P = [ A BR 1 B T P Q A T P = [ A + BK Q A T P and so [ I 0 P I [ A BR 1 B T Q A T [ I 0 P I = [ A + BK BR 1 B T 0 (A + BK) T where 0 in lower left corner comes from ARE note that [ I 0 P I 1 = [ I 0 P I Continuous time linear quadratic regulator 4 24

we see that: eigenvalues of Hamiltonian H are λ 1,..., λ n and λ 1,..., λ n hence, closed-loop eigenvalues are the eigenvalues of H with negative real part Continuous time linear quadratic regulator 4 25

let s assume A + BK is diagonalizable, i.e., T 1 (A + BK)T = Λ = diag(λ 1,..., λ n ) then we have T T ( A BK) T T T = Λ, so [ [ T 1 0 A + BK BR 1 B T 0 T T 0 (A + BK) T [ Λ T = 1 BR 1 B T T T 0 Λ [ T 0 0 T T Continuous time linear quadratic regulator 4 26

putting it together we get [ [ [ T 1 0 I 0 I 0 0 T T H P I P I [ [ T 1 0 T 0 = T T P T T H PT [ Λ T = 1 BR 1 B T T T 0 Λ T T [ T 0 0 T T and so thus, the n columns of H [ T PT [ T PT the stable eigenvalues λ 1,..., λ n = [ T PT Λ are the eigenvectors of H associated with Continuous time linear quadratic regulator 4 27

Solving ARE via Hamiltonian find eigenvalues of H, and let λ 1,..., λ n denote the n stable ones (there are exactly n stable and n unstable ones) find associated eigenvectors v 1,..., v n, and partition as [ v1 v n = [ X Y R 2n n P = Y X 1 is unique PSD solution of the ARE (this is very close to the method used in practice, which does not require A + BK to be diagonalizable) Continuous time linear quadratic regulator 4 28