OPTIMAL CONTROL Sadegh Bolouki Lecture slides for ECE 515 University of Illinois, Urbana-Champaign Fall 2016 S. Bolouki (UIUC) 1 / 28
(Example from Optimal Control Theory, Kirk) Objective: To get from 0 to e in the shortest time possible. Physical constraints: No backing up. Start and end at rest. Bounded acceleration and bounded deceleration. Limited fuel; no gas station on the road.
(Example from Optimal Control Theory, Kirk) x 1 : position x 2 : velocity u 1 : acceleration u 2 : deceleration ] [ẋ1 ẋ 2 = [ ] [ ] 0 1 x1 + 0 0 x 2 [ 0 0 1 1 ] [ ] u1 u 2 min u t f t 0 s.t. x(t 0 ) = [ [ 0 e, x(t 0] f ) = 0] 0 x 1 e, 0 x 2 0 u 1 M 1, M 2 u 2 0 tf t 0 (k 1 u 1 + k 2 x 2 ) dt G
(Example from Optimal Control Theory, Kirk)
Optimal Control Problem Optimal Control Problem S. Bolouki (UIUC) 5 / 28
Optimal Control Problem Optimal Control Problem Assume that t 0 and x 0 are given. Cost : V(u) = where - the final time t 1 is given. Dynamics : ẋ(t) = f (x, u, t), x(t 0 ) = x 0 t1 - V, l, and m are real-valued functions. - x(t 1 ) is free. t 0 l(x, u, τ)dτ + m(x(t 1 )) Objective: Find a control input u defined over [t 0, t 1 ] that minimizes V. An important special case is the LQR problem: ẋ = A(t)x(t) + B(t)u(t), x(t 0 ) = x 0 t1 V(u) = (x T Q(t)x + u T R(t)u)dt + x T (t 1 )Mx(t 1 ) t 0 where Q(t), R(t), and M are positive semi-definite for any t. S. Bolouki (UIUC) 6 / 28
Hamilton-Jacobi-Bellman Equation Hamilton-Jacobi-Bellman Equation S. Bolouki (UIUC) 7 / 28
Hamilton-Jacobi-Bellman Equation HJB Equation Let V (x, t) denote the value function [ t1 V (x, t) := min l(x(τ), u(τ), τ)dτ + m ( x(t 1 ) )] u [t,t1 ] t where x = x(t) and u [t,t1] indicates that u is defined over [t, t 1 ]. For any t m, t < t m < t 1 we now have [ tm t1 V (x, t) = min l(x(τ), u(τ), τ)dτ + l(x(τ), u(τ), τ)dτ + m ( x(t 1 ) ) ] u [t,t1 ] t t m = min u [t,tm] tm t [ t1 l(x(τ), u(τ), τ)dτ + min l(x(τ), u(τ), τ)dτ + m ( x(t 1 ) ) ] u [t,t1 ] t } m {{} V (x(t m),t m) S. Bolouki (UIUC) 8 / 28
Hamilton-Jacobi-Bellman Equation HJB Equation [ tm ] V (x, t) = min u [t,tm] t l(x(τ), u(τ), τ)dτ + V (x(t m ), t m ) If u [t,t 1] is an optimal control input given t, x, and t 1 as the starting time, starting point, and final time; and the resulting optimal trajectory passes x m at time t m ; then u [t m,t] is an optimal control input given t m, x m, t 1. S. Bolouki (UIUC) 9 / 28
Hamilton-Jacobi-Bellman Equation HJB Equation If we now let t m t, we arrive at the HJB equation. Let { t m = t + t x m = x(t m ) = x(t + t) = x(t) + x If V is sufficiently smooth around (x, t), we can write its Taylor series and conclude [ ] V (x, t) = min l(x(t), u(t), t) t + V (x(t m ), t m ) u [t,tm] = min u [t,tm] [l(x(t), u(t), t) t + V (x, t) + V x ( ) V ] ( ) x(t), t x + x(t), t t +... t Thus 0 = min u [t,tm] [l(x, u(t), t) + V ( ) x x, t x t + V ( ) ] x, t t S. Bolouki (UIUC) 10 / 28
Hamilton-Jacobi-Bellman Equation HJB Equation Letting t 0, we have 0 = min u [l(x, u, t) + V x ( )ẋ V ( ) ] x, t + x, t where V x = [ x V ] T. HJB equation : V ( ) [ x, t = min l(x, u, t) + V ( ) ] x, t f (x, u, t) t u } x {{} Hamiltonian Boundary condition : V ( ) ( x(t 1 ), t 1 = m x(t1 ) ) t Hamiltonian : H(x, p, u, t) := l(x, u, t) + p T f (x, u, t) S. Bolouki (UIUC) 11 / 28
Hamilton-Jacobi-Bellman Equation HJB Equation Theorem If the value function V has continuous partial derivatives, then it satisfies the HJB equation V ( ) x, t = min H ( x, x V (x, t), u, t ). t u Furthermore, the optimal control u and the corresponding optimal trajectory x satisfy min H ( x, x V (x (t), t), u, t ) = H ( x, x V (x (t), t), u, t ). (1) u Conversely, if some function V satisfies the HJB equation and its boundary condition, and u satisfies (1), then u is the optimal control and V is the desired value function. S. Bolouki (UIUC) 12 / 28
Linear Quadratic Regulators Linear Quadratic Regulators S. Bolouki (UIUC) 13 / 28
Linear Quadratic Regulators LQR Problem In the LQR problem we have ẋ = A(t)x(t) + B(t)u(t), x(t 0 ) = x 0 V(u) = t1 We can compute the Hamiltonian as t 0 (x T Q(t)x + u T R(t)u)dt + x T (t 1 )Mx(t 1 ) H(x, p, u, t) = x T Qx + u T Ru + p T (Ax + Bu) Remember that u minimizes the Hamiltonian. Thus u = 1 2 R 1 B T p p = x V Therefore, the system dynamics is ẋ = Ax 1 2 BR 1 B T x V. S. Bolouki (UIUC) 14 / 28
Linear Quadratic Regulators LQR Problem The HJB equation now is that can be rephrased as V t = H(x, x V, u, t) = x T Qx + 1 4 ( xv ) T BR 1 RR 1 B T ( x V ) + ( x V ) T Ax 1 2 ( xv ) T BR 1 B T ( x V ) V t = x T Qx + ( x V ) T Ax 1 4 ( xv ) T BR 1 B T ( x V ) We know that at the final time t 1 : V (x, t 1 ) = x T Mx. Thus we guess that for some time-varying positive-definite matrix P V = x T Px We now have V t = x T Ṗ(t)x and x V = 2Px. S. Bolouki (UIUC) 15 / 28
Linear Quadratic Regulators LQR Problem Thus we obtain the Riccati Differential Equation (RDE) Ṗ = Q + PA + A T P PBR 1 B T P with the boundary condition P(t 1 ) = M The RDE with its boundary condition has a unique solution if A, B, Q, R are all piecewise continuous in time. This unique solution is symmetric. u = R 1 BPx := Kx, that is a state feedback. V (x 0, t 0 ) = x T 0 P(t 0)x 0. S. Bolouki (UIUC) 16 / 28
Hamiltonian Matrix Hamiltonian Matrix S. Bolouki (UIUC) 17 / 28
Hamiltonian Matrix Hamiltonian Matrix Recall that the RDE is nonlinear. Our goal here is to compute its solution by solving a linear ODE. Theorem The solution P(t) of the RDE equation with its boundary condition is Y(t)X 1 (t) where X(t) and Y(t) are the unique solutions of [Ẋ(t) ] [ ] [ ] A BR = 1 B T X(t) Ẏ(t) Q A T, Y(t) [ ] [ ] X(t1 ) I = Y(t 1 ) M [ ] A BR The matrix 1 B T Q A T is called the Hamiltonian matrix, denoted by H. For the LTI case, the RDE can be solved explicitly. S. Bolouki (UIUC) 18 / 28
Hamiltonian Matrix Hamiltonian Matrix How do we solve the RDE for the LTI case? Assume that The eigenvalues of H are pairwise distinct. if λ is an eigenvalue of H, so is λ. Diagonalize H: [ ] U 1 Λs 0 HU = 0 Λ s [ ] U11 U where Λ s contains the stable eigenvalues and U = 12 is the matrix U 21 U 22 composed of the eigenvectors of H. Then [ ] [ ] [ ] X(t) e Λ s(t t 1) 0 = U Y(t) 0 e Λs(t t1) U 1 I M S. Bolouki (UIUC) 19 / 28
Hamiltonian Matrix Hamiltonian Matrix Finally P(t) = Y(t)X 1 (t) = [ U 21 + U 22 e Λs(t t1) Ge Λs(t t1)] [ U 11 + U 12 e Λs(t t1) Ge Λs(t t1)] 1 where G = [U 22 MU 12 ] 1 [U 21 MU 11 ] Keep in mind that if t 1, then P(t) = U 21 U 1 11. We will get back to it in the infinite horizon case. S. Bolouki (UIUC) 20 / 28
Infinite Horizon Regulators Infinite Horizon Regulators S. Bolouki (UIUC) 21 / 28
Infinite Horizon Regulators Infinite Horizon Regulators We consider the LTI Case: ẋ = Ax + Bu, x(0) = x 0 V(u) = Objective: Find u [0, ] that minimizes V. 0 [ x T (t)qx(t) + u T (t)ru(t) ] dt General idea: Do the finite-horizon with t 1 as the final time. Then, let t 1. Let P(t, t 1 ) -defined over [0, t 1 ]- be the solution of the Riccati equation Ṗ = Q + PA + A T P PBR 1 B T P If P = lim t1 P(t, t 1 ) exists, then u(t) = R 1 B T Px(t) is the optimal control law for the infinite-horizon problem. S. Bolouki (UIUC) 22 / 28
ẋ = x + u V(u) = t1 0 (x 2 + u 2 )dt + 5x 1 (t) 2
Infinite Horizon Regulators Infinite Horizon Regulators Recall that for the LTI case, if the eigenvalues of H are distinct, we have Notice that P does not depend on t. Now lim P(t, t 1) = U 21 U 1 t 11 := P 1 u = R } 1 {{ B T P } K Let us now characterize the eigenvalues of A BK. Remember that HU = UΛ. Therefore that results in H [ U11 U 21 ] = [ U11 U 21 ] x Λ s U 11 Λ s = AU 11 BR 1 B T U 21 = (A BK)U 11 S. Bolouki (UIUC) 25 / 28
Infinite Horizon Regulators Infinite Horizon Regulators Theorem Given A BK is Hurwitz The stable eigenvalues of H coincide with the eigenvalues of A BK. Columns of U 11 are the eigenvectors of A BK. S. Bolouki (UIUC) 26 / 28
Infinite Horizon Regulators Infinite Horizon Regulators Another way to compute P: Remembering HU = UΛ, we have AU 11 BR 1 B T U 21 = U 11 Λ s QU 11 A T U 21 = U 21 Λ s Multiply the first eq by P on the left and U 1 11 Multiply the second eq by U 1 11 on the right to get PA PBR 1 B T P = U 21 ΛU 1 11 on the right to get Q A T P = U 21 ΛU 1 11 Hence, we arrive at the algebraic Riccati equation (ARE) A T P + PA + Q PBR 1 B T P = 0 S. Bolouki (UIUC) 27 / 28
Infinite Horizon Regulators Infinite Horizon Regulators Theorem V = ẋ = Ax + Bu 0 (x T Qx + u T Ru)dt If Q = C T C for some matrix C, where (A, B) is stabilizable and (A, C) is detectable, The ARE has a unique solution among positive semi-definite matrices; The closed loop system A BR 1 B T P is hurwitz; The optimal control is u = R 1 B T Px and the optimal cost is x T 0 Px 0 ; The solution of the finite horizon problem converges to the solution of the infinite horizon problem as t 1. If, additionally, (A, C) is observable, P is positive definite. S. Bolouki (UIUC) 28 / 28