Control of Dynamical System Yuan Gao Applied Mathematics University of Washington yuangao@uw.edu Spring 2015 1 / 21
Simplified Model for a Car To start, let s consider the simple example of controlling a car. A extremely simplified model follows directly from Newton s law of motion F = ma. We take F (t) = cu(t), where u(t) is the control (pressing the paddle/break). The dynamics of a car is then Possible goals: d 2 x dt 2 = c m u. Drive the car to a specific speed. Drive the car to a desired location. 2 / 21
Controlling the Speed The differential equation concerning speed is dv dt = c m u. Define e(t) = v d v(t) as the error at time t, where v d is the desired speed. The goal is to let e(t) 0. If v(0) = v d, what should u(0) be? If v(0) > v d, what sign should u(0) take? If v(0) < v d, what sign should u(0) take? 3 / 21
Bang-Bang Control u max if e(t) > 0, u(t) := u max if e(t) < 0, 0 if e(t) = 0. 4 / 21
Bang-Bang Control 2500 2000 1500 1000 500 0 0 5 10 15 20 25 30 35 40 45 50 5 / 21
Difference Equation dv dt = c m u. In many situations, control happens at discrete time. or v(t + t) v(t) t = c m u(t), v(t + t) = v(t) + c m u(t) t. Drawbacks off Bang-Bang control: suffers from overshooting overacts to small errors bad for actuators 6 / 21
PE Control PE control, or proportional to error control, is we choose control that is proportional to the error: u(t) = ke(t). 2500 2000 1500 1000 500 0 0 5 10 15 20 25 30 35 40 45 50 7 / 21
Analysis of PE Control In terms of PE control, Suppose v(0) = 0, the solution is dv dt = c m u(t) = c m k(v d v). v(t) = v d (1 e c m kt ). Does PE control suffer from overshooting? v(t + t) = v(t) + c m k(v d v(t)) t. The required condition for stability is c k t < 2. m 8 / 21
Adding Air Resistance A more realistic model... dv(t) = c u(t) γv(t). dt m 2500 2000 1500 1000 500 0 0 5 10 15 20 25 30 35 40 45 50 9 / 21
Failure of PE Control Plug in u(t) = ke(t), dv(t) dt = c m k(v d v(t)) γv(t). The equilibrium velocity is ck ck+mγ v d. Thinking in this way, we have another way of controlling the system by setting its equilibrium the point we want! dv(t) dt = c m u(t) γv(t) = 0. = u(t) = m c γv d. Drawback: it requires a lot more information the mass m, coefficient c, and frictional constant γ not robust parameters changing across different situations 10 / 21
PID Control P: Proportional, I: Integral, D: Derivative. u(t) = k p e(t) + k i t 0 e(τ)dτ + k d d dt e(t) 3500 3000 2500 2000 1500 1000 500 0 0 5 10 15 20 25 30 35 40 45 50 11 / 21
Controlling Position Goal: x x d, v 0. The system of equations is In matrix form: d dt dx dt = v. dv dt = c m u γv. [ ] [ x 0 1 = v 0 γ ] [ ] [ ] x 0 + c u. v m 12 / 21
Comparison of Two Controllers 12000 k(x x d ) k(x x d + v v d ) 10000 8000 6000 4000 2000 0 0 10 20 30 40 50 60 70 80 90 100 13 / 21
Optimal Control Theory x: state u(x): the control at state x l(x, u): cost of applying control u at state x The goal is to minimize the total cost: J(x, u) = where x k+1 = next(x k, u k ). n l(x k, u k ), k=0 14 / 21
Bellman s Principle of Optimality An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision. Introduce the optimal value function (optimal cost-to-go): v(x) = minimal total cost for completing the task starting from state x Consider all possible choices of u at current state x, add l(x, u) to v(next(x, u)), and choose the u for which the sum l(x, u) + v(next(x, u)) is minimal. 15 / 21
Bellman Equation Dynamic Programming Finite horizon formulation: v(x ) = 0. The Bellman Equation: v(x) = min{l(x, u) + v(next(x, u))}. u Infinite horizon discounted cost formulation: minimize The Bellman Equation: J(x, u) = lim n k=0 n α k l(x k, u k ). v(x) = min{l(x, u) + αv(next(x, u))}. u 16 / 21
Continuous Optimal Control Assume the dynamics The goal is to minimize dx dt = f (x, u). J(x, u) = T 0 l(x(t), u(t), t)dt. Choosing a time step of t, we have n J(x, u) = l(x k, u k, k t) t, k=0 and next(x k, u k ) = x k + f (x k, u k ) t. 17 / 21
Continuous Optimal Control Apply the bellman equation, v(x, k t) = min u {l(x, u, k t) t + v (x + f (x, u) t, (k + 1) t)}. Taylor expansion for v in the first variable x gives v (x + f (x, u) t, (k + 1) t) v(x, (k + 1) t) + v x (x, (k + 1) t)f (x, u) t. This gives us (v(x, k t) v(x, (k + 1) t)) / t = min u {l(x, u, k t) + f (x, u)v x (x, (k + 1) t)}. Now take the limit as t 0, v t (x, t) = min u {l(x, u, t) + f (x, u)v x (x, t)}. The Hamiltion-Jacobi-Bellman (HJB) equation! 18 / 21
Infinite Horizon Formulation The total cost becomes J(x, u) = 0 e αt l(x(t), u(t))dt. Choosing a time step of t, we have J(x, u) = e αk t l(x k, u k ) t. k=0 Apply the infinite horizon discounted cost Bellman Equation, } v(x) = min u {l(x, u) t + e α t v (x + f (x, u) t). Using similar techniques we get the infinite horizon HJB equation: { αv(x) = min u l(x, u) + f (x, u)v (x) }. 19 / 21
A Simple Example Back to the simplest example of controlling the speed of a car dx dt = c m u, here x represents the velocity, to be consistent with previous notations. We define the cost function as l(x, u) = 1 2 (x x d) 2 + 1 2 u2. Assume the value function has the form v(x) = 1 2 V (x x d) 2. Plug into the infinite horizon HJB equation and solve for V, we get ( ) V = m2 c 2 α + α 2 + 2 c2 m 2 and u = c m V (x x d). 20 / 21
Varying α 6000 5000 α = 0.1 control velocity 6000 5000 α = 1 control velocity 4000 4000 3000 3000 2000 2000 1000 1000 0 0 1000 1000 2000 0 2 4 6 8 10 2000 0 2 4 6 8 10 21 / 21