Outline. 1 Linear Quadratic Problem. 2 Constraints. 3 Dynamic Programming Solution. 4 The Infinite Horizon LQ Problem.

Similar documents
Model Predictive Control Short Course Regulation

Outline. Linear regulation and state estimation (LQR and LQE) Linear differential equations. Discrete time linear difference equations

MATH4406 (Control Theory) Unit 6: The Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) Prepared by Yoni Nazarathy, Artem

4F3 - Predictive Control

Postface to Model Predictive Control: Theory and Design

Theory in Model Predictive Control :" Constraint Satisfaction and Stability!

4F3 - Predictive Control

Linear-Quadratic Optimal Control: Full-State Feedback

Optimal control and estimation

Optimal Polynomial Control for Discrete-Time Systems

EE C128 / ME C134 Feedback Control Systems

IMPROVED MPC DESIGN BASED ON SATURATING CONTROL LAWS

On the Stabilization of Neutrally Stable Linear Discrete Time Systems

Suppose that we have a specific single stage dynamic system governed by the following equation:

Lecture 9: Discrete-Time Linear Quadratic Regulator Finite-Horizon Case

On the Inherent Robustness of Suboptimal Model Predictive Control

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming

ESC794: Special Topics: Model Predictive Control

Prashant Mhaskar, Nael H. El-Farra & Panagiotis D. Christofides. Department of Chemical Engineering University of California, Los Angeles

Chapter 2 Optimal Control Problem

A Globally Stabilizing Receding Horizon Controller for Neutrally Stable Linear Systems with Input Constraints 1

4F3 - Predictive Control

Linear Systems. Manfred Morari Melanie Zeilinger. Institut für Automatik, ETH Zürich Institute for Dynamic Systems and Control, ETH Zürich

Topic # Feedback Control Systems

On the Inherent Robustness of Suboptimal Model Predictive Control

State Regulator. Advanced Control. design of controllers using pole placement and LQ design rules

Linear State Feedback Controller Design

MS-E2133 Systems Analysis Laboratory II Assignment 2 Control of thermal power plant

FINITE HORIZON ROBUST MODEL PREDICTIVE CONTROL USING LINEAR MATRIX INEQUALITIES. Danlei Chu, Tongwen Chen, Horacio J. Marquez

Module 07 Controllability and Controller Design of Dynamical LTI Systems

Nonlinear Model Predictive Control Tools (NMPC Tools)

Robotics. Control Theory. Marc Toussaint U Stuttgart

Lecture 2 and 3: Controllability of DT-LTI systems

Introduction to Model Predictive Control. Dipartimento di Elettronica e Informazione

Optimizing Economic Performance using Model Predictive Control

EE221A Linear System Theory Final Exam

Time-Invariant Linear Quadratic Regulators Robert Stengel Optimal Control and Estimation MAE 546 Princeton University, 2015

Homework Solution # 3

CDS 110b: Lecture 2-1 Linear Quadratic Regulators

Extensions and applications of LQ

1. Find the solution of the following uncontrolled linear system. 2 α 1 1

Stochastic Tube MPC with State Estimation

Overview of Models for Automated Process Control

Decentralized and distributed control

Numerical Methods for Model Predictive Control. Jing Yang

Efficient robust optimization for robust control with constraints Paul Goulart, Eric Kerrigan and Danny Ralph

An Introduction to Model-based Predictive Control (MPC) by

Time-Invariant Linear Quadratic Regulators!

SUCCESSIVE POLE SHIFTING USING SAMPLED-DATA LQ REGULATORS. Sigeru Omatu

In search of the unreachable setpoint

Outline. 1 Full information estimation. 2 Moving horizon estimation - zero prior weighting. 3 Moving horizon estimation - nonzero prior weighting

Course on Model Predictive Control Part II Linear MPC design

OPTIMAL CONTROL AND ESTIMATION

Regional Solution of Constrained LQ Optimal Control

Moving Horizon Estimation (MHE)

Lecture 2: Discrete-time Linear Quadratic Optimal Control

EEE582 Homework Problems

Fall 線性系統 Linear Systems. Chapter 08 State Feedback & State Estimators (SISO) Feng-Li Lian. NTU-EE Sep07 Jan08

ESC794: Special Topics: Model Predictive Control

EML5311 Lyapunov Stability & Robust Control Design

Australian Journal of Basic and Applied Sciences, 3(4): , 2009 ISSN Modern Control Design of Power System

CONTROL DESIGN FOR SET POINT TRACKING

Riccati difference equations to non linear extended Kalman filter constraints

Optimality, Duality, Complementarity for Constrained Optimization

Quadratic Stability of Dynamical Systems. Raktim Bhattacharya Aerospace Engineering, Texas A&M University

Discrete Time Systems:

ECE7850 Lecture 8. Nonlinear Model Predictive Control: Theoretical Aspects

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters

Enlarged terminal sets guaranteeing stability of receding horizon control

Module 03 Linear Systems Theory: Necessary Background

Trust Regions. Charles J. Geyer. March 27, 2013

MODEL PREDICTIVE CONTROL FUNDAMENTALS

Distributed and Real-time Predictive Control

Nonlinear Optimization for Optimal Control

1. Type your solutions. This homework is mainly a programming assignment.

Topic # Feedback Control

OPTIMAL CONTROL WITH DISTURBANCE ESTIMATION

Optimal Control. McGill COMP 765 Oct 3 rd, 2017

Structured State Space Realizations for SLS Distributed Controllers

Online Model Predictive Torque Control for Permanent Magnet Synchronous Motors

State Estimation using Moving Horizon Estimation and Particle Filtering

Course on Model Predictive Control Part III Stability and robustness

5. Observer-based Controller Design

Global stabilization of feedforward systems with exponentially unstable Jacobian linearization

An overview of distributed model predictive control (MPC)

Denis ARZELIER arzelier

Optimization Problems in Model Predictive Control

Principles of Optimal Control Spring 2008

1 Steady State Error (30 pts)

EE363 homework 2 solutions

Controlling Large-Scale Systems with Distributed Model Predictive Control

Stability of Parameter Adaptation Algorithms. Big picture

Quis custodiet ipsos custodes?

Semidefinite Programming Duality and Linear Time-invariant Systems

Appendix A Solving Linear Matrix Inequality (LMI) Problems

Linear Algebra. P R E R E Q U I S I T E S A S S E S S M E N T Ahmad F. Taha August 24, 2015

MS&E 318 (CME 338) Large-Scale Numerical Optimization

EE363 homework 7 solutions

ECE557 Systems Control

IEOR 265 Lecture 14 (Robust) Linear Tube MPC

Transcription:

Model Predictive Control Short Course Regulation James B. Rawlings Michael J. Risbeck Nishith R. Patel Department of Chemical and Biological Engineering Copyright c 217 by James B. Rawlings Outline 1 Linear Quadratic Problem 2 Constraints 3 Dynamic Programming Solution 4 The Infinite Horizon LQ Problem 5 Controllability 6 Convergence of the Linear Quadratic Regulator 7 Constrained Regulation Milwaukee, WI August 28 29, 217 8 Quadratic Programming JCI 217 MPC short course 1 / 61 JCI 217 MPC short course 2 / 61 Linear quadratic regulation We start by designing a controller to take the state of a deterministic, linear system to the origin. If the setpoint is not the origin, or we wish to track a time-varying setpoint trajectory, we will subsequently make modifications of the zero setpoint problem to account for that. The system model is x(k + 1) = Ax(k) + Bu(k) y(k) = Cx(k) (1) in which x is an n vector, u is an m vector, and y is a p vector. We can compress the notation by denoting the model as x + = Ax + Bu y = Cx (2) A point on dot and + notation Just like in the continuous time model where we use the dot notation to represent the time derivative dx = Ax + Bu dt y = Cx + Du ẋ = Ax + Bu y = Cx + Du In discrete time we use the + notation to represent the state at the next sample time x(k + 1) = Ax(k) + Bu(k) y(k) = Cx(k) + Du(k) x + = Ax + Bu y = Cx + Du JCI 217 MPC short course 3 / 61 JCI 217 MPC short course 4 / 61

State feedback Sequence of inputs (decision variables) In this first problem, we assume that the state is measured, or C = I. We will handle the output measurement problem with state estimation in the next section. Using the model we can predict how the state evolves given any set of inputs we are considering. Consider N time steps into the future and collect the input sequence into u, u = (u(), u(1),..., u(n 1)) Constraints on the u sequence (i.e., valve saturations, etc.) are the main feature that distinguishes MPC from the standard linear quadratic (LQ) control. JCI 217 MPC short course 5 / 61 JCI 217 MPC short course 6 / 61 Constraints The manipulated inputs (valve positions, voltages, torques, etc.) to most physical systems are bounded. We include these constraints by linear inequalities in which Eu(k) e k =, 1,... E = [ ] I I e = [ ] u u are chosen to describe simple bounds such as u u(k) u k =, 1,... We sometimes wish to impose constraints on states or outputs for reasons of safety, operability, product quality, etc. These can be stated as Fx(k) f k =, 1,... JCI 217 MPC short course 7 / 61 Rate of change constraints Practitioners find it convenient in some applications to limit the rate of change of the input, u(k) u(k 1). To maintain the state space form of the model, we may augment the state as [ ] x(k) x(k) = u(k 1) and the augmented system model becomes in which à = [ ] A x + = à x + Bu y = C x B = [ ] B I C = [ C ] JCI 217 MPC short course 8 / 61

Rate of change constraints A rate of change constraint such as is then stated as u(k) u(k 1) k =, 1,... F x(k) + Eu(k) e F = [ ] I I E = [ ] I I e = [ ] To simplify analysis, it pays to maintain linear constraints when using linear dynamic models. So if we want to consider fairly general constraints for a linear system, we choose the form Fx(k) + Eu(k) e k =, 1,... which subsumes all the forms listed previously. JCI 217 MPC short course 9 / 61 Controller objective function We first define an objective function V ( ) to measure the deviation of the trajectory of x(k), u(k) from zero by summing the weighted squares V (x(), u) = 1 2 subject to N 1 k= [ x(k) Qx(k) + u(k) Ru(k) ] + 1 2 x(n) P f x(n) x + = Ax + Bu The objective function depends on the input sequence and state sequence. The initial state is available from the measurement. The remainder of the state trajectory, x(k), k = 1,..., N, is determined by the model and the input sequence u. So we show the objective function s explicit dependence on the input sequence and initial state. JCI 217 MPC short course 1 / 61 Regulator tuning parameters The tuning parameters in the controller are the matrices Q and R. Note that we are using the deviation of u from zero as the input penalty. We ll use rate of change penalty S later. We allow the final state penalty to have a different weighting matrix, P f, for generality. Large values of Q in comparison to R reflect the designer s intent to drive the state to the origin quickly at the expense of large control action. Penalizing the control action through large values of R relative to Q is the way to reduce the control action and slow down the rate at which the state approaches the origin. Choosing appropriate values of Q and R (i.e., tuning) is not always obvious, and this difficulty is one of the challenges faced by industrial practitioners of LQ control. Notice that MPC inherits this tuning challenge. JCI 217 MPC short course 11 / 61 Regulation problem definition We then formulate the following optimal MPC control problem subject to min u V (x(), u) (3) x + = Ax + Bu Fx(k) + Eu(k) e, k =, 1,..., N 1 The Q, P f and R matrices often are chosen to be diagonal, but we do not assume that here. We assume, however, that Q, P f, and R are real and symmetric; Q and P f are positive semidefinite; and R is positive definite. These assumptions guarantee that the solution to the optimal control problem exists and is unique. JCI 217 MPC short course 12 / 61

Let s first examine a scalar system (n = 1) with horizon N = 1 We have only one move to optimize, u. The objective is x 1 = ax + bu V = (1/2) ( qx 2 + ru 2 + p f x1 2 ) Expand the x 1 term to see its dependence on the control u V = (1/2) ( qx 2 + ru 2 + p f (ax + bu ) 2) ( ) = (1/2) qx 2 + ru 2 + p f (a 2 x 2 + 2abx u + b 2 u) 2 ( ) V = (1/2) (q + a 2 p f )x 2 + 2(bap f x )u + (b 2 p f + r)u 2 Take derivative dv /du, set to zero Take the derivative, set to zero Solve for optimal control d du V = bp f ax + (b 2 p f + r)u = bp f ax + (b 2 p f + r)u u = bp f a b 2 p f + r x u = kx k = bp f a b 2 p f + r JCI 217 MPC short course 13 / 61 JCI 217 MPC short course 14 / 61 Back to vectors and matrices Dynamic programming solution Quadratic functions in x and u V = (1/2) (x (Q + A P f A)x + 2u B P f Ax + u (B ) P f B + R)u Take derivative, set to zero, solve d du V = B P f Ax + (B P f B + R)u u = (B P f B + R) 1 B P f Ax So we have the optimal linear control law u = Kx K = (B P f B + R) 1 B P f A To handle N > 1, we can go to the end of the horizon and work our way backwards to the beginning of the horizon. This trick is known as (backwards) dynamic programming. Developed by Bellman (1957). Each problem that we solve going backwards is an N = 1 problem, which we already know how to solve! When we arrive at k =, we have the optimal control as a function of the initial state. This is our control law. We next examine the outcome of applying dynamic programming to the LQ problem. JCI 217 MPC short course 15 / 61 JCI 217 MPC short course 16 / 61

State feedback solution for last stage The optimal cost function If we have available x(n 1), which we denote by x: Optimal control and feedback gain un 1 (x) = K(N 1) x K(N 1) = (B P f B + R) 1 B P f A Optimal cost and Riccati equation V N 1 (x) = (1/2)x Π(N 1) x Π(N 1) = Q + A P f A A P f B(B P f B + R) 1 B P f A The function VN 1 (x) defines the optimal cost to go from state x for the last stage under the optimal control law un 1 (x). Having this function allows us to move to the next stage of the DP recursion. For the next stage we solve the optimization subject to min l(x(n 2), u(n 2)) + V N 1 (x(n 1)) u(n 2),x(N 1) x(n 1) = Ax(N 2) + Bu(N 2) Notice that this problem is identical in structure to the stage we just solved and we can write out the solution by simply renaming variables. JCI 217 MPC short course 17 / 61 JCI 217 MPC short course 18 / 61 Next step in the recursion The regulator Riccati equation Now assume we have x(n 1) available, which we denote by x: Optimal control and feedback gain Optimal cost and Riccati equation un 2 (x) = K(N 2) x VN 2 (x) = (1/2)x Π(N 2) x K(N 2) = (B Π(N 1)B + R) 1 B Π(N 1)A Π(N 2) = Q + A Π(N 1)A A Π(N 1)B(B Π(N 1)B + R) 1 B Π(N 1)A The recursion from Π(N 1) to Π(N 2) is known as a backward Riccati iteration. To summarize, the backward Riccati iteration is defined as follows Π(k 1) = Q + A Π(k)A A Π(k)B ( B Π(k)B + R ) 1 B Π(k)A with terminal condition k = N, N 1,..., 1 (4) Π(N) = P f (5) The terminal condition replaces the typical initial condition because the iteration is running backward. JCI 217 MPC short course 19 / 61 JCI 217 MPC short course 2 / 61

The optimal control policy Whew. OK, the controller is optimal, but is it useful? The optimal control policy at each stage is uk (x) = K(k)x k = N 1, N 2,..., (6) The optimal gain at time k is computed from the Riccati matrix at time k + 1 K(k) = ( B Π(k + 1)B + R ) 1 B Π(k + 1)A The optimal cost to go from time k to time N is k = N 1, N 2,..., (7) V k (x) = (1/2)x Π(k)x k = N, N 1,..., (8) We now have set up and solved the (unconstrained) optimal control problem. Next we are going to study some of the closed-loop properties of the controlled system under this controller. One of these properties is closed-loop stability. Before diving into the controller stability discussion, let s take a breather and see what kinds of things that we should guard against when using the optimal controller. We start with a simple scalar system with inverse response or, equivalently, a right-half-plane zero in the system transfer function. y(s) = g(s)u(s) g(s) = k s a s + b, a, b > JCI 217 MPC short course 21 / 61 JCI 217 MPC short course 22 / 61 Step response of system with right-half-plane zero Convert the transfer function to state space (notice a nonzero D) d x = Ax + Bu dt A = b B = (a + b) y = Cx + Du C = k D = k Solve the ODEs with u(t) = 1 and x() = y 1.8.6.4.2.2.4 5 5 1 15 2 time What input delivers a unit step change in the output? Say I make a unit step change in the setpoint of y. What u(t) (if any) can deliver a (perfect) unit step in y(t)? In the Laplace domain, y(s) = 1/s. Solve for u(s). Since y = g(s)u we have that u(s) = y(s) g(s) = s + b ks(s a) We can invert this signal back to the time domain a to obtain u(t) = 1 [ ] b + (a + b)e at ka Note that the second term is a growing exponential for a >. Hmmm... How well will that work? a Use partial fractions, for example. Figure 1: Step response of a system with a right-half-plane zero. JCI 217 MPC short course 23 / 61 JCI 217 MPC short course 24 / 61

Simulating the system with chosen input Will this work in practice? u y 35 3 25 2 15 1 5 5 5 1 15 2 1.2 1.8.6.4.2 5 5 1 15 2 Figure 2: Output response with an exponential input for a system with RHP zero. time So we can indeed achieve perfect tracking in y(t) with this growing input u(t). Because the system g(s) = k(s a)/(s + b) has a zero at s = a, and u(s) has a 1/(s a) term, they cancel and we see nice behavior rather than exponential growth in y(t). This cancellation is the so-called input blocking property of the transfer function zero. But what happens if u(t) hits a constraint, u(t) u max. Let s simulate the system again, but with an input constraint at u max = 2. We achieve the following result. Sure enough, that input moves the output right to its setpoint! JCI 217 MPC short course 25 / 61 JCI 217 MPC short course 26 / 61 Output behavior when the exponential input saturates So how can we reliably use optimal control? The input saturation destroys the nice output behavior u y 2 15 1 5 5 5 1 15 2 2 1 1 2 3 4 5 5 1 15 2 Figure 3: Output response with input saturation for a system with RHP zero. time Control theory to the rescue The optimal controller needs to know to avoid exploiting this exponential input. But how do we tell it? We cannot let it start down the exponential path and find out only later that this was a bad idea. We have to eliminate this option by the structure of the optimal control problem. More importantly, even if we fix this issue, how do we know that we have eliminated every other path to instability for every other kind of (large, multivariable) system? Addressing this kind of issue is where control theory is a lot more useful than brute force simulation. JCI 217 MPC short course 27 / 61 JCI 217 MPC short course 28 / 61

The infinite horizon LQ problem An optimal controller that is closed-loop unstable Let us motivate the infinite horizon problem by showing a weakness of the finite horizon problem. Kalman (196, p.113) pointed out in his classic 196 paper that optimality does not ensure stability. In the engineering literature it is often assumed (tacitly and incorrectly) that a system with an optimal control law is necessarily stable. Assume that we use as our control law the first feedback gain of the N stage finite horizon problem, K N (), u(k) = K N ()x(k) Then the stability of the closed-loop system is determined by the eigenvalues of A + BK N (). We now construct an example that shows choosing Q >, R >, and N 1 does not ensure stability. In fact, we can find reasonable values of these parameters such that the controller destabilizes a stable system. JCI 217 MPC short course 29 / 61 Let A = [ 4/3 2/3 1 ] B = [ 1 ] C = [ 2/3 1] This discrete-time system has a transfer function zero at z = 3/2, i.e., an unstable zero (or a right-half-plane zero in continuous time). We now construct an LQ controller that inverts this zero and hence produces an unstable system. We would like to choose Q = C C so that y itself is penalized, but that Q is only semidefinite. We add a small positive definite piece to C C so that Q is positive definite, and choose a small positive R penalty (to encourage the controller to misbehave), and N = 5, [ ] Q = C 4/9 +.1 2/3 C +.1I = R =.1 2/3 1.1 JCI 217 MPC short course 3 / 61 Compute the control law Closed-loop eigenvalues for different N; Riccati iteration We now iterate the Riccati equation four times starting from Π = P f = Q and compute K N () for N = 5. Then we compute the eigenvalues of A + BK N () and achieve 1 eig(a + BK 5 ()) = {1.37,.1} Using this controller the closed-loop system evolution is x(k) = (A + BK 5 ()) k x. Since an eigenvalue of A + BK 5 () is greater than unity, x(k) as k. In other words the closed-loop system is unstable. Im 1.8.6.4.2.2.4.6.8 N 1 1.5.5 1 1.5 5 N = 1 Figure 4: Closed-loop eigenvalues of A + BK N () for different horizon length N (o); open-loop eigenvalues of A (x); real versus imaginary parts. Re 1 Please check this answer with Octave or Matlab. JCI 217 MPC short course 31 / 61 JCI 217 MPC short course 32 / 61

Why is the closed-loop unstable? What happens if we make the horizon N larger? A finite horizon objective function may not give a stable controller! How is this possible? If we continue to iterate the Riccati equation, which corresponds to increasing the horizon in the controller, we obtain for N = 7 x 2 x closed-loop k + 2 k + 1 k x 1 and the controller is stabilizing. eig(a + BK 7 ()) = {.989,.1} If we continue iterating the Riccati equation, we converge to the following steady-state closed-loop eigenvalues eig(a + BK ()) = {.664,.1} This controller corresponds to an infinite horizon control law. Notice that it is stabilizing and has a reasonable stability margin. Nominal stability is a guaranteed property of infinite horizon controllers as we prove in the next section. JCI 217 MPC short course 33 / 61 JCI 217 MPC short course 34 / 61 The forecast versus the closed-loop behavior for N = 5 The forecast versus the closed-loop behavior for N = 7 1 1 1 y(k) y(k).1.1.1 2 4 6 8 1 12 14 k Figure 5: The forecast versus the closed-loop behavior for N = 5..1 2 4 6 8 1 12 14 16 k Figure 6: The forecast versus the closed-loop behavior for N = 7. JCI 217 MPC short course 35 / 61 JCI 217 MPC short course 36 / 61

The forecast versus the closed-loop behavior for N = 2 The infinite horizon regulator 1..1.1 y(k).1.1.1.1 5 1 15 2 25 3 k Figure 7: The forecast versus the closed-loop behavior for N = 2. With this motivation, we are led to consider directly the infinite horizon case V (x(), u) = 1 2 x(k) Qx(k) + u(k) Ru(k) (9) k= in which x(k) is the solution at time k of x + = Ax + Bu if the initial state is x() and the input sequence is u. If we are interested in a continuous process (i.e., no final time), then the natural cost function is an infinite horizon cost. If we were truly interested in a batch process (i.e., the process does stop at k = N), then stability is not a relevant property, and we naturally would use the finite horizon LQ controller and the time-varying controller, u(k) = K(k)x(k), k =, 1,..., N. JCI 217 MPC short course 37 / 61 JCI 217 MPC short course 38 / 61 When can we compute an infinite horizon cost? Controllability In considering the infinite horizon problem, we first restrict attention to systems for which there exist input sequences that give bounded cost. Consider the case A = I and B =, for example. Regardless of the choice of input sequence, (9) is unbounded for x(). It seems clear that we are not going to stabilize an unstable system (A = I ) without any input (B = ). This is an example of an uncontrollable system. In order to state the sharpest results on stabilization, we require the concepts of controllability, stabilizability, observability, and detectability. We shall define these concepts subsequently. A system is controllable if, for any pair of states x, z in the state space, z can be reached in finite time from x (or x controlled to z) (Sontag, 1998, p.83). A linear discrete time system x + = Ax + Bu is therefore controllable if there exists a finite time N and a sequence of inputs (u(), u(1),... u(n 1)) that can transfer the system from any x to any z in which u(n 1) z = A N x + [ B AB A N 1 B ] u(n 2). u() JCI 217 MPC short course 39 / 61 JCI 217 MPC short course 4 / 61

Checking with N = n moves is sufficient Controllability test For an unconstrained linear system, if we cannot reach z in n moves, we cannot reach z in any number of moves. The question of controllability of a linear time-invariant system is therefore a question of existence of solutions to linear equations for an arbitrary right-hand side u(n 1) [ B AB A n 1 B ] u(n 2). = z An x u() The matrix appearing in this equation is known as the controllability matrix C C = [ B AB A n 1 B ] (1) JCI 217 MPC short course 41 / 61 From the fundamental theorem of linear algebra, we know a solution exists for all right-hand sides if and only if the rows of the n nm controllability matrix are linearly independent. 2 Therefore, the system (A, B) is controllable if and only if rank(c) = n 2 See Section A.4 in Appendix A of (Rawlings and Mayne, 29) or (Strang, 198, pp.87 88) for a review of this result. JCI 217 MPC short course 42 / 61 Convergence of the linear quadratic regulator Convergence of the linear quadratic regulator We now show that the infinite horizon regulator asymptotically stabilizes the origin for the closed-loop system. Define the infinite horizon objective function subject to V (x, u) = 1 2 x(k) Qx(k) + u(k) Ru(k) k= x + = Ax + Bu x() = x If (A, B) is controllable, the solution to the optimization problem exists and is unique for all x. min u V (x, u) We denote the optimal solution by u (x), and the first input in the optimal sequence by u (x). The feedback control law κ ( ) for this infinite horizon case is then defined as u = κ (x) in which κ (x) = u (x) = u (; x). with Q, R >. JCI 217 MPC short course 43 / 61 JCI 217 MPC short course 44 / 61

LQR stability Controllability gives finite cost As stated in the following lemma, this infinite horizon linear quadratic regulator (LQR) is stabilizing. Lemma 1 (LQR convergence) For (A, B) controllable, the infinite horizon LQR with Q, R > gives a convergent closed-loop system x + = Ax + Bκ (x) The cost of the infinite horizon objective is bounded above for all x() because (A, B) is controllable. Controllability implies that there exists a sequence of n inputs (u(), u(1),..., u(n 1)) that transfers the state from any x() to x(n) =. A zero control sequence after k = n for (u(n + 1), u(n + 2),...) generates zero cost for all terms in V after k = n, and the objective function for this infinite control sequence is therefore finite. The cost function is strictly convex in u because R > so the solution to the optimization is unique. JCI 217 MPC short course 45 / 61 JCI 217 MPC short course 46 / 61 Proof of closed-loop convergence If we consider the sequence of costs to go along the closed-loop trajectory, we have for V k = V (x(k)) V k+1 = V k (1/2) ( x(k) Qx(k) + u(k) Ru(k) ) in which V k = V (x(k)) is the cost at time k for state value x(k) and u(k) = u (x(k)) is the optimal control for state x(k). The cost along the closed-loop trajectory is nonincreasing and bounded below (by zero). Therefore, the sequence (V k ) converges and x(k) Qx(k) u(k) Ru(k) as k Since Q, R >, we have x(k) u(k) as k and closed-loop convergence is established. JCI 217 MPC short course 47 / 61 Connection to Riccati equation In fact we know more. From the previous sections, we know the optimal solution is found by iterating the Riccati equation, and the optimal infinite horizon control law and optimal cost are given by in which u (x) = Kx K = (B ΠB + R) 1 B ΠA V (x) = (1/2)x Πx Π = Q + A ΠA A ΠB(B ΠB + R) 1 B ΠA (11) Proving Lemma 1 has shown also that for (A, B) controllable and Q, R >, a positive definite solution to the discrete algebraic Riccati equation (DARE), (11), exists and the eigenvalues of (A + BK) are asymptotically stable for the K corresponding to this solution (Bertsekas, 1987, pp.58 64). JCI 217 MPC short course 48 / 61

Extensions to constrained systems Enlarging the class of systems This basic approach to establishing regulator stability will be generalized to handle constrained (and nonlinear) systems The optimal cost is a Lyapunov function for the closed-loop system. We also can strengthen the stability for linear systems from asymptotic stability to exponential stability based on the form of the Lyapunov function. The LQR convergence result in Lemma 1 is the simplest to establish, but we can enlarge the class of systems and penalties for which closed-loop stability is guaranteed. The system restriction can be weakened from controllability to stabilizability, which is discussed in Exercises 1.19 and 1.2. The restriction on the allowable state penalty Q can be weakened from Q > to Q and (A, Q) detectable, which is also discussed in Exercise 1.2. The restriction R > is retained to ensure uniqueness of the control law. In applications, if one cares little about the cost of the control, then R is chosen to be small, but positive definite. JCI 217 MPC short course 49 / 61 JCI 217 MPC short course 5 / 61 Constrained regulation Adding constraints is a big change! Now we add the constraints to the control problem. Control Objective. V (x(), u) = 1 2 Constraints. Optimization. N 1 k= [ x(k) Qx(k) + u(k) Ru(k) ] + 1 2 x(n) P f x(n) Fx(k) + Eu(k) e, k =, 1,... N 1 min u V (x, u) subject to the model and constraints. We cannot solve the constrained problem in closed-form using dynamic programming. Now we have a quadratic objective subject to linear constraints, which is known as a quadratic program (Nocedal and Wright, 1999). We must compute the solution for industrial-sized multivariable problems online. We must compute all the u(k), k = 1, 2,..., N 1 simultaneously. That means a bigger online computation. But we have fast, reliable online computing available; why not use it? JCI 217 MPC short course 51 / 61 JCI 217 MPC short course 52 / 61

The geometry of quadratic programming The simplest possible constrained control law u 2 min u V (u) = (1/2)u Hu + h u subject to Du d V (u) = constant The optimal control law for the regulation problem with an active input constraint remains linear (affine) u = Kx + b. But the control laws are valid over only local polyhedral regions The simplest example would be a SISO, first-order system with input saturation. There are three regions. least squares u = H 1 h u 1.5 κ N(x) Du d quadratic program JCI 217 MPC short course 53 / 61 u 1.5 1 3 2 1 1 2 3 x Figure 8: The optimal control law for x + = x + u, N = 2, Q = R = 1, u [ 1, 1]. JCI 217 MPC short course 54 / 61 The constrained control law can be more complex Stability of the closed-loop system x 2 2 1.5 1.5.5 1 1.5 2 3 2 1 1 2 3 Figure 9: Explicit control law regions for a system with n = 2, m = 1, and N = 1. (Rawlings and Mayne, 29, p.5) x 1 The number of regions can increase exponentially with system order n, number of inputs, m, and horizon length N. For even modest state dimension and horizon length, we prefer to solve the QP online rather than compute and store this complex control law. We denote the solution to the QP, which depends on the initial state x as u (x) = κ N (x) The constrained controller is now nonlinear even though the model is linear The closed-loop system is then x + = Ax + Bκ N (x) And we would like to design the controller (choose N, Q, R, P f ) so that the closed-loop system is stable. JCI 217 MPC short course 55 / 61 JCI 217 MPC short course 56 / 61

Terminal constraint solution Adding a terminal constraint x(n) = ensures stability May cause infeasibility (cannot reach in only N moves) Open-loop predictions not equal to closed-loop behavior Infinite horizon solution The infinite horizon ensures stability Open-loop predictions equal to closed-loop behavior (desirable!) Challenge to implement x 2 V k+2 V k+1 l(x k+1, u k+1 ) x 2 V k+2 = V k+1 l(x k+1, u k+1 ) k + 2 k + 2 k + 1 k + 1 k k V k V k+1 V k l(x k, u k ) V k V k+1 = V k l(x k, u k ) x 1 x 1 JCI 217 MPC short course 57 / 61 JCI 217 MPC short course 58 / 61 Implementation challenges Lab Exercises The theory is now in reasonably good shape. In 199, there was no theory addressing the control problem with constraints. Numerical QP solution methods are efficient and robust. In the process industries, applications with hundreds of states and inputs are running today. But challenges remain: Scale. Methods must scale well with horizon N, number of states n, and inputs, m. The size of applications continues to grow. Ill-conditioning. Detect and repair nearly collinear constraints, badly conditioned Hessian matrix. Model accuracy. How to adjust the models automatically as conditions in the plant change. Exercise 1.25 Exercise 2.3 Exercise 2.19 Exercise 6.2 JCI 217 MPC short course 59 / 61 JCI 217 MPC short course 6 / 61

Further reading I R. E. Bellman. Dynamic Programming. Princeton University Press, Princeton, New Jersey, 1957. D. P. Bertsekas. Dynamic Programming. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1987. R. E. Kalman. Contributions to the theory of optimal control. Bull. Soc. Math. Mex., 5: 12 119, 196. J. Nocedal and S. J. Wright. Numerical Optimization. Springer-Verlag, New York, 1999. J. B. Rawlings and D. Q. Mayne. Model Predictive Control: Theory and Design. Nob Hill Publishing, Madison, WI, 29. 576 pages, ISBN 978--9759377--9. E. D. Sontag. Mathematical Control Theory. Springer-Verlag, New York, second edition, 1998. G. Strang. Linear Algebra and its Applications. Academic Press, New York, second edition, 198. JCI 217 MPC short course 61 / 61