ESC794: Special Topics: Model Predictive Control

Similar documents
Chapter 3 Nonlinear Model Predictive Control

IEOR 265 Lecture 14 (Robust) Linear Tube MPC

MATH4406 (Control Theory) Unit 6: The Linear Quadratic Regulator (LQR) and Model Predictive Control (MPC) Prepared by Yoni Nazarathy, Artem

ESC794: Special Topics: Model Predictive Control

ECE7850 Lecture 8. Nonlinear Model Predictive Control: Theoretical Aspects

Homework Solution # 3

EE C128 / ME C134 Feedback Control Systems

Suppose that we have a specific single stage dynamic system governed by the following equation:

4F3 - Predictive Control

Regional Solution of Constrained LQ Optimal Control

Lecture 9: Discrete-Time Linear Quadratic Regulator Finite-Horizon Case

EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

ESTIMATES ON THE PREDICTION HORIZON LENGTH IN MODEL PREDICTIVE CONTROL

arxiv: v2 [math.oc] 15 Jan 2014

Formula Sheet for Optimal Control

4F3 - Predictive Control

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

Numerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini

Enlarged terminal sets guaranteeing stability of receding horizon control

Distributed and Real-time Predictive Control

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory

MPC: implications of a growth condition on exponentially controllable systems

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming

Deterministic Dynamic Programming

Adaptive Nonlinear Model Predictive Control with Suboptimality and Stability Guarantees

Static and Dynamic Optimization (42111)

Chapter 5. Pontryagin s Minimum Principle (Constrained OCP)

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

ROBUSTNESS OF PERFORMANCE AND STABILITY FOR MULTISTEP AND UPDATED MULTISTEP MPC SCHEMES. Lars Grüne and Vryan Gil Palma

Outline. 1 Linear Quadratic Problem. 2 Constraints. 3 Dynamic Programming Solution. 4 The Infinite Horizon LQ Problem.

6.231 DYNAMIC PROGRAMMING LECTURE 9 LECTURE OUTLINE

Optimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:

A Tour of Reinforcement Learning The View from Continuous Control. Benjamin Recht University of California, Berkeley

Theory in Model Predictive Control :" Constraint Satisfaction and Stability!

Linear-Quadratic Optimal Control: Full-State Feedback

IMPROVED MPC DESIGN BASED ON SATURATING CONTROL LAWS

Time-Invariant Linear Quadratic Regulators Robert Stengel Optimal Control and Estimation MAE 546 Princeton University, 2015

A Globally Stabilizing Receding Horizon Controller for Neutrally Stable Linear Systems with Input Constraints 1

Efficient robust optimization for robust control with constraints Paul Goulart, Eric Kerrigan and Danny Ralph

Topic # Feedback Control Systems

On-off Control: Audio Applications

CDS 110b: Lecture 2-1 Linear Quadratic Regulators

Hamilton-Jacobi-Bellman Equation Feb 25, 2008

Chapter 2 Optimal Control Problem

Georgia Department of Education Common Core Georgia Performance Standards Framework CCGPS Advanced Algebra Unit 2

UCLA Chemical Engineering. Process & Control Systems Engineering Laboratory

Lecture 2: Discrete-time Linear Quadratic Optimal Control

A Stable Block Model Predictive Control with Variable Implementation Horizon

Model predictive control of industrial processes. Vitali Vansovitš

Economic MPC using a Cyclic Horizon with Application to Networked Control Systems

Nonlinear Model Predictive Control Tools (NMPC Tools)

Generalization to inequality constrained problem. Maximize

1 Steady State Error (30 pts)

Asymptotic stability and transient optimality of economic MPC without terminal conditions

1 Basics of probability theory

Prashant Mhaskar, Nael H. El-Farra & Panagiotis D. Christofides. Department of Chemical Engineering University of California, Los Angeles

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

Lyapunov Stability Theory

Suboptimality of minmax MPC. Seungho Lee. ẋ(t) = f(x(t), u(t)), x(0) = x 0, t 0 (1)

Economic Nonlinear Model Predictive Control

Time-Invariant Linear Quadratic Regulators!

A SIMPLE TUBE CONTROLLER FOR EFFICIENT ROBUST MODEL PREDICTIVE CONTROL OF CONSTRAINED LINEAR DISCRETE TIME SYSTEMS SUBJECT TO BOUNDED DISTURBANCES

LINEAR-CONVEX CONTROL AND DUALITY

Second Order Sufficient Conditions for Optimal Control Problems with Non-unique Minimizers

Stability analysis of constrained MPC with CLF applied to discrete-time nonlinear system

EML5311 Lyapunov Stability & Robust Control Design

An Introduction to Model-based Predictive Control (MPC) by

Autonomous navigation of unicycle robots using MPC

Quadratic Stability of Dynamical Systems. Raktim Bhattacharya Aerospace Engineering, Texas A&M University

1. Type your solutions. This homework is mainly a programming assignment.

Denis ARZELIER arzelier

Stability and feasibility of state-constrained linear MPC without stabilizing terminal constraints

Trajectory-based optimization

Analysis and design of unconstrained nonlinear MPC schemes for finite and infinite dimensional systems

Tube Model Predictive Control Using Homothety & Invariance

Dissipativity. Outline. Motivation. Dissipative Systems. M. Sami Fadali EBME Dept., UNR

Lecture 5 Linear Quadratic Stochastic Control

On the Inherent Robustness of Suboptimal Model Predictive Control

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters

Further results on Robust MPC using Linear Matrix Inequalities

Optimization. Yuh-Jye Lee. March 21, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 29

Learning Model Predictive Control for Iterative Tasks: A Computationally Efficient Approach for Linear System

Topic # Feedback Control

Optimal control of nonlinear systems with input constraints using linear time varying approximations

Exam. 135 minutes, 15 minutes reading time

Optimization using Calculus. Optimization of Functions of Multiple Variables subject to Equality Constraints

MCE/EEC 647/747: Robot Dynamics and Control. Lecture 8: Basic Lyapunov Stability Theory

CE 191: Civil & Environmental Engineering Systems Analysis. LEC 17 : Final Review

Chap. 3. Controlled Systems, Controllability

Professor Fearing EE C128 / ME C134 Problem Set 2 Solution Fall 2010 Jansen Sheng and Wenjie Chen, UC Berkeley

Constrained Optimization

Numerical Optimal Control Overview. Moritz Diehl

MPC Infeasibility Handling

Model Predictive Control Short Course Regulation

LMI Methods in Optimal and Robust Control

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained

Optimal Control Theory


Course on Model Predictive Control Part III Stability and robustness

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Transcription:

ESC794: Special Topics: Model Predictive Control Nonlinear MPC Analysis : Part 1 Reference: Nonlinear Model Predictive Control (Ch.3), Grüne and Pannek Hanz Richter, Professor Mechanical Engineering Department Cleveland State University

Nonlinear MPC for Constant References Here we consider equilibrium regulation of a nonlinear system under constraints. Consider the open-loop system to be x + = f(x,u) and the MPC feedback law u = µ(x). The resulting closed-loop system is x + = g(x) = f(x,µ(x)) and we assume that x is an equilibrium for this system, that is, g(x ) = x. Also, assume that there is no running cost associated with equilibrium-holding: l(x,µ(x )) = 0. Finally, assume l(x,u) > 0 for (x,u) (x,µ(x )). 2 / 23

Optimal Control Problem (OCP N ) 3.1 and NMPC Algorithm minimize u U N (x 0 ) subject to J N (x 0,u) = N 1 k=0 l(x u (k,x 0 ),u(k)) x u (0,x 0 ) = x 0 x u (k +1,x 0 ) = f(x u (k,x 0 ),u(k)) Let u (k) be the open-loop solution sequence for OCP N. The MPC feedback law is defined by µ N (x(n)) = u (0). Note that n is the instant at which OCP N is solved, using x 0 = x(n) (the feedback). The nominal closed-loop system resulting from this algorithm is x + = f(x,µ N (x)) 3 / 23

Constraints and Related Definitions Recalling the notational elements introduced in Handout 1, we consider X to be the set of admissible states, decided by us as designers. U(x) is the set of admissible inputs for state x. Also specified by designers, and the simplest case is when there s no dependence on x. Define the set of admissible pairs by Y = {(x,u) : x X and u U(x)} Let n N and x 0 X. A control sequence u U N and the resulting trajectory x u (k,x 0 ) are admissible for x 0 up to N if (x u (k,x 0 ),u(k)) Y, k = 0,1,2,...N 1 and x u (N,x 0 ) X We need this separate condition on the last state because control sequences are 1 step shorter than resulting trajectories. The set of all admissible control sequences for x 0 up to N is denoted by U N (x 0 ) 4 / 23

Constraints and Related Definitions... Let x 0 X and u U. The control sequence and corresponding trajectory x u (k,x 0 ) are called admissible for x 0 if they are admissible for any arbitrary n N. The set of all admissible sequences for x 0 is denoted U (x 0 ) A feedback law µ : N 0 X U is called admissible if µ(n,x) U 1 (x) for all x X and all n N 0. 1. Viability: We assume that x X, there is always some u U(x) such that f(x,u) X (there s always some admissible control to apply that will not result in a state constraint violation next). 1 Normally we require the entire predicted sequences out of the OCP N to be admissible, even though only the first control sequence element will be applied. Not much else we can do! 5 / 23

Viability... A car (point mass m) is situated between 2 walls separated by a distance d. The maximum acceleration and deceleration of the car are captured by the constraint u Ū. The maximum allowable speed is V. Suppose the car obeys the double-integrator law mẍ = u. Sketch the viable subset of X = R 2. Note: G & P and other recent approaches to NMPC analysis make state admissibility a part of input admissibility. When we require u U N (x 0 ) not only we enforce input value constraints, but also that the resulting state trajectories are in X up to time N. This shifts the burden of preserving state constraints to the numerical solvers, avoiding theoretical complications. We just assume we have viability, it s up to the solver to find the best viable solution. 6 / 23

Admissibility of the NMPC Feedback Law This theorem shows that NMPC feedback will generate admissible input sequences and corresponding trajectories provided viability is assumed. (At every point x, there s always some control u that may be applied without violating X next. NMPC simply chooses u = µ N (x) among them). Theorem G&P 3.5: Consider the OCP N 3.1 with constraints u U N (x 0 ). Suppose viability holds. Consider the nominal closed-loop system x + = f(x,µ N (x 0,x)) with µ N (x(n)) = u (0) and suppose x 0 = x µn (0) X. Then for all n N. (x µn (n),µ N (x µn (n)) Y This key result leads to the recursive feasibility property, because the assumption x 0 X will be automatically satisfied upon subsequent applications of the MPC feedback law, as a consequence of this very same theorem. 7 / 23

Time-Varying Optimal Control Problem (OCP n N ) For a time-varying reference x ref (n), the running cost l(n,x,u) is assumed to satisfy l(n,x ref (n),u ref (n)) = 0 where x ref (n) has been generated by a suitable u ref (n): x ref (n+1) = f(x ref (n),u ref (n)) Also, the running cost must be non-negative: l(n,x,u) > 0 n, u U, x X,x x ref (n) A running cost that satisfies the above is l(n,x,u) = x x ref (n) 2 +λ u u ref (n) 2 with λ 0. 8 / 23

Time-Varying NMPC Algorithm Measure x(n), set x 0 = x(n) and solve minimize u U N (x 0 ) subject to J N (n,x 0,u) = N 1 k=0 l(n+k,x u (k,x 0 ),u(k)) x u (0,x 0 ) = x 0 x u (k +1,x 0 ) = f(x u (k,x 0 ),u(k)) Let u (k) be the open-loop solution sequence for OCP N. The MPC feedback law is defined by µ N (n,x(n)) = u (0). The nominal closed-loop system resulting from this algorithm is x + = f(x,µ N (n,x)) 9 / 23

Terminal Constraint Sets Terminal constraint sets provide a way to guarantee feasibility and closed-loop stability of NMPC. For a fixed terminal set X 0, a general terminal constraint is expressed as x u (N,x(n)) X 0 for each u U N (x 0 ). In words, we re asking admissible predicted sequences to yield a predicted state at the end of the horizon such that it lies on a desired set X 0. The terminal set may not be fixed, but walk with time: X 0 (n). In this case, the terminal constraint has the form x u (N,x(n)) X 0 (n+n) Terminal sets are used to define feasible sets (of initial conditions), denoted X N : X N = {x 0 X : u U N (x 0 ) such that x u (N,x 0 ) X 0 } The corresponding admissible control sequences available for x 0 are: U N X 0 (x 0 ) = {u U N (x 0 ) : x u (N,x 0 ) X 0 } Similar definitions apply to the T-V case, see Def. 3.9 ii in G & P. 10 / 23

Terminal Costs and Weighted Costs - Everything algorithm The predicted state at the end of the horizon may be included as separate term in the cost function, F(x u (N,x(n)). Again, will be used as part of stability analysis. Weighted costs are generated by using a sequence of non-negative weights ω k. A T-V algorithm with weighted cost, terminal constraint and terminal cost ( everything algorithm) is: At time n, set x 0 = x(n) X N and solve minimize u U N X 0 (n,x 0 ) subject to J N (n,x 0,u) = N 1 k=0 ω k l(n+k,x u (k,x 0 ),u(k))+f(n+n,x u (N,x(n)) x u (0,x 0 ) = x 0 x u (k +1,x 0 ) = f(x u (k,x 0 ),u(k)) Note that the terminal state constraint is part of the admissibility requirement for u, and thus not listed as a subject to constraint. But when coding for numerical solutions, terminal constraints are listed among the subject to constraints. 11 / 23

Recursive Feasibility of NMPC Consider the everything algorithm. The following holds (Corollary 3.13 in G & P) for all n N: x X N (n) f(x,µ N (n,x)) X N 1 (n+1) At any time n, if we solve the NMPC problem at some point of the current N-step feasible set, then x + under NMPC feedback will belong to the next N 1-step feasible set. When X N is constant and contains just a single equilibrium point, it is clear that N 1-step feasibility implies N-step feasibility, since we may just apply the equilibrium control input again. 12 / 23

Bellman s Optimality Principle Suppose a driver has been following a bad route due to confusion. When he realizes the mistake, he will take the best route from where he is, regardless of what he did before. If he had started from that point, he would have followed that same best route. This empirical observation is reflected in Bellman s principle of optimality and will be presented in precise mathematical form as the MPC dynamic programming equality and inequalities. 13 / 23

Optimal Value Function and Optimal Sequences Define the optimal value function (or minimum cost) by V N (n,x 0 ) = inf u U N X 0 (n,x 0 ) J N (n,x 0,u) We use inf instead of min because there may be no (admissible) u that gives V N exactly. For example, the function e t has an infimum value at zero, but there s no t that produces zero. A control sequence u U N X 0 (x 0 ) is optimal if it actually achieves the optimal value: J N (n,x 0,u ) = V N (n,x 0 ) 14 / 23

Dynamic Programming Principle (Th. 3.15 in G & P): For OCP n N,e 2 with x 0 X N (n) and all n,n N 0 : i.) V N (n,x 0 ) = inf u U K X N K (n,x 0 ) { K 1 k=0 + V N K (n+k,x u (k,x 0 )) ω N k l(n+k,x u (k,x 0 ),u(k)) } Total opt cost (start at x 0 at time n with horizon N) = inf of (cost from x 0 at n with horizon K + opt cost from x u (K,x 0 ) at time n+k, horizon N K). Note that the terminal cost does not appear in the DP equation, but the principle is valid for OCP containing such term. 2 Important: this principle applies to the predicted solutions of the OCP, not their repeated use as feedback controls 15 / 23

Dynamic Programming Principle... ii) If an optimal sequence u U N X 0 (n,x 0 ) exists for x 0 then V N (n,x 0 ) = K 1 k=0 ω N k l(n+k,x u (k,x 0 ),u (k))+v N K (n+k,x u (k,x 0 )) time n: solve OCP get u at cost V N stage cost K 1 k=0 ω N kl(k,x u (k,x 0 ),u (k)) time n+k: solve OCP get the tail of u, at cost V N K x u (K,x 0 ) VN K(n+K,x u (k,x 0 )) cost-to-go x 0 16 / 23

Dynamic Programming Principle... (Corollary 3.16 in G &P): If u is an optimal solution to the OCP n N,e for x 0 X N (n) at time n with N 2, then for each K = 1,2,...N 1, the shifted sequence u K(k) = u (k +K), k = 0,1,2,N K 1 is an optimal solution to the OCP n N,e for x u (K,x 0) at time n+k, with horizon N K. (Theorem 3.17): Consider the OCP n N,e with x 0 X N (n) and assume an optimal sequence u exists. Then the NMPC feedback µ N (n,x 0 ) = u (0) satisfies µ N (n,x 0 ) = arg min {ω N l(n,x 0,u) u U 1 X (n,x 0 ) N 1 + V N 1 (n+1,f(x 0,u))} Note that the arg min is taken over all 1-element sequences admissible at time x 0 and time n. 17 / 23

Interpretation At x 0, n, try all admissible 1-element sequences u. Recall what admissibility entails (constrained controls, states, terminal state). Each u results in a 1-step stage cost, a next state f(x 0,u) and an optimal cost-to-go V N 1 (n+1,f(x 0,u)). Tally all admissible us and for each, add the stage cost and the cost-to-go. Locate the minimum sum. Th. 3.17 says that the u giving the minimum sum is the NMPC solution u (0). The next Corollary says the following: suppose we apply the NMPC feedback for some time. If we form a sequence with the applied feedbacks, it will be a solution to the one-shot OCP at the initial time. (Corollary 3.18): Consider the OCP n N,e with x 0 X and consider the set of admissible feedback laws µ N K for k = 0,1,...N 1. 18 / 23

Interpretation... Important: Unless N =, elements 2,3,4,...N of the predicted control and state sequences at time n do not match the feedback and closed-loop states at n+1,n+2,...n+n 1. Example (Prob. 3.2 in G&P): Consider the system x + 1 = x 1 +2x 2 x + 2 = x 2 +2u with running cost l(x,u) = u 2, x 0 = [0 0] T, x N = [4 0] T. For N = 4, use the dynamic programming principle to obtain the first predicted trajectory and optimal cost. We then use numerical simulations to illustrate the validity of the above theorems and colloraries. 19 / 23

Dynamic Programming Principle, LTI DT Systems and Finite-Horizon Quadratic Problem Consider the LTI DT system x + = Ax+Bu with initial state x(0) = x 0 and quadratic cost J N (x 0,u) = {x T u(k,x 0 )Qx u (k,x 0 )+u T (k)ru(k)}+x T u(n)q f x u (N) N 1 k=0 We use the DP principle to find a solution for the unconstrained OCP and corresponding optimal cost function. Standard solvability assumptions: Q = Q T 0, Q f = Q T f 0, R = R T > 0 20 / 23

Finite-Horizon DLQR... Apply the DP principle (Th 3.15) to this case with K = 1 and time n. Then the stage cost term is associated with the initial state and control: x T 0Qx 0 +u T Ru. With u as initial control, the next state is x + = Ax 0 +Bu. Then the cost-to-go to be minimized is V N 1 (Ax 0 +Bu). Then the DP principle reduces to V N (n,x 0 ) = x T { 0Qx 0 + min u T Ru+V N 1 (n+1,ax 0 +Bu) } u U 1 It is guessed that V N 1 (n+1,z) is a quadratic time-varying function: V N 1 (n+1,z) = z T P n+1 z where P n is a sequence of symmetric, p.def. matrices. Substituting the guess: V N (n,x 0 ) = x T 0Qx 0 + min u U 1{uT Ru+(Ax 0 +Bu) T P n+1 (Ax 0 +Bu)} 21 / 23

Finite-Horizon DLQR... Perform the indicated minimization by equating the gradient to zero: 2u T R+2(Ax 0 +Bu) T P n+1 B = 0 which gives the well-known optimal solution u = (R+B T P n+1 B) 1 B T P n+1 Ax 0 K n x 0 Substituting this solution into the DP principle equation gives the Riccati backward recursion: P n = Q+A T P n+1 A A T P n+1 B(R+B T P n+1 B) 1 B T P t+1 A To solve the above, note that P N = Q f. This is used as initial value to find P N 1,P N 2,...P 0 in that order, along with the optimal control sequence. 22 / 23

Example: Finite-Horizon DLQR Consider a double-integrator plant discretized with ZOH. We simulate the finite-horizon optimal regulator with identity weights. See the effect of N. 1 State Trajectory: Finite-Time DLQR 0-1 -2 x 2-3 -4-5 -6-7 -2-1.5-1 -0.5 0 0.5 1 1.5 2 x 1 23 / 23