Robotics. Control Theory. Marc Toussaint U Stuttgart

Similar documents
Robotics: Science & Systems [Topic 6: Control] Prof. Sethu Vijayakumar Course webpage:

Quadratic Stability of Dynamical Systems. Raktim Bhattacharya Aerospace Engineering, Texas A&M University

Topic # /31 Feedback Control Systems. Analysis of Nonlinear Systems Lyapunov Stability Analysis

Robotics. Dynamics. Marc Toussaint U Stuttgart

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

Math 312 Lecture Notes Linear Two-dimensional Systems of Differential Equations

EN Nonlinear Control and Planning in Robotics Lecture 3: Stability February 4, 2015

Copyright (c) 2006 Warren Weckesser

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

Optimal Control. McGill COMP 765 Oct 3 rd, 2017

1. Find the solution of the following uncontrolled linear system. 2 α 1 1

2 Lyapunov Stability. x(0) x 0 < δ x(t) x 0 < ɛ

Artificial Intelligence

Robotics. Dynamics. University of Stuttgart Winter 2018/19

State Regulator. Advanced Control. design of controllers using pole placement and LQ design rules

Control Systems. Internal Stability - LTI systems. L. Lanari

Hamilton-Jacobi-Bellman Equation Feb 25, 2008

POLE PLACEMENT. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 19

CDS 101/110a: Lecture 2.1 Dynamic Behavior

Time-Invariant Linear Quadratic Regulators!

Identification Methods for Structural Systems

EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

CDS 101/110a: Lecture 2.1 Dynamic Behavior

Problem 1 Cost of an Infinite Horizon LQR

Time-Invariant Linear Quadratic Regulators Robert Stengel Optimal Control and Estimation MAE 546 Princeton University, 2015

Optimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:

EE363 homework 7 solutions

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming

Optimal Control with Learned Forward Models

MCE/EEC 647/747: Robot Dynamics and Control. Lecture 8: Basic Lyapunov Stability Theory

EE C128 / ME C134 Final Exam Fall 2014

EML5311 Lyapunov Stability & Robust Control Design

Lecture 4 Continuous time linear quadratic regulator

Steady State Kalman Filter

Modeling and Analysis of Dynamic Systems

EE C128 / ME C134 Feedback Control Systems

Lecture 10: Powers of Matrices, Difference Equations

Lecture 2: Discrete-time Linear Quadratic Optimal Control

Course Outline. Higher Order Poles: Example. Higher Order Poles. Amme 3500 : System Dynamics & Control. State Space Design. 1 G(s) = s(s + 2)(s +10)

Math Ordinary Differential Equations

Hybrid Control and Switched Systems. Lecture #9 Analysis tools for hybrid systems: Impact maps

Suppose that we have a specific single stage dynamic system governed by the following equation:

Lecture 2: Controllability of nonlinear systems

CDS 101: Lecture 5-1 Reachability and State Space Feedback

Linear Quadratic Regulator (LQR) Design I

Stochastic and Adaptive Optimal Control

Lecture 5 Linear Quadratic Stochastic Control

Understand the existence and uniqueness theorems and what they tell you about solutions to initial value problems.

Lecture 9 Nonlinear Control Design

NPTEL Online Course: Control Engineering

A Tour of Reinforcement Learning The View from Continuous Control. Benjamin Recht University of California, Berkeley

The Inverted Pendulum

Dynamical Systems & Lyapunov Stability

Lecture 2 and 3: Controllability of DT-LTI systems

3 Stability and Lyapunov Functions

CONTROL DESIGN FOR SET POINT TRACKING

Computational Issues in Nonlinear Dynamics and Control

Poincaré Map, Floquet Theory, and Stability of Periodic Orbits

Nonlinear Systems and Control Lecture # 12 Converse Lyapunov Functions & Time Varying Systems. p. 1/1

Lecture 20: Linear Dynamics and LQG

Nonlinear System Analysis

Linear-Quadratic Optimal Control: Full-State Feedback

APPPHYS 217 Tuesday 6 April 2010

Georgia Institute of Technology Nonlinear Controls Theory Primer ME 6402

ELEC 3035, Lecture 3: Autonomous systems Ivan Markovsky

1. The Transition Matrix (Hint: Recall that the solution to the linear equation ẋ = Ax + Bu is

Control Systems I. Lecture 4: Diagonalization, Modal Analysis, Intro to Feedback. Readings: Emilio Frazzoli

4F3 - Predictive Control

Linear Quadratic Regulator (LQR) I

Fundamentals of Dynamical Systems / Discrete-Time Models. Dr. Dylan McNamara people.uncw.edu/ mcnamarad

EE363 homework 2 solutions

Homogeneous Constant Matrix Systems, Part II

Linear Quadratic Optimal Control Topics

Lyapunov Stability Theory

16.31 Fall 2005 Lecture Presentation Mon 31-Oct-05 ver 1.1

Span and Linear Independence

Reinforcement Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

20. The pole diagram and the Laplace transform

Topic # Feedback Control Systems

Outline. 1 Linear Quadratic Problem. 2 Constraints. 3 Dynamic Programming Solution. 4 The Infinite Horizon LQ Problem.

Module 07 Controllability and Controller Design of Dynamical LTI Systems

Introduction to gradient descent

16.30/31, Fall 2010 Recitation # 13

Transverse Linearization for Controlled Mechanical Systems with Several Passive Degrees of Freedom (Application to Orbital Stabilization)

CDS Solutions to the Midterm Exam

Linear Systems. Manfred Morari Melanie Zeilinger. Institut für Automatik, ETH Zürich Institute for Dynamic Systems and Control, ETH Zürich

EEE582 Homework Problems

Automatic Control 2. Nonlinear systems. Prof. Alberto Bemporad. University of Trento. Academic year

Linear-Quadratic-Gaussian (LQG) Controllers and Kalman Filters

Multi-Robotic Systems

ECE557 Systems Control

Extensions and applications of LQ

Solution via Laplace transform and matrix exponential

Chapter III. Stability of Linear Systems

Appendix A Equations of Motion in the Configuration and State Spaces

Zeros and zero dynamics

Control Systems Design, SC4026. SC4026 Fall 2009, dr. A. Abate, DCSC, TU Delft

Robot Control Basics CS 685

, and rewards and transition matrices as shown below:

Generalized eigenspaces

Transcription:

Robotics Control Theory Topics in control theory, optimal control, HJB equation, infinite horizon case, Linear-Quadratic optimal control, Riccati equations (differential, algebraic, discrete-time), controllability, stability, eigenvalue analysis, Lyapunov function Marc Toussaint U Stuttgart

Cart pole example θ u x state x = (x, ẋ, θ, θ) g sin(θ) + cos(θ) θ = [ ] c 1 u c θ2 2 sin(θ) 4 3 l c 2 cos 2 (θ) [ ẍ = c 1 u + c θ2 2 sin(θ) θ ] cos(θ) 2/44

Control Theory Concerns controlled systems of the form ẋ = f(x, u) + noise(x, u) and a controller of the form π : (x, t) u We ll neglect stochasticity here When analyzing a given controller π, one analyzes closed-loop system as described by the differential equation ẋ = f(x, π(x, t)) (E.g., analysis for convergence & stability) 3/44

Core topics in Control Theory Stability* Analyze the stability of a closed-loop system Eigenvalue analysis or Lyapunov function method Controllability* Analyze which dimensions (DoFs) of the system can actually in principle be controlled Transfer function Analyze the closed-loop transfer function, i.e., how frequencies are transmitted through the system. ( Laplace transformation) Controller design Find a controller with desired stability and/or transfer function properties Optimal control* Define a cost function on the system behavior. Optimize a controller to minimize costs 4/44

Control Theory references Robert F. Stengel: Optimal control and estimation Online lectures: http://www.princeton.edu/~stengel/mae546lectures.html (esp. lectures 3,4 and 7-9) From robotics lectures: Stefan Schaal s lecture Introduction to Robotics: http://www-clmc. usc.edu/teaching/teachingintroductiontoroboticssyllabus Drew Bagnell s lecture on Adaptive Control and Reinforcement Learning http://robotwhisperer.org/acrls11/ 5/44

Outline We ll first consider optimal control Goal: understand Algebraic Riccati equation significance for local neighborhood control Then controllability & stability 6/44

Optimal control (discrete time) Given a controlled dynamic system we define a cost function J π = x t+1 = f(x t, u t ) T c(x t, u t ) + φ(x T ) t=0 where x 0 and the controller π : (x, t) u are given, which determines x 1:T and u 0:T 7/44

Dynamic Programming & Bellman principle An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision. 1 1 Start 1 7 5 3 1 3 1 5 3 3 15 8 10 3 1 20 3 Goal V (state) = min edge [c(edge) + V (next-state)] 8/44

Bellman equation (discrete time) Define the value function or optimal cost-to-go function V t (x) = min π [ T s=t ] c(x s, u s ) + φ(x T ) x t=x Bellman equation ] V t (x) = min u [c(x, u) + V t+1 (f(x, u)) [ ] The argmin gives the optimal control signal: πt (x) = argmin u Derivation: V t(x) = min π [ T s=t ] c(x s, u s) + φ(x T ) [ = min c(x, u u t) + min[ t π T s=t+1 ] u t = min [ c(x, u t) + V t+1 (f(x, u t)) ] c(x s, u s) + φ(x T )] 9/44

Optimal Control (continuous time) Given a controlled dynamic system ẋ = f(x, u) we define a cost function with horizon T J π = T 0 c(x(t), u(t)) dt + φ(x(t )) where the start state x(0) and the controller π : (x, t) u are given, which determine the closed-loop system trajectory x(t), u(t) via ẋ = f(x, π(x, t)) and u(t) = π(x(t), t) 10/44

Hamilton-Jacobi-Bellman equation (continuous time) Define the value function or optimal cost-to-go function [ T ] V (x, t) = min c(x(s), u(s)) ds + φ(x(t )) π t x(t)=x Hamilton-Jacobi-Bellman equation [ ] t V (x, t) = min u c(x, u) + V x f(x, u) The argmin gives the optimal control signal: π (x) = argmin u [ ] Derivation: dv (x, t) dt = V t + V x ẋ c(x, u ) = V t + V x f(x, u ) V t = c(x, u ) + V x f(x, u ) 11/44

Infinite horizon case J π = 0 c(x(t), u(t)) dt This cost function is stationary (time-invariant)! the optimal value function is stationary (V (x, t) = V (x)) the optimal control signal depends on x but not on t the optimal controller π is stationary The HBJ and Bellman equations remain the same but with the same (stationary) value function independent of t: [ 0 = min c(x, u) + V ] u x f(x, u) (cont. time) [ ] V (x) = min c(x, u) + V (f(x, u)) (discrete time) u [ ] The argmin gives the optimal control signal: π (x) = argmin u 12/44

Infinite horizon examples Cart-pole balancing: You always want the pole to be upright (θ 0) You always want the car to be close to zero (x 0) You want to spare energy (apply low torques) (u 0) You might define a cost J π = 0 θ 2 + ɛ x 2 + ρ u 2 Reference following: You always want to stay close to a reference trajectory r(t) Define x(t) = x(t) r(t) with dynamics x(t) = f( x(t) + r(t), u) ṙ(t) Define a cost J π = x 2 + ρ u 2 Many many problems in control can be framed this way 0 13/44

Comments The Bellman equation is fundamental in optimal control theory, but also Reinforcement Learning The HJB eq. is a differential equation for V (x, t) which is in general hard to solve The (time-discretized) Bellman equation can be solved by Dynamic Programming starting backward: [ ] V T (x) = φ(x), V T -1 (x) = min c(x, u) + V T (f(x, u)) etc. u But it might still be hard or infeasible to represent the functions V t (x) over continuous x! Both become significantly simpler under linear dynamics and quadratic costs: Riccati equation 14/44

Linear-Quadratic Optimal Control linear dynamics ẋ = f(x, u) = Ax + Bu quadratic costs c(x, u) = x Qx + u Ru, φ(x T ) = x T F x T Note: Dynamics neglects constant term; costs neglect linear and constant terms. This is because constant costs are irrelevant linear cost terms can be made away by redefining x or u constant dynamic term only introduces a constant drift 15/44

Linear-Quadratic Control as Local Approximation LQ control is important also to control non-lq systems in the neighborhood of a desired state! Let x be such a desired state (e.g., cart-pole: x = (0, 0, 0, 0)) linearize the dynamics around x use 2nd order approximation of the costs around x control the system locally as if it was LQ pray that the system will never leave this neighborhood! 16/44

Riccati differential equation = HJB eq. in LQ case In the Linear-Quadratic (LQ) case, the value function always is a quadratic function of x! Let V (x, t) = x P (t)x, then the HBJ equation becomes [ V (x, t) = min c(x, u) + V ] t u x f(x, u) [ ] x P (t)x = min x Qx + u Ru + 2x P (t)(ax + Bu) u 0 = [ ] x Qx + u Ru + 2x P (t)(ax + Bu) u = 2u R + 2x P (t)b u = R -1 B P x Riccati differential equation P = A P + P A P BR -1 B P + Q 17/44

Riccati differential equation P = A P + P A P BR -1 B P + Q This is a differential equation for the matrix P (t) describing the quadratic value function. If we solve it with the finite horizon constraint P (T ) = F we solved the optimal control problem The optimal control u = R -1 B P x is called Linear Quadratic Regulator Note: If the state is dynamic (e.g., x = (q, q)) this control is linear in the positions and linear in the velocities and is an instance of PD control The matrix K = R -1 B P is therefore also called gain matrix For instance, if x(t) = (q(t) r(t), q(t) ṙ(t)) for a reference r(t) and K = [ ] K p K d then u = K p(r(t) q(t)) + K d (ṙ(t) q(t)) 18/44

Riccati equations Finite horizon continuous time Riccati differential equation P = A P + P A P BR -1 B P + Q, P (T ) = F Infinite horizon continuous time Algebraic Riccati equation (ARE) 0 = A P + P A P BR -1 B P + Q Finite horizon discrete time (J π = T t=0 x t 2 Q + u t 2 R + x T 2 F ) P t-1 = Q + A [P t P t B(R + B P t B) -1 B P t ]A, P T = F Infinite horizon discrete time (J π = t=0 x t 2 Q + u t 2 R ) P = Q + A [P P B(R + B P B) -1 B P ]A 19/44

Example: 1D point mass Dynamics: q(t) = u(t)/m Costs: x = q, ẋ = q q = q q u(t)/m = = Ax + Bu, A = 0 1 x + 0 0 0 1, B = 0 0 0 u 1/m 0 1/m c(x, u) = ɛ x 2 + ϱ u 2, Q = ɛi, R = ϱi Algebraic Riccati equation: P = a c c b, u = R -1 B P x = 1 [cq + b q] ϱm 0 = A P + P A P BR -1 B P + Q = c b 0 0 + 0 a 1 0 c ϱm 2 c2 bc bc b 2 + ɛ 1 0 0 1 20/44

Example: 1D point mass (cont.) Algebraic Riccati equation: P = 0 = a c c b c b 0 0, + u = R -1 B P x = 1 [cq + b q] ϱm 0 a 1 0 c ϱm 2 c2 bc bc b 2 + ɛ 1 0 0 1 First solve for c, then for b = m ϱ c + ɛ. Whether the damping ration ξ = b 4mc depends on the choices of ϱ and ɛ. The Algebraic Riccati equation is usually solved numerically. (E.g. are, care or dare in Octave) 21/44

Optimal control comments HJB or Bellman equation are very powerful concepts Even if we can solve the HJB eq. and have the optimal control, we still don t know how the system really behaves? Will it actually reach a desired state? Will it be stable? It is actually controllable at all? Last note on optimal control: Formulate as a constrainted optimization problem with objective function J π and constraint ẋ = f(x, u). λ(t) are the Langrange multipliers. It turns out that V (x, t) = λ(t). (See Stengel.) x 22/44

Relation to other topics Optimal Control: Inverse Kinematics: min π J π = T 0 c(x(t), u(t)) dt + φ(x(t )) min q f(q) = q q 0 2 W + φ(q) y 2 C Optimal operational space control: min u Trajectory Optimization: min f(q 0:T ) = q 0:T f(u) = u 2 H + φ(q) ÿ 2 C (control hard constraints could be included) T Ψ t (q t-k,.., q t ) 2 + t=0 T Φ t (q t ) 2 Reinforcement Learning: Markov Decision Processes discrete time stochastic controlled system P (x t+1 u t, x t ) Bellman equation Basic RL methods (Q-learning, etc) 23/44 t=0

Controllability 24/44

Controllability As a starting point, consider the claim: Intelligence means to gain maximal controllability over all degrees of freedom over the environment. 25/44

Controllability As a starting point, consider the claim: Intelligence means to gain maximal controllability over all degrees of freedom over the environment. Note: controllability (ability to control) control What does controllability mean exactly? I think the general idea of controllability is really interesting Linear control theory provides one specific definition of controllability, which we introduce next.. 25/44

Controllability Consider a linear controlled system ẋ = Ax + Bu How can we tell from the matrices A and B whether we can control x to eventually reach any desired state? Example: x is 2-dim, u is 1-dim: Is x controllable? ẋ 1 = ẋ 2 0 0 0 0 x 1 + x 2 1 u 0 26/44

Controllability Consider a linear controlled system ẋ = Ax + Bu How can we tell from the matrices A and B whether we can control x to eventually reach any desired state? Example: x is 2-dim, u is 1-dim: Is x controllable? ẋ 1 = ẋ 2 0 0 0 0 x 1 + x 2 1 u 0 Is x controllable? ẋ 1 = ẋ 2 0 1 0 0 x 1 + x 2 0 u 1 26/44

Controllability We consider a linear stationary (=time-invariant) controlled system ẋ = Ax + Bu Complete controllability: All elements of the state can be brought from arbitrary initial conditions to zero in finite time 27/44

Controllability We consider a linear stationary (=time-invariant) controlled system ẋ = Ax + Bu Complete controllability: All elements of the state can be brought from arbitrary initial conditions to zero in finite time A system is completely controllable iff the controllability matrix [ ] C := B AB A 2 B A n-1 B has full rank dim(x) (that is, all rows are linearly independent) 27/44

Controllability We consider a linear stationary (=time-invariant) controlled system ẋ = Ax + Bu Complete controllability: All elements of the state can be brought from arbitrary initial conditions to zero in finite time A system is completely controllable iff the controllability matrix [ ] C := B AB A 2 B A n-1 B has full rank dim(x) (that is, all rows are linearly independent) Meaning of C: The ith row describes how the ith element x i can be influenced by u B : ẋ i is directly influenced via B AB : ẍ i is indirectly influenced via AB (u directly influences some ẋ j via B; the dynamics A then influence ẍ i depending on ẋ j) A 2 B :... x i is double-indirectly influenced etc... Note: ẍ = Aẋ + B u = AAx + ABu + B u... x = A 3 x + A 2 Bu + AB u + Bü 27/44

Controllability When all rows of the controllability matrix are linearly independent (u, u, ü,...) can influence all elements of x independently If a row is zero this element of x cannot be controlled at all If 2 rows are linearly dependent these two elements of x will remain coupled forever 28/44

Controllability examples ẋ 1 = ẋ 2 0 0 0 0 x 1 + x 2 1 u C = 1 1 0 1 0 rows linearly dependent ẋ 1 = ẋ 2 0 0 0 0 x 1 + x 2 1 u C = 0 1 0 0 0 2nd row zero ẋ 1 = ẋ 2 0 1 0 0 x 1 + x 2 0 u C = 1 0 1 1 0 good! 29/44

Controllability Why is it important/interesting to analyze controllability? The Algebraic Riccati Equation will always return an optimal controller but controllability tells us whether such a controller even has a chance to control x 30/44

Controllability Why is it important/interesting to analyze controllability? The Algebraic Riccati Equation will always return an optimal controller but controllability tells us whether such a controller even has a chance to control x Intelligence means to gain maximal controllability over all degrees of freedom over the environment. real environments are non-linear to have the ability to gain controllability over the environment s DoFs 30/44

Stability 31/44

Stability One of the most central topics in control theory Instead of designing a controller by first designing a cost function and then applying Riccati, design a controller such that the desired state is provably a stable equilibrium point of the closed loop system 32/44

Stability Stability is an analysis of the closed loop system. That is: for this analysis we don t need to distinguish between system and controller explicitly. Both together define the dynamics ẋ = f(x, π(x, t)) =: f(x) The following will therefore discuss stability analysis of general differential equations ẋ = f(x) What follows: 3 basic definitions of stability 2 basic methods for analysis by Lyapunov 33/44

Aleksandr Lyapunov (1857 1918) 34/44

Stability 3 definitions ẋ = F (x) with equilibrium point x = 0 x 0 is an equilibrium point f(x 0 ) = 0 35/44

Stability 3 definitions ẋ = F (x) with equilibrium point x = 0 x 0 is an equilibrium point f(x 0 ) = 0 Lyapunov stable or uniformly stable ɛ : δ s.t. x(0) δ x(t) ɛ (when it starts off δ-near to x 0, it will remain ɛ-near forever) 35/44

Stability 3 definitions ẋ = F (x) with equilibrium point x = 0 x 0 is an equilibrium point f(x 0 ) = 0 Lyapunov stable or uniformly stable ɛ : δ s.t. x(0) δ x(t) ɛ (when it starts off δ-near to x 0, it will remain ɛ-near forever) asymtotically stable Lyapunov stable and lim t x(t) = 0 35/44

Stability 3 definitions ẋ = F (x) with equilibrium point x = 0 x 0 is an equilibrium point f(x 0 ) = 0 Lyapunov stable or uniformly stable ɛ : δ s.t. x(0) δ x(t) ɛ (when it starts off δ-near to x 0, it will remain ɛ-near forever) asymtotically stable Lyapunov stable and lim t x(t) = 0 exponentially stable asymtotically stable and α, a s.t. x(t) ae αt x(0) ( the error time integral x(t) dt a 0 α x(0) is bounded!) 35/44

Linear Stability Analysis ( Linear local for a system linearized at the equilibrium point.) Given a linear system ẋ = Ax Let λ i be the eigenvalues of A The system is asymptotically stable i : real(λ i ) < 0 The system is unstable stable i : real(λ i ) > 0 The system is marginally stable i : real(λ i ) 0 36/44

Linear Stability Analysis ( Linear local for a system linearized at the equilibrium point.) Given a linear system ẋ = Ax Let λ i be the eigenvalues of A The system is asymptotically stable i : real(λ i ) < 0 The system is unstable stable i : real(λ i ) > 0 The system is marginally stable i : real(λ i ) 0 Meaning: An eigenvalue describes how the system behaves along one state dimension (along the eigenvector): ẋ i = λ ix i As for the 1D point mass the solution is x i(t) = ae λ it and imaginary λ i oscillation negative real(λ i) exponential decay e λ i t positive real(λ i) exponential explosion e λ i t 36/44

Linear Stability Analysis: Example Let s take the 1D point mass q = u/m in closed loop with a PD u = K p q K d q Dynamics: Eigenvalues: ẋ = A = q = q q = 0 1 x + 1/m 0 0 0 1 K p /m K d /m 0 0 x K p K d 0 1 The equation λ q leads to the equation q K p /m K d /m q λ q = λ 2 q = K p /mq K d /mλq or mλ 2 + K d λ + K p = 0 with solution (compare slide 05:10) λ = K d ± K 2 d 4mK p 2m For K 2 d 4mK p negative, the real(λ) = K d /2m Positive derivative gain K d makes the system stable. 37/44

Side note: Stability for discrete time systems Given a discrete time linear system x t+1 = Ax t Let λ i be the eigenvalues of A The system is asymptotically stable i : λ i < 1 The system is unstable stable i : λ i > 1 The system is marginally stable i : λ i 1 38/44

Linear Stability Analysis comments The same type of analysis can be done locally for non-linear systems, as we will do for the cart-pole in the exercises We can design a controller that minimizes the (negative) eigenvalues of A: controller with fastest asymtopic convergence This is a real alternative to optimal control! 39/44

Lyapunov function method A method to analyze/prove stability for general non-linear systems is the famous Lyapunov s second method Let D be a region around the equilibrium point x 0 A Lyaponov function V (x) for a system dynamics ẋ = f(x) is positive, V (x) > 0, everywhere in D except... at the equilibrium point where V (x 0 ) = 0 V (x) x always decreases, V (x) = ẋ < 0, in D except... at the equilibrium point where f(x) = 0 and therefore V (x) = 0 If there exists a D and a Lyapunov function the system is asymtotically stable If D is the whole state space, the system is globally stable 40/44

Lyapunov function method The Lyapunov function method is very general. V (x) could be anything (energy, cost-to-go, whatever). Whenever one finds some V (x) that decreases, this proves stability The problem though is to think of some V (x) given a dynamics! (In that sense, the Lyapunov function method is rather a method of proof than a concrete method for stability analysis.) 41/44

Lyapunov function method The Lyapunov function method is very general. V (x) could be anything (energy, cost-to-go, whatever). Whenever one finds some V (x) that decreases, this proves stability The problem though is to think of some V (x) given a dynamics! (In that sense, the Lyapunov function method is rather a method of proof than a concrete method for stability analysis.) In standard cases, a good guess for the Lyapunov function is either the energy or the value function 41/44

Lyapunov function method Energy Example Let s take the 1D point mass q = u/m in closed loop with a PD u = K p q K d q, which has the solution (slide 05:14): q(t) = be ξ/λ t e iω0 1 ξ 2 t Energy of the 1D point mass: V (q, q) := 1 2 m q2 V (x) = e 2ξ/λ t V (x(0)) (using that the energy of an undamped oscillator is conserved) V (x) < 0 ξ > 0 K d > 0 Same result as for the eigenvalue analysis 42/44

Lyapunov function method value function Example Consider infinite horizon linear-quadratic optimal control. The solution of the Algebraic Riccati equation gives the optimal controller. The value function satisfies V (x) = x P x V (x) = [Ax + Bu ] P x + x P [Ax + Bu ] u = R -1 B P x = Kx V (x) = x [(A + BK) P + P (A + BK)]x = x [A P + P A + (BK) P + P (BK)]x 0 = A P + P A P BR -1 B P + Q V (x) = x [P BR -1 B P Q + (P BK) + P BK]x = x [Q + K RK]x (We could have derived this easier! x Qx are the immediate state costs, and x K RKx = u Ru are the immediate control costs and V (x) = c(x, u )! See slide 11 bottom.) That is: V is a Lyapunov function if Q + K RK is positive definite! 43/44

Observability & Adaptive Control When some state dimensions are not directly observable: analyzing higher order derivatives to infer them. Very closely related to controllability: Just like the controllability matrix tells whether state dimensions can (indirectly) be controlled; an observation matrix tells whether state dimensions can (indirectly) be inferred. Adaptive Control: When system dynamics ẋ = f(x, u, β) has unknown parameters β. One approach is to estimate β from the data so far and use optimal control. Another is to design a controller that has an additional internal update equation for an estimate ˆβ and is provably stable. (See Schaal s lecture, for instance.) 44/44