Dynamic programming using radial basis functions and Shepard approximations

Similar documents
Dynamic programming with radial basis functions and Shepard s method

Zur Konstruktion robust stabilisierender Regler mit Hilfe von mengenorientierten numerischen Verfahren

An adaptive RBF-based Semi-Lagrangian scheme for HJB equations

Nonlinear Control. Nonlinear Control Lecture # 3 Stability of Equilibrium Points

Calculating the domain of attraction: Zubov s method and extensions

Applied Analysis (APPM 5440): Final exam 1:30pm 4:00pm, Dec. 14, Closed books.

LMI Methods in Optimal and Robust Control

Nonlinear Systems and Control Lecture # 12 Converse Lyapunov Functions & Time Varying Systems. p. 1/1

Topic # /31 Feedback Control Systems. Analysis of Nonlinear Systems Lyapunov Stability Analysis

Stability of Kernel Based Interpolation

MATH 590: Meshfree Methods

Nonlinear Control. Nonlinear Control Lecture # 2 Stability of Equilibrium Points

Maximum Principles for Parabolic Equations

HOW TO CHOOSE THE STATE RELEVANCE WEIGHT OF THE APPROXIMATE LINEAR PROGRAM?

1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0

Ordinary Differential Equations II

Least Squares Approximation

Exam February h

Greedy Kernel Techniques with Applications to Machine Learning

Output Feedback and State Feedback. EL2620 Nonlinear Control. Nonlinear Observers. Nonlinear Controllers. ẋ = f(x,u), y = h(x)

Numerical Methods for Differential Equations Mathematical and Computational Tools

Introduction to Nonlinear Control Lecture # 3 Time-Varying and Perturbed Systems

Interpolation in h-version finite element spaces

Stability constants for kernel-based interpolation processes

Second Order Elliptic PDE

Nonlinear equations. Norms for R n. Convergence orders for iterative methods

Nonlinear stabilization via a linear observability

Feedback stabilization methods for the numerical solution of ordinary differential equations

Approximate Dynamic Programming

Zangwill s Global Convergence Theorem

Linear Analysis Lecture 5

High-Gain Observers in Nonlinear Feedback Control. Lecture # 2 Separation Principle

1 Lyapunov theory of stability

Optimal data-independent point locations for RBF interpolation

Approximate Dynamic Programming

A Posteriori Error Bounds for Meshless Methods

Partial Differential Equations

Mathematical modelling of collective behavior

for all subintervals I J. If the same is true for the dyadic subintervals I D J only, we will write ϕ BMO d (J). In fact, the following is true

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005

Lecture 8. Chapter 5: Input-Output Stability Chapter 6: Passivity Chapter 14: Passivity-Based Control. Eugenio Schuster.

A nonlinear equation is any equation of the form. f(x) = 0. A nonlinear equation can have any number of solutions (finite, countable, uncountable)

Preconditioned space-time boundary element methods for the heat equation

Half of Final Exam Name: Practice Problems October 28, 2014

j=1 r 1 x 1 x n. r m r j (x) r j r j (x) r j (x). r j x k

Boundary Value Problems and Iterative Methods for Linear Systems

B5.6 Nonlinear Systems

CIS 520: Machine Learning Oct 09, Kernel Methods

Input to state Stability

REGULAR LAGRANGE MULTIPLIERS FOR CONTROL PROBLEMS WITH MIXED POINTWISE CONTROL-STATE CONSTRAINTS

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

Stability Analysis of Nonlinear Systems with Linear Programming

Lyapunov Stability Theory

Nonlinear Control. Nonlinear Control Lecture # 8 Time Varying and Perturbed Systems

Some Results on b-orthogonality in 2-Normed Linear Spaces

Quasi Stochastic Approximation American Control Conference San Francisco, June 2011

Numerical Methods for Large-Scale Nonlinear Systems

MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix

1 Solutions to selected problems

Passivity-based Stabilization of Non-Compact Sets

Numerical solution of nonlinear sine-gordon equation with local RBF-based finite difference collocation method

Fast Marching Methods for Front Propagation Lecture 1/3

Approximate dynamic programming for stochastic reachability

Conjugate Gradient Method

Continuous Random Variables

SHADOWING AND INVERSE SHADOWING IN SET-VALUED DYNAMICAL SYSTEMS. HYPERBOLIC CASE. Sergei Yu. Pilyugin Janosch Rieger. 1.

Viscosity Iterative Approximating the Common Fixed Points of Non-expansive Semigroups in Banach Spaces

Set oriented numerics

Suboptimal feedback control of PDEs by solving Hamilton-Jacobi Bellman equations on sparse grids

Approximate Dynamic Programming

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto

Set oriented construction of globally optimal controllers

Methods for sparse analysis of high-dimensional data, II

Ahmed Mohammed. Harnack Inequality for Non-divergence Structure Semi-linear Elliptic Equations

2 Two-Point Boundary Value Problems

MATH 590: Meshfree Methods

NONLOCAL DIFFUSION EQUATIONS

BETI for acoustic and electromagnetic scattering

FIXED POINT ITERATIONS

Scientific Computing WS 2018/2019. Lecture 15. Jürgen Fuhrmann Lecture 15 Slide 1

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

High-Gain Observers in Nonlinear Feedback Control

Optimal Stopping Problems

[2] (a) Develop and describe the piecewise linear Galerkin finite element approximation of,

Non commutative Khintchine inequalities and Grothendieck s theo

MATH 590: Meshfree Methods

Linear Algebra Massoud Malek

A-Posteriori Error Estimation for Parameterized Kernel-Based Systems

Semismooth implicit functions

An Introduction to Variational Inequalities

Nonlinear Control. Nonlinear Control Lecture # 6 Passivity and Input-Output Stability

Mathias Jais CLASSICAL AND WEAK SOLUTIONS FOR SEMILINEAR PARABOLIC EQUATIONS WITH PREISACH HYSTERESIS

Uniformly Uniformly-ergodic Markov chains and BSDEs

The HJB-POD approach for infinite dimensional control problems

i=1 α i. Given an m-times continuously

Lax Solution Part 4. October 27, 2016

A Fixed Point Theorem and its Application in Dynamic Programming

Hybrid Control and Switched Systems. Lecture #8 Stability and convergence of hybrid systems (topological view)

Selçuk Demir WS 2017 Functional Analysis Homework Sheet

Generalized Pattern Search Algorithms : unconstrained and constrained cases

Transcription:

Dynamic programming using radial basis functions and Shepard approximations Oliver Junge, Alex Schreiber Fakultät für Mathematik Technische Universität München Workshop on algorithms for dynamical systems and Lyapunov functions Reykjavik, July 2013

Problem discrete-time control system x k+1 = f (x k, u k ), k = 0, 1, 2,..., f : Ω U Ω continuous Ω R d and U R m compact target set T Ω, compact goal: construct feedback F : S U, S Ω, such that for the closed loop system x k+1 = f (x k, F (x k )), x k S, the target T is asymptotically stable. Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 2

Optimal control cost function c : Ω U [0, ) continuous, c(x, u) δ > 0 for x T and any u U. accumulated cost J(x 0, (u k ) k ) = c(x k, u k ), k=0 with trajectory (x k ) k associated to x 0 Ω and (u k ) k U N. optimal value function V (x) = inf (u k ) k J(x, (u k ) k ) Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 3

The Bellman equation V fulfills the Bellman equation V (x) = inf {c(x, u) + V (f (x, u))} u U =: L[V ](x) with boundary condition V (T ) = 0. optimal feedback F (x) = argmin{c(x, u) + V (f (x, u))} u U (whenever the min exists) Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 4

Numerical treatment assume V F approximation space A F, dim(a) < projection Π : F A discretized Bellman operator Π L : A A value iteration: choose V (0) A with V (0) (T ) = 0, V (n+1) := Π L[V (n) ], n = 0, 1,... typical A: finite differences, finite elements (order p) problem: dim(a) O(n d ) for error O(n p ) Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 5

Nonlinear approximation Theorem [Girosi, Anzellotti, 92] If f H s,2 (R d ), s > d/2, we can find n coefficients c i R, n centers x i R d, and n variances σ i > 0 such that n f c i e x x i 2 2σ i 2 i=1 2 = O(n 1 ). Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 6

Scattered data interpolation Problem Given sites X = {x 1,..., x N } Ω R d data f 1,..., f N R, find a function a A such that a(x i ) = f i, i = 1,..., N. For A = span{a 1,..., a N } we get Ac = f, with A ij = a j (x i ). Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 7

Radial basis functions radial basis functions a(, x j ) = ϕ( x j 2 ) examples: Gaussian: ϕ(r) = exp( r 2 ), Wendland function: ϕ(r) = (1 r) 4 + (4r + 1) scaling: a j = a ε j = ϕ(ε x j ) 1 0.8 0.6 0.4 0.2 0 1 0.5 0 0.5 1 ε = 5 1 0.8 0.6 0.4 1 0.5 0 0.5 1 ε = 1 Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 8

The Kruzkov transform problem: V (x) increasing, but ϕ(x) decreasing as x Kruzkov transform: V ˆV = e V ( ) Kruzkov-Bellman equation ˆV (x) = sup{e c(x,u) ˆV (f (x, u))} =: ˆL[V ](x), x Ω\T u U with boundary condition ˆV (T ) = 1. under the assumption c(x, u) δ > 0 for x T, the Kruzkov-Bellman operator ˆL is a contraction on L. Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 9

Dynamic programming using radial basis functions approximation space A = A X,ε = span{ϕ(ε x 2 ) : x X } interpolation operator on X Π : F A discretized Kruzkov-Bellman operator Π ˆL : A A value iteration: choose ˆV (0) A with ˆV (0) (0) = 1, ˆV (n+1) := Π ˆL[ ˆV (n) ], n = 0, 1,... Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 10

Example x k+1 = (1 + au k )x k, x k X = [0, 1], u k U = [ 1, 1] and a (0, 1) fixed. cost g(x, u) = ax. optimal control sequence: u(x) = ( 1, 1,...). value function: V (x) = x. Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 11

Example n = 3 1 0.8 0.6 V 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 x Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 12

Example: inverted pendulum ϕ m u state: x = (ϕ, ϕ) system: f (x, u) = Φ T (x, u) cost g(x, u) = T 0 q 1 ϕ 2 (t) + q 2 ϕ 2 (t) dt + Tq 3 u 2 Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 13

Example n = 150 2 Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 14

Weighted least squares Problem Given sites X = {x 1,..., x N } Ω R d, data f 1,..., f N R, approximation space A = span{a 1,..., a m }, m < N, weight function w : Ω R with associated scalar product f, g w := N k=1 f (x k)g(x k )w(x k ) and induced norm find a function a A such that f a w! = min Optimal coefficient vector c: Gc = f A with Gram matrix G = ( a i, a j w ) ij and f A = ( f, a j w ) j. Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 15

Moving least squares Idea In computing an approximation to the function f : Ω R at x Ω, only the values at sites x j X close to x should play a role. moving weight function w : Ω Ω R w(x, y) small for x y 2 large inner product: f, g w(,x) := N k=1 f (x k)g(x k )w(x k, x) moving least squares approximation a of data f is a(x) = a x (x), where a x A is minimizing f a x w(,x). given by solving the Gram system G x c x = f x A Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 16

Shepard s method D. Shepard, A two dimensional interpolation function for irregularly spaced data, Proc. 23rd Nat. Conf. ACM, 1968. simply choose A = span{1} Gram matrix G x = 1, 1 w(,x) = N i=1 w(x i, x) right hand side f x A = f, 1 w(,x) = N i=1 f (x i)w(x i, x) thus we get c x = f x /G x = N w(x i, x) f (x i ) N i=1 w(x i, x) }{{} =:a i (x) i=1 and so the Shepard approximant is Sf (x) = c x 1 = N f (x i )a i (x) i=1 advantage: Shepard approximation requires no matrix solve

Shepard discretization of the Bellman equation approximation space { } w(x i, ) A = span N i=1 w(x i, x), x i X Shepard approximation operator S : F A discretized Kruzkov-Bellman operator S ˆL : A A value iteration as usual

Convergence of the value iteration f Sf is linear, for each x Ω, Sf (x) is a convex combination of the values f (x 1 ),..., f (x n ), therefore the Shepard operator S : (L, ) (A, ) has norm 1, thus we get Lemma Value iteration with the discretized Kruzkov-Bellman operator S ˆL : (A, ) (A, ) converges to the unique fixed point of S ˆL. Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 19

Example inverted pendulum, n = 104 nodes 10 14 12 5 10 ϕ 8 0 6 4 5 2 10 5 0 ϕ Oliver Junge, Alex Schreiber 5 Dynamic programming using radial basis functions 20

Example inverted pendulum, value iteration history 10 0 C 0.75 k v vk / v 10 5 10 10 10 15 20 40 60 80 100 120 140 k Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 21

Example inverted pendulum, residual 10 5 ϕ 0 5 10 5 0 5 ϕ Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 22

Convergence for fill distance 0 fill distance of X Ω h = h(x, Ω) = sup min x x j 2 x j X x Ω If f : Ω R is Lipschitz continuous with constant L then f Sf CLh for some constant C > 0. Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 23

Convergence for fill distance 0 sequence (X n ) n of nodes sets, X n Ω, fill distances h n, Shepard operators S n, K < 1 contraction constant of ˆL, ˆV fixed point of ˆL, ˆVn fixed point of S n ˆL Theorem If ˆV is Lipschitz continuous, then ˆV ˆV n CL 1 K h Oliver Junge, Alex Schreiber Dynamic programming using radial basis functions 24

Conclusion and future work multi-level greedy construction of sites complexity in dependence of d ( low discrepancy sites ) approximate/relaxed dynamic programming