Reading group: Calculus of Variations and Optimal Control Theory by Daniel Liberzon

Similar documents
Copyrighted Material. L(t, x(t), u(t))dt + K(t f, x f ) (1.2) where L and K are given functions (running cost and terminal cost, respectively),

MATH 23b, SPRING 2005 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Midterm (part 1) Solutions March 21, 2005

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

Constrained Optimization

Scientific Computing: Optimization

Constrained Optimization Theory

Lagrange Multipliers

1 Directional Derivatives and Differentiability

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

Calculus of Variations Summer Term 2014

Lecture 7. Econ August 18

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

Course 212: Academic Year Section 1: Metric Spaces

Chapter 3 Numerical Methods

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Thus, X is connected by Problem 4. Case 3: X = (a, b]. This case is analogous to Case 2. Case 4: X = (a, b). Choose ε < b a

September Math Course: First Order Derivative

MATH2070 Optimisation

Optimal control problems with PDE constraints

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Math 4317 : Real Analysis I Mid-Term Exam 1 25 September 2012

Assignment 1: From the Definition of Convexity to Helley Theorem

Preliminary draft only: please check for final version

Generalization to inequality constrained problem. Maximize

Contents Real Vector Spaces Linear Equations and Linear Inequalities Polyhedra Linear Programs and the Simplex Method Lagrangian Duality

Perturbation Analysis of Optimization Problems

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Algorithms for Constrained Optimization

Least Sparsity of p-norm based Optimization Problems with p > 1

Lecture: Duality of LP, SOCP and SDP

1. Bounded linear maps. A linear map T : E F of real Banach

Economics 204 Fall 2012 Problem Set 3 Suggested Solutions

Answers Machine Learning Exercises 4

Lecture 6 - Convex Sets

2.31 Definition By an open cover of a set E in a metric space X we mean a collection {G α } of open subsets of X such that E α G α.

Numerical Optimization

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Math 273a: Optimization Basic concepts

where u is the decision-maker s payoff function over her actions and S is the set of her feasible actions.

Exam 2 extra practice problems

Gradient Descent. Dr. Xiaowei Huang

Solutions to Homework 7

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Maximum Theorem, Implicit Function Theorem and Envelope Theorem

LECTURE 15: COMPLETENESS AND CONVEXITY

A linear equation in two variables is generally written as follows equation in three variables can be written as

Sufficient Conditions for Finite-variable Constrained Minimization

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Advanced Classical Mechanics (NS-350B) - Lecture 3. Alessandro Grelli Institute for Subatomic Physics

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

EC /11. Math for Microeconomics September Course, Part II Lecture Notes. Course Outline

E 600 Chapter 4: Optimization

Convex Sets with Applications to Economics

5 Handling Constraints

Tangent spaces, normals and extrema

Lecture 3: Basics of set-constrained and unconstrained optimization

1 Computing with constraints

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Calculus of Variations Summer Term 2014

4TE3/6TE3. Algorithms for. Continuous Optimization

Week 4: Calculus and Optimization (Jehle and Reny, Chapter A2)

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

i=1 β i,i.e. = β 1 x β x β 1 1 xβ d

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Convex Optimization M2

Elements of Convex Optimization Theory

5. Duality. Lagrangian

Algebraic Topology Homework 4 Solutions

Math 164-1: Optimization Instructor: Alpár R. Mészáros

Outline. Roadmap for the NPP segment: 1 Preliminaries: role of convexity. 2 Existence of a solution

Mathematics For Economists

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

Advanced Calculus Notes. Michael P. Windham

Exam in TMA4180 Optimization Theory

Optimality Conditions for Constrained Optimization

MATH 4211/6211 Optimization Linear Programming

Lecture 2 - Unconstrained Optimization Definition[Global Minimum and Maximum]Let f : S R be defined on a set S R n. Then

Lecture 7: Positive Semidefinite Matrices

c i r i i=1 r 1 = [1, 2] r 2 = [0, 1] r 3 = [3, 4].

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

Chap 2. Optimality conditions

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Implicit Functions, Curves and Surfaces

Theorems. Theorem 1.11: Greatest-Lower-Bound Property. Theorem 1.20: The Archimedean property of. Theorem 1.21: -th Root of Real Numbers

Calculus of Variation An Introduction To Isoperimetric Problems

Math Advanced Calculus II

Some Background Material

1 Lagrange Multiplier Method

Lecture 18: Optimization Programming

Instituto Superior Técnico. Calculus of variations and Optimal Control. Problem Series nº 2

Constrained optimization

Linear and non-linear programming

Lecture 2: Linear SVM in the Dual

Support Vector Machines

Lecture: Duality.

Constrained optimization: direct methods (cont.)

Convex Optimization & Lagrange Duality

Transcription:

: Calculus of Variations and Optimal Control Theory by Daniel Liberzon 16th March 2017 1 / 30

Content 1 2 Recall on finite-dimensional of a global minimum 3 Infinite-dimensional 4 2 / 30

Content 1 2 Recall on finite-dimensional 3 Infinite-dimensional 4 3 / 30

Dido s isoperimetric problem How to build the biggest city with the hide of an ox? This can be formulated as a maximization problem : maximize y C 1 b a y(x)dx such that y(a) = y(b) = 0 b 1 + (y (x)) 2 dx = C 0 a 4 / 30

Catenary problem What equation describes a chain hanging between two fixed points? minimize y C 1 b such that y(a) = y a a y(x) 1 + (y (x)) 2 dx y(b) = y b b 1 + (y (x)) 2 dx = C 0 a 5 / 30

Brachistochrone problem What path between two points is such that a particle sliding without friction along it takes the shortest time? b 1 + (y minimize (x)) 2 y C 1 dx y(x) such that y(a) = y a a y(b) = y b 6 / 30

Content 1 2 Recall on finite-dimensional of a global minimum 3 Infinite-dimensional 4 7 / 30

of minimum We are usually interested not in local minima, but more on global minima. How could we be sure that they exist? Theorem : Weierstrass Theorem If f is a continuous functions and D is a compact set, then there exists a global minimum of f over D. A compact set D is defined as a set for which every open cover has a finite subcover. Equivalently, D is compact if every sequence in D has a subsequence converging in D. For a subset D of R n, compactness if easily defined: D must be closed and bounded. 8 / 30

First-order conditions Consider f : R n R a C 1 -function, D = dom f that we suppose open and non-empty. Denote x a local minimum of f on D (which must be an interior point). For α small enough, x + αd D. Let us consider g(α) = f (x + αd). By construction, g is C 1, so : g(α) = g(0) + αg (0) + o(α) 9 / 30

First-order conditions g(α) = g(0) + αg (0) + o(α) We can easily prove that g (0) = 0. For that, suppose that g (0) 0. Then, by the definition of o(α), there exists ε > 0 such that : And so : α < ε, α 0 = o(α) < g (0)α g(α) g(0) < αg (0) + αg (0) Restrict α to have the opposite sign to g (0), such that we have g(α) g(0) < 0 and so : f (x + αd) < f (x ) which contradicts the fact that x is a minimum. So g (0) = 0. 10 / 30

First-order conditions A simple application of the chain rule implies : g (0) = f (x ) d But since d is arbitrary, we conclude that : f (x ) = 0 11 / 30

Second-order conditions Suppose now that f is C 2. As before: g(α) = g(0) + αg (0) + 1 2 g (0)α 2 + o(α 2 ) We easily prove that g (0) 0. The same way we did for 1st order conditions, suppose g (0) < 0. Then there exists ε > 0 such that : α < ε, α 0 = o(α 2 ) < 1 2 g (0)α 2 And we prove exactely the same way that g(α) g(0) < 0, thus contradicting the fact that x is a local minimum. 12 / 30

Second-order conditions Once again, using the chain rule: g (0) = d 2 f (x )d 0 and since d is arbitrary, 2 f (x ) must be positive semidefinite. 13 / 30

border conditions We supposed so far that D is an open set. What if D is closed and x D? There will be some d such that x + αd D whatever α. Definition: Feasible direction We call a feasible direction at x a vector d such that x + αd D for small enough α > 0. 14 / 30

border conditions We can actually easily prove that we can replace the first and second-order conditions by: if x is a local minimum of f on D: if f is C 1, then f (x ) d 0 for every feasible direction d if f is C 2, then d 2 f (x )d 0 for all feasible directions satisfying f (x ) d = 0. Also, if D is convex, then every point of D is written x + αd, so this kind of approach is clearly suitable for this kind of set. 15 / 30

Lagrange multipliers We want now to minimize a function f on a set D defined by equalities: h 1 (x) =... = h m (x) = 0 Definition: regular point x is called a regular point if h i (x ), i = 1,..., m are linearly independant. As we discussed before, the method used with feasible direction is not necessarily suitable for constrained cases. So here, we would rather use feasible curves. 16 / 30

Lagrange multipliers We consider a curve x(α) D, supposed C 1, with x(0) = x. Given an aribtrary curve of this kind, we can consider For α = 0, we can prove: g(α) = f (x(α)) g (0) = f (x )x (0) = 0 This vector x (0) is a tangent vector to D at x. It lives in the tangent space to D at x, which is denoted by T x D. 17 / 30

Lagrange multipliers From this, we can prove that if x is a minimum of f in D, then there exists λ R m such that: f (x ) + m λ i h i (x ) = 0 i=1 This is a really geometrical point of view. But we will prove that in a more algebraic way. 18 / 30

Lagrange multipliers Suppose that m = 1. Given two vectors d 1, d 2 arbitrary, consider : F : (α 1, α 2 ) (f (x + α 1 d 1 + α 2 d 2 ), h(x + α 1 d 1 + α 2 d 2 )) The Jacobian matrix of F at (0, 0) is : ( f (x J F (0, 0) = )d 1 f (x ) )d 2 h(x )d 1 h(x )d 2 Theorem: Inverse Function Theorem If the total derivative of a continuously differentiable function F defined from an open set of R n into R n is invertible at a point p (i.e., the Jacobian determinant of F at p is non-zero), then F is an invertible function near p. Moreover, the inverse function F 1 is also continuously differentiable. 19 / 30

Lagrange multipliers If J F (0, 0) is non singular, then by the Inverse Function Theorem, there are neighbourhoods of (0, 0) and F (0, 0) = (f (x ), 0) on which F is a bijection. That implies that there is a point x = x + α 1 d 1 + α 2 d 2, with (α 1, α 2 ) close to (0, 0), such that h(x) = 0 and f (x) < f (x ). This cannot be true since x is a minimum. So J F (0, 0) must be singular. 20 / 30

Lagrange multipliers ( f (x J F (0, 0) = )d 1 f (x ) )d 2 h(x )d 1 h(x )d 2 We suppose that x is a regular point. In this case, it means that h(x ) 0. Choose d 1 such that h(x )d 1 0. Let λ = f (x )d 1 h(x )d 1 Since J F (0, 0) is singular, its first row must be a constant multiple of its second row. Thus : f (x )d 2 + λ h(x)d 2 = 0, d 2 R n Since d 2 is arbitrary, we have the well know equation: f (x ) + λ h(x) = 0 21 / 30

Content 1 2 Recall on finite-dimensional 3 Infinite-dimensional 4 22 / 30

Can we use this in an infinite-dimensional framework? Consider now J : V R V : infinite-dimensional vector space (usually, a function space). J is called a functional. Main difference with finite-dimensional case: no "generic" space. There can be several V where the problem may be well defined, and choosing this V may be part of the problem. 23 / 30

Can we use this in an infinite-dimensional framework? In finite-dimensional cases, all norms are equivalent. It is not the case for function spaces. On C 0 ([a, b], R n ), we can use: y 0 = max a x b y(x) On C 1 ([a, b], R n ), we can use: y 1 = max y(x) + max y (x) a x b a x b There may be other norms (L p,...) Once this norm is chosen, we can clearly define a local optimum for the functional J over V. 24 / 30

First variation Also, what do we call a derivative of a functional? Definition: First Variation For a function y V, we call the first variation of J at y, the linear functional δj y : V R satisfying, for all η and all α: J(y + αη) = J(y) + δj y (η)α + o(α) 25 / 30

First-order condition Can we the same way, define a first-order condition? Suppose we want to find a local minimum of a functional J over a subset A of V. Definition: Admissible perturbation We call a perturbation η V admissible if y + αη A for all α close enough to 0. Reasonning the same way as before, we prove that if y is a local minimum, then for all admissible perturbations η, we must have δj y (η) = 0 Problem : how do we compute δj y? 26 / 30

Second variation Definition: Bilinear functional and quadratic form B : V V R is bilinear if it is linear in each argument. Q(y) = B(y, y) is called a quadratic form on V. Definition: Second variation For a function y V, we call the second variation of J at y, the quadratic form δ 2 J y : V R satisfying, for all η and all α: J(y + αη) = J(y) + δj y (η)α + δ 2 J y (η)α 2 + o(α 2 ) 27 / 30

Second-order condition We still have that, if y is a local minimum of J over A, then for all admissible perturbations η, we have δ 2 J y (η) 0 But the condition here is not sufficient! In the finite dimensional-case, sufficiency is proved by the fact that the unit ball is compact (and then using Weierstrass Theorem). This is not the case in infinite-dimensional case! 28 / 30

Content 1 2 Recall on finite-dimensional 3 Infinite-dimensional 4 29 / 30

page 6-7: prove the necessary (and sufficient) conditions in the case where x D (suppose D convex). page 15: Prove the first order condition in the constrained case with m > 1. Exercise 1.2 : have a look at the following constrained problem : minx 1 + x 2 s.t. (x 1 + 2) 2 + x 2 2 = 2 (x 1 2) 2 + x 2 2 = 2 Find the minimum solution (you can prove before that it exists!). Is it a regular point? Have a look also on the KKT conditions (Lagrange multipliers). Exercise 1.5 and 1.6 (first and second variations) 30 / 30