Bellman s Curse of Dimensionality

Similar documents
Nonlinear Optimization for Optimal Control

Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Optimization

Barrier Method. Javier Peña Convex Optimization /36-725

Interior Point Algorithms for Constrained Convex Optimization

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Least Mean Squares Regression. Machine Learning Fall 2017

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Homework 4. Convex Optimization /36-725

Stochas(c Dual Ascent Linear Systems, Quasi-Newton Updates and Matrix Inversion

Line Search Methods for Unconstrained Optimisation

Pseudospectral Methods For Op2mal Control. Jus2n Ruths March 27, 2009

Nonlinear Optimization for Optimal Control Part 2. Pieter Abbeel UC Berkeley EECS. From linear to nonlinear Model-predictive control (MPC) POMDPs

12. Interior-point methods

CS 6140: Machine Learning Spring What We Learned Last Week 2/26/16

Selected Topics in Optimization. Some slides borrowed from

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Javier Peña Convex Optimization /36-725

Lecture 16: October 22

Convex Optimization. Prof. Nati Srebro. Lecture 12: Infeasible-Start Newton s Method Interior Point Methods

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization

12. Interior-point methods

Written Examination

3E4: Modelling Choice. Introduction to nonlinear programming. Announcements

4TE3/6TE3. Algorithms for. Continuous Optimization

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

10. Unconstrained minimization

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

11. Equality constrained minimization

Convex Optimization. Problem set 2. Due Monday April 26th

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725

More First-Order Optimization Algorithms

Constrained Optimization and Lagrangian Duality

Analytic Center Cutting-Plane Method

Convex Optimization and l 1 -minimization

Computational Finance

Lecture 14: October 17

CS711008Z Algorithm Design and Analysis

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Gradient Descent for High Dimensional Systems

Lecture 9 Sequential unconstrained minimization

An Iterative Descent Method

Optimization Tutorial 1. Basic Gradient Descent

Lecture 5: Gradient Descent. 5.1 Unconstrained minimization problems and Gradient descent

2.098/6.255/ Optimization Methods Practice True/False Questions

Course Outline. FRTN10 Multivariable Control, Lecture 13. General idea for Lectures Lecture 13 Outline. Example 1 (Doyle Stein, 1979)

ORIE 6326: Convex Optimization. Quasi-Newton Methods

CE 191: Civil & Environmental Engineering Systems Analysis. LEC 17 : Final Review

Agenda. Fast proximal gradient methods. 1 Accelerated first-order methods. 2 Auxiliary sequences. 3 Convergence analysis. 4 Numerical examples

Lecture 24: August 28

Interior-Point Methods for Linear Optimization

Reduced Models for Process Simula2on and Op2miza2on

Algorithms for constrained local optimization

Lecture 14: Newton s Method

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Descent methods. min x. f(x)

Interior Point Methods in Mathematical Programming

Why should you care about the solution strategies?

Tangent lines, cont d. Linear approxima5on and Newton s Method

Interior-point methods Optimization Geoff Gordon Ryan Tibshirani

NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

Machine Learning and Data Mining. Linear regression. Prof. Alexander Ihler

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Lecture 5: September 15

Chemical Equilibrium: A Convex Optimization Problem

Numerical Methods in Physics

minimize x subject to (x 2)(x 4) u,

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Convex Optimization. Lecture 12 - Equality Constrained Optimization. Instructor: Yuanzhang Xiao. Fall University of Hawaii at Manoa

Unconstrained minimization of smooth functions


Primal/Dual Decomposition Methods

A Second-Order Path-Following Algorithm for Unconstrained Convex Optimization

Last Lecture Recap UVA CS / Introduc8on to Machine Learning and Data Mining. Lecture 3: Linear Regression

Numerical optimization

Interior Point Methods for Mathematical Programming

Convex Optimization Overview (cnt d)

10 Numerical methods for constrained problems

Conditional Gradient (Frank-Wolfe) Method

Constrained Optimization

ECE 8440 Unit 17. Parks- McClellan Algorithm

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Multidisciplinary System Design Optimization (MSDO)

5 Handling Constraints

Design and Analysis of Algorithms Lecture Notes on Convex Optimization CS 6820, Fall Nov 2 Dec 2016

Algorithms for NLP. Classifica(on III. Taylor Berg- Kirkpatrick CMU Slides: Dan Klein UC Berkeley

Gradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Unconstrained optimization

Modeling radiocarbon in the Earth System. Radiocarbon summer school 2012

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

Unconstrained minimization: assumptions

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Lecture 14 Barrier method

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection

Lecture 3: Minimizing Large Sums. Peter Richtárik

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

CSE 473: Ar+ficial Intelligence. Probability Recap. Markov Models - II. Condi+onal probability. Product rule. Chain rule.

Modern Optimization Techniques

CHAPTER 2: QUADRATIC PROGRAMMING

Transcription:

Bellman s Curse of Dimensionality n- dimensional state space Number of states grows exponen<ally in n (for fixed number of discre<za<on levels per coordinate) In prac<ce Discre<za<on is considered only computa<onally feasible up to 5 or 6 dimensional state spaces even when using Variable resolu<on discre<za<on Highly op<mized implementa<ons

Op<miza<on for Op<mal Control Goal: find a sequence of control inputs (and corresponding sequence of states) that solves: Generally hard to do. In this set of slides we will consider convex problems, which means g is convex, the sets U t and X t are convex, and f is linear. Next set of slides will relax these assump<ons. Note: itera<vely applying LQR is one way to solve this problem but can get a bit tricky when there are constraints on the control inputs and state. In principle (though not in our examples), u could be parameters of a control policy rather than the raw control inputs.

Convex Op*miza*on Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [op<onal] Boyd and Vandenberghe, Convex Op<miza<on, Chapters 9 11 [op<onal] Be\s, Prac<cal Methods for Op<mal Control Using Nonlinear Programming

Outline Convex op*miza*on problems Unconstrained minimiza<on Gradient Descent Newton s Method Equality constrained minimiza<on Inequality and equality constrained minimiza<on

Convex Func<ons A func<on f is convex if and only if 8x 1,x 2 2 Domain(f), 8t 2 [0, 1] : f(tx 1 +(1 t)x 2 ) apple tf(x 1 )+(1 t)f(x 2 ) Image source: wikipedia

Convex Func<ons Unique minimum Set of points for which f(x) <= a is convex Source: Thomas Jungblut s Blog

Convex Op<miza<on Problems Convex op<miza<on problems are a special class of op<miza<on problems, of the following form: min x2r nf 0(x) s.t. f i (x) apple 0 i =1,...,n Ax = b with f i (x) convex for i = 0, 1,, n A func<on f is convex if and only if 8x 1,x 2 2 Domain(f), 8 2 [0, 1] f( x 1 +(1 )x 2 ) apple f(x 1 )+(1 )f(x 2 )

Outline Convex op<miza<on problems Unconstrained minimiza*on Gradient Descent Newton s Method Equality constrained minimiza<on Inequality and equality constrained minimiza<on

Unconstrained Minimiza<on x* is a local minimum of (differen<able) f than it has to sa<sfy: In simple cases we can directly solve the system of n equa<ons given by (2) to find candidate local minima, and then verify (3) for these candidates. In general however, solving (2) is a difficult problem. Going forward we will consider this more general segng and cover numerical solu<on methods for (1).

Steepest Descent Idea: Start somewhere Repeat: Take a step in the steepest descent direc<on Figure source: Mathworks

Steepest Descent Algorithm 1. Ini<alize x 2. Repeat 1. Determine the steepest descent direc<on Δx 2. Line search: Choose a step size t > 0. 3. Update: x := x + t Δx. 3. Un<l stopping criterion is sa<sfied

What is the Steepest Descent Direc<on? à Steepest Descent = Gradient Descent

Stepsize Selec<on: Exact Line Search Used when the cost of solving the minimiza<on problem with one variable is low compared to the cost of compu<ng the search direc<on itself.

Stepsize Selec<on: Backtracking Line Search Inexact: step length is chose to approximately minimize f along the ray {x + t Δx t > 0}

Stepsize Selec<on: Backtracking Line Search Figure source: Boyd and Vandenberghe

Steepest Descent (= Gradient Descent) Source: Boyd and Vandenberghe

Gradient Descent: Example 1 Figure source: Boyd and Vandenberghe

Gradient Descent: Example 2 Figure source: Boyd and Vandenberghe

Gradient Descent: Example 3 Figure source: Boyd and Vandenberghe

Gradient Descent Convergence Condi<on number = 10 Condi<on number = 1 For quadra<c func<on, convergence speed depends on ra<o of highest second deriva<ve over lowest second deriva<ve ( condi<on number ) In high dimensions, almost guaranteed to have a high (=bad) condi<on number Rescaling coordinates (as could happen by simply expressing quan<<es in different measurement units) results in a different condi<on number

Outline Unconstrained minimiza*on Gradient Descent Newton s Method Equality constrained minimiza<on Inequality and equality constrained minimiza<on

Newton s Method 2 nd order Taylor Approxima<on rather than 1 st order: assuming (which is true for convex f) the minimum of the 2 nd order approxima<on is achieved at: Figure source: Boyd and Vandenberghe

Newton s Method Figure source: Boyd and Vandenberghe

Affine Invariance Consider the coordinate transforma<on y = A - 1 x (x = Ay) If running Newton s method star<ng from x (0) on f(x) results in x (0), x (1), x (2), Then running Newton s method star<ng from y (0) = A - 1 x (0) on g(y) = f(ay), will result in the sequence y (0) = A - 1 x (0), y (1) = A - 1 x (1), y (2) = A - 1 x (2), Exercise: try to prove this!

Affine Invariance - - - Proof

Example 1 gradient descent with Newton s method with backtracking line search Figure source: Boyd and Vandenberghe

Example 2 gradient descent Newton s method Figure source: Boyd and Vandenberghe

Larger Version of Example 2 Figure source: Boyd and Vandenberghe

Gradient Descent: Example 3 Figure source: Boyd and Vandenberghe

Example 3 Gradient descent Newton s method (converges in one step if f convex quadra<c)

Quasi- Newton Methods Quasi- Newton methods use an approxima<on of the Hessian Example 1: Only compute diagonal entries of Hessian, set others equal to zero. Note this also simplifies computa<ons done with the Hessian. Example 2: natural gradient - - - see next slide

Natural Gradient Consider a standard maximum likelihood problem: Gradient: Hessian: r 2 f( )= X i r 2 p(x (i) ; ) p(x (i) ; ) > r log p(x (i) ; ) r log p(x (i) ; ) Natural gradient: only keeps the 2 nd term in the Hessian. Benefits: (1) faster to compute (only gradients needed); (2) guaranteed to be nega<ve definite; (3) found to be superior in some experiments; (4) invariant to re- parametriza<on

Natural Gradient Property: Natural gradient is invariant to parameteriza<on of the family of probability distribu<ons p( x ; θ) Hence the name. Note this property is stronger than the property of Newton s method, which is invariant to affine re- parameteriza<ons only. Exercise: Try to prove this property!

Natural Gradient Invariant to Reparametriza<on - - - Proof Natural gradient for parametriza<on with θ: Let Φ = f(θ), and let i.e., à the natural gradient direc<on is the same independent of the (inver<ble, but otherwise not constrained) reparametriza<on f

Outline Unconstrained minimiza<on Gradient Descent Newton s Method Equality constrained minimiza*on Inequality and equality constrained minimiza<on

Equality Constrained Minimiza<on Problem to be solved: We will cover three solu<on methods: Elimina<on Newton s method Infeasible start Newton method

Method 1: Elimina<on From linear algebra we know that there exist a matrix F (in fact infinitely many) such that: : any solu<on to Ax = b F: spans the null- space of A A way to find an F: compute SVD of A, A = U S V, for A having k nonzero singular values, set F = U(:, k+1:end) So we can solve the equality constrained minimiza<on problem by solving an unconstrained minimiza/on problem over a new variable z: Poten<al cons: (i) need to first find a solu<on to Ax=b, (ii) need to find F, (iii) elimina<on might destroy sparsity in original problem structure

Methods 2 and 3 - - - First Consider Op<mality Condi<on Recall problem to be solved: x* with Ax*=b is (local) op<mum if and only if: Equivalently:

Methods 2 and 3 - - - First Consider Op<mality Condi<on Recall problem to be solved:

Method 2: Newton s Method Problem to be solved: Assume x is feasible, i.e., sa<sfies Ax = b, now use 2 nd order approxima<on of f: Op<mality condi<on for 2 nd order approxima<on:

Method 2: Newton s Method With Newton step obtained by solving a linear system of equa<ons: Feasible descent method:

Method 3: Infeasible Start Newton Method Problem to be solved: Use 1 st order approxima<on of the op<mality condi<ons at current x: Equivalently:

Outline Unconstrained minimiza<on Equality constrained minimiza<on Inequality and equality constrained minimiza*on

Equality and Inequality Constrained Minimiza<on Recall the problem to be solved:

Equality and Inequality Constrained Minimiza<on Problem to be solved: Approxima<on via logarithmic barrier: Reformulation via indicator function * for t > 0, - (1/t) log(- u) is a smooth approxima<on of I_(u) * approxima<on improves for t à 1 * be\er condi<oned for smaller t à No inequality constraints anymore, but very poorly conditioned objective function

Equality and Inequality Constrained Minimiza<on

Barrier Method Given: strictly feasible x, t=t (0) > 0, μ > 1, tolerance ε > 0 Repeat 1. Centering Step. Compute x * (t) by solving star<ng from x 2. Update. x := x * (t). 3. Stopping Criterion. Quit if m/t < ε 4. Increase t. t := μt

Example 1: Inequality Form LP

Example 2: Geometric Program

Example 3: Standard LPs

Ini<aliza<on Basic phase I method: Ini<alize by first solving: Easy to ini<alize above problem, pick some x such that Ax = b, and then simply set s = max i f i (x) Can stop early- - - whenever s < 0

Initaliza<on Sum of infeasibili<es phase I method: Ini<alize by first solving: Easy to ini<alize above problem, pick some x such that Ax = b, and then simply set s i = max(0, f i (x)) For infeasible problems, produces a solu<on that sa<sfies many more inequali<es than basic phase I method

Other methods We have covered a primal interior point method one of several op<miza<on approaches Examples of others: Primal- dual interior point methods Primal- dual infeasible interior point methods

Op<mal Control (Open Loop) For convex g t and f i, we can now solve: Which gives an open- loop sequence of controls

Op<mal Control (Closed Loop) Given: For k=0, 1, 2,, T Solve min x,u TX g t (x t,u t ) t=k s.t. x t+1 = A t x t + B t u t t {k, k +1,...,T 1} f i (x, u) 0, x k = x k i {1,...,m} Execute u k Observe resul<ng state, = Model Predic<ve Control Ini<aliza<on with solu<on from itera<on k- 1 can make solver very fast (and would be done most conveniently with infeasible start Newton method)

CVX Disciplined convex programming = convex op<miza<on problems of forms easily programma<cally verified to be convex Convenient high- level expressions Excellent for fast implementa<on Designed by Michael Grant and Stephen Boyd, with input from Yinyu Ye. Current webpage: h\p://cvxr.com/cvx/

CVX Matlab Example for Op<mal Control, see course webpage