Bellman s Curse of Dimensionality

Bellman s Curse of Dimensionality n- dimensional state space Number of states grows exponen<ally in n (for fixed number of discre<za<on levels per coordinate) In prac<ce Discre<za<on is considered only computa<onally feasible up to 5 or 6 dimensional state spaces even when using Variable resolu<on discre<za<on Highly op<mized implementa<ons

Op<miza<on for Op<mal Control Goal: find a sequence of control inputs (and corresponding sequence of states) that solves: Generally hard to do. In this set of slides we will consider convex problems, which means g is convex, the sets U t and X t are convex, and f is linear. Next set of slides will relax these assump<ons. Note: itera<vely applying LQR is one way to solve this problem but can get a bit tricky when there are constraints on the control inputs and state. In principle (though not in our examples), u could be parameters of a control policy rather than the raw control inputs.

Convex Op*miza*on Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [op<onal] Boyd and Vandenberghe, Convex Op<miza<on, Chapters 9 11 [op<onal] Be\s, Prac<cal Methods for Op<mal Control Using Nonlinear Programming

Outline Convex op*miza*on problems Unconstrained minimiza<on Gradient Descent Newton s Method Equality constrained minimiza<on Inequality and equality constrained minimiza<on

Convex Func<ons A func<on f is convex if and only if 8x 1,x 2 2 Domain(f), 8t 2 [0, 1] : f(tx 1 +(1 t)x 2 ) apple tf(x 1 )+(1 t)f(x 2 ) Image source: wikipedia

Convex Func<ons Unique minimum Set of points for which f(x) <= a is convex Source: Thomas Jungblut s Blog

Convex Op<miza<on Problems Convex op<miza<on problems are a special class of op<miza<on problems, of the following form: min x2r nf 0(x) s.t. f i (x) apple 0 i =1,...,n Ax = b with f i (x) convex for i = 0, 1,, n A func<on f is convex if and only if 8x 1,x 2 2 Domain(f), 8 2 [0, 1] f( x 1 +(1 )x 2 ) apple f(x 1 )+(1 )f(x 2 )

Outline Convex op<miza<on problems Unconstrained minimiza*on Gradient Descent Newton s Method Equality constrained minimiza<on Inequality and equality constrained minimiza<on

Unconstrained Minimiza<on x* is a local minimum of (differen<able) f than it has to sa<sfy: In simple cases we can directly solve the system of n equa<ons given by (2) to find candidate local minima, and then verify (3) for these candidates. In general however, solving (2) is a difficult problem. Going forward we will consider this more general segng and cover numerical solu<on methods for (1).

Steepest Descent Idea: Start somewhere Repeat: Take a step in the steepest descent direc<on Figure source: Mathworks

Steepest Descent Algorithm 1. Ini<alize x 2. Repeat 1. Determine the steepest descent direc<on Δx 2. Line search: Choose a step size t > 0. 3. Update: x := x + t Δx. 3. Un<l stopping criterion is sa<sfied

What is the Steepest Descent Direc<on? à Steepest Descent = Gradient Descent

Stepsize Selec<on: Exact Line Search Used when the cost of solving the minimiza<on problem with one variable is low compared to the cost of compu<ng the search direc<on itself.

Stepsize Selec<on: Backtracking Line Search Inexact: step length is chose to approximately minimize f along the ray {x + t Δx t > 0}

Stepsize Selec<on: Backtracking Line Search Figure source: Boyd and Vandenberghe

Steepest Descent (= Gradient Descent) Source: Boyd and Vandenberghe

Gradient Descent: Example 1 Figure source: Boyd and Vandenberghe

Gradient Descent: Example 2 Figure source: Boyd and Vandenberghe

Gradient Descent: Example 3 Figure source: Boyd and Vandenberghe

Gradient Descent Convergence Condi<on number = 10 Condi<on number = 1 For quadra<c func<on, convergence speed depends on ra<o of highest second deriva<ve over lowest second deriva<ve ( condi<on number ) In high dimensions, almost guaranteed to have a high (=bad) condi<on number Rescaling coordinates (as could happen by simply expressing quan<<es in different measurement units) results in a different condi<on number

Outline Unconstrained minimiza*on Gradient Descent Newton s Method Equality constrained minimiza<on Inequality and equality constrained minimiza<on

Newton s Method 2 nd order Taylor Approxima<on rather than 1 st order: assuming (which is true for convex f) the minimum of the 2 nd order approxima<on is achieved at: Figure source: Boyd and Vandenberghe

Newton s Method Figure source: Boyd and Vandenberghe

Affine Invariance Consider the coordinate transforma<on y = A - 1 x (x = Ay) If running Newton s method star<ng from x (0) on f(x) results in x (0), x (1), x (2), Then running Newton s method star<ng from y (0) = A - 1 x (0) on g(y) = f(ay), will result in the sequence y (0) = A - 1 x (0), y (1) = A - 1 x (1), y (2) = A - 1 x (2), Exercise: try to prove this!

Affine Invariance - - - Proof

Example 1 gradient descent with Newton s method with backtracking line search Figure source: Boyd and Vandenberghe

Example 2 gradient descent Newton s method Figure source: Boyd and Vandenberghe

Larger Version of Example 2 Figure source: Boyd and Vandenberghe

Gradient Descent: Example 3 Figure source: Boyd and Vandenberghe

Example 3 Gradient descent Newton s method (converges in one step if f convex quadra<c)

Quasi- Newton Methods Quasi- Newton methods use an approxima<on of the Hessian Example 1: Only compute diagonal entries of Hessian, set others equal to zero. Note this also simplifies computa<ons done with the Hessian. Example 2: natural gradient - - - see next slide

Natural Gradient Consider a standard maximum likelihood problem: Gradient: Hessian: r 2 f( )= X i r 2 p(x (i) ; ) p(x (i) ; ) > r log p(x (i) ; ) r log p(x (i) ; ) Natural gradient: only keeps the 2 nd term in the Hessian. Benefits: (1) faster to compute (only gradients needed); (2) guaranteed to be nega<ve definite; (3) found to be superior in some experiments; (4) invariant to re- parametriza<on

Natural Gradient Property: Natural gradient is invariant to parameteriza<on of the family of probability distribu<ons p( x ; θ) Hence the name. Note this property is stronger than the property of Newton s method, which is invariant to affine re- parameteriza<ons only. Exercise: Try to prove this property!

Natural Gradient Invariant to Reparametriza<on - - - Proof Natural gradient for parametriza<on with θ: Let Φ = f(θ), and let i.e., à the natural gradient direc<on is the same independent of the (inver<ble, but otherwise not constrained) reparametriza<on f

Outline Unconstrained minimiza<on Gradient Descent Newton s Method Equality constrained minimiza*on Inequality and equality constrained minimiza<on

Equality Constrained Minimiza<on Problem to be solved: We will cover three solu<on methods: Elimina<on Newton s method Infeasible start Newton method

Method 1: Elimina<on From linear algebra we know that there exist a matrix F (in fact infinitely many) such that: : any solu<on to Ax = b F: spans the null- space of A A way to find an F: compute SVD of A, A = U S V, for A having k nonzero singular values, set F = U(:, k+1:end) So we can solve the equality constrained minimiza<on problem by solving an unconstrained minimiza/on problem over a new variable z: Poten<al cons: (i) need to first find a solu<on to Ax=b, (ii) need to find F, (iii) elimina<on might destroy sparsity in original problem structure

Methods 2 and 3 - - - First Consider Op<mality Condi<on Recall problem to be solved: x* with Ax*=b is (local) op<mum if and only if: Equivalently:

Methods 2 and 3 - - - First Consider Op<mality Condi<on Recall problem to be solved:

Method 2: Newton s Method Problem to be solved: Assume x is feasible, i.e., sa<sfies Ax = b, now use 2 nd order approxima<on of f: Op<mality condi<on for 2 nd order approxima<on:

Method 2: Newton s Method With Newton step obtained by solving a linear system of equa<ons: Feasible descent method:

Method 3: Infeasible Start Newton Method Problem to be solved: Use 1 st order approxima<on of the op<mality condi<ons at current x: Equivalently:

Outline Unconstrained minimiza<on Equality constrained minimiza<on Inequality and equality constrained minimiza*on

Equality and Inequality Constrained Minimiza<on Recall the problem to be solved:

Equality and Inequality Constrained Minimiza<on Problem to be solved: Approxima<on via logarithmic barrier: Reformulation via indicator function * for t > 0, - (1/t) log(- u) is a smooth approxima<on of I_(u) * approxima<on improves for t à 1 * be\er condi<oned for smaller t à No inequality constraints anymore, but very poorly conditioned objective function

Equality and Inequality Constrained Minimiza<on

Barrier Method Given: strictly feasible x, t=t (0) > 0, μ > 1, tolerance ε > 0 Repeat 1. Centering Step. Compute x * (t) by solving star<ng from x 2. Update. x := x * (t). 3. Stopping Criterion. Quit if m/t < ε 4. Increase t. t := μt

Example 1: Inequality Form LP

Example 2: Geometric Program

Example 3: Standard LPs

Ini<aliza<on Basic phase I method: Ini<alize by first solving: Easy to ini<alize above problem, pick some x such that Ax = b, and then simply set s = max i f i (x) Can stop early- - - whenever s < 0

Initaliza<on Sum of infeasibili<es phase I method: Ini<alize by first solving: Easy to ini<alize above problem, pick some x such that Ax = b, and then simply set s i = max(0, f i (x)) For infeasible problems, produces a solu<on that sa<sfies many more inequali<es than basic phase I method

Other methods We have covered a primal interior point method one of several op<miza<on approaches Examples of others: Primal- dual interior point methods Primal- dual infeasible interior point methods

Op<mal Control (Open Loop) For convex g t and f i, we can now solve: Which gives an open- loop sequence of controls

Op<mal Control (Closed Loop) Given: For k=0, 1, 2,, T Solve min x,u TX g t (x t,u t ) t=k s.t. x t+1 = A t x t + B t u t t {k, k +1,...,T 1} f i (x, u) 0, x k = x k i {1,...,m} Execute u k Observe resul<ng state, = Model Predic<ve Control Ini<aliza<on with solu<on from itera<on k- 1 can make solver very fast (and would be done most conveniently with infeasible start Newton method)

CVX Disciplined convex programming = convex op<miza<on problems of forms easily programma<cally verified to be convex Convenient high- level expressions Excellent for fast implementa<on Designed by Michael Grant and Stephen Boyd, with input from Yinyu Ye. Current webpage: h\p://cvxr.com/cvx/

CVX Matlab Example for Op<mal Control, see course webpage