MATHEMATICS FOR COMPUTER VISION WEEK 8 OPTIMISATION PART 2. Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year

Similar documents
Line Search Methods for Unconstrained Optimisation

Scientific Computing: Optimization

Nonlinear Optimization: What s important?

2.098/6.255/ Optimization Methods Practice True/False Questions

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

The Conjugate Gradient Method

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

The Conjugate Gradient Method

1 Numerical optimization

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent

Optimization Methods

Lecture V. Numerical Optimization

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

Gradient Descent. Dr. Xiaowei Huang

Numerical optimization

NonlinearOptimization

5 Quasi-Newton Methods

CS711008Z Algorithm Design and Analysis

Numerical Optimization

17 Solution of Nonlinear Systems

Optimization Methods

Programming, numerics and optimization

1 Numerical optimization

Nonlinear Programming

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are

Unconstrained optimization

Math 411 Preliminaries

15-850: Advanced Algorithms CMU, Fall 2018 HW #4 (out October 17, 2018) Due: October 28, 2018

Nonlinear Optimization for Optimal Control

Goals for This Lecture:

Mathematical optimization

ECS550NFB Introduction to Numerical Methods using Matlab Day 2

Quasi-Newton Methods

Written Examination

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

Matrix Derivatives and Descent Optimization Methods

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

Gradient Descent. Sargur Srihari

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

1 Number Systems and Errors 1

Root Finding (and Optimisation)

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

EECS 275 Matrix Computation

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

Introduction to Scientific Computing

Optimization and Root Finding. Kurt Hornik

FALL 2018 MATH 4211/6211 Optimization Homework 4

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

Minimization of Static! Cost Functions!

Convex Optimization CMU-10725

Multivariate Newton Minimanization

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

Comparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming Problems

Numerical Analysis of Electromagnetic Fields

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS

Convex Optimization. Problem set 2. Due Monday April 26th

Lecture Notes: Geometric Considerations in Unconstrained Optimization

Appendix A Taylor Approximations and Definite Matrices

Gradient Descent Methods

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

LECTURE 22: SWARM INTELLIGENCE 3 / CLASSICAL OPTIMIZATION

Constrained Optimization

Numerical Optimization Techniques

Higher-Order Methods

Conditional Gradient (Frank-Wolfe) Method

Lecture 23 Branch-and-Bound Algorithm. November 3, 2009

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Conjugate-Gradient. Learn about the Conjugate-Gradient Algorithm and its Uses. Descent Algorithms and the Conjugate-Gradient Method. Qx = b.

CS-E4830 Kernel Methods in Machine Learning

Lecture 18: Optimization Programming

PETROV-GALERKIN METHODS

2. Quasi-Newton methods

Introduction to Applied Linear Algebra with MATLAB

Review for Exam 2 Ben Wang and Mark Styczynski

5 Handling Constraints

MVE165/MMG631 Linear and integer optimization with applications Lecture 13 Overview of nonlinear programming. Ann-Brith Strömberg

Multidisciplinary System Design Optimization (MSDO)

1. Method 1: bisection. The bisection methods starts from two points a 0 and b 0 such that

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 3. Gradient Method

ICS-E4030 Kernel Methods in Machine Learning

Neural Network Training

Lecture 7: Minimization or maximization of functions (Recipes Chapter 10)

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

MATHEMATICS FOR COMPUTER VISION WEEK 2 LINEAR SYSTEMS. Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year

Chapter 2. Optimization. Gradients, convexity, and ALS

Numerical solution of Least Squares Problems 1/32

M.A. Botchev. September 5, 2014

Motivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes)

Computational Finance

Chapter 4. Unconstrained optimization

nonrobust estimation The n measurement vectors taken together give the vector X R N. The unknown parameter vector is P R M.

Introduction to gradient descent

Lecture 17: Numerical Optimization October 2014

CE 191: Civil & Environmental Engineering Systems Analysis. LEC 17 : Final Review

Inverse Problems and Optimal Design in Electricity and Magnetism

Scientific Computing: An Introductory Survey

Transcription:

MATHEMATICS FOR COMPUTER VISION WEEK 8 OPTIMISATION PART 2 1 Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year 2013-14

OUTLINE OF WEEK 8 topics: quadratic optimisation, least squares, iterative algorithms for nonlinear optimisatiom Least squares methods Linear least squares ( QP ) Quadratic Programming Integer Programming (IP) and LP relaxation Iterative methods Newton Raphson Quasi-Newton Conjugate gradient Gradient descent 2

LEAST SQUARES OPTIMISATION 3 Week 8 Optimisation 2

LEAST SQUARES 4

LINEAR LEAST SQUARES 5

COMPUTATION AND INTERPRETATION 6

QUADRATIC PROGRAMMING 7 Week 8 Optimisation 2

FORMULATION 8

DUAL PROBLEM AND COMPUTATION 9

INTEGER PROGRAMMING 10 Week 8 Optimisation 2

( IP ) INTEGER PROGRAMMING optimisation problem in which some of the variables are integer numbers It is NP-hard canonical form: where all entries of A, b and c are integer many problems can be formulated as IP: travelling salesman vertex cover Boolean satisfiability 11

EXAMPLE example problem feasible integer points in red constraints after LP relaxation in blue clearly, the LP relax optimum is neither feasible nor optimal for the IP problem 12

LP RELAXATION the idea is to relax the constraint that x is integer, and solve the resulting LP problem, then round in general, solution after relaxation is not feasible however, if A is totally unimodular, every basic feasible solution (vertex of the polytope determined by the linear constraints) is integer! ( unimodular (ever square nonsingular matrix is matrix is unimodular when det A = 1 we can just apply the simplex algorithm, and we are sure to get the optimal integer solution if A is not unimodular, there are exact algorithms 13

ITERATIVE METHODS 14 Week 8 Optimisation 2

NONLINEAR PROGRAMMING some of the constraints or the objective function are nonlinear issue arises when the problem is non-convex under differentiability of the functions involved, the Kuhn-Tucker conditions provide necessary conditions for ( 7 optimality (see Week example: nonlinear feasibility space ( sector (blue useful tools: numerical iterative methods 15

ITERATIVE METHODS solve nonlinear programming problems by evaluating Hessian, gradients and/or function values for smooth functions derivative calculations improve the rate of convergence, but increase computational load performance criterion: number of function evaluations Order of n+1 for gradients Order of n 2 for Hessians ultimately, what s best depend on the problem 16

NEWTON S METHOD 17

GEOMETRIC INTERPRETATION 18

QUASI-NEWTON derives from Newton s method looks for the stationary points of a function, using a second order Taylor approximation the Hessian does not need to be computed, uses an approximation B of it, such that a condition called the secant equation (Taylor expansion ( itself of the gradient it is underdetermined: need to add additional constraints, various options e.g. symmetry: B=B T, minimal distance: 19

QUASI-NEWTON update steps: Newton steps using the current approximation B k of the Hessian various methods to update B k, e.g. DFP where BFGS: Matlab optimization toolbox implementation: BFGS is one option of fminunc.m 20

( DESCENT GRADIENT DESCENT (STEEPEST first-order optimisation algorithm takes steps proportional to the negative of the gradient to find local minimum ( maxima (opposite for local also known as steepest descent starts with a guess x 0... and updates using ( F(b if is small enough, F(a) the sequence x 0, x 1, x 2,... should converge to local minimum 21

BEHAVIOR OF GRADIENT DESCENT kind of zig-zags slow convergence near the minimum gradient points away from the actual direction of the sought minimum can be used to solve linear systems Ax = b, in a least squares sense, by minimising Ax b 22

GRADIENT DESCENT PYTHON IMPLEMENTATION piece of code which finds the local minimum of the function f(x) = x 4-3x 3 + 2, with derivative f'(x) = 4x 3-9x 2 x_old = 0 x_new = 6 # The algorithm starts at x=6 eps = 0.01 # step size precision = 0.00001 def f_prime(x): return 4 * x**3-9 * x**2 while abs(x_new - x_old) > precision: x_old = x_new ( f_prime(x_old x_new = x_old - eps * print "Local minimum occurs at ", x_new 23

GRADIENT DESCENT VS NEWTON illustrates a comparison Newton in red Gradient descent in green Newton uses curvature (second order) information to take more direct route 24

CONJUGATE GRADIENT used to solve linear systems whose matrix A is positive definite iterative method, so can be applied to large systems for which Cholesky decomposition is not feasible can also be used in energy minimisation two vectors are conjugate is their inner product w.r.t. A is zero (they are orthogonal using the norm associated ( A with idea: solution of the system is also unique minimizer of the quadratic function 25

CONJUGATE GRADIENT - ALGORITHM start with initial guess x 0 at each step we take the negative of the gradient of the quadratic function, and we move in the direction p 0 = b Ax 0 the residual is r k = b Ax k gradient descent would move in the direction of r k Instead we want the successive search directions p k to be ( procedure conjugate w.r.t. A (similar to Gram-Schmidt update equations: 26

EXAMPLE MATLAB CODE can be easily implemented function [x] = conjgrad(a,b,x) r = b-a*x; p = r; rsold = r'*r; for i = 1:10^(6) Ap = A*p; alpha = rsold/(p'*ap); x = x + alpha*p; r = r - alpha*ap; rsnew = r'*r; if sqrt(rsnew) < 1e-10 break; end p = r + rsnew/rsold*p; rsold = rsnew; end 27

CONJUGATE GRADIENT VS GRADIENT DESCENT illustrates a comparison Conjugate gradient in red Gradient descent in green Conjugate gradient converges in at most n steps 28

SUMMARY 29 Week 8 Optimisation 2

SUMMARY OF WEEK 8 Nonlinear optimisation topics ( particular Least squares (linear in ( brief ) Quadratic Programming Integer Programming Nonlinear Programming Iterative methods Newton-Raphson Quasi-Newton gradient descent conjugate gradient 30