Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 3. Gradient Method
|
|
- Gregory Gray
- 5 years ago
- Views:
Transcription
1 Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 3 Gradient Method
2 Shiqian Ma, MAT-258A: Numerical Optimization Gradient method Classical gradient method: to minimize a differentiable convex function f min x R n f(x) The algorithm: choose x 0 and repeat x k+1 = x k t k f(x k ), k = 0, 1, 2,... Note that p k := f(x k ) is a descent direction. (we call p k a descent direction if p k f k < 0) question: how to choose step size t k?
3 Shiqian Ma, MAT-258A: Numerical Optimization 3 step size rules exact line search: t k = argmin t f(x k t f(x k )) fixed: t k constant backtracking line search (most practical) Backtracking line search: initialize t k at some ˆt > 0 (for example, ˆt = 1), repeat t k := βt k until f(x t k f(x)) < f(x) αt k f(x) 2 2 This is called Amijo condition. Two parameters: 0 < β < 1 and 0 < α 0.5
4 Shiqian Ma, MAT-258A: Numerical Optimization 4 General line search at x k, compute a descent direction p k, what step size to take to move? x k+1 = x k + t k p k exact line search: min t f(x k + tp k ) Amijo condition: f(x k + tp k ) f(x k ) + c 1 t f(x k ) p k Wolfe condition: f(x k + tp k ) f(x k ) + c 1 t f(x k ) p k f(x k + tp k ) p k c 2 f(x k ) p k with 0 < c 1 < c 2 < 1.
5 Shiqian Ma, MAT-258A: Numerical Optimization Analysis of gradient method x k+1 = x k t k f(x k ), k = 0, 1, 2,... with fixed step size or backtracking line search assumptions: f is convex and differentiable with dom f = R n f(x) is Lipschitz continuous with parameter L > 0 optimal value f = inf x f(x) is finite and attained at x
6 Shiqian Ma, MAT-258A: Numerical Optimization 6 Lipschitz continuity: a function h is called Lipschitz continuous with Lipschitz constant L, if h(x) h(y) L x y, x, y dom h. If f(x) is Lipschitz-continuous with parameter L > 0, then (quadratic upper bound) f(y) f(x) + f(x) (y x) + L 2 x y 2 2, x, y dom f if dom f = R n and f has a minimizer x, then 1 2L f(x) 2 2 f(x) f(x ) L 2 x x 2 2
7 Shiqian Ma, MAT-258A: Numerical Optimization 7 Strongly convex f is strongly convex with parameter µ > 0 if f(x) µ 2 x 2 2 is convex First-order condition f(y) f(x) + f(x) (y x) + µ 2 x y 2 2, x, y dom f Second-order condition 2 f(x) µi, x dom f if dom f = R n, then f has a minimizer x, and µ 2 x x 2 2 f(x) f(x ) 1 2µ f(x) 2 2
8 Shiqian Ma, MAT-258A: Numerical Optimization Analysis of constant step size recall quadratic upper bound: f(y) f(x) + f(x), y x + L 2 y x 2 2 plug in y = x t f(x) to obtain f(x t f(x)) f(x) t let x + = x t f(x) and assume 0 < t 1/L, ( 1 Lt ) f(x) f(x + ) f(x) t 2 f(x) 2 2 f + f(x), x x t 2 f(x) 2 2 = f + 1 2t ( x x 2 2 x x t f(x) 2 2) = f + 1 2t ( x x 2 2 x + x 2 2)
9 Shiqian Ma, MAT-258A: Numerical Optimization 9 take x = x i 1, x + = x i, t i = t, and the bounds for i = 1,..., k: k ( i=1 f(x i ) f ) 1 k ( 2t ( i=1 x i 1 x 2 2 x i x 2) 2 x 0 x 2 2 x k x 2) 2 = 1 2t since f(x i ) is non-increasing, f(x k ) f 1 k 1 2t x0 x 2 2 k i=1 ( f(x i ) f ) 1 2kt x0 x 2 2 conclusion: number of iterations to reach f(x k ) f ɛ is O(1/ɛ)
10 Shiqian Ma, MAT-258A: Numerical Optimization Analysis for strongly convex functions faster convergence rate with additional assumption of strong convexity analysis for exact line search: recall from quadratic upper bound f(x t f(x)) f(x) t(1 Lt 2 ) f(x) 2 2 use x + = argmin t f(x t f(x)) to obtain f(x + ) f(x 1 1 f(x)) f(x) L 2L f(x) 2 2 subtract f from both sides f(x + ) f f(x) f 1 2L f(x) 2 2 now use strong convexity: f(x) f 1 2µ f(x) 2 2 f(x + ) f (1 µ L )(f(x) f )
11 Shiqian Ma, MAT-258A: Numerical Optimization 11 therefore f(x k ) f (1 µ L )k (f(x 0 ) f ) conclusion: number of iterations to reach f(x k ) f ɛ is log((f(x 0 ) f )/ɛ) L ( f(x 0 log(1 µ/l) 1 µ log ) f ) ɛ roughly proportional to condition number L/µ when it is large
12 Shiqian Ma, MAT-258A: Numerical Optimization 12 A quadratic example quadratic example f(x) = (1/2)(x γx 2 2) (γ > 1) with exact line search, starting at x 0 = (γ, 1) f(x k ) = ( ) 2k γ 1 f(x 0 ) γ + 1 x k x 2 = x 0 x 2 ( γ 1 γ + 1 ) k if γ = 10 4, k = 100, then ( γ 1 γ+1) k = 0.98 gradient method can be very slow, and very much dependent on scaling
13 Shiqian Ma, MAT-258A: Numerical Optimization 13 The contour is more eccentric, and gradient method is slow.
14 Shiqian Ma, MAT-258A: Numerical Optimization 14 A non-quadratic example f(x 1, x 2 ) = exp(x 1 + 3x 2 0.1) + exp(x 1 3x 2 0.1) + exp( x 1 0.1) its gradient is f(x 1, x 2 ) = ( ) exp(x1 + 3x 2 0.1) + exp(x 1 3x 2 0.1) exp( x 1 0.1) 3 exp(x 1 + 3x 2 0.1) 3 exp(x 1 3x 2 0.1)
15 Shiqian Ma, MAT-258A: Numerical Optimization 15 run gradient method with line search (α = 0.1, β = 0.7). upper: f(x k ) f ; lower: f(x k ) 2
16 Shiqian Ma, MAT-258A: Numerical Optimization 16 Convergence rate sublinear rate: r k c/k p linear rate: r k c(1 q) k quadratic rate: r k+1 crk 2 r k can be f(x k ) f, x k x 2, or f(x k ) 2 ; c is some constant
17 Shiqian Ma, MAT-258A: Numerical Optimization Newton s Method assume f(x) is twice continuously differentiable and convex given x k, use a quadratic function to approximate f(x) locally: use Taylor s expansion: f(x k + p k ) = f(x k ) + f(x k ) p k pk 2 f(x k )p k choose p k such that this quadratic approximation is minimized: (pure) Newton method damped Newton method f(x k ) + 2 f(x k )p k = 0 x k+1 = x k 2 f(x k ) 1 f(x k ) x k+1 = x k t k 2 f(x k ) 1 f(x k )
18 Shiqian Ma, MAT-258A: Numerical Optimization 18 advantages: fast convergence, affine invariance affine invariant means: independent of linear changes of coordinates for example: Newton iterates for f(y) = f(t y) with starting point y 0 = T 1 x 0 are y k = T 1 x k disadvantages: requires second derivatives, solution to linear equation can be too expensive for large-scale applications
19 Shiqian Ma, MAT-258A: Numerical Optimization 19 Quasi-Newton Method It is too expensive to compute 2 f(x k ), use some B k to replace it. use a quadratic model to approximate f(x) locally at x k : m k (p) = f(x k ) + f(x k ) p p B k p here B k is an n n symmetric positive definite matrix. Note that the function value and gradient of this model at p = 0 match f(x k ) and f(x k ), respectively. by minimizing this quadratic approximation, we obtain then we update the iterate via p k = B 1 k f(xk ) x k+1 = x k + α k p k
20 Shiqian Ma, MAT-258A: Numerical Optimization 20 how to compute B k? when we are at x k+1, we want to construct m k+1 (p) = f(x k+1 ) + f(x k+1 ) p p B k+1 p we want the gradient of m k+1 to match the gradient of f at x k and x k+1 : m k+1 (0) = f(x k+1 ) and so we have, m k+1 ( α k p k ) = f(x k+1 ) α k B k+1 p k = f(x k ) to simplify the notation, define B k+1 α k p k = f(x k+1 ) f(x k ) s k = x k+1 x k = α k p k, y k = f(x k+1 ) f(x k )
21 Shiqian Ma, MAT-258A: Numerical Optimization 21 then we get B k+1 s k = y k and this is called secant equation
22 Shiqian Ma, MAT-258A: Numerical Optimization 22 To compute B k+1, we solve DFP and BFGS min B B k s.t., B = B, Bs k = y k This gives the following DFP updating formula (originally given by Davidon in 1959, and subsequently studied by Fletcher and Powell) B k+1 = (I ρ k y k s k )B k (I ρ k s k y k ) + ρ k y k y k with ρ k = 1/y k s k. The other way to compute B k+1 : denote its inverse as H k+1, solve the following problem min H H k s.t., H = H, Hy k = s k
23 Shiqian Ma, MAT-258A: Numerical Optimization 23 This gives the following BFGS updating formula (proposed by Broyden, Fletcher, Goldfarb and Shanno, independently) H k+1 = (I ρ k s k y k )H k (I ρ k y k s k ) + ρ k s k s k or, by Sherman-Morrison-Woodbury formula: B k+1 = B k B ks k s k B k s k B ks k + y ky k y k s k
24 Shiqian Ma, MAT-258A: Numerical Optimization 24 The complete BFGS method given initial point x 0 and B 0 0 repeat for k = 0, 1, 2,... until a stopping criterion is satisfied compute quasi-newton direction p k = B 1 k f(xk ) determine step size t k (via backtracking line search) update x k+1 = x k + t k p k and compute f(x k+1 ) compute B k+1
25 Shiqian Ma, MAT-258A: Numerical Optimization 25 Convergence of BFGS method global convergence if f is strongly convex, then BFGS with backtracking line search converges to the optimum for any x 0 and B 0 0 local convergence if f is strongly convex and 2 f(x) is Lipschitz continuous, then local convergence is superlinear: for sufficiently large k, where c k 0. x k+1 x 2 c k x k x 2
5 Quasi-Newton Methods
Unconstrained Convex Optimization 26 5 Quasi-Newton Methods If the Hessian is unavailable... Notation: H = Hessian matrix. B is the approximation of H. C is the approximation of H 1. Problem: Solve min
More informationQuasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS)
Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno (BFGS) Limited memory BFGS (L-BFGS) February 6, 2014 Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb
More informationLecture 14: October 17
1-725/36-725: Convex Optimization Fall 218 Lecture 14: October 17 Lecturer: Lecturer: Ryan Tibshirani Scribes: Pengsheng Guo, Xian Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More information2. Quasi-Newton methods
L. Vandenberghe EE236C (Spring 2016) 2. Quasi-Newton methods variable metric methods quasi-newton methods BFGS update limited-memory quasi-newton methods 2-1 Newton method for unconstrained minimization
More informationNewton s Method. Ryan Tibshirani Convex Optimization /36-725
Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x
More information10. Unconstrained minimization
Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation
More informationOptimization II: Unconstrained Multivariable
Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Justin Solomon CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 1
More informationConvex Optimization. Newton s method. ENSAE: Optimisation 1/44
Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)
More information1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method
L. Vandenberghe EE236C (Spring 2016) 1. Gradient method gradient method, first-order methods quadratic bounds on convex functions analysis of gradient method 1-1 Approximate course outline First-order
More informationMethods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent
Nonlinear Optimization Steepest Descent and Niclas Börlin Department of Computing Science Umeå University niclas.borlin@cs.umu.se A disadvantage with the Newton method is that the Hessian has to be derived
More informationOptimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23
Optimization: Nonlinear Optimization without Constraints Nonlinear Optimization without Constraints 1 / 23 Nonlinear optimization without constraints Unconstrained minimization min x f(x) where f(x) is
More informationConvex Optimization CMU-10725
Convex Optimization CMU-10725 Quasi Newton Methods Barnabás Póczos & Ryan Tibshirani Quasi Newton Methods 2 Outline Modified Newton Method Rank one correction of the inverse Rank two correction of the
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationUnconstrained minimization
CSCI5254: Convex Optimization & Its Applications Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions 1 Unconstrained
More informationConvex Optimization. Problem set 2. Due Monday April 26th
Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining
More informationQuasi-Newton Methods. Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization
Quasi-Newton Methods Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization 10-725 Last time: primal-dual interior-point methods Given the problem min x f(x) subject to h(x)
More informationORIE 6326: Convex Optimization. Quasi-Newton Methods
ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted
More informationChapter 4. Unconstrained optimization
Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file
More informationDescent methods. min x. f(x)
Gradient Descent Descent methods min x f(x) 5 / 34 Descent methods min x f(x) x k x k+1... x f(x ) = 0 5 / 34 Gradient methods Unconstrained optimization min f(x) x R n. 6 / 34 Gradient methods Unconstrained
More informationLecture 18: November Review on Primal-dual interior-poit methods
10-725/36-725: Convex Optimization Fall 2016 Lecturer: Lecturer: Javier Pena Lecture 18: November 2 Scribes: Scribes: Yizhu Lin, Pan Liu Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationQuasi-Newton Methods. Javier Peña Convex Optimization /36-725
Quasi-Newton Methods Javier Peña Convex Optimization 10-725/36-725 Last time: primal-dual interior-point methods Consider the problem min x subject to f(x) Ax = b h(x) 0 Assume f, h 1,..., h m are convex
More informationMATH 4211/6211 Optimization Quasi-Newton Method
MATH 4211/6211 Optimization Quasi-Newton Method Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Quasi-Newton Method Motivation:
More informationImproving the Convergence of Back-Propogation Learning with Second Order Methods
the of Back-Propogation Learning with Second Order Methods Sue Becker and Yann le Cun, Sept 1988 Kasey Bray, October 2017 Table of Contents 1 with Back-Propagation 2 the of BP 3 A Computationally Feasible
More informationOptimization and Root Finding. Kurt Hornik
Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationUnconstrained minimization of smooth functions
Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and
More informationOptimization II: Unconstrained Multivariable
Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Optimization II: Unconstrained
More informationNonlinear Programming
Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week
More informationSelected Topics in Optimization. Some slides borrowed from
Selected Topics in Optimization Some slides borrowed from http://www.stat.cmu.edu/~ryantibs/convexopt/ Overview Optimization problems are almost everywhere in statistics and machine learning. Input Model
More informationComparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming Problems
International Journal of Scientific and Research Publications, Volume 3, Issue 10, October 013 1 ISSN 50-3153 Comparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming
More informationMaria Cameron. f(x) = 1 n
Maria Cameron 1. Local algorithms for solving nonlinear equations Here we discuss local methods for nonlinear equations r(x) =. These methods are Newton, inexact Newton and quasi-newton. We will show that
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428
More informationGradient Descent. Lecturer: Pradeep Ravikumar Co-instructor: Aarti Singh. Convex Optimization /36-725
Gradient Descent Lecturer: Pradeep Ravikumar Co-instructor: Aarti Singh Convex Optimization 10-725/36-725 Based on slides from Vandenberghe, Tibshirani Gradient Descent Consider unconstrained, smooth convex
More informationGradient Descent. Ryan Tibshirani Convex Optimization /36-725
Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like
More informationMethods for Unconstrained Optimization Numerical Optimization Lectures 1-2
Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods
More informationGradient descent. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725
Gradient descent Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Gradient descent First consider unconstrained minimization of f : R n R, convex and differentiable. We want to solve
More informationLecture 5: Gradient Descent. 5.1 Unconstrained minimization problems and Gradient descent
10-725/36-725: Convex Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 5: Gradient Descent Scribes: Loc Do,2,3 Disclaimer: These notes have not been subjected to the usual scrutiny reserved for
More information5. Subgradient method
L. Vandenberghe EE236C (Spring 2016) 5. Subgradient method subgradient method convergence analysis optimal step size when f is known alternating projections optimality 5-1 Subgradient method to minimize
More informationFrank-Wolfe Method. Ryan Tibshirani Convex Optimization
Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)
More informationNewton s Method. Javier Peña Convex Optimization /36-725
Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and
More informationSearch Directions for Unconstrained Optimization
8 CHAPTER 8 Search Directions for Unconstrained Optimization In this chapter we study the choice of search directions used in our basic updating scheme x +1 = x + t d. for solving P min f(x). x R n All
More informationLecture 7 Unconstrained nonlinear programming
Lecture 7 Unconstrained nonlinear programming Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University,
More informationQuasi-Newton methods for minimization
Quasi-Newton methods for minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universitá di Trento November 21 December 14, 2011 Quasi-Newton methods for minimization 1
More informationMath 408A: Non-Linear Optimization
February 12 Broyden Updates Given g : R n R n solve g(x) = 0. Algorithm: Broyden s Method Initialization: x 0 R n, B 0 R n n Having (x k, B k ) compute (x k+1, B x+1 ) as follows: Solve B k s k = g(x
More informationData Mining (Mineria de Dades)
Data Mining (Mineria de Dades) Lluís A. Belanche belanche@lsi.upc.edu Soft Computing Research Group Dept. de Llenguatges i Sistemes Informàtics (Software department) Universitat Politècnica de Catalunya
More informationImproving L-BFGS Initialization for Trust-Region Methods in Deep Learning
Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Jacob Rafati http://rafati.net jrafatiheravi@ucmerced.edu Ph.D. Candidate, Electrical Engineering and Computer Science University
More information1 Numerical optimization
Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms
More informationLecture 5: September 15
10-725/36-725: Convex Optimization Fall 2015 Lecture 5: September 15 Lecturer: Lecturer: Ryan Tibshirani Scribes: Scribes: Di Jin, Mengdi Wang, Bin Deng Note: LaTeX template courtesy of UC Berkeley EECS
More informationHigher-Order Methods
Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth
More informationECS550NFB Introduction to Numerical Methods using Matlab Day 2
ECS550NFB Introduction to Numerical Methods using Matlab Day 2 Lukas Laffers lukas.laffers@umb.sk Department of Mathematics, University of Matej Bel June 9, 2015 Today Root-finding: find x that solves
More information1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:
Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion
More information8 Numerical methods for unconstrained problems
8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields
More informationHomework 3 Conjugate Gradient Descent, Accelerated Gradient Descent Newton, Quasi Newton and Projected Gradient Descent
Homework 3 Conjugate Gradient Descent, Accelerated Gradient Descent Newton, Quasi Newton and Projected Gradient Descent CMU 10-725/36-725: Convex Optimization (Fall 2017) OUT: Sep 29 DUE: Oct 13, 5:00
More informationProximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization
Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R
More informationUniversity of Houston, Department of Mathematics Numerical Analysis, Fall 2005
3 Numerical Solution of Nonlinear Equations and Systems 3.1 Fixed point iteration Reamrk 3.1 Problem Given a function F : lr n lr n, compute x lr n such that ( ) F(x ) = 0. In this chapter, we consider
More informationOptimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng
Optimization 2 CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Optimization 2 1 / 38
More informationSECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS
SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss
More informationNumerical solutions of nonlinear systems of equations
Numerical solutions of nonlinear systems of equations Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan E-mail: min@math.ntnu.edu.tw August 28, 2011 Outline 1 Fixed points
More informationStochastic Quasi-Newton Methods
Stochastic Quasi-Newton Methods Donald Goldfarb Department of IEOR Columbia University UCLA Distinguished Lecture Series May 17-19, 2016 1 / 35 Outline Stochastic Approximation Stochastic Gradient Descent
More informationStatistics 580 Optimization Methods
Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of
More informationConvex Optimization. 9. Unconstrained minimization. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University
Convex Optimization 9. Unconstrained minimization Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2017 Autumn Semester SJTU Ying Cui 1 / 40 Outline Unconstrained minimization
More informationAlgorithms for Constrained Optimization
1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Gradient Descent, Newton-like Methods Mark Schmidt University of British Columbia Winter 2017 Admin Auditting/registration forms: Submit them in class/help-session/tutorial this
More information6. Proximal gradient method
L. Vandenberghe EE236C (Spring 2016) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping
More informationStatic unconstrained optimization
Static unconstrained optimization 2 In unconstrained optimization an objective function is minimized without any additional restriction on the decision variables, i.e. min f(x) x X ad (2.) with X ad R
More informationProximal Newton Method. Ryan Tibshirani Convex Optimization /36-725
Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h
More informationLecture V. Numerical Optimization
Lecture V Numerical Optimization Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Numerical Optimization p. 1 /19 Isomorphism I We describe minimization problems: to maximize
More information1 Numerical optimization
Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................
More informationLecture 1: Supervised Learning
Lecture 1: Supervised Learning Tuo Zhao Schools of ISYE and CSE, Georgia Tech ISYE6740/CSE6740/CS7641: Computational Data Analysis/Machine from Portland, Learning Oregon: pervised learning (Supervised)
More information14. Nonlinear equations
L. Vandenberghe ECE133A (Winter 2018) 14. Nonlinear equations Newton method for nonlinear equations damped Newton method for unconstrained minimization Newton method for nonlinear least squares 14-1 Set
More informationQuasi-Newton Methods
Newton s Method Pros and Cons Quasi-Newton Methods MA 348 Kurt Bryan Newton s method has some very nice properties: It s extremely fast, at least once it gets near the minimum, and with the simple modifications
More informationLine search methods with variable sample size. Nataša Krklec Jerinkić. - PhD thesis -
UNIVERSITY OF NOVI SAD FACULTY OF SCIENCES DEPARTMENT OF MATHEMATICS AND INFORMATICS Nataša Krklec Jerinkić Line search methods with variable sample size - PhD thesis - Novi Sad, 2013 2. 3 Introduction
More informationOptimization Tutorial 1. Basic Gradient Descent
E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.
More informationMarch 8, 2010 MATH 408 FINAL EXAM SAMPLE
March 8, 200 MATH 408 FINAL EXAM SAMPLE EXAM OUTLINE The final exam for this course takes place in the regular course classroom (MEB 238) on Monday, March 2, 8:30-0:20 am. You may bring two-sided 8 page
More informationmin f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;
Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many
More informationLecture 5: September 12
10-725/36-725: Convex Optimization Fall 2015 Lecture 5: September 12 Lecturer: Lecturer: Ryan Tibshirani Scribes: Scribes: Barun Patra and Tyler Vuong Note: LaTeX template courtesy of UC Berkeley EECS
More informationLine Search Methods for Unconstrained Optimisation
Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic
More informationMATH 3795 Lecture 13. Numerical Solution of Nonlinear Equations in R N.
MATH 3795 Lecture 13. Numerical Solution of Nonlinear Equations in R N. Dmitriy Leykekhman Fall 2008 Goals Learn about different methods for the solution of F (x) = 0, their advantages and disadvantages.
More informationEmpirical Risk Minimization and Optimization
Statistical Machine Learning Notes 3 Empirical Risk Minimization and Optimization Instructor: Justin Domke 1 Empirical Risk Minimization Empirical Risk Minimization is a fancy sounding name for a very
More informationSubgradient Method. Ryan Tibshirani Convex Optimization
Subgradient Method Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last last time: gradient descent min x f(x) for f convex and differentiable, dom(f) = R n. Gradient descent: choose initial
More informationGradient-Based Optimization
Multidisciplinary Design Optimization 48 Chapter 3 Gradient-Based Optimization 3. Introduction In Chapter we described methods to minimize (or at least decrease) a function of one variable. While problems
More informationGRADIENT = STEEPEST DESCENT
GRADIENT METHODS GRADIENT = STEEPEST DESCENT Convex Function Iso-contours gradient 0.5 0.4 4 2 0 8 0.3 0.2 0. 0 0. negative gradient 6 0.2 4 0.3 2.5 0.5 0 0.5 0.5 0 0.5 0.4 0.5.5 0.5 0 0.5 GRADIENT DESCENT
More informationLecture 14 Ellipsoid method
S. Boyd EE364 Lecture 14 Ellipsoid method idea of localization methods bisection on R center of gravity algorithm ellipsoid method 14 1 Localization f : R n R convex (and for now, differentiable) problem:
More informationThe Randomized Newton Method for Convex Optimization
The Randomized Newton Method for Convex Optimization Vaden Masrani UBC MLRG April 3rd, 2018 Introduction We have some unconstrained, twice-differentiable convex function f : R d R that we want to minimize:
More information1. Search Directions In this chapter we again focus on the unconstrained optimization problem. lim sup ν
1 Search Directions In this chapter we again focus on the unconstrained optimization problem P min f(x), x R n where f : R n R is assumed to be twice continuously differentiable, and consider the selection
More informationLecture 6: September 12
10-725: Optimization Fall 2013 Lecture 6: September 12 Lecturer: Ryan Tibshirani Scribes: Micol Marchetti-Bowick Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not
More informationEAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science
EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Multidimensional Unconstrained Optimization Suppose we have a function f() of more than one
More information6. Proximal gradient method
L. Vandenberghe EE236C (Spring 2013-14) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping
More informationMultivariate Newton Minimanization
Multivariate Newton Minimanization Optymalizacja syntezy biosurfaktantu Rhamnolipid Rhamnolipids are naturally occuring glycolipid produced commercially by the Pseudomonas aeruginosa species of bacteria.
More informationEmpirical Risk Minimization and Optimization
Statistical Machine Learning Notes 3 Empirical Risk Minimization and Optimization Instructor: Justin Domke Contents 1 Empirical Risk Minimization 1 2 Ingredients of Empirical Risk Minimization 4 3 Convex
More informationStochastic Optimization Algorithms Beyond SG
Stochastic Optimization Algorithms Beyond SG Frank E. Curtis 1, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods
More informationMatrix Secant Methods
Equation Solving g(x) = 0 Newton-Lie Iterations: x +1 := x J g(x ), where J g (x ). Newton-Lie Iterations: x +1 := x J g(x ), where J g (x ). 3700 years ago the Babylonians used the secant method in 1D:
More informationLecture 15 Newton Method and Self-Concordance. October 23, 2008
Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications
More informationNonlinearOptimization
1/35 NonlinearOptimization Pavel Kordík Department of Computer Systems Faculty of Information Technology Czech Technical University in Prague Jiří Kašpar, Pavel Tvrdík, 2011 Unconstrained nonlinear optimization,
More informationNumerical Optimization
Unconstrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 01, India. NPTEL Course on Unconstrained Minimization Let f : R n R. Consider the optimization problem:
More informationProximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725
Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:
More information1. Introduction and motivation. We propose an algorithm for solving unconstrained optimization problems of the form (1) min
LLNL IM Release number: LLNL-JRNL-745068 A STRUCTURED QUASI-NEWTON ALGORITHM FOR OPTIMIZING WITH INCOMPLETE HESSIAN INFORMATION COSMIN G. PETRA, NAIYUAN CHIANG, AND MIHAI ANITESCU Abstract. We present
More informationAM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality
More informationGlobal Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction
ISSN 1749-3889 (print), 1749-3897 (online) International Journal of Nonlinear Science Vol.11(2011) No.2,pp.153-158 Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method Yigui Ou, Jun Zhang
More informationSubgradient Method. Guest Lecturer: Fatma Kilinc-Karzan. Instructors: Pradeep Ravikumar, Aarti Singh Convex Optimization /36-725
Subgradient Method Guest Lecturer: Fatma Kilinc-Karzan Instructors: Pradeep Ravikumar, Aarti Singh Convex Optimization 10-725/36-725 Adapted from slides from Ryan Tibshirani Consider the problem Recall:
More information