Optimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Size: px
Start display at page:

Download "Optimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng"

Transcription

1 Optimization 2 CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Optimization 2 1 / 38

2 Last Time... Last Time... Gradient descent updates variables by x k+1 = x k α f(x k ). (1) Gradient descent can be slow. To increase convergence rate, find good step length α using line search. Leow Wee Kheng (NUS) Optimization 2 2 / 38

3 Line Search Line Search Gradient descent minimizes f(x) by updating x k along f(x k ): x k+1 = x k α f(x k ). (2) In general, other descent directions p k are possible if f(x) p k < 0. Then, update x k according to x k+1 = x k +αp k. (3) Ideally, want to find α > 0 that minimizes f(x k +αp k ). Too expensive. In practice, find α > 0 that reduces f(x) sufficiently (not too small). The simple condition f(x k +αp k ) < f(x k ) may not be sufficient. Leow Wee Kheng (NUS) Optimization 2 3 / 38

4 Line Search A popular sufficient line search condition is Armijo condition: f(x k +αp k ) f(x k )+c 1 α f(x k ) p k, (4) where c 1 > 0, usually quite small, say, c 1 = Armijo condition may accepts very small α. So, add another condition to ensure sufficient reduction of f(x): f(x k +α k p k ) p k c 2 f(x) p k, (5) with 0 < c 1 < c 2 1, typically c 2 = 0.9. These two conditions together are called Wolfe conditions. Known result: Gradient descent will find local minimum if α satisfies Wolfe conditions. Leow Wee Kheng (NUS) Optimization 2 4 / 38

5 Line Search Another way is to bound α from below, giving Goldstein conditions: f(x k )+(1 c)α f(x k ) p k f(x k +αp k ) f(x k )+cα f(x k ) p k. (6) Another simple way is to start with large α and reduce it iteratively: Backtracking Line Search 1. Choose large α > 0, 0 < ρ < Repeat until Armijo condition is met: 2.1 α ρα. In this case, Armijo condition alone is enough. Leow Wee Kheng (NUS) Optimization 2 5 / 38

6 Gauss-Newton Method Gauss-Newton Method Carl Friedrich Gauss Sir Isaac Newton Leow Wee Kheng (NUS) Optimization 2 6 / 38

7 Gauss-Newton Method Gauss-Newton method estimates Hessian H by 1st-order derivatives. Express E(a) as the sum of square of r i (a), E(a) = n r i (a) 2, (7) i=1 where r i is the error or residual of each data point x i : and Then, 2 E a j a l = 2 n i=1 r i (a) = v i f(x i ;a), (8) r = [r 1 r n ]. (9) E a j = 2 n i=1 [ ri r i 2 ] r i +r i a j a l a j a l r i r i a j, (10) 2 n i=1 r i a j r i a l. (11) Leow Wee Kheng (NUS) Optimization 2 7 / 38

8 Gauss-Newton Method The derivative r i a j is the (i,j)-th element of the Jacobian J r of r. J r = r 1 r 1 a 1 a m..... r n r n a 1 a m The gradient and Hessian are related to the Jacobian by So, the update equation becomes. (12) E = 2J r r. (13) H 2J r J r, (14) a k+1 = a k (J r J r ) 1 J r r. (15) Leow Wee Kheng (NUS) Optimization 2 8 / 38

9 Gauss-Newton Method We can also use the Jacobian J f of f(x i ): J f = f(x 1 ) f(x 1 ) a 1 a m..... f(x n ) f(x n ) a 1 a m. (16) Then, J f = J r, and the update equation becomes a k+1 = a k +(J f J f) 1 J f r. (17) Leow Wee Kheng (NUS) Optimization 2 9 / 38

10 Gauss-Newton Method Compare to linear fitting: linear fitting a = (D D) 1 D v Gauss-Newton a = (J f J f) 1 J f r (J f J f) 1 J f is analogous to pseudo-inverse. Compare to gradient descent: gradient descent a = α E Gauss-Newton a = 1 2 (J r J r ) 1 E Gauss-Newton updates a along negative gradient at variable rate. Gauss-Newton method may converge slowly or diverge if initial guess a(0) is far from minimum, or matrix J J is ill-conditioned. Leow Wee Kheng (NUS) Optimization 2 10 / 38

11 Gauss-Newton Method Condition Number Condition Number The condition number of (real) matrix D is given by the ratio of its singular values: κ(d) = σ max(d) σ min (D). (18) If D is normal, i.e., D D = DD, then its condition number is given by the ratio of its eigenvalues: κ(d) = λ max(d) λ min (D). (19) Large condition number means the elements of D are less balanced. Leow Wee Kheng (NUS) Optimization 2 11 / 38

12 Gauss-Newton Method Condition Number For the linear equation Da = v, the condition number indicates change of solution a vs. change of v. Small condition number: Small change in v gives small change in a: stable. Small error in v gives small error in a. D is well-conditioned. Large condition number: Small change in v gives large change in a: unstable. Small error in v gives large error in a. D is ill-conditioned. Leow Wee Kheng (NUS) Optimization 2 12 / 38

13 Gauss-Newton Method Practical Issues Practical Issues The update equation, with J = J f, a k+1 = a k +(J J) 1 J r. (20) requires computation of matrix inverse, which can be expensive. Can use singular value decomposition (SVD) to help. SVD of J is J = UΣV. (21) Substituting it into the update equation yields (Homework) a k+1 = a k +VΣ 1 U r. (22) Leow Wee Kheng (NUS) Optimization 2 13 / 38

14 Gauss-Newton Method Practical Issues Convergence criteria: same as gradient descent E k+1 E k E k < ǫ, e.g., ǫ = a j,k+1 a j,k a j,k < ǫ, for each component j. If divergence occurs, can introduce step length α: where 0 < α < 1. a k+1 = a k +α(j J) 1 J r, (23) Is there an alternative way to control convergence? Yes: Levenberg-Marquardt Method. Leow Wee Kheng (NUS) Optimization 2 14 / 38

15 Levenberg-Marquardt Method Levenberg-Marquardt Method Gauss-Newton method updates a according to (J J) a = J r. (24) Levenberg modifies the update equation to λ is adjusted at each iteration. (J J+λI) a = J r. (25) If reduction of error E is large, can use small λ: more similar to Gauss-Newton. If reduction of error E is small, can use large λ: more similar to gradient descent. Leow Wee Kheng (NUS) Optimization 2 15 / 38

16 Levenberg-Marquardt Method If λ is very large, (J J+λI) λi, i.e., just doing gradient descent. Marquardt scales each component of gradient: (J J+λdiag(J J)) a = J r, (26) where diag(j J) is a diagonal matrix of the diagonal elements of J J. The (i,i)-th component of diag(j J) is n ( ) f(xl ) 2. (27) l=1 a i If (i,i)-th component of gradient is small, a i is updated more: faster convergence along i-th direction. Then, λ doesn t have to be too large. Leow Wee Kheng (NUS) Optimization 2 16 / 38

17 Quasi-Newton Quasi-Newton Quasi-Newton also minimizes f(x) by iteratively updating x: where the descent direction p k is x k+1 = x k +αp k (28) p k = H 1 k f(x k). (29) There are many methods of estimating H 1. The most effective is the method by Broyden, Fletcher, Goldfarb and Shanno (BFGS). Leow Wee Kheng (NUS) Optimization 2 17 / 38

18 Quasi-Newton BFGS Method Let x k = x k+1 x k and y k = f(x k+1 ) f(x k ). Then, estimate H k or H 1 k by H k+1 = H k H k x k x k H k x k H + y kyk k x k yk x, (30) k ( H 1 k+1 = I x kyk ) ( yk x H 1 k I y k x ) k k yk x + x k x k k yk x. (31) k (For derivation, please refer to [1].) Leow Wee Kheng (NUS) Optimization 2 18 / 38

19 Quasi-Newton BFGS Algorithm 1. Set α and initialize x 0 and H For k = 0,1,... until convergence: 2.1 Compute search direction p k = H 1 k f(x k). 2.2 Perform line search to find α that satisfies Wolfe conditions. 2.3 Compute x k = αp k, and set x k+1 = x k + x k. 2.4 Compute y k = f(x k+1 ) f(x k ). 2.5 Compute H 1 k+1 by Eq. 31. No standard way to initialize H. Can try H = I. Leow Wee Kheng (NUS) Optimization 2 19 / 38

20 Approximation of Gradient Approximation of Gradient All previous methods require the gradient f (or E). In many complex problems, it is very difficult to derive f. Several methods to overcome this problem: Use symbolic differentiation to derive math expression of f. Provided in Matlab, Maple, etc. Approximate gradient numerically. Optimize without gradient. Leow Wee Kheng (NUS) Optimization 2 20 / 38

21 Approximation of Gradient Forward Difference From Taylor series expansion, f(x+ x) = f(x)+ f(x) x+ (32) Denote the j-th component of x as ǫe j, where e j is the unit vector of the j-th dimension. Then, the j-th component of f(x) is f(x) f(x+ǫe j) f(x). (33) x j ǫ Note that x+ǫe j updates only the j-th component: Then, x+ǫe j = [ x 1 x j 1 x j x j+1 x m ] + (34) [ 0 0 ǫ 0 0 ] f(x) [ f(x) x 1 ] f(x). (35) x m Leow Wee Kheng (NUS) Optimization 2 21 / 38

22 Approximation of Gradient Central Difference Forward difference method has forward bias: x+ǫe j. Error in forward bias can accumulate quickly. To be less biased, use central difference. f(x+ǫe j ) f(x)+ǫ f(x) x j f(x ǫe j ) f(x) ǫ f(x) x j. So, f(x) f(x+ǫe j) f(x ǫe j ). (36) x j 2ǫ Leow Wee Kheng (NUS) Optimization 2 22 / 38

23 Optimization without Gradient Optimization without Gradient A simple idea is to descend along each coordinate direction iteratively. Let e 1,...,e m denote the unit vectors along coordinates 1,...,m. Coordinate Descent 1. Choose initial x Repeat until convergence: 2.1 For each j = 1,...,m: Find x k+1 along e j that minimizes f(x k ). Leow Wee Kheng (NUS) Optimization 2 23 / 38

24 Optimization without Gradient Alternative for Step 2.1.1: Find x k+1 along e j using line search to reduce f(x k ) sufficiently. Strength: Simplicity. Weakness: Can be slow. Can iterate around the minimum, never approaching it. Powell s Method Overcomes the problem of coordinate descent. Basic Idea: Choose directions that do not interfere with or undo previous effort. Leow Wee Kheng (NUS) Optimization 2 24 / 38

25 Alternating Optimization Alternating Optimization Let S R m denote the set of x that satisfy the constraints. Then, constrained optimization problem can be written as: minf(x). (37) x S Alternating optimization (AO) is similar to coordinate descent except that the variables x j,j = 1,...,m, are split into groups. Partition x S into p non-overlapping parts y l such that x = (y 1,...,y l,...,y p ). (38) Example: x = (x 1,x 2,x }{{} 4,x 3,x 7,x }{{} 5,x 6,x 8,x 9 ). }{{} y 1 y 2 y 3 Denote S l S as the set that contains y l. Leow Wee Kheng (NUS) Optimization 2 25 / 38

26 Alternating Optimization Alternating Optimization 1. Choose initial x (0) = (y (0) 1,...,y(0) p ). 2. For k = 0,1,2,..., until convergence: 2.1 For l = 1,...,p: Find y (k+1) l y (k+1) l such that = arg min f(y (k+1) 1,...,y (k+1) y l S l 1, y l, y (k) l+1,...,y(k) p ). l That is, find y (k+1) l while fixing other variables. (Alternative) Find y (k+1) l that decreases f(x) sufficiently: f(y (k+1) 1,...,y (k+1) l 1, y (k) l, y (k) l+1,...,y(k) p ) f(y (k+1) 1,...,y (k+1) l 1, y (k+1) l, y (k) l+1,...,y(k) p ) > c > 0. Leow Wee Kheng (NUS) Optimization 2 26 / 38

27 Alternating Optimization Example: initially (y (0) 1,y(0) 2,y(0) 3,...,y(0) l,...,y (0) p ) k = 0,l = 1 (y (1) 1,y(0) 2,y(0) 3,...,y(0) l,...,y (0) p ) k = 0,l = 2 (y (1) 1,y(1) 2,y(0) 3,...,y(0) l,...,y (0) p ) k = 0,l = p (y (1) 1,y(1) 2,y(1) 3,...,y(1) l,...,y p (1) ) k = 1,l = 1 (y (2) 1,y(1) 2,y(1) 3,...,y(1) l,...,y p (1) ) Obviously, we want to partition the variables so that Each sub-problem of finding y l is relatively easy to solve. Solving y l does not undo previous partial solutions.. Leow Wee Kheng (NUS) Optimization 2 27 / 38

28 Alternating Optimization Example: k-means clustering k-means clustering tries to group data points x into k clusters. c 2 c 1 c 3 Let c j denote the prototype of cluster C j. Then, k-means clustering finds C j and c j, j = 1,...,k, that minimize within-cluster difference: k D = (x c j ) 2. (39) j=1 x C j Leow Wee Kheng (NUS) Optimization 2 28 / 38

29 Alternating Optimization Suppose C j are already known. Then, the c j that minimize D is given by That is, c j is the centroid of C j : How to find C j? Use alternating optimization. D c j = 2 x C j (x c j ) = 0. (40) c j = 1 C j x C j x. (41) Leow Wee Kheng (NUS) Optimization 2 29 / 38

30 Alternating Optimization k-means clustering 1. Initialize c j, j = 1,...,k. 2. Repeat until convergence: 2.1 For each x, assign it to the nearest cluster C j : (fix c j, update C j ) 2.2 Update cluster centroids: (fix C j, update c j ) x c j x c l, for l j. c j = 1 C j Step 2.2 is (locally) optimal. Step 2.1 does not undo step 2.2. Why? x C j x. So, k-means clustering can converge to a local minimum. Leow Wee Kheng (NUS) Optimization 2 30 / 38

31 Convergence Rates Convergence Rates Suppose the sequence {x k } converges to x. Then, the sequence converges to x linearly if µ (0,1) such that x k+1 x lim k x k x = µ. (42) The sequence converges sublinearly if µ = 1. The sequence converges superlinearly if µ = 0. If the sequence converges sublinearly and then, the sequence converges logarithmically. x k+2 x k+1 lim = 1, (43) k x k+1 x k Leow Wee Kheng (NUS) Optimization 2 31 / 38

32 Convergence Rates We can distinguish between different rates of superlinear convergence. The sequence converges with order of q, for q > 1, if µ > 0 such that x k+1 x lim k x k x q = µ > 0. (44) If q = 1, the sequence converges q-linearly. If q = 2, the sequence converges qudratically or q-quadratically. If q = 3, the sequence converges cubically or q-cubically. etc. Leow Wee Kheng (NUS) Optimization 2 32 / 38

33 Convergence Rates Known Convergence Rates Gradient descent converges linearly, often slowly linearly. Gauss-Newton converges quadratically, if it converges. Levenberg-Marquardt converges quadratically. Quasi-Newton converges superlinearly. Alternating optimization converges q-linearly if f is convex around a local minimizer x. Leow Wee Kheng (NUS) Optimization 2 33 / 38

34 Summary Summary Method For Use Gradient descent optimization f Gauss-Newton nonlinear least-square E, estimate of H Levenberg-Marquardt nonlinear least-square E, estimate of H quasi-newton optimization f, estimate of H Newton optimization f, H coordinate descent optimization f alternating optimization optimization f Leow Wee Kheng (NUS) Optimization 2 34 / 38

35 Summary SciPy supports many optimization methods including linear least-square Levenberg-Marquardt method quasi-newton BFGS method line search that satisfies Wolfe conditions etc. For more details and other methods, refer to [1]. Leow Wee Kheng (NUS) Optimization 2 35 / 38

36 Probing Questions Probing Questions Can you apply an algorithm specialized for nonlinear least square, e.g., Gauss-Newton, to generic optimization? Gauss-Newton may converge slowly or diverge if the matrix J J is ill-conditioned. How to make the matrix well-conditioned? How can your alternating optimization program detect interference of previous update by current update? If your alternating optimization program detects frequent interference between updates, what can you do about it? So far, we have look at minimizing a scalar function f(x). How to minimize a vector function f(x) or multiple scalar functions f l (x) simultaneously? How to ensure that your optimization program will always converge regardless of the test condition? Leow Wee Kheng (NUS) Optimization 2 36 / 38

37 Homework Homework 1. Describe the essence of Gauss-Newton, Levenberg-Marquardt, quasi-newton, coordinate descent and alternating optimization each in one sentence. 2. With SVD of J given by J = UΣV, show that (J J) 1 J = VΣ 1 U. Hint: For matrices A and B, (AB) 1 = B 1 A Q3(c) of AY2015/16 Final Evaluation. 4. Q5 of AY2015/16 Final Evaluation. Leow Wee Kheng (NUS) Optimization 2 37 / 38

38 References References 1. J. Nocedal and S. J. Wright, Numerical Optimization, Springer-Verlag, W. H. Press, S. A. Teukolsky, W. T. Vetterling, B. P. Flannery, Numerical Recipes: The Art of Scientific Computing, 3rd Ed., Cambridge University Press, J. C. Bezdek and R. J. Hathaway, Some notes on alternating optimization, in Proc. Int. Conf. on Fuzzy Systems, Leow Wee Kheng (NUS) Optimization 2 38 / 38

5 Quasi-Newton Methods

5 Quasi-Newton Methods Unconstrained Convex Optimization 26 5 Quasi-Newton Methods If the Hessian is unavailable... Notation: H = Hessian matrix. B is the approximation of H. C is the approximation of H 1. Problem: Solve min

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Prof. C. F. Jeff Wu ISyE 8813 Section 1 Motivation What is parameter estimation? A modeler proposes a model M(θ) for explaining some observed phenomenon θ are the parameters

More information

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent Nonlinear Optimization Steepest Descent and Niclas Börlin Department of Computing Science Umeå University niclas.borlin@cs.umu.se A disadvantage with the Newton method is that the Hessian has to be derived

More information

Optimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23

Optimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23 Optimization: Nonlinear Optimization without Constraints Nonlinear Optimization without Constraints 1 / 23 Nonlinear optimization without constraints Unconstrained minimization min x f(x) where f(x) is

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal

More information

Lecture 7 Unconstrained nonlinear programming

Lecture 7 Unconstrained nonlinear programming Lecture 7 Unconstrained nonlinear programming Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University,

More information

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09 Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428

More information

Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS)

Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS) Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno (BFGS) Limited memory BFGS (L-BFGS) February 6, 2014 Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 3. Gradient Method

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 3. Gradient Method Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 3 Gradient Method Shiqian Ma, MAT-258A: Numerical Optimization 2 3.1. Gradient method Classical gradient method: to minimize a differentiable convex

More information

Improving the Convergence of Back-Propogation Learning with Second Order Methods

Improving the Convergence of Back-Propogation Learning with Second Order Methods the of Back-Propogation Learning with Second Order Methods Sue Becker and Yann le Cun, Sept 1988 Kasey Bray, October 2017 Table of Contents 1 with Back-Propagation 2 the of BP 3 A Computationally Feasible

More information

Math 273a: Optimization Netwon s methods

Math 273a: Optimization Netwon s methods Math 273a: Optimization Netwon s methods Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 some material taken from Chong-Zak, 4th Ed. Main features of Newton s method Uses both first derivatives

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

Chapter 4. Unconstrained optimization

Chapter 4. Unconstrained optimization Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file

More information

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52 Previously...

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

Optimization II: Unconstrained Multivariable

Optimization II: Unconstrained Multivariable Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Justin Solomon CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 1

More information

MATH 4211/6211 Optimization Quasi-Newton Method

MATH 4211/6211 Optimization Quasi-Newton Method MATH 4211/6211 Optimization Quasi-Newton Method Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Quasi-Newton Method Motivation:

More information

Numerical Optimization

Numerical Optimization Numerical Optimization Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Spring 2010 Emo Todorov (UW) AMATH/CSE 579, Spring 2010 Lecture 9 1 / 8 Gradient descent

More information

Optimization Methods

Optimization Methods Optimization Methods Decision making Examples: determining which ingredients and in what quantities to add to a mixture being made so that it will meet specifications on its composition allocating available

More information

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods

More information

Optimization II: Unconstrained Multivariable

Optimization II: Unconstrained Multivariable Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Optimization II: Unconstrained

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

Maria Cameron. f(x) = 1 n

Maria Cameron. f(x) = 1 n Maria Cameron 1. Local algorithms for solving nonlinear equations Here we discuss local methods for nonlinear equations r(x) =. These methods are Newton, inexact Newton and quasi-newton. We will show that

More information

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 5. Nonlinear Equations

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 5. Nonlinear Equations Lecture Notes to Accompany Scientific Computing An Introductory Survey Second Edition by Michael T Heath Chapter 5 Nonlinear Equations Copyright c 2001 Reproduction permitted only for noncommercial, educational

More information

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Multidimensional Unconstrained Optimization Suppose we have a function f() of more than one

More information

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725 Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

More information

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

More information

Optimization and Root Finding. Kurt Hornik

Optimization and Root Finding. Kurt Hornik Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding

More information

4 Newton Method. Unconstrained Convex Optimization 21. H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion:

4 Newton Method. Unconstrained Convex Optimization 21. H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion: Unconstrained Convex Optimization 21 4 Newton Method H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion: f(x + p) f(x)+p T f(x)+ 1 2 pt H(x)p ˆf(p) In general, ˆf(p) won

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

j=1 r 1 x 1 x n. r m r j (x) r j r j (x) r j (x). r j x k

j=1 r 1 x 1 x n. r m r j (x) r j r j (x) r j (x). r j x k Maria Cameron Nonlinear Least Squares Problem The nonlinear least squares problem arises when one needs to find optimal set of parameters for a nonlinear model given a large set of data The variables x,,

More information

Quasi-Newton methods for minimization

Quasi-Newton methods for minimization Quasi-Newton methods for minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universitá di Trento November 21 December 14, 2011 Quasi-Newton methods for minimization 1

More information

Trajectory-based optimization

Trajectory-based optimization Trajectory-based optimization Emo Todorov Applied Mathematics and Computer Science & Engineering University of Washington Winter 2012 Emo Todorov (UW) AMATH/CSE 579, Winter 2012 Lecture 6 1 / 13 Using

More information

Conjugate Directions for Stochastic Gradient Descent

Conjugate Directions for Stochastic Gradient Descent Conjugate Directions for Stochastic Gradient Descent Nicol N Schraudolph Thore Graepel Institute of Computational Science ETH Zürich, Switzerland {schraudo,graepel}@infethzch Abstract The method of conjugate

More information

2. Quasi-Newton methods

2. Quasi-Newton methods L. Vandenberghe EE236C (Spring 2016) 2. Quasi-Newton methods variable metric methods quasi-newton methods BFGS update limited-memory quasi-newton methods 2-1 Newton method for unconstrained minimization

More information

CS 450 Numerical Analysis. Chapter 5: Nonlinear Equations

CS 450 Numerical Analysis. Chapter 5: Nonlinear Equations Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80

More information

Statistics 580 Optimization Methods

Statistics 580 Optimization Methods Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of

More information

Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

More information

Lecture 14: October 17

Lecture 14: October 17 1-725/36-725: Convex Optimization Fall 218 Lecture 14: October 17 Lecturer: Lecturer: Ryan Tibshirani Scribes: Pengsheng Guo, Xian Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

Optimization Methods

Optimization Methods Optimization Methods Categorization of Optimization Problems Continuous Optimization Discrete Optimization Combinatorial Optimization Variational Optimization Common Optimization Concepts in Computer Vision

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Quasi Newton Methods Barnabás Póczos & Ryan Tibshirani Quasi Newton Methods 2 Outline Modified Newton Method Rank one correction of the inverse Rank two correction of the

More information

EECS260 Optimization Lecture notes

EECS260 Optimization Lecture notes EECS260 Optimization Lecture notes Based on Numerical Optimization (Nocedal & Wright, Springer, 2nd ed., 2006) Miguel Á. Carreira-Perpiñán EECS, University of California, Merced May 2, 2010 1 Introduction

More information

Numerical Methods I Solving Nonlinear Equations

Numerical Methods I Solving Nonlinear Equations Numerical Methods I Solving Nonlinear Equations Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 16th, 2014 A. Donev (Courant Institute)

More information

NonlinearOptimization

NonlinearOptimization 1/35 NonlinearOptimization Pavel Kordík Department of Computer Systems Faculty of Information Technology Czech Technical University in Prague Jiří Kašpar, Pavel Tvrdík, 2011 Unconstrained nonlinear optimization,

More information

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality

More information

17 Solution of Nonlinear Systems

17 Solution of Nonlinear Systems 17 Solution of Nonlinear Systems We now discuss the solution of systems of nonlinear equations. An important ingredient will be the multivariate Taylor theorem. Theorem 17.1 Let D = {x 1, x 2,..., x m

More information

Introduction to Nonlinear Optimization Paul J. Atzberger

Introduction to Nonlinear Optimization Paul J. Atzberger Introduction to Nonlinear Optimization Paul J. Atzberger Comments should be sent to: atzberg@math.ucsb.edu Introduction We shall discuss in these notes a brief introduction to nonlinear optimization concepts,

More information

Quasi-Newton Methods

Quasi-Newton Methods Newton s Method Pros and Cons Quasi-Newton Methods MA 348 Kurt Bryan Newton s method has some very nice properties: It s extremely fast, at least once it gets near the minimum, and with the simple modifications

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 6 Optimization Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted

More information

Math 411 Preliminaries

Math 411 Preliminaries Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon s method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector

More information

MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS

MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS 1. Please write your name and student number clearly on the front page of the exam. 2. The exam is

More information

COMP 558 lecture 18 Nov. 15, 2010

COMP 558 lecture 18 Nov. 15, 2010 Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to

More information

Notes on Numerical Optimization

Notes on Numerical Optimization Notes on Numerical Optimization University of Chicago, 2014 Viva Patel October 18, 2014 1 Contents Contents 2 List of Algorithms 4 I Fundamentals of Optimization 5 1 Overview of Numerical Optimization

More information

MATH 3795 Lecture 13. Numerical Solution of Nonlinear Equations in R N.

MATH 3795 Lecture 13. Numerical Solution of Nonlinear Equations in R N. MATH 3795 Lecture 13. Numerical Solution of Nonlinear Equations in R N. Dmitriy Leykekhman Fall 2008 Goals Learn about different methods for the solution of F (x) = 0, their advantages and disadvantages.

More information

Optimization Tutorial 1. Basic Gradient Descent

Optimization Tutorial 1. Basic Gradient Descent E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.

More information

Chapter 3 Numerical Methods

Chapter 3 Numerical Methods Chapter 3 Numerical Methods Part 2 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization 1 Outline 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization Summary 2 Outline 3.2

More information

Data Mining (Mineria de Dades)

Data Mining (Mineria de Dades) Data Mining (Mineria de Dades) Lluís A. Belanche belanche@lsi.upc.edu Soft Computing Research Group Dept. de Llenguatges i Sistemes Informàtics (Software department) Universitat Politècnica de Catalunya

More information

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Jacob Rafati http://rafati.net jrafatiheravi@ucmerced.edu Ph.D. Candidate, Electrical Engineering and Computer Science University

More information

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search

More information

Numerical solutions of nonlinear systems of equations

Numerical solutions of nonlinear systems of equations Numerical solutions of nonlinear systems of equations Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan E-mail: min@math.ntnu.edu.tw August 28, 2011 Outline 1 Fixed points

More information

Convex Optimization. Problem set 2. Due Monday April 26th

Convex Optimization. Problem set 2. Due Monday April 26th Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining

More information

Linear and Nonlinear Optimization

Linear and Nonlinear Optimization Linear and Nonlinear Optimization German University in Cairo October 10, 2016 Outline Introduction Gradient descent method Gauss-Newton method Levenberg-Marquardt method Case study: Straight lines have

More information

Gradient-Based Optimization

Gradient-Based Optimization Multidisciplinary Design Optimization 48 Chapter 3 Gradient-Based Optimization 3. Introduction In Chapter we described methods to minimize (or at least decrease) a function of one variable. While problems

More information

Applied Mathematics 205. Unit I: Data Fitting. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit I: Data Fitting. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit I: Data Fitting Lecturer: Dr. David Knezevic Unit I: Data Fitting Chapter I.4: Nonlinear Least Squares 2 / 25 Nonlinear Least Squares So far we have looked at finding a best

More information

Geometry optimization

Geometry optimization Geometry optimization Trygve Helgaker Centre for Theoretical and Computational Chemistry Department of Chemistry, University of Oslo, Norway European Summer School in Quantum Chemistry (ESQC) 211 Torre

More information

Matrix Derivatives and Descent Optimization Methods

Matrix Derivatives and Descent Optimization Methods Matrix Derivatives and Descent Optimization Methods 1 Qiang Ning Department of Electrical and Computer Engineering Beckman Institute for Advanced Science and Techonology University of Illinois at Urbana-Champaign

More information

Non-polynomial Least-squares fitting

Non-polynomial Least-squares fitting Applied Math 205 Last time: piecewise polynomial interpolation, least-squares fitting Today: underdetermined least squares, nonlinear least squares Homework 1 (and subsequent homeworks) have several parts

More information

1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0

1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0 Numerical Analysis 1 1. Nonlinear Equations This lecture note excerpted parts from Michael Heath and Max Gunzburger. Given function f, we seek value x for which where f : D R n R n is nonlinear. f(x) =

More information

Outline. Scientific Computing: An Introductory Survey. Nonlinear Equations. Nonlinear Equations. Examples: Nonlinear Equations

Outline. Scientific Computing: An Introductory Survey. Nonlinear Equations. Nonlinear Equations. Examples: Nonlinear Equations Methods for Systems of Methods for Systems of Outline Scientific Computing: An Introductory Survey Chapter 5 1 Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign

More information

Math 409/509 (Spring 2011)

Math 409/509 (Spring 2011) Math 409/509 (Spring 2011) Instructor: Emre Mengi Study Guide for Homework 2 This homework concerns the root-finding problem and line-search algorithms for unconstrained optimization. Please don t hesitate

More information

CLASS NOTES Computational Methods for Engineering Applications I Spring 2015

CLASS NOTES Computational Methods for Engineering Applications I Spring 2015 CLASS NOTES Computational Methods for Engineering Applications I Spring 2015 Petros Koumoutsakos Gerardo Tauriello (Last update: July 27, 2015) IMPORTANT DISCLAIMERS 1. REFERENCES: Much of the material

More information

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term; Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many

More information

A Trust-region-based Sequential Quadratic Programming Algorithm

A Trust-region-based Sequential Quadratic Programming Algorithm Downloaded from orbit.dtu.dk on: Oct 19, 2018 A Trust-region-based Sequential Quadratic Programming Algorithm Henriksen, Lars Christian; Poulsen, Niels Kjølstad Publication date: 2010 Document Version

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Gradient Descent, Newton-like Methods Mark Schmidt University of British Columbia Winter 2017 Admin Auditting/registration forms: Submit them in class/help-session/tutorial this

More information

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology,

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained

More information

Line Search Algorithms

Line Search Algorithms Lab 1 Line Search Algorithms Investigate various Line-Search algorithms for numerical opti- Lab Objective: mization. Overview of Line Search Algorithms Imagine you are out hiking on a mountain, and you

More information

Comparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming Problems

Comparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming Problems International Journal of Scientific and Research Publications, Volume 3, Issue 10, October 013 1 ISSN 50-3153 Comparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 5 Nonlinear Equations Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by: Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion

More information

Static unconstrained optimization

Static unconstrained optimization Static unconstrained optimization 2 In unconstrained optimization an objective function is minimized without any additional restriction on the decision variables, i.e. min f(x) x X ad (2.) with X ad R

More information

Improved Damped Quasi-Newton Methods for Unconstrained Optimization

Improved Damped Quasi-Newton Methods for Unconstrained Optimization Improved Damped Quasi-Newton Methods for Unconstrained Optimization Mehiddin Al-Baali and Lucio Grandinetti August 2015 Abstract Recently, Al-Baali (2014) has extended the damped-technique in the modified

More information

Written Examination

Written Examination Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes

More information

Iterative Methods for Smooth Objective Functions

Iterative Methods for Smooth Objective Functions Optimization Iterative Methods for Smooth Objective Functions Quadratic Objective Functions Stationary Iterative Methods (first/second order) Steepest Descent Method Landweber/Projected Landweber Methods

More information

Quasi-Newton Methods

Quasi-Newton Methods Quasi-Newton Methods Werner C. Rheinboldt These are excerpts of material relating to the boos [OR00 and [Rhe98 and of write-ups prepared for courses held at the University of Pittsburgh. Some further references

More information

Outline. Scientific Computing: An Introductory Survey. Optimization. Optimization Problems. Examples: Optimization Problems

Outline. Scientific Computing: An Introductory Survey. Optimization. Optimization Problems. Examples: Optimization Problems Outline Scientific Computing: An Introductory Survey Chapter 6 Optimization 1 Prof. Michael. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

Solving linear equations with Gaussian Elimination (I)

Solving linear equations with Gaussian Elimination (I) Term Projects Solving linear equations with Gaussian Elimination The QR Algorithm for Symmetric Eigenvalue Problem The QR Algorithm for The SVD Quasi-Newton Methods Solving linear equations with Gaussian

More information

Motivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes)

Motivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes) AMSC/CMSC 460 Computational Methods, Fall 2007 UNIT 5: Nonlinear Equations Dianne P. O Leary c 2001, 2002, 2007 Solving Nonlinear Equations and Optimization Problems Read Chapter 8. Skip Section 8.1.1.

More information

(One Dimension) Problem: for a function f(x), find x 0 such that f(x 0 ) = 0. f(x)

(One Dimension) Problem: for a function f(x), find x 0 such that f(x 0 ) = 0. f(x) Solving Nonlinear Equations & Optimization One Dimension Problem: or a unction, ind 0 such that 0 = 0. 0 One Root: The Bisection Method This one s guaranteed to converge at least to a singularity, i not

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α

More information

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss

More information

Quasi-Newton Methods. Javier Peña Convex Optimization /36-725

Quasi-Newton Methods. Javier Peña Convex Optimization /36-725 Quasi-Newton Methods Javier Peña Convex Optimization 10-725/36-725 Last time: primal-dual interior-point methods Consider the problem min x subject to f(x) Ax = b h(x) 0 Assume f, h 1,..., h m are convex

More information

March 8, 2010 MATH 408 FINAL EXAM SAMPLE

March 8, 2010 MATH 408 FINAL EXAM SAMPLE March 8, 200 MATH 408 FINAL EXAM SAMPLE EXAM OUTLINE The final exam for this course takes place in the regular course classroom (MEB 238) on Monday, March 2, 8:30-0:20 am. You may bring two-sided 8 page

More information

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number Limited-Memory Matrix Methods with Applications 1 Tamara Gibson Kolda 2 Applied Mathematics Program University of Maryland at College Park Abstract. The focus of this dissertation is on matrix decompositions

More information

LBFGS. John Langford, Large Scale Machine Learning Class, February 5. (post presentation version)

LBFGS. John Langford, Large Scale Machine Learning Class, February 5. (post presentation version) LBFGS John Langford, Large Scale Machine Learning Class, February 5 (post presentation version) We are still doing Linear Learning Features: a vector x R n Label: y R Goal: Learn w R n such that ŷ w (x)

More information

Search Directions for Unconstrained Optimization

Search Directions for Unconstrained Optimization 8 CHAPTER 8 Search Directions for Unconstrained Optimization In this chapter we study the choice of search directions used in our basic updating scheme x +1 = x + t d. for solving P min f(x). x R n All

More information