Truncated Newton Method

Size: px
Start display at page:

Download "Truncated Newton Method"

Transcription

1 Truncated Newton Method approximate Newton methods truncated Newton methods truncated Newton interior-point methods EE364b, Stanford University

2 minimize convex f : R n R Newton s method Newton step x nt found from (SPD) Newton system using Cholesky factorization 2 f(x) x nt = f(x) backtracking line search on function value f(x) or norm of gradient f(x) stopping criterion based on Newton decrement λ 2 /2 = f(x) T x nt or norm of gradient f(x) EE364b, Stanford University 1

3 Approximate or inexact Newton methods use as search direction an approximate solution x of Newton system idea: no need to compute x nt exactly; only need a good enough search direction number of iterations may increase, but if effort per iteration is smaller than for Newton, we win examples: solve Ĥ x = f(x), where Ĥ is diagonal or band of 2 f(x) factor 2 f(x) every k iterations and use most recent factorization EE364b, Stanford University 2

4 Truncated Newton methods approximately solve Newton system using CG or PCG, terminating (sometimes way) early also called Newton-iterative methods; related to limited memory Newton (or BFGS) total effort is measured by cumulative sum of CG steps done for good performance, need to tune CG stopping criterion, to use just enough steps to get a good enough search direction less reliable than Newton s method, but (with good tuning, good preconditioner, fast z 2 f(x)z method, and some luck) can handle very large problems EE364b, Stanford University 3

5 Truncated Newton method backtracking line search on f(x) typical CG termination rule: stop after N max steps or η = 2 f(x) x+ f(x) f(x) ǫ pcg with simple rules, N max, ǫ pcg are constant more sophisticated rules adapt N max or ǫ pcg as algorithm proceeds (based on, e.g., value of f(x), or progress in reducing f(x) ) η = min(0.1, f(x) 1/2 ) guarantees (with large N max ) superlinear convergence EE364b, Stanford University 4

6 CG initialization we use CG to approximately solve 2 f(x) x+ f(x) = 0 if we initialize CG with x = 0 after one CG step, x points in direction of negative gradient (so, N max = 1 results in gradient method) all CG iterates are descent directions for f another choice: initialize with x = x prev, the previous search step initial CG iterates need not be descent directions but can give advantage when N max is small EE364b, Stanford University 5

7 simple scheme: if x prev is a descent direction ( x T prev f(x) < 0) start CG from x = x T prev f(x) x T prev 2 f(x) x prev x prev otherwise start CG from x = 0 EE364b, Stanford University 6

8 l 2 -regularized logistic regression Example minimize f(w) = (1/m) m i=1 log( 1+exp( b i x T i w)) + n i=1 λ iw 2 i variable is w R n problem data are x i R n, b i { 1,1}, i = 1,...,m, and regularization parameter λ R n + n is number of features; m is number of samples/observations EE364b, Stanford University 7

9 Hessian and gradient 2 f(w) = A T DA+2Λ, f(w) = A T g +2Λw where A = [b 1 x 1 b m x m ] T, D = diag(h), Λ = diag(λ) g i = (1/m)/(1+exp(Aw) i ) h i = (1/m)exp(Aw) i /(1+exp(Aw) i ) 2 we never form 2 f(w); we carry out multiplication z 2 f(w)z as 2 f(w)z = ( A T DA+2Λ ) z = A T (D(Az))+2Λz EE364b, Stanford University 8

10 Problem instance n = features, m = samples (10000 each with b i = ±1) x i have random sparsity pattern, with around 10 nonzero entries nonzero entries in x i drawn from N(b i,1) λ i = 10 8 around nonzeros in 2 f, and 30M nonzeros in Cholesky factor EE364b, Stanford University 9

11 Methods Newton (using Cholesky factorization of 2 f(w)) truncated Newton with ǫ cg = 10 4, N max = 10 truncated Newton with ǫ cg = 10 4, N max = 50 truncated Newton with ǫ cg = 10 4, N max = 250 EE364b, Stanford University 10

12 Convergence versus iterations 10 1 cg 10 cg cg 250 Newton 10 3 f k EE364b, Stanford University 11

13 Convergence versus cumulative CG steps 10 1 cg 10 cg cg f cumulative CG iterations EE364b, Stanford University 12

14 convergence of exact Newton, and truncated Newton methods with N max = 50 and 250 essentially the same, in terms of iterations in terms elapsed time (and memory!), truncated Newton methods far better than Newton truncated Newton with N max = 10 seems to jam near f(w) 10 6 times (on AMD270 2GHz, 12GB, Linux) in sec: method f(w) 10 5 f(w) 10 8 Newton cg 10 4 cg cg EE364b, Stanford University 13

15 Truncated PCG Newton method approximate search direction found via diagonally preconditioned PCG 10 1 cg 10 cg cg 250 pcg 10 pcg pcg 250 f cumulative CG iterations EE364b, Stanford University 14

16 diagonal preconditioning allows N max = 10 to achieve high accuracy; speeds up other truncated Newton methods times: method f(w) 10 5 f(w) 10 8 Newton cg 10 4 cg cg pcg pcg pcg speedups of 1600:3, 2600:5 are not bad (and we really didn t do much tuning... ) EE364b, Stanford University 15

17 Extensions can extend to (infeasible start) Newton s method with equality constraints since we don t use exact Newton step, equality constraints not guaranteed to hold after finite number of steps (but r p 0) can use for barrier, primal-dual methods EE364b, Stanford University 16

18 Truncated Newton interior-point methods use truncated Newton method to compute search direction in interior-point method tuning PCG parameters for optimal performance on a given problem class is tricky, since linear systems in interior-point methods often become ill-conditioned as algorithm proceeds but can work well (with luck, good preconditioner) EE364b, Stanford University 17

19 Network rate control rate control problem with variable f f R n ++ is vector of flow rates U(f) = n j=1 logf j is flow utility minimize U(f) = n j=1 logf j subject to Rf c R R m n is route matrix (R ij {0,1}) c R m is vector of link capacities EE364b, Stanford University 18

20 Dual rate control problem dual problem with variable λ R m duality gap maximize g(λ) = n c T λ+ m i=1 log(rt i λ) subject to λ 0 η = U(f) g(λ) n = logf j n+c T λ j=1 m log(ri T λ) i=1 EE364b, Stanford University 19

21 Primal-dual search direction (BV 11.7) primal-dual search direction f, λ given by (D 1 +R T D 2 R) f = g 1 (1/t)R T g 2, λ = D 2 R f λ+(1/t)g 2 where s = c Rf, D 1 = diag(1/f1,...,1/f 2 n), 2 D 2 = diag(λ 1 /s 1,...,λ m /s m ) g 1 = (1/f 1,...,1/f n ), g 2 = (1/s 1,...,1/s m ) EE364b, Stanford University 20

22 primal-dual residual: Truncated Newton primal-dual algorithm r = (r dual,r cent ) = ( g 2 +R T λ, diag(λ)s (1/t)1 ) given f with Rf c; λ 0 while η/g(λ) > ǫ t := µm/η compute f using PCG as approximate solution of (D 1 +R T D 2 R) f = g 1 (1/t)R T g 2 λ := D 2 R f λ+(1/t)g 2 carry out line search on r 2, and update: f := f +γ f, λ := λ+γ λ EE364b, Stanford University 21

23 problem instance m = links, n = flows average of 12 links per flow, 6 flows per link capacities random, uniform on [0.1, 1] algorithm parameters truncated Newton with ǫ cg = min(0.1,η/g(λ)), N max = 200 (N max never reached) diagonal preconditioner warm start µ = 2 ǫ = (i.e., solve to guaranteed 0.1% suboptimality) EE364b, Stanford University 22

24 Primal and dual objective evolution x 105 U(f) g(λ) cumulative PCG iterations EE364b, Stanford University 23

25 Relative duality gap evolution 10 1 relative duality gap cumulative PCG iterations EE364b, Stanford University 24

26 Primal and dual objective evolution (n = 10 6 ) x 106 U(f) g(λ) cumulative PCG iterations EE364b, Stanford University 25

27 Relative duality gap evolution (n = 10 6 ) 10 1 relative duality gap cumulative PCG iterations EE364b, Stanford University 26

Conjugate Gradient Method

Conjugate Gradient Method Conjugate Gradient Method direct and indirect methods positive definite linear systems Krylov sequence spectral analysis of Krylov sequence preconditioning Prof. S. Boyd, EE364b, Stanford University Three

More information

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725 Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

More information

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725 Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h

More information

An Interior-Point Method for Large Scale Network Utility Maximization

An Interior-Point Method for Large Scale Network Utility Maximization An Interior-Point Method for Large Scale Network Utility Maximization Argyrios Zymnis, Nikolaos Trichakis, Stephen Boyd, and Dan O Neill August 6, 2007 Abstract We describe a specialized truncated-newton

More information

Analytic Center Cutting-Plane Method

Analytic Center Cutting-Plane Method Analytic Center Cutting-Plane Method S. Boyd, L. Vandenberghe, and J. Skaf April 14, 2011 Contents 1 Analytic center cutting-plane method 2 2 Computing the analytic center 3 3 Pruning constraints 5 4 Lower

More information

Homework 4. Convex Optimization /36-725

Homework 4. Convex Optimization /36-725 Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Newton s Method. Javier Peña Convex Optimization /36-725

Newton s Method. Javier Peña Convex Optimization /36-725 Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and

More information

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725 Primal-Dual Interior-Point Methods Ryan Tibshirani Convex Optimization 10-725/36-725 Given the problem Last time: barrier method min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h i, i = 1,...

More information

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods March 23, 2018 1 / 35 This material is covered in S. Boyd, L. Vandenberge s book Convex Optimization https://web.stanford.edu/~boyd/cvxbook/.

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal IPAM Summer School 2012 Tutorial on Optimization methods for machine learning Jorge Nocedal Northwestern University Overview 1. We discuss some characteristics of optimization problems arising in deep

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

ORIE 6326: Convex Optimization. Quasi-Newton Methods

ORIE 6326: Convex Optimization. Quasi-Newton Methods ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted

More information

A Study of Numerical Algorithms for Regularized Poisson ML Image Reconstruction

A Study of Numerical Algorithms for Regularized Poisson ML Image Reconstruction A Study of Numerical Algorithms for Regularized Poisson ML Image Reconstruction Yao Xie Project Report for EE 391 Stanford University, Summer 2006-07 September 1, 2007 Abstract In this report we solved

More information

Numerical optimization

Numerical optimization Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

More information

Interior Point Algorithms for Constrained Convex Optimization

Interior Point Algorithms for Constrained Convex Optimization Interior Point Algorithms for Constrained Convex Optimization Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Inequality constrained minimization problems

More information

Preconditioning via Diagonal Scaling

Preconditioning via Diagonal Scaling Preconditioning via Diagonal Scaling Reza Takapoui Hamid Javadi June 4, 2014 1 Introduction Interior point methods solve small to medium sized problems to high accuracy in a reasonable amount of time.

More information

Primal-Dual Interior-Point Methods

Primal-Dual Interior-Point Methods Primal-Dual Interior-Point Methods Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 Outline Today: Primal-dual interior-point method Special case: linear programming

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

An Iterative Descent Method

An Iterative Descent Method Conjugate Gradient: An Iterative Descent Method The Plan Review Iterative Descent Conjugate Gradient Review : Iterative Descent Iterative Descent is an unconstrained optimization process x (k+1) = x (k)

More information

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization Primal-Dual Interior-Point Methods Ryan Tibshirani Convex Optimization 10-725 Given the problem Last time: barrier method min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h i, i = 1,... m are

More information

Equality constrained minimization

Equality constrained minimization Chapter 10 Equality constrained minimization 10.1 Equality constrained minimization problems In this chapter we describe methods for solving a convex optimization problem with equality constraints, minimize

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

5 Quasi-Newton Methods

5 Quasi-Newton Methods Unconstrained Convex Optimization 26 5 Quasi-Newton Methods If the Hessian is unavailable... Notation: H = Hessian matrix. B is the approximation of H. C is the approximation of H 1. Problem: Solve min

More information

LINEAR AND NONLINEAR PROGRAMMING

LINEAR AND NONLINEAR PROGRAMMING LINEAR AND NONLINEAR PROGRAMMING Stephen G. Nash and Ariela Sofer George Mason University The McGraw-Hill Companies, Inc. New York St. Louis San Francisco Auckland Bogota Caracas Lisbon London Madrid Mexico

More information

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems 1 Numerical optimization Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book Numerical optimization 048921 Advanced topics in vision Processing and Analysis of

More information

Applications of Linear Programming

Applications of Linear Programming Applications of Linear Programming lecturer: András London University of Szeged Institute of Informatics Department of Computational Optimization Lecture 9 Non-linear programming In case of LP, the goal

More information

Lecture 9 Sequential unconstrained minimization

Lecture 9 Sequential unconstrained minimization S. Boyd EE364 Lecture 9 Sequential unconstrained minimization brief history of SUMT & IP methods logarithmic barrier function central path UMT & SUMT complexity analysis feasibility phase generalized inequalities

More information

11. Equality constrained minimization

11. Equality constrained minimization Convex Optimization Boyd & Vandenberghe 11. Equality constrained minimization equality constrained minimization eliminating equality constraints Newton s method with equality constraints infeasible start

More information

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09 Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1, Math 30 Winter 05 Solution to Homework 3. Recognizing the convexity of g(x) := x log x, from Jensen s inequality we get d(x) n x + + x n n log x + + x n n where the equality is attained only at x = (/n,...,

More information

A Distributed Newton Method for Network Utility Maximization, II: Convergence

A Distributed Newton Method for Network Utility Maximization, II: Convergence A Distributed Newton Method for Network Utility Maximization, II: Convergence Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract The existing distributed algorithms for Network Utility

More information

Convex Optimization Lecture 13

Convex Optimization Lecture 13 Convex Optimization Lecture 13 Today: Interior-Point (continued) Central Path method for SDP Feasibility and Phase I Methods From Central Path to Primal/Dual Central'Path'Log'Barrier'Method Init: Feasible&#

More information

Convex Optimization. Problem set 2. Due Monday April 26th

Convex Optimization. Problem set 2. Due Monday April 26th Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining

More information

Lecture 14: October 17

Lecture 14: October 17 1-725/36-725: Convex Optimization Fall 218 Lecture 14: October 17 Lecturer: Lecturer: Ryan Tibshirani Scribes: Pengsheng Guo, Xian Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

Coordinate Descent and Ascent Methods

Coordinate Descent and Ascent Methods Coordinate Descent and Ascent Methods Julie Nutini Machine Learning Reading Group November 3 rd, 2015 1 / 22 Projected-Gradient Methods Motivation Rewrite non-smooth problem as smooth constrained problem:

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

Convex Optimization and l 1 -minimization

Convex Optimization and l 1 -minimization Convex Optimization and l 1 -minimization Sangwoon Yun Computational Sciences Korea Institute for Advanced Study December 11, 2009 2009 NIMS Thematic Winter School Outline I. Convex Optimization II. l

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information

Lecture 17: October 27

Lecture 17: October 27 0-725/36-725: Convex Optimiation Fall 205 Lecturer: Ryan Tibshirani Lecture 7: October 27 Scribes: Brandon Amos, Gines Hidalgo Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

A Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification

A Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification JMLR: Workshop and Conference Proceedings 1 16 A Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification Chih-Yang Hsia r04922021@ntu.edu.tw Dept. of Computer Science,

More information

Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

More information

Change point method: an exact line search method for SVMs

Change point method: an exact line search method for SVMs Erasmus University Rotterdam Bachelor Thesis Econometrics & Operations Research Change point method: an exact line search method for SVMs Author: Yegor Troyan Student number: 386332 Supervisor: Dr. P.J.F.

More information

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by: Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion

More information

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology. The Problem The logarithmic barrier approach

More information

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS Working Paper 01 09 Departamento de Estadística y Econometría Statistics and Econometrics Series 06 Universidad Carlos III de Madrid January 2001 Calle Madrid, 126 28903 Getafe (Spain) Fax (34) 91 624

More information

Sequential Convex Programming

Sequential Convex Programming Sequential Convex Programming sequential convex programming alternating convex optimization convex-concave procedure Prof. S. Boyd, EE364b, Stanford University Methods for nonconvex optimization problems

More information

Lecture 25: November 27

Lecture 25: November 27 10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have

More information

Iterative Methods for Smooth Objective Functions

Iterative Methods for Smooth Objective Functions Optimization Iterative Methods for Smooth Objective Functions Quadratic Objective Functions Stationary Iterative Methods (first/second order) Steepest Descent Method Landweber/Projected Landweber Methods

More information

Chemical Equilibrium: A Convex Optimization Problem

Chemical Equilibrium: A Convex Optimization Problem Chemical Equilibrium: A Convex Optimization Problem Linyi Gao June 4, 2014 1 Introduction The equilibrium composition of a mixture of reacting molecules is essential to many physical and chemical systems,

More information

Homework 5. Convex Optimization /36-725

Homework 5. Convex Optimization /36-725 Homework 5 Convex Optimization 10-725/36-725 Due Tuesday November 22 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review Department of Statistical Sciences and Operations Research Virginia Commonwealth University Oct 16, 2013 (Lecture 14) Nonlinear Optimization

More information

Accelerated Block-Coordinate Relaxation for Regularized Optimization

Accelerated Block-Coordinate Relaxation for Regularized Optimization Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth

More information

Incomplete Cholesky preconditioners that exploit the low-rank property

Incomplete Cholesky preconditioners that exploit the low-rank property anapov@ulb.ac.be ; http://homepages.ulb.ac.be/ anapov/ 1 / 35 Incomplete Cholesky preconditioners that exploit the low-rank property (theory and practice) Artem Napov Service de Métrologie Nucléaire, Université

More information

Improving an interior-point algorithm for multicommodity flows by quadratic regularizations

Improving an interior-point algorithm for multicommodity flows by quadratic regularizations Improving an interior-point algorithm for multicommodity flows by quadratic regularizations Jordi Castro Jordi Cuesta Dept. of Stat. and Operations Research Dept. of Chemical Engineering Universitat Politècnica

More information

PENNON A Generalized Augmented Lagrangian Method for Convex NLP and SDP p.1/39

PENNON A Generalized Augmented Lagrangian Method for Convex NLP and SDP p.1/39 PENNON A Generalized Augmented Lagrangian Method for Convex NLP and SDP Michal Kočvara Institute of Information Theory and Automation Academy of Sciences of the Czech Republic and Czech Technical University

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug

More information

SMO vs PDCO for SVM: Sequential Minimal Optimization vs Primal-Dual interior method for Convex Objectives for Support Vector Machines

SMO vs PDCO for SVM: Sequential Minimal Optimization vs Primal-Dual interior method for Convex Objectives for Support Vector Machines vs for SVM: Sequential Minimal Optimization vs Primal-Dual interior method for Convex Objectives for Support Vector Machines Ding Ma Michael Saunders Working paper, January 5 Introduction In machine learning,

More information

EE364a Homework 8 solutions

EE364a Homework 8 solutions EE364a, Winter 2007-08 Prof. S. Boyd EE364a Homework 8 solutions 9.8 Steepest descent method in l -norm. Explain how to find a steepest descent direction in the l -norm, and give a simple interpretation.

More information

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem: CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

More information

Unconstrained minimization: assumptions

Unconstrained minimization: assumptions Unconstrained minimization I terminology and assumptions I gradient descent method I steepest descent method I Newton s method I self-concordant functions I implementation IOE 611: Nonlinear Programming,

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

Nonlinear Optimization Methods for Machine Learning

Nonlinear Optimization Methods for Machine Learning Nonlinear Optimization Methods for Machine Learning Jorge Nocedal Northwestern University University of California, Davis, Sept 2018 1 Introduction We don t really know, do we? a) Deep neural networks

More information

The classifier. Theorem. where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know

The classifier. Theorem. where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know The Bayes classifier Theorem The classifier satisfies where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know Alternatively, since the maximum it is

More information

The classifier. Linear discriminant analysis (LDA) Example. Challenges for LDA

The classifier. Linear discriminant analysis (LDA) Example. Challenges for LDA The Bayes classifier Linear discriminant analysis (LDA) Theorem The classifier satisfies In linear discriminant analysis (LDA), we make the (strong) assumption that where the min is over all possible classifiers.

More information

Barrier Method. Javier Peña Convex Optimization /36-725

Barrier Method. Javier Peña Convex Optimization /36-725 Barrier Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: Newton s method For root-finding F (x) = 0 x + = x F (x) 1 F (x) For optimization x f(x) x + = x 2 f(x) 1 f(x) Assume f strongly

More information

Quasi-Newton Methods. Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization

Quasi-Newton Methods. Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization Quasi-Newton Methods Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization 10-725 Last time: primal-dual interior-point methods Given the problem min x f(x) subject to h(x)

More information

Solving linear equations with Gaussian Elimination (I)

Solving linear equations with Gaussian Elimination (I) Term Projects Solving linear equations with Gaussian Elimination The QR Algorithm for Symmetric Eigenvalue Problem The QR Algorithm for The SVD Quasi-Newton Methods Solving linear equations with Gaussian

More information

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization / Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

More information

Written Examination

Written Examination Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes

More information

Introduction to Logistic Regression and Support Vector Machine

Introduction to Logistic Regression and Support Vector Machine Introduction to Logistic Regression and Support Vector Machine guest lecturer: Ming-Wei Chang CS 446 Fall, 2009 () / 25 Fall, 2009 / 25 Before we start () 2 / 25 Fall, 2009 2 / 25 Before we start Feel

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Gradient Descent, Newton-like Methods Mark Schmidt University of British Columbia Winter 2017 Admin Auditting/registration forms: Submit them in class/help-session/tutorial this

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

Warm up. Regrade requests submitted directly in Gradescope, do not instructors.

Warm up. Regrade requests submitted directly in Gradescope, do not  instructors. Warm up Regrade requests submitted directly in Gradescope, do not email instructors. 1 float in NumPy = 8 bytes 10 6 2 20 bytes = 1 MB 10 9 2 30 bytes = 1 GB For each block compute the memory required

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428

More information

Computational Science and Engineering (Int. Master s Program) Preconditioning for Hessian-Free Optimization

Computational Science and Engineering (Int. Master s Program) Preconditioning for Hessian-Free Optimization Computational Science and Engineering (Int. Master s Program) Technische Universität München Master s thesis in Computational Science and Engineering Preconditioning for Hessian-Free Optimization Robert

More information

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way. AMSC 607 / CMSC 878o Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 3: Penalty and Barrier Methods Dianne P. O Leary c 2008 Reference: N&S Chapter 16 Penalty and Barrier

More information

Inexact Newton Methods and Nonlinear Constrained Optimization

Inexact Newton Methods and Nonlinear Constrained Optimization Inexact Newton Methods and Nonlinear Constrained Optimization Frank E. Curtis EPSRC Symposium Capstone Conference Warwick Mathematics Institute July 2, 2009 Outline PDE-Constrained Optimization Newton

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems AMSC 607 / CMSC 764 Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 4: Introduction to Interior Point Methods Dianne P. O Leary c 2008 Interior Point Methods We ll discuss

More information

Scaled gradient projection methods in image deblurring and denoising

Scaled gradient projection methods in image deblurring and denoising Scaled gradient projection methods in image deblurring and denoising Mario Bertero 1 Patrizia Boccacci 1 Silvia Bonettini 2 Riccardo Zanella 3 Luca Zanni 3 1 Dipartmento di Matematica, Università di Genova

More information

Bellman s Curse of Dimensionality

Bellman s Curse of Dimensionality Bellman s Curse of Dimensionality n- dimensional state space Number of states grows exponen

More information

Unconstrained minimization

Unconstrained minimization CSCI5254: Convex Optimization & Its Applications Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions 1 Unconstrained

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)

More information

Lecture 9: September 28

Lecture 9: September 28 0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

Convex Optimization. 9. Unconstrained minimization. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University

Convex Optimization. 9. Unconstrained minimization. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University Convex Optimization 9. Unconstrained minimization Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2017 Autumn Semester SJTU Ying Cui 1 / 40 Outline Unconstrained minimization

More information

Selected Topics in Optimization. Some slides borrowed from

Selected Topics in Optimization. Some slides borrowed from Selected Topics in Optimization Some slides borrowed from http://www.stat.cmu.edu/~ryantibs/convexopt/ Overview Optimization problems are almost everywhere in statistics and machine learning. Input Model

More information

Dual Augmented Lagrangian, Proximal Minimization, and MKL

Dual Augmented Lagrangian, Proximal Minimization, and MKL Dual Augmented Lagrangian, Proximal Minimization, and MKL Ryota Tomioka 1, Taiji Suzuki 1, and Masashi Sugiyama 2 1 University of Tokyo 2 Tokyo Institute of Technology 2009-09-15 @ TU Berlin (UT / Tokyo

More information

Recent advances in approximation using Krylov subspaces. V. Simoncini. Dipartimento di Matematica, Università di Bologna.

Recent advances in approximation using Krylov subspaces. V. Simoncini. Dipartimento di Matematica, Università di Bologna. Recent advances in approximation using Krylov subspaces V. Simoncini Dipartimento di Matematica, Università di Bologna and CIRSA, Ravenna, Italy valeria@dm.unibo.it 1 The framework It is given an operator

More information

Properties of Preconditioners for Robust Linear Regression

Properties of Preconditioners for Robust Linear Regression 50 Properties of Preconditioners for Robust Linear Regression Venansius Baryamureeba * Department of Computer Science, Makerere University Trond Steihaug Department of Informatics, University of Bergen

More information

14. Nonlinear equations

14. Nonlinear equations L. Vandenberghe ECE133A (Winter 2018) 14. Nonlinear equations Newton method for nonlinear equations damped Newton method for unconstrained minimization Newton method for nonlinear least squares 14-1 Set

More information

REPORTS IN INFORMATICS

REPORTS IN INFORMATICS REPORTS IN INFORMATICS ISSN 0333-3590 A class of Methods Combining L-BFGS and Truncated Newton Lennart Frimannslund Trond Steihaug REPORT NO 319 April 2006 Department of Informatics UNIVERSITY OF BERGEN

More information

Lecture 14: Newton s Method

Lecture 14: Newton s Method 10-725/36-725: Conve Optimization Fall 2016 Lecturer: Javier Pena Lecture 14: Newton s ethod Scribes: Varun Joshi, Xuan Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes

More information