The Conjugate Gradient Algorithm
|
|
- Rosalyn Malone
- 6 years ago
- Views:
Transcription
1
2 Optimization over a Subspace Conjugate Direction Methods Conjugate Gradient Algorithm Non-Quadratic Conjugate Gradient Algorithm
3 Optimization over a Subspace Consider the problem min f (x) subject to x x 0 + S, where f : R n R is continuously differentiable and S is the subspace S := Span{v 1,..., v k }.
4 Optimization over a Subspace Consider the problem min f (x) subject to x x 0 + S, where f : R n R is continuously differentiable and S is the subspace S := Span{v 1,..., v k }. If V R n k has columns v 1,..., v k, then this problem is equivalent min f (x 0 + Vz) subject to z R k.
5 Subspace Optimality Condition min f (x 0 + Vz) subject to z R k. Set ˆf (z) = f (x 0 + Vz). If z solves this problem, then V T f (x 0 + V z) = ˆf ( z) = 0.
6 Subspace Optimality Condition min f (x 0 + Vz) subject to z R k. Set ˆf (z) = f (x 0 + Vz). If z solves this problem, then V T f (x 0 + V z) = ˆf ( z) = 0. Set x = x 0 + V z, we note that x solves the original problem if and only if z solves the problem above.
7 Subspace Optimality Condition min f (x 0 + Vz) subject to z R k. Set ˆf (z) = f (x 0 + Vz). If z solves this problem, then V T f (x 0 + V z) = ˆf ( z) = 0. Set x = x 0 + V z, we note that x solves the original problem if and only if z solves the problem above. Then V T f ( x) = 0, or equivalently, vi T f ( x) = 0 for i = 1, 2,..., k, go f ( x) Span(v 1,..., v k ).
8 Subspace Optimality Theorem Consider the problem P S min f (x) subject to x x 0 + S, where f : R n R is continuously differentiable and S is the subspace S := Span{v 1,..., v k }. If x solves P S, then f ( x) S.
9 Subspace Optimality Theorem Consider the problem P S min f (x) subject to x x 0 + S, where f : R n R is continuously differentiable and S is the subspace S := Span{v 1,..., v k }. If x solves P S, then f ( x) S. If it is further assumed that f is conves, then x solves P S if and only if f ( x) S.
10 Conjugate Direction Algorithm P : where f : R n R is C 2 is given by minimize f (x) subject to x R n f (x) := 1 2 x T Qx b T x with Q is a symmetric positive definite.
11 Conjugate Direction Algorithm Definition [Conjugacy] Let Q R n n be symmetric and positive definite. We say that the vectors x, y R n \{0} are Q-conjugate (or Q-orthogonal) if x T Qy = 0.
12 Conjugate Direction Algorithm Definition [Conjugacy] Let Q R n n be symmetric and positive definite. We say that the vectors x, y R n \{0} are Q-conjugate (or Q-orthogonal) if x T Qy = 0. Proposition [Conjugacy implies Linear Independence] If Q R n n is positive definite and the set of nonzero vectors d 0, d 1,..., d k are (pairwise) Q-conjugate, then these vectors are linearly independent.
13 Conjugate Direction Algorithm [Conjugate Direction Algorithm] Let {d i } n 1 i=0 be a set of nonzero Q-conjugate vectors. For any x 0 R n the sequence {x k } generated according to with x k+1 := x k + α k d k, k 0 α k := arg min{f (x k + αd k ) : α R} converges to the unique solution, x of P after n steps, that is x n = x.
14 Conjugate Direction Algorithm [Conjugate Direction Algorithm] Let {d i } n 1 i=0 be a set of nonzero Q-conjugate vectors. For any x 0 R n the sequence {x k } generated according to with x k+1 := x k + α k d k, k 0 α k := arg min{f (x k + αd k ) : α R} converges to the unique solution, x of P after n steps, that is x n = x. We have already shown that α k = f (x k )i T d k /d k Qd k.
15 Expanding Subspace Theorem Let {d i } n 1 i=0 be a sequence of nonzero Q-conjugate vectors in Rn. Then for any x 0 R n the sequence {x k } generated according to x k+1 α k = x k + α k d k = g T k d k d T k Qd k has the property that f (x) = 1 2 x T Qx b T x attains its minimum value on the affine set x 0 + Span {d 0,..., d k } at the point x k.
16 Expanding Subspace Theorem: Proof Let x solve min {f (x) x x 0 + Span {d 0,..., d k }}. The Subspace Optimality Theorem tells us that f ( x) T d i = 0, i = 0,..., k.
17 Expanding Subspace Theorem: Proof Let x solve min {f (x) x x 0 + Span {d 0,..., d k }}. The Subspace Optimality Theorem tells us that f ( x) T d i = 0, i = 0,..., k. Since x x 0 + Span {d 0,..., d k }, there exist β i R such that x = x 0 + β 0 d 0 + β 2 d β k d k.
18 Expanding Subspace Theorem: Proof Let x solve min {f (x) x x 0 + Span {d 0,..., d k }}. The Subspace Optimality Theorem tells us that f ( x) T d i = 0, i = 0,..., k. Since x x 0 + Span {d 0,..., d k }, there exist β i R such that Therefore, 0 = f ( x) T d i x = x 0 + β 0 d 0 + β 2 d β k d k. = (Q(x 0 + β 0 d 0 + β 2 d β k d k 1 ) + g) T d i = (Qx 0 + g) T d i + β 0 d T 0 Qd i + β 2 d T 2 Qd i + + β k d T k Qd i = f (x 0 ) T d i + β i d T i Qd i.
19 Expanding Subspace Theorem: Proof Hence, 0 = f (x 0 ) T d i + β i d T i Qd i, i = 0, 1,..., k. β i = f (x 0 ) T d i /d T i Qd i, i = 0,..., k.
20 Expanding Subspace Theorem: Proof 0 = f (x 0 ) T d i + β i di T Qd i, i = 0, 1,..., k. Hence, β i = f (x 0 ) T d i /di T Qd i, i = 0,..., k. Similarly, the iteration gives x i+1 α i = x i + α i d i = f (x i ) T d i di T Qd i x k+1 = x 0 + α 0 d 0 + α 2 d α k d k.
21 Expanding Subspace Theorem: Proof 0 = f (x 0 ) T d i + β i di T Qd i, i = 0, 1,..., k. Hence, β i = f (x 0 ) T d i /di T Qd i, i = 0,..., k. Similarly, the iteration gives So we need to show x i+1 α i = x i + α i d i = f (x i ) T d i di T Qd i x k+1 = x 0 + α 0 d 0 + α 2 d α k d k. β i = f (x 0) T d i di T = f (xi)t d i Qd i di T = β i, i = 0,..., k. Qd i
22 Expanding Subspace Theorem: Proof f (x i ) T d i = (Q(x 0 + α 0 d α i 1 d i 1 ) + g) T d i = (Qx 0 + g) T d i + α 0 d0 T Qd i + + α i 1 di 1Qd T i = f (x 0 ) T d i
23 Initialization: x 0 R n, d 0 = g 0 = f (x 0 ) = b Qx 0. For k = 0, 1, 2,... α k x k+1 := gk T d k/dk T Qd k := x k + α k d k g k+1 := Qx k+1 b (STOP if g k+1 = 0) β k d k+1 := gk+1 T Qd k/dk T Qd k := g k+1 + β k d k k := k + 1.
24 The Conjugate Gradient Theorem The C-G algorithm is a conjugate direction method. If it does not terminate at x k (g k 0), then 1. Span [g 0, g 1,..., g k ] = span [g 0, Qg 0,..., Q k g 0 ] 2. Span [d 0, d 1,..., d k ] = span [g 0, Qg 0,..., Q k g 0 ] 3. d T k Qd i = 0 for i k 1 4. α k = g T k g k/d T k Qd k 5. β k = g T k+1 g k+1/g T k g k.
25 The Conjugate Gradient Theorem: Proof Prove (1)-(3) by induction. (1)-(3) true for k = 0. Suppose true up to k and show true for k + 1. (1) Span [g 0, g 1,..., g k ] = Span [g 0, Qg 0,..., Q k g 0 ]
26 The Conjugate Gradient Theorem: Proof Prove (1)-(3) by induction. (1)-(3) true for k = 0. Suppose true up to k and show true for k + 1. (1) Span [g 0, g 1,..., g k ] = Span [g 0, Qg 0,..., Q k g 0 ] Since g k+1 = g k + α k Qd k, g k+1 Span[g 0,..., Q k+1 g 0 ] (ind. hyp.).
27 The Conjugate Gradient Theorem: Proof Prove (1)-(3) by induction. (1)-(3) true for k = 0. Suppose true up to k and show true for k + 1. (1) Span [g 0, g 1,..., g k ] = Span [g 0, Qg 0,..., Q k g 0 ] Since g k+1 = g k + α k Qd k, g k+1 Span[g 0,..., Q k+1 g 0 ] (ind. hyp.). Also g k+1 / Span [d 0,..., d k ] otherwise g k+1 = 0 (by The Subspace Optimality Theorem) since the method is a conjugate direction method up to step k (ind. hyp.). So g k+1 / Span [g 0,..., Q k g 0 ] and Span [g 0, g 1,..., g k+1 ] = Span [g 0,..., Q k+1 g 0 ] proving (1).
28 The Conjugate Gradient Theorem: Proof (2) Span [d 0, d 1,..., d k ] = Span [g 0, Qg 0,..., Q k g 0 ]
29 The Conjugate Gradient Theorem: Proof (2) Span [d 0, d 1,..., d k ] = Span [g 0, Qg 0,..., Q k g 0 ] To prove (2) write d k+1 = g k+1 + β k d k so that (2) follows from (1) and the induction hypothesis on (2).
30 The Conjugate Gradient Theorem: Proof (3) d T k Qd i = 0 for i k 1
31 The Conjugate Gradient Theorem: Proof (3) dk T Qd i = 0 for i k 1 To see (3) observe that dk+1 T Qd i = g k+1 Qd i + β k dk T Qd i. For i = k the right hand side is zero by the definition of β k.
32 The Conjugate Gradient Theorem: Proof (3) d T k Qd i = 0 for i k 1 To see (3) observe that d T k+1 Qd i = g k+1 Qd i + β k d T k Qd i. For i = k the right hand side is zero by the definition of β k. For i < k both terms vanish. The term g T k+1 Qd i = 0 by the Expanding Subspace Theorem since Qd i Span[d 0,..., d k ] by (1) and (2). The term d T k Qd i vanishes by the induction hypothesis on (3).
33 The Conjugate Gradient Theorem: Proof (4) α k = g T k g k/d T k Qd k
34 The Conjugate Gradient Theorem: Proof (4) α k = g T k g k/d T k Qd k g T k d k = g T k g k β k 1 g T k d k 1 where g T k d k 1 = 0 by the Expanding Subspace Theorem. So α k = g T k d k/d T k Qd k = g T k g k/d T k Qd k.
35 The Conjugate Gradient Theorem: Proof (5) β k = g T k+1 g k+1/g T k g k
36 The Conjugate Gradient Theorem: Proof (5) β k = g T k+1 g k+1/g T k g k g T k+1 g k = 0 by the Expanding Subspace Theorem because g k Span[d 0,..., d k ].
37 The Conjugate Gradient Theorem: Proof (5) β k = g T k+1 g k+1/g T k g k g T k+1 g k = 0 by the Expanding Subspace Theorem because g k Span[d 0,..., d k ]. Hence gk+1 T Qd k = gk+1 T Q(x k+1 x k ) = 1 gk+1 T α k α [g k+1 g k ] = 1 gk+1 T k α g k+1. k
38 The Conjugate Gradient Theorem: Proof (5) β k = g T k+1 g k+1/g T k g k g T k+1 g k = 0 by the Expanding Subspace Theorem because g k Span[d 0,..., d k ]. Hence gk+1 T Qd k = gk+1 T Q(x k+1 x k ) = 1 gk+1 T α k α [g k+1 g k ] = 1 gk+1 T k α g k+1. k Therefore, β k = g T k+1 Qd k d T k Qd k = 1 α k g T k+1 g k+1 d T k Qd k = g k+1 T g k+1 gk T g. k
39 Comments on the CG Algorithm The C G method decribed above is a descent method since the values f (x 0 ), f (x 1 ),..., f (x n ) form a decreasing sequence. Moreover, note that f (x k ) T d k = g T k g k and α k > 0. Thus, the C G method behaves very much like the descent methods discussed peviously.
40 Comments on the CG Algorithm Due to the occurrence of round-off error the C-G algorithm is best implemented as an iterative method. That is, at the end of n steps, f may not attain its global minimum at x n and the intervening directions d k may not be Q-conjugate. But it is also possible for the CG algorithm to terminate early.
41 Comments on the CG Algorithm Due to the occurrence of round-off error the C-G algorithm is best implemented as an iterative method. That is, at the end of n steps, f may not attain its global minimum at x n and the intervening directions d k may not be Q-conjugate. But it is also possible for the CG algorithm to terminate early. Consequently, at each step one should check the value f (x k+1 ) and the size of the step x k+1 x k. If either is sufficiently small, then accept x k as the point at which f attains its global minimum value; otherwise, continue to iterate regardless of the iteration count (up to a maximum acceptable number of iterations). Since CG is a descent method, continued progress is assured.
42 The Non-Quadratic CG Algorithm Initialization: x 0 R n, g 0 = f (x 0 ), d 0 = g 0, 0 < c < β < 1. Having x k otain x k+1 as follows: Check restart criteria. If a restart condition is satisfied, then reset x 0 = x n, g 0 = f (x 0 ), d 0 = g 0 ; otherwise, set { α k λ λ > 0, f (x } k + λd k ) T d β f (x k ) T d k, and f (x k + λd k ) f (x k ) cλ f (x k ) T d k x k+1 := x k + α k d k g k+1 := f (x k+1 ) g T k+1 g k+1 gk β k := T g Fletcher-Reeves { k} max 0, g T k+1 (g k+1 g k ) gk T g Polak-Ribiere k d k+1 := g k+1 + β k d k k := k + 1.
43 The Non-Quadratic CG Algorithm Restart Conditions 1. k = n 2. gk+1 T g k 0.2gk T g k 3. 2gk T g k gk T d k 0.2gk T g k
44 The Non-Quadratic CG Algorithm The Polak-Ribiere update for β k has a demonstrated experimental superiority. One way to see why this might be true is to observe that g T k+1 (g k+1 g k ) α k g T k+1 2 f (x k )d k thereby yielding a better second order approximation. Indeed, the formula for β k in in the quadratic case is precisely α k g T k+1 2 f (x k )d k g T k g k.
Lecture 10: September 26
0-725: Optimization Fall 202 Lecture 0: September 26 Lecturer: Barnabas Poczos/Ryan Tibshirani Scribes: Yipei Wang, Zhiguang Huo Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)
More informationChapter 10 Conjugate Direction Methods
Chapter 10 Conjugate Direction Methods An Introduction to Optimization Spring, 2012 1 Wei-Ta Chu 2012/4/13 Introduction Conjugate direction methods can be viewed as being intermediate between the method
More informationChapter 4. Unconstrained optimization
Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file
More informationJanuary 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13
Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière Hestenes Stiefel January 29, 2014 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428
More informationSteepest descent algorithm. Conjugate gradient training algorithm. Steepest descent algorithm. Remember previous examples
Conjugate gradient training algorithm Steepest descent algorithm So far: Heuristic improvements to gradient descent (momentum Steepest descent training algorithm Can we do better? Definitions: weight vector
More informationFALL 2018 MATH 4211/6211 Optimization Homework 4
FALL 2018 MATH 4211/6211 Optimization Homework 4 This homework assignment is open to textbook, reference books, slides, and online resources, excluding any direct solution to the problem (such as solution
More informationConjugate Gradient (CG) Method
Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous
More informationMATH 4211/6211 Optimization Quasi-Newton Method
MATH 4211/6211 Optimization Quasi-Newton Method Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Quasi-Newton Method Motivation:
More informationOptimization of Quadratic Functions
36 CHAPER 4 Optimization of Quadratic Functions In this chapter we study the problem (4) minimize x R n x Hx + g x + β, where H R n n is symmetric, g R n, and β R. It has already been observed that we
More informationNumerical Optimization of Partial Differential Equations
Numerical Optimization of Partial Differential Equations Part I: basic optimization concepts in R n Bartosz Protas Department of Mathematics & Statistics McMaster University, Hamilton, Ontario, Canada
More informationConjugate gradient algorithm for training neural networks
. Introduction Recall that in the steepest-descent neural network training algorithm, consecutive line-search directions are orthogonal, such that, where, gwt [ ( + ) ] denotes E[ w( t + ) ], the gradient
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method The minimization problem We are given a symmetric positive definite matrix R n n and a right hand side vector b R n We want to solve the linear system Find u R n such that
More informationConjugate-Gradient. Learn about the Conjugate-Gradient Algorithm and its Uses. Descent Algorithms and the Conjugate-Gradient Method. Qx = b.
Lab 1 Conjugate-Gradient Lab Objective: Learn about the Conjugate-Gradient Algorithm and its Uses Descent Algorithms and the Conjugate-Gradient Method There are many possibilities for solving a linear
More informationNonlinear Programming
Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week
More informationConjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)
Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps
More informationx k+1 = x k + α k p k (13.1)
13 Gradient Descent Methods Lab Objective: Iterative optimization methods choose a search direction and a step size at each iteration One simple choice for the search direction is the negative gradient,
More informationNumerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09
Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods
More informationTopics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems
Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate
More informationComputational Linear Algebra
Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 4: Iterative Methods PD
More informationSolutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.
Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. John L. Weatherwax July 7, 2010 wax@alum.mit.edu 1 Chapter 5 (Conjugate Gradient Methods) Notes
More informationConjugate Gradients I: Setup
Conjugate Gradients I: Setup CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Justin Solomon CS 205A: Mathematical Methods Conjugate Gradients I: Setup 1 / 22 Time for Gaussian Elimination
More informationLINEAR CONVt;AUENCE OF THE CONJUGATE GRADIENT METHOD* - Harlan Crowder Philip Wolfe
LINEAR CONVt;AUENCE OF THE CONJUGATE GRADIENT METHOD* - by t". Harlan Crowder Philip Wolfe ibm Thomas J. Watson Research Center Yorktown Heights, New York ABSTRACT: I; is shown that the method of conjugate
More informationConvex Optimization. Problem set 2. Due Monday April 26th
Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining
More informationComputational Linear Algebra
Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 3: Iterative Methods PD
More informationGradient Descent Methods
Lab 18 Gradient Descent Methods Lab Objective: Many optimization methods fall under the umbrella of descent algorithms. The idea is to choose an initial guess, identify a direction from this point along
More information1 Numerical optimization
Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms
More informationMA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS
MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS 1. Please write your name and student number clearly on the front page of the exam. 2. The exam is
More informationQuasi-Newton methods for minimization
Quasi-Newton methods for minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universitá di Trento November 21 December 14, 2011 Quasi-Newton methods for minimization 1
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview
More information1 Numerical optimization
Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................
More informationECE 680 Modern Automatic Control. Gradient and Newton s Methods A Review
ECE 680Modern Automatic Control p. 1/1 ECE 680 Modern Automatic Control Gradient and Newton s Methods A Review Stan Żak October 25, 2011 ECE 680Modern Automatic Control p. 2/1 Review of the Gradient Properties
More informationIntroduction to Optimization
Introduction to Optimization Gradient-based Methods Marc Toussaint U Stuttgart Gradient descent methods Plain gradient descent (with adaptive stepsize) Steepest descent (w.r.t. a known metric) Conjugate
More informationISyE 6663 (Winter 2017) Nonlinear Optimization
Winter 017 TABLE OF CONTENTS ISyE 6663 (Winter 017) Nonlinear Optimization Prof. R. D. C. Monteiro Georgia Institute of Technology L A TEXer: W. KONG http://wwkong.github.io Last Revision: May 3, 017 Table
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More information1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:
Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion
More informationIterative Linear Solvers
Chapter 10 Iterative Linear Solvers In the previous two chapters, we developed strategies for solving a new class of problems involving minimizing a function f ( x) with or without constraints on x. In
More informationConjugate Gradient algorithm. Storage: fixed, independent of number of steps.
Conjugate Gradient algorithm Need: A symmetric positive definite; Cost: 1 matrix-vector product per step; Storage: fixed, independent of number of steps. The CG method minimizes the A norm of the error,
More informationSecond Order Optimization Algorithms I
Second Order Optimization Algorithms I Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapters 7, 8, 9 and 10 1 The
More informationThe conjugate gradient method
The conjugate gradient method Michael S. Floater November 1, 2011 These notes try to provide motivation and an explanation of the CG method. 1 The method of conjugate directions We want to solve the linear
More informationand P RP k = gt k (g k? g k? ) kg k? k ; (.5) where kk is the Euclidean norm. This paper deals with another conjugate gradient method, the method of s
Global Convergence of the Method of Shortest Residuals Yu-hong Dai and Ya-xiang Yuan State Key Laboratory of Scientic and Engineering Computing, Institute of Computational Mathematics and Scientic/Engineering
More informationThe Steepest Descent Algorithm for Unconstrained Optimization
The Steepest Descent Algorithm for Unconstrained Optimization Robert M. Freund February, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 1 Steepest Descent Algorithm The problem
More informationNotes on Some Methods for Solving Linear Systems
Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms
More informationMath 24 Spring 2012 Questions (mostly) from the Textbook
Math 24 Spring 2012 Questions (mostly) from the Textbook 1. TRUE OR FALSE? (a) The zero vector space has no basis. (F) (b) Every vector space that is generated by a finite set has a basis. (c) Every vector
More informationAn Iterative Descent Method
Conjugate Gradient: An Iterative Descent Method The Plan Review Iterative Descent Conjugate Gradient Review : Iterative Descent Iterative Descent is an unconstrained optimization process x (k+1) = x (k)
More informationNOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained
NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS 1. Introduction. We consider first-order methods for smooth, unconstrained optimization: (1.1) minimize f(x), x R n where f : R n R. We assume
More informationStatistics 580 Optimization Methods
Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of
More informationOpen Problems in Nonlinear Conjugate Gradient Algorithms for Unconstrained Optimization
BULLETIN of the Malaysian Mathematical Sciences Society http://math.usm.my/bulletin Bull. Malays. Math. Sci. Soc. (2) 34(2) (2011), 319 330 Open Problems in Nonlinear Conjugate Gradient Algorithms for
More informationReconstruction of Block-Sparse Signals by Using an l 2/p -Regularized Least-Squares Algorithm
Reconstruction of Block-Sparse Signals by Using an l 2/p -Regularized Least-Squares Algorithm Jeevan K. Pant, Wu-Sheng Lu, and Andreas Antoniou University of Victoria May 21, 2012 Compressive Sensing 1/23
More informationLec10p1, ORF363/COS323
Lec10 Page 1 Lec10p1, ORF363/COS323 This lecture: Conjugate direction methods Conjugate directions Conjugate Gram-Schmidt The conjugate gradient (CG) algorithm Solving linear systems Leontief input-output
More informationOn the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search
ANZIAM J. 55 (E) pp.e79 E89, 2014 E79 On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search Lijun Li 1 Weijun Zhou 2 (Received 21 May 2013; revised
More informationOn Lagrange multipliers of trust region subproblems
On Lagrange multipliers of trust region subproblems Ladislav Lukšan, Ctirad Matonoha, Jan Vlček Institute of Computer Science AS CR, Prague Applied Linear Algebra April 28-30, 2008 Novi Sad, Serbia Outline
More informationAn Efficient Modification of Nonlinear Conjugate Gradient Method
Malaysian Journal of Mathematical Sciences 10(S) March : 167-178 (2016) Special Issue: he 10th IM-G International Conference on Mathematics, Statistics and its Applications 2014 (ICMSA 2014) MALAYSIAN
More informationConjugate Gradient Method
Conjugate Gradient Method Hung M Phan UMass Lowell April 13, 2017 Throughout, A R n n is symmetric and positive definite, and b R n 1 Steepest Descent Method We present the steepest descent method for
More informationHomework 11 Solutions. Math 110, Fall 2013.
Homework 11 Solutions Math 110, Fall 2013 1 a) Suppose that T were self-adjoint Then, the Spectral Theorem tells us that there would exist an orthonormal basis of P 2 (R), (p 1, p 2, p 3 ), consisting
More informationConstrained optimization: direct methods (cont.)
Constrained optimization: direct methods (cont.) Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi Direct methods Also known as methods of feasible directions Idea in a point x h, generate a
More informationSuppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf.
Maria Cameron 1. Trust Region Methods At every iteration the trust region methods generate a model m k (p), choose a trust region, and solve the constraint optimization problem of finding the minimum of
More informationTMA 4180 Optimeringsteori THE CONJUGATE GRADIENT METHOD
INTRODUCTION TMA 48 Optimeringsteori THE CONJUGATE GRADIENT METHOD H. E. Krogstad, IMF, Spring 28 This note summarizes main points in the numerical analysis of the Conjugate Gradient (CG) method. Most
More informationLinear Algebra. Paul Yiu. Department of Mathematics Florida Atlantic University. Fall A: Inner products
Linear Algebra Paul Yiu Department of Mathematics Florida Atlantic University Fall 2011 6A: Inner products In this chapter, the field F = R or C. We regard F equipped with a conjugation χ : F F. If F =
More informationOn fast trust region methods for quadratic models with linear constraints. M.J.D. Powell
DAMTP 2014/NA02 On fast trust region methods for quadratic models with linear constraints M.J.D. Powell Abstract: Quadratic models Q k (x), x R n, of the objective function F (x), x R n, are used by many
More informationMath 273a: Optimization Netwon s methods
Math 273a: Optimization Netwon s methods Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 some material taken from Chong-Zak, 4th Ed. Main features of Newton s method Uses both first derivatives
More informationDefinition 1. A set V is a vector space over the scalar field F {R, C} iff. there are two operations defined on V, called vector addition
6 Vector Spaces with Inned Product Basis and Dimension Section Objective(s): Vector Spaces and Subspaces Linear (In)dependence Basis and Dimension Inner Product 6 Vector Spaces and Subspaces Definition
More informationPreconditioned conjugate gradient algorithms with column scaling
Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Preconditioned conjugate gradient algorithms with column scaling R. Pytla Institute of Automatic Control and
More informationAN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.
Bull. Iranian Math. Soc. Vol. 40 (2014), No. 1, pp. 235 242 Online ISSN: 1735-8515 AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.
More informationSECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS
SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss
More informationConjugate Directions for Stochastic Gradient Descent
Conjugate Directions for Stochastic Gradient Descent Nicol N Schraudolph Thore Graepel Institute of Computational Science ETH Zürich, Switzerland {schraudo,graepel}@infethzch Abstract The method of conjugate
More informationGradient-Based Optimization
Multidisciplinary Design Optimization 48 Chapter 3 Gradient-Based Optimization 3. Introduction In Chapter we described methods to minimize (or at least decrease) a function of one variable. While problems
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Jason E. Hicken Aerospace Design Lab Department of Aeronautics & Astronautics Stanford University 14 July 2011 Lecture Objectives describe when CG can be used to solve Ax
More information8 Numerical methods for unconstrained problems
8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields
More informationUniversity of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number
Limited-Memory Matrix Methods with Applications 1 Tamara Gibson Kolda 2 Applied Mathematics Program University of Maryland at College Park Abstract. The focus of this dissertation is on matrix decompositions
More informationMTH 3318 Solutions to Induction Problems Fall 2009
Pat Rossi Instructions. MTH 338 Solutions to Induction Problems Fall 009 Name Prove the following by mathematical induction. Set (): ++3+...+ n n(n+) i n(n+) Step #: Show that the proposition is true for
More informationMath 5630: Conjugate Gradient Method Hung M. Phan, UMass Lowell March 29, 2019
Math 563: Conjugate Gradient Method Hung M. Phan, UMass Lowell March 29, 219 hroughout, A R n n is symmetric and positive definite, and b R n. 1 Steepest Descent Method We present the steepest descent
More informationVectors in Function Spaces
Jim Lambers MAT 66 Spring Semester 15-16 Lecture 18 Notes These notes correspond to Section 6.3 in the text. Vectors in Function Spaces We begin with some necessary terminology. A vector space V, also
More informationMATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.
MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors. Orthogonal sets Let V be a vector space with an inner product. Definition. Nonzero vectors v 1,v
More informationImage restoration: numerical optimisation
Image restoration: numerical optimisation Short and partial presentation Jean-François Giovannelli Groupe Signal Image Laboratoire de l Intégration du Matériau au Système Univ. Bordeaux CNRS BINP / 6 Context
More informationConjugate Gradient Method
Conjugate Gradient Method Tsung-Ming Huang Department of Mathematics National Taiwan Normal University October 10, 2011 T.M. Huang (NTNU) Conjugate Gradient Method October 10, 2011 1 / 36 Outline 1 Steepest
More informationInner Product Spaces
Inner Product Spaces Introduction Recall in the lecture on vector spaces that geometric vectors (i.e. vectors in two and three-dimensional Cartesian space have the properties of addition, subtraction,
More informationIterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More informationModification of the Armijo line search to satisfy the convergence properties of HS method
Université de Sfax Faculté des Sciences de Sfax Département de Mathématiques BP. 1171 Rte. Soukra 3000 Sfax Tunisia INTERNATIONAL CONFERENCE ON ADVANCES IN APPLIED MATHEMATICS 2014 Modification of the
More informationMATH 4211/6211 Optimization Basics of Optimization Problems
MATH 4211/6211 Optimization Basics of Optimization Problems Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 A standard minimization
More informationMethods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent
Nonlinear Optimization Steepest Descent and Niclas Börlin Department of Computing Science Umeå University niclas.borlin@cs.umu.se A disadvantage with the Newton method is that the Hessian has to be derived
More informationToday. Introduction to optimization Definition and motivation 1-dimensional methods. Multi-dimensional methods. General strategies, value-only methods
Optimization Last time Root inding: deinition, motivation Algorithms: Bisection, alse position, secant, Newton-Raphson Convergence & tradeos Eample applications o Newton s method Root inding in > 1 dimension
More informationSteepest Descent. Juan C. Meza 1. Lawrence Berkeley National Laboratory Berkeley, California 94720
Steepest Descent Juan C. Meza Lawrence Berkeley National Laboratory Berkeley, California 94720 Abstract The steepest descent method has a rich history and is one of the simplest and best known methods
More information6.4 Krylov Subspaces and Conjugate Gradients
6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P
More informationGlobal Convergence Properties of the HS Conjugate Gradient Method
Applied Mathematical Sciences, Vol. 7, 2013, no. 142, 7077-7091 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.311638 Global Convergence Properties of the HS Conjugate Gradient Method
More informationChapter 3. More about Vector Spaces Linear Independence, Basis and Dimension. Contents. 1 Linear Combinations, Span
Chapter 3 More about Vector Spaces Linear Independence, Basis and Dimension Vincent Astier, School of Mathematical Sciences, University College Dublin 3. Contents Linear Combinations, Span Linear Independence,
More informationOn Lagrange multipliers of trust-region subproblems
On Lagrange multipliers of trust-region subproblems Ladislav Lukšan, Ctirad Matonoha, Jan Vlček Institute of Computer Science AS CR, Prague Programy a algoritmy numerické matematiky 14 1.- 6. června 2008
More informationM 2 F g(i,j) = )))))) Mx(i)Mx(j) If G(x) is constant, F(x) is a quadratic function and can be expressed as. F(x) = (1/2)x T Gx + c T x + " (1)
Gradient Techniques for Unconstrained Optimization Gradient techniques can be used to minimize a function F(x) with respect to the n by 1 vector x, when the gradient is available or easily estimated. These
More informationCourse Notes: Week 1
Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues
More informationGLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH
GLOBAL CONVERGENCE OF CONJUGATE GRADIENT METHODS WITHOUT LINE SEARCH Jie Sun 1 Department of Decision Sciences National University of Singapore, Republic of Singapore Jiapu Zhang 2 Department of Mathematics
More informationLecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm
CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction
More informationLinear Solvers. Andrew Hazel
Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction
More informationOn nonlinear optimization since M.J.D. Powell
On nonlinear optimization since 1959 1 M.J.D. Powell Abstract: This view of the development of algorithms for nonlinear optimization is based on the research that has been of particular interest to the
More informationA First Course in Optimization: Answers to Selected Exercises
A First Course in Optimization: Answers to Selected Exercises Charles L. Byrne Department of Mathematical Sciences University of Massachusetts Lowell Lowell, MA 01854 December 2, 2009 (The most recent
More informationLecture 1: Basic Concepts
ENGG 5781: Matrix Analysis and Computations Lecture 1: Basic Concepts 2018-19 First Term Instructor: Wing-Kin Ma This note is not a supplementary material for the main slides. I will write notes such as
More informationPart 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)
Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective
More informationLecture 35 Minimization and maximization of functions. Powell s method in multidimensions Conjugate gradient method. Annealing methods.
Lecture 35 Minimization and maximization of functions Powell s method in multidimensions Conjugate gradient method. Annealing methods. We know how to minimize functions in one dimension. If we start at
More informationNonlinear Optimization: What s important?
Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global
More informationConvex Analysis and Optimization Chapter 2 Solutions
Convex Analysis and Optimization Chapter 2 Solutions Dimitri P. Bertsekas with Angelia Nedić and Asuman E. Ozdaglar Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com
More information