Chapter 10 Conjugate Direction Methods
|
|
- Lydia Thompson
- 5 years ago
- Views:
Transcription
1 Chapter 10 Conjugate Direction Methods An Introduction to Optimization Spring, Wei-Ta Chu 2012/4/13
2 Introduction Conjugate direction methods can be viewed as being intermediate between the method of steepest descent and Newton s method. Solve quadratics of variables in steps. The usual implementation, the conjugate gradient algorithm, requires no Hessian matrix evaluations. No matrix inversion and no storage of an matrix are required. The conjugate direction methods typically perform better than the method of steepest descent, but not as well as Newton s method. 2
3 Introduction For a quadratic function of variables,,, the best direction of search is in the -conjugate direction. Basically, two directions and in are said to be - conjugate if. Definition 10.1: Let be a real symmetric matrix. The directions are -conjugate if for all, we have 3
4 Introduction Lemma 10.1: Let be a symmetric positive definite matrix. If the directions,, are nonzero and -conjugate, then they are linearly independent. Proof: Let be scalars such that Premultiplying this equality by,, yields because all other terms,, by -conjugacy. But and ; hence,,. Therefore,,, are linearly independent. 4
5 Example Let Note that. The matrix is positive definite because all its leading principal minors are positive: Our goal is to construct a set of -conjugate vectors Let,,. We require that. We have 5
6 Example Let,,. Then,, and thus To find the third vector, which would be -conjugate with and, we require that and. We have If we take, then the resulting set of vectors is mutually conjugate. 6
7 The Conjugate Direction Algorithm This method of finding -conjugate vectors is inefficient. A systematic procedure for finding -conjugate vectors can be devised using the idea underlying the Gram-Schmidt process of transforming a given basis of into an orthonormal basis of. 7
8 The Conjugate Direction Algorithm Minimizing the quadratic function of variables where,. Note that because, the function has a global minimizer that can be found by solving Basic Conjugate Direction Algorithm. Given a starting point and -conjugate directions ; for, 8
9 The Conjugate Direction Algorithm Theorem 10.1: For any starting point, the basic conjugate direction algorithm converges to the unique (that solves ) in steps; that is,. Proof: Consider. Because the are linearly independent, there exist constants, such that Now premultiply both sides of this equation by to obtain where the terms, by the - conjugate property. Hence, 9
10 The Conjugate Direction Algorithm Now, we can write Therefore, So writing and premultiplying the above by, we obtain because and. Thus, and, which completes the proof. 10
11 Example Find the minimizer of using the conjugate direction method with the initial point, and -conjugate direction and. We have and hence Thus, 11
12 Example To find, we compute and Therefore, Because is a quadratic function in two variables, 12
13 The Conjugate Direction Algorithm For a quadratic function of variables, the conjugate direction method reaches the solution after steps. Suppose that we start at and search in the direction to obtain we claim that 13
14 The Conjugate Direction Algorithm The equation implies that has the property that, where. To see this, apply the chain rule to get Evaluating the above at, we get Because is a quadratic function of, and the coefficient of the term in is, the above implies that 14
15 The Conjugate Direction Algorithm Using a similar argument, we can show that for all, and hence, Lemma 10.2: In the conjugate direction algorithm, for all,, and Proof: Note that because. Thus, 15
16 The Conjugate Direction Algorithm Prove by induction. The result is true for because We now show that if the result is true for (i.e. ), then it is true for, i.e. Fix and. By the induction hypothesis, Because and by the -conjugacy, we have It remains to be shown that 16
17 The Conjugate Direction Algorithm Indeed, because. Therefore, by induction, for all and 17
18 The Conjugate Direction Algorithm By Lemma 10.2 we see that is orthogonal to any vector from the subspace spanned by We now show that not only does satisfy, but also In other words, if we write then we can express. As increases, the subspace expands, and will eventually fill the whole of R n (provided that are linearly independent). 18
19 The Conjugate Direction Algorithm Therefore, for some sufficiently large, will lie in. For this reason, the above result is sometimes called the expanding subspace theorem. To prove the expanding subspace theorem, define the matrix Note that. Also Hence, 19
20 The Conjugate Direction Algorithm Now, consider any vector. There exists a vector such that. Let. Note that is a quadratic function and has a unique minimizer that satisfies the FONC. By the chain rule, Therefore, By Lemma 10.2,. Therefore, satisfies the FONC for the quadratic function, and hence is the minimizer of ; that is, 20
21 The Conjugate Gradient Algorithm The conjugate direction algorithm is very effective. However, to use the algorithm, we need to specify the - conjugate directions. Fortunately, there is a way to generate -conjugate directions as we perform iterations. The conjugate gradient algorithm does not use prespecified conjugate directions, but instead computes the directions as the algorithm proceeds. At each stage of the algorithm, the direction is calculated as a linear combination of the previous direction and the current gradient, in such as way that all the directions are mutually - conjugate. 21
22 The Conjugate Gradient Algorithm For a quadratic function of variables, we can locate the function minimizer by performing searches along mutually conjugate directions. We consider the quadratic function where. Our first search direction from an initial point is in the direction of steepest descent; that is, Thus,, where 22
23 The Conjugate Gradient Algorithm We search in a direction that is -conjugate to. We choose as a linear combination of and. In general, at the th step, we choose as a linear combination of and. Specifically, we choose The coefficients, are chosen in such a way that is -conjugate to. This is accomplished by choosing to be 23
24 The Conjugate Gradient Algorithm The algorithm 1. Set ; select the initial point 2.. If, stop; else, set If, stop Set ; go to step 3. 24
25 Example Proposition 10.1: In the conjugate gradient algorithm, the directions are -conjugate. Consider the quadratic function We find the minimizer using the conjugate gradient algorithm, using the starting point We can represent as where 25
26 Example We have Hence, 26
27 Example Hence, 27
28 Proof of Proposition 10.1 We use induction. We first show that. Substituting for we see that Assume that, are -conjugate directions. From Lemma 10.2 we have Thus, is orthogonal to each of the directions We now show that 28
29 Proof of Proposition 10.1 Fix. We have Substituting this equation into the previous one yields Because, it follows that We are now ready to show that We have If, then, by virtue of the induction hypothesis. Hence, we have 29
30 Proof of Proposition 10.1 But. Because Thus, It remains to show. We have Using the expression for, we get, which completes the proof. 30
31 The Conjugate Gradient Algorithm for Nonquadratic Problems The algorithm can be extended to general nonlinear functions by interpreting as a second-order Taylor series approximation of the objective function. For a quadratic, the matrix, the Hessian of the quadratic, is constant. However, for a general nonlinear function the Hessian is a matrix that has to be reevaluated at each iteration of the algorithm. Observe that appears only in the computation of the scalars and. Because can be replaced by a numerical line search procedure, we need only concern ourselves with the formula for. 31
32 The Conjugate Gradient Algorithm for Nonquadratic Problems Hestenes-Stiefel Formula. Replacing the by the term. The two terms are equal in the quadratic case.. Premultiplying both sides by, subtracting from both sides, and recognizing that we get, which we can rewrite as. Therefore, the Hestenes-Stiefel Formula 32
33 The Conjugate Gradient Algorithm for Nonquadratic Problems Polak-Ribiere Formula. Starting from the Hestenes-Stiefel formula, we multiply out the denominator to get By Lemma 10.2,. Also, since and premultiplying this by, we get where once again we used Lemma Hence, we get the Polak-Ribiere formula 33
34 The Conjugate Gradient Algorithm for Nonquadratic Problems Flether-Reeves Formula. Starting with the Polak-Ribiere formula, we multiply out the numerator to get We now use, which we get by using the equation and applying Lemma This leads to the Fletcher-Reeves formula 34
35 The Conjugate Gradient Algorithm for Nonquadratic Problems Without the Hessian matrix, all we need are the objective function and gradient values at each iteration. For the quadratic case the three expressions for are exactly equal. A very important issue in minimization problems of nonquadratic functions is the line search. If the line search is known to be inaccurate, the Hestenes-Stiefel formula for is recommended. 35
36 Homework 2 Exercises 8.3, 8.15 Exercises 9.1 Exercises 10.9 Hand over your homework at the class of Apr
Chapter 8 Gradient Methods
Chapter 8 Gradient Methods An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Introduction Recall that a level set of a function is the set of points satisfying for some constant. Thus, a point
More informationSteepest descent algorithm. Conjugate gradient training algorithm. Steepest descent algorithm. Remember previous examples
Conjugate gradient training algorithm Steepest descent algorithm So far: Heuristic improvements to gradient descent (momentum Steepest descent training algorithm Can we do better? Definitions: weight vector
More informationThe Conjugate Gradient Algorithm
Optimization over a Subspace Conjugate Direction Methods Conjugate Gradient Algorithm Non-Quadratic Conjugate Gradient Algorithm Optimization over a Subspace Consider the problem min f (x) subject to x
More informationConjugate gradient algorithm for training neural networks
. Introduction Recall that in the steepest-descent neural network training algorithm, consecutive line-search directions are orthogonal, such that, where, gwt [ ( + ) ] denotes E[ w( t + ) ], the gradient
More informationFALL 2018 MATH 4211/6211 Optimization Homework 4
FALL 2018 MATH 4211/6211 Optimization Homework 4 This homework assignment is open to textbook, reference books, slides, and online resources, excluding any direct solution to the problem (such as solution
More informationChapter 4. Unconstrained optimization
Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file
More informationJanuary 29, Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes Stiefel 1 / 13
Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière Hestenes Stiefel January 29, 2014 Non-linear conjugate gradient method(s): Fletcher Reeves Polak Ribière January 29, 2014 Hestenes
More informationSolutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.
Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. John L. Weatherwax July 7, 2010 wax@alum.mit.edu 1 Chapter 5 (Conjugate Gradient Methods) Notes
More informationLecture 10: September 26
0-725: Optimization Fall 202 Lecture 0: September 26 Lecturer: Barnabas Poczos/Ryan Tibshirani Scribes: Yipei Wang, Zhiguang Huo Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These
More informationChapter 3 Transformations
Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)
More informationNonlinear Programming
Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week
More informationLec10p1, ORF363/COS323
Lec10 Page 1 Lec10p1, ORF363/COS323 This lecture: Conjugate direction methods Conjugate directions Conjugate Gram-Schmidt The conjugate gradient (CG) algorithm Solving linear systems Leontief input-output
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview
More informationConjugate Directions for Stochastic Gradient Descent
Conjugate Directions for Stochastic Gradient Descent Nicol N Schraudolph Thore Graepel Institute of Computational Science ETH Zürich, Switzerland {schraudo,graepel}@infethzch Abstract The method of conjugate
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428
More informationNotes on Some Methods for Solving Linear Systems
Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms
More informationMA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS
MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS 1. Please write your name and student number clearly on the front page of the exam. 2. The exam is
More informationMATH 4211/6211 Optimization Quasi-Newton Method
MATH 4211/6211 Optimization Quasi-Newton Method Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Quasi-Newton Method Motivation:
More informationThere are two things that are particularly nice about the first basis
Orthogonality and the Gram-Schmidt Process In Chapter 4, we spent a great deal of time studying the problem of finding a basis for a vector space We know that a basis for a vector space can potentially
More informationDesigning Information Devices and Systems I Spring 2018 Lecture Notes Note 25
EECS 6 Designing Information Devices and Systems I Spring 8 Lecture Notes Note 5 5. Speeding up OMP In the last lecture note, we introduced orthogonal matching pursuit OMP, an algorithm that can extract
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Jason E. Hicken Aerospace Design Lab Department of Aeronautics & Astronautics Stanford University 14 July 2011 Lecture Objectives describe when CG can be used to solve Ax
More informationECE 680 Modern Automatic Control. Gradient and Newton s Methods A Review
ECE 680Modern Automatic Control p. 1/1 ECE 680 Modern Automatic Control Gradient and Newton s Methods A Review Stan Żak October 25, 2011 ECE 680Modern Automatic Control p. 2/1 Review of the Gradient Properties
More informationOrthonormal Bases; Gram-Schmidt Process; QR-Decomposition
Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition MATH 322, Linear Algebra I J. Robert Buchanan Department of Mathematics Spring 205 Motivation When working with an inner product space, the most
More informationOrthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6
Chapter 6 Orthogonality 6.1 Orthogonal Vectors and Subspaces Recall that if nonzero vectors x, y R n are linearly independent then the subspace of all vectors αx + βy, α, β R (the space spanned by x and
More informationConjugate Gradient (CG) Method
Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous
More informationThe conjugate gradient method
The conjugate gradient method Michael S. Floater November 1, 2011 These notes try to provide motivation and an explanation of the CG method. 1 The method of conjugate directions We want to solve the linear
More informationNumerical Optimization of Partial Differential Equations
Numerical Optimization of Partial Differential Equations Part I: basic optimization concepts in R n Bartosz Protas Department of Mathematics & Statistics McMaster University, Hamilton, Ontario, Canada
More informationConjugate Gradient Method
Conjugate Gradient Method Hung M Phan UMass Lowell April 13, 2017 Throughout, A R n n is symmetric and positive definite, and b R n 1 Steepest Descent Method We present the steepest descent method for
More informationSOLUTIONS TO EXERCISES FOR MATHEMATICS 133 Part 1. I. Topics from linear algebra
SOLUTIONS TO EXERCISES FOR MATHEMATICS 133 Part 1 Winter 2009 I. Topics from linear algebra I.0 : Background 1. Suppose that {x, y} is linearly dependent. Then there are scalars a, b which are not both
More informationthe method of steepest descent
MATH 3511 Spring 2018 the method of steepest descent http://www.phys.uconn.edu/ rozman/courses/m3511_18s/ Last modified: February 6, 2018 Abstract The Steepest Descent is an iterative method for solving
More informationConjugate Gradients I: Setup
Conjugate Gradients I: Setup CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Justin Solomon CS 205A: Mathematical Methods Conjugate Gradients I: Setup 1 / 22 Time for Gaussian Elimination
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method The minimization problem We are given a symmetric positive definite matrix R n n and a right hand side vector b R n We want to solve the linear system Find u R n such that
More informationConvex Optimization. Problem set 2. Due Monday April 26th
Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining
More informationChapter 1.6. Perform Operations with Complex Numbers
Chapter 1.6 Perform Operations with Complex Numbers EXAMPLE Warm-Up 1 Exercises Solve a quadratic equation Solve 2x 2 + 11 = 37. 2x 2 + 11 = 37 2x 2 = 48 Write original equation. Subtract 11 from each
More informationwhich arises when we compute the orthogonal projection of a vector y in a subspace with an orthogonal basis. Hence assume that P y = A ij = x j, x i
MODULE 6 Topics: Gram-Schmidt orthogonalization process We begin by observing that if the vectors {x j } N are mutually orthogonal in an inner product space V then they are necessarily linearly independent.
More informationNumerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09
Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods
More informationMath 24 Spring 2012 Questions (mostly) from the Textbook
Math 24 Spring 2012 Questions (mostly) from the Textbook 1. TRUE OR FALSE? (a) The zero vector space has no basis. (F) (b) Every vector space that is generated by a finite set has a basis. (c) Every vector
More informationWorksheet for Lecture 25 Section 6.4 Gram-Schmidt Process
Worksheet for Lecture Name: Section.4 Gram-Schmidt Process Goal For a subspace W = Span{v,..., v n }, we want to find an orthonormal basis of W. Example Let W = Span{x, x } with x = and x =. Give an orthogonal
More informationLinear Models Review
Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign
More informationConjugate-Gradient. Learn about the Conjugate-Gradient Algorithm and its Uses. Descent Algorithms and the Conjugate-Gradient Method. Qx = b.
Lab 1 Conjugate-Gradient Lab Objective: Learn about the Conjugate-Gradient Algorithm and its Uses Descent Algorithms and the Conjugate-Gradient Method There are many possibilities for solving a linear
More informationMTH 2310, FALL Introduction
MTH 2310, FALL 2011 SECTION 6.2: ORTHOGONAL SETS Homework Problems: 1, 5, 9, 13, 17, 21, 23 1, 27, 29, 35 1. Introduction We have discussed previously the benefits of having a set of vectors that is linearly
More informationMath 5630: Conjugate Gradient Method Hung M. Phan, UMass Lowell March 29, 2019
Math 563: Conjugate Gradient Method Hung M. Phan, UMass Lowell March 29, 219 hroughout, A R n n is symmetric and positive definite, and b R n. 1 Steepest Descent Method We present the steepest descent
More informationThe Method of Conjugate Directions 21
The Method of Conjugate Directions 1 + 1 i : 7. The Method of Conjugate Directions 7.1. Conjugacy Steepest Descent often finds itself taking steps in the same direction as earlier steps (see Figure 8).
More informationThis property turns out to be a general property of eigenvectors of a symmetric A that correspond to distinct eigenvalues as we shall see later.
34 To obtain an eigenvector x 2 0 2 for l 2 = 0, define: B 2 A - l 2 I 2 = È 1, 1, 1 Î 1-0 È 1, 0, 0 Î 1 = È 1, 1, 1 Î 1. To transform B 2 into an upper triangular matrix, subtract the first row of B 2
More informationTMA 4180 Optimeringsteori THE CONJUGATE GRADIENT METHOD
INTRODUCTION TMA 48 Optimeringsteori THE CONJUGATE GRADIENT METHOD H. E. Krogstad, IMF, Spring 28 This note summarizes main points in the numerical analysis of the Conjugate Gradient (CG) method. Most
More informationLecture 35 Minimization and maximization of functions. Powell s method in multidimensions Conjugate gradient method. Annealing methods.
Lecture 35 Minimization and maximization of functions Powell s method in multidimensions Conjugate gradient method. Annealing methods. We know how to minimize functions in one dimension. If we start at
More information1 Numerical optimization
Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................
More information1 Numerical optimization
Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms
More informationMath 416, Spring 2010 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 2010 GRAM-SCHMIDT, THE QR-FACTORIZATION, ORTHOGONAL MATRICES
Math 46, Spring 00 Gram-Schmidt, the QR-factorization, Orthogonal Matrices March 4, 00 GRAM-SCHMIDT, THE QR-FACTORIZATION, ORTHOGONAL MATRICES Recap Yesterday we talked about several new, important concepts
More informationHomework 11 Solutions. Math 110, Fall 2013.
Homework 11 Solutions Math 110, Fall 2013 1 a) Suppose that T were self-adjoint Then, the Spectral Theorem tells us that there would exist an orthonormal basis of P 2 (R), (p 1, p 2, p 3 ), consisting
More informationIterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More informationMath Linear Algebra
Math 220 - Linear Algebra (Summer 208) Solutions to Homework #7 Exercise 6..20 (a) TRUE. u v v u = 0 is equivalent to u v = v u. The latter identity is true due to the commutative property of the inner
More informationIterative Linear Solvers
Chapter 10 Iterative Linear Solvers In the previous two chapters, we developed strategies for solving a new class of problems involving minimizing a function f ( x) with or without constraints on x. In
More informationChapter 6: Orthogonality
Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products
More informationLecture 10: October 27, 2016
Mathematical Toolkit Autumn 206 Lecturer: Madhur Tulsiani Lecture 0: October 27, 206 The conjugate gradient method In the last lecture we saw the steepest descent or gradient descent method for finding
More informationLinear Independence. Stephen Boyd. EE103 Stanford University. October 9, 2017
Linear Independence Stephen Boyd EE103 Stanford University October 9, 2017 Outline Linear independence Basis Orthonormal vectors Gram-Schmidt algorithm Linear independence 2 Linear dependence set of n-vectors
More informationConjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)
Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps
More informationON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS
ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS Anders FORSGREN Tove ODLAND Technical Report TRITA-MAT-203-OS-03 Department of Mathematics KTH Royal
More informationNotes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.
Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where
More informationGradient Descent Methods
Lab 18 Gradient Descent Methods Lab Objective: Many optimization methods fall under the umbrella of descent algorithms. The idea is to choose an initial guess, identify a direction from this point along
More informationMath 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008
Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008 Exam 2 will be held on Tuesday, April 8, 7-8pm in 117 MacMillan What will be covered The exam will cover material from the lectures
More informationNonlinearOptimization
1/35 NonlinearOptimization Pavel Kordík Department of Computer Systems Faculty of Information Technology Czech Technical University in Prague Jiří Kašpar, Pavel Tvrdík, 2011 Unconstrained nonlinear optimization,
More informationLecture Notes: Geometric Considerations in Unconstrained Optimization
Lecture Notes: Geometric Considerations in Unconstrained Optimization James T. Allison February 15, 2006 The primary objectives of this lecture on unconstrained optimization are to: Establish connections
More informationNew hybrid conjugate gradient methods with the generalized Wolfe line search
Xu and Kong SpringerPlus (016)5:881 DOI 10.1186/s40064-016-5-9 METHODOLOGY New hybrid conjugate gradient methods with the generalized Wolfe line search Open Access Xiao Xu * and Fan yu Kong *Correspondence:
More informationLecture 28 Continuous-Time Fourier Transform 2
Lecture 28 Continuous-Time Fourier Transform 2 Fundamentals of Digital Signal Processing Spring, 2012 Wei-Ta Chu 2012/6/14 1 Limit of the Fourier Series Rewrite (11.9) and (11.10) as As, the fundamental
More informationTopics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems
Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate
More information4.4. Orthogonality. Note. This section is awesome! It is very geometric and shows that much of the geometry of R n holds in Hilbert spaces.
4.4. Orthogonality 1 4.4. Orthogonality Note. This section is awesome! It is very geometric and shows that much of the geometry of R n holds in Hilbert spaces. Definition. Elements x and y of a Hilbert
More informationTangent spaces, normals and extrema
Chapter 3 Tangent spaces, normals and extrema If S is a surface in 3-space, with a point a S where S looks smooth, i.e., without any fold or cusp or self-crossing, we can intuitively define the tangent
More informationGradient-Based Optimization
Multidisciplinary Design Optimization 48 Chapter 3 Gradient-Based Optimization 3. Introduction In Chapter we described methods to minimize (or at least decrease) a function of one variable. While problems
More informationx k+1 = x k + α k p k (13.1)
13 Gradient Descent Methods Lab Objective: Iterative optimization methods choose a search direction and a step size at each iteration One simple choice for the search direction is the negative gradient,
More informationThe Gram Schmidt Process
u 2 u The Gram Schmidt Process Now we will present a procedure, based on orthogonal projection, that converts any linearly independent set of vectors into an orthogonal set. Let us begin with the simple
More informationThe Gram Schmidt Process
The Gram Schmidt Process Now we will present a procedure, based on orthogonal projection, that converts any linearly independent set of vectors into an orthogonal set. Let us begin with the simple case
More informationLecture 8 Optimization
4/9/015 Lecture 8 Optimization EE 4386/5301 Computational Methods in EE Spring 015 Optimization 1 Outline Introduction 1D Optimization Parabolic interpolation Golden section search Newton s method Multidimensional
More informationNOTES (1) FOR MATH 375, FALL 2012
NOTES 1) FOR MATH 375, FALL 2012 1 Vector Spaces 11 Axioms Linear algebra grows out of the problem of solving simultaneous systems of linear equations such as 3x + 2y = 5, 111) x 3y = 9, or 2x + 3y z =
More informationCombining Conjugate Direction Methods with Stochastic Approximation of Gradients
Combining Conjugate Direction Methods with Stochastic Approximation of Gradients Nicol N Schraudolph Thore Graepel Institute of Computational Sciences Eidgenössische Technische Hochschule (ETH) CH-8092
More information17 Solution of Nonlinear Systems
17 Solution of Nonlinear Systems We now discuss the solution of systems of nonlinear equations. An important ingredient will be the multivariate Taylor theorem. Theorem 17.1 Let D = {x 1, x 2,..., x m
More informationMathematical Methods wk 1: Vectors
Mathematical Methods wk : Vectors John Magorrian, magog@thphysoxacuk These are work-in-progress notes for the second-year course on mathematical methods The most up-to-date version is available from http://www-thphysphysicsoxacuk/people/johnmagorrian/mm
More informationMathematical Methods wk 1: Vectors
Mathematical Methods wk : Vectors John Magorrian, magog@thphysoxacuk These are work-in-progress notes for the second-year course on mathematical methods The most up-to-date version is available from http://www-thphysphysicsoxacuk/people/johnmagorrian/mm
More informationHomework 5. (due Wednesday 8 th Nov midnight)
Homework (due Wednesday 8 th Nov midnight) Use this definition for Column Space of a Matrix Column Space of a matrix A is the set ColA of all linear combinations of the columns of A. In other words, if
More informationSECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS
SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss
More informationUpon successful completion of MATH 220, the student will be able to:
MATH 220 Matrices Upon successful completion of MATH 220, the student will be able to: 1. Identify a system of linear equations (or linear system) and describe its solution set 2. Write down the coefficient
More informationM 2 F g(i,j) = )))))) Mx(i)Mx(j) If G(x) is constant, F(x) is a quadratic function and can be expressed as. F(x) = (1/2)x T Gx + c T x + " (1)
Gradient Techniques for Unconstrained Optimization Gradient techniques can be used to minimize a function F(x) with respect to the n by 1 vector x, when the gradient is available or easily estimated. These
More informationMath 1180, Notes, 14 1 C. v 1 v n v 2. C A ; w n. A and w = v i w i : v w = i=1
Math 8, 9 Notes, 4 Orthogonality We now start using the dot product a lot. v v = v v n then by Recall that if w w ; w n and w = v w = nx v i w i : Using this denition, we dene the \norm", or length, of
More informationWe showed that adding a vector to a basis produces a linearly dependent set of vectors; more is true.
Dimension We showed that adding a vector to a basis produces a linearly dependent set of vectors; more is true. Lemma If a vector space V has a basis B containing n vectors, then any set containing more
More informationConstrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.
Optimization Unconstrained optimization One-dimensional Multi-dimensional Newton s method Basic Newton Gauss- Newton Quasi- Newton Descent methods Gradient descent Conjugate gradient Constrained optimization
More informationLinear Algebra 2 Spectral Notes
Linear Algebra 2 Spectral Notes In what follows, V is an inner product vector space over F, where F = R or C. We will use results seen so far; in particular that every linear operator T L(V ) has a complex
More informationHILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define
HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,
More informationSection 6.2, 6.3 Orthogonal Sets, Orthogonal Projections
Section 6. 6. Orthogonal Sets Orthogonal Projections Main Ideas in these sections: Orthogonal set = A set of mutually orthogonal vectors. OG LI. Orthogonal Projection of y onto u or onto an OG set {u u
More information1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:
Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion
More informationwhich are not all zero. The proof in the case where some vector other than combination of the other vectors in S is similar.
It follows that S is linearly dependent since the equation is satisfied by which are not all zero. The proof in the case where some vector other than combination of the other vectors in S is similar. is
More information1. General Vector Spaces
1.1. Vector space axioms. 1. General Vector Spaces Definition 1.1. Let V be a nonempty set of objects on which the operations of addition and scalar multiplication are defined. By addition we mean a rule
More informationMathematical optimization
Optimization Mathematical optimization Determine the best solutions to certain mathematically defined problems that are under constrained determine optimality criteria determine the convergence of the
More informationCLASS NOTES Computational Methods for Engineering Applications I Spring 2015
CLASS NOTES Computational Methods for Engineering Applications I Spring 2015 Petros Koumoutsakos Gerardo Tauriello (Last update: July 27, 2015) IMPORTANT DISCLAIMERS 1. REFERENCES: Much of the material
More informationA globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications
A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth
More informationLecture 1: Basic Concepts
ENGG 5781: Matrix Analysis and Computations Lecture 1: Basic Concepts 2018-19 First Term Instructor: Wing-Kin Ma This note is not a supplementary material for the main slides. I will write notes such as
More informationVectors. Vectors and the scalar multiplication and vector addition operations:
Vectors Vectors and the scalar multiplication and vector addition operations: x 1 x 1 y 1 2x 1 + 3y 1 x x n 1 = 2 x R n, 2 2 y + 3 2 2x = 2 + 3y 2............ x n x n y n 2x n + 3y n I ll use the two terms
More informationFunctional Analysis HW #5
Functional Analysis HW #5 Sangchul Lee October 29, 2015 Contents 1 Solutions........................................ 1 1 Solutions Exercise 3.4. Show that C([0, 1]) is not a Hilbert space, that is, there
More informationOptimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23
Optimization: Nonlinear Optimization without Constraints Nonlinear Optimization without Constraints 1 / 23 Nonlinear optimization without constraints Unconstrained minimization min x f(x) where f(x) is
More information