Lecture Notes: Geometric Considerations in Unconstrained Optimization

Size: px
Start display at page:

Download "Lecture Notes: Geometric Considerations in Unconstrained Optimization"

Transcription

1 Lecture Notes: Geometric Considerations in Unconstrained Optimization James T. Allison February 15, 2006 The primary objectives of this lecture on unconstrained optimization are to: Establish connections between optimality conditions and problem geometry Provide several motivations for the gradient method and Newton s method Illustrate these concepts with numerical examples The derivation of optimality conditions using abstract means is important, but the intuition gained through geometric understanding of optimality conditions can also be very useful. This additional insight can contribute to more effective implementation of optimization theory. A brief derivation of first and second order conditions is provided, followed by a discussion of function approximation models and a geometric explanation of optimality conditions. Finally, the impact of problem condition and scaling on optimization algorithms is discussed. 1 Optimality Conditions For a point to be a minimum, perturbations about this point ( x = x x 0 ) must result only in objective function increases: f = f(x) f(x 0 ) 0 (1) Finite term Taylor series expansions of a function are accurate near the point of expansion. Combining a first order expansion with equation 1, we can derive a necessary condition for optimality 1 : f = f(x ) x + o( x ) f f(x ) x 0 f(x ) = 0 (2) A point that meets this condition is a stationary point (x ), but it is unkown whether this point is a minimum, maximum, or a saddle point. Evaluation of this first order necessary condition involves the solution of a system of nonlinear equations (equation 2). A second order expansion about a known stationary point provides curvature information via a quadratic approximation of the function, and enables the determination of whether the stationary point is in fact a minimum. If we apply equation 1 to a second order expansion about a stationary point, noting that the linear term in this case is zero, we arrive at the following condition: x H x > 0 x (3) H is the Hessian (also written as 2 f(x)). The satisfaction of this condition and the stationarity condition of equation 2 together comprises a second order sufficiency condition, i.e., if this condition is met the point in 1 Note that in this document vectors are considered to be column vectors, and gradients are also considered to be column vectors. The transpose of a vector x is denoted x. Copyright c 2006 by James T. Allison 1

2 2 FUNCTION MODELS question is known to be an minimum. Evaluating this condition for all possible perturbations would be very difficult. However, it is known from linear algebra theory that equation 3 is satisfied if and only if the objective function Hessian matrix is positive definite. A positive definite matrix is often denoted with the expression H 0. A matrix is positive definite if and only if all of its eigenvalues are positive, which are easily evaluated numerically. The relationship between positive definiteness, positive eigenvalues, and function geometry will be clarified in these lecture notes. 2 Function Models The Taylor series expansions used in deriving the above optimality conditions can be viewed as function approximation models. Both linear and quadratic models were used, and a geometric understanding of these models can add insights to optimality conditions and optimization algorithms. Linear Function Models A linear function model characterizes the slope of a function in the neighborhood of a point. In R space a linear model is a line tangent to a function, and in R n space 2 it is a hyperplane tangent to the function. If the tangent plane is not horizontal, then directions of descent exist, as does an improved objective function value. Therefore, an optimal point must have a horizontal tangent plane. The gradient of the objective function is zero when the tangent plane, defined by a linear Taylor series expansion, is horizontal. This verifies equation 2. This geometric description also motivates the gradient method for unconstrained optimization. Gradient Method Algorithm: 1. Build a linear model for the function at the current point, and if descent directions exist, move in the direction of steepest descent ( f) until the objective function stops improving. 2. Update the linear model and repeat until f = 0. The iterative formula for the gradient method, where k is the iteration number and α is the step size, is: x k+1 = x k α f(x k ) (4) The gradient method algorithm converts a multidimensional minimization problem to a sequence of onedimensional line searches. During each of these line searches we are looking at a slice of the objective function surface. This is illustrated in the following example. Example 1: Consider the quadratic function: 7x x 1 x 2 + x 2 2 (5) The contours of the function level set are shown in the first plot of Figure 1. We can see by inspection of the objective function that the minimum is at x = [0 0. If we start at x 0 = [10 5 and perform the line search min α f(α) = x 0 α f(x 0 ), the objective function in the search direction appears as shown in the second plot of Figure 1. The search direction is illustrated in the first plot, and second plot is a slice of the objective function surface in this search direction. Quadratic Function Models A quadratic model can capture curvature information of a function in the neighborhood of a point. In R space a quadratic model is a parabola, and in R n it is a paraboloid. Constructing a quadratic model of a function facilitates the approximation of the function s stationary point, since the quadratic model has its own stationary point. Linear models (hyperplanes) do not have stationary points. The closer to quadratic a function s shape is, the better this approximation will be, and will of course be exact for quadratic objective 2 R is the set of all real numbers, and R n is the set of all real valued vectors of length n. Copyright c 2006 by James T. Allison 2

3 2 FUNCTION MODELS 15 Example 1: Convex Quadratic Function 2500 Line Search View x x s x 0 f(x 0 α f(x 0 )) x α Figure 1: Contour and line search plots for the quadratic function of Example 1 functions. Iterative approximation of a function s stationary point forms the basis for Newton s method for unconstrained optimization. Sequential quadratic modeling is the first of three motivations for Newton s method for optimization that will be discussed in these lecture notes. Newton s Method: 1. Build a quadratic model for the function at the current point, and use the stationary point of this model as the approximation for the objective function stationary point. 2. Check for convergence, and iterate if not converged. The iterative formula for Newton s method, where H 1 is the inverse of the objective function s Hessian, is: x k+1 = x k H 1 f(x k ) (6) Newton s method exhibits very fast (quadratic) local convergence compared to the slower linear convergence of the gradient method. Newton s method, however, can be unstable. It may converge to a maximum instead of a minimum, since it does not have a descent property. Newton s method seeks to find a stationary point, but has no ability to distinguish between a maximum or a minimum. In contrast, the gradient method will always decrease the objective function at each iteration because it always moves in a descent direction. The gradient method will find a stationary point that is either a minimum, or a saddle point that is an improvement over the starting point. In other words, the gradient method is effective at moving in a descent direction, even far from a stationary point location, while Newton s method is effective at converging quickly to a stationary point when one is near. Quasi-Newton methods combine the good global convergence of the gradient method with the rapid local convergence of Newton s method. Such methods begin with gradient method iterations, and dynamically transform into Newton s method iterations. You may be familiar with another form of Newton s method that is for finding the roots of a function. The one-dimensional root finding formula is: x k+1 = x k f(xk ) f (x k ) Copyright c 2006 by James T. Allison 3

4 3 QUADRATIC FORMS AND GEOMETRY An extension of Newton s method to multiple dimenions takes the form: x k+1 = x k J 1 f(x k ) (7) This multidimensional version seeks to solve the system of equations f(x) = 0. Note that f(x) is a vectorvalued function. The matrix J is the Jacobian of the function f(x), which is a matrix where each row is the transpose of the gradient of each component of the vector function f(x). This next concept establishes the connection between Newton s method for root finding (i.e., for solving systems of non-linear equations), and Newton s method for unconstrained optimization. Recall that if we are seeking to find a stationary point of an objective function f(x), we need to solve the system of equations f(x) = 0. If we use Newton s method for solving nonlinear systems of equations to solve f(x) = 0, we replace f(x) in equation 7 with the vector valued function f(x), and replace J 1 with the inverse of the Jacobian of f(x). Observe that the inverse of the Jacobian of f(x) is in fact the inverse of the Hessian of f(x), or H 1. Hence, by applying Newton s method for solving systems of equation to the problem of finding a stationary point, we have derived Newton s method for unconstrained optimization as defined in equation 6. This is the second of three motivations for Newton s method discussed in this document. 3 Quadratic Forms and Geometry Quadratic models can take either of two general shapes: paraboloid (convex or concave), or hyperboloid (a saddle). Example 1 exhibited a function with a convex parabolic shape, and a hyperboloid will be illustrated shortly. First a brief review of quadratic function forms will be given. A function has a quadratic form if it is a linear combination of x i x j terms. It can be written in matrix form: f(x) = x Ax, where A is a symmetric matix that defines the quadratic function. The conversion of the function in equation 8 will be illustrated. f(x) = 2x x 1 x 2 + x x 2 x 3 + x 2 3 (8) First, each coefficient of squared terms (i.e., i = j) are placed on the diagonal at location (i, i) Then, since each cross (or interaction ) term is split across two off-diagonal entries, the coefficient terms is divided by two and placed in each entry Finally, any quadratic terms that do not appear in the original function are assigned a value of zero in the matrix The function in equation 8, rewritten in matrix form, is: f(x) = [x 1 x 2 x x 1 x 2 = x Ax (9) x 3 The correctness of this representation can be verified by performing the vector and matrix multiplications in eqaution 9, and observing that the result simplifies to equation 8. Certain properties of the quadratic form indicate what type of shape the function has. Three general possible shapes exist. Copyright c 2006 by James T. Allison 4

5 , ), ), ) 3 QUADRATIC FORMS AND GEOMETRY! #"$ % 0 - %1"2./ 0 34$ %9 /% #"$ % * '() * '() * '() & & & -! #"$./ 0 - %1"2./ 0 34$ %9 /% #"$ % + Figure 2: Illustration of convex, concave, and hyperbolic quadratic functions if x Ax > 0 if x Ax < 0 x, A is positive definite convex quadratic function x, A is negative definite concave quadratic function if x Ax is positive for some x and negative for other, A is indefinite hyperbolic quadratic function Figure 2 illustrates each of these three cases using both surface and contour plots of covex, concave, and hyperbolic quadratic functions. The quadratic functions corresponding to the left, center, and right plots, respectively, are given below. f 1 (x) = x A 1 x where: A 1 = [ f 2 (x) = x A 2 x f 3 (x) = x A 3 x [ 7 1.2, A 2 = [ 5 2.6, and A 3 = It can also be demonstrated that if a matrix has all positive eigenvalues, i.e., λ i > 0 i, the matrix is positive definite. Similarly, if λ i < 0 i, the corresponding matrix is negative definite, and if eigenvalues take both positive and negative values, then the matrix is indefinite. Eigenvalues provide a nice way of evaluating properties of a quadratic function, but what exactly is the connection between eigenvalues and function geometry? We will demonstrate an intuitive interpretation of eigenvalues, and illustrate this with a numerical example. Copyright c 2006 by James T. Allison 5

6 3 QUADRATIC FORMS AND GEOMETRY Eigenvalues, Eigenvectors, and Geometry An eigenvalue λ and corresponding eigenvector v of a matrix A satisfies the relation: Av = λv (10) Eigenvectors v are vectors that result in a scalar multiple of themselves if they are pre-multiplied by the associated matrix. We can gain geometric intuition for what eigenvalues and eigenvectors are by shifting and rotating the coordinate system that we use to view a quadratic function. This is a lengthy process, but the end result will provide significant geometric insight. We start with a general quadratic function (including constant and linear terms), and translate the coordinate axes to be centered at the function s stationary point by defining new coordinates: z = x x. Note that this function s gradient is b + 2Ax, and the stationary point is x = 1 2 A 1 b. The vectors x and b have length n, and the matrix A has dimension n n. f(x) = f 0 + x b + x Ax f(z) = f 0 + (z + x ) b + (z + x ) A(z + x ) f(z) = (f 0 + x b + x Ax ) + z Az + z (b + 2Ax ) f(z) = f + z Az For convenience f is defined as the function value at x, and the last term was dropped in the third equation because of stationarity (i.e., b + 2Ax = 0). The coordinate system can be rotated by transforming (multiplying) the coordinate variables with a matrix. Consider the matrix V = [v 1 v 2... v n, with columns that are the normalized eigenvectors of A. Using V to rotate the coordinates will cause the coordinates to be aligned with the eigenvectors of A. This rotation is effected through the multiplication p = V z, where p are the new coordinates. Since the normalized eigenvectors form an orthonormal basis, the matrix V is orthogonal, and the following identities hold: V = V 1, V V = VV = I I is the identity matrix, an n n matrix with ones on the diagonal. We can use these identities and the definition of p to write z in terms of the rotated coordinates p. It will also be helpful to know z in terms of p: z = Iz = VV z = Vp z = [Vp = p V Substituting these expressions for z and z into the last equation for f(z), we arrive at a new form of the original quadratic function in terms of the translated and rotated coordinates p: f(p) = f + p V AVp This functional expression can be further simplified by defining the matrix Λ = V AV, which turns out to be a diagonal matrix whose entries are the eigenvalues of A. The function can be rewritten as: f(p) = f + p Λp (11) Since all off-diagonal terms are zero, the function can be written using a simple summation (equation 12). This final result will enable geometric interpretation of eigenvalues. f(p) = f + n λ i p 2 i (12) This form provides an excellent geometric interpretation for eigenvalues and eigenvectors. If we move along an eigenvector direction (i.e, vary p i ), the function will decrease if λ i < 0, and increase if λ i > 0. This interpretation is congruent with the geometry associated with positive definite, negative definite, and indefinite matrices. If an eigenvalue is large, then the rate of change in the associated direction will be large. The eigenvalues and eigenvectors from the functions in Figure 2 are shown below, and the eigenvector directions are plotted in Figure 3. Eigenvectors point in the direction of the axes of the level set contour ellipses. Note that the eigenvectors associated with the larger eigenvalues point in the direction of the minor axes of the level set ellipses, since the function in the direction of the minor axes is steepest. i=1 Copyright c 2006 by James T. Allison 6

7 & % & % & % 4 PROBLEM CONDITION AND SCALING! " '!(" ' " )+*", - ". ' " $& $& $& $% $% $% #& #& #& #$% #$% #$% #$& % #$% $% #$& % #$% $% #$& % #$% $% " Figure 3: Eigenvector directions of quadratic functions from Figure 2 Function 1: v 1 = [ [.982, λ 1 =.769, v 2 =.189, λ 2 = 7.23 Function 2: Function 3: v 1 = v 1 = [ [ [.189, λ 1 =.723, v 2 =.982 [.314, λ 1 = 5.86, v 2 =.949, λ 2 =.769, λ 2 = Problem Condition and Scaling An objective function is more difficult to minimize when it is highly elliptical. A quantitative measure of this is the condition number (C) of a function (equation 13). It is defined as the ratio between the maximum and minimum eigenvalues of the function s Hessian. A perfectly conditioned function has a condition number of 1, while ill-conditioned problems have very large condition numbers. Recall that large eigenvalues correspond to very steep function responses. Thus, an ill-conditioned function has directions with rapid change in some directions, and very little change in other directions. Also note that these directions of disparate sensitivity are not necessarily aligned with coordinate axes. C = λ max λ min (13) The gradient method has particular difficulty with poorly conditioned problems. The influence of steep directions can drown out the influence of relatively flat directions. For example, if the algorithm is evaluating a point in a long, narrow valley, the gradient method could be numerically fooled into thinking that the gradient is zero, even if the point is far from the minimum. The directional derivative in the steep direction may in fact be zero if the point is in the low point of the valley, but since the derivative in the nearly flat direction is so small, machine precision limitations may incorrectly identify a zero gradient. When the gradient method is Copyright c 2006 by James T. Allison 7

8 4 PROBLEM CONDITION AND SCALING stuck in such a valley, not much progress can be made with each step because of the relatively small gradient. Whether algorithm convergence is based on having a zero gradient or a sufficiently small step size, the gradient method may terminate before finding the solution because of poor scaling. In addition, it can be show that the each search direction of the gradient method (with exact line search) is orthogonal to the previous search direction. This results in a zig-zag route to the solution that requires many iterations. What can be done to address this issue with using the gradient method for ill-conditioned problems? A common approach is to scale the variables such that the objective function is approximately just as sensitive to all variables. A simple scaling approach is to multiply each variable by a scalar such that the nominal value (or starting point value) is equal to one. These scalars multipliers can be used to form a scaling vector, such that the scaled variables can be calculated using a vector multiplication: y = s x. Here y is the vector of scaled variables, and s is the scaling vector. Similarly, in constrained optimization it is important to scale the objective and constraint function values such that they have the same magnitude. Scaling each variable individually works well when the eigenvectors are nearly aligned with the coordinate axes. Recall that this might not be the case, i.e., a function may have a steep direction that points in a direction somewhere in the middle of the coordinate axes. Such a function requires more sophisticated scaling to achieve reasonable conditioning. The interaction between variables must be considered, and a scaling matrix may be used to accomplish this since the off-diagonal terms of a matrix can account for variable interaction. A useful class of scaling matrices are symmetric and positive definite. For convenience we define S 1 as a scaling matrix, and write: x = Sy (14) If we define the objective function in the new variable space as h(y) = f(sy), then the new gradient method iteration becomes: y k+1 = y k α h(y k ) (15) Although we could proceed using this formula, obtain a solution in terms of y, and convert the solution back to the original variable space using equation 14, it will be instructive to recast equation 15 in terms of x. If we premultiply this equation by S, define S 2 = D k, and use the chain rule to obtain the relation h(y) = S f(x), we find after algebraic manipulation that: x k+1 = x k αd k f(x k ) (16) This result is in fact a scaled gradient method iteration. The gradient scaling matrix for iteration k, D k, will ensure descent if it is symmetric and positive definite. It turns out that the best scaling results are obtained if we set D k to the inverse of the function s Hessian evaluated at x k, i.e., D k = H(x k ). Observe that when this is the case, and if we set the step size α = 1, the scaled steepest descent algorithm becomes Newton s method for optimization, as defined in equation 6. This is the third and final motivation for using Newton s method for unconstrained optimization that will be discussed in these lecture notes. If the Hessian of an objective function is positive definite at a point, then Newton s method will produce descent for that iteration, since the scaled gradient method is guaranteed descent when D k 0. The Hessian, however, may not be positive definite. Geometrically, when Newton s method is operating in a region where the objective function is convex (i.e., H(x k ) 0) that includes a minimum, it will iteratively descend to that minimum. Conversely, if the region is concave (i.e., H(x k ) 0) with an associated maximum, Newton s method will ascend to the maximum. Ideal scaling will remove any function ellipticity. This will transform elliptical level sets of a function s contour plot to circular level sets. Scaling a quadratic function with the inverse of its Hessian will result in a perfectly conditioned function with circular level sets. Applying the gradient method to such a function will locate the minimum in one step, since f(x k ) will point directly to the minimum. Since using the inverse of the Hessian to scale a function for the gradient method is the same as using Newton s method, this scenario is equivalent to applying Newton s method to the minimization of a quadratic function. Recall that Newton s method will find the minimum of a quadratic function in one step, since the quadratic approximation model is exact. Whether we view this situation as Newton s method applied to a quadratic function, or use of the gradient method with ideal scaling, the result is the same the solution is identified in one step. Copyright c 2006 by James T. Allison 8

9 5 SUMMARY 5 Summary A connection was established between optimality conditions and a geometric understanding of functions. Three approaches were used to motivate the use of Newton s method: 1. Sequential second-order function approximations 2. Newton s method for root finding to solve f(x ) = 0 3. Use of the objective function s Hessian to provide ideal scaling Copyright c 2006 by James T. Allison 9

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are Quadratic forms We consider the quadratic function f : R 2 R defined by f(x) = 2 xt Ax b T x with x = (x, x 2 ) T, () where A R 2 2 is symmetric and b R 2. We will see that, depending on the eigenvalues

More information

LECTURE 22: SWARM INTELLIGENCE 3 / CLASSICAL OPTIMIZATION

LECTURE 22: SWARM INTELLIGENCE 3 / CLASSICAL OPTIMIZATION 15-382 COLLECTIVE INTELLIGENCE - S19 LECTURE 22: SWARM INTELLIGENCE 3 / CLASSICAL OPTIMIZATION TEACHER: GIANNI A. DI CARO WHAT IF WE HAVE ONE SINGLE AGENT PSO leverages the presence of a swarm: the outcome

More information

Gradient Descent. Dr. Xiaowei Huang

Gradient Descent. Dr. Xiaowei Huang Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,

More information

Chapter 8 Gradient Methods

Chapter 8 Gradient Methods Chapter 8 Gradient Methods An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Introduction Recall that a level set of a function is the set of points satisfying for some constant. Thus, a point

More information

Introduction to gradient descent

Introduction to gradient descent 6-1: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction to gradient descent Derivation and intuitions Hessian 6-2: Introduction to gradient descent Prof. J.C. Kao, UCLA Introduction Our

More information

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality AM 205: lecture 18 Last time: optimization methods Today: conditions for optimality Existence of Global Minimum For example: f (x, y) = x 2 + y 2 is coercive on R 2 (global min. at (0, 0)) f (x) = x 3

More information

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Optimization Escuela de Ingeniería Informática de Oviedo (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Unconstrained optimization Outline 1 Unconstrained optimization 2 Constrained

More information

Introduction to Unconstrained Optimization: Part 2

Introduction to Unconstrained Optimization: Part 2 Introduction to Unconstrained Optimization: Part 2 James Allison ME 555 January 29, 2007 Overview Recap Recap selected concepts from last time (with examples) Use of quadratic functions Tests for positive

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

Optimization Methods

Optimization Methods Optimization Methods Decision making Examples: determining which ingredients and in what quantities to add to a mixture being made so that it will meet specifications on its composition allocating available

More information

The Steepest Descent Algorithm for Unconstrained Optimization

The Steepest Descent Algorithm for Unconstrained Optimization The Steepest Descent Algorithm for Unconstrained Optimization Robert M. Freund February, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 1 Steepest Descent Algorithm The problem

More information

NonlinearOptimization

NonlinearOptimization 1/35 NonlinearOptimization Pavel Kordík Department of Computer Systems Faculty of Information Technology Czech Technical University in Prague Jiří Kašpar, Pavel Tvrdík, 2011 Unconstrained nonlinear optimization,

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

4 Newton Method. Unconstrained Convex Optimization 21. H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion:

4 Newton Method. Unconstrained Convex Optimization 21. H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion: Unconstrained Convex Optimization 21 4 Newton Method H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion: f(x + p) f(x)+p T f(x)+ 1 2 pt H(x)p ˆf(p) In general, ˆf(p) won

More information

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method. Optimization Unconstrained optimization One-dimensional Multi-dimensional Newton s method Basic Newton Gauss- Newton Quasi- Newton Descent methods Gradient descent Conjugate gradient Constrained optimization

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

Mathematical optimization

Mathematical optimization Optimization Mathematical optimization Determine the best solutions to certain mathematically defined problems that are under constrained determine optimality criteria determine the convergence of the

More information

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by: Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion

More information

1 Numerical optimization

1 Numerical optimization Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Exploring the energy landscape

Exploring the energy landscape Exploring the energy landscape ChE210D Today's lecture: what are general features of the potential energy surface and how can we locate and characterize minima on it Derivatives of the potential energy

More information

Functions of Several Variables

Functions of Several Variables Functions of Several Variables The Unconstrained Minimization Problem where In n dimensions the unconstrained problem is stated as f() x variables. minimize f()x x, is a scalar objective function of vector

More information

Math 302 Outcome Statements Winter 2013

Math 302 Outcome Statements Winter 2013 Math 302 Outcome Statements Winter 2013 1 Rectangular Space Coordinates; Vectors in the Three-Dimensional Space (a) Cartesian coordinates of a point (b) sphere (c) symmetry about a point, a line, and a

More information

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

More information

Note: Every graph is a level set (why?). But not every level set is a graph. Graphs must pass the vertical line test. (Level sets may or may not.

Note: Every graph is a level set (why?). But not every level set is a graph. Graphs must pass the vertical line test. (Level sets may or may not. Curves in R : Graphs vs Level Sets Graphs (y = f(x)): The graph of f : R R is {(x, y) R y = f(x)} Example: When we say the curve y = x, we really mean: The graph of the function f(x) = x That is, we mean

More information

ECE580 Exam 1 October 4, Please do not write on the back of the exam pages. Extra paper is available from the instructor.

ECE580 Exam 1 October 4, Please do not write on the back of the exam pages. Extra paper is available from the instructor. ECE580 Exam 1 October 4, 2012 1 Name: Solution Score: /100 You must show ALL of your work for full credit. This exam is closed-book. Calculators may NOT be used. Please leave fractions as fractions, etc.

More information

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search

More information

Scientific Computing: Optimization

Scientific Computing: Optimization Scientific Computing: Optimization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 March 8th, 2011 A. Donev (Courant Institute) Lecture

More information

Math 273a: Optimization Netwon s methods

Math 273a: Optimization Netwon s methods Math 273a: Optimization Netwon s methods Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015 some material taken from Chong-Zak, 4th Ed. Main features of Newton s method Uses both first derivatives

More information

Optimization Tutorial 1. Basic Gradient Descent

Optimization Tutorial 1. Basic Gradient Descent E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.

More information

1 Kernel methods & optimization

1 Kernel methods & optimization Machine Learning Class Notes 9-26-13 Prof. David Sontag 1 Kernel methods & optimization One eample of a kernel that is frequently used in practice and which allows for highly non-linear discriminant functions

More information

x k+1 = x k + α k p k (13.1)

x k+1 = x k + α k p k (13.1) 13 Gradient Descent Methods Lab Objective: Iterative optimization methods choose a search direction and a step size at each iteration One simple choice for the search direction is the negative gradient,

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

Deep Learning. Authors: I. Goodfellow, Y. Bengio, A. Courville. Chapter 4: Numerical Computation. Lecture slides edited by C. Yim. C.

Deep Learning. Authors: I. Goodfellow, Y. Bengio, A. Courville. Chapter 4: Numerical Computation. Lecture slides edited by C. Yim. C. Chapter 4: Numerical Computation Deep Learning Authors: I. Goodfellow, Y. Bengio, A. Courville Lecture slides edited by 1 Chapter 4: Numerical Computation 4.1 Overflow and Underflow 4.2 Poor Conditioning

More information

Performance Surfaces and Optimum Points

Performance Surfaces and Optimum Points CSC 302 1.5 Neural Networks Performance Surfaces and Optimum Points 1 Entrance Performance learning is another important class of learning law. Network parameters are adjusted to optimize the performance

More information

Matrix Derivatives and Descent Optimization Methods

Matrix Derivatives and Descent Optimization Methods Matrix Derivatives and Descent Optimization Methods 1 Qiang Ning Department of Electrical and Computer Engineering Beckman Institute for Advanced Science and Techonology University of Illinois at Urbana-Champaign

More information

A A x i x j i j (i, j) (j, i) Let. Compute the value of for and

A A x i x j i j (i, j) (j, i) Let. Compute the value of for and 7.2 - Quadratic Forms quadratic form on is a function defined on whose value at a vector in can be computed by an expression of the form, where is an symmetric matrix. The matrix R n Q R n x R n Q(x) =

More information

COMP 558 lecture 18 Nov. 15, 2010

COMP 558 lecture 18 Nov. 15, 2010 Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to

More information

Least Squares Optimization

Least Squares Optimization Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. I assume the reader is familiar with basic linear algebra, including the

More information

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Taylor s Theorem Can often approximate a function by a polynomial The error in the approximation

More information

Math 411 Preliminaries

Math 411 Preliminaries Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon s method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector

More information

Notes on Some Methods for Solving Linear Systems

Notes on Some Methods for Solving Linear Systems Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms

More information

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016 AM 221: Advanced Optimization Spring 2016 Prof. Yaron Singer Lecture 8 February 22nd 1 Overview In the previous lecture we saw characterizations of optimality in linear optimization, and we reviewed the

More information

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Multidimensional Unconstrained Optimization Suppose we have a function f() of more than one

More information

Unconstrained Multivariate Optimization

Unconstrained Multivariate Optimization Unconstrained Multivariate Optimization Multivariate optimization means optimization of a scalar function of a several variables: and has the general form: y = () min ( ) where () is a nonlinear scalar-valued

More information

Arc Search Algorithms

Arc Search Algorithms Arc Search Algorithms Nick Henderson and Walter Murray Stanford University Institute for Computational and Mathematical Engineering November 10, 2011 Unconstrained Optimization minimize x D F (x) where

More information

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way. AMSC 607 / CMSC 878o Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 3: Penalty and Barrier Methods Dianne P. O Leary c 2008 Reference: N&S Chapter 16 Penalty and Barrier

More information

Math 118, Fall 2014 Final Exam

Math 118, Fall 2014 Final Exam Math 8, Fall 4 Final Exam True or false Please circle your choice; no explanation is necessary True There is a linear transformation T such that T e ) = e and T e ) = e Solution Since T is linear, if T

More information

Convex Functions and Optimization

Convex Functions and Optimization Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized

More information

7.2 Steepest Descent and Preconditioning

7.2 Steepest Descent and Preconditioning 7.2 Steepest Descent and Preconditioning Descent methods are a broad class of iterative methods for finding solutions of the linear system Ax = b for symmetric positive definite matrix A R n n. Consider

More information

CHAPTER 2: QUADRATIC PROGRAMMING

CHAPTER 2: QUADRATIC PROGRAMMING CHAPTER 2: QUADRATIC PROGRAMMING Overview Quadratic programming (QP) problems are characterized by objective functions that are quadratic in the design variables, and linear constraints. In this sense,

More information

Gradient Descent. Sargur Srihari

Gradient Descent. Sargur Srihari Gradient Descent Sargur srihari@cedar.buffalo.edu 1 Topics Simple Gradient Descent/Ascent Difficulties with Simple Gradient Descent Line Search Brent s Method Conjugate Gradient Descent Weight vectors

More information

ECE580 Partial Solution to Problem Set 3

ECE580 Partial Solution to Problem Set 3 ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 1 ECE580 Partial Solution to Problem Set 3 These problems are from the textbook by Chong and Zak, 4th edition, which is the textbook for the

More information

1 Computing with constraints

1 Computing with constraints Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

More information

Math (P)refresher Lecture 8: Unconstrained Optimization

Math (P)refresher Lecture 8: Unconstrained Optimization Math (P)refresher Lecture 8: Unconstrained Optimization September 2006 Today s Topics : Quadratic Forms Definiteness of Quadratic Forms Maxima and Minima in R n First Order Conditions Second Order Conditions

More information

g(x,y) = c. For instance (see Figure 1 on the right), consider the optimization problem maximize subject to

g(x,y) = c. For instance (see Figure 1 on the right), consider the optimization problem maximize subject to 1 of 11 11/29/2010 10:39 AM From Wikipedia, the free encyclopedia In mathematical optimization, the method of Lagrange multipliers (named after Joseph Louis Lagrange) provides a strategy for finding the

More information

Course Notes: Week 4

Course Notes: Week 4 Course Notes: Week 4 Math 270C: Applied Numerical Linear Algebra 1 Lecture 9: Steepest Descent (4/18/11) The connection with Lanczos iteration and the CG was not originally known. CG was originally derived

More information

Econ Slides from Lecture 8

Econ Slides from Lecture 8 Econ 205 Sobel Econ 205 - Slides from Lecture 8 Joel Sobel September 1, 2010 Computational Facts 1. det AB = det BA = det A det B 2. If D is a diagonal matrix, then det D is equal to the product of its

More information

Numerical optimization

Numerical optimization Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

More information

Meaning of the Hessian of a function in a critical point

Meaning of the Hessian of a function in a critical point Meaning of the Hessian of a function in a critical point Mircea Petrache February 1, 2012 We consider a function f : R n R and assume for it to be differentiable with continuity at least two times (that

More information

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions CE 191: Civil and Environmental Engineering Systems Analysis LEC : Optimality Conditions Professor Scott Moura Civil & Environmental Engineering University of California, Berkeley Fall 214 Prof. Moura

More information

Unconstrained minimization of smooth functions

Unconstrained minimization of smooth functions Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and

More information

Numerical Optimization

Numerical Optimization Numerical Optimization Unit 2: Multivariable optimization problems Che-Rung Lee Scribe: February 28, 2011 (UNIT 2) Numerical Optimization February 28, 2011 1 / 17 Partial derivative of a two variable function

More information

Lecture V. Numerical Optimization

Lecture V. Numerical Optimization Lecture V Numerical Optimization Gianluca Violante New York University Quantitative Macroeconomics G. Violante, Numerical Optimization p. 1 /19 Isomorphism I We describe minimization problems: to maximize

More information

Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema

Chapter 7. Extremal Problems. 7.1 Extrema and Local Extrema Chapter 7 Extremal Problems No matter in theoretical context or in applications many problems can be formulated as problems of finding the maximum or minimum of a function. Whenever this is the case, advanced

More information

Basic Math for

Basic Math for Basic Math for 16-720 August 23, 2002 1 Linear Algebra 1.1 Vectors and Matrices First, a reminder of a few basic notations, definitions, and terminology: Unless indicated otherwise, vectors are always

More information

Least Squares Optimization

Least Squares Optimization Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques, which are widely used to analyze and visualize data. Least squares (LS)

More information

Tangent spaces, normals and extrema

Tangent spaces, normals and extrema Chapter 3 Tangent spaces, normals and extrema If S is a surface in 3-space, with a point a S where S looks smooth, i.e., without any fold or cusp or self-crossing, we can intuitively define the tangent

More information

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018 MATH 57: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 18 1 Global and Local Optima Let a function f : S R be defined on a set S R n Definition 1 (minimizers and maximizers) (i) x S

More information

Nonlinear equations and optimization

Nonlinear equations and optimization Notes for 2017-03-29 Nonlinear equations and optimization For the next month or so, we will be discussing methods for solving nonlinear systems of equations and multivariate optimization problems. We will

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Optimization and Root Finding. Kurt Hornik

Optimization and Root Finding. Kurt Hornik Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding

More information

1 Numerical optimization

1 Numerical optimization Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................

More information

Appendix A Taylor Approximations and Definite Matrices

Appendix A Taylor Approximations and Definite Matrices Appendix A Taylor Approximations and Definite Matrices Taylor approximations provide an easy way to approximate a function as a polynomial, using the derivatives of the function. We know, from elementary

More information

Introduction to unconstrained optimization - direct search methods

Introduction to unconstrained optimization - direct search methods Introduction to unconstrained optimization - direct search methods Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi Structure of optimization methods Typically Constraint handling converts the

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

8. Diagonalization.

8. Diagonalization. 8. Diagonalization 8.1. Matrix Representations of Linear Transformations Matrix of A Linear Operator with Respect to A Basis We know that every linear transformation T: R n R m has an associated standard

More information

10.34 Numerical Methods Applied to Chemical Engineering Fall Quiz #1 Review

10.34 Numerical Methods Applied to Chemical Engineering Fall Quiz #1 Review 10.34 Numerical Methods Applied to Chemical Engineering Fall 2015 Quiz #1 Review Study guide based on notes developed by J.A. Paulson, modified by K. Severson Linear Algebra We ve covered three major topics

More information

arxiv: v1 [math.na] 5 May 2011

arxiv: v1 [math.na] 5 May 2011 ITERATIVE METHODS FOR COMPUTING EIGENVALUES AND EIGENVECTORS MAYSUM PANJU arxiv:1105.1185v1 [math.na] 5 May 2011 Abstract. We examine some numerical iterative methods for computing the eigenvalues and

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

Machine Learning Brett Bernstein. Recitation 1: Gradients and Directional Derivatives

Machine Learning Brett Bernstein. Recitation 1: Gradients and Directional Derivatives Machine Learning Brett Bernstein Recitation 1: Gradients and Directional Derivatives Intro Question 1 We are given the data set (x 1, y 1 ),, (x n, y n ) where x i R d and y i R We want to fit a linear

More information

Week 4: Differentiation for Functions of Several Variables

Week 4: Differentiation for Functions of Several Variables Week 4: Differentiation for Functions of Several Variables Introduction A functions of several variables f : U R n R is a rule that assigns a real number to each point in U, a subset of R n, For the next

More information

ECE580 Fall 2015 Solution to Midterm Exam 1 October 23, Please leave fractions as fractions, but simplify them, etc.

ECE580 Fall 2015 Solution to Midterm Exam 1 October 23, Please leave fractions as fractions, but simplify them, etc. ECE580 Fall 2015 Solution to Midterm Exam 1 October 23, 2015 1 Name: Solution Score: /100 This exam is closed-book. You must show ALL of your work for full credit. Please read the questions carefully.

More information

Review of Classical Optimization

Review of Classical Optimization Part II Review of Classical Optimization Multidisciplinary Design Optimization of Aircrafts 51 2 Deterministic Methods 2.1 One-Dimensional Unconstrained Minimization 2.1.1 Motivation Most practical optimization

More information

Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis

Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms De La Fuente notes that, if an n n matrix has n distinct eigenvalues, it can be diagonalized. In this supplement, we will provide

More information

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti Mobile Robotics 1 A Compact Course on Linear Algebra Giorgio Grisetti SA-1 Vectors Arrays of numbers They represent a point in a n dimensional space 2 Vectors: Scalar Product Scalar-Vector Product Changes

More information

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems 1 Numerical optimization Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book Numerical optimization 048921 Advanced topics in vision Processing and Analysis of

More information

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection 6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE Three Alternatives/Remedies for Gradient Projection Two-Metric Projection Methods Manifold Suboptimization Methods

More information

REVIEW OF DIFFERENTIAL CALCULUS

REVIEW OF DIFFERENTIAL CALCULUS REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be

More information

Recitation 1. Gradients and Directional Derivatives. Brett Bernstein. CDS at NYU. January 21, 2018

Recitation 1. Gradients and Directional Derivatives. Brett Bernstein. CDS at NYU. January 21, 2018 Gradients and Directional Derivatives Brett Bernstein CDS at NYU January 21, 2018 Brett Bernstein (CDS at NYU) Recitation 1 January 21, 2018 1 / 23 Initial Question Intro Question Question We are given

More information

Optimization Methods

Optimization Methods Optimization Methods Categorization of Optimization Problems Continuous Optimization Discrete Optimization Combinatorial Optimization Variational Optimization Common Optimization Concepts in Computer Vision

More information

Constrained optimization: direct methods (cont.)

Constrained optimization: direct methods (cont.) Constrained optimization: direct methods (cont.) Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi Direct methods Also known as methods of feasible directions Idea in a point x h, generate a

More information

Computational Finance

Computational Finance Department of Mathematics at University of California, San Diego Computational Finance Optimization Techniques [Lecture 2] Michael Holst January 9, 2017 Contents 1 Optimization Techniques 3 1.1 Examples

More information

Transpose & Dot Product

Transpose & Dot Product Transpose & Dot Product Def: The transpose of an m n matrix A is the n m matrix A T whose columns are the rows of A. So: The columns of A T are the rows of A. The rows of A T are the columns of A. Example:

More information

Quadratic Programming

Quadratic Programming Quadratic Programming Outline Linearly constrained minimization Linear equality constraints Linear inequality constraints Quadratic objective function 2 SideBar: Matrix Spaces Four fundamental subspaces

More information

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL) Part 4: Active-set methods for linearly constrained optimization Nick Gould RAL fx subject to Ax b Part C course on continuoue optimization LINEARLY CONSTRAINED MINIMIZATION fx subject to Ax { } b where

More information