# Exploring the energy landscape

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Exploring the energy landscape ChE210D Today's lecture: what are general features of the potential energy surface and how can we locate and characterize minima on it Derivatives of the potential energy function Forces For any atomic configuration, we can compute the force acting on each atom from the negative energy derivative with respect to the particle coordinates: In shorthand notation, Alternatively,,,, That is, the force stems from the gradient of the potential energy. We can simplify this expression if our potential energy function is built from pairwise interactions, We have that: + + Consider computing the force on atom. There are 1 terms in the pairwise summation that include the coordinates of atom ". Thus the energy derivative with respect to is:, # M. S. Shell /14 last modified 4/12/2012

2 # Using the definition of, 2 2 Hence,, Doing the same for the and coordinates, where # # For example, if our pair potential is the so-called soft-sphere interaction, where + is a positive integer, then Simplifying, %& ' ( )*, +%' * )*)-. # + )* /%& ' ( 0 # Notice that the force acting on atom " due to atom 1 is just the negative of the force acting on atom 1 due to atom ", since 2 M. S. Shell /14 last modified 4/12/2012

3 In simulation, we typically maintain an (N,3) array of forces. A typical way to implement pairwise force and potential energy calculation in computer code is the following: set the potential energy variable and elements of the force array equal to zero loop over atom i 1 to N 1 loop over atom j i+1 to N calculate the pair energy of atoms i and j add the energy to the potential energy variable calculate the force on atom i due to j add the force to the array elements for atom i subtract the force from the array elements for atom j Second derivatives and the Hessian The curvature of the potential energy function surrounding a given atomic configuration is given by second derivatives. There are 3 3 different second derivatives, attained by cross-permuting different particle coordinates. We typically write the second derivatives as a matrix called the Hessian: 8 > < The Hessian is important in distinguishing potential energy minima from saddle points, as discussed in greater detail below. It can be a memory-intensive matrix to compute in molecular simulation, as the number of elements is 9. The energy landscape Basic definition and properties The potential energy function defines all of the thermodynamic and kinetic properties of a classical atomic system. We can think of this function as being plotted in a high-dimensional space where there are 3 axes, one for each coordinate of every atom. An additional axis measures the potential energy, and we call the projection of the potential energy function in 3 +1 dimensional space as the potential energy landscape (PEL). Though we cannot visualize the PEL directly, the following schematic shows some features of it: M. S. Shell /14 last modified 4/12/2012

4 saddle point basin local minima particle coordinates, global minimum Stationary points correspond to the condition where the slope of the PEL is This is shorthand for the following simultaneous conditions: "1,2,, Notice that these equations imply that the net force acting on each particle is zero: 0 Therefore, the stationary points correspond to mechanically stable configurations of the atoms in the system. There are two basic kinds of stationary points: For minima, the curvature surrounding the stationary point is positive. That is, any movement away from the point increases the potential energy. For saddles, one can move in one or more directions around a stationary point and experience a decrease in potential energy. Scaling of the number of minima and saddle points PELs contain numerous stationary points. It can be shown [Stillinger, Phys Rev E (1999); Shell et al, Phys Rev Lett (2004)] that both the number of minima and saddle points grow exponentially with the number of atoms: number of minima ~exp ' M. S. Shell /14 last modified 4/12/2012

5 number of saddle points ~ exp T where ' and T are system-specific constants but independent of the number of atoms. The implication of these scaling laws is that it is extraordinarily difficult to search exhaustively for all of the minima for even very small systems (~100 atoms), since their number is so great. By extension, it is extremely challenging to locate the global potential energy minimum, that is, the set of particle coordinates that gives the lowest potential energy achievable. The following figure shows a disconnectivity graph for a system of just 13 Lennard-Jones atoms in infinite volume, taken from [Doye, Miller, and Wales, J. Chem. Phys. 111, 8417 (1999)]. The horizontal axis marks the potential energy and the lines show the connectivity of different minima. To move from one minimum to another, one must travel the path formed when the lines extending from two minima connect to a common energetic point. This small system has 1467 distinct potential energy minima. These results were generated using an advanced and computationally-intensive minima-finding algorithm. Relevance of finding minima Why should we care to find PEL minima? M. S. Shell /14 last modified 4/12/2012

6 Mechanically stable molecular structures can be identified. When examining conformational changes within a molecule, minima give the distinct conformational states possible. For changes among many molecules, the minima provide the basic structures associated with physical events in the system (e.g., binding). Both can often be compared with experiment. Differences in the energies of different minimized structures provide a first-order perspective on the relative populations of these (neglecting entropies). Again, this can be compared to experiment and often is an important force field diagnostic. Initial structures in molecular simulations often need to be relaxed away from highenergy states. Minimization can take an initial structure and remove atomic overlaps or distorted bond and angle degrees of freedom. Locating local energy minima Problem formulation If we are anywhere within the basin of a minimum, we can define a steepest descent protocol that takes us to the minimum: U limu ) ( ) Here, U is a fictitious time-like variable. The solution to this first order differential equation in the limit that U is the set of coordinates at a minimum min. Here, we essentially follow the forces down the potential energy surface until we reach its bottom the velocities vectors of each atom follow the negative forces. In a two-dimensional contour plot of energies (as a function of coordinates), this would look like: M. S. Shell /14 last modified 4/12/2012

7 The steepest descent protocol is equivalent to an instantaneous quench of a system to absolute zero: we continuously remove energy from the system until we cannot remove any more. Line searches It is very inefficient to implement the above equation because one must take U to infinity, and because the closer we get to the minimum, the slower we are to converge due to the fact that the forces grow increasingly small. A number of much more efficient methods are available. These typically use a so-called line search as a part of the overall approach. A line search simply says to find a minimum along one particular direction in coordinate space, X. In two dimensions, X The procedure for a line search is broken into two parts. 1. Start with an initial set of coordinates Y and a search direction X, chosen to be in the downhill direction of the energy surface. 2. Let - Y. Then, find two more coordinates Y +ZX and [ Y +2ZX by taking small steps Z along the search direction. Z should be chosen to be small relative to the scales of energy changes relative to particle coordinates. 3. Is ( [ )>( )? If so, the three coordinates bracket a minimum and we should move to the next part of the algorithm. 4. If not, we need to keep searching along this direction. Let -, [, and [ +ZX. Go back to step 3. The second part of the algorithm consists of finding the minimum once we ve bracketed it. One possibility is to bisect pairs of coordinates: 1. Find a new coordinate ^. If - > [, pick ^ - ( - + ). Otherwise, pick ^ - ( + [ ). M. S. Shell /14 last modified 4/12/2012

8 2. Of the four coordinates ( -,, [,^) pick new values for ( -,, [ ) from these that most closely bracket the minimum. 3. Go back to step 1 until the successive values of the energies found at each new ^ changes by less than some fractional tolerance, typically 1 in ( ) progress along X progress along X An alternative to step 1 here, which converges faster, is to use the three points ( -,, [ ) to fit a parabola and find the coordinates at the minimum from this approximation: - - -[ [ - Δ - ( ) ( - ) Δ -[ ( [ ) ( - ) ^ a - - ba -[ Δ - - Δ -[ b -[ Δ - - Δ -[ Steepest descent with line search The steepest descent approach can be made more efficient with a line search: 1. From the current set of coordinates, compute the forces. Choose the search direction to be equal to X, the negative gradient of the potential energy surface. 2. Perform a line search to find the minimum along X. 3. Compute the new set of forces at the configuration found in point 2 and a new search direction from them. 4. Repeat step 2 until the minimum energy found with each iteration no longer changes by some fractional tolerance, typically 1 in M. S. Shell /14 last modified 4/12/2012

9 Conjugate-gradient method For long, narrow valleys in the PEL, the steepest descent + line search method results in a large number of zigzag-type motions as one approaches the minimum. This is because the approach continuously overcorrects itself when computing the new search direction with each iteration (new line search). Instead, a more efficient approach is to use the conjugate gradient (CG) method. The CG approach differs only in the choice of a new search direction with each iteration in step 3. Like the steepest descent + line search approach, we use the gradient of the energy surface. In addition, however, we use the previous search direction as well. The CG method chooses search directions that are conjugate (perpendicular) to previous gradients found. A new search direction at the "th iteration is found using with X +c X )- c )- )- Here, the dot product signifies the 3N-dimensional vector multiplication:,- +,- +,- +, + +, For +-dimensional quadratic functions, the CG approach converges exactly to the minimum in + iterations (e.g., + line search operations). An alternative equation for c is typically used in molecular modeling, as it has better convergence properties: c ( )- ) )- )- M. S. Shell /14 last modified 4/12/2012

10 Newton and Quasi-Newton methods Location of minima can be enhanced by using second derivative information. Consider the PEL in the vicinity of a minimum. For the sake of simplicity, we will consider a one-dimensional function (). We can Taylor expand () and its derivative to second order: ()( Y )+( Y ) ( Y) () ( Y) + ( Y) ( Y ) 2 +( Y ) ( Y ) Neglecting higher terms, we can solve for the point at which the derivative is zero: Y a ( )- Y) ( Y ) ba b For quadratic functions, this relation will find the minimum after only a single application. For other functions, this equation defines an iterative procedure whereby the minimum can be located through successive applications of it (each time the function being approximated as quadratic): e- a ( )- ) ( ) ba b This is the Newton-Raphson iteration scheme for finding stationary points. The above procedure will converge to both minima and saddle points. To converge to minima, one must first move to a point on the PEL where the energy landscape has positive curvature in every direction. For multidimensional functions ( ) the update equation above involves an inverse of the Hessian matrix of second derivatives: e- + )- 5 To implement this method in simulation requires computation of the 3 3 Hessian matrix and its inverse with each iteration step. For even small systems, this task can become computeand memory-intensive. To remedy these issues, several quasi-newton methods are available. The distinctive feature of these approaches is that they generate approximations to the inverse Hessian; upon each iteration, these approximations grow increasingly accurate. These approximations are generated from the force array. The most popular of these in use with molecular modeling methods is M. S. Shell /14 last modified 4/12/2012

11 the Limited Broyden-Fletcher-Goldfarb-Shanno (LBFGS) approach. Implementations of this method can be found at online simulation code databases. In general, Newton and quasi-newton methods are more accurate and efficient than the conjugate gradient approach; however, both are widely used in the literature. Locating global energy minima Methods to find global potential energy minima are actively researched and have important applications in crystal and macromolecular structure prediction. The global minimum problem is much more challenging as it requires a detailed search of the entire PEL and it is difficult if not impossible to know a priori when the global minimum has been found (unless the entire surface has been explored exhaustively). Typically global-minimizers combine local minimization with various other methods to selectively explore low-energy regions of the PEL and with bookkeeping routines to remember which parts have been searched and what low-energy configurations have been found. This search is usually partially stochastic in nature, often leveraging the molecular dynamics and Monte Carlo techniques we will discuss in later lectures. Normal mode analysis For minima Consider the PEL in the vicinity of a minimum. Let s write the coordinates of all the atoms about this minimum be If we consider very small perturbations about these coordinates, Y Y +Δ we can Taylor expand the potential energy to second order. That is, we can perform a harmonic approximation to the PEL about a minimum: ( )( Y )+ Δ - 2 ( Y ) + Δ -Δ - ( Y ) + + Δ ( Y ) Notice that all of the first-order terms in the Taylor expansion are zero by the fact that ( Y ) is a minimum. A shorter way of writing this is in matrix form using the Hessian: M. S. Shell /14 last modified 4/12/2012

12 f > 8 Δ - Δ Δ Δ - 27Δ 7 6Δ < which can be written in more compact form: 8 > Δ - 7 Δ > Δ - 7Δ Δ < < Y gf 5g where Y ( Y ) and gδ. Keep in mind that the Hessian is evaluated at Y. We can use a trick of mathematics to simplify the analysis, by diagonalizing the Hessian matrix 5. Notice that 5 is a symmetric matrix, since the order in which the second derivatives are taken doesn t matter. That property means that we can write where i is a diagonal matrix of eigenvalues: 5h f ih k k ij 0 l 0 0 k and h is a matrix whose rows are eigenvectors of 5. Then, Y gf h f ihg Y (hg)f i(hg) Y mf im In the last line, we defined a new set of 3 coordinates, instead of (g), that consists of linear combinations of the atomic positions: mhg h h Y M. S. Shell /14 last modified 4/12/2012

13 Physically, the new coordinates m correspond to projecting the atomic positions on a new set of axes given by the eigenvectors of the Hessian. Using the above Taylor expansion, the fact that i is a diagonal matrix means that the energy is a simple a sum of 3 independent harmonic oscillators along the new directions m: [ Y k U The physical interpretation of this result is the following. Starting from the minimum and at low temperatures, the system oscillates about Y. The eigenvectors of 5 give the normal modes along which the system fluctuates, that is, the directions and collective motions of all of the atoms that move in harmonic fashion for each oscillator. There are 3 of these modes corresponding to the 3 eigenvectors. The eigenvalues give the corresponding force constants and can be related to the frequencies of fluctuations. If all atomic masses are the same, o, then n- p qk o In the case that the atomic masses differ, a slightly more complicated equation is used [Case, Curr. Op. Struct. Biol. 4, 285 (1994)]. Regarding the eigenvalues, All will be positive or zero, since the energy must increase as one moves away from the minimum. Thus, the frequencies are all real. If the system also has rotational symmetry (i.e., atoms can be collectively rotated about the x, y, and z axes), then an additional three eigenvalues will be zero. The spectrum of different normal mode frequencies informs one of motions on different timescales: If the system has translational symmetry (i.e., atoms can collectively be moved in the x, y, and z directions without a change in energy), then three of the eigenvalues will be zero. High-frequency modes signify very fast degrees of freedoms. The corresponding eigenmodes typically correspond to bond stretching motions, if present. Highfrequency modes tend to be localized, i.e., involving only a few atoms. M. S. Shell /14 last modified 4/12/2012

### , b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are

Quadratic forms We consider the quadratic function f : R 2 R defined by f(x) = 2 xt Ax b T x with x = (x, x 2 ) T, () where A R 2 2 is symmetric and b R 2. We will see that, depending on the eigenvalues

### Example questions for Molecular modelling (Level 4) Dr. Adrian Mulholland

Example questions for Molecular modelling (Level 4) Dr. Adrian Mulholland 1) Question. Two methods which are widely used for the optimization of molecular geometies are the Steepest descents and Newton-Raphson

### 22 Path Optimisation Methods

22 Path Optimisation Methods 204 22 Path Optimisation Methods Many interesting chemical and physical processes involve transitions from one state to another. Typical examples are migration paths for defects

### Today. Introduction to optimization Definition and motivation 1-dimensional methods. Multi-dimensional methods. General strategies, value-only methods

Optimization Last time Root inding: deinition, motivation Algorithms: Bisection, alse position, secant, Newton-Raphson Convergence & tradeos Eample applications o Newton s method Root inding in > 1 dimension

### Project: Vibration of Diatomic Molecules

Project: Vibration of Diatomic Molecules Objective: Find the vibration energies and the corresponding vibrational wavefunctions of diatomic molecules H 2 and I 2 using the Morse potential. equired Numerical

### Mathematical optimization

Optimization Mathematical optimization Determine the best solutions to certain mathematically defined problems that are under constrained determine optimality criteria determine the convergence of the

### Computer simulation can be thought as a virtual experiment. There are

13 Estimating errors 105 13 Estimating errors Computer simulation can be thought as a virtual experiment. There are systematic errors statistical errors. As systematical errors can be very complicated

### Molecular Simulation II. Classical Mechanical Treatment

Molecular Simulation II Quantum Chemistry Classical Mechanics E = Ψ H Ψ ΨΨ U = E bond +E angle +E torsion +E non-bond Jeffry D. Madura Department of Chemistry & Biochemistry Center for Computational Sciences

### Unconstrained Multivariate Optimization

Unconstrained Multivariate Optimization Multivariate optimization means optimization of a scalar function of a several variables: and has the general form: y = () min ( ) where () is a nonlinear scalar-valued

### Nonlinear Optimization

Nonlinear Optimization (Com S 477/577 Notes) Yan-Bin Jia Nov 7, 2017 1 Introduction Given a single function f that depends on one or more independent variable, we want to find the values of those variables

### Quasi-Newton Methods

Newton s Method Pros and Cons Quasi-Newton Methods MA 348 Kurt Bryan Newton s method has some very nice properties: It s extremely fast, at least once it gets near the minimum, and with the simple modifications

### Comparison of methods for finding saddle points without knowledge of the final states

JOURNAL OF CHEMICAL PHYSICS VOLUME 121, NUMBER 20 22 NOVEMBER 2004 Comparison of methods for finding saddle points without knowledge of the final states R. A. Olsen and G. J. Kroes Leiden Institute of

### Lecture 7: Minimization or maximization of functions (Recipes Chapter 10)

Lecture 7: Minimization or maximization of functions (Recipes Chapter 10) Actively studied subject for several reasons: Commonly encountered problem: e.g. Hamilton s and Lagrange s principles, economics

### 11 More Regression; Newton s Method; ROC Curves

More Regression; Newton s Method; ROC Curves 59 11 More Regression; Newton s Method; ROC Curves LEAST-SQUARES POLYNOMIAL REGRESSION Replace each X i with feature vector e.g. (X i ) = [X 2 i1 X i1 X i2

### Statistics 580 Optimization Methods

Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of

### Tangent spaces, normals and extrema

Chapter 3 Tangent spaces, normals and extrema If S is a surface in 3-space, with a point a S where S looks smooth, i.e., without any fold or cusp or self-crossing, we can intuitively define the tangent

Quadratic Programming Outline Linearly constrained minimization Linear equality constraints Linear inequality constraints Quadratic objective function 2 SideBar: Matrix Spaces Four fundamental subspaces

### A Study of the Thermal Properties of a One. Dimensional Lennard-Jones System

A Study of the Thermal Properties of a One Dimensional Lennard-Jones System Abstract In this study, the behavior of a one dimensional (1D) Lennard-Jones (LJ) system is simulated. As part of this research,

Chapter 8 Gradient Methods An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Introduction Recall that a level set of a function is the set of points satisfying for some constant. Thus, a point

### Convex Optimization CMU-10725

Convex Optimization CMU-10725 Quasi Newton Methods Barnabás Póczos & Ryan Tibshirani Quasi Newton Methods 2 Outline Modified Newton Method Rank one correction of the inverse Rank two correction of the

### Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane.

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane c Sateesh R. Mane 2018 3 Lecture 3 3.1 General remarks March 4, 2018 This

### Designing Information Devices and Systems II Fall 2015 Note 22

EE 16B Designing Information Devices and Systems II Fall 2015 Note 22 Notes taken by John Noonan (11/12) Graphing of the State Solutions Open loop x(k + 1) = Ax(k) + Bu(k) y(k) = Cx(k) Closed loop x(k

### Numerical Optimization Techniques

Numerical Optimization Techniques Léon Bottou NEC Labs America COS 424 3/2/2010 Today s Agenda Goals Representation Capacity Control Operational Considerations Computational Considerations Classification,

### ( ) 2 75( ) 3

Chemistry 380.37 Dr. Jean M. Standard Homework Problem Set 3 Solutions 1. The part of a particular MM3-like force field that describes stretching energy for an O-H single bond is given by the following

### Department of Chemical Engineering University of California, Santa Barbara Spring Exercise 2. Due: Thursday, 4/19/09

Department of Chemical Engineering ChE 210D University of California, Santa Barbara Spring 2012 Exercise 2 Due: Thursday, 4/19/09 Objective: To learn how to compile Fortran libraries for Python, and to

### Chapter 4. Unconstrained optimization

Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file

### Analysis of the Lennard-Jones-38 stochastic network

Analysis of the Lennard-Jones-38 stochastic network Maria Cameron Joint work with E. Vanden-Eijnden Lennard-Jones clusters Pair potential: V(r) = 4e(r -12 - r -6 ) 1. Wales, D. J., Energy landscapes: calculating

### 8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

### A simple technique to estimate partition functions and equilibrium constants from Monte Carlo simulations

A simple technique to estimate partition functions and equilibrium constants from Monte Carlo simulations Michal Vieth Department of Chemistry, The Scripps Research Institute, 10666 N. Torrey Pines Road,

### Numerical Methods I Solving Nonlinear Equations

Numerical Methods I Solving Nonlinear Equations Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 16th, 2014 A. Donev (Courant Institute)

### Minimization of Static! Cost Functions!

Minimization of Static Cost Functions Robert Stengel Optimal Control and Estimation, MAE 546, Princeton University, 2017 J = Static cost function with constant control parameter vector, u Conditions for

### Chapter 3 Numerical Methods

Chapter 3 Numerical Methods Part 2 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization 1 Outline 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization Summary 2 Outline 3.2

### Lecture 10. Neural networks and optimization. Machine Learning and Data Mining November Nando de Freitas UBC. Nonlinear Supervised Learning

Lecture 0 Neural networks and optimization Machine Learning and Data Mining November 2009 UBC Gradient Searching for a good solution can be interpreted as looking for a minimum of some error (loss) function

### Chemistry 324, Spring 2002 Geometry Optimization

Geometry optimization Page 1 of 9 Review Chemistry 324, Spring 2002 Geometry Optimization By now, you have learned several useful things about molecular modeling. First, you have learned from your work

### Optimization. Totally not complete this is...don't use it yet...

Optimization Totally not complete this is...don't use it yet... Bisection? Doing a root method is akin to doing a optimization method, but bi-section would not be an effective method - can detect sign

### 7.2 Steepest Descent and Preconditioning

7.2 Steepest Descent and Preconditioning Descent methods are a broad class of iterative methods for finding solutions of the linear system Ax = b for symmetric positive definite matrix A R n n. Consider

### Advanced sampling. fluids of strongly orientation-dependent interactions (e.g., dipoles, hydrogen bonds)

Advanced sampling ChE210D Today's lecture: methods for facilitating equilibration and sampling in complex, frustrated, or slow-evolving systems Difficult-to-simulate systems Practically speaking, one is

### arxiv:cond-mat/ v1 [cond-mat.soft] 23 Mar 2007

The structure of the free energy surface of coarse-grained off-lattice protein models arxiv:cond-mat/0703606v1 [cond-mat.soft] 23 Mar 2007 Ethem Aktürk and Handan Arkin Hacettepe University, Department

### Optimization and Calculus

Optimization and Calculus To begin, there is a close relationship between finding the roots to a function and optimizing a function. In the former case, we solve for x. In the latter, we solve: g(x) =

Non-convex optimization Issam Laradji Strongly Convex Objective function f(x) x Strongly Convex Objective function Assumptions Gradient Lipschitz continuous f(x) Strongly convex x Strongly Convex Objective

### Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

### Normal Mode Analysis. Jun Wan and Willy Wriggers School of Health Information Sciences & Institute of Molecular Medicine University of Texas Houston

Normal Mode Analysis Jun Wan and Willy Wriggers School of Health Information Sciences & Institute of Molecular Medicine University of Teas Houston Contents Introduction --- protein dynamics (lowfrequency

Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous

### Practical Optimization: Basic Multidimensional Gradient Methods

Practical Optimization: Basic Multidimensional Gradient Methods László Kozma Lkozma@cis.hut.fi Helsinki University of Technology S-88.4221 Postgraduate Seminar on Signal Processing 22. 10. 2008 Contents

### Lecture 10: October 27, 2016

Mathematical Toolkit Autumn 206 Lecturer: Madhur Tulsiani Lecture 0: October 27, 206 The conjugate gradient method In the last lecture we saw the steepest descent or gradient descent method for finding

### Kinematics of fluid motion

Chapter 4 Kinematics of fluid motion 4.1 Elementary flow patterns Recall the discussion of flow patterns in Chapter 1. The equations for particle paths in a three-dimensional, steady fluid flow are dx

### Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Optimization Unconstrained optimization One-dimensional Multi-dimensional Newton s method Basic Newton Gauss- Newton Quasi- Newton Descent methods Gradient descent Conjugate gradient Constrained optimization

### The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017

The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the

### Roots and Coefficients of a Quadratic Equation Summary

Roots and Coefficients of a Quadratic Equation Summary For a quadratic equation with roots α and β: Sum of roots = α + β = and Product of roots = αβ = Symmetrical functions of α and β include: x = and

### 5 Topological defects and textures in ordered media

5 Topological defects and textures in ordered media In this chapter we consider how to classify topological defects and textures in ordered media. We give here only a very short account of the method following

### Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

### 1.1 Radical Expressions: Rationalizing Denominators

1.1 Radical Expressions: Rationalizing Denominators Recall: 1. A rational number is one that can be expressed in the form a, where b 0. b 2. An equivalent fraction is determined by multiplying or dividing

### Empirical Risk Minimization and Optimization

Statistical Machine Learning Notes 3 Empirical Risk Minimization and Optimization Instructor: Justin Domke Contents 1 Empirical Risk Minimization 1 2 Ingredients of Empirical Risk Minimization 4 3 Convex

### Comparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming Problems

International Journal of Scientific and Research Publications, Volume 3, Issue 10, October 013 1 ISSN 50-3153 Comparative study of Optimization methods for Unconstrained Multivariable Nonlinear Programming

### ( ), λ. ( ) =u z. ( )+ iv z

AMS 212B Perturbation Metho Lecture 18 Copyright by Hongyun Wang, UCSC Method of steepest descent Consider the integral I = f exp( λ g )d, g C =u + iv, λ We consider the situation where ƒ() and g() = u()

### Review for Exam 2 Ben Wang and Mark Styczynski

Review for Exam Ben Wang and Mark Styczynski This is a rough approximation of what we went over in the review session. This is actually more detailed in portions than what we went over. Also, please note

### Maximum Likelihood Estimation

Maximum Likelihood Estimation Prof. C. F. Jeff Wu ISyE 8813 Section 1 Motivation What is parameter estimation? A modeler proposes a model M(θ) for explaining some observed phenomenon θ are the parameters

### Practical Numerical Methods in Physics and Astronomy. Lecture 5 Optimisation and Search Techniques

Practical Numerical Methods in Physics and Astronomy Lecture 5 Optimisation and Search Techniques Pat Scott Department of Physics, McGill University January 30, 2013 Slides available from http://www.physics.mcgill.ca/

### Non-Convex Optimization. CS6787 Lecture 7 Fall 2017

Non-Convex Optimization CS6787 Lecture 7 Fall 2017 First some words about grading I sent out a bunch of grades on the course management system Everyone should have all their grades in Not including paper

### Root Finding Methods

Root Finding Methods Let's take a simple problem ax 2 +bx+c=0 Can calculate roots using the quadratic equation or just by factoring Easy and does not need any numerical methods How about solving for the

### 221A Lecture Notes Convergence of Perturbation Theory

A Lecture Notes Convergence of Perturbation Theory Asymptotic Series An asymptotic series in a parameter ɛ of a function is given in a power series f(ɛ) = f n ɛ n () n=0 where the series actually does

### 1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

AM 221: Advanced Optimization Spring 2016 Prof. Yaron Singer Lecture 8 February 22nd 1 Overview In the previous lecture we saw characterizations of optimality in linear optimization, and we reviewed the

### 21 Symmetric and skew-symmetric matrices

21 Symmetric and skew-symmetric matrices 21.1 Decomposition of a square matrix into symmetric and skewsymmetric matrices Let C n n be a square matrix. We can write C = (1/2)(C + C t ) + (1/2)(C C t ) =

### IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I Lecture 15: Nonlinear optimization Prof. John Gunnar Carlsson November 1, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I November 1, 2010 1 / 24

### Lecture 15: Exploding and Vanishing Gradients

Lecture 15: Exploding and Vanishing Gradients Roger Grosse 1 Introduction Last lecture, we introduced RNNs and saw how to derive the gradients using backprop through time. In principle, this lets us train

### Report due date. Please note: report has to be handed in by Monday, May 16 noon.

Lecture 23 18.86 Report due date Please note: report has to be handed in by Monday, May 16 noon. Course evaluation: Please submit your feedback (you should have gotten an email) From the weak form to finite

ECE 830 Spring 2013 Statistical Signal Processing instructors: K. Jamieson and R. Nowak Lecture: Adaptive Filtering Adaptive filters are commonly used for online filtering of signals. The goal is to estimate

### Optimization Tutorial 1. Basic Gradient Descent

E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.

### Overview of gradient descent optimization algorithms. HYUNG IL KOO Based on

Overview of gradient descent optimization algorithms HYUNG IL KOO Based on http://sebastianruder.com/optimizing-gradient-descent/ Problem Statement Machine Learning Optimization Problem Training samples:

### A = 3 B = A 1 1 matrix is the same as a number or scalar, 3 = [3].

Appendix : A Very Brief Linear ALgebra Review Introduction Linear Algebra, also known as matrix theory, is an important element of all branches of mathematics Very often in this course we study the shapes

The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)

### In these chapter 2A notes write vectors in boldface to reduce the ambiguity of the notation.

1 2 Linear Systems In these chapter 2A notes write vectors in boldface to reduce the ambiguity of the notation 21 Matrix ODEs Let and is a scalar A linear function satisfies Linear superposition ) Linear

### Taylor series. Chapter Introduction From geometric series to Taylor polynomials

Chapter 2 Taylor series 2. Introduction The topic of this chapter is find approximations of functions in terms of power series, also called Taylor series. Such series can be described informally as infinite

### Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

### The Potential Energy Surface (PES) Preamble to the Basic Force Field Chem 4021/8021 Video II.i

The Potential Energy Surface (PES) Preamble to the Basic Force Field Chem 4021/8021 Video II.i The Potential Energy Surface Captures the idea that each structure that is, geometry has associated with it

Gradient Based Optimization Methods Antony Jameson, Department of Aeronautics and Astronautics Stanford University, Stanford, CA 94305-4035 1 Introduction Consider the minimization of a function J(x) where

### 2. Quasi-Newton methods

L. Vandenberghe EE236C (Spring 2016) 2. Quasi-Newton methods variable metric methods quasi-newton methods BFGS update limited-memory quasi-newton methods 2-1 Newton method for unconstrained minimization

### An Inverse Mass Expansion for Entanglement Entropy. Free Massive Scalar Field Theory

in Free Massive Scalar Field Theory NCSR Demokritos National Technical University of Athens based on arxiv:1711.02618 [hep-th] in collaboration with Dimitris Katsinis March 28 2018 Entanglement and Entanglement

### Math 411 Preliminaries

Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon s method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector

### A Lattice Study of the Glueball Spectrum

Commun. Theor. Phys. (Beijing, China) 35 (2001) pp. 288 292 c International Academic Publishers Vol. 35, No. 3, March 15, 2001 A Lattice Study of the Glueball Spectrum LIU Chuan Department of Physics,

### Electronic structure theory: Fundamentals to frontiers. 1. Hartree-Fock theory

Electronic structure theory: Fundamentals to frontiers. 1. Hartree-Fock theory MARTIN HEAD-GORDON, Department of Chemistry, University of California, and Chemical Sciences Division, Lawrence Berkeley National

### High Performance Nonlinear Solvers

What is a nonlinear system? High Performance Nonlinear Solvers Michael McCourt Division Argonne National Laboratory IIT Meshfree Seminar September 19, 2011 Every nonlinear system of equations can be described

### ORIE 6326: Convex Optimization. Quasi-Newton Methods

ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted

### Marcus Theory for Electron Transfer a short introduction

Marcus Theory for Electron Transfer a short introduction Minoia Andrea MPIP - Journal Club -Mainz - January 29, 2008 1 Contents 1 Intro 1 2 History and Concepts 2 2.1 Frank-Condon principle applied to

### Optimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Optimization 2 CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Optimization 2 1 / 38

### Molecular body vs. molecular skeleton Ideal surface 1, 29

Subject Index Acid catalysis 244, 245, 247, 249, 252 Bonds-pairs 177, 178, 184, 185, 190, 191, Activation 207, 208 202-204 Activation barrier 160, 233, 242, 249 Bottleneck 291, 299, 307, 323-325, 336,

### Nonlinear equations and optimization

Notes for 2017-03-29 Nonlinear equations and optimization For the next month or so, we will be discussing methods for solving nonlinear systems of equations and multivariate optimization problems. We will

The Conjugate Gradient Method Jason E. Hicken Aerospace Design Lab Department of Aeronautics & Astronautics Stanford University 14 July 2011 Lecture Objectives describe when CG can be used to solve Ax

### 7. Numerical Solutions of the TISE

7. Numerical Solutions of the TISE Copyright c 2015 2016, Daniel V. Schroeder Besides the infinite square well, there aren t many potentials V (x) for which you can find the energies and eigenfunctions

### Ch4: Method of Steepest Descent

Ch4: Method of Steepest Descent The method of steepest descent is recursive in the sense that starting from some initial (arbitrary) value for the tap-weight vector, it improves with the increased number

### Numerical Methods. Root Finding

Numerical Methods Solving Non Linear 1-Dimensional Equations Root Finding Given a real valued function f of one variable (say ), the idea is to find an such that: f() 0 1 Root Finding Eamples Find real

### Visual SLAM Tutorial: Bundle Adjustment

Visual SLAM Tutorial: Bundle Adjustment Frank Dellaert June 27, 2014 1 Minimizing Re-projection Error in Two Views In a two-view setting, we are interested in finding the most likely camera poses T1 w

### Summary of free theory: one particle state: vacuum state is annihilated by all a s: then, one particle state has normalization:

The LSZ reduction formula based on S-5 In order to describe scattering experiments we need to construct appropriate initial and final states and calculate scattering amplitude. Summary of free theory:

### Linear Algebra and Eigenproblems

Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details

### 1. Method 1: bisection. The bisection methods starts from two points a 0 and b 0 such that

Chapter 4 Nonlinear equations 4.1 Root finding Consider the problem of solving any nonlinear relation g(x) = h(x) in the real variable x. We rephrase this problem as one of finding the zero (root) of a

### Numerical methods part 2

Numerical methods part 2 Alain Hébert alain.hebert@polymtl.ca Institut de génie nucléaire École Polytechnique de Montréal ENE6103: Week 6 Numerical methods part 2 1/33 Content (week 6) 1 Solution of an

### Lecture 2: Linear regression

Lecture 2: Linear regression Roger Grosse 1 Introduction Let s ump right in and look at our first machine learning algorithm, linear regression. In regression, we are interested in predicting a scalar-valued

### Math 263 Assignment #4 Solutions. 0 = f 1 (x,y,z) = 2x 1 0 = f 2 (x,y,z) = z 2 0 = f 3 (x,y,z) = y 1

Math 263 Assignment #4 Solutions 1. Find and classify the critical points of each of the following functions: (a) f(x,y,z) = x 2 + yz x 2y z + 7 (c) f(x,y) = e x2 y 2 (1 e x2 ) (b) f(x,y) = (x + y) 3 (x