Practical Newton s Method

Similar documents
Lecture-7. Homework (Due 2/13/03)

OPTIMISATION. Introduction Single Variable Unconstrained Optimisation Multivariable Unconstrained Optimisation Linear Programming

Topic 5: Non-Linear Regression

Lecture 2 Solution of Nonlinear Equations ( Root Finding Problems )

CHAPTER 4d. ROOTS OF EQUATIONS

Singular Value Decomposition: Theory and Applications

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

Lecture Notes on Linear Regression

Summary with Examples for Root finding Methods -Bisection -Newton Raphson -Secant

: Numerical Analysis Topic 2: Solution of Nonlinear Equations Lectures 5-11:

1 GSW Iterative Techniques for y = Ax

36.1 Why is it important to be able to find roots to systems of equations? Up to this point, we have discussed how to find the solution to

ρ some λ THE INVERSE POWER METHOD (or INVERSE ITERATION) , for , or (more usually) to

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Mathematical Economics MEMF e ME. Filomena Garcia. Topic 2 Calculus

Dynamic Systems on Graphs

Single Variable Optimization

IV. Performance Optimization

Structure from Motion. Forsyth&Ponce: Chap. 12 and 13 Szeliski: Chap. 7

CISE301: Numerical Methods Topic 2: Solution of Nonlinear Equations

SCALED STEEPEST DESCENT METHOD

Mixture of Gaussians Expectation Maximization (EM) Part 2

Optimization. Nuno Vasconcelos ECE Department, UCSD

Quantum Mechanics I - Session 4

Department of Chemical and Biological Engineering LECTURE NOTE II. Chapter 3. Function of Several Variables

Review of Taylor Series. Read Section 1.2

Generalized Linear Methods

1 Convex Optimization

Chapter Newton s Method

Numerical Methods Solution of Nonlinear Equations

A total variation approach

Timing-Driven Placement. Outline

SEMI-SUPERVISED LEARNING

Body Models I-2. Gerard Pons-Moll and Bernt Schiele Max Planck Institute for Informatics

The Karush-Kuhn-Tucker. Nuno Vasconcelos ECE Department, UCSD

MMA and GCMMA two methods for nonlinear optimization

CHAPTER 7 CONSTRAINED OPTIMIZATION 2: SQP AND GRG

Solutions to exam in SF1811 Optimization, Jan 14, 2015

The Algorithms of Broyden-CG for. Unconstrained Optimization Problems

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Lesson 16: Basic Control Modes

Endogenous timing in a mixed oligopoly consisting of a single public firm and foreign competitors. Abstract

Chapter 3 Differentiation and Integration

University of Washington Department of Chemistry Chemistry 452/456 Summer Quarter 2014

Complex Variables. Chapter 18 Integration in the Complex Plane. March 12, 2013 Lecturer: Shih-Yuan Chen

Norms, Condition Numbers, Eigenvalues and Eigenvectors


Confidence intervals for weighted polynomial calibrations

CS 331 DESIGN AND ANALYSIS OF ALGORITHMS DYNAMIC PROGRAMMING. Dr. Daisy Tang

ADAPTIVE IMAGE FILTERING

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

QPCOMP: A Quadratic Programming Based Solver for Mixed. Complementarity Problems. February 7, Abstract

Supplementary Material for Spectral Clustering based on the graph p-laplacian

Feature Selection: Part 1

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 17. a ij x (k) b i. a ij x (k+1) (D + L)x (k+1) = b Ux (k)

Pattern Classification

CHAPTER 3 UNCONSTRAINED OPTIMIZATION

Multilayer Perceptron (MLP)

Physics 2A Chapter 3 HW Solutions

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

A Riemannian Limited-Memory BFGS Algorithm for Computing the Matrix Geometric Mean

On the convergence of the block nonlinear Gauss Seidel method under convex constraints

Statistical pattern recognition

Lecture 1 Least Squares

ORDINARY DIFFERENTIAL EQUATIONS EULER S METHOD

Logistic Regression Maximum Likelihood Estimation

( ) 2 ( ) ( ) Problem Set 4 Suggested Solutions. Problem 1

form, and they present results of tests comparng the new algorthms wth other methods. Recently, Olschowka & Neumaer [7] ntroduced another dea for choo

Inexact Newton Methods for Inverse Eigenvalue Problems

Root Finding

Period & Frequency. Work and Energy. Methods of Energy Transfer: Energy. Work-KE Theorem 3/4/16. Ranking: Which has the greatest kinetic energy?

Video Data Analysis. Video Data Analysis, B-IT. Lecture plan:

Lecture 21: Numerical methods for pricing American type derivatives

Optimization. September 4, 2018

CONTINUOUS PARAMETER FREE FILLED FUNCTION METHOD

Errors for Linear Systems

Momentum. Momentum. Impulse. Momentum and Collisions

Lecture 10 Support Vector Machines. Oct

Chapter 5 rd Law of Thermodynamics

A Ferris-Mangasarian Technique. Applied to Linear Least Squares. Problems. J. E. Dennis, Trond Steihaug. May Rice University

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

You will analyze the motion of the block at different moments using the law of conservation of energy.

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Proseminar Optimierung II. Victor A. Kovtunenko SS 2012/2013: LV

Support Vector Machines CS434

Model Fitting and Robust Regression Methods

SENSITIVITY APPROACH TO OPTIMAL CONTROL FOR AFFINE NONLINEAR DISCRETE-TIME SYSTEMS

2 STATISTICALLY OPTIMAL TRAINING DATA 2.1 A CRITERION OF OPTIMALITY We revew the crteron of statstcally optmal tranng data (Fukumzu et al., 1994). We

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Physics 2A Chapters 6 - Work & Energy Fall 2017

Optimization. August 30, 2016

The equation of motion of a dynamical system is given by a set of differential equations. That is (1)

Introduction to Simulation - Lecture 5. QR Factorization. Jacob White. Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy

State Estimation. Ali Abur Northeastern University, USA. September 28, 2016 Fall 2016 CURENT Course Lecture Notes

THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Series A, OF THE ROMANIAN ACADEMY Volume 13, Number 3/2012, pp

Computational Biology Lecture 8: Substitution matrices Saad Mneimneh

Transcription:

Practcal Newton s Method Lecture- n Newton s Method n Pure Newton s method converges radly once t s close to. It may not converge rom the remote startng ont he search drecton to be a descent drecton rue the Hessan s Postve Dente Otherwse t may be ascent, or may be ecessvely long wo Strateges: Newton GC: Solve Lnear System usng GC, termnate neg curvature encountered Moded Newton: Mody Hessan beore or durng the soluton

Ineact Newton Stes Iteratve method to solve lnear system, termnate at some aromate soluton. Resdual r r n Scale deendent Relatve Resdual ermnate teratons : + η η, r η η s the orcng seuence How about? η, heorem 6. Suose that s contnuously derentable n a neghborhood o a mnmzer, and assume that s ostve dente. Consder the teraton +, where + satses r η η,, then, the startng ont s sucently near, the seuence { } converges to lnearly. hat s, or all K sucently large, we have: c, c

heorem.7 Newton Lecture-6 Suose that s twce derentable and that Hessan s Lschtz contnuous. Consder the teraton + where s gven by N hen:. I the startng ont s sucently close to, the seuence converges to.. he rate o convergence s uadratc. he seuence o gradent norms converges uadratcally to zero. + r + r I Hessan s PD L aylor Seres L + r L + η + + + O + O r + O L + r + O r η + + r η

+ η + O + + O η + I s chosen close to, we can eect to decrease by a actor o aromately at every teraton. + lm su η < η + O η < I r o lm su + r η I r O + r lm su + O + O + O + O + c

heorem 6. Suose that the condtons o heorem 6. hold and assume that the terates { } generated by the neact Newton method converges to. hen the rate o convergence s suer-lnear η and uadratc η O. Quadratc η mn. 5, r r η r r O + O + O O + r + O + + lm c

Lne-Search Newton-CG Method. he startng ont or GC teraton s. Negatve curvature test. I the search drecton satses A I, comlete the rst GC, comute the new terate, sto I >, sto the rst GC, return most recent soluton. he Newton ste s dened as the nal CG terate gven ntal ont or end,,...,n Set r mn.5, s encountered Algorthm 6. Algorthm 6. Lne Search Newton - CG Comute a search drecton by alyng the CG method to + startng rom +α, whereα satses Wole. ermnate when, or the negatve curvature s bactracng condtons

Problems I Hessan s nearly sngular, Newton-CG drecton can be long, reurng many uncton evaluatons. he reducton n uncton may be very small. Normalze the Newton s drecton Introduce threshold A Algorthm 6. Algorthm 6. Lne Search Newton wth Modcaton gven ntal ont or end,,...,n Factorze the matr Set s sucently PD; otherwse, E ensure that Solve + ; s sucently PD + E, where E s chosen to +α, whereα satses Wole bactracng condtons

ounded Moded Factorzaton Proerty he matrces n the seuence { } have bounded condton number whenever the seuence o Hessan } s bounded, that s: { cond C, or some C >, Hessan Modcaton Choose modcaton E such that matr PD. + E s sucently -modcaton to be well-condtoned -small, so that second order normaton s reserved -modcaton be comutable at moderate cost

Egenvalue Modcaton,,,, dag Λ n Q Q λ Sectral decomoston,, dag and λ λ λ Α Q I.,, N.9 4.. > + N.. N It s not a descent drecton,, 8 + dag λ 8 λ For small ths ste s nearly arallel to and very long. Although decreases along the drecton, ts etreme length volates the srt o Newton s method, whch reles on the uadratc aromaton o the objectve uncton. Relace all negatve egenvalues by small ostve numbers. 8

Fl the sgns o negatve egenvalues, n our case Set Set the last term zero, so that the search drecton has no comonent along the negatve curvature drectons, adat the choce o to ensure the length o the ste s not ecessve. 8 λ