IV. Performance Optimization

Size: px
Start display at page:

Download "IV. Performance Optimization"

Transcription

1 IV. Performance Optmzaton A. Steepest descent algorthm defnton how to set up bounds on learnng rate mnmzaton n a lne (varyng learnng rate) momentum learnng examples B. Newton s method defnton Gauss-Newton method Levenberg-Marquardt method C. Conjugate gradent method defnton conjugate drecton theorem method mplementaton example References: [Hagan], [Moon] 7/4/6 EC446.SuFy6/MPF

2 Performance Optmzaton Goal: NN: How do we fnd optmum (mnmum) ponts located on the performance (error) surface f(x)? Progressvely trans (learns) when t s presented feature vectors Learnng s teratve Optmzaton schemes are teratve w x+ = x + α p x= b learnng rate search drecton Schemes nvestgated A. Steepest descent B. Newton s method C. Conjugate gradent ntal mnmzaton along a lne Gauss Newton Levenberg-Marquardt 7/4/6 EC446.SuFy6/MPF

3 A. Steepest Descent Goal: Fnd p so that x+ = x + α p x x : search dreecton Use a aylor seres expanson to fnd p (stop at frst order approxmaton) = ( + ) F x F x x + For a one-dmensonal case: = F x + F x x+ + x Pc p so that * x F x x < p = F x 7/4/6 EC446.SuFy6/MPF 3

4 For F( x) = /x Ax+ d x+ c F( x) = F( x) = 7/4/6 EC446.SuFy6/MPF 4

5 Example: F( x) = x + 5x fnd F x, F x expresson for x(), and the teratve 7/4/6 EC446.SuFy6/MPF 5

6 A. What s the effect of α on the teratve scheme behavor? overdamped behavor α ncreases - underdamped behavor unstable behavor 7/4/6 EC446.SuFy6/MPF 6

7 A.3 How to set up bounds on the learnng rate α? V F( x) F( x) = x Ax+ d x+ c VF( x) = = x + = 7/4/6 EC446.SuFy6/MPF 7

8 Overdamped/Underdamped Behavor defne c = x x ( α ) x I A x α d + = + + opt ( α ) α ( α ) ( α ) = I A x d A d = I A x + A I A d ( I α A x ) A d = + + ( α ) H ( α ) ( α ) c = I A c x opt = Q I Σ Q c ; A= QΣQ Q c = Q Q I Σ Q c H H H + I H ( α ) V I V + = Σ ( α ) V = I Σ V 7/4/6 EC446.SuFy6/MPF 8

9 x = c + x opt opt ( α ) = QV + x = q V + x = q Λ V + x change of sgn f Λ < o nsure overdamped behavor Select α > Λ for all so that doesn t flp sgn dependng f s even or odd. ( α ) Λ opt opt 7/4/6 EC446.SuFy6/MPF 9

10 Example: F( x) = x + x + x x + x Fnd the upper bound on α α =.39 α = /4/6 EC446.SuFy6/MPF

11 A.4 Mnmzaton on a lne Alternatve for estmatng α α ( ) Mnmze F x + α κ at each teraton wth respect to F( x + ) = F( x + α p ), p = F( x ) Arbtrary functon s dffcult --> loo at quadratc case frst d d dx A= F x + α p = F( x) dα dx dα = F( x ) = F( x + α p ) p Use partal fracton expanson: p ( α ) = F( x ) + F( x ) p p ( ) α ( ) A = F x p + p F x p A = α = F( x ) p p F( x ) p ( ) x = x + α F x + 7/4/6 EC446.SuFy6/MPF

12 Contour Plot x x Recall: α computed so that F(x + α p ) s mnmum along the gradent lne F(x + α p ) s mnmum at x so that F(x ) = gradent at x + s orthogonal to gradent at x 7/4/6 EC446.SuFy6/MPF

13 Example: 9 f( n) x x x = + = x x Do teratons usng mnmzaton on a lne. x 9 = F( x) = = F x 7/4/6 EC446.SuFy6/MPF 3

14 7/4/6 EC446.SuFy6/MPF 4

15 Example: Pattern recognton ( classes) Steepest descent for a -3-- NN, step sze α =. step sze 4 Fgure 4.7: Pattern-recognton problem for a neural networ. Decson output (sold lne) NN output (dashed lne) 7/4/6 EC446.SuFy6/MPF 5

16 A.5 Momentum learnng speed of convergence for steepest descent may mprove f oscllatons n the teraton scheme are reduced. oscllatons may be vewed as hgh-frequency whch can be smoothed out by a low-pass flter. basc steepest descent teraton x + = x + x = x α F x Modfy as follows: wth ( ) x = x α γ F x γα F x + γ [,] γ = = α x+ x F x γ = = α basc steepest descent x+ x F x no slope update Impact of momentum: when both dervatves are of the same sgn, accelerate n that drecton. when both dervatves have dfferent sgns, momentum provdes a drag, whch tends to mnmze oscllatons and stablze behavor. 7/4/6 EC446.SuFy6/MPF 6

17 Why does momentum learnng wor? rajectory wth momentum error.5 3 Iteraton Number 7/4/6 EC446.SuFy6/MPF 7

18 Effects of momentum learnng and step sze on -3-- NN example α: step sze µ: momentum constant 7/4/6 EC446.SuFy6/MPF 8

19 B. Newton s Method Recall the steepest descent scheme s based on: F x + = F x + x = F x + F x x Newton s method s an extenson of expanson to nd order ( + ) = ( + ) F x F x x = F x + F x x + x F( x ) x Restrctng to quadratc functons F( x) = x Ax+ d x+ c F( x) = = x * F( x) = 7/4/6 EC446.SuFy6/MPF 9

20 d d x Fnd x so that F( x + x ) s mnmum d d x d F( x ) = F( x ) + F( x ) x + d x = F x + F x x = = F x + x F x F x For quadratc functons, the teraton becomes: [ ] x = x + x = x F x F x + F( x) = x Ax+ d + c F x = F x = 7/4/6 EC446.SuFy6/MPF

21 B. What happens when F(x) s not quadratc and we use the Newton s method n the teratve scheme? true approxmated true teraton of Newton s scheme from x = [.5 ] approxmated teraton of Newton s scheme from x = [.5 ] 7/4/6 EC446.SuFy6/MPF

22 Newton s method summary Newton method s based on a local approxmaton of F( x) by a quadratc functon. f F( x) s a quadratc CV n step f F( x) s not a quadratc may CV to a local mnmum saddle pont may oscllate Newton s method s expensve. need to solve F( x) at each teraton need to compute F x at each teraton 7/4/6 EC446.SuFy6/MPF

23 B. Gauss-Newton Method F( x) = x Ax+ d x+ c x = x F x F x + x+ = x A g expensve to compute need to approxmate F x = v x = v ( x) v( x) Rewrte N = = N F x v x F( x) = = v ( x) j x x j j v ( x) N v ( x) x x F( x) = x N v ( x) v ( x) = x Assume x= = Assume N = v x v x v( x) + v( x) x x F( x) = v( x) v( x) v( x) + v( x) x x 7/4/6 EC446.SuFy6/MPF 3

24 v x v x x x v x ( x) x x = J F x = v( x) v( x) v F x ( x) V ( x) v x v x v x v x v ( x) + + v ( x) + + = + x x x x v( x) v ( x) x x v x v x v x v x x x x x = + v x v x v x v x x x x x = J x J x + S x nvolves nd order dervatves terms whch can be neglected ( ) x J x J x J x V x + 7/4/6 EC446.SuFy6/MPF 4

25 B.3 Levenberg-Marquardt Scheme Gauss-Newton x + = x J x J x J x V x Addtonal robustness to numercal mplementaton x+ = x J x J x + ( ) J x x 7/4/6 EC446.SuFy6/MPF 5

26 C. Conjugate gradent method (CG method) Usng nd order nformaton s often too expensve. Go bac to st order approxmaton. For small problems: CG less effcent than Newton scheme. For large problems: CG s a leadng contender Note: ) Assume F( x) = x Ax+ d x+ c Assume we want to compute the mnmum of F(x) to fnd mn(f(x)): ) Defnton: Mutually conjugate vectors wth respect to a matrx A. (A-orthogonal) { p } Defnton: A set of vectors are mutually conjugate wth respect to a P.D. Hessan matrx A ff: p A p = j j Consequence: the egenvectors of A are A- conjugate 7/4/6 EC446.SuFy6/MPF 6

27 3) K { } If the vectors p are non-zero and are A- = conjugate for a P.D. matrx A; then the set of vectors p are lnearly ndependent. { } Consequences: ) Can mnmze a quadratc by searchng along egenvectors as they are the man axes of ellpsods (however, egenvectors requre the Hessan whch s expensve) ) Have a set of exact lne searches along a set of conjugate vectors mnmum can be reached n n steps comes down to computng conjugate drectons 7/4/6 EC446.SuFy6/MPF 7

28 C. Conjugate drecton theorem. Assume where Let F ( x) = x Ax+ d x+ c x= n be a set of A-conjugate vectors. For any ntal condton x, the teraton: where { p, } o p n x + = x +α p α = g p p Ap, g = F x = Ax + d converges to the unque mnmum x* of F(x) n n steps. Proof: 7/4/6 EC446.SuFy6/MPF 8

29 x x = α p = ( x ) p A x = because {p } are A-conjugate {p } are lnearly ndependent * ( ) * p A( x x ) p A x x = p Apα α = =,..., n p Ap * ( + ) p A x x x x = p Ap p A x = p Ap p Ap * ( x ) p A( x x ) + * ( x ) p A x = + p Ap 7/4/6 EC446.SuFy6/MPF 9 * n x x = α p p A α j p j= p Ap = * * ( ) ( + ( + )) p Ax Ax p Ax d Ax d = = p Ap p Ap * p ( F( x ) F( x ) ) = p Ap p g = p Ap j

30 C. CG method mplementaton CG requres nowledge of the conjugate drecton vectors p usually p are computed as the method progresses (not beforehand) recall x + = x +α p α = chosen to mnmze F(x) n the drecton p we need to select p, p j, so that; j j j, j p Ap = α p Ap = = x Ap = g p = Recall: g + g = (Ax + + c) (Ax + c) = A(x + x ) = g - we need to fnd p j so that: g p p Ap j 7/4/6 EC446.SuFy6/MPF 3

31 Iteraton p = -g (SD drecton) α x = x + α p, α = g = F( x ) = Ax() + c = g p p Ap K =, pc p so that p g so p = g + β p wth β so that p g g g ( g β ) p + = g p p Ap p g g g p = 7/4/6 EC446.SuFy6/MPF 3

32 Overall teraton scheme: x = x + α p α + + = g p p Ap g = Ax + c p = g + β p wth β = g g / g g /4/6 EC446.SuFy6/MPF 3

33 Example:.8 F x = x x, x =.5 Implement the conjugate gradent scheme 7/4/6 EC446.SuFy6/MPF 33

34 CG Contour Plot Steepest Descent x x 7/4/6 EC446.SuFy6/MPF 34

Neural networks. Nuno Vasconcelos ECE Department, UCSD

Neural networks. Nuno Vasconcelos ECE Department, UCSD Neural networs Nuno Vasconcelos ECE Department, UCSD Classfcaton a classfcaton problem has two types of varables e.g. X - vector of observatons (features) n the world Y - state (class) of the world x X

More information

Topic 5: Non-Linear Regression

Topic 5: Non-Linear Regression Topc 5: Non-Lnear Regresson The models we ve worked wth so far have been lnear n the parameters. They ve been of the form: y = Xβ + ε Many models based on economc theory are actually non-lnear n the parameters.

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence. Vector Norms Chapter 7 Iteratve Technques n Matrx Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematcs Unversty of Calforna, Berkeley Math 128B Numercal Analyss Defnton A vector norm

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

EXCE, steepest descent, conjugate gradient & BFGS

EXCE, steepest descent, conjugate gradient & BFGS Club Cast3M, 21th of November 2008 An amazng optmsaton problem P. Pegon & Ph. Capéran European Laboratory for Structural Assessment Jont Research Centre Ispra, Italy An amazng optmsaton problem OUTLINE:

More information

Optimization. September 4, 2018

Optimization. September 4, 2018 Optmzaton September 4, 2018 Optmzaton problem 1/34 An optmzaton problem s the problem of fndng the best soluton for an objectve functon. Optmzaton method plays an mportant role n statstcs, for example,

More information

Evaluation of classifiers MLPs

Evaluation of classifiers MLPs Lecture Evaluaton of classfers MLPs Mlos Hausrecht mlos@cs.ptt.edu 539 Sennott Square Evaluaton For any data set e use to test the model e can buld a confuson matrx: Counts of examples th: class label

More information

Multi-layer neural networks

Multi-layer neural networks Lecture 0 Mult-layer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Lnear regresson w Lnear unts f () Logstc regresson T T = w = p( y =, w) = g( w ) w z f () = p ( y = ) w d w d Gradent

More information

Review: Fit a line to N data points

Review: Fit a line to N data points Revew: Ft a lne to data ponts Correlated parameters: L y = a x + b Orthogonal parameters: J y = a (x ˆ x + b For ntercept b, set a=0 and fnd b by optmal average: ˆ b = y, Var[ b ˆ ] = For slope a, set

More information

Quadratic speedup for unstructured search - Grover s Al-

Quadratic speedup for unstructured search - Grover s Al- Quadratc speedup for unstructured search - Grover s Al- CS 94- gorthm /8/07 Sprng 007 Lecture 11 001 Unstructured Search Here s the problem: You are gven a boolean functon f : {1,,} {0,1}, and are promsed

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Newton s Method for One - Dimensional Optimization - Theory

Newton s Method for One - Dimensional Optimization - Theory Numercal Methods Newton s Method for One - Dmensonal Optmzaton - Theory For more detals on ths topc Go to Clck on Keyword Clck on Newton s Method for One- Dmensonal Optmzaton You are free to Share to copy,

More information

Department of Chemical and Biological Engineering LECTURE NOTE II. Chapter 3. Function of Several Variables

Department of Chemical and Biological Engineering LECTURE NOTE II. Chapter 3. Function of Several Variables LECURE NOE II Chapter 3 Functon of Several Varables Unconstraned multvarable mnmzaton problem: mn f ( x), x R x N where x s a vector of desgn varables of dmenson N, and f s a scalar obectve functon - Gradent

More information

Logistic Regression Maximum Likelihood Estimation

Logistic Regression Maximum Likelihood Estimation Harvard-MIT Dvson of Health Scences and Technology HST.951J: Medcal Decson Support, Fall 2005 Instructors: Professor Lucla Ohno-Machado and Professor Staal Vnterbo 6.873/HST.951 Medcal Decson Support Fall

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Why feed-forward networks are in a bad shape

Why feed-forward networks are in a bad shape Why feed-forward networks are n a bad shape Patrck van der Smagt, Gerd Hrznger Insttute of Robotcs and System Dynamcs German Aerospace Center (DLR Oberpfaffenhofen) 82230 Wesslng, GERMANY emal smagt@dlr.de

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

OPTIMISATION. Introduction Single Variable Unconstrained Optimisation Multivariable Unconstrained Optimisation Linear Programming

OPTIMISATION. Introduction Single Variable Unconstrained Optimisation Multivariable Unconstrained Optimisation Linear Programming OPTIMIATION Introducton ngle Varable Unconstraned Optmsaton Multvarable Unconstraned Optmsaton Lnear Programmng Chapter Optmsaton /. Introducton In an engneerng analss, sometmes etremtes, ether mnmum or

More information

This model contains two bonds per unit cell (one along the x-direction and the other along y). So we can rewrite the Hamiltonian as:

This model contains two bonds per unit cell (one along the x-direction and the other along y). So we can rewrite the Hamiltonian as: 1 Problem set #1 1.1. A one-band model on a square lattce Fg. 1 Consder a square lattce wth only nearest-neghbor hoppngs (as shown n the fgure above): H t, j a a j (1.1) where,j stands for nearest neghbors

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution. Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

FTCS Solution to the Heat Equation

FTCS Solution to the Heat Equation FTCS Soluton to the Heat Equaton ME 448/548 Notes Gerald Recktenwald Portland State Unversty Department of Mechancal Engneerng gerry@pdx.edu ME 448/548: FTCS Soluton to the Heat Equaton Overvew 1. Use

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

CHAPTER 3 UNCONSTRAINED OPTIMIZATION

CHAPTER 3 UNCONSTRAINED OPTIMIZATION . Prelmnares CHAPER 3 UNCONSRAINED OPIMIZAION.. Introducton In ths chapter we wll examne some theory for the optmzaton of unconstraned functons. We wll assume all functons are contnuous and dfferentable.

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

2 Finite difference basics

2 Finite difference basics Numersche Methoden 1, WS 11/12 B.J.P. Kaus 2 Fnte dfference bascs Consder the one- The bascs of the fnte dfference method are best understood wth an example. dmensonal transent heat conducton equaton T

More information

One-sided finite-difference approximations suitable for use with Richardson extrapolation

One-sided finite-difference approximations suitable for use with Richardson extrapolation Journal of Computatonal Physcs 219 (2006) 13 20 Short note One-sded fnte-dfference approxmatons sutable for use wth Rchardson extrapolaton Kumar Rahul, S.N. Bhattacharyya * Department of Mechancal Engneerng,

More information

Lecture 21: Numerical methods for pricing American type derivatives

Lecture 21: Numerical methods for pricing American type derivatives Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Norms, Condition Numbers, Eigenvalues and Eigenvectors

Norms, Condition Numbers, Eigenvalues and Eigenvectors Norms, Condton Numbers, Egenvalues and Egenvectors 1 Norms A norm s a measure of the sze of a matrx or a vector For vectors the common norms are: N a 2 = ( x 2 1/2 the Eucldean Norm (1a b 1 = =1 N x (1b

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

Review of Taylor Series. Read Section 1.2

Review of Taylor Series. Read Section 1.2 Revew of Taylor Seres Read Secton 1.2 1 Power Seres A power seres about c s an nfnte seres of the form k = 0 k a ( x c) = a + a ( x c) + a ( x c) + a ( x c) k 2 3 0 1 2 3 + In many cases, c = 0, and the

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

1 Derivation of Point-to-Plane Minimization

1 Derivation of Point-to-Plane Minimization 1 Dervaton of Pont-to-Plane Mnmzaton Consder the Chen-Medon (pont-to-plane) framework for ICP. Assume we have a collecton of ponts (p, q ) wth normals n. We want to determne the optmal rotaton and translaton

More information

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester

Admin NEURAL NETWORKS. Perceptron learning algorithm. Our Nervous System 10/25/16. Assignment 7. Class 11/22. Schedule for the rest of the semester 0/25/6 Admn Assgnment 7 Class /22 Schedule for the rest of the semester NEURAL NETWORKS Davd Kauchak CS58 Fall 206 Perceptron learnng algorthm Our Nervous System repeat untl convergence (or for some #

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

Inexact Newton Methods for Inverse Eigenvalue Problems

Inexact Newton Methods for Inverse Eigenvalue Problems Inexact Newton Methods for Inverse Egenvalue Problems Zheng-jan Ba Abstract In ths paper, we survey some of the latest development n usng nexact Newton-lke methods for solvng nverse egenvalue problems.

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Chapter 12. Ordinary Differential Equation Boundary Value (BV) Problems

Chapter 12. Ordinary Differential Equation Boundary Value (BV) Problems Chapter. Ordnar Dfferental Equaton Boundar Value (BV) Problems In ths chapter we wll learn how to solve ODE boundar value problem. BV ODE s usuall gven wth x beng the ndependent space varable. p( x) q(

More information

Optimization. August 30, 2016

Optimization. August 30, 2016 Optmzaton August 30, 2016 Optmzaton problem 1/31 An optmzaton problem s the problem of fndng the best soluton for an objectve functon. Optmzaton method plays an mportant role n statstcs, for example, to

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 13

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 13 CME 30: NUMERICAL LINEAR ALGEBRA FALL 005/06 LECTURE 13 GENE H GOLUB 1 Iteratve Methods Very large problems (naturally sparse, from applcatons): teratve methods Structured matrces (even sometmes dense,

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Multilayer neural networks

Multilayer neural networks Lecture Multlayer neural networks Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Mdterm exam Mdterm Monday, March 2, 205 In-class (75 mnutes) closed book materal covered by February 25, 205 Multlayer

More information

2.29 Numerical Fluid Mechanics Fall 2011 Lecture 12

2.29 Numerical Fluid Mechanics Fall 2011 Lecture 12 REVIEW Lecture 11: 2.29 Numercal Flud Mechancs Fall 2011 Lecture 12 End of (Lnear) Algebrac Systems Gradent Methods Krylov Subspace Methods Precondtonng of Ax=b FINITE DIFFERENCES Classfcaton of Partal

More information

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB

More information

6) Derivatives, gradients and Hessian matrices

6) Derivatives, gradients and Hessian matrices 30C00300 Mathematcal Methods for Economsts (6 cr) 6) Dervatves, gradents and Hessan matrces Smon & Blume chapters: 14, 15 Sldes by: Tmo Kuosmanen 1 Outlne Defnton of dervatve functon Dervatve notatons

More information

Statistical pattern recognition

Statistical pattern recognition Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve

More information

CS 229, Public Course Problem Set #3 Solutions: Learning Theory and Unsupervised Learning

CS 229, Public Course Problem Set #3 Solutions: Learning Theory and Unsupervised Learning CS9 Problem Set #3 Solutons CS 9, Publc Course Problem Set #3 Solutons: Learnng Theory and Unsupervsed Learnng. Unform convergence and Model Selecton In ths problem, we wll prove a bound on the error of

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Meshless Surfaces. presented by Niloy J. Mitra. An Nguyen

Meshless Surfaces. presented by Niloy J. Mitra. An Nguyen Meshless Surfaces presented by Nloy J. Mtra An Nguyen Outlne Mesh-Independent Surface Interpolaton D. Levn Outlne Mesh-Independent Surface Interpolaton D. Levn Pont Set Surfaces M. Alexa, J. Behr, D. Cohen-Or,

More information

PHYS 705: Classical Mechanics. Calculus of Variations II

PHYS 705: Classical Mechanics. Calculus of Variations II 1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression 11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Chapter 4: Root Finding

Chapter 4: Root Finding Chapter 4: Root Fndng Startng values Closed nterval methods (roots are search wthn an nterval o Bsecton Open methods (no nterval o Fxed Pont o Newton-Raphson o Secant Method Repeated roots Zeros of Hgher-Dmensonal

More information

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

Introduction to the R Statistical Computing Environment R Programming

Introduction to the R Statistical Computing Environment R Programming Introducton to the R Statstcal Computng Envronment R Programmng John Fox McMaster Unversty ICPSR 2018 John Fox (McMaster Unversty) R Programmng ICPSR 2018 1 / 14 Programmng Bascs Topcs Functon defnton

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Numerical Solution of Ordinary Differential Equations

Numerical Solution of Ordinary Differential Equations Numercal Methods (CENG 00) CHAPTER-VI Numercal Soluton of Ordnar Dfferental Equatons 6 Introducton Dfferental equatons are equatons composed of an unknown functon and ts dervatves The followng are examples

More information

Training Convolutional Neural Networks

Training Convolutional Neural Networks Tranng Convolutonal Neural Networks Carlo Tomas November 26, 208 The Soft-Max Smplex Neural networks are typcally desgned to compute real-valued functons y = h(x) : R d R e of ther nput x When a classfer

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

1 Matrix representations of canonical matrices

1 Matrix representations of canonical matrices 1 Matrx representatons of canoncal matrces 2-d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3-d rotaton around the x-axs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3-d rotaton around the y-axs:

More information

Introduction to the Introduction to Artificial Neural Network

Introduction to the Introduction to Artificial Neural Network Introducton to the Introducton to Artfcal Neural Netork Vuong Le th Hao Tang s sldes Part of the content of the sldes are from the Internet (possbly th modfcatons). The lecturer does not clam any onershp

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

Least squares cubic splines without B-splines S.K. Lucas

Least squares cubic splines without B-splines S.K. Lucas Least squares cubc splnes wthout B-splnes S.K. Lucas School of Mathematcs and Statstcs, Unversty of South Australa, Mawson Lakes SA 595 e-mal: stephen.lucas@unsa.edu.au Submtted to the Gazette of the Australan

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Calculation of time complexity (3%)

Calculation of time complexity (3%) Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fttng of Data Davd Eberly Geoetrc Tools, LLC http://www.geoetrctools.co/ Copyrght c 1998-2015. All Rghts Reserved. Created: July 15, 1999 Last Modfed: January 5, 2015 Contents 1 Lnear Fttng

More information

Electronic Quantum Monte Carlo Calculations of Energies and Atomic Forces for Diatomic and Polyatomic Molecules

Electronic Quantum Monte Carlo Calculations of Energies and Atomic Forces for Diatomic and Polyatomic Molecules RESERVE HIS SPACE Electronc Quantum Monte Carlo Calculatons of Energes and Atomc Forces for Datomc and Polyatomc Molecules Myung Won Lee 1, Massmo Mella 2, and Andrew M. Rappe 1,* 1 he Maknen heoretcal

More information

Some modelling aspects for the Matlab implementation of MMA

Some modelling aspects for the Matlab implementation of MMA Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton

More information

A fast iterative algorithm for support vector data description

A fast iterative algorithm for support vector data description https://do.org/10.1007/s13042-018-0796-7 ORIGINAL ARTICLE A fast teratve algorthm for support vector data descrpton Songfeng Zheng 1 Receved: 9 February 2017 / Accepted: 26 February 2018 Sprnger-Verlag

More information

10.34 Numerical Methods Applied to Chemical Engineering Fall Homework #3: Systems of Nonlinear Equations and Optimization

10.34 Numerical Methods Applied to Chemical Engineering Fall Homework #3: Systems of Nonlinear Equations and Optimization 10.34 Numercal Methods Appled to Chemcal Engneerng Fall 2015 Homework #3: Systems of Nonlnear Equatons and Optmzaton Problem 1 (30 ponts). A (homogeneous) azeotrope s a composton of a multcomponent mxture

More information

CHAPTER 7 CONSTRAINED OPTIMIZATION 2: SQP AND GRG

CHAPTER 7 CONSTRAINED OPTIMIZATION 2: SQP AND GRG Chapter 7: Constraned Optmzaton CHAPER 7 CONSRAINED OPIMIZAION : SQP AND GRG Introducton In the prevous chapter we eamned the necessary and suffcent condtons for a constraned optmum. We dd not, however,

More information

Radar Trackers. Study Guide. All chapters, problems, examples and page numbers refer to Applied Optimal Estimation, A. Gelb, Ed.

Radar Trackers. Study Guide. All chapters, problems, examples and page numbers refer to Applied Optimal Estimation, A. Gelb, Ed. Radar rackers Study Gude All chapters, problems, examples and page numbers refer to Appled Optmal Estmaton, A. Gelb, Ed. Chapter Example.0- Problem Statement wo sensors Each has a sngle nose measurement

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information