14 Lagrange Multipliers
|
|
- Suzan Davis
- 6 years ago
- Views:
Transcription
1 Lagrange Multplers 14 Lagrange Multplers The Method of Lagrange Multplers s a powerful technque for constraned optmzaton. Whle t has applcatons far beyond machne learnng t was orgnally developed to solve physcs equatons), t s used for several ey dervatons n machne learnng. The problem set-up s as follows: we wsh to fnd extrema.e., maxma or mnma) of a dfferentable objectve functon Ex) = Ex 1,x 2,...x D ) 1) If we have no constrants on the problem, then the extrema must necessarly satsfy the followng system of equatons: E = 0 2) whch s equvalent to wrtng de dx = 0 for all. Ths equaton says that there s no way to nfntesmally perturb x to get a dfferent value fore; the objectve functon s locally flat. ow, however, our goal wll be to fnd extrema subject to a sngle constrant: gx) = 0 3) In other words, we want to fnd the extrema among the set of ponts x that satsfy gx) = 0. It s sometmes possble to reparameterze the problem n order to elmnate the constrants.e., so that the new parameterzaton ncludes all possble solutons to gx) = 0), however, ths can be awward n some cases, and mpossble n others. Gven the constrant gx) = 0, we are no longer loong for a pont where no perturbaton n any drecton changes E. Instead, we need to fnd a pont at whch perturbatons that satsfy the constrants do not change E. Ths can be expressed by the followng condton: E +λ g = 0 4) for some arbtrary scalar value λ. Frst note that, for ponts on the contour gx) = 0, the gradent g s always perpendcular to the contour ths s a great exercse f you don t remember the proof). Hence the expresson E = λ g says that the gradent of E must be parallel to the gradent of the contour at a possble soluton pont. In other words, any perturbaton to x that changes E also maes the constrant become volated. Perturbatons that do not change g, and hence stll le on the contourgx) = 0 do not changee ether. Hence, our goal s to fnd a pontxthat satsfes ths condton and alsogx) = 0 In the Method of Lagrange Multplers, we defne a new objectve functon, called the Lagrangan: Lx,λ) = Ex)+λgx) 5) ow we wll nstead fnd the extrema of L wth respect to both x and λ. The ey fact s that extrema of the unconstraned objectve L are the extrema of the orgnal constraned problem. So we have elmnated the nasty constrants by changng the objectve functon and also ntroducng new unnowns. Copyrght c 2015 Aaron Hertzmann, Davd J. Fleet and Marcus Brubaer 83
2 Lagrange Multplers E x g gx) = 0 Fgure 1: The set of solutons togx) = 0 vsualzed as a curve. The gradent g s always normal to the curve. At an extremal pont, E ponts s parallel to g. Fgure from Pattern Recognton and Machne Learnng by Chrs Bshop.) To see why, let s loo at the extrema of L. The extrema toloccur when dλ dx = gx) = 0 6) = E +λ g = 0 7) whch are exactly the condtons gven above. In other words, frst equaton ensures that gx) s zero, as desred, and the second equaton s our constrant that the gradents of E and g mucst be parallel. Usng the Lagrangan s a convenent way of combnng these two constrants nto one unconstraned optmzaton Examples Mnmzng on a crcle. We begn wth a smple geometrc example. We have the followng constraned optmzaton problem: argmn x,y x+y 8) subject tox 2 +y 2 = 1 9) Copyrght c 2015 Aaron Hertzmann, Davd J. Fleet and Marcus Brubaer 84
3 Lagrange Multplers Fgure 2: Illustraton of the maxmzaton on a crcle problem. Image from Wpeda.) In other words, we want to fnd the pont on a crcle that mnmzesx+y; the problem s vsualzed n Fgure 2. Here,Ex,y) = x+y andgx,y) = x 2 +y 2 1. The Lagrangan for ths problem s: Settng the gradent to zero gves ths system of equatons: Lx,y,λ) = x+y +λx 2 +y 2 1) 10) = 1+2λx = 0 dx 11) = 1+2λy = 0 dy 12) dλ = x2 +y 2 1 = 0 13) From the frst two lnes, we can see that x = y. Substtutng ths nto the constrant and solvng gves two solutons x = y = ± 1 2. Substtutng these two solutons nto the objectve, we see that the mnmum s atx = y = 1 2. Estmatng a multnomal dstrbuton. In a multnomal dstrbuton, we have an event e wth K possble dscrete, dsjont outcomes, where Pe = ) = p 14) For example, con-flppng s a bnomal dstrbuton where = 2 and e = 1 mght ndcate that the con lands heads. Copyrght c 2015 Aaron Hertzmann, Davd J. Fleet and Marcus Brubaer 85
4 Lagrange Multplers Suppose we observe events; the lelhood of the data s: K =1Pe p) = p 15) where s the number of tmes that e =,.e., the number of occurrences of the -th event. To estmate ths dstrbuton, we can mnmze the negatve log-lelhood: arg mn lnp 16) subject to p = 1,p 0, for all 17) The constrants are requred n order to ensure that the p s form a vald probablty dstrbuton. One way to optmze ths problem s to reparameterze: set p K = 1 K 1 =1 p, substtute n, and then optmze the unconstraned problem n closed-form. Whle ths method does wor n ths case, t breas the natural symmetry of the problem, resultng n some messy calculatons. Moreover, ths method often cannot be generalzed to other problems. The Lagrangan for ths problem s: Lp,λ) = ) lnp +λ p 1 18) Here we omt the constrant that p 0 and hope that ths constrant wll be satsfed by the soluton t wll). Settng the gradent to zero gves: Multplyng /dp = 0 byp and summng over gves: = +λ = 0 for all dp p 19) dλ = p 1 = 0 20) 0 = K =1 +λ p = +λ 21) snce = and p = 1. Hence, the optmal λ =. Substtutng ths nto /dp and solvng gves: p = whch s the famlar maxmum-lelhood estmator for a multnomal dstrbuton. 22) Copyrght c 2015 Aaron Hertzmann, Davd J. Fleet and Marcus Brubaer 86
5 Lagrange Multplers Maxmum varance PCA. In the orgnal formulaton of PCA, the goal s to fnd a low-dmensonal projecton of data ponts y x = w T y b) 23) such that the varance of the x s s maxmzed, subject to the constrant that w T w = 1. The Lagrangan s: ) Lw,b,λ) = 1 x 1 2 x +λw T w 1) 24) = 1 w T y b) 1 2 w T y b)) +λw T w 1) 25) = 1 w T y b) 1 2 y b))) +λw T w 1) 26) = 1 w T y ȳ) ) 2 +λw T w 1) 27) = 1 w T y ȳ)y ȳ) T w+λw T w 1) 28) = w T 1 y ȳ)y ȳ) )w+λw T T w 1) 29) where ȳ = y /. Solvng /dw = 0 gves: 1 y ȳ)y ȳ) )w T = λw 30) Ths s just the egenvector equaton: n other words, w must be an egenvector of the sample covarance of the y s, and λ must be the correspondng egenvalue. In order to determne whch one, we can substtute ths equalty nto the Lagrangan to get: L = w T λw+λw T w 1) 31) = λ 32) snce w T w = 1. Snce our goal s to maxmze the varance, we choose the egenvector w whch has the largest egenvalue λ. We have not yet selected b, but t s clear that the value of the objectve functon does not depend on b, so we mght as well set t to be the mean of the data b = y /, whch results n the x s havng zero mean: x / = 0. Copyrght c 2015 Aaron Hertzmann, Davd J. Fleet and Marcus Brubaer 87
6 Lagrange Multplers 14.2 Least-Squares PCA n one-dmenson We now derve PCA for the case of a one-dmensonal projecton, n terms of mnmzng squared error. Specfcally, we are gven a collecton of data vectorsy 1:, and wsh to fnd a basb, a sngle unt vector w, and one-dmensonal coordnates x 1:, to mnmze: arg mn w,x 1:,b y wx +b) 2 33) subject tow T w = 1 34) The vectorws called the frst prncpal component. The Lagrangan s: Lw,x 1:,b,λ) = y wx +b) 2 +λ w 2 1) 35) There are several sets of unnowns, and we derve ther optmal values each n turn. Projectons x ). We frst derve the projectons: dx = 2w T y wx +b)) = 0 36) Usngw T w = 1 and solvng for x gves: x = w T y b) 37) Bas b). We begn by dfferentatng: db = 2 y wx +b)) 38) Substtutng n Equaton 37 gves db = 2 y ww T y b)+b)) 39) = 2 y +2ww T y 2ww T b+2b 40) = 2I ww T ) y +2I ww T )b = 0 41) Dvdng both sdes by 2I ww T ) and rearrangng terms gves: b = 1 y 42) Copyrght c 2015 Aaron Hertzmann, Davd J. Fleet and Marcus Brubaer 88
7 Lagrange Multplers Bass vector w). To mae thngs smpler, we wll defne ỹ = y b) as the mean-subtracted data ponts, and the reconstructons are thenx = w T ỹ, and the objectve functon s: L = = = = ỹ wx 2 +λw T w 1) 43) ỹ ww T ỹ 2 +λw T w 1) 44) ỹ ww T ỹ ) T ỹ ww T ỹ )+λw T w 1) 45) ỹ T ỹ 2ỹ T ww T ỹ +ỹ T ww T ww T ỹ )+λw T w 1) 46) = ỹ T ỹ ỹ T w) 2 +λw T w 1) 47) where we have usedw T w = 1. We then dfferentate and smplfy: We can rearrange ths to get: dw = 2 ỹ ỹ T w+2λw = 0 48) ) ỹ ỹ T w = λw 49) Ths s exactly the egenvector equaton, meanng that extrema for L occur when w s an egenvector of the matrx ỹỹ T, andλs the correspondng egenvalue. Multplyng both sdes by 1/, we see ths matrx has the same egenvectors as the data covarance: 1 y b)y b) )w T = λ w 50) ow we must determne whch egenvector to use. We rewrte Equaton 47 as: L = = and substtute n Equaton 49: ỹ T ỹ w T ỹ ỹ T w+λw T w 1) 51) ) ỹ T ỹ w T ỹ ỹ T w+λw T w 1) 52) 53) L = = ỹ T ỹ λw T w+λw T w 1) 54) ỹ T ỹ λ 55) Copyrght c 2015 Aaron Hertzmann, Davd J. Fleet and Marcus Brubaer 89
8 Lagrange Multplers agan usng w T w = 1. We must pc the egenvalue λ that gves the smallest value of L. Hence, we pc the largest egenvalue, and setwto be the correspondng egenvector Multple constrants When we wsh to optmze wth respect to multple constrants {g x)},.e., Extrema occur when: argmn x Ex) 56) subject to g x) = 0 for = 1...K 57) E + λ g = 0 58) where we have ntroduced K Lagrange multplers λ. The constrants can be combned nto a sngle Lagrangan: Lx,λ 1:K ) = Ex)+ λ g x) 59) 14.4 Inequalty constrants The method can be extended to nequalty constrants of the form gx) 0. For a soluton to be vald and maxmal, there two possble cases: The optmal soluton s nsde the constrant regon, and, hence E = 0 and gx) > 0. In ths regon, the constrant s nactve, meanng thatλcan be set to zero. The optmal soluton les on the boundarygx) = 0. In ths case, the gradent E must pont n the opposte drecton of the gradent of g; otherwse, followng the gradent of E would causeg to become postve whle also modfynge. Hence, we must have E = λ g for λ 0. ote that, n both cases, we have λgx) = 0. Hence, we can enforce that one of these cases s found wth the followng optmzaton problem: max w,λ Ex)+λgx) 60) such that gx) 0 61) λ 0 62) λgx) = 0 63) These are called the Karush-Kuhn-Tucer KKT) condtons, whch generalze the Method of Lagrange Multplers. When mnmzng, we want E to pont n the same drecton as g when on the boundary, and so we mnmze E λg nstead of E +λg. Copyrght c 2015 Aaron Hertzmann, Davd J. Fleet and Marcus Brubaer 90
9 Lagrange Multplers E x 1 g x 2 gx) > 0 gx) = 0 Fgure 3: Illustraton of the condton for nequalty constrants: the soluton may le on the boundary of the constrant regon, or n the nteror. Fgure from Pattern Recognton and Machne Learnng by Chrs Bshop.) Copyrght c 2015 Aaron Hertzmann, Davd J. Fleet and Marcus Brubaer 91
15 Lagrange Multipliers
15 The Method of s a powerful technque for constraned optmzaton. Whle t has applcatons far beyond machne learnng t was orgnally developed to solve physcs equatons), t s used for several ey dervatons n
More informationSolutions to exam in SF1811 Optimization, Jan 14, 2015
Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable
More informationWhich Separator? Spring 1
Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal
More informationCSC 411 / CSC D11 / CSC C11
18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationPHYS 705: Classical Mechanics. Calculus of Variations II
1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary
More information13 Principal Components Analysis
Prncpal Components Analyss 13 Prncpal Components Analyss We now dscuss an unsupervsed learnng algorthm, called Prncpal Components Analyss, or PCA. The method s unsupervsed because we are learnng a mappng
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationChapter Newton s Method
Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve
More informationMMA and GCMMA two methods for nonlinear optimization
MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More informationLagrange Multipliers. A Somewhat Silly Example. Monday, 25 September 2013
Lagrange Multplers Monday, 5 September 013 Sometmes t s convenent to use redundant coordnates, and to effect the varaton of the acton consstent wth the constrants va the method of Lagrange undetermned
More informationSolutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.
Solutons HW #2 Dual of general LP. Fnd the dual functon of the LP mnmze subject to c T x Gx h Ax = b. Gve the dual problem, and make the mplct equalty constrants explct. Soluton. 1. The Lagrangan s L(x,
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces
More informationCOS 521: Advanced Algorithms Game Theory and Linear Programming
COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg
prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there
More information17 Support Vector Machines
17 We now dscuss an nfluental and effectve classfcaton algorthm called (SVMs). In addton to ther successes n many classfcaton problems, SVMs are responsble for ntroducng and/or popularzng several mportant
More informationC4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )
C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased
More informationSupport Vector Machines CS434
Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts
More informationLecture 10 Support Vector Machines. Oct
Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron
More informationChapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the
More informationLinear Feature Engineering 11
Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19
More informationCollege of Computer & Information Science Fall 2009 Northeastern University 20 October 2009
College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:
More informationMaximum Likelihood Estimation
Maxmum Lkelhood Estmaton INFO-2301: Quanttatve Reasonng 2 Mchael Paul and Jordan Boyd-Graber MARCH 7, 2017 INFO-2301: Quanttatve Reasonng 2 Paul and Boyd-Graber Maxmum Lkelhood Estmaton 1 of 9 Why MLE?
More informatione i is a random error
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationSingular Value Decomposition: Theory and Applications
Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationThe Second Anti-Mathima on Game Theory
The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationThe Geometry of Logit and Probit
The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.
More informationPhysics 5153 Classical Mechanics. Principle of Virtual Work-1
P. Guterrez 1 Introducton Physcs 5153 Classcal Mechancs Prncple of Vrtual Work The frst varatonal prncple we encounter n mechancs s the prncple of vrtual work. It establshes the equlbrum condton of a mechancal
More informationMath1110 (Spring 2009) Prelim 3 - Solutions
Math 1110 (Sprng 2009) Solutons to Prelm 3 (04/21/2009) 1 Queston 1. (16 ponts) Short answer. Math1110 (Sprng 2009) Prelm 3 - Solutons x a 1 (a) (4 ponts) Please evaluate lm, where a and b are postve numbers.
More information1 Matrix representations of canonical matrices
1 Matrx representatons of canoncal matrces 2-d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3-d rotaton around the x-axs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3-d rotaton around the y-axs:
More informationPhysics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1
P. Guterrez Physcs 5153 Classcal Mechancs D Alembert s Prncple and The Lagrangan 1 Introducton The prncple of vrtual work provdes a method of solvng problems of statc equlbrum wthout havng to consder the
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationSection 8.3 Polar Form of Complex Numbers
80 Chapter 8 Secton 8 Polar Form of Complex Numbers From prevous classes, you may have encountered magnary numbers the square roots of negatve numbers and, more generally, complex numbers whch are the
More informationExpectation Maximization Mixture Models HMMs
-755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood
More informationMaximal Margin Classifier
CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org
More information12. The Hamilton-Jacobi Equation Michael Fowler
1. The Hamlton-Jacob Equaton Mchael Fowler Back to Confguraton Space We ve establshed that the acton, regarded as a functon of ts coordnate endponts and tme, satsfes ( ) ( ) S q, t / t+ H qpt,, = 0, and
More informationExpected Value and Variance
MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or
More informationFisher Linear Discriminant Analysis
Fsher Lnear Dscrmnant Analyss Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan Fsher lnear
More informationMore metrics on cartesian products
More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of
More informationC/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1
C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned
More informationLecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.
prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove
More informationCS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements
CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before
More informationCSCE 790S Background Results
CSCE 790S Background Results Stephen A. Fenner September 8, 011 Abstract These results are background to the course CSCE 790S/CSCE 790B, Quantum Computaton and Informaton (Sprng 007 and Fall 011). Each
More informationFoundations of Arithmetic
Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an
More informationMechanics Physics 151
Mechancs Physcs 5 Lecture 0 Canoncal Transformatons (Chapter 9) What We Dd Last Tme Hamlton s Prncple n the Hamltonan formalsm Dervaton was smple δi δ p H(, p, t) = 0 Adonal end-pont constrants δ t ( )
More informationOPTIMISATION. Introduction Single Variable Unconstrained Optimisation Multivariable Unconstrained Optimisation Linear Programming
OPTIMIATION Introducton ngle Varable Unconstraned Optmsaton Multvarable Unconstraned Optmsaton Lnear Programmng Chapter Optmsaton /. Introducton In an engneerng analss, sometmes etremtes, ether mnmum or
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationSpectral graph theory: Applications of Courant-Fischer
Spectral graph theory: Applcatons of Courant-Fscher Steve Butler September 2006 Abstract In ths second talk we wll ntroduce the Raylegh quotent and the Courant- Fscher Theorem and gve some applcatons for
More informationNatural Language Processing and Information Retrieval
Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support
More informationMATH Sensitivity of Eigenvalue Problems
MATH 537- Senstvty of Egenvalue Problems Prelmnares Let A be an n n matrx, and let λ be an egenvalue of A, correspondngly there are vectors x and y such that Ax = λx and y H A = λy H Then x s called A
More informationComplex Numbers. x = B B 2 4AC 2A. or x = x = 2 ± 4 4 (1) (5) 2 (1)
Complex Numbers If you have not yet encountered complex numbers, you wll soon do so n the process of solvng quadratc equatons. The general quadratc equaton Ax + Bx + C 0 has solutons x B + B 4AC A For
More informationRadar Trackers. Study Guide. All chapters, problems, examples and page numbers refer to Applied Optimal Estimation, A. Gelb, Ed.
Radar rackers Study Gude All chapters, problems, examples and page numbers refer to Appled Optmal Estmaton, A. Gelb, Ed. Chapter Example.0- Problem Statement wo sensors Each has a sngle nose measurement
More informationCHAPTER 6 CONSTRAINED OPTIMIZATION 1: K-T CONDITIONS
Chapter 6: Constraned Optzaton CHAPER 6 CONSRAINED OPIMIZAION : K- CONDIIONS Introducton We now begn our dscusson of gradent-based constraned optzaton. Recall that n Chapter 3 we looked at gradent-based
More informationAPPENDIX A Some Linear Algebra
APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,
More informationCHAPTER 7 CONSTRAINED OPTIMIZATION 1: THE KARUSH-KUHN-TUCKER CONDITIONS
CHAPER 7 CONSRAINED OPIMIZAION : HE KARUSH-KUHN-UCKER CONDIIONS 7. Introducton We now begn our dscusson of gradent-based constraned optzaton. Recall that n Chapter 3 we looked at gradent-based unconstraned
More informationCHAPTER 7 CONSTRAINED OPTIMIZATION 2: SQP AND GRG
Chapter 7: Constraned Optmzaton CHAPER 7 CONSRAINED OPIMIZAION : SQP AND GRG Introducton In the prevous chapter we eamned the necessary and suffcent condtons for a constraned optmum. We dd not, however,
More informationCHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD
CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB
More informationLagrange Multipliers Kernel Trick
Lagrange Multplers Kernel Trck Ncholas Ruozz Unversty of Texas at Dallas Based roughly on the sldes of Davd Sontag General Optmzaton A mathematcal detour, we ll come back to SVMs soon! subject to: f x
More informationTHE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens
THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of
More information8.6 The Complex Number System
8.6 The Complex Number System Earler n the chapter, we mentoned that we cannot have a negatve under a square root, snce the square of any postve or negatve number s always postve. In ths secton we want
More informationEconomics 101. Lecture 4 - Equilibrium and Efficiency
Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationFeb 14: Spatial analysis of data fields
Feb 4: Spatal analyss of data felds Mappng rregularly sampled data onto a regular grd Many analyss technques for geophyscal data requre the data be located at regular ntervals n space and/or tme. hs s
More informationTemperature. Chapter Heat Engine
Chapter 3 Temperature In prevous chapters of these notes we ntroduced the Prncple of Maxmum ntropy as a technque for estmatng probablty dstrbutons consstent wth constrants. In Chapter 9 we dscussed the
More informationLecture 20: November 7
0-725/36-725: Convex Optmzaton Fall 205 Lecturer: Ryan Tbshran Lecture 20: November 7 Scrbes: Varsha Chnnaobreddy, Joon Sk Km, Lngyao Zhang Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer:
More informationECE559VV Project Report
ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate
More informationCIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M
CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute
More informationSupplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso
Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed
More information6.854J / J Advanced Algorithms Fall 2008
MIT OpenCourseWare http://ocw.mt.edu 6.854J / 18.415J Advanced Algorthms Fall 2008 For nformaton about ctng these materals or our Terms of Use, vst: http://ocw.mt.edu/terms. 18.415/6.854 Advanced Algorthms
More informationLECTURE 9 CANONICAL CORRELATION ANALYSIS
LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of
More information( ) 2 ( ) ( ) Problem Set 4 Suggested Solutions. Problem 1
Problem Set 4 Suggested Solutons Problem (A) The market demand functon s the soluton to the followng utlty-maxmzaton roblem (UMP): The Lagrangean: ( x, x, x ) = + max U x, x, x x x x st.. x + x + x y x,
More informationCS 229, Public Course Problem Set #3 Solutions: Learning Theory and Unsupervised Learning
CS9 Problem Set #3 Solutons CS 9, Publc Course Problem Set #3 Solutons: Learnng Theory and Unsupervsed Learnng. Unform convergence and Model Selecton In ths problem, we wll prove a bound on the error of
More informationADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING
1 ADVANCED ACHINE LEARNING ADVANCED ACHINE LEARNING Non-lnear regresson technques 2 ADVANCED ACHINE LEARNING Regresson: Prncple N ap N-dm. nput x to a contnuous output y. Learn a functon of the type: N
More informationSupport Vector Machines. Jie Tang Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University 2012
Support Vector Machnes Je Tang Knowledge Engneerng Group Department of Computer Scence and Technology Tsnghua Unversty 2012 1 Outlne What s a Support Vector Machne? Solvng SVMs Kernel Trcks 2 What s a
More informationDifferentiating Gaussian Processes
Dfferentatng Gaussan Processes Andrew McHutchon Aprl 17, 013 1 Frst Order Dervatve of the Posteror Mean The posteror mean of a GP s gven by, f = x, X KX, X 1 y x, X α 1 Only the x, X term depends on the
More informationSome Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)
Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998
More informationSIO 224. m(r) =(ρ(r),k s (r),µ(r))
SIO 224 1. A bref look at resoluton analyss Here s some background for the Masters and Gubbns resoluton paper. Global Earth models are usually found teratvely by assumng a startng model and fndng small
More informationTransfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system
Transfer Functons Convenent representaton of a lnear, dynamc model. A transfer functon (TF) relates one nput and one output: x t X s y t system Y s The followng termnology s used: x y nput output forcng
More informationPhysics 106a, Caltech 11 October, Lecture 4: Constraints, Virtual Work, etc. Constraints
Physcs 106a, Caltech 11 October, 2018 Lecture 4: Constrants, Vrtual Work, etc. Many, f not all, dynamcal problems we want to solve are constraned: not all of the possble 3 coordnates for M partcles (or
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationDECOUPLING THEORY HW2
8.8 DECOUPLIG THEORY HW2 DOGHAO WAG DATE:OCT. 3 207 Problem We shall start by reformulatng the problem. Denote by δ S n the delta functon that s evenly dstrbuted at the n ) dmensonal unt sphere. As a temporal
More informationxp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ
CSE 455/555 Sprng 2013 Homework 7: Parametrc Technques Jason J. Corso Computer Scence and Engneerng SUY at Buffalo jcorso@buffalo.edu Solutons by Yngbo Zhou Ths assgnment does not need to be submtted and
More informationThe Feynman path integral
The Feynman path ntegral Aprl 3, 205 Hesenberg and Schrödnger pctures The Schrödnger wave functon places the tme dependence of a physcal system n the state, ψ, t, where the state s a vector n Hlbert space
More informationSolutions to Problem Set 6
Solutons to Problem Set 6 Problem 6. (Resdue theory) a) Problem 4.7.7 Boas. n ths problem we wll solve ths ntegral: x sn x x + 4x + 5 dx: To solve ths usng the resdue theorem, we study ths complex ntegral:
More informationBOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS
BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all
More information3.1 ML and Empirical Distribution
67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum
More informationSpectral Graph Theory and its Applications September 16, Lecture 5
Spectral Graph Theory and ts Applcatons September 16, 2004 Lecturer: Danel A. Spelman Lecture 5 5.1 Introducton In ths lecture, we wll prove the followng theorem: Theorem 5.1.1. Let G be a planar graph
More informationWhy Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)
Why Bayesan? 3. Bayes and Normal Models Alex M. Martnez alex@ece.osu.edu Handouts Handoutsfor forece ECE874 874Sp Sp007 If all our research (n PR was to dsappear and you could only save one theory, whch
More informationMACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression
11 MACHINE APPLIED MACHINE LEARNING LEARNING MACHINE LEARNING Gaussan Mture Regresson 22 MACHINE APPLIED MACHINE LEARNING LEARNING Bref summary of last week s lecture 33 MACHINE APPLIED MACHINE LEARNING
More informationWeek3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity
Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle
More information