Homogenised Virtual Support Vector Machines
|
|
- Jennifer Gibbs
- 5 years ago
- Views:
Transcription
1 Homogensed Vrtual Support Vector Machnes Chrstan J. Walder 1,2 1 Max Planck Insttute for Bologcal Cybernetcs Spemannstaße 38, Tübngen, Germany. Bran C. Lovell 2 2 IRIS Research Group, EMI School of Informaton Technology and Electrcal Engneerng, The Unversty of Queensland, Brsbane, Queensland 4072, Australa. chrstan.walder@tuebngen.mpg.de, lovell@tee.uq.edu.au Abstract In many domans, relable a pror knowledge exsts that may be used to mprove classfer performance. For example n handwrtten dgt recognton, such a pror knowledge may nclude classfcaton nvarance wth respect to mage translatons and rotatons. In ths paper, we present a new generalsaton of the Support Vector Machne SVM that ams to better ncorporate ths knowledge. The method s an extenson of the Vrtual SVM, and penalses an approxmaton of the varance of the decson functon across each grouped set of vrtual examples, thus utlsng the fact that these groups should deally be assgned smlar class membershp probabltes. The method s shown to be an effcent approxmaton of the nvarant SVM of Chapelle and Schölkopf, wth the advantage that t can be solved by trval modfcaton to standard SVM optmzaton packages and neglgble ncrease n computatonal complexty when compared wth the Vrtual SVM. The effcacy of the method s demonstrated on a smple problem. 1 Introducton In recent years Vapnk s Support Vector Machne SVM classfer [9] has become establshed among the most effectve classfers on real-world problems. As remarked n a comparson of classfers appled to handwrtten dgt recognton wrtten by LeCun et al n 1995, SVMs are capable of achevng hgh generalsaton performance wth none of the a pror knowledge that s necessary n order to acheve smlar results wth other methods [4]. In the years snce ths comparson was conducted, SVM researchers have managed to ncorporate problem specfc pror knowledge nto the SVM algorthm. One very effectve approach to date, as far as performance on the MNIST handwrtten dgt database s concerned, has been the Vrtual SVM V-SVM method of Scholköpf et al [5]. Indeed, to the best of our knowledge ths method has acheved the lowest recorded test set error on the MNIST set, wth trval data preprocessng [3]. The V-SVM method nvolves frst tranng a normal SVM n order to extract the support vector set. A set of transformatons that are known to have no effect on the lkelhood of class membershp are then appled to these vectors, producng vrtual support vectors. For example, n the case of handwrtten dgts, new vrtual examples can be created by randomly shftng and rotatng n the 2D mage sense the orgnal tranng dgts. A new machne s then traned on the unon of the orgnal support vector set and the synthetc vrtual support vectors. In V-SVM, when the machne s retraned on the augmented tranng set, no dstncton s made between the real and vrtual data vectors. As such, a potentally useful pece of nformaton about the problem s beng dscarded, namely that we know the decson latent functon tself should be nvarant to the transformatons appled to the support vectors not just the classfer output that corresponds to the sgn of ths latent functon. Indeed, precsely ths nformaton has been ncorporated n the Invarant SVM I-SVM method of Chapelle and Scholköpf [2], whch has prevously been mplemented usng deas from kernel PCA [7]. The present work les n between the I- SVM and V-SVM, n that the vrtual examples are ncluded as n V-SVM but the I-SVM lke nvarance of the latent functon are mposed only on the support vectors. As we shall see however, the present approach represents a rather effcent approxmaton to the I-SVM as t can be mple- 1
2 mented by trval modfcaton to exstng SVM optmzaton software such as LIBSVM [1], wth the same computatonal cost as the V-SVM. The remander of the paper s structured as follows: n Secton 2 we brefly revew the soft margn SVM, before ntroducng the I-SVM n Secton 3. In Secton 4 the man contrbuton of the paper begns wth the dervaton of the necessary equatons for the new approach. In Secton 5 we demonstrate the effcacy of the new approach on a smple toy problem, showng mproved performance as compared to the state of the art and rather hard to beat V-SVM. 2 Soft-Margn Support Vector Machnes In normal squared loss soft-margn SVM classfcaton [9], we have a set of labelled ponts x R d, = 1... N, wth assocated class labels y {1, 1}. The standard SVM formulaton solves the followng problem: Mnmze w,b w, w + C N =1 ξ2 Subject To: y w, x + b 1 ξ, = 1... N ξ 0, = 1... N where C s a regularsaton parameter. In the Lagrangan dual of the above problem, the data vectors appear only by way of ther nner products wth one another. Ths allows the problem to be solved n a possbly nfnte dmensonal feature space H by way of the kernel trck,.e., the replacement of all nner products x, x j by some Mercer kernel k x, x j = φ x.φ x j, where φ : R d H see e.g. [8]. By dualsng the above problem n ths manner, we obtan the followng fnal decson functon: g x = sgn f x, wheref x = N α 0 y k x, x + b =1 where the α 0 are obtaned by maxmsng the dual objectve functon: W α = N α 1 2 =1 N α α j y y j k x, x j,j=1 subject to the constrants N =1 α y = 0 and α 0. 3 Invarant Support Vector Machnes Prevously, Chapelle and Schölkopf [6] ncorporated doman specfc pror knowledge nto the lnear SVM framework by mnmsng the objectve functon: 1 w, w + w, d x 2 subject to the usual SVM constrants see Secton 2. The tangent vectors d x are drectons n whch a pror knowledge tells us that the functonal value at x should not change the sense of the optmzaton can be seen from the equalty: f x + d x f x = w, d x It turns out that the above problem s equvalent to the normal SVM after lnearly transformng the nput space by x C 1 x where the matrx C s defned by: C = 1 I + d x d x 1 2 whch was shown to be equvalent to usng the followng kernel functon n the otherwse unchanged SVM framework: k x, x j = x C 2 x j 1 However n order to use the kernel trck to perform ths algorthm n the feature space H nduced by k,, as n the prevous Secton, one must replace the term d x d x wth dφ x dφ x n the expresson for C. Snce H s typcally very hgh dmensonal, t s mpossble to do ths drectly. A soluton to ths problem usng kernel PCA [7] s gven n [2], whch takes advantage of the fact that the set {φ x } 1 N spans a subset of H whose dmenson s no greater than N. Unfortunately however, computng the modfed kernel functon n ths manner does ntroduce computatonal dsadvantages, and the need for a large scale verson of the algorthm was noted by ts authors n [2]. 4 Homogensed Vrtual SVM In the I-SVM method of the prevous Secton, the tangent vectors d x are usually not avalable drectly, and so a fnte dfference approxmaton s used. In ths case the term w, d x 2 can be wrtten: w, x + x w, x 2 2 where the vector x + x s equvalent to a vrtual vector of the V-SVM approach that has been derved from x.
3 In the V-SVM method, the vrtual vectors are derved from the real vectors by some group of nvarant transformatons, that s transformatons that should not affect the lkelhood of class membershp. For example, n handwrtten dgt recognton the group of one-pxel mage translatons have been used to good effect [3] n ths case, for each of the orgnal tranng patterns an addtonal 8 vrtual vectors can be derved, one for each possble sngle pxel translaton. We wll presently consder a combnaton of the I-SVM and the V-SVM, whch we dub the Homogensed Vrtual SVM HV-SVM. To begn we combne the ncluson of the vrtual examples as n the V-SVM method, wth the nvarance term 2, allowng as well a soft margn wth parameter C as n Secton 2. Altogether ths leads to the objectve functon: w, w + C P n=1 ξ 2 +,j S n w, x w, x j 2 3 where each of the P sets S n contan the subscrpts of those vectors that are nvarant transformatons of one another. Ths effectvely extracts a fnte dfference approxmaton of an nvarant drecton for each par of vectors n the same set S p. The term 1 2 s ncluded for an equvalence wth I-SVM, snce here each nvarant drecton s effectvely ncluded twce. The objectve functon must be mnmzed subject to the normal SVM constrants of Secton 2. It s well known by the SVM communty that these constrants lead to the followng optmalty condtons: α y w, x + b 1 + ξ = 0 whch mply that for those vectors that have α > 0 the support vectors, then ξ = 1 y w, x + b. Ths means that f x and x j are support vectors belongng to the same set S p, then y = y j {1, 1} and therefore w, x w, x j = ξ ξ j. Assumng that all vectors are support vectors, the objectve functon 3 can therefore be rewrtten: 1 w, w + C ξ P ξ ξ j 2 2 n=1,j S n 4 In realty however, not all of the vectors wll be support vectors, and thus we effectvely have an approxmaton that becomes more deal as the number of support vectors ncreases. At ths pont we should note that the V-SVM method usually retans only the support vectors from a prelmnary tranng teraton before dervng and retranng wth the vrtual examples, and that by a smlar process we can reasonably expect a hgh proporton of support vectors. Moreover, the approxmaton s roughly equvalent to mnmsng the nvarance term of only those vectors that le near the decson boundary the support vectors, whch also seems reasonable snce those are the vectors that are wdely beleved to contan the most mportant nformaton. We now fnd the Lagrangan dual of ths approxmaton to I-SVM. The algebra s less of a burden f we assume to begn wth that all of the vectors are n the same set S 1. Dvdng the resultant objectve functon by 1 gves w, w +C ξ2 + 21,j ξ ξ j 2 the summatons run from = 1... N, j = 1... N, as shall be assumed for the remander of ths secton. Note that: C ξ2 + 21,j ξ ξ j 2 ξ 2 + ξj 2 2ξ ξ j = C ξ2 + = C + N 1 21,j ξ2 1 N,j ξ ξ j so that f we let F = C + 1 and G = objectve functon can be rewrtten as: w, w + F ξ 2 + G,j ξ ξ j 1 the The Lagrangan functon wth Lagrange multplers α s then: L w, b, ξ, α = 1 2 w, w F ξ G,j ξ ξ j and the statonarty condtons are: α y w, x + b + ξ 1 L w = 0 = w α y x 5 L b = 0 = α y 6 L ξ = 0 = F ξ + G j ξ j α 7 so w has the usual support vector expanson w = α y x. Substtutng 5 and 6 nto the Lagrangan yelds: L = 1 2,j α α j y y j x, x j + α α ξ + F 1 ξ 2 + G 1 ξ ξ j 2 2 = 1 2 α H α + ẽ α α ξ F ξ ξ G ξ E ξ where H s the usual Gram matrx, [H,j ] = y y j x, x j, E s a matrx of ones and I the dentty matrx. Rewrtng,j
4 7 n matrx form and left-multplyng by ξ mples that F ξ ξ + G ξ E ξ α ξ = 0. The Lagrangan can therefore be wrtten: L w, b, ξ, α = 1 2 α H α + ẽ α 1 2 α ξ Fnally, we substtute the expresson for ξ obtaned from 7, ξ = F I + GE 1 α, whch leads to the fnal form of our dual objectve functon: W α = 1 2 α H + F I + GE 1 α + ẽ α F F +GN = C1 + CC1 +N G F F +GN = CC1 +N At ths pont one can show that the matrx F I + GE 1 has entres equal to F +N 1G on the dagonal and elsewhere. Usng these expressons we can now compactly wrte down the fnal form of the dual problem for 4. In ths case one can readly verfy that the sets S n are treated smlarly to the prevous case n whch t was assumed that there s one sngle all encompassng set S 1 = {1... N}, resultng n a dual problem that conssts of mnmsng the dual objectve functon 1 2 α H + B α ẽ α subject to the normal SVM constrants see Secton 2. Referrng to the S n as groups, the matrx B s defned on the dagonal by: [B], = { C1 + CC1 +N 1 C1 f x s n a group otherwse where N s the total number of vectors n x s group. Off the dagonal, B s defned by: { CC1 +N [B],j = f x and x j are same group 0 otherwse 9 Note that f = 0 or f there are no non-empty groups, then B smplfes to a dagonal matrx wth entres 1 C, recoverng the soft-margn SVM of Secton 2. The smlarty wth normal SVM allows the problem to be solved by trval modfcaton to exstng SVM optmzaton packages such as LIBSVM [1] one need smply replace the typcal dagonal bas terms wth the B term above Hybrd HV/I - SVM: Note that t s also possble to derve a hybrd method that s closer to the I-SVM, n the sense that t gnores only the nvarance terms of those entre sets S n that contan no support vectors. If V n are the support vectors, and V n the non support vectors n group S n, the nvarance term of the HV- SVM method for that group s: 1 2,j V w, x n w, x j V n,j V ξ2 n 8 Fgure 1. An llustraton of the effect of the parameter n separatng pluses from crcles n 2D. An nvarance to rotatons s encoded by choosng vrtual vectors that are rotatons of the tranng data, each nvarant group S n beng ndcated by a dotted lne. The gray level ndcates the decson functon value and the whte lne the decson boundary. Left: Vrtual SVM, Mddle: HV-SVM wth medum Rght HV- SVM wth close to one.
5 whch, assumng that we are amng for the same result as the I-SVM, s not what s requred unless V n s emtpy. However f we had known V n a pror and set [B],j = 0 f ether V n or j V n, then the second term n the above nvarance term would equal zero. Thus, had we also modfed the kernel functon as per the I-SVM method wth nvarance drectons { x x j : V n j Vn } then we would have establshed once agan the correct I- SVM nvarance term for S n. Ths would need to have been done for all the S n, but as we do not know the V n a pror t would be necessary to use for example approach of Algorthm 1, below, for determnng whch nvarance drectons need to be accounted for by the I-SVM kernel functon, rather than by the more effcent HV-SVM bas terms [B],j. In the algorthm, the nvarance terms assocated wth non support vectors are gnored. Note that t should usually be possble to perform the retranng wth fewer teratons than the ntal pass, by usng the prevous soluton as the startng pont for the optmzaton. Algorthm 1 HV-SVM/I-SVM Hybrd 1: W 2: [B],j 0,, j 3: for all {, j : W j W } do 4: assgn [B],j accordng to equatons 8 and 9 5: end for 6: Defne k by equaton 1 usng the tangent vectors assocated wth W 7: Optmse usng k and B, and let V be the resultant support vector set 8: R n:s n V = S n 9: f R V W then 10: fnsh 11: else 12: W W R V 13: Go to lne 2 14: end f C Fgure 2. Cross valdaton performance for the crcle toy problem, averaged over 100 trals. The gray-scale ndcates mean test error percentage. The C scale s logarthmc, and the scale s logarthmc wth the excepton of the frst value whch s zero. Note that = 0 s equvalent to the Vrtual SVM method, and that the nvarances are enforced more as 1. Very lttle performance change s found by extendng the plot to greater C values. the second order polynomal kernel k x, ỹ = x ỹ The results of the experment are shown n Fgure 2, whch ndcate that for any gven value of the margn softness parameter C, the more the nvarance s enforced 1 the better the test set performance becomes. Moreover t can be seen from the example n Fgure 1 that as expected, the nvarance term leads to decson boundares of a more crcular nature Experments Our tests on real world problems are the subject of ongong work. In the present secton we nstead consder the followng toy problem: the data are unformly dstrbuted over the nterval [ 1, 1] 2 and the true decson boundary s a crcle centered at the orgn, namely f x = sgn x 1/2. The nvarance we wsh to encode s local nvarance under rotatons, and so we derve from each tranng vector a sngle vrtual vector by applyng a rotaton of 0.5 radans about the orgn. A tranng set of 15 ponts and test set of 2000 ponts were generated ndependently n 100 ndvdual trals. For the kernel functon we chose 6 Concluson We have presented a generalsaton of the Vrtual SVM n whch the support vectors that are nvarant transformatons of one another are constraned to have smlar decson functon values. We have demonstrated that the method s superor to the state of the art Vrtual SVM n a toy problem nvolvng nvarances under rotatons see Fgure 2. Moreover, the algorthm s easy to mplement, as the fnal optmsaton problem s very smlar to that of the standard SVM, and ncurs no addtonal computatonal penalty n comparson wth the Vrtual SVM. We plan to perform more realworld tests snce the Vrtual SVM represents the state of the art n dgt recognton t seems nterestng to apply the
6 method to ths problem. References [1] C.-C. Chang and C.-J. Ln. LIBSVM: a lbrary for support vector machnes, Software avalable at cse.ntu.edu.tw/ cjln/lbsvm. [2] O. Chapelle and B. Schölkopf. Incorporatng nvarances n nonlnear support vector machnes, [3] D. DeCoste and B. Schölkopf. Tranng nvarant support vector machnes. Machne Learnng, 46: , [4] Y. LeCun, L. Jackel, L. Bottou, A. Brunot, C. Cortes, J. Denker, H. Drucker, I. Guyon, U. Muller, E. Sacknger, P. Smard, and V. Vapnk. Comparson of learnng algorthms for handwrtten dgt recognton, [5] B. Schölkopf, C. Burges, and V. Vapnk. Incorporatng nvarances n support vector learnng machnes. In J. V. C. von der Malsburg, W. von Seelen and B. Sendhoff, edtors, Artfcal Neural Networks ICANN 96, volume 1112, pages 47 52, Berln, Sprnger Lecture Notes n Computer Scence. [6] B. Schölkopf, P. Smard, A. Smola, and V. Vapnk. Pror knowledge n support vector kernels. In Proceedngs of the 1997 conference on Advances n neural nformaton processng systems 10, pages MIT Press, [7] B. Schölkopf, A. Smola, and K.-R. Muller. Nonlnear component analyss as a kernel egenvalue problem, [8] B. Scholköpf and A. J. Smola. Learnng wth Kernels. MIT Press, [9] V. N. Vapnk. The Nature of Statstcal Learnng Theory. Sprnger Verlag, 1995.
Support Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationWhich Separator? Spring 1
Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal
More informationNatural Language Processing and Information Retrieval
Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support
More informationLecture 3: Dual problems and Kernels
Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM
More informationDifference Equations
Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationLearning Theory: Lecture Notes
Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be
More informationSupport Vector Machines
CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More information2.3 Nilpotent endomorphisms
s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationCS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015
CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationSome Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)
Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationCSC 411 / CSC D11 / CSC C11
18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationMaximal Margin Classifier
CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org
More informationSupport Vector Machines CS434
Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts
More informationMMA and GCMMA two methods for nonlinear optimization
MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons
More informationStructure and Drive Paul A. Jensen Copyright July 20, 2003
Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationTime-Varying Systems and Computations Lecture 6
Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy
More informationSupport Vector Machines CS434
Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We
More informationCSE 252C: Computer Vision III
CSE 252C: Computer Vson III Lecturer: Serge Belonge Scrbe: Catherne Wah LECTURE 15 Kernel Machnes 15.1. Kernels We wll study two methods based on a specal knd of functon k(x, y) called a kernel: Kernel
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationLinear Classification, SVMs and Nearest Neighbors
1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush
More informationA Hybrid Variational Iteration Method for Blasius Equation
Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method
More informationAssortment Optimization under MNL
Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.
More informationFundamental loop-current method using virtual voltage sources technique for special cases
Fundamental loop-current method usng vrtual voltage sources technque for specal cases George E. Chatzaraks, 1 Marna D. Tortorel 1 and Anastasos D. Tzolas 1 Electrcal and Electroncs Engneerng Departments,
More informationC4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )
C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationCOMPUTATIONALLY EFFICIENT WAVELET AFFINE INVARIANT FUNCTIONS FOR SHAPE RECOGNITION. Erdem Bala, Dept. of Electrical and Computer Engineering,
COMPUTATIONALLY EFFICIENT WAVELET AFFINE INVARIANT FUNCTIONS FOR SHAPE RECOGNITION Erdem Bala, Dept. of Electrcal and Computer Engneerng, Unversty of Delaware, 40 Evans Hall, Newar, DE, 976 A. Ens Cetn,
More informationAPPENDIX A Some Linear Algebra
APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,
More informationSupport Vector Machines. Jie Tang Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University 2012
Support Vector Machnes Je Tang Knowledge Engneerng Group Department of Computer Scence and Technology Tsnghua Unversty 2012 1 Outlne What s a Support Vector Machne? Solvng SVMs Kernel Trcks 2 What s a
More informationRegularized Discriminant Analysis for Face Recognition
1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths
More informationON REGULARISATION PARAMETER TRANSFORMATION OF SUPPORT VECTOR MACHINES. Hong-Gunn Chew Cheng-Chew Lim. (Communicated by the associate editor name)
Manuscrpt submtted to AIMS Journals Volume X, Number 0X, XX 200X Webste: http://aimscences.org pp. X XX ON REGULARISATION PARAMETER TRANSFORMATION OF SUPPORT VECTOR MACHINES Hong-Gunn Chew Cheng-Chew Lm
More informationPHYS 705: Classical Mechanics. Calculus of Variations II
1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary
More informationSupport Vector Machines
Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far Supervsed machne learnng Lnear models Least squares regresson Fsher s dscrmnant, Perceptron, Logstc model Non-lnear
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg
prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationKernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan
Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems
More information4DVAR, according to the name, is a four-dimensional variational method.
4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The
More informationThe Order Relation and Trace Inequalities for. Hermitian Operators
Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationSupport Vector Machines
Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far So far Supervsed machne learnng Lnear models Non-lnear models Unsupervsed machne learnng Generc scaffoldng So far
More informationIterative General Dynamic Model for Serial-Link Manipulators
EEL6667: Knematcs, Dynamcs and Control of Robot Manpulators 1. Introducton Iteratve General Dynamc Model for Seral-Lnk Manpulators In ths set of notes, we are gong to develop a method for computng a general
More informationMultilayer Perceptron (MLP)
Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne
More informationSingular Value Decomposition: Theory and Applications
Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationCHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD
CHALMERS, GÖTEBORGS UNIVERSITET SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS COURSE CODES: FFR 35, FIM 72 GU, PhD Tme: Place: Teachers: Allowed materal: Not allowed: January 2, 28, at 8 3 2 3 SB
More information8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS
SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 493 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces you have studed thus far n the text are real vector spaces because the scalars
More informationP R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /
Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons
More information= = = (a) Use the MATLAB command rref to solve the system. (b) Let A be the coefficient matrix and B be the right-hand side of the system.
Chapter Matlab Exercses Chapter Matlab Exercses. Consder the lnear system of Example n Secton.. x x x y z y y z (a) Use the MATLAB command rref to solve the system. (b) Let A be the coeffcent matrx and
More informationFoundations of Arithmetic
Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an
More informationResearch Article Green s Theorem for Sign Data
Internatonal Scholarly Research Network ISRN Appled Mathematcs Volume 2012, Artcle ID 539359, 10 pages do:10.5402/2012/539359 Research Artcle Green s Theorem for Sgn Data Lous M. Houston The Unversty of
More informationCase A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.
THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationBoostrapaggregating (Bagging)
Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod
More informationWorkshop: Approximating energies and wave functions Quantum aspects of physical chemistry
Workshop: Approxmatng energes and wave functons Quantum aspects of physcal chemstry http://quantum.bu.edu/pltl/6/6.pdf Last updated Thursday, November 7, 25 7:9:5-5: Copyrght 25 Dan Dll (dan@bu.edu) Department
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationLecture Note 3. Eshelby s Inclusion II
ME340B Elastcty of Mcroscopc Structures Stanford Unversty Wnter 004 Lecture Note 3. Eshelby s Incluson II Chrs Wenberger and We Ca c All rghts reserved January 6, 004 Contents 1 Incluson energy n an nfnte
More informationProf. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model
EXACT OE-DIMESIOAL ISIG MODEL The one-dmensonal Isng model conssts of a chan of spns, each spn nteractng only wth ts two nearest neghbors. The smple Isng problem n one dmenson can be solved drectly n several
More informationLecture 7: Boltzmann distribution & Thermodynamics of mixing
Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters
More informationSupporting Information
Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to
More informationThe Minimum Universal Cost Flow in an Infeasible Flow Network
Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran
More informationCommon loop optimizations. Example to improve locality. Why Dependence Analysis. Data Dependence in Loops. Goal is to find best schedule:
15-745 Lecture 6 Data Dependence n Loops Copyrght Seth Goldsten, 2008 Based on sldes from Allen&Kennedy Lecture 6 15-745 2005-8 1 Common loop optmzatons Hostng of loop-nvarant computatons pre-compute before
More informationCurve Fitting with the Least Square Method
WIKI Document Number 5 Interpolaton wth Least Squares Curve Fttng wth the Least Square Method Mattheu Bultelle Department of Bo-Engneerng Imperal College, London Context We wsh to model the postve feedback
More informationLecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.
prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove
More informationFisher Linear Discriminant Analysis
Fsher Lnear Dscrmnant Analyss Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan Fsher lnear
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationApproximate Smallest Enclosing Balls
Chapter 5 Approxmate Smallest Enclosng Balls 5. Boundng Volumes A boundng volume for a set S R d s a superset of S wth a smple shape, for example a box, a ball, or an ellpsod. Fgure 5.: Boundng boxes Q(P
More informationPhysics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1
P. Guterrez Physcs 5153 Classcal Mechancs D Alembert s Prncple and The Lagrangan 1 Introducton The prncple of vrtual work provdes a method of solvng problems of statc equlbrum wthout havng to consder the
More information1 Matrix representations of canonical matrices
1 Matrx representatons of canoncal matrces 2-d rotaton around the orgn: ( ) cos θ sn θ R 0 = sn θ cos θ 3-d rotaton around the x-axs: R x = 1 0 0 0 cos θ sn θ 0 sn θ cos θ 3-d rotaton around the y-axs:
More informationarxiv:cs.cv/ Jun 2000
Correlaton over Decomposed Sgnals: A Non-Lnear Approach to Fast and Effectve Sequences Comparson Lucano da Fontoura Costa arxv:cs.cv/0006040 28 Jun 2000 Cybernetc Vson Research Group IFSC Unversty of São
More informationImage classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?
Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem? Classfers Learn a decson rule assgnng bag-offeatures representatons of
More informationSupport Vector Machines
Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class
More informationAPPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14
APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce
More informationOne-sided finite-difference approximations suitable for use with Richardson extrapolation
Journal of Computatonal Physcs 219 (2006) 13 20 Short note One-sded fnte-dfference approxmatons sutable for use wth Rchardson extrapolaton Kumar Rahul, S.N. Bhattacharyya * Department of Mechancal Engneerng,
More informationThe Study of Teaching-learning-based Optimization Algorithm
Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute
More informationTransfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system
Transfer Functons Convenent representaton of a lnear, dynamc model. A transfer functon (TF) relates one nput and one output: x t X s y t system Y s The followng termnology s used: x y nput output forcng
More informationOn the Multicriteria Integer Network Flow Problem
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of
More informationChapter Newton s Method
Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationWe present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.
CS 189 Introducton to Machne Learnng Sprng 2018 Note 26 1 Boostng We have seen that n the case of random forests, combnng many mperfect models can produce a snglodel that works very well. Ths s the dea
More informationMultigradient for Neural Networks for Equalizers 1
Multgradent for Neural Netorks for Equalzers 1 Chulhee ee, Jnook Go and Heeyoung Km Department of Electrcal and Electronc Engneerng Yonse Unversty 134 Shnchon-Dong, Seodaemun-Ku, Seoul 1-749, Korea ABSTRACT
More informationWeek 5: Neural Networks
Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple
More information= z 20 z n. (k 20) + 4 z k = 4
Problem Set #7 solutons 7.2.. (a Fnd the coeffcent of z k n (z + z 5 + z 6 + z 7 + 5, k 20. We use the known seres expanson ( n+l ( z l l z n below: (z + z 5 + z 6 + z 7 + 5 (z 5 ( + z + z 2 + z + 5 5
More informationAppendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis
A Appendx for Causal Interacton n Factoral Experments: Applcaton to Conjont Analyss Mathematcal Appendx: Proofs of Theorems A. Lemmas Below, we descrbe all the lemmas, whch are used to prove the man theorems
More informationModule 9. Lecture 6. Duality in Assignment Problems
Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept
More informationADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING
1 ADVANCED ACHINE LEARNING ADVANCED ACHINE LEARNING Non-lnear regresson technques 2 ADVANCED ACHINE LEARNING Regresson: Prncple N ap N-dm. nput x to a contnuous output y. Learn a functon of the type: N
More informationA New Refinement of Jacobi Method for Solution of Linear System Equations AX=b
Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,
More informationCHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE
CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng
More informationINF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018
INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton
More information