Training Support Vector Machines with Particle Swarms
|
|
- Bartholomew Barton
- 5 years ago
- Views:
Transcription
1 Tranng Support Vector Machnes wth Partcle Swarms U Paquet Department of Computer Scence Unversty of Pretora South Afrca Emal: upaquet@cs.up.ac.za AP Engelbrecht Department of Computer Scence Unversty of Pretora South Afrca Emal: engel@drese.cs.up.ac.za Abstract Tranng a Support Vector Machne requres solvng a constraned quadratc programmng problem. Lnear Partcle Swarm Optmzaton s ntutve and smple to mplement, and s presented as an alternatve to current numerc SVM tranng methods. Performance of the new algorthm s demonstrated on the MNIST character recognton dataset. I. INTRODUCTION Support Vector Machnes (SVMs) are a young and mportant addton to the machne learnng toolbox. Havng been formally ntroduced by Boser et al. [1], SVMs have proved ther worth n the last decade there has been a remarkable growth n both the theory and practce of these learnng machnes. Tranng a SVM requres solvng a lnearly constraned quadratc optmzaton problem. Ths problem often nvolves a matrx wth an extremely large number of entres, whch make off-the-shelf optmzaton packages unsutable. Several methods have been used to decompose the problem, of whch many requre numerc packages for solvng the smaller subproblems. Partcle Swarm Optmzaton (PSO) s an ntutve and easyto-mplement algorthm from the swarm ntellgence communty, and s ntroduced as a new way of tranng a SVM. Usng PSO replaces the need for numerc solvers. A Lnear PSO (LPSO) s adapted and shown to be deal n optmzng the SVM problem. Ths paper gves an overvew of the SVM algorthm, and explans the man methodologes for tranng SVMs. PSO s dscussed as an alternatve method for solvng a SVM s quadratc programmng problem. Expermental results on character recognton llustrate the convergence propertes of the algorthms. II. SUPPORT VECTOR MACHINES Tradtonally, a SVM s a learnng machne for twoclass classfcaton problems, and learns from a set of l N- dmensonal example vectors x, and ther assocated classes y,.e. {x 1, y 1 },..., {x l, y l } R N {±1} (1) The algorthm ams to learn a separaton between the two classes by creatng a lnear decson surface between them. Ths surface s, however, not created n nput space, but rather n a very hgh-dmensonal feature space. The resultng model s nonlnear, and s accomplshed by the use of kernel functons. The kernel functon k gves a measure of smlarty between a pattern x, and a pattern x from the tranng set. The decson boundary that needs to be constructed s of the form l f(x) = y α k(x, x ) + b (2) =1 where the class of x s determned from the sgn of f(x). The α are Lagrange multplers from a prmal quadratc programmng (QP) problem, and there s an α for each vector n the tranng set. The value b s a threshold. Support vectors defne the decson surface, and correspond to the subset of nonzero α. These vectors can be seen as the most nformatve tranng vectors. Tranng the SVM conssts of fndng the values of α. By defnng a Hessan matrx Q such that (Q) j = y y j k(x, x j ), tranng can be expressed as a dual QP problem of solvng max α W (α) = αt αt Qα (3) subject to one equalty constrant and a set of box constrants α T y = 0 (4) α 0 C1 α 0 (5) Tranng a SVM thus nvolves solvng a lnearly constraned quadratc optmzaton problem. III. SVM TRAINING METHODS The QP problem s equvalent to fndng the maxmum of a constraned bowl-shaped objectve functon. Due to the defnton of the kernel functon, the matrx Q always gves a convex QP problem, where every local soluton s also a global soluton [2]. Certan optmalty condtons the Karush-Kuhn- Tucker (KKT) condtons [2] gve condtons determnng whether the constraned maxmum has been found. Solvng the QP problem for real-world problems can prove to be very dffcult: The matrx Q has a dmenson equal to the number of tranng examples. A tranng set of 60,000 vectors gves rse to a matrx Q wth 3.6 bllon elements, whch does not ft nto the memory of a standard computer. For large learnng tasks, off-the-shelf optmzaton packages and technques for general quadratc programmng quckly become ntractable n ther memory and tme requrements. A
2 number of other approaches, whch allow for fast convergence and small memory requrements, even on large problems, have been nvented: Chunkng The chunkng algorthm s based on the fact that the nonsupport vectors play no role n the SVM decson boundary. If they are removed from the tranng set of examples, the SVM soluton wll be exactly the same. Chunkng has been suggested by V. Vapnk n [14], and breaks the large QP problem down nto a number of smaller problems. A QP routne s used to optmze the Lagrangan on an arbtrary subset of data. After ths optmzaton, the set of nonzero α (the current support vectors) are retaned, and all other data ponts (α = 0) are dscarded. At every subsequent step, chunkng solves the QP problem that conssts of all nonzero α, plus some of the α that volates the KKT condtons. After optmzng the subproblem, data ponts wth α = 0 are agan dscarded. Ths procedure s terated untl the KKT condtons are met, and the margn s maxmzed. The sze of the subproblem vares, but tends to grow wth tme. At the last step, chunkng has dentfed and optmzed all the nonzero α, whch correspond to the set of all the support vectors. Thus the overall QP problem s solved. Although ths technque of reducng the Q matrx s dmenson from the number of tranng examples to approxmately the number of support vectors makes t sutable to large problems, even the reduced matrx may not ft nto memory. Decomposton Decomposton methods are smlar to chunkng, and were ntroduced by E. Osuna n [8], [9]. The large QP problem s broken down nto a seres of smaller subproblems, and a numerc QP optmzer solves each of these problems. It was suggested that one vector be added and one removed from the subproblem at each teraton, and that the sze of the subproblems should be kept fxed. The motvaton behnd ths method s based on the observaton that as long as at least one α volatng the KKT condtons s added to the prevous subproblem, each step reduces the objectve functon and mantans all of the constrants. In ths fashon the sequence of QP subproblems wll asymptotcally converge. For faster practcal convergence, researchers add and delete multple examples. Whle the strategy used n chunkng takes advantage of the fact that the expected number of support vectors s small (< 3, 000), decomposton allows for tranng arbtrarly large data sets. Another decomposton method was ntroduced by T. Joachms [3]. Joachm s method s based on the gradent of the objectve functon. The dea s to pck α for the QP subproblem such that the α form the steepest possble drecton of ascent on the objectve functon, where the number of nonzero elements n the drecton s equal to the sze of the QP subproblem. As n Osuna s method, the sze of the subproblem remans fxed. Solvng each subproblem requres a numerc quadratc optmzer. Sequental Mnmal Optmzaton The most extreme case of decomposton s Sequental Mnmal Optmzaton (SMO) where the smallest possble optmzaton problem s solved at each step [11]. Because the α must obey the lnear equalty constrant, two α s chosen to be jontly optmzed. No numercal QP optmzaton s necessary, and after an analytc soluton, the SVM s updated to reflect the new optmal values. Wth the excepton of SMO, a numerc QP lbrary s needed for tranng a SVM. An ntutve and alternatve approach s to use PSO to optmze each decomposed subproblem. The PSO algorthm s easy to mplement, and certan propertes of the LPSO make t deal for the type of problem posed by SVM tranng. IV. PARTICLE SWARM OPTIMIZATION PSO [4] s smlar n sprt to brds mgratng n a flock toward some destnaton, where the ntellgence and effcency les n the cooperaton of an entre flock. PSO dffers from tradtonal optmzaton methods n that a populaton of potental solutons s used n the search. The drect ftness nformaton nstead of functon dervatves or related knowledge s used to gude the search. Partcles collaborate as a populaton, or swarm, to reach a collectve goal, for example maxmzng an n-dmensonal objectve functon f. Each partcle has memory of the best soluton that t has found, called ts personal best. A partcle s traversal of the search space s nfluenced by ts personal best and the best soluton found by a neghborhood of partcles. There s thus a sharng of nformaton that takes place. Partcles proft from the dscoveres and prevous experence of other partcles durng the exploraton and search for hgher objectve functon values. There exsts a great number of schemes n whch ths nformaton sharng can take place. One of two socometrc prncples s usually mplemented. The frst, called gbest (global best) PSO, conceptually connects all the partcles n the populaton to one another, so that the very best performance of the entre populaton the global best nfluences each partcle. The second, called lbest (local best), creates a neghborhood for each ndvdual comprsng of tself and some fxed number of ts nearest neghbors. Snce SVM tranng requres solvng a convex problem, the gbest verson s mplemented n ths paper. Let ndcate a partcle s ndex n the swarm. In a gbest PSO each of the s partcles p fly through the n-dmensonal search space R n wth a velocty v, whch s dynamcally adjusted accordng to ts own prevous best soluton z and the prevous best soluton ẑ of the entre swarm. In the orgnal PSO [4], each partcle s velocty adjustments are calculated separately for each component n ts poston
3 vector. By calculatng velocty adjustments as lnear combnatons of poston vectors, equalty constrants on the objectve functon can easly be met. Equalty Constrants and the Lnear PSO The Lnear PSO (LPSO) was ntroduced by [10] to constran the movement of a swarm to a lnear hyperplane n R n. LPSO dffers from the orgnal PSO, snce velocty updates are calculated as a lnear combnaton of poston and velocty vectors. The partcles of a LPSO nteract and move accordng to the followng equatons v (t+1) p (t+1) = wv (t) + c 1 r (t) 1 [z(t) p (t) ] + c 2 r (t) 2 [ẑ(t) p (t) ] (6) = v (t+1) + p (t) (7) where r (t) 1, r(t) 2 UNIF (0, 1) are random numbers between zero and one. These numbers are scaled by acceleraton coeffcents c 1 and c 2, where 0 c 1, c 2 2, and w s an nerta weght [12]. It s possble to clamp the velocty vectors by specfyng upper and lower bounds on v, to avod too rapd movement of partcles n the search space. When the objectve functon f needs to be maxmzed subject to constrants Ap = b, the swarm should be constraned to fly through hyperplane P. Wth A beng a m n matrx and b a m-dmensonal vector, P = {p Ap = b} defnes the set of feasble solutons to the constraned problem, and each pont n P wll be a feasble pont. It was shown n [10] that f the ntal swarm les n P, LPSO wll force the partcles to fly only n feasble drectons, and the swarm wll optmze wthn the search space P. Premature convergence s overcome by usng a verson of van den Bergh s Guaranteed Convergence Partcle Swarm Optmzer [13]. In ths algorthm, the velocty updates for the global best partcle s changed to force t to search for a better soluton n an area around the poston of that partcle. Let τ be the ndex of the global best partcle, such that z τ = ẑ. The velocty update equaton for partcle τ s changed to v (t+1) τ = p (t) τ + ẑ (t) + wv (t) τ + ρ (t) υ (t) (8) where ρ (t) s some scalng factor, and υ (t) UNIF ( 1, 1) n s a random n-dmensonal vector wth the property that Aυ (t) = 0, or υ (t) les n the null space of A. The LPSO algorthm [10] s summarzed below: 1) Set the teraton number t to zero. Intalze the swarm S of s partcles such that the poston p (0) meets Ap (0) = b. of each partcle 2) Evaluate the performance f(p (t) ) of each partcle. 3) Compare the personal best of each partcle to ts current performance, and set z (t) to the better performance,.e. { z (t 1) f f(p (t) ) f(z (t 1) ) z (t) = p (t) f f(p (t) ) > f(z (t 1) ) 4) Set the global best ẑ (t) to the poston of the partcle wth the best performance wthn the swarm,.e. ẑ (t) {z (t) 1, z(t) 2,..., z(t) s } f(ẑ (t) ) = max{f(z (t) 1 (9) ), f(z(t) 2 ),..., f(z(t) s )} (10) 5) Change the velocty vector for each partcle accordng to equaton (6). 6) Move each partcle to ts new poston, accordng to equaton (7). 7) Let t := t ) Go to step 2, and repeat untl convergence. The LPSO algorthm s suffcent to optmze the SVM objectve functon subject to the lnear equalty constrant (4). The box constrants (5) are easly handled by ntalzng all partcles p to le nsde the hypercube defned by the constrants, and restrctng ther movement to ths hypercube. When a partcle s movng outsde the boundary of the hypercube, ts velocty vector s scaled wth some factor such that all components of ts poston le ether nsde the hypercube, or on ts boundary. The practcal sde of usng LPSO, as well as the tranng algorthm, s dscussed n the followng secton. V. TRAINING THE SVM Usng LPSO to solve the SVM QP problem requres crtera for optmalty, a way to decompose the QP, and a way to extend LPSO to optmze the SVM subproblem. Snce Q s a postve sem-defnte matrx (the kernel functon used s postve sem-defnte), and the constrants are lnear, the Karush-Kuhn-Tucker (KKT) condtons are necessary and suffcent for optmalty [2]. A soluton α of the QP problem, as stated n equaltes (3) (5), s an optmal soluton f the followng relatons hold for each α : α = 0 y f(x ) 1 0 < α < C y f(x ) = 1 α = C y f(x ) 1 (11) where s the ndex of an example vector from the tranng set. Decomposng the QP problem nvolves choosng a subset, or workng set, of varables for optmzaton. The workng set, called set B, s created by pckng q sub-optmal varables from all l α. The workng set of varables s optmzed whle keepng the remanng varables (called set N) constant. The general decomposton algorthm works as follows: 1) Whle the KKT condtons for optmalty are volated: a) Select q varables for the workng set B. The remanng l q varables (set N) are fxed at ther current value. b) Use LPSO to optmze W (α) on B. c) Return the optmzed α from B to the orgnal set of varables. 2) Termnate and return α. A concern n the above algorthm s to select the optmal workng set. The decomposton method presented here s due to [3], and works on the method of feasble drectons. The dea s to fnd the steepest feasble drecton d of ascent on the objectve functon W as defned n equaton (3), under the requrement that only q components be nonzero. The α correspondng to these q components wll be ncluded n the
4 workng set. Fndng an approxmaton to d s equvalent to solvng Maxmse W (α) T d subject to y T d = 0 d 0 f α = 0 d 0 d { 1, 0, 1} {d : d 0} = q f α = C For y T d to be equal to zero, the number of elements wth sgn matches between d and y must be equal to the number of elements wth sgn msmatches between d and y. Also, d should be chosen to maxmze the drecton of ascent W (α) T d. Ths s acheved by frst sortng the tranng vectors n ncreasng order accordng to y W (α). Assumng q to be even, a forward pass selects q 2 examples from the front of the sorted lst, and a backward pass selects q 2 examples from the back. A full explanaton of ths method s gven by P. Laskov n [5]. It s necessary to rewrte the objectve functon (3) as a functon that s only dependent on the workng set. Let α be splt nto two sets α B and α N. If α, y and Q are approprately rearranged, we have α = [ αb α N ], y = [ yb y N ], Q = [ ] QBB Q BN Q NB Q NN Snce only α B s gong to be optmzed, W s rewrtten n terms of α B. If terms that do not contan α B are dropped, the optmzaton problem remans essentally the same. Also, snce Q s symmetrc, wth Q BN = Q T NB, the problem s to fnd: max α B W (α B ) = α T B1 1 2 αt BQ BB α B α T BQ BN α N (12) subject to α T By B + α T Ny N = 0 α B 0 Implementng Partcle Swarm Optmzaton C1 α B 0 (13) When the decomposton algorthm starts, a feasble soluton that satsfes the lnear constrant α T y = 0, wth constrants 0 α C also met, s needed. The ntal soluton s constructed n the followng way: Let c be some real number between 0 and C, and γ some postve nteger less than both the number of postve examples (y = +1) and negatve examples (y = 1) n the tranng set. Randomly pck a total of γ postve examples, and γ negatve examples, and ntalze ther correspondng α to c. By settng all other α to zero, the ntal soluton wll be feasble. The value 2γ gves the total number of ntal support vectors, and snce these ntal support vectors are a randomly chosen guess, t s suggested that the value of γ be kept small. TABLE I INFLUENCE OF DIFFERENT WORKING SET SIZES ON THE FIRST 20,000 ELEMENTS OF THE MNIST DATASET Workng Workng Tme SVs set sze set selectons 4 8,782 02:17:43 1, ,213 03:11:40 1, ,502 03:51:24 1, ,023 06:27:06 1, ,667 07:26:23 1,652 In optmzng the q-dmensonal subproblem, LPSO requres that all partcles be ntalzed such that α T B y B + α T N y N = 0 s met. Ths s done as follows: 1) Set each partcle n the swarm to the q-dmensonal vector α B. 2) Add a random q-dmensonal vector δ satsfyng y T B δ = 0 to each partcle, under the condton that the partcle wll stll le n the hypercube [0, C] q. Intalzng the swarm n ths way ensures that the ntal swarm les n the set of feasble solutons P = {p Ap = α T N y N }, allowng the flght of the swarm to be defned by feasble drectons. For faster convergence, the vector υ (t) used to adjust the global best partcle, can be chosen as an approxmaton to the partal dervatve W (α B ), subject to y T B υ(t) = 0. VI. EXPERIMENTAL RESULTS The SVM tranng algorthm presented n ths paper was tested on the MNIST dataset [7]. The MNIST dataset s a database of optcal characters, and conssts of a tranng set of 60,000 handwrtten dgts. Each dgt s a 28 by 28 pxel gray-level mage, equvalent to a 784-dmensonal nput vector. Each pxel corresponds to an nteger value n the range of 0 (whte) to 255 (black) For tranng a SVM on the MNIST dataset, the character 8 was used to represent the postve examples, whle the remanng dgts defned the negatve examples. Tranng was done wth a polynomal kernel of degree fve: k(x, x j ) = (x x j + 1) 5 (14) Due to the sze of the dot product between two mages, rased to the ffth power, the pxel values were scaled to the range [0, 0.1]. Ths gves Lagrange multples α that are easer for the LPSO to handle. (The kernel functon of two unscaled black mages would be ( ) 5, whle the kernel functon of the scaled versons gves a more practcal ( ) 5 835). For an optmal soluton to be found n the followng PSO experments, the KKT condtons needed to be satsfed wthn an error threshold of 0.02 from the rght hand sde of equatons (11). Optmzaton of the workng set termnated when the KKT condtons on the workng set were met wthn an error
5 TABLE II SCALABILITY: TRAINING ON THE MNIST DATASET MNIST PSO Workng PSO PSO SMO SMO SVM lght SVM lght elements set selectons tme SVs tme SVs tme SVs 10,000 3,898 00:29:49 1,022 00:01:29 1,032 00:02:02 1,034 20,000 8,782 02:17:43 1,631 00:06:14 1,647 00:10:43 1,641 30,000 12,428 04:50:11 1,988 00:13:22 2,012 00:23:04 2,001 40,000 15,725 08:14:26 2,353 00:22:46 2,355 00:41:09 2,367 50,000 22,727 15:05:09 2,728 01:46:38 2,740 01:31:48 2,726 60,000 25,914 20:54:15 3,025 04:38:11 3,043 08:01:05 3,026 of 0.001, or when the swarm has optmzed for a hundred teratons. The followng parameters defned the expermental PSO: By lettng γ = 10, a total of 20 ntal support vectors were chosen to start the algorthm. The swarm sze s used n each experment was 10, whle the nerta weght w was set to 0.7. The acceleraton coeffcents c 1 and c 2 were both set to 1.4 [13]. Snce the objectve functon s constranted by a set of box constrants, the velocty vectors were not clamped. For each experment the upper bound C was kept at The PSO tranng algorthm was wrtten n Java, and does not make use of cachng and shrnkng methods to optmze ts speed. The sparsty of nput data s used to speed up the evaluaton of kernel functons. All experments were preformed on a 1.00 GHz AMD Duron processor. Expermental results show successful and accurate tranng on the MNIST database. The nfluence of dfferent workng set szes on the LPSO tranng algorthm, ts scalablty, as well as ts relaton to other SVM tranng algorthms, were examned. Influence of workng set szes Experments on dfferent workng set szes were done on the frst 20,000 elements of the MNIST database. Results are shown n Table I, and ndcate that a workng set of sze q = 4 gves the fastest convergence tme and fewest support vectors. A workng set of sze 2 can be solved analytcally, as s true n the case of SMO. The results n Table I are not necessarly an ndcaton of the speed of the PSO on the workng set, as selecton of the workng set also burdens the speed of the algorthm (the q 2 greatest and least values of y W (α) need to be selected from a lst of thousands). Scalablty of the PSO approach Scalablty of the PSO algorthm was tested by tranng on the frst 10,000, 20,000, etc. examples from the MNIST dataset, as shown n Table II. In each case a workng set of sze 4 was used. The expermental results ndcate that the PSO tranng algorthm shows quadratc scalablty, and scales as l 2.1. Comparson to other algorthms In Table II, the PSO approach s compared to SMO and a decomposton method, SVM lght [3]. WnSVM was developed by C. Longbn [6] from the SVM lght source code, and was used as an mplementaton of SMO. Unlke these methods, the current PSO algorthm does not make use of cachng and shrnkng to optmze ts speed. Results smlar to Table I ndcate that SVM lght gves the fastest rate of convergence wth a workng set sze q = 8, whch s used n Table II s comparson. Expermental results show SMO scalng as l 2.8, and SVM lght scalng as l 3.0. Both these algorthms are substantally faster than tranng a SVM wth PSO on the MNIST dataset, but the PSO approach shows better scalng abltes ( l 2.1 ). Due to the fact that the PSO tranng algorthm starts wth a very small set of possble support vectors, wth all other α set to zero, the PSO method consstently fnds a few support vectors less than the other approaches. The man drawback from the current PSO approach s ts slow performance tmes, but from ths ntal study many optmzatons can be mplemented on both decomposton and PSO methods. VII. CONCLUSION It was shown that a PSO can be used to tran a SVM. Some propertes of LPSO make t partcularly useful to solve the SVM constraned QP problem. The PSO algorthm s smple to mplement, and does not requre any background of numercal methods. Accurate and scalable tranng results were shown on the MNIST dataset, wth the PSO algorthm fndng fewer support vectors and better scalablty than other approaches. Although the algorthm s smple, ts speed poses a practcal bottleneck, whch can be mproved from ths ntal study. ACKNOWLEDGMENT The fnancal assstance of the Natonal Research Foundaton towards ths research s hereby acknowledged. Opnons expressed n ths paper and conclusons arrved at, are those of the authors and not necessarly to be attrbuted to the Natonal Research Foundaton.
6 REFERENCES [1] B.E. Boser, I.M. Guyon, and V.N. Vapnk, A tranng algorthm for optmal margn classfers, n D. Haussler, edtor, Proceedngs of the Ffth Annual ACM Workshop on Computatonal Learnng Theory, pages , Pttsburgh, PA, ACM Press. [2] R. Fletcher, Practcal Methods of Optmzaton. John Wley and Sons, Inc., 2nd edton, [3] T. Joachms, Makng large-scale SVM learnng practcal, n Advances n Kernel Methods Support Vector Learnng, B. Schölkopf, C.J.C Burges, and A.J. Smola, edtors, pages MIT Press, Cambrdge, MA, [4] J. Kennedy and R.C. Eberhart, Partcle swarm optmzaton, n Proceedngs of the IEEE Internatonal Conference on Neural Networks, IV, pages , [5] P. Laskov, Feasble drecton decomposton algorthms for tranng support vector machnes, n Machne Learnng, Volume 46, N. Crstann, C. Campbell, and Chrs Burges, edtors, pages , [6] C. Longbn, Insttute of Automaton, Chnese Academy of Scences (CASIA). [7] MNIST Optcal Character Database at AT&T Research, [8] E. Osuna, R. Freund, and F. Gros, Support vector machnes: Tranng and applcatons, A.I. Memo AIM-1602, MIT A.I. Lab, [9] E. Osuna, R. Freund, and F. Gros, An mproved tranng algorthm for support vector machnes, n Neural Networks for Sgnal Processng VII Proceedngs of the 1997 IEEE Workshop, J. Prncpe, L. Gle, N. Morgan, and E. Wlson, edtors, pages IEEE, New York, [10] U. Paquet and A.P. Engelbrecht, Partcle swarms for equaltyconstraned optmzaton, submtted to IEEE Transactons on Evolutonary Computaton. [11] J. Platt, Fast tranng of support vector machnes usng sequental mnmal optmzaton, n Advances n Kernel Methods Support Vector Learnng, B. Schölkopf, C.J.C Burges, and A.J. Smola, edtors, pages MIT Press, Cambrdge, MA, [12] Y.H. Sh and R.C. Eberhart, A modfed partcle swarm optmzer, n IEEE Internatonal Conference on Evolutonary Computaton, Anchorage, Alaska, [13] F. van den Bergh, An analyss of partcle swarm optmzers, PhD Thess, Department of Computer Scence, Unversty of Pretora, [14] V. Vapnk, Estmaton of Dependences Based on Emprcal Data [n Russan], Nauka, Moscow, (Englsh translaton: Sprnger Verlag, New York, 1982.)
Kernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationMMA and GCMMA two methods for nonlinear optimization
MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons
More informationWhich Separator? Spring 1
Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal
More informationNatural Language Processing and Information Retrieval
Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support
More informationLecture Notes on Linear Regression
Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume
More informationFeature Selection: Part 1
CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?
More informationSupport Vector Machines. Vibhav Gogate The University of Texas at dallas
Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest
More informationSupport Vector Machines. Jie Tang Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University 2012
Support Vector Machnes Je Tang Knowledge Engneerng Group Department of Computer Scence and Technology Tsnghua Unversty 2012 1 Outlne What s a Support Vector Machne? Solvng SVMs Kernel Trcks 2 What s a
More informationSupport Vector Machines CS434
Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts
More informationProblem Set 9 Solutions
Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem
More informationThe Study of Teaching-learning-based Optimization Algorithm
Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute
More informationChapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems
Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons
More informationSupport Vector Machines
CS 2750: Machne Learnng Support Vector Machnes Prof. Adrana Kovashka Unversty of Pttsburgh February 17, 2016 Announcement Homework 2 deadlne s now 2/29 We ll have covered everythng you need today or at
More informationGeneralized Linear Methods
Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set
More informationCSC 411 / CSC D11 / CSC C11
18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t
More informationU.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017
U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that
More informationLecture 10 Support Vector Machines. Oct
Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron
More informationLinear Classification, SVMs and Nearest Neighbors
1 CSE 473 Lecture 25 (Chapter 18) Lnear Classfcaton, SVMs and Nearest Neghbors CSE AI faculty + Chrs Bshop, Dan Klen, Stuart Russell, Andrew Moore Motvaton: Face Detecton How do we buld a classfer to dstngush
More informationThe Minimum Universal Cost Flow in an Infeasible Flow Network
Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran
More informationCollege of Computer & Information Science Fall 2009 Northeastern University 20 October 2009
College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:
More informationModule 9. Lecture 6. Duality in Assignment Problems
Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept
More informationLecture 3: Dual problems and Kernels
Lecture 3: Dual problems and Kernels C4B Machne Learnng Hlary 211 A. Zsserman Prmal and dual forms Lnear separablty revsted Feature mappng Kernels for SVMs Kernel trck requrements radal bass functons SVM
More informationLecture 20: November 7
0-725/36-725: Convex Optmzaton Fall 205 Lecturer: Ryan Tbshran Lecture 20: November 7 Scrbes: Varsha Chnnaobreddy, Joon Sk Km, Lngyao Zhang Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer:
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16
STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus
More informationEEE 241: Linear Systems
EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they
More informationLagrange Multipliers Kernel Trick
Lagrange Multplers Kernel Trck Ncholas Ruozz Unversty of Texas at Dallas Based roughly on the sldes of Davd Sontag General Optmzaton A mathematcal detour, we ll come back to SVMs soon! subject to: f x
More informationAdditional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty
Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,
More informationErrors for Linear Systems
Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch
More informationSolving Nonlinear Differential Equations by a Neural Network Method
Solvng Nonlnear Dfferental Equatons by a Neural Network Method Luce P. Aarts and Peter Van der Veer Delft Unversty of Technology, Faculty of Cvlengneerng and Geoscences, Secton of Cvlengneerng Informatcs,
More informationA PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS
HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,
More informationADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING
1 ADVANCED ACHINE LEARNING ADVANCED ACHINE LEARNING Non-lnear regresson technques 2 ADVANCED ACHINE LEARNING Regresson: Prncple N ap N-dm. nput x to a contnuous output y. Learn a functon of the type: N
More information10-701/ Machine Learning, Fall 2005 Homework 3
10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40
More informationDesign and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm
Desgn and Optmzaton of Fuzzy Controller for Inverse Pendulum System Usng Genetc Algorthm H. Mehraban A. Ashoor Unversty of Tehran Unversty of Tehran h.mehraban@ece.ut.ac.r a.ashoor@ece.ut.ac.r Abstract:
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationOn the Multicriteria Integer Network Flow Problem
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of
More informationFoundations of Arithmetic
Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an
More informationSupport Vector Machines CS434
Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We
More informationprinceton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg
prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there
More informationPhysics 5153 Classical Mechanics. Principle of Virtual Work-1
P. Guterrez 1 Introducton Physcs 5153 Classcal Mechancs Prncple of Vrtual Work The frst varatonal prncple we encounter n mechancs s the prncple of vrtual work. It establshes the equlbrum condton of a mechancal
More informationSome Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)
Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998
More informationSolutions to exam in SF1811 Optimization, Jan 14, 2015
Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable
More informationA new Approach for Solving Linear Ordinary Differential Equations
, ISSN 974-57X (Onlne), ISSN 974-5718 (Prnt), Vol. ; Issue No. 1; Year 14, Copyrght 13-14 by CESER PUBLICATIONS A new Approach for Solvng Lnear Ordnary Dfferental Equatons Fawz Abdelwahd Department of
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationSome modelling aspects for the Matlab implementation of MMA
Some modellng aspects for the Matlab mplementaton of MMA Krster Svanberg krlle@math.kth.se Optmzaton and Systems Theory Department of Mathematcs KTH, SE 10044 Stockholm September 2004 1. Consdered optmzaton
More informationCHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE
CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng
More informationMaximal Margin Classifier
CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org
More informationOn a direct solver for linear least squares problems
ISSN 2066-6594 Ann. Acad. Rom. Sc. Ser. Math. Appl. Vol. 8, No. 2/2016 On a drect solver for lnear least squares problems Constantn Popa Abstract The Null Space (NS) algorthm s a drect solver for lnear
More informationCOS 521: Advanced Algorithms Game Theory and Linear Programming
COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton
More informationFor now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.
Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson
More informationAssortment Optimization under MNL
Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.
More informationReport on Image warping
Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.
More informationPHYS 705: Classical Mechanics. Calculus of Variations II
1 PHYS 705: Classcal Mechancs Calculus of Varatons II 2 Calculus of Varatons: Generalzaton (no constrant yet) Suppose now that F depends on several dependent varables : We need to fnd such that has a statonary
More informationn α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0
MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector
More information6.854J / J Advanced Algorithms Fall 2008
MIT OpenCourseWare http://ocw.mt.edu 6.854J / 18.415J Advanced Algorthms Fall 2008 For nformaton about ctng these materals or our Terms of Use, vst: http://ocw.mt.edu/terms. 18.415/6.854 Advanced Algorthms
More informationLecture 14: Bandits with Budget Constraints
IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed
More informationDifference Equations
Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationKernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan
Kernels n Support Vector Machnes Based on lectures of Martn Law, Unversty of Mchgan Non Lnear separable problems AND OR NOT() The XOR problem cannot be solved wth a perceptron. XOR Per Lug Martell - Systems
More informationChapter 6 Support vector machine. Séparateurs à vaste marge
Chapter 6 Support vector machne Séparateurs à vaste marge Méthode de classfcaton bnare par apprentssage Introdute par Vladmr Vapnk en 1995 Repose sur l exstence d un classfcateur lnéare Apprentssage supervsé
More informationLOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin
Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence
More informationCSE 252C: Computer Vision III
CSE 252C: Computer Vson III Lecturer: Serge Belonge Scrbe: Catherne Wah LECTURE 15 Kernel Machnes 15.1. Kernels We wll study two methods based on a specal knd of functon k(x, y) called a kernel: Kernel
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationA linear imaging system with white additive Gaussian noise on the observed data is modeled as follows:
Supplementary Note Mathematcal bacground A lnear magng system wth whte addtve Gaussan nose on the observed data s modeled as follows: X = R ϕ V + G, () where X R are the expermental, two-dmensonal proecton
More informationLecture 12: Classification
Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More informationThe Geometry of Logit and Probit
The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.
More informationLeast squares cubic splines without B-splines S.K. Lucas
Least squares cubc splnes wthout B-splnes S.K. Lucas School of Mathematcs and Statstcs, Unversty of South Australa, Mawson Lakes SA 595 e-mal: stephen.lucas@unsa.edu.au Submtted to the Gazette of the Australan
More informationChapter Newton s Method
Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve
More information4DVAR, according to the name, is a four-dimensional variational method.
4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The
More informationON REGULARISATION PARAMETER TRANSFORMATION OF SUPPORT VECTOR MACHINES. Hong-Gunn Chew Cheng-Chew Lim. (Communicated by the associate editor name)
Manuscrpt submtted to AIMS Journals Volume X, Number 0X, XX 200X Webste: http://aimscences.org pp. X XX ON REGULARISATION PARAMETER TRANSFORMATION OF SUPPORT VECTOR MACHINES Hong-Gunn Chew Cheng-Chew Lm
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationIV. Performance Optimization
IV. Performance Optmzaton A. Steepest descent algorthm defnton how to set up bounds on learnng rate mnmzaton n a lne (varyng learnng rate) momentum learnng examples B. Newton s method defnton Gauss-Newton
More informationMLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012
MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:
More information1 Convex Optimization
Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,
More informationAn Interactive Optimisation Tool for Allocation Problems
An Interactve Optmsaton ool for Allocaton Problems Fredr Bonäs, Joam Westerlund and apo Westerlund Process Desgn Laboratory, Faculty of echnology, Åbo Aadem Unversty, uru 20500, Fnland hs paper presents
More informationTemperature. Chapter Heat Engine
Chapter 3 Temperature In prevous chapters of these notes we ntroduced the Prncple of Maxmum ntropy as a technque for estmatng probablty dstrbutons consstent wth constrants. In Chapter 9 we dscussed the
More informationWeek3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity
Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle
More informationSuppose that there s a measured wndow of data fff k () ; :::; ff k g of a sze w, measured dscretely wth varable dscretzaton step. It s convenent to pl
RECURSIVE SPLINE INTERPOLATION METHOD FOR REAL TIME ENGINE CONTROL APPLICATIONS A. Stotsky Volvo Car Corporaton Engne Desgn and Development Dept. 97542, HA1N, SE- 405 31 Gothenburg Sweden. Emal: astotsky@volvocars.com
More informationSupport Vector Machines
Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More information18-660: Numerical Methods for Engineering Design and Optimization
8-66: Numercal Methods for Engneerng Desgn and Optmzaton n L Department of EE arnege Mellon Unversty Pttsburgh, PA 53 Slde Overve lassfcaton Support vector machne Regularzaton Slde lassfcaton Predct categorcal
More informationLecture 12: Discrete Laplacian
Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly
More informationFormal solvers of the RT equation
Formal solvers of the RT equaton Formal RT solvers Runge- Kutta (reference solver) Pskunov N.: 979, Master Thess Long characterstcs (Feautrer scheme) Cannon C.J.: 970, ApJ 6, 55 Short characterstcs (Hermtan
More informationCS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015
CS 3710: Vsual Recognton Classfcaton and Detecton Adrana Kovashka Department of Computer Scence January 13, 2015 Plan for Today Vsual recognton bascs part 2: Classfcaton and detecton Adrana s research
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationLecture 21: Numerical methods for pricing American type derivatives
Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)
More informationCase A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.
THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty
More informationELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM
ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM An elastc wave s a deformaton of the body that travels throughout the body n all drectons. We can examne the deformaton over a perod of tme by fxng our look
More informationTracking with Kalman Filter
Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,
More informationVQ widely used in coding speech, image, and video
at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng
More informationA Local Variational Problem of Second Order for a Class of Optimal Control Problems with Nonsmooth Objective Function
A Local Varatonal Problem of Second Order for a Class of Optmal Control Problems wth Nonsmooth Objectve Functon Alexander P. Afanasev Insttute for Informaton Transmsson Problems, Russan Academy of Scences,
More information10.34 Numerical Methods Applied to Chemical Engineering Fall Homework #3: Systems of Nonlinear Equations and Optimization
10.34 Numercal Methods Appled to Chemcal Engneerng Fall 2015 Homework #3: Systems of Nonlnear Equatons and Optmzaton Problem 1 (30 ponts). A (homogeneous) azeotrope s a composton of a multcomponent mxture
More information2.3 Nilpotent endomorphisms
s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms
More information1 GSW Iterative Techniques for y = Ax
1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn
More informationA New Refinement of Jacobi Method for Solution of Linear System Equations AX=b
Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,
More informationSemi-supervised Classification with Active Query Selection
Sem-supervsed Classfcaton wth Actve Query Selecton Jao Wang and Swe Luo School of Computer and Informaton Technology, Beng Jaotong Unversty, Beng 00044, Chna Wangjao088@63.com Abstract. Labeled samples
More informationSparse Gaussian Processes Using Backward Elimination
Sparse Gaussan Processes Usng Backward Elmnaton Lefeng Bo, Lng Wang, and Lcheng Jao Insttute of Intellgent Informaton Processng and Natonal Key Laboratory for Radar Sgnal Processng, Xdan Unversty, X an
More information