Margin Maximizing Loss Functions

Size: px
Start display at page:

Download "Margin Maximizing Loss Functions"

Transcription

1 Margn Maxmzng Loss Functons Saharon Rosset Watson Research Center IBM Yorktown, NY, J Zhu Deartment of Statstcs Unversty of Mchgan Ann Arbor, MI, Trevor Haste Deartment of Statstcs Stanford Unversty Stanford, CA, Abstract Margn maxmzng roertes lay an mortant role n the analyss of class caton models, such as boostng and suort vector machnes. Margn maxmzaton s theoretcally nterestng because t facltates generalzaton error analyss, and ractcally nterestng because t resents a clear geometrc nterretaton of the models beng bult. We formulate and rove a suf cent condton for the solutons of regularzed loss functons to converge to margn maxmzng searators, as the regularzaton vanshes. Ths condton covers the hnge loss of SVM, the exonental loss of AdaBoost and logstc regresson loss. We also generalze t to mult-class class caton roblems, and resent margn maxmzng multclass versons of logstc regresson and suort vector machnes. 1 Introducton Assume we have a class caton learnng samle {x,y } n =1 wth y { 1, +1}. We wsh to buld a model F (x) for ths data by mnmzng (exactly or aroxmately) a loss crteron C(y,F(x )) = C(y F (x )) whch s a functon of the margns y F (x ) of ths model on ths data. Most common class caton modelng aroaches can be cast n ths framework: logstc regresson, suort vector machnes, boostng and more. The model F (x) whch these methods actually buld s a lnear combnaton of dctonary functons comng from a dctonary H whch can be large or even n nte: F (x) = β j h j (x) h j H and our redcton at ont x based on ths model s sgnf (x). When H s large, as s the case n most boostng or kernel SVM alcatons, some regularzaton s needed to control the comlexty of the model F (x) and the resultng over- ttng. Thus, t s common that the quantty actually mnmzed on the data s a regularzed verson of the loss functon: (1) = mn C(y β h(x )) + λ β β where the second term enalzes for the l norm of the coef cent vector β ( 1 for convexty, and n ractce usually {1, }), and λ 0 s a tunng regularzaton arameter. The 1- and -norm suort vector machne tranng roblems wth slack can be cast n ths form ([6], chater 1). In [8] we have shown that boostng aroxmately follows the

2 ath of regularzed solutons traced by (1) as the regularzaton arameter λ vares, wth the arorate loss and an l 1 enalty. The man queston that we answer n ths aer s: for what loss functons does converge to an otmal searator as λ 0? The de nton of otmal whch we wll use deends on the l norm used for regularzaton, and we wll term t the l -margn maxmzng searatng hyer-lane. More concsely, we wll nvestgate for whch loss functons and under whch condtons we have: () lm λ 0 = arg max mn β =1 y β h(x ) Ths margn maxmzng roerty s nterestng for three dstnct reasons. Frst, t gves us a geometrc nterretaton of the lmtng model as we relax the regularzaton. It tells us that ths loss seeks to otmally searate the data by maxmzng a dstance between a searatng hyer-lane and the closest onts. A theorem by Mangasaran [7] allows us to nterret l margn maxmzaton as l q dstance maxmzaton, wth 1/ +1/q =1, and hence make a clear geometrc nterretaton. Second, from a learnng theory ersectve large margns are an mortant quantty generalzaton error bounds that deend on the margns have been generated for suort vector machnes ([10] usng l margns) and boostng ( [9] usng l 1 margns). Thus, showng that a loss functon s margn maxmzng n ths sense s useful and romsng nformaton regardng ths loss functon s otental for generatng good redcton models. Thrd, ractcal exerence shows that exact or aroxmate margn maxmzaon (such as non-regularzed kernel SVM solutons, or n nte boostng) may actually lead to good class caton redcton models. Ths s certanly not always the case, and we return to ths hotly debated ssue n our dscusson. Our man result s a suf cent condton on the loss functon, whch guarantees that () holds, f the data s searable,.e. f the maxmum on the RHS of () s ostve. Ths condton s resented and roven n secton. It covers the hnge loss of suort vector machnes, the logstc log-lkelhood loss of logstc regresson, and the exonental loss, most notably used n boostng. We dscuss these and other examles n secton 3. Our result generalzes elegantly to mult-class models and loss functons. We resent the resultng margn-maxmzng versons of SVMs and logstc regresson n secton 4. Suf cent condton for margn maxmzaton The followng theorem shows that f the loss functon vanshes quckly enough, then t wll be margn-maxmzng as the regularzaton vanshes. It rovdes us wth a un ed margn-maxmzaton theory, coverng SVMs, logstc regresson and boostng. Theorem.1 Assume the data {x,y } n =1 s searable,.e. β s.t. mn y β h(x ) > 0. Let C(y, f) =C(yf) be a monotone non-ncreasng loss functon deendng on the margn only. If T > 0 (ossbly T = ) such that: C(t [1 ɛ]) (3) lm =, ɛ >0 t T C(t) Then C s a margn maxmzng loss functon n the sense that any convergence ont of the normalzed solutons to the regularzed roblems (1) as λ 0 s an l margnmaxmzng searatng hyer-lane. Consequently, f ths margn-maxmzng hyer-lane s unque, then the solutons converge to t: (4) lm λ 0 = arg max mn y β h(x ) β =1

3 Proof We rove the result searately for T = and T<. a. T = : Lemma. λ 0 Proof Snce T = then C(m) > 0 m >0, and lm m C(m) =0. Therefore, for loss+enalty to vansh as λ 0, must dverge, to allow the margns to dverge. Lemma.3 Assume β 1,β are two searatng models, wth β 1 = β =1, and β 1 searates the data better,.e.: 0 <m = mn y h(x ) β <m 1 = mn y h(x ) β 1. Then U = U(m 1,m ) such that t >U, C(y h(x ) (tβ 1 )) < C(y h(x ) (tβ )) In words, f β 1 searates better than β then scaled-u versons of β 1 wll ncur smaller loss than scaled-u versons of β, f the scalng factor s large enough. Proof Snce condton (3) holds wth T =, there exsts U such that t >U, n. Thus from C beng non-ncreasng we mmedately get: t >U, C(y h(x ) (tβ 1 )) n C(tm 1 ) <C(tm ) < C(tm ) C(tm 1) > C(y h(x ) (tβ )) Proof of case a.: Assume β s a convergence ont of as λ 0, wth β =1. Now assume by contradcton β has β =1and bgger mnmal l margn. Denote the mnmal margns for the two models by m and m, resectvely, wth m < m. By contnuty of the mnmal margn n β, there exsts some oen neghborhood of β on the l shere: N β = {β : β =1, β β <δ} and an ɛ>0, such that: mn y β h(x ) < m ɛ, β N β Now by lemma.3 we get that exsts U = U( m, m ɛ) such that t β ncurs smaller loss than tβ for any t>u, β N β. Therefore β cannot be a convergence ont of b. T< Lemma.4 C(T )=0and C(T δ) > 0, δ >0. Proof From condton (3), C(T Tɛ) C(T ) Lemma.5 lm λ 0 mn y h(x )=T. =. Both results follow mmedately, wth δ = Tɛ. Proof Assume by contradcton that there s a sequence λ 1,λ,... 0 and ɛ>0s.t. j, mn y ˆβ(λj ) h(x ) T ɛ. Pck any searatng normalzed model β.e. β =1and m := mn y β h(x ) > 0. Then for any λ< m C(T ɛ) T we get: C(y T m β h(x )) + λ T m β <C(T ɛ)

4 snce the rst term (loss) s 0 and the enalty s smaller than C(T ɛ) by condton on λ. But j 0 s.t. λ j0 < m C(T ɛ) T and so we get a contradcton to otmalty of ˆβ(λ j0 ), snce we assumed mn y ˆβ(λj0 ) h(x ) T ɛ and thus: C(y ˆβ(λj0 ) h(x )) C(T ɛ) We have thus roven that lm nf λ 0 mn y h(x ) T. It remans to rove equalty. Assume by contradcton that for some value of λ we have m := mn y h(x ) >T. Then the re-scaled model T m has the same zero loss as, but a smaller enalty, snce T m = T m <. So we get a contradcton to otmalty of. Proof of case b.: Assume β s a convergence ont of as λ 0, wth β =1. Now assume by contradcton β has β =1and bgger mnmal margn. Denote the mnmal margns for the two models by m and m, resectvely, wth m < m. Let λ 1,λ,... 0 be a sequence along whch ˆβ(λ j) ˆβ(λ j) β. By lemma.5 and our assumton, ˆβ(λ j ) T m > T m. Thus, j 0 such that j >j 0, ˆβ(λ j ) > consequently: C(y ˆβ(λj ) h(x )) + λ ˆβ(λ j ) >λ( T m ) = T C(y m βh(x )) + λ T m β So we get a contradcton to otmalty of ˆβ(λ j ). Thus we conclude for both cases a. and b. that any convergence ont of T m and must maxmze the l margn. Snce =1, such convergence onts obvously exst. If the l -margn-maxmzng searatng hyer-lane s unque, then we can conclude: Necessty results ˆβ := arg max mn β =1 y β h(x ) A necessty result for margn maxmzaton on any searable data seems to requre ether addtonal assumtons on the loss or a relaxaton of condton (3). We conjecture that f we also requre that the loss s convex and vanshng (.e. lm m C(m) =0) then condton (3) s suf cent and necessary. However ths s stll a subject for future research. 3 Examles Suort vector machnes Suort vector machnes (lnear or kernel) can be descrbed as a regularzed roblem: (5) mn [1 y β h(x )] + + λ β β where =for the standard ( -norm ) SVM and =1for the 1-norm SVM. Ths formulaton s equvalent to the better known norm mnmzaton SVM formulaton n the sense that they have the same set of solutons as the regularzaton arameter λ vares n (5) or the slack bound vares n the norm mnmzaton formulaton.

5 The loss n (5) s termed hnge loss snce t s lnear for margns less than 1, then xed at 0 (see gure 1). The theorem obvously holds for T =1, and t ver es our knowledge that the non-regularzed SVM soluton, whch s the lmt of the regularzed solutons, maxmzes the arorate margn (Eucldean for standard SVM, l 1 for 1-norm SVM). Note that our theorem ndcates that the squared hnge loss (AKA truncated squared loss): C(y,F(x )) = [1 y F (x )] + s also a margn-maxmzng loss. Logstc regresson and boostng The two loss functons we consder n ths context are: (6) Exonental : C e (m) = ex( m) (7) Log lkelhood : C l (m) = log(1 + ex( m)) These two loss functons are of great nterest n the context of two class class caton: C l s used n logstc regresson and more recently for boostng [4], whle C e s the mlct loss functon used by AdaBoost - the orgnal and most famous boostng algorthm [3]. In [8] we showed that boostng aroxmately follows the regularzed ath of solutons usng these loss functons and l 1 regularzaton. We also roved that the two loss functons are very smlar for ostve margns, and that ther regularzed solutons converge to margn-maxmzng searators. Theorem.1 rovdes a new roof of ths result, snce the theorem s condton holds wth T = for both loss functons. Some nterestng non-examles Commonly used class caton loss functons whch are not margn-maxmzng nclude any olynomal loss functon: C(m) = 1 m, C(m) =m, etc. do not guarantee convergence of regularzed solutons to margn maxmzng solutons. Another nterestng method n ths context s lnear dscrmnant analyss. Although t does not corresond to the loss+enalty formulaton we have descrbed, t does nd a decson hyer-lane n the redctor sace. For both olynomal loss functons and lnear dscrmnant analyss t s easy to nd examles whch show that they are not necessarly margn maxmzng on searable data. 4 A mult-class generalzaton Our man result can be elegantly extended to versons of mult-class logstc regresson and suort vector machnes, as follows. Assume the resonse s now mult-class, wth K ossble values.e. y {c 1,..., c K }. Our model conssts of a redcton for each class: F k (x) = β (k) j h j (x) h j H wth the obvous redcton rule at x beng arg max k F k (x). Ths gves rse to a K 1 dmensonal margn for each observaton. For y = c k, de ne the margn vector as: (8) m(c k,f 1,..., f K )=(f k f 1,..., f k f k 1,f k f k+1,..., f k f K ) And our loss s a functon of ths K 1 dmensonal margn: C(y, f 1,..., f K )= k I{y = c k }C(m(c k,f 1,..., f K ))

6 3.5 hnge exonental logstc Fgure 1: Margn maxmzng loss functons for -class roblems (left) and the SVM 3-class loss functon of secton 4.1 (rght) The l -regularzed roblem s now: (9) = arg mn C(y,h(x ) β (1),..., h(x ) β (K) )+λ β (1),...,β (K) k β (k) Where =(ˆβ (1) (λ),..., ˆβ (K) (λ)) R K H. In ths formulaton, the concet of margn maxmzaton corresonds to maxmzng the mnmal of all n (K 1) normalzed l -margns generated by the data: (10) max β (1) β (K) =1 mn mn h(x ) (β (y) β (k) ) y c k Note that ths margn maxmzaton roblem stll has a natural geometrc nterretaton, as h(x ) (β (y) β (k) ) > 0, k y mles that the hyer-lane h(x) (β (j) β (k) )=0 successfully searates classes j and k for any two classes. Here s a generalzaton of the otmal searaton theorem.1 to mult-class models: Theorem 4.1 Assume C(m) s commutatve and decreasng n each coordnate, then f T >0 (ossbly T = ) such that: C(t[1 ɛ],tu 1,...tu K ) (11) lm t T =, C(t, tv 1,..., tv K ) ɛ >0, u 1 1,..., u K 1,v 1 1,...v K 1 Then C s a margn-maxmzng loss functon for mult-class models, n the sense that any convergence ont of the normalzed solutons to (9),, attans the otmal searaton as de ned n (10) Idea of roof The roof s essentally dentcal to the two class case, now consderng the n (K 1) margns on whch the loss deends. The condton (11) mles that as the regularzaton vanshes the model s determned by the mnmal margn, and so an otmal model uts the emhass on maxmzng that margn.

7 Corollary 4. In the -class case, theorem 4.1 reduces to theorem.1. Proof The loss deends on β (1) β (), the enalty on β (1) + β (). An otmal soluton to the regularzed roblem must thus have β (1) + β () =0, snce by transformng: β (1) β (1) β(1) + β (), β () β () β(1) + β () we are not changng the loss, but reducng the enalty, by Jensen s nequalty: β (1) β(1) + β () + β () β(1) + β () = β(1) β () β (1) + β () So we can conclude that ˆβ (1) (λ) = ˆβ () (λ) and consequently that the two margn maxmzaton tasks (), (10) are equvalent. 4.1 Margn maxmzaton n mult-class SVM and logstc regresson Here we aly theorem 4.1 to versons of mult-class logstc regresson and SVM. For logstc regresson, we use a slghtly dfferent formulaton than the standard logstc regresson models, whch uses class K as a reference class,.e. assumes that β (K) =0. Ths s requred for non-regularzed ttng, snce wthout t the soluton s not unquely de ned. However, usng regularzaton as n (9) guarantees that the soluton wll be unque and consequently we can symmetrze the model whch allows us to aly theorem 4.1. So the loss functon we use s (assume y = c k belongs to class k): (1) C(y, f 1,..., f K ) e f k = log e f e = fk = log(e f1 f k e f k 1 f k +1+e f k+1 f k e fk f k ) wth the lnear model: f j (x )=h(x ) β (j). It s not df cult to verfy that condton (11) holds for ths loss functon wth T =, usng the fact that log(1 + ɛ) =ɛ + O(ɛ ). The sum of exonentals whch results from alyng ths rst-order aroxmaton sats es (11), and as ɛ 0, the second order term can be gnored. For suort vector machnes, consder a mult-class loss whch s a natural generalzaton of the two-class loss: (13) K 1 C(m) = [1 m j ] + j=1 Where m j s the j th comonent of the mult-margn m as n (8). Fgure 1 shows ths loss for K =3classes as a functon of the two margns. The loss+enalty formulaton usng 13 s equvalent to a standard otmzaton formulaton of mult-class SVM (e.g. [11]): max s.t. c h(x ) (β (y) β (k) ) c(1 ξ k ), {1,...n}, k {1,..., K}, c k y ξ k 0, ξ k B, β (k) =1,k k As both theorem 4.1 (usng T =1) and the otmzaton formulaton ndcate, the regularzed solutons to ths roblem converge to the l margn maxmzng mult-class soluton.

8 5 Dscusson What are the roertes we would lke to have n a class caton loss functon? Recently there has been a lot of nterest n Bayes-consstency of loss functons and algorthms ([1] and references theren), as the data sze ncreases. It turns out that ractcally all reasonable loss functons are consstent n that sense, although convergence rates and other measures of degree of consstency may vary. Margn maxmzaton, on the other hand, s a nte samle otmalty roerty of loss functons, whch s otentally of decreasng nterest as samle sze grows, snce the tranng data-set s less lkely to be searable. Note, however, that n very hgh dmensonal redctor saces, such as those tycally used by boostng or kernel SVM, searablty of any nte-sze data-set s a mld assumton, whch s volated only n athologcal cases. We have shown that the margn maxmzng roerty s shared by some oular loss functons used n logstc regresson, suort vector machnes and boostng. Knowng that these algorthms converge, as regularzaton vanshes, to the same model (rovded they use the same regularzaton) s an nterestng nsght. So, for examle, we can conclude that 1-norm suort vector machnes, exonental boostng and l 1 -regularzed logstc regresson all facltate the same non-regularzed soluton, whch s an l 1 -margn maxmzng searatng hyer-lane. From Mangasaran s theorem [7] we know that ths hyer-lane maxmzes the l dstance from the closest onts on ether sde. The most nterestng statstcal queston whch arses s: are these otmal searatng models really good for redcton, or should we exect regularzed models to always do better n ractce? Statstcal ntuton suorts the latter, as do some margn-maxmzng exerments by Breman [] and Grove and Schuurmans [5]. However t has also been observed that n many cases margn-maxmzaton leads to reasonable redcton models, and does not necessarly result n over- ttng. We have had smlar exerence wth boostng and kernel SVM. Settlng ths ssue s an ntrgung research toc, and one that s crtcal n determnng the ractcal mortance of our results, as well as that of margn-based generalzaton error bounds. References [1] Bartlett, P., Jordan, M. & McAulffe, J. (003). Convexty, Class caton and Rsk Bounds. Techncal reorts, det. of Statstcs, UC Berkeley. [] Breman, L. (1999). Predcton games and arcng algorthms. Neural Comutaton 7: [3] Freund, Y. & Scahre, R.E. (1995). A decson theoretc generalzaton of on-lne learnng and an alcaton to boostng. Proc. of nd Euroean Conf. on Comutatonal Learnng Theory. [4] Fredman, J. H., Haste, T. & Tbshran, R. (000). Addtve logstc regresson: a statstcal vew of boostng. Annals of Statstcs 8, [5] Grove, A.J. & Schuurmans, D. (1998). Boostng n the lmt: Maxmzng the margn of learned ensembles. Proc. of 15th Natonal Conf. on AI. [6] Haste, T., Tbshran, R. & Fredman, J. (001). Elements of Stat. Learnng. Srnger-Verlag. [7] Mangasaran, O.L. (1999). Arbtrary-norm searatng lane. Oeratons Research Letters, Vol. 4 1-:15-3 [8] Rosset, R., Zhu, J & Haste, T. (003). Boostng as a regularzed ath to a maxmum margn class er. Techncal reort, Det. of Statstcs, Stanford Unv. [9] Scahre, R.E., Freund, Y., Bartlett, P. & Lee, W.S. (1998). Boostng the margn: a new exlanaton for the effectveness of votng methods. Annals of Statstcs 6(5): [10] Vank, V. (1995). The Nature of Statstcal Learnng Theory. Srnger. [11] Weston, J. & Watkns, C. (1998). Mult-class suort vector machnes. Techncal reort CSD- TR-98-04, det of CS, Royal Holloway, Unversty of London.

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

Margin Maximizing Loss Functions

Margin Maximizing Loss Functions Margin Maximizing Loss Functions Saharon Rosset, Ji Zhu and Trevor Hastie Department of Statistics Stanford University Stanford, CA, 94305 saharon, jzhu, hastie@stat.stanford.edu Abstract Margin maximizing

More information

On the Connectedness of the Solution Set for the Weak Vector Variational Inequality 1

On the Connectedness of the Solution Set for the Weak Vector Variational Inequality 1 Journal of Mathematcal Analyss and Alcatons 260, 15 2001 do:10.1006jmaa.2000.7389, avalable onlne at htt:.dealbrary.com on On the Connectedness of the Soluton Set for the Weak Vector Varatonal Inequalty

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Boosting as a Regularized Path to a Maximum Margin Classifier

Boosting as a Regularized Path to a Maximum Margin Classifier Boostng as a Regularzed Path to a Maxmum Margn Classfer Saharon Rosset, J Zhu, Trevor Haste Department of Statstcs Stanford Unversty Stanford, CA, 94305 {saharon,jzhu,haste}@stat.stanford.edu May 5, 2003

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far Supervsed machne learnng Lnear models Least squares regresson Fsher s dscrmnant, Perceptron, Logstc model Non-lnear

More information

An application of generalized Tsalli s-havrda-charvat entropy in coding theory through a generalization of Kraft inequality

An application of generalized Tsalli s-havrda-charvat entropy in coding theory through a generalization of Kraft inequality Internatonal Journal of Statstcs and Aled Mathematcs 206; (4): 0-05 ISS: 2456-452 Maths 206; (4): 0-05 206 Stats & Maths wwwmathsjournalcom Receved: 0-09-206 Acceted: 02-0-206 Maharsh Markendeshwar Unversty,

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Konstantn Tretyakov (kt@ut.ee) MTAT.03.227 Machne Learnng So far So far Supervsed machne learnng Lnear models Non-lnear models Unsupervsed machne learnng Generc scaffoldng So far

More information

Algorithms for factoring

Algorithms for factoring CSA E0 235: Crytograhy Arl 9,2015 Instructor: Arta Patra Algorthms for factorng Submtted by: Jay Oza, Nranjan Sngh Introducton Factorsaton of large ntegers has been a wdely studed toc manly because of

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Supplementary Material for Spectral Clustering based on the graph p-laplacian

Supplementary Material for Spectral Clustering based on the graph p-laplacian Sulementary Materal for Sectral Clusterng based on the grah -Lalacan Thomas Bühler and Matthas Hen Saarland Unversty, Saarbrücken, Germany {tb,hen}@csun-sbde May 009 Corrected verson, June 00 Abstract

More information

6. Hamilton s Equations

6. Hamilton s Equations 6. Hamlton s Equatons Mchael Fowler A Dynamcal System s Path n Confguraton Sace and n State Sace The story so far: For a mechancal system wth n degrees of freedom, the satal confguraton at some nstant

More information

Logistic regression with one predictor. STK4900/ Lecture 7. Program

Logistic regression with one predictor. STK4900/ Lecture 7. Program Logstc regresson wth one redctor STK49/99 - Lecture 7 Program. Logstc regresson wth one redctor 2. Maxmum lkelhood estmaton 3. Logstc regresson wth several redctors 4. Devance and lkelhood rato tests 5.

More information

Machine Learning. Classification. Theory of Classification and Nonparametric Classifier. Representing data: Hypothesis (classifier) Eric Xing

Machine Learning. Classification. Theory of Classification and Nonparametric Classifier. Representing data: Hypothesis (classifier) Eric Xing Machne Learnng 0-70/5 70/5-78, 78, Fall 008 Theory of Classfcaton and Nonarametrc Classfer Erc ng Lecture, Setember 0, 008 Readng: Cha.,5 CB and handouts Classfcaton Reresentng data: M K Hyothess classfer

More information

Lecture 20: November 7

Lecture 20: November 7 0-725/36-725: Convex Optmzaton Fall 205 Lecturer: Ryan Tbshran Lecture 20: November 7 Scrbes: Varsha Chnnaobreddy, Joon Sk Km, Lngyao Zhang Note: LaTeX template courtesy of UC Berkeley EECS dept. Dsclamer:

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Advanced Topics in Optimization. Piecewise Linear Approximation of a Nonlinear Function

Advanced Topics in Optimization. Piecewise Linear Approximation of a Nonlinear Function Advanced Tocs n Otmzaton Pecewse Lnear Aroxmaton of a Nonlnear Functon Otmzaton Methods: M8L Introducton and Objectves Introducton There exsts no general algorthm for nonlnear rogrammng due to ts rregular

More information

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Support Vector Machines. Vibhav Gogate The University of Texas at dallas Support Vector Machnes Vbhav Gogate he Unversty of exas at dallas What We have Learned So Far? 1. Decson rees. Naïve Bayes 3. Lnear Regresson 4. Logstc Regresson 5. Perceptron 6. Neural networks 7. K-Nearest

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Managing Capacity Through Reward Programs. on-line companion page. Byung-Do Kim Seoul National University College of Business Administration

Managing Capacity Through Reward Programs. on-line companion page. Byung-Do Kim Seoul National University College of Business Administration Managng Caacty Through eward Programs on-lne comanon age Byung-Do Km Seoul Natonal Unversty College of Busness Admnstraton Mengze Sh Unversty of Toronto otman School of Management Toronto ON M5S E6 Canada

More information

Fuzzy approach to solve multi-objective capacitated transportation problem

Fuzzy approach to solve multi-objective capacitated transportation problem Internatonal Journal of Bonformatcs Research, ISSN: 0975 087, Volume, Issue, 00, -0-4 Fuzzy aroach to solve mult-objectve caactated transortaton roblem Lohgaonkar M. H. and Bajaj V. H.* * Deartment of

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

MAT 578 Functional Analysis

MAT 578 Functional Analysis MAT 578 Functonal Analyss John Qugg Fall 2008 Locally convex spaces revsed September 6, 2008 Ths secton establshes the fundamental propertes of locally convex spaces. Acknowledgment: although I wrote these

More information

( ) 2 ( ) ( ) Problem Set 4 Suggested Solutions. Problem 1

( ) 2 ( ) ( ) Problem Set 4 Suggested Solutions. Problem 1 Problem Set 4 Suggested Solutons Problem (A) The market demand functon s the soluton to the followng utlty-maxmzaton roblem (UMP): The Lagrangean: ( x, x, x ) = + max U x, x, x x x x st.. x + x + x y x,

More information

Non-Ideality Through Fugacity and Activity

Non-Ideality Through Fugacity and Activity Non-Idealty Through Fugacty and Actvty S. Patel Deartment of Chemstry and Bochemstry, Unversty of Delaware, Newark, Delaware 19716, USA Corresondng author. E-mal: saatel@udel.edu 1 I. FUGACITY In ths dscusson,

More information

Confidence intervals for weighted polynomial calibrations

Confidence intervals for weighted polynomial calibrations Confdence ntervals for weghted olynomal calbratons Sergey Maltsev, Amersand Ltd., Moscow, Russa; ur Kalambet, Amersand Internatonal, Inc., Beachwood, OH e-mal: kalambet@amersand-ntl.com htt://www.chromandsec.com

More information

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}.

We present the algorithm first, then derive it later. Assume access to a dataset {(x i, y i )} n i=1, where x i R d and y i { 1, 1}. CS 189 Introducton to Machne Learnng Sprng 2018 Note 26 1 Boostng We have seen that n the case of random forests, combnng many mperfect models can produce a snglodel that works very well. Ths s the dea

More information

Solutions (mostly for odd-numbered exercises)

Solutions (mostly for odd-numbered exercises) Solutons (mostly for odd-numbered exercses) c 005 A. Coln Cameron and Pravn K. Trved "Mcroeconometrcs: Methods and Alcatons" 1. Chater 1: Introducton o exercses.. Chater : Causal and oncausal Models o

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

18-660: Numerical Methods for Engineering Design and Optimization

18-660: Numerical Methods for Engineering Design and Optimization 8-66: Numercal Methods for Engneerng Desgn and Optmzaton n L Department of EE arnege Mellon Unversty Pttsburgh, PA 53 Slde Overve lassfcaton Support vector machne Regularzaton Slde lassfcaton Predct categorcal

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE II LECTURE - GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCE Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 3.

More information

A Quadratic Cumulative Production Model for the Material Balance of Abnormally-Pressured Gas Reservoirs F.E. Gonzalez M.S.

A Quadratic Cumulative Production Model for the Material Balance of Abnormally-Pressured Gas Reservoirs F.E. Gonzalez M.S. Natural as Engneerng A Quadratc Cumulatve Producton Model for the Materal Balance of Abnormally-Pressured as Reservors F.E. onale M.S. Thess (2003) T.A. Blasngame, Texas A&M U. Deartment of Petroleum Engneerng

More information

A Quadratic Cumulative Production Model for the Material Balance of Abnormally-Pressured Gas Reservoirs F.E. Gonzalez M.S.

A Quadratic Cumulative Production Model for the Material Balance of Abnormally-Pressured Gas Reservoirs F.E. Gonzalez M.S. Formaton Evaluaton and the Analyss of Reservor Performance A Quadratc Cumulatve Producton Model for the Materal Balance of Abnormally-Pressured as Reservors F.E. onale M.S. Thess (2003) T.A. Blasngame,

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? Intuton of Margn Consder ponts A, B, and C We

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

A total variation approach

A total variation approach Denosng n dgtal radograhy: A total varaton aroach I. Froso M. Lucchese. A. Borghese htt://as-lab.ds.unm.t / 46 I. Froso, M. Lucchese,. A. Borghese Images are corruted by nose ) When measurement of some

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018 INF 5860 Machne learnng for mage classfcaton Lecture 3 : Image classfcaton and regresson part II Anne Solberg January 3, 08 Today s topcs Multclass logstc regresson and softma Regularzaton Image classfcaton

More information

Support Vector Machines CS434

Support Vector Machines CS434 Support Vector Machnes CS434 Lnear Separators Many lnear separators exst that perfectly classfy all tranng examples Whch of the lnear separators s the best? + + + + + + + + + Intuton of Margn Consder ponts

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information

Which Separator? Spring 1

Which Separator? Spring 1 Whch Separator? 6.034 - Sprng 1 Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng Whch Separator? Mamze the margn to closest ponts 6.034 - Sprng 3 Margn of a pont " # y (w $ + b) proportonal

More information

Pattern Classification (II) 杜俊

Pattern Classification (II) 杜俊 attern lassfcaton II 杜俊 junu@ustc.eu.cn Revew roalty & Statstcs Bayes theorem Ranom varales: screte vs. contnuous roalty struton: DF an DF Statstcs: mean, varance, moment arameter estmaton: MLE Informaton

More information

Hidden Markov Model Cheat Sheet

Hidden Markov Model Cheat Sheet Hdden Markov Model Cheat Sheet (GIT ID: dc2f391536d67ed5847290d5250d4baae103487e) Ths document s a cheat sheet on Hdden Markov Models (HMMs). It resembles lecture notes, excet that t cuts to the chase

More information

Excess Error, Approximation Error, and Estimation Error

Excess Error, Approximation Error, and Estimation Error E0 370 Statstcal Learnng Theory Lecture 10 Sep 15, 011 Excess Error, Approxaton Error, and Estaton Error Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton So far, we have consdered the fnte saple

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Maximal Margin Classifier

Maximal Margin Classifier CS81B/Stat41B: Advanced Topcs n Learnng & Decson Makng Mamal Margn Classfer Lecturer: Mchael Jordan Scrbes: Jana van Greunen Corrected verson - /1/004 1 References/Recommended Readng 1.1 Webstes www.kernel-machnes.org

More information

Regularized Discriminant Analysis for Face Recognition

Regularized Discriminant Analysis for Face Recognition 1 Regularzed Dscrmnant Analyss for Face Recognton Itz Pma, Mayer Aladem Department of Electrcal and Computer Engneerng, Ben-Guron Unversty of the Negev P.O.Box 653, Beer-Sheva, 845, Israel. Abstract Ths

More information

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens THE CHINESE REMAINDER THEOREM KEITH CONRAD We should thank the Chnese for ther wonderful remander theorem. Glenn Stevens 1. Introducton The Chnese remander theorem says we can unquely solve any par of

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Matching Dyadic Distributions to Channels

Matching Dyadic Distributions to Channels Matchng Dyadc Dstrbutons to Channels G. Böcherer and R. Mathar Insttute for Theoretcal Informaton Technology RWTH Aachen Unversty, 5256 Aachen, Germany Emal: {boecherer,mathar}@t.rwth-aachen.de Abstract

More information

Bayesian Decision Theory

Bayesian Decision Theory No.4 Bayesan Decson Theory Hu Jang Deartment of Electrcal Engneerng and Comuter Scence Lassonde School of Engneerng York Unversty, Toronto, Canada Outlne attern Classfcaton roblems Bayesan Decson Theory

More information

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k. THE CELLULAR METHOD In ths lecture, we ntroduce the cellular method as an approach to ncdence geometry theorems lke the Szemeréd-Trotter theorem. The method was ntroduced n the paper Combnatoral complexty

More information

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z ) C4B Machne Learnng Answers II.(a) Show that for the logstc sgmod functon dσ(z) dz = σ(z) ( σ(z)) A. Zsserman, Hlary Term 20 Start from the defnton of σ(z) Note that Then σ(z) = σ = dσ(z) dz = + e z e z

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

The Minimum Universal Cost Flow in an Infeasible Flow Network

The Minimum Universal Cost Flow in an Infeasible Flow Network Journal of Scences, Islamc Republc of Iran 17(2): 175-180 (2006) Unversty of Tehran, ISSN 1016-1104 http://jscencesutacr The Mnmum Unversal Cost Flow n an Infeasble Flow Network H Saleh Fathabad * M Bagheran

More information

COS 511: Theoretical Machine Learning

COS 511: Theoretical Machine Learning COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that

More information

NECESSARY AND SUFFICIENT CONDITIONS FOR ALMOST REGULARITY OF UNIFORM BIRKHOFF INTERPOLATION SCHEMES. by Nicolae Crainic

NECESSARY AND SUFFICIENT CONDITIONS FOR ALMOST REGULARITY OF UNIFORM BIRKHOFF INTERPOLATION SCHEMES. by Nicolae Crainic NECESSARY AND SUFFICIENT CONDITIONS FOR ALMOST REGULARITY OF UNIFORM BIRKHOFF INTERPOLATION SCHEMES by Ncolae Cranc Abstract: In ths artcle usng a combnaton of the necessary and suffcent condtons for the

More information

CSC 411 / CSC D11 / CSC C11

CSC 411 / CSC D11 / CSC C11 18 Boostng s a general strategy for learnng classfers by combnng smpler ones. The dea of boostng s to take a weak classfer that s, any classfer that wll do at least slghtly better than chance and use t

More information

Pattern Classification

Pattern Classification Pattern Classfcaton All materals n these sldes ere taken from Pattern Classfcaton (nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wley & Sons, 000 th the permsson of the authors and the publsher

More information

Support Vector Machines

Support Vector Machines Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x n class

More information

Fuzzy Set Approach to Solve Multi-objective Linear plus Fractional Programming Problem

Fuzzy Set Approach to Solve Multi-objective Linear plus Fractional Programming Problem Internatonal Journal of Oeratons Research Vol.8, o. 3, 5-3 () Internatonal Journal of Oeratons Research Fuzzy Set Aroach to Solve Mult-objectve Lnear lus Fractonal Programmng Problem Sanjay Jan Kalash

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

Submodular Maximization Over Multiple Matroids via Generalized Exchange Properties

Submodular Maximization Over Multiple Matroids via Generalized Exchange Properties Submodular Maxmzaton Over Multle Matrods va Generalzed Exchange Proertes Jon Lee Maxm Svrdenko Jan Vondrák June 25, 2009 Abstract Submodular-functon maxmzaton s a central roblem n combnatoral otmzaton,

More information

The Small Noise Arbitrage Pricing Theory

The Small Noise Arbitrage Pricing Theory The Small Nose Arbtrage Prcng Theory S. Satchell Trnty College Unversty of Cambrdge and School of Fnance and Economcs Unversty of Technology, Sydney December 998 Ths aer was wrtten when the Author was

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

A NOTE ON THE DISCRETE FOURIER RESTRICTION PROBLEM

A NOTE ON THE DISCRETE FOURIER RESTRICTION PROBLEM A NOTE ON THE DISCRETE FOURIER RESTRICTION PROBLEM XUDONG LAI AND YONG DING arxv:171001481v1 [mathap] 4 Oct 017 Abstract In ths aer we establsh a general dscrete Fourer restrcton theorem As an alcaton

More information

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle

More information

Parton Model. 2 q Q, 1

Parton Model. 2 q Q, 1 Parton Model How do we exect the structure functons w and w to behave? It s nstructve to consder some secal cases: ) e e elastc scatterng from a ont-lke roton w, q q q w, q q m m m Defnng and notng that

More information

KNOTS AND THEIR CURVATURES

KNOTS AND THEIR CURVATURES KNOTS AND THEIR CURVATURES LIVIU I. NICOLAESCU ABSTRACT. I dscuss an old result of John Mlnor statng roughly that f a closed curve n sace s not too curved then t cannot be knotted. CONTENTS. The total

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

* * * * * * * * * * Appendix A. Introduction to Binary Support Vector Machines

* * * * * * * * * * Appendix A. Introduction to Binary Support Vector Machines Aendx A. Introducton to Bnary Suort Vector Machnes Below we summarze the man deas behnd bnary Suort Vector Machnes va a short theoretcal descrton followed by an examle. For a detaled revew of SVMs, refer

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

PARTIAL QUOTIENTS AND DISTRIBUTION OF SEQUENCES. Department of Mathematics University of California Riverside, CA

PARTIAL QUOTIENTS AND DISTRIBUTION OF SEQUENCES. Department of Mathematics University of California Riverside, CA PARTIAL QUOTIETS AD DISTRIBUTIO OF SEQUECES 1 Me-Chu Chang Deartment of Mathematcs Unversty of Calforna Rversde, CA 92521 mcc@math.ucr.edu Abstract. In ths aer we establsh average bounds on the artal quotents

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 4: Regularization, Sparsity & Lasso

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 4: Regularization, Sparsity & Lasso Machne Learnng Data Mnng CS/CS/EE 155 Lecture 4: Regularzaton, Sparsty Lasso 1 Recap: Complete Ppelne S = {(x, y )} Tranng Data f (x, b) = T x b Model Class(es) L(a, b) = (a b) 2 Loss Functon,b L( y, f

More information

Pattern Classification

Pattern Classification attern Classfcaton All materals n these sldes were taken from attern Classfcaton nd ed by R. O. Duda,. E. Hart and D. G. Stork, John Wley & Sons, 000 wth the ermsson of the authors and the ublsher Chater

More information

Multiple Regression Analysis

Multiple Regression Analysis Multle Regresson Analss Roland Szlág Ph.D. Assocate rofessor Correlaton descres the strength of a relatonsh, the degree to whch one varale s lnearl related to another Regresson shows us how to determne

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

The Robustness of a Nash Equilibrium Simulation Model

The Robustness of a Nash Equilibrium Simulation Model 8th World IMACS / MODSIM Congress, Carns, Australa 3-7 July 2009 htt://mssanz.org.au/modsm09 The Robustness of a Nash Equlbrum Smulaton Model Etaro Ayosh, Atsush Mak 2 and Takash Okamoto 3 Faculty of Scence

More information

Some Notes on Consumer Theory

Some Notes on Consumer Theory Some Notes on Consumer Theory. Introducton In ths lecture we eamne the theory of dualty n the contet of consumer theory and ts use n the measurement of the benefts of rce and other changes. Dualty s not

More information

AN ASYMMETRIC GENERALIZED FGM COPULA AND ITS PROPERTIES

AN ASYMMETRIC GENERALIZED FGM COPULA AND ITS PROPERTIES Pa. J. Statst. 015 Vol. 31(1), 95-106 AN ASYMMETRIC GENERALIZED FGM COPULA AND ITS PROPERTIES Berzadeh, H., Parham, G.A. and Zadaram, M.R. Deartment of Statstcs, Shahd Chamran Unversty, Ahvaz, Iran. Corresondng

More information

Support Vector Machines

Support Vector Machines /14/018 Separatng boundary, defned by w Support Vector Machnes CISC 5800 Professor Danel Leeds Separatng hyperplane splts class 0 and class 1 Plane s defned by lne w perpendcular to plan Is data pont x

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD he Gaussan classfer Nuno Vasconcelos ECE Department, UCSD Bayesan decson theory recall that we have state of the world X observatons g decson functon L[g,y] loss of predctng y wth g Bayes decson rule s

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

Laplacian Regularized Subspace Learning for Interactive Image Reranking

Laplacian Regularized Subspace Learning for Interactive Image Reranking Lalacan Regularzed Subsace Learnng for Interactve Image Rerankng Lnng Zhang School of Electrcal and Electronc Engneerng Nanyang echnologcal Unversty Sngaore 639798 zhan0327@ntu.edu.sg Lo Wang School of

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Exercise Solutions to Real Analysis

Exercise Solutions to Real Analysis xercse Solutons to Real Analyss Note: References refer to H. L. Royden, Real Analyss xersze 1. Gven any set A any ɛ > 0, there s an open set O such that A O m O m A + ɛ. Soluton 1. If m A =, then there

More information

SDMML HT MSc Problem Sheet 4

SDMML HT MSc Problem Sheet 4 SDMML HT 06 - MSc Problem Sheet 4. The recever operatng characterstc ROC curve plots the senstvty aganst the specfcty of a bnary classfer as the threshold for dscrmnaton s vared. Let the data space be

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information