Stochastic Convex Optimization

Size: px
Start display at page:

Download "Stochastic Convex Optimization"

Transcription

1 Stochastc Covex Optmzato Sha Shalev-Shwartz TTI-Chcago Ohad Shamr The Hebrew Uversty Natha Srebro TTI-Chcago Karthk Srdhara TTI-Chcago Abstract For supervsed classfcato problems, t s well kow that learablty s equvalet to uform covergece of the emprcal rsks ad thus to learablty by emprcal mmzato. Ispred by recet regret bouds for ole covex optmzato, we study stochastc covex optmzato, ad ucover a surprsgly dfferet stuato the more geeral settg: although the stochastc covex optmzato problem s learable (e.g. usg ole-to-batch coversos, o uform covergece holds the geeral case, ad emprcal mmzato mght fal. Rather the beg a dfferece betwee ole methods ad a global mmzato approach, we show that the key gredet s strog covexty ad regularzato. Our results demostrate that the celebrated theorem of Alo et al o the equvalece of learablty ad uform covergece does ot exted to Vapk s Geeral Settg of Learg, that the Geeral Settg cosderg oly emprcal mmzato s ot eough, ad that despte Vapk s result o the equvalece of strct cosstecy ad uform covergece, uform covergece s oly a suffcet, but ot ecessary, codto for meagful o-trval learablty. 1 Itroducto We cosder the stochastc covex mmzato problem argm F(w (1 where F(w = E Z [f(w; Z] s the expectato, wth respect to Z, of a radom objectve that s covex w. The optmzato s based o a..d. sample z 1,..., z draw from a ukow dstrbuto. The goal s to choose w based o the sample ad full kowledge of f(, ad W so as to mmze F(w. Alteratvely, we ca also thk of a ukow dstrbuto over covex fuctos, where we are gve a sample of fuctos {w f(w; z } ad would lke to optmze the expected fucto. A specal case s the famlar predcto settg where z = (x, y s a stace-label par, W s a subset of a Hlbert space, ad f(w; x, y = l( w, φ(x, y for some covex loss fucto l ad feature mappg φ. The stuato whch the stochastc depedece o w s lear, as the precedg example, s farly well uderstood. Whe the doma W ad the mappg φ are bouded, oe ca uformly (over all w W boud the devato betwee the expected objectve F(w ad the emprcal average ˆF(w = Ê [f(w; z] = 1 f(w; z. ( =1 Ths uform covergece of ˆF(w to F(w justfes choosg the emprcal mmzer ŵ = arg m w ˆF(w, (3 ad guaratees that the expected value of F(ŵ coverges to the optmal value F(w = f w F(w. Furthermore, a smlar guaratee ca also be obtaed for a approxmate mmzer of the emprcal objectve. Our goal here s to cosder the stochastc covex optmzato problem more broadly, wthout assumg ay metrc or other structure o the parameter z or mappgs of t, or ay specal structure of the objectve fucto f( ;. Vewed as optmzato based o a sample of fuctos, we do ot mpose ay costrats o the fuctos, or the relatoshp betwee the fuctos, except that each fucto w f(w; z separately s covex ad Lpschtz-cotuous. A ole aalogue of ths settg has recetly receved cosderable atteto. Ole covex optmzato cocers a sequece of covex fuctos f( ; z 1,..., f( ; z, whch ca be chose by a adversary, ad a sequece of ole predctors w, where w ca deped oly o z 1,...,z 1. Ole guaratees provde a upper boud o the ole regret, 1 f(w 1 ; z m w f(w; z. Note the dfferece versus the stochastc settg, where we seek a sgle predctor w ad would lke to boud the populato sub-optmalty F( w F(w. Zkevch [Z03] showed that requrg f(w; z to be Lpschtz-cotuous w.r.t. w s eough for obtag a ole algorthm wth ole regret whch dmshes as 1/. If f(w, z s ot merely covex w.r.t. w, but also strogly covex, the regret dmshes wth a faster rate of 1/ [HKKA06]. These ole results parallel kow results the stochastc settg, whe the stochastc depedece o w s

2 lear. However, they apply also a much broader settg, whe the stochastc depedece o w s ot lear, e.g. whe f(w; z = w z p for p. The requremet that the fuctos w f(w; z be Lpschtz-cotuous s much more geeral tha a specfc requremet o the structure of the fuctos, ad does ot at all costra the relatoshp betwee the fuctos. That s, we ca thk of z as parameterzg all possble Lpschtz-cotuous covex fuctos w f(w; z. We ote that ths s qute dfferet from the work of vo Luxburg ad Bousquet [vlb04] who studed learg wth fuctos that are Lpschtz wth respect to z. The results for the ole settg prompt us to ask whether smlar results, requrg oly Lpschtz cotuty, ca also be obtaed for stochastc covex optmzato. The aswer we dscover s surprsgly complex. Our frst surprsg observato s that requrg Lpschtz cotuty s ot eough for esurg uform covergece of ˆF(w to F(w, or for the emprcal mmzer ŵ to coverge to a optmal soluto. We preset covex, bouded, Lpschtz-cotuous examples where eve as the sample sze creases, the expected value of the emprcal mmzer ŵ s bouded away from the populato optmum: F(ŵ = 1/ > 0 = F(w. I essetally all prevously studed settgs we are aware of where learg or stochastc optmzato s possble, we have at least some form of locally uform covergece, ad a emprcal mmzato approach s approprate. I fact, for commo models of supervsed learg, t s kow that uform covergece s equvalet to stochastc optmzato beg possble [ABCH97]. Ths mght lead us to thk that Lpschtz-cotuty s ot eough to make stochastc covex optmzato possble, eve though t s eough to esure ole covex optmzato s possble. However, ths gap betwee the ole ad stochastc settg caot be, sce t s possble to covert the ole method of Zkevch to a batch algorthm, wth a matchg guaratee o the populato sub-optmalty F( w F(w. Ths guaratee holds for the specfc output w of the algorthm, whch s ot, geeral, the emprcal mmzer. It seems, the, that we are a strage stuato where stochastc optmzato s possble, but oly usg a specfc (ole algorthm, rather tha the more atural emprcal mmzer. We show that the magc ca be uderstood ot as a gap betwee ole optmzato ad emprcal mmzato, but rather terms of regularzato. To do so, we frst show that for a strogly covex stochastc optmzato problem, eve though we mght stll have o uform covergece, the emprcal mmzer s guarateed to coverge to the populato optmum. Ths result seems to defy Vapk s celebrated result o the equvalece of uform covergece ad strct cosstecy of the emprcal mmzer [Vap95, Vap98]. We expla why there s o cotradcto here: Vapk s oto of strct cosstecy s too strct ad does ot capture all stuatos whch learg s o-trval, yet stll possble. Covergece of the emprcal mmzer to the populato optmum for strogly covex objectves justfes stochastc covex optmzato of weakly covex Lpschtzcotuous fuctos usg regularzed emprcal mmzato. I fact, we dscuss how Zkevch s algorthm ca also be uderstood terms of mmzg a mplct regularzed problem. Setup ad Backgroud A stochastc covex optmzato problem s specfed by a covex doma W, whch ths paper we always take to be a closed ad bouded subset of a Hlbert space H, ad a fucto f : W Z R whch s covex w.r.t. ts frst argumet. We say that the problem s learable (or solvable ff there exsts a rule for choosg w based o a..d. sample z 1,..., z, ad complete kowledge of W ad f( ;, such that for ay δ > 0, ay ɛ > 0, ad large eough sample sze, for ay dstrbuto over z, wth probablty at least 1 δ over a sample of sze, we have F( w F(w + ɛ. We say that such a rule s uformly cosstet, or that t solves the stochastc optmzato problem. We say that the problem s bouded by B ff for all w W we have w B. We say that the problem s L-Lpschtz f f(w; z s L- Lpschtz w.r.t. w. That s, for ay z Z ad w 1,w W we have f(w 1 ; z f(w ; z L w 1 w. We say that the problem -strogly covex f for ay z Z, w 1,w W ad α [0, 1] we have f(αw 1 +(1 αw ; z αf(w 1 ; z+(1 αf(w ; z α(1 α w 1 w. Note that ths stregthes the covexty requremet, whch correspods to settg = 0..1 Geeralzed Lear Stochastc Optmzato We say that a problem s a geeralzed lear problem f f(w; z ca be wrtte as f(w; z = g( w, φ(z ; z + r(w (4 where g : R Z R s covex w.r.t. ts frst argumet, r : W R s covex, ad φ : Z H. A specal case s supervsed learg of a lear predctor wth a covex loss fucto, where g( ; ecodes the loss fucto. Learablty results for lear predctors ca -fact be stated more geerally as guaratees o stochastc optmzato of geeralzed lear problems: Theorem 1. Cosder a geeralzed lear stochastc covex optmzato problem of the form (4, such that the doma W s bouded by B, the mage of φ s bouded by R ad g(u; z s L g -Lpschtz u. The for ay dstrbuto over z ad ay δ > 0, wth probablty at least 1 δ over a sample of sze : sup F(w ˆF(w ( B (RL g O log(1/δ

3 That s, the emprcal values ˆF(w coverge uformly, for all w W, to ther expectatos F(w. Ths esures that wth probablty at least 1 δ, for all w W: F(w F(w ( ˆF(w ˆF(ŵ ( B (RL g + O log(1/δ The emprcal suboptmalty term o the rght-had-sde vashes for the emprcal mmzer ŵ, establshg that emprcal mmzato solves the stochastc optmzato problem wth a rate of 1/. Furthermore, (5 allows us to boud the populato suboptmalty terms of the emprcal suboptmalty ad obta meagful guaratees eve for approxmate emprcal mmzers. The o-stochastc term r(w does ot play a role the above boud, as t ca always be caceled out. However, whe ths terms s strogly-covex (e.g. whe t s a squaredorm regularzato term, r(w = w, a faster covergece rate ca be guarateed: Theorem. [SSS08] Cosder a geeralzed lear stochastc covex optmzato problem of the form (4, such that r(w s -strogly covex, the mage of φ s bouded by R ad g(u; z s L g -Lpschtz u. The for ay dstrbuto over z ad ay δ > 0, wth probablty at least 1 δ over a sample of sze, for all w W: F(w F(w ( ˆF(w ˆF(ŵ+O. Ole Covex Optmzato (5 ( (RLg log(1/δ Zkevch [Z03] establshed that Lpschtz cotuty ad covexty of the objectve fuctos wth respect to the optmzato argumet are suffcet for ole optmzato 1 : Theorem 3. [Sha07, Corollary 1] Let f : W Z R be such that W s bouded by B ad f(w, z s covex ad L- Lpschtz wth respect to w. The, there exsts a ole algorthm such that for ay sequece z 1,..., z the sequece of ole vectors w 1,...,w satsfes: ( 1 f(w ; z 1 f(w B L ; z + O (6 Subsequetly, Haza et al [HKKA06] showed that a faster rate ca be obtaed whe the objectve fuctos are ot oly covex, but also strogly covex: Theorem 4. [HKKA06, Theorem 1] Let f : W Z R be such that fucto f(w, z s -strogly covex ad L- Lpschtz wth respect to w. The, there exsts a ole algorthm such that for ay sequece z 1,..., z the sequece of ole vectors w 1,...,w satsfes: 1 f(w ; z 1 ( L f(w log( ; z + O 1 We preset here slghtly more geeral theorem statemets tha those foud the orgal papers [Z03, HKKA06]. We do ot requre dfferetablty, ad stead of boudg the gradet ad the Hessa we boud the Lpschtz costat ad the parameter of strog covexty. The boud Theorem 3 s also a bt tghter tha that orgally establshed by Zkevch. Ole-to-batch coversos I ths paper, we are ot terested the ole settg, but rather the batch stochastc optmzato settg, where we would lke to obta a sgle predctor w wth low expected value over future examples F( w = E z [f( w; z]. Usg martgale equaltes, t s possble to covert a ole algorthm to a batch algorthm wth a stochastc guaratee. Oe smple way to do so s to ru the ole algorthm o the stochastc sequece of fuctos f(, z 1,..., f(, z ad set the sgle predctor w to be the average of the ole choces w 1,...,w. Assumg the codtos of Theorem 3, t s possble to show (e.g. [CCG04] that wth probablty of at least 1 δ we have ( F( w F(w B L O log(1/δ. (7 It s also possble to derve a smlar guaratee assumg the codtos of Theorem 4 [KT08]: ( L F( w F(w log(/δ O. (8 The codtos for Theorem 3 geeralze those of Theorem 1 whe r(w = 0: If f(w; z = g( w, φ(z satsfes the codtos of Theorem 1 the t also satsfes the codtos of Theorem 3 wth L = L g R ad the boud o the populato sub-optmalty of w gve (7 matches the guaratee o ŵ usg Theorem 1. Smlarly, the codtos of Theorem 4 roughly geeralze those of Theorem wth L = RL g + L r ad the guaratees are smlar (except for a log-factor, ad as log as L r = O(RL g. It s mportat to ote, however, that the guaratees (7 ad (8 do ot subsume Theorems 1 ad, as the ole-to-batch guaratees apply oly to a specfc choce w whch s defed terms of the behavor of a specfc algorthm. They do ot provde guaratees o the emprcal mmzer, ad certaly ot a uform guaratee terms of the emprcal sub-optmalty. 3 Warm-Up: Fte Dmesoal Case We beg by otg that the fte dmesoal case, Lpschtz cotuty s eough to guaratee uform covergece, hece also learablty va emprcal mmzato. Theorem 5. Let W R d be bouded by B ad let f(w, z be L-Lpschtz w.r.t. w. The wth probablty of at least 1 δ over a sample of sze, for all w W: F(w ˆF(w O L B d log(log( d δ Proof. We wll show uform covergece by boudg the l -coverg umber of the class of fuctos F = {z f(w; z w W}. To do so, we frst ote that as a subset of a l -sphere, we ca boud the coverg umber of W wth respect to the Eucldea dstace d (w 1,w = w 1 w [VG05]: (for d > 3 ( N(ɛ, W, d = O d ( B ɛ d (9

4 We ow tur to coverg umbers of F wth respect to the l dstace d (f(w 1 ;, f(w ; = sup z f(w 1 ; z f(w ; z. By Lpschtz cotuty, for ay w 1,w W we have sup z f(w 1 ; z f(w ; z L w 1 w. A ɛ-coverg of W w.r.t. d therefore yelds a Lɛ-coverg of F w.r.t. d dstaces, ad so: ( N(ɛ, F, d N(ɛ/L, W, d = O d ( LB d ɛ (10 Notg that the emprcal l 1 coverg umber s bouded by the d coverg umber, ad usg a uform boud terms of emprcal l 1 coverg umbers we get [Pol84]: Pr ( sup F(w ˆF(w ɛ 8N(ɛ, F, d exp( ɛ ( ( d LB O d exp( ɛ ɛ 18LR. 18LR Equatg the rght-had-sde to δ ad boudg ɛ we get the boud the Theorem. We ca therefore coclude that emprcal mmzato s uformly cosstet wth the same rate as Theorem 5: F(ŵ F(w + O L B d log(log( d δ (11 wth probablty at least 1 δ over a sample of sze. Ths s the stadard approach for establshg learablty. We ow tur to ask whether such a approach ca also be take the fte dmesoal case,.e. yeldg a boud that does ot deped o the dmesoalty. 4 Learable, but ot wth Emprcal Mmzer The results of the Secto. suggest that perhaps Lpschtz cotuty s eough for obtag guaratees o stochastc covex optmzato usg a more drect approach, eve fte dmesos. I partcular, that perhaps Lpschtz cotuty s eough for esurg uform covergece, whch tur would mply learablty usg emprcal mmzato, as the fte dmesoal lear case, the fte dmesoal Lpschtz case, ad essetally all studed scearos of stochastc optmzato that we are aware of. Esurg uform covergece would further eable us to use approxmate emprcal mmzers, ad boud the stochastc sub-optmalty of ay vector w terms of ts emprcal suboptmalty, rather tha obtag a guaratee o the stochastc sub-optmalty of oly oe specfc procedural choce (obtaed from rug the ole learg algorthm. Ufortuately, ths s ot the case. Despte the fact that a bouded, Lpschtz-cotuous, stochastc covex optmzato problem s learable eve fte dmesos, as dscussed Secto., we show here that uform covergece does ot hold ad that t mght ot be learable wth emprcal mmzato. 4.1 Emprcal Mmzer far from Populato Optmal Cosder a covex stochastc optmzato problem gve by: f (1 (w; (x, α = α (w x = α [](w[] x[] (1 where for ow we wll set the doma to the d-dmesoal ut sphere W = { w R d : w 1 } ad take z = (x, α wth α [0, 1] d ad x W, ad where u v deotes a elemet-wse product. We wll frst cosder a sequece of problems, where d = for ay sample sze, ad establsh that we caot expect a covergece rate whch s depedet of the dmesoalty d. We the formalze ths example fte dmesos. Oe ca thk of the problem (1 as that of fdg the ceter of a ukow dstrbuto over x R d, where we also have stochastc per-coordate cofdece measures α[]. We wll actually focus o the case where some coordates are mssg,.e. occasoally α[] = 0. I ay case the doma W s bouded by oe, ad for ay z = (x, α the fucto w f (1 (w; z s covex ad 1-Lpschtz. Thus, the codtos of Theorem 3 hold, ad the covex stochastc optmzato problem s learable by rug Zkevch s ole algorthm ad takg a average. Cosder the followg dstrbuto over Z = (X, α: X = 0 wth probablty oe, ad α s uform over {0, 1} d. That s, α[] are..d. uform Beroull. For a radom sample (x 1, α 1,..., (x, α we have that wth probablty greater tha 1 e 1 > 0.63, there exsts a coordate j 1... such that all cofdece vectors α the sample are zero o the coordate j,.e. α [j] = 0 for all = 1... Let e j W be the stadard bass vector correspodg to ths coordate. The ˆF (1 (e j = 1 α (e j 0 = 1 α [j] = 0 but F (1 (e j = E X,α [ α (e j 0 ] = E X,α [ α[j] ] = 1/. We establshed that for ay, we ca costruct a covex Lpschtz-cotuous objectve hgh eough dmeso such that wth probablty at least 0.63 over the sample, F(1 sup w (w ˆF (1 (w 1/. Furthermore, sce f( ; s o-egatve, we have that e j s a emprcal mmzer, but ts expected value F (1 (e j = 1/ s far from the optmal expected value m w F (1 (w = F (1 (0 = I Ifte Dmesos: Populato Mmzer Does Not Coverge to Populato Optmum To formalze the example a sample-sze depedet way, take W to be the ut sphere of a fte-dmesoal Hlbert space wth orthoormal bass e 1,e,..., where for v W, we refer to ts coordates v[j] = v,e j w.r.t ths bass. The cofdeces α are ow a mappg of each coordate to [0, 1]. That s, a fte sequece of reals [0, 1]. The elemet-wse product operato α v s defed wth respect to ths bass ad the objectve fucto f (1 of equato (1 s well defed ths fte-dmesoal space.

5 We aga take a dstrbuto over Z = (X, α where X = 0 ad α s a..d. sequece of uform Beroull radom varables. Now, for ay fte sample there s almost surely a coordate j wth α [j] = 0 for all, ad so we a.s. have a emprcal mmzer ˆF (1 (e j = 0 wth F (1 (e j = 1/ > 0 = F (1 (0. We see that although the stochastc covex optmzato problem (1 s learable (usg Zkevch s ole algorthm, the emprcal values ˆF (1 (w do ot coverge uformly to ther expectatos, ad emprcal mmzato s ot guarateed to solve the problem! 4.3 Uque Emprcal Mmzer Does Not Coverge to Populato Optmum It s also possble to costruct a sharper couterexample, whch the uque emprcal mmzer ŵ s far from havg optmal expected value. To do so, we augmet f (1 by a small term whch esures ts emprcal mmzer s uque, ad far from the org. Cosder: f (13 (w; (x, α = f (1 (w; (x, α + ɛ (w[] 1 (13 where ɛ = The objectve s stll covex ad (1 + ɛ- Lpschtz. Furthermore, sce the addtoal term s strctly covex, we have that f (13 (w; z s strctly covex w.r.t. w ad so the emprcal mmzer s uque. Cosder the same dstrbuto over Z: X = 0 whle α[] are..d. uform zero or oe. The emprcal mmzer s the mmzer of ˆF (13 (w subject to the costrats w 1. Idetfyg the soluto to ths costraed optmzato problem s trcky, but fortuately ot ecessary. It s eough to show that the optmum of the ucostraed optmzato problem w UC = arg m ˆF (13 (w (wthout costrag w W has orm w UC 1. Notce that the ucostraed problem, wheever α [j] = 0 for all = 1.., oly the secod term of f (13 depeds o w[j] ad we have w UC[j] = 1. Sce ths happes a.s. for some coordate j, we ca coclude that the soluto to the costraed optmzato problem les o the boudary of W,.e. has ŵ = 1. But for such a soluto we have [ F (13 (ŵ E α α[]ŵ [] ] E α [ α[]ŵ [] ] = 1 ŵ = 1, whle F(w F(0 = ɛ. I cocluso, o matter how bg the sample sze s, the uque emprcal mmzer ŵ of the stochastc covex optmzato problem (13 s a.s. much worse tha the populato optmum, F(ŵ 1 > ɛ F(w, ad certaly does ot coverge to t. 5 Emprcal Mmzato of a Strogly Covex Objectve We saw that emprcal mmzato s ot adequate for stochastc covex optmzato eve f the objectve s Lpschtz-cotuous. We wll ow show that, f the objectve f(w; z s strogly covex w.r.t. w, the emprcal mmzer does coverge to the optmum. Ths s despte the fact that eve the strogly covex case, we stll mght ot have uform covergece of ˆF(w to F(w. 5.1 Emprcal Mmzer coverges to Populato Optmum Theorem 6. Cosder a stochastc covex optmzato problem such that f(w; z s -strogly covex ad L- Lpschtz wth respect to w W. Let z 1,..., z be a..d. sample ad let ŵ be the emprcal mmzer. The, wth probablty at least 1 δ over the sample we have F(ŵ F(w 4L δ. (14 Proof. To prove the Theorem, we use a stablty argumet troduced by Bousquet ad Elsseeff [BE0]. Deote ˆF ( (w = 1 f(w, z + f(w, z j the emprcal average wth z replaced by a depedetly ad detcally draw z, ad cosder ts mmzer: ŵ ( = arg m ˆF ( (w. We frst use strog covexty ad Lpschtz-cotuty to establsh that emprcal mmzato s stable the followg sese: z 1,...,z, z, z Z f(ŵ, z f(ŵ (, z β (15 wth β = 4L (ths s referred to as CV (Replacemet Stablty [RMP05] ad s smlar to uform stablty [BE0]. We the show that (15 mples covergece of F(ŵ to F(w. Clam 6.1. Uder the codtos of Theorem 6, the stablty boud (15 holds wth β = 4L. Proof of Clam 6.1: We frst calculate: ˆF(ŵ ( ˆF(ŵ = f(ŵ(, z f(ŵ, z = f(ŵ(, z f(ŵ, z ( + ˆF ( (ŵ ( ˆF ( (ŵ ( j f(ŵ (, z f(ŵ, z + + f(ŵ, z f(ŵ(, z (16 f(ŵ(, z f(ŵ, z + f(ŵ, z f(ŵ(, z L ŵ ( ŵ (17 where the frst equalty follows from the fact that ŵ ( s the mmzer of ˆF ( (w ad for the secod equalty we use Lpschtz cotuty. But from strog covexty of ˆF(w ad the fact that ŵ mmzes ˆF(w we also have that ˆF(ŵ ( ˆF(ŵ + ŵ ( ŵ. (18 Combg (18 wth (17 we get ŵ ( ŵ 4L/(. Fally from Lpschtz cotuty, for ay z Z: f(ŵ, z f(ŵ (, z 4L (19

6 Clam 6.. If the stablty boud (15 holds, the for ay δ > 0, wth probablty 1 δ over the sample, F(ŵ F(w β δ. (0 A smlar result that s ot specfc to ŵ, but yelds oly a β + 1 rate appears [RMP05, Theorem 4.4]. The faster rate s mportat for us here. Proof of Clam 6.: Sce the samples wth z ad wth z are detcally dstrbuted, ad z s depedet of z, we have: [ ] [ ] E[F(ŵ] = E F(ŵ ( = E f(ŵ ( ; z where the expectato s over z 1,...,z, z. Ths holds for all, ad so we ca also wrte: E[F(ŵ] = 1 [ ] E f(ŵ ( ; z. (1 We also have: [ [ ] 1 E ˆF(ŵ = E =1 ] f(ŵ; z = 1 =1 E[f(ŵ; z ] ( Combg (1 ad ( ad usg (15 yelds : [ E F(ŵ ˆF(ŵ ] = 1 [ ] E f(ŵ (, z f(ŵ; z β =1 ] [ ] We also have that E[F(w ] = E[ ˆF(w E ˆF(ŵ, where the equalty s just equatg a expectato to a expectato of a average, ad the equalty follows from optmalty of ŵ. We ca therefore coclude: [ E[F(ŵ F(w ] E F(ŵ ˆF(ŵ ] β. (3 Usg Markov s equalty yelds (0. =1 We suspect that the depedece o δ the above boud ca be mproved to log(1/δ, matchg the depedece o δ the ole-to-batch guaratee (8 ad the guratees for the geeralzed lear case. For more detals, see Appedx A. 5. But Wthout Uform Covergece! We ow tur to ask whether the covergece of the emprcal mmzer ths case s a result of uform covergece. Cosder augmetg the objectve fucto f (1 of Secto 4 wth a strogly covex term: f (4 (w;x, α = f (1 (w;x, α + w. (4 The modfed objectve f (4 ( ; s -strogly covex ad (1 + -Lpschtz over the doma W = {w : w 1} ad thus satsfes the codtos of Theorem 6. Cosder the same dstrbuto over Z = (X, α used Secto 4: X = 0 ad α s a..d. sequece of uform Ths s a modfcato of a dervato extracted from the proof of Theorem 1 [BE0] zero/oe Beroull varables. Recall that almost surely we have a coordate j that s ever observed,.e. such that α [j] = 0. Cosder a vector te j of magtude 0 < t 1 the drecto of ths coordate. We have that ˆF (4 (te j = t but F (4 (te j = 1 t+ t. Hece F (4 (te j ˆF (4 (te j = t/. I partcular, we ca set t = 1 ad establsh sup (F (4 (w ˆF (4 (w 1 regardless of the sample sze. We see the that the emprcal averages ˆF (4 (w do ot coverge uformly to ther expectatos, eve as the sample sze creases. 5.3 Not Eve Local Uform Covergece For ay ɛ > 0, cosder lmtg our atteto oly to predctors that are close to beg populato optmal: W ɛ = {w W : F (4 (w F (4 (w + ɛ}. Settg t = ɛ we have te j W ɛ (focusg for coveece o < 1 ad so: sup ɛ (F (4 (w ˆF (4 (w ɛ (5 regardless of the sample sze. Ad so, eve a arbtrarly small eghborhood of the optmum, the emprcal values ˆF (4 (w do ot coverge uformly to ther expected values eve as. Ths s sharp cotrast to essetally all other results o stochastc optmzato ad learg that we are aware of. 5.4 Boudg Populato Sub-Optmalty term of Emprcal Sub-Optmalty A practcal questo related to uform covergece s whether we ca obta a uform boud o the populato sub-optmalty terms of the emprcal sub-optmalty, as Theorem. We frst ote that merely due to the fact that the emprcal objectve ˆF s strogly covex, ay approxmate emprcal mmzer must be close to ŵ, ad due to the fact that the expected objectve F s Lpschtz-cotuous ay vector close to ŵ caot have a much worse value tha ŵ. We therefore have, uder the codtos of Theorem 6, that wth probablty at least 1 δ, for all w W: L F(w F(w ˆF(w ˆF(ŵ + 4L δ (6 It s mportat to emphasze that ths s a mmedate cosequece of (14 ad does ot volve ay further stochastc propertes of ˆF or F. Although ths uform equalty does allow us to boud the populato sub-optmalty terms of the emprcal sub-optmalty, the emprcal sub-optmalty must be quadratc the desred populato sub-optmalty. Compare ths depedece wth the more favorable lear depedece of Theorem. Ufortuately, as we show ext, ths s the best that ca be esured. Cosder the objectve f (4 ad the same dstrbuto over Z = (X, α dscussed above ad recall that te j s a vector of magtude t alog a coordate j s.t. α [j] = 0. We have that ˆF (4 (te j ˆF (4 (ŵ = t ad so settg t = ɛ/, we get a ɛ-emprcal-suboptmal vector wth populato sub-optmalty F (4 (te j F (4 (0 = 1 t + t = ɛ + ɛ.

7 Ths establshes that the depedece o ɛ the frst term of (6 s tght, ad the stuato s qualtatvely dfferet tha the geeralzed lear case. 5.5 Cotradcto to Vapk? At ths pot, a reader famlar wth Vapk s work o ecessary ad suffcet codtos for cosstecy of emprcal mmzato (.e. codtos for F(ŵ F(w mght be cofused. I seekg such ecessary ad suffcet codtos [Vap98, Chapter 3], Vapk excludes certa cosstet settgs where the cosstecy s so-called trval. The ma example of a excluded settg s oe whch there s oe hypothess w 0 that domates all others,.e. f(w 0 ; z < f(w; z for all w W ad all z Z [Vap98, Fgure 3.]. Whe ths s the case, emprcal mmzato wll be cosstet regardless of the behavor of ˆF(w for w w 0. I order to exclude such trval cases Vapk defes strct (aka o-trval cosstecy of emprcal mmzato as ( our otato: f ˆF(w P f F(w c (7 F(w c F(w c for all c R, where the covergece s probablty. Ths codto deed esures that F(ŵ P F(w. Vapk s Key Theorem o Learg Theory [Vap98, Theorem 3.1] the states that strct cosstecy of emprcal mmzato s equvalet to oe-sded uform covergece. Oesded meag requrg oly sup(f (4 (w ˆF (4 (w P 0, rather the sup F (4 (w ˆF P (4 (w 0. Note that the aalyss above shows the lack of such oe-sded uform covergece. I the example preseted above, eve though Theorem 6 establshes F(ŵ P F(w, the cosstecy s t strct by the defto above. To see ths, for ay c > 0, cosder the vector te j (where α [j] = 0 wth t = c. We have F(te j = 1 t + t > c but ˆF(4 (te j = t = c. Focusg o = 1 we get: f ˆF(w c (8 F(w c almost surely for ay sample sze, volatg the strct cosstecy requremet (7. The fact that the rght-had-sde of (8 s strctly greater the F(w = 0 s eough for obtag (o strct cosstecy of emprcal mmzato, but ths s ot eough for satsfyg strct cosstecy. We emphasze that stochastc covex optmzato s far from trval that there s o domatg hypothess that wll always be selected. Although for coveece of aalyss we took X = 0, oe should thk of stuatos whch X s stochastc wth a ukow dstrbuto. We see the that there s o mathematcal cotradcto here to Vapk s Key Theorem. Rather, we see a demostrato that strct cosstecy s too strct a requremet, ad that terestg, o-trval, learg problems mght admt o-strct cosstecy whch s ot equvalet to oe-sded uform covergece. We see that uform covergece s a suffcet, but ot at all ecessary, codto for cosstecy of emprcal mmzato o-trval settgs. 6 Regularzato We ow retur to the case where f(w, z s Lpschtz (ad covex w.r.t. w but ot strogly covex. As we saw, emprcal mmzato may fal ths case, despte the guarateed success of a ole approach. Our goal ths secto s to uderscore a more drect, o-procedural, optmzato crtero for stochastc optmzato. To do so, we defe a regularzed emprcal mmzato problem ( ŵ = m w + 1 f(w, z =1, (9 where s a parameter that wll be determed later. The followg theorem establshes that the mmzer of (9 s a good soluto to the stochastc covex optmzato problem: Theorem 7. Let f : W Z R be such that W s bouded by B ad f(w, z s covex ad L-Lpschtz wth respect to w. Let z 1,..., z be a..d. sample ad let ŵ be the 16L mmzer of (9 wth = δ B. The, wth probablty at least 1 δ we have ( F(ŵ F(w L B δ δ Proof. Let r(w; z = w + f(w; z ad let R(w = E z [r(w, z]. Note that ŵ s the emprcal mmzer for the stochastc optmzato problem defed by r(w; z. We apply Theorem 6 to r(w; z, to ths ed ote that sce f s L-Lpschtz ad w W, w B we see that r s fact L + B-Lpschtz. Applyg Theorem 6 ow we see that ŵ 4(L + B +F(ŵ = R(ŵ f R(w+ w δ R(w 4(L + B + = δ w +F(w 4(L + B + δ Now ote that w B ad so we get that F(ŵ F(w + B + 4(L + B δ F(w + B + 8L δ + 8B δ Pluggg the value of gve the theorem statemet we see that F(ŵ F(w L B + 4 δ + 3 L B δ δ Ths gves us the requred boud. From the above theorem ad the dscusso Secto 4 we see that regularzato s essetal for covex stochastc optmzato. It s terestg to cotrast ths wth the ole learg algorthm of Zkevch [Z03]. Seemgly, the ole approach of Zkevch does ot rely o regularzato. However, a more careful look reveals a uderlyg regularzato also the ole techque. Ideed, Shalev- Shwartz [Sha07] showed that Zkevch s ole learg

8 f (1 f (4 As we show below, we caot solve the stochastc optuform covergece: sup ˆF(w F(w 0 learable by emprcal m always supervsed learg f(1 learable wth ŵ: F(ŵ F(w uform covergece learable always supervsed learg f(4 learable: F( w F(w Fgure 1: Lpschtz-cotuous covex problems (tragle are all learable, but ot ecessarly usg emprcal mmzato. Lpschtz-cotuous strogly covex problems (dotted rectagle are all learable wth emprcal mmzato, but uform covergece mght ot hold. For bouded geeralzed lear problems (starred rectagle, uform covergece always holds. Our two separatg examples are also dcated. algorthm ca be vewed as approxmate coordate ascet optmzato of the dual of the regularzed problem (9. Furthermore, t s also possble to obta the same ole regret boud usg a Follow-The-Regularzed-Leader approach, whch at each terato drectly solves the regularzed mmzato problem (9 o z 1,..., z 1. The key, the, seems to be regularzato, rather the a procedural ole versus global mmzato approach. 6.1 Regularzato vs Costrats The role of regularzato here s very dfferet tha famlar settgs such as l regularzato SVMs ad l 1 regularzato LASSO. I those settgs regularzato serves to costra our doma to a low-complexty doma (e.g. loworm predctors, where we rely o uform covergece. I fact, almost all learg guaratees for such settgs that we are aware of ca be expressed terms of some sort of uform covergece. Ad as we metoed, learablty (uder the stadard supervsed learg model s fact equvalet to a uform covergece property. I our case, costrag the orm of w does ot esure uform covergece. Cosder the example f (1 of Secto 4. Eve over a restrcted doma W r = {w : w r}, for arbtrarly small r > 0, the emprcal averages ˆF(w do ot uformly coverge to F(w ad Pr (lmsup sup r ˆF(w F(w > 0 = 1. Furthermore, cosder replacg the regularzato term w wth a costrat o the orm of w, amely, solvg the problem w r = arg m ˆF(w (30 w r Fgure : Relatoshp betwee dfferet propertes of stochastc optmzato problems. mzato problem by settg r a dstrbuto-depedet way (.e. wthout kowg the soluto... To see ths, ote that whe X = 0 a.s. we must have r 0 to esure F( w w F(w. However, f X = e 1 a.s., we must set r 1. No costrat wll work for all dstrbutos over Z = (X, α! Ths sharply cotrasts wth tradtoal uses of regularzato, were learg guaratees are actually typcally stated terms of a costrat o the orm rather tha terms of a parameter such as, ad addg a regularzato term of the form w s vewed as a proxy for boudg the orm w. 7 Summary Followg the work of Zkevch [Z03], we expected to be able to geeralze well establshed results o stochastc optmzato of lear fuctos also to the more geeral Lpschtz-covex case. We dscovered a complex ad uexpected stuato, where strog covexty ad regularzato play a key role ad ultmately dd reach a uderstadg of stochastc covex optmzato that does ot rely o ole techques (Fgure 1. For stochastc objectves that arse from supervsed predcto problems, t s well kow that learablty,.e. solvablty of the stochastc optmzato problem, s equvalet to uform covergece, ad so wheever the problem s learable, t s learable usg emprcal mmzato [ABCH97]. May mght thk that ths prcpal, amely that a problem s learable ff t s learable usg emprcal mmzato, exteds also the Geeral Settg of Learg [Vap95] whch cludes also the stochastc covex optmzato problem studed here. However, we demostrated stochastc optmzato problems whch these equvaleces do ot hold. There s o cotradcto, sce stochastc optmzato problems that arse from supervsed learg have a restrcted structure, ad partcular the examples we study are ot amog such problems. I fact, for reasoable loss fuctos, order to make f(w;x, y = l(pred(w,x, y covex for both pos-

9 tve ad egatve labels, we must essetally make the predcto fucto pred(w, x both covex ad cocave w,.e. lear. Ad so stochastc (or ole covex optmzato problems that correspod to supervsed problems are ofte geeralzed lear problems. To summarze, although there s o cotradcto to the work of Vapk [Vap95] or of Alo et al [ABCH97], we see that learg the Geeral Settg s more complex tha we perhaps apprecate. Emprcal mmzato mght be cosstet wthout local uform covergece, ad more surprsgly, learg mght be possble, but ot by emprcal mmzato (Fgure. Ackowledgmets We would lke to thak Leo Bottou, Tog Zhag, ad Vladmr Vapk for helpful dscussos. Refereces [ABCH97] N. Alo, S. Be-Davd, N. Cesa-Bach, ad D. Haussler. Scale-sestve dmesos, uform covergece, ad learablty. J. ACM, 44(4: , [BE0] O. Bousquet ad A. Elsseeff. Stablty ad geeralzato. J. Mach. Lear. Res., :499 56, 00. [CCG04] N. Cesa-Bach, A. Coco, ad C. Getle. O the geeralzato ablty of o-le learg algorthms. IEEE Trasactos o Iformato Theory, 50(9: , September 004. [HKKA06] E. Haza, A. Kala, S. Kale, ad A. Agarwal. Logarthmc regret algorthms for ole covex optmzato. I Proceedgs of the Neteeth Aual Coferece o Computatoal Learg Theory, 006. [HKLW91] Davd Haussler, Mchael Kears, Nck Lttlestoe, ad Mafred K. Warmuth. Equvalece of models for polyomal learablty. Iformato ad Computato, 95(:19 161, December [KT08] S.M. Kakade ad A. Tewar. O the geeralzato ablty of ole strogly covex programmg algorthms. I NIPS, 008. [Pol84] D. Pollard. Covergece of Stochastc Processes. Sprger, New York, [RMP05] S. Rakhl, S. Mukherjee, ad T. Poggo. Stablty results learg theory. Aalyss ad Applcatos, 3(4: , 005. [Sha07] S. Shalev-Shwartz. Ole Learg: Theory, Algorthms, ad Applcatos. PhD thess, The Hebrew Uversty, 007. [SSS08] K. Srdhara, N. Srebro, ad S. Shalev-Shwartz. Fast rates for regularzed objectves. I Advaces Neural Iformato Processg Systems, 008. [Vap95] V.N. Vapk. The Nature of Statstcal Learg Theory. Sprger, [Vap98] V. N. Vapk. Statstcal Learg Theory. Wley, [VG05] J.L. Verger-Gaugry. Coverg a ball wth smaller equal balls R. Dscrete Comput. Geom., 33(1: , 005. [vlb04] U. vo Luxburg ad O. Bousquet. Dstace based classfcato wth lpschtz fuctos. J. Mach. Lear. Res., 5: , 004. [Z03] M. Zkevch. Ole covex programmg ad geeralzed ftesmal gradet ascet. I Proceedgs of the Tweteth Iteratoal Coferece o Mache Learg, 003. A Hgh Cofdece Bouds The bouds Theorems 6 ad 7 have polyomal rather tha logarthmc depedece o the cofdece parameter δ. Ths leads to the questo of whether these bouds ca be mproved to deped just o log(1/δ, matchg the depedece o δ the ole-to-batch guaratees (7 ad (8. Whle we suspect ths mght be the case, the questo remas ope. We emphasze that the questo here pertas to the boud o the covergece of the emprcal mmzer. The ole-to-batch guaratees apply oly to a specfc procedurally defed predctor, whch s ot the emprcal mmzer. Aother smple way to acheve a logarthmc depedece o 1/δ s to use emprcal mmzato combed wth a geerc boostg-the-cofdece method [HKLW91]. Aga, ths leads to a hgh-cofdece boud for a dfferet learg rule, based o the emprcal mmzer, but s ot the emprcal mmzer. As for results regardg the emprcal mmzer tself, we ote that t s possble to get hgh-cofdece bouds, wth oly a logarthmc depedece o 1/δ. However, these bouds come at the prce of worse depedece o the other parameters of the learg problem. For stace, f F(w s twce cotuously dfferetable, wth a uform upper boud max o the egevalues of ts Hessa, ad the codtos of Theorem 6 hold, we get that wth probablty at least 1 δ: ( L F(ŵ F(w log(1/δ max O. (31 Also, uder the codtos of Theorem 6 ad wthout ay addtoal assumpto, Bousquet ad Elsseeff [BE0] provde argumets for a boud of the form ( F(ŵ F(w L log(1/δ O. (3 Ufortuately, ether of these two bouds s suffcet for obtag a verso of Theorem 7 whch matches the oleto-batch guaratee (8 or the boud of Theorem 1 for the geeralzed lear case. Optmzg for the value of as a fucto of the sample sze, we get that the boud o the uregularzed objectve fucto Theorem 7 s replaced by ( (B F(ŵ F(w 4 L 1/3 log(1/δ max O f we use (31, or ( (B F(ŵ F(w 4 L 4 1/4 log(1/δ O f we use (3. I partcular, the depedece o the sample sze s sgfcatly worse tha 1/.

Stochastic Convex Optimization

Stochastic Convex Optimization Stochastc Covex Optmzato Sha Shalev-Shwartz TTI-Chcago sha@tt-c.org Ohad Shamr The Hebrew Uversty ohadsh@cs.huj.ac.l Natha Srebro TTI-Chcago at@uchcago.edu Karthk Srdhara TTI-Chcago karthk@tt-c.org.edu

More information

arxiv: v1 [cs.lg] 22 Feb 2015

arxiv: v1 [cs.lg] 22 Feb 2015 SDCA wthout Dualty Sha Shalev-Shwartz arxv:50.0677v cs.lg Feb 05 Abstract Stochastc Dual Coordate Ascet s a popular method for solvg regularzed loss mmzato for the case of covex losses. I ths paper we

More information

Dimensionality Reduction and Learning

Dimensionality Reduction and Learning CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that

More information

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture) CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

Rademacher Complexity. Examples

Rademacher Complexity. Examples Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th 018 3.1 Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed

More information

MATH 247/Winter Notes on the adjoint and on normal operators.

MATH 247/Winter Notes on the adjoint and on normal operators. MATH 47/Wter 00 Notes o the adjot ad o ormal operators I these otes, V s a fte dmesoal er product space over, wth gve er * product uv, T, S, T, are lear operators o V U, W are subspaces of V Whe we say

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971)) art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Chapter 9 Jordan Block Matrices

Chapter 9 Jordan Block Matrices Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.

More information

Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions

Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions CO-511: Learg Theory prg 2017 Lecturer: Ro Lv Lecture 16: Bacpropogato Algorthm Dsclamer: These otes have ot bee subected to the usual scruty reserved for formal publcatos. They may be dstrbuted outsde

More information

Research Article A New Iterative Method for Common Fixed Points of a Finite Family of Nonexpansive Mappings

Research Article A New Iterative Method for Common Fixed Points of a Finite Family of Nonexpansive Mappings Hdaw Publshg Corporato Iteratoal Joural of Mathematcs ad Mathematcal Sceces Volume 009, Artcle ID 391839, 9 pages do:10.1155/009/391839 Research Artcle A New Iteratve Method for Commo Fxed Pots of a Fte

More information

Bayes (Naïve or not) Classifiers: Generative Approach

Bayes (Naïve or not) Classifiers: Generative Approach Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg

More information

X ε ) = 0, or equivalently, lim

X ε ) = 0, or equivalently, lim Revew for the prevous lecture Cocepts: order statstcs Theorems: Dstrbutos of order statstcs Examples: How to get the dstrbuto of order statstcs Chapter 5 Propertes of a Radom Sample Secto 55 Covergece

More information

The Mathematical Appendix

The Mathematical Appendix The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.

More information

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statstcal Learg Teory Lecturer: Tegyu Ma Lecture #7 Scrbe: Bra Zag October 5, 08 Revew ad Overvew We wll frst gve a bref revew of wat as bee covered so far I te frst few lectures, we stated

More information

1 Lyapunov Stability Theory

1 Lyapunov Stability Theory Lyapuov Stablty heory I ths secto we cosder proofs of stablty of equlbra of autoomous systems. hs s stadard theory for olear systems, ad oe of the most mportat tools the aalyss of olear systems. It may

More information

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:

More information

PROJECTION PROBLEM FOR REGULAR POLYGONS

PROJECTION PROBLEM FOR REGULAR POLYGONS Joural of Mathematcal Sceces: Advaces ad Applcatos Volume, Number, 008, Pages 95-50 PROJECTION PROBLEM FOR REGULAR POLYGONS College of Scece Bejg Forestry Uversty Bejg 0008 P. R. Cha e-mal: sl@bjfu.edu.c

More information

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled

More information

Class 13,14 June 17, 19, 2015

Class 13,14 June 17, 19, 2015 Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral

More information

Introduction to local (nonparametric) density estimation. methods

Introduction to local (nonparametric) density estimation. methods Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though

More information

8.1 Hashing Algorithms

8.1 Hashing Algorithms CS787: Advaced Algorthms Scrbe: Mayak Maheshwar, Chrs Hrchs Lecturer: Shuch Chawla Topc: Hashg ad NP-Completeess Date: September 21 2007 Prevously we looked at applcatos of radomzed algorthms, ad bega

More information

Kernel-based Methods and Support Vector Machines

Kernel-based Methods and Support Vector Machines Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

Convergence of Large Margin Separable Linear Classification

Convergence of Large Margin Separable Linear Classification Covergece of Large Marg Separable Lear Classfcato Tog Zhag Mathematcal Sceces Departmet IBM T.J. Watso Research Ceter Yorktow Heghts, NY 0598 tzhag@watso.bm.com Abstract Large marg lear classfcato methods

More information

Simple Linear Regression

Simple Linear Regression Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato

More information

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions. Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos

More information

ESS Line Fitting

ESS Line Fitting ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

22 Nonparametric Methods.

22 Nonparametric Methods. 22 oparametrc Methods. I parametrc models oe assumes apror that the dstrbutos have a specfc form wth oe or more ukow parameters ad oe tres to fd the best or atleast reasoably effcet procedures that aswer

More information

Generalized Linear Regression with Regularization

Generalized Linear Regression with Regularization Geeralze Lear Regresso wth Regularzato Zoya Bylsk March 3, 05 BASIC REGRESSION PROBLEM Note: I the followg otes I wll make explct what s a vector a what s a scalar usg vec t or otato, to avo cofuso betwee

More information

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model Chapter 3 Asmptotc Theor ad Stochastc Regressors The ature of eplaator varable s assumed to be o-stochastc or fed repeated samples a regresso aalss Such a assumpto s approprate for those epermets whch

More information

An Introduction to. Support Vector Machine

An Introduction to. Support Vector Machine A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork

More information

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:

More information

Investigating Cellular Automata

Investigating Cellular Automata Researcher: Taylor Dupuy Advsor: Aaro Wootto Semester: Fall 4 Ivestgatg Cellular Automata A Overvew of Cellular Automata: Cellular Automata are smple computer programs that geerate rows of black ad whte

More information

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 3. Sampling, sampling distributions, and parameter estimation Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called

More information

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

Strong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity

Strong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity BULLETIN of the MALAYSIAN MATHEMATICAL SCIENCES SOCIETY Bull. Malays. Math. Sc. Soc. () 7 (004), 5 35 Strog Covergece of Weghted Averaged Appromats of Asymptotcally Noepasve Mappgs Baach Spaces wthout

More information

PTAS for Bin-Packing

PTAS for Bin-Packing CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,

More information

Chapter 14 Logistic Regression Models

Chapter 14 Logistic Regression Models Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as

More information

CHAPTER 4 RADICAL EXPRESSIONS

CHAPTER 4 RADICAL EXPRESSIONS 6 CHAPTER RADICAL EXPRESSIONS. The th Root of a Real Number A real umber a s called the th root of a real umber b f Thus, for example: s a square root of sce. s also a square root of sce ( ). s a cube

More information

A Remark on the Uniform Convergence of Some Sequences of Functions

A Remark on the Uniform Convergence of Some Sequences of Functions Advaces Pure Mathematcs 05 5 57-533 Publshed Ole July 05 ScRes. http://www.scrp.org/joural/apm http://dx.do.org/0.436/apm.05.59048 A Remark o the Uform Covergece of Some Sequeces of Fuctos Guy Degla Isttut

More information

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018 Chrs Pech Fal Practce CS09 Dec 5, 08 Practce Fal Examato Solutos. Aswer: 4/5 8/7. There are multle ways to obta ths aswer; here are two: The frst commo method s to sum over all ossbltes for the rak of

More information

arxiv:math/ v1 [math.gm] 8 Dec 2005

arxiv:math/ v1 [math.gm] 8 Dec 2005 arxv:math/05272v [math.gm] 8 Dec 2005 A GENERALIZATION OF AN INEQUALITY FROM IMO 2005 NIKOLAI NIKOLOV The preset paper was spred by the thrd problem from the IMO 2005. A specal award was gve to Yure Boreko

More information

Lecture 02: Bounding tail distributions of a random variable

Lecture 02: Bounding tail distributions of a random variable CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome

More information

n -dimensional vectors follow naturally from the one

n -dimensional vectors follow naturally from the one B. Vectors ad sets B. Vectors Ecoomsts study ecoomc pheomea by buldg hghly stylzed models. Uderstadg ad makg use of almost all such models requres a hgh comfort level wth some key mathematcal sklls. I

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 00 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the

More information

Median as a Weighted Arithmetic Mean of All Sample Observations

Median as a Weighted Arithmetic Mean of All Sample Observations Meda as a Weghted Arthmetc Mea of All Sample Observatos SK Mshra Dept. of Ecoomcs NEHU, Shllog (Ida). Itroducto: Iumerably may textbooks Statstcs explctly meto that oe of the weakesses (or propertes) of

More information

The Occupancy and Coupon Collector problems

The Occupancy and Coupon Collector problems Chapter 4 The Occupacy ad Coupo Collector problems By Sarel Har-Peled, Jauary 9, 08 4 Prelmares [ Defto 4 Varace ad Stadard Devato For a radom varable X, let V E [ X [ µ X deote the varace of X, where

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall

More information

Unsupervised Learning and Other Neural Networks

Unsupervised Learning and Other Neural Networks CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all

More information

Lecture 07: Poles and Zeros

Lecture 07: Poles and Zeros Lecture 07: Poles ad Zeros Defto of poles ad zeros The trasfer fucto provdes a bass for determg mportat system respose characterstcs wthout solvg the complete dfferetal equato. As defed, the trasfer fucto

More information

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Numercal Computg -I UNIT SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Structure Page Nos..0 Itroducto 6. Objectves 7. Ital Approxmato to a Root 7. Bsecto Method 8.. Error Aalyss 9.4 Regula Fals Method

More information

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d 9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,

More information

1 Onto functions and bijections Applications to Counting

1 Onto functions and bijections Applications to Counting 1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of

More information

Special Instructions / Useful Data

Special Instructions / Useful Data JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth

More information

1 Convergence of the Arnoldi method for eigenvalue problems

1 Convergence of the Arnoldi method for eigenvalue problems Lecture otes umercal lear algebra Arold method covergece Covergece of the Arold method for egevalue problems Recall that, uless t breaks dow, k steps of the Arold method geerates a orthogoal bass of a

More information

Chapter 8. Inferences about More Than Two Population Central Values

Chapter 8. Inferences about More Than Two Population Central Values Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha

More information

Chapter 3 Sampling For Proportions and Percentages

Chapter 3 Sampling For Proportions and Percentages Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys

More information

Third handout: On the Gini Index

Third handout: On the Gini Index Thrd hadout: O the dex Corrado, a tala statstca, proposed (, 9, 96) to measure absolute equalt va the mea dfferece whch s defed as ( / ) where refers to the total umber of dvduals socet. Assume that. The

More information

ENGI 3423 Simple Linear Regression Page 12-01

ENGI 3423 Simple Linear Regression Page 12-01 ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable

More information

Non-uniform Turán-type problems

Non-uniform Turán-type problems Joural of Combatoral Theory, Seres A 111 2005 106 110 wwwelsevercomlocatecta No-uform Turá-type problems DhruvMubay 1, Y Zhao 2 Departmet of Mathematcs, Statstcs, ad Computer Scece, Uversty of Illos at

More information

Simulation Output Analysis

Simulation Output Analysis Smulato Output Aalyss Summary Examples Parameter Estmato Sample Mea ad Varace Pot ad Iterval Estmato ermatg ad o-ermatg Smulato Mea Square Errors Example: Sgle Server Queueg System x(t) S 4 S 4 S 3 S 5

More information

AN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET

AN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET AN UPPER BOUND FOR THE PERMANENT VERSUS DETERMINANT PROBLEM BRUNO GRENET Abstract. The Permaet versus Determat problem s the followg: Gve a matrx X of determates over a feld of characterstc dfferet from

More information

Simple Linear Regression

Simple Linear Regression Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uversty Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal

More information

Generalization of the Dissimilarity Measure of Fuzzy Sets

Generalization of the Dissimilarity Measure of Fuzzy Sets Iteratoal Mathematcal Forum 2 2007 o. 68 3395-3400 Geeralzato of the Dssmlarty Measure of Fuzzy Sets Faramarz Faghh Boformatcs Laboratory Naobotechology Research Ceter vesa Research Isttute CECR Tehra

More information

A tighter lower bound on the circuit size of the hardest Boolean functions

A tighter lower bound on the circuit size of the hardest Boolean functions Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the

More information

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015 Fall 05 Homework : Solutos Problem : (Practce wth Asymptotc Notato) A essetal requremet for uderstadg scalg behavor s comfort wth asymptotc (or bg-o ) otato. I ths problem, you wll prove some basc facts

More information

Cubic Nonpolynomial Spline Approach to the Solution of a Second Order Two-Point Boundary Value Problem

Cubic Nonpolynomial Spline Approach to the Solution of a Second Order Two-Point Boundary Value Problem Joural of Amerca Scece ;6( Cubc Nopolyomal Sple Approach to the Soluto of a Secod Order Two-Pot Boudary Value Problem W.K. Zahra, F.A. Abd El-Salam, A.A. El-Sabbagh ad Z.A. ZAk * Departmet of Egeerg athematcs

More information

D KL (P Q) := p i ln p i q i

D KL (P Q) := p i ln p i q i Cheroff-Bouds 1 The Geeral Boud Let P 1,, m ) ad Q q 1,, q m ) be two dstrbutos o m elemets, e,, q 0, for 1,, m, ad m 1 m 1 q 1 The Kullback-Lebler dvergece or relatve etroy of P ad Q s defed as m D KL

More information

Entropy ISSN by MDPI

Entropy ISSN by MDPI Etropy 2003, 5, 233-238 Etropy ISSN 1099-4300 2003 by MDPI www.mdp.org/etropy O the Measure Etropy of Addtve Cellular Automata Hasa Aı Arts ad Sceces Faculty, Departmet of Mathematcs, Harra Uversty; 63100,

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Revew o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for Chapter 4-5 Notes: Although all deftos ad theorems troduced our lectures ad ths ote are mportat ad you should be famlar wth, but I put those

More information

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur

More information

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i.

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i. CS 94- Desty Matrces, vo Neuma Etropy 3/7/07 Sprg 007 Lecture 3 I ths lecture, we wll dscuss the bascs of quatum formato theory I partcular, we wll dscuss mxed quatum states, desty matrces, vo Neuma etropy

More information

Module 7: Probability and Statistics

Module 7: Probability and Statistics Lecture 4: Goodess of ft tests. Itroducto Module 7: Probablty ad Statstcs I the prevous two lectures, the cocepts, steps ad applcatos of Hypotheses testg were dscussed. Hypotheses testg may be used to

More information

Complete Convergence and Some Maximal Inequalities for Weighted Sums of Random Variables

Complete Convergence and Some Maximal Inequalities for Weighted Sums of Random Variables Joural of Sceces, Islamc Republc of Ira 8(4): -6 (007) Uversty of Tehra, ISSN 06-04 http://sceces.ut.ac.r Complete Covergece ad Some Maxmal Iequaltes for Weghted Sums of Radom Varables M. Am,,* H.R. Nl

More information

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov Iteratoal Boo Seres "Iformato Scece ad Computg" 97 MULTIIMNSIONAL HTROGNOUS VARIABL PRICTION BAS ON PRTS STATMNTS Geady Lbov Maxm Gerasmov Abstract: I the wors [ ] we proposed a approach of formg a cosesus

More information

Objectives of Multiple Regression

Objectives of Multiple Regression Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of

More information

Binary classification: Support Vector Machines

Binary classification: Support Vector Machines CS 57 Itroducto to AI Lecture 6 Bar classfcato: Support Vector Maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Supervsed learg Data: D { D, D,.., D} a set of eamples D, (,,,,,

More information

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines It J Cotemp Math Sceces, Vol 5, 2010, o 19, 921-929 Solvg Costraed Flow-Shop Schedulg Problems wth Three Maches P Pada ad P Rajedra Departmet of Mathematcs, School of Advaced Sceces, VIT Uversty, Vellore-632

More information

Chapter 8: Statistical Analysis of Simulated Data

Chapter 8: Statistical Analysis of Simulated Data Marquette Uversty MSCS600 Chapter 8: Statstcal Aalyss of Smulated Data Dael B. Rowe, Ph.D. Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 08 by Marquette Uversty MSCS600 Ageda 8. The Sample

More information

CS 1675 Introduction to Machine Learning Lecture 12 Support vector machines

CS 1675 Introduction to Machine Learning Lecture 12 Support vector machines CS 675 Itroducto to Mache Learg Lecture Support vector maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Mdterm eam October 9, 7 I-class eam Closed book Stud materal: Lecture otes Correspodg chapters

More information

= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n

= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n .. Soluto of Problem. M s obvously cotuous o ], [ ad ], [. Observe that M x,..., x ) M x,..., x ) )..) We ext show that M s odecreasg o ], [. Of course.) mles that M s odecreasg o ], [ as well. To show

More information

Aitken delta-squared generalized Juncgk-type iterative procedure

Aitken delta-squared generalized Juncgk-type iterative procedure Atke delta-squared geeralzed Jucgk-type teratve procedure M. De la Se Isttute of Research ad Developmet of Processes. Uversty of Basque Coutry Campus of Leoa (Bzkaa) PO Box. 644- Blbao, 488- Blbao. SPAIN

More information

A conic cutting surface method for linear-quadraticsemidefinite

A conic cutting surface method for linear-quadraticsemidefinite A coc cuttg surface method for lear-quadratcsemdefte programmg Mohammad R. Osoorouch Calfora State Uversty Sa Marcos Sa Marcos, CA Jot wor wth Joh E. Mtchell RPI July 3, 2008 Outle: Secod-order coe: defto

More information

5 Short Proofs of Simplified Stirling s Approximation

5 Short Proofs of Simplified Stirling s Approximation 5 Short Proofs of Smplfed Strlg s Approxmato Ofr Gorodetsky, drtymaths.wordpress.com Jue, 20 0 Itroducto Strlg s approxmato s the followg (somewhat surprsg) approxmato of the factoral,, usg elemetary fuctos:

More information