Distributed Accelerated Proximal Coordinate Gradient Methods

Size: px
Start display at page:

Download "Distributed Accelerated Proximal Coordinate Gradient Methods"

Transcription

1 Dstrbuted Accelerated Proxmal Coordate Gradet Methods Yog Re, Ju Zhu Ceter for Bo-Ispred Computg Research State Key Lab for Itell. Tech. & Systems Dept. of Comp. Sc. & Tech., TNLst Lab, Tsghua Uversty Abstract We develop a geeral accelerated proxmal coordate descet algorthm dstrbuted settgs Ds- APCG) for the optmzato problem that mmzes the sum of two covex fuctos: the frst part f s smooth wth a gradet oracle, ad the other oe Ψ s separable wth respect to blocks of coordate ad has a smple kow structure e.g., L 1 orm). Our algorthm gets ew accelerated covergece rate the case that f s strogly covex by makg use of moder parallel structures, ad cludes prevous o-strogly case as a specal case. We further preset effcet mplemetatos to avod full-dmesoal operatos each step, sgfcatly reducg the computato cost. Expermets o the regularzed emprcal rsk mmzato problem demostrate the effectveess of our algorthm ad match our theoretcal fdgs. 1 Itroducto We cosder the followg optmzato problem wth a composte objectve fucto: m F x) := fx) + Ψx), 1) x R N where f ad Ψ are proper ad lower sem-cotuous covex fuctos. We further assume that f s dfferetable o R N, ad Ψ has a smple blockwse separable structure Ψx) = Ψ x ), where x s the -th block of x wth cardalty N. Ths problem s ubqutous mache learg, where fx) deotes the loss ad Ψ represets some costrats or regularzatos. Gve a set D of..d data, a typcal loss fucto f has the followg form fx) = lx; A j ), ) A j D where l s some smoothed loss fucto such as the smoothed hge loss [L et al., 014]. The choce of Ψ depeds o the requremets of certa problems, such as boud costrats e.g., Ψ x) = 0 for x [0, 1] ad otherwse) or regularzatos for a specal purpose e.g., L 1 -regularzer for sparsty). fx) ca be strogly covex or ot. The strog covexty property usually meas a faster lear) covergece rate Correspodg author ad hece s terested may cases e.g., rdge regresso). We am to develop a geeral accelerated coordate descet method to solve problem 1) uder dstrbuted settgs, ad accelerate the covergece rate by makg use of the parallel structure. 1.1 Related Work ad Motvato Coordate descet CD) methods are popular optmzato algorthms to hadle such problems that ca break dow to small peces sce they ca usually make use of specal structures uderlyg the problems. At each terato t, the basc CD method chooses oe block of coordate x t to suffcetly reduce the objectve value whle keepg the other blocks fxed. There are two commo strateges for choosg such a block the cyclc scheme ad the radomzed scheme, where the former chooses t a cyclc fasho.e., t+1 = t mod + 1), whle the latter chooses t uformly or va some dstrbutos. The radomzed scheme s more commo sce t ejoys both theoretcal ad practcal beefts [Wrght, 015]. Due to ther superor covergece property, CD algorthms have bee wdely used may practcal problems, especally some regmes that eed relatvely hgh accurate solutos [Wrght, 015]. To mprove the performace of the basc CD algorthm, may varats have bee proposed. Amog them, Nesterov s accelerato techque [Nesterov, 1983], whch s prove to be a optmal frst order gradet) method covergece rate, s a mportat le. Specally, Nesterov [01] developed a accelerated radomzed CD method for mmzg ucostraed smooth fuctos.e., wth Ψx) = 0), whch s the frst oe that apples accelerato techques to CD methods. Later oe, Lu ad Xao [013] gave a mproved verso wth a sharper covergece aalyss. For the more geeral problem 1), Fercoq ad Rchtark [015] proposed the APPROX algorthm for solvg fuctos fx) wthout strog covexty to obta a accelerated sublear covergece rate ad L [014] proposed a geeral APCG algorthm to obta a accelerated covergece rate for fx) ether wth strog covexty or ot. Aother mportat le s to utlze parallel computg archtectures to accelerate the CD algorthms. Here we focus o the sychroous case, where multple blocks of coordates are sampled ad gradets are the computed each terato, rug parallel o multple processors. After 655

2 that, a barrer sychrozato step s set to update the coordates, see a seres of recet work parallel ad dstrbuted settgs [Bradley et al., 011; Necoara ad Clpc, 013; Rchtark ad Takac, 013a]. To meet the requremets of the bg-data challege, combg the accelerato techque wth advaced parallel computg archtectures seems to be atural to leverage ther advatages respectvely. For the geeral problem 1), the APPROX algorthm s parallel ature to deal wth fx) wthout strog covexty. For a specal case of 1) called emprcal rsk mmzato ERM) problem, Shalev-Shwartz ad Zhag [014] proposed a erouter loop optmzato scheme to obta a accelerated covergece rate wth a strog covexty assumpto, where the er loop s easly parallelzed usg exstg techques [Takac ad Rchtark, 015]. However, the parallel algorthm for problem 1) the strogly covex case s stll uexplored. I ths paper, we propose a ew dstrbuted accelerated coordate gradet descet method DsAPCG) to solve the geeral form 1) for fx) wth or wthout strog covexty. Our algorthm les the le of Nesterov accelerato methods, wth ew carefully modfed Nesterov s sequeces to adapt to dstrbuted settgs. Our algorthm cludes AP- PROX as a specal case whe the fucto does ot have a strog covexty assumpto, where a sublear covergece rate s obtaed. I the strog covex case, we obta a accelerated lear covergece rate, thaks to the parallel structure. Furthermore, we propose a effcet mplemetato to avod full-dmesoal vector operatos, whch reduces the updatg cost each terato. For practcal use, we apply our algorthm to the ERM problem ad fd a sgfcat mprovemet comparg wth several other dstrbuted CD solvers. 1. Outle of The Paper I Secto, we troduce the otatos ad assumptos, preset the geeral DsAPCG method, aalyze ts covergece rate wth or wthout strog covexty, ad show the ga by leveragg parallel structures. I Secto 3, we preset a equvalet verso of the geeral algorthm to avod full-dmesoal mapulato may cases. I Secto 4, we apply our DsAPCG method to the wdely studed ERM problem. Besdes, we use expermets o a smoothed verso of SVM problems to demostrate the effectveess of our algorthm. Fally, we coclude Secto 5. The DsAPCG Method.1 Notatos, Assumptos ad Settgs For a N-dmesoal vector x R N, let Ω = {x R N } deote a partto of the coordates wth N = N. Let U = [U 1,..., U ] be a N N permutato matrx wth U R N N. The, we have x = U x, ad x = U x, []. Wthout losg geeralty, we assume that U s the detty matrx. Our dstrbuted settg s smlar as [Rchtark ad Takac, 013a]. Suppose that the cluster cossts of K odes ad the blocks of coordates features) are uformly dstrbuted each ode.e., each ode keeps ad updates /K blocks of coordates). For smplcty, we assume that s dvsble by K. We deote S k Ω as the collecto of blocks of coordates that are dstrbuted the k-th ode ad we have K k=1 S k = Ω. I the meawhle, all the data descrbg the features S k are stored the k-th ode. Followg [L et al., 014], for ay x R N, the partal gradet of f wth respect to x s defed as fx) = U fx), [], where [] := {1,...} deotes the set of tegers from 1 to. We make the followg assumptos, whch are commoly used the lterature of coordate descet methods [Fercoq ad Rchtark, 015]. Assumpto 1. The gradet of fx) s block-wse Lpschtz cotuous wth costats L : fx+u h ) fx) L h, x R N, h R N. 1/ For coveece, we deote x L = L x ) as the L -orm blockwsely weghted by the coeffcets L. Assumpto. There exsts µ 0 such that for all y R N ad x domψ): fy) fx) + fx), y x + µ y x L. A mmedate cosequece of Assumpto 1 s fx+u h ) fx)+ fx), h + L h, h R N, whch bouds the varato of fx) whe a sgle block chages. I dstrbuted settgs, as more tha oe blocks vary each terato, we eed the followg lemma to boud the total varato of the fucto fx). Lemma 1. [Rchtark ad Takac, 013b]) Assume that f satsfes Assumpto 1. For all x, h R N, we have fx + h) fx) + fx), h + Supph) h L, where Supph) := { [] : h ) 0} s the set of blocks that are ot equal to zero. The above boud s tght geeral. Suppose that dstrbuted settgs, we alter κ blocks each terato ad defe L κ = κ L. The, the boud gves that for x, y R N, fy) fx) + fx), y x + µ κ y x L κ, 3) ad Assumpto 1 mples that fx + h) fx) + fx), h + 1 h L κ, 4) where µ κ = µ/κ.. The DsAPCG Algorthm The geeral DsAPCG algorthm s summarzed Alg.1. We start K processors ad each processor k rus Alg.1 smultaeously. The algorthm matas a o-creasg step sze α t ad termedate varable y t), smlarly the orgal Nesterov s method. At terato t, each ode samples multple blocks of coordates ad computes the termedate varable z t) usg a proxmal gradet step, as step 4. Fally, 656

3 the coordates are updated a sychrozed maer step 5. We make more commets o several key steps as follows. Step1: S t) k collects the dces of the coordates to be updated for ode k. The total umber of updated coordates oe terato s κ = τk, where τ s the m-batch sze that every ode samples at each terato. Note that the sample procedure s ot equvalet to uformly sample κ blocks of coordates from all sce they are stored locally. Our algorthm s essetally a m-batch maer. Whe K = 1, we ca merge several small blocks to a larger oe ad use [L et al., 014] drectly.e., uformly sample oe large block at each terato), ad hece the m-batch aalyss s ot ecessary. However, such mergg ca ot be doe the dstrbuted settgs, sce every ode eeds to sample ther ow data, ot as a whole. Step: The orgal Nesterov s sequece s computed as αt = 1 α t )γ t + α t µ. Our ew sequece cosders the fluece troduced by mult-block alterato oe terato. Smlar propertes hold as the orgal oes. Step3: The Reduce operator s smlar to MPI::AllReduce, where y t) s frst computed by gatherg coordates from all odes ad the broadcasted to all odes. Ths s a crucal step sce y t) s of sze ON) ad hece troduces the most part of commucato cost. y t) s used to compute the gradet, however, oe ca further reduce the commucato cost by explorg the specal structure of fx), as we shall see. Step5: The update step volves full-dmesoal operatos sce z t+1), z t) ad y t) are dese geeral ad the smlar problem exsts for the computato of y t). Ths ssue wll be dealt wth by proposg a effcet ad equvalet algorthm Secto 3..3 Covergece Aalyss We aalyze the covergece rate of Alg.1. The ma theorem s as follows. We defer the full proof to appedx. Theorem 1. Suppose that Assumptos 1 ad hold. Let F be the optmal value of problem 1) ad {x t) } be the sequece geerated by the DsAPCG method. The for ay t 0, the followg holds: { ) κµ t ) } E[F x t) ] F m 1, F x 0) ) F + γ 0 κr 0 + tκ γ 0 ), 7) where R 0 := m x X x0) x L, ad X s the set of optmal solutos of problem 1). ) κµ t ) The two terms 1 ad + tκ correspod to the strog covex ad o-strog covex cases re- γ 0 spectvely. For the o-strogly covex case, a slght chage of the proof ca remove the κ the last term of 7) ad the we recover the covergece rate [Fercoq ad Rchtark, 015] as a specal case. For the strogly covex case, we get a accelerated lear covergece rate, as the followg corollary shows: Algorthm 1 The DsAPCG algorthm. Iput: x 0) domψ) ad covexty parameter µ κ 0. Italze: set z 0) = x 0) ad choose 0 < γ 0 [µ κ, 1]. Start K odes wth ther correspodg data ad coordates. for t = 0, 1,, 3... do: 1. Uformly sample τ blocks of coordates S t) k.. Compute α t 0, κ/] usg the relato: κ α t = 1 α t)γ t + α tµ κ, ad set γ t+1 = 1 α t)γ t + α tµ κ, β t = αtµκ. γ t+1 3. Reduce: y t) 1 = α t γ t z t) + γ t+1 x t)) 5) α t γ t + γ t+1 4. for all dex S k, compute z t+1) z t+1) = argm x R N as α t L x 1 β t )z t) β t y t) + fy t) ), x + Ψ x ), ad for all dex S k, compute z t+1) 5. Set ed for z t+1) = 1 β t )z t) as + β t y t) x t+1) = y t) + κ α tz t+1) z t) ) + κµ κ zt) y t) ) 6) Corollary 1. Suppose that the same codtos Theorem. 1 hold ad further assume that f s µ-strogly covex. I order to obta E[F x t) ] F ϵ, t suffces to have the terato t satsfy t log C + Dκ, κµ ϵ where C = F x 0) ) F, D = γ 0 R 0/. The prevous sgle-mache verso APCG provded [L et al., 014] eeds O/ µ) to acheve ϵ accuracy, whle our DsAPCG further accelerates ths rate by a factor κ, omttg the log term. It remas a ope problem that whether we ca accelerate ths rate by κ. Remark 1. We have assumed that the coordates ad ther correspodg data) are uformly dstrbuted each ode. However, the computato power of each ode may be dfferet practce, whch stragglers may slow dow the barrer sychrozato. To avod ths, odes ca store dfferet amouts of coordates, depedg ther computato power, ad cosequetly, each ode k has ts ow m-batch sze τ k. The above theorem stll holds as log as each block s sampled wth equal probablty. 3 Effcet Implemetato For the strogly covex case wth µ > 0, we ca choose γ 0 = µ/κ to get a cocse verso. I ths case, we have that γ t = µ/κ ad cosequetly, α t = β t = κµ/. 657

4 As metoed Secto., a straghtforward mplemetato of Alg.1 requres full-dmesoal operatos o x.e. ON)), whch s comparable to a full-gradet step. L et al. [014] provded a equvalet verso to ther orgal algorthm, correspodg to the specal sgle-mache case our paper, wth a effcet update step. Here we show that such strategy ca also be used the dstrbuted settgs wth some modfcatos. The overall algorthm s summerzed Alg. ad the equvalet asserto s made Proposto 1. Proof ca be foud the appedx. Algorthm DsAPCG wthout full-dmesoal vector operators the case µ κ > 0 Iput: x 0) domψ) ad covexty parameter µ κ > 0. Italze: set u 0) = 0, v 0) = x 0), α = κµ, ρ = 1 α 1+α. Start K odes wth ther correspodg data ad coordates. for t = 0, 1,, 3... do: 1. Reduce: ρ t+1 u t) + v t).. Uformly sample τ blocks of coordates S t) k. 3. for all S t) k h t) = argm h R N compute h t) αl h + as fρ t+1 u t) + v t) ), h + Ψ ρ t+1 u t) + v t) + h 4. Let u t+1) = u t), v t+1) = v t) ad update S t) as: u t+1) = u t) 1 κ α ρ t+1 Output: x t) = ρ t u t) + v t). ht), v t+1) = v t) ) α κ h t) Proposto 1. Algorthm ad Algorthm 1 are equvalet wth x t) = ρ t u t) + v t), for all t 0. { argmax Dx) = 1 x R y t) = ρ t+1 u t) + v t), z t) = ρ t u t) + v t), As we ca see, the updatg step Alg. avods fulldmesoal operatos. However, the reduce step ρ t+1 u t) + v t) stll eeds ON) computato cost geeral. We ca further explore the structure of certa problem to avod t, as we shall see for the problem of ERM. 4 Applcato to Prmal-dual ERM Problem 4.1 Prmal ad Dual ERM Problem The ERM problem arses ubqutously supervsed mache learg applcatos. Let A 1,..., A be vectors R d, ϕ 1,..., ϕ be a sequece of covex fuctos o R, ad g be a covex fucto o R d, the regularzed ERM problem s defed as follows: { } argm P w) = 1 ϕ A w) + λgw) w R d The dual of the above problem s ϕ x ) λg 1 λ Ax) where A = [A 1,..., A ], ad ϕ, g are cojugate fuctos of ϕ, g respectvely. Notce that the above problem, the sample sze s the dmesoalty whe we cosder the dual problem. We focus o the strog covexty case, where we eed the followg assumpto that s also stadard the lterature of solvg the prmal ad dual ERM problems. Assumpto 3. Each fucto ϕ s 1/γ smooth ad the fucto g s strog covex wth parameter 1. The above assumpto mples that ϕ s γ strog covex ad g s cotuous dfferetable. The structure of Dx) matches problem 1) wth the equvalet form F x) = Dx), where fx) = λg 1 λ Ax) ad Ψx) = 1 ϕ x ). I order to match the lear covergece rate assumpto, we re-locate the strog covexty of ϕ ad get the fal optmzato problem as argm x R F x) where F x) = λg 1 λ Ax) + γ x + 1 ϕ x ) γ }{{} x ). }{{} fx) Ψx) We focus o a specal case that gw) = 1 w. Such specal case mples that we ca effcetly compute the partal coordate gradet ad s mostly used as regularzato term ERM problems [Ma et al., 015]. I ths case, we have fy t) ) = 1 λ A Ay t) ) + γ yt) 8) Besdes, we ca determe a upper boud for the Lpschtz costat L R + λγ λ, where R = max A, [], ad a lower boud for the strog covexty parameter for λγ fx) wth respect to L as µ R. Detals ca be + λγ foud the appedx. 4. Numercal Expermets We cosder mmzg the smoothed hge loss problem, order to satsfy the 1/γ smooth codto. Precsely, we have 0 f a > 1 ϕ a) = 1 a γ f a 1 γ 1 γ 1 a) otherwse The cojugate fucto of ϕ s the as follows: { ϕ b + γ b) = b f b [ 1, 0] otherwse Cosequecely, we have Ψ x) = 1 ϕ x) γ ) x = { x } f x [0, 1] otherwse I the cotext of prmal-dual optmzato problem, people ofte care about the dualty gap P wx)) Dx), whch s, 658

5 Fgure 1: Dualty gap vs. the umber of teratos, as well as dualty gap vs. elapsed tme for the Epslo datasets. We vary the m-batch sze τ whle keep the umber of odes K fxed or vse versa. λ s fxed to be The dualty gap ad elapsed tme are show log doma whle the umber of teratos s show ormally to emphasze the lear covergece rate. Algorthm 3 DsAPCG for regularzed ERM wth µ > 0 Iput: x 0) domψ) ad µ = Italze: set α = κµ λγ R +λγ, ρ = 1 α 1+α, v0) = x 0), u 0) = 0, p 0) = 0 ad q 0) = Ax 0). Start K odes wth ther correspodg data ad coordates. for t = 0, 1,, 3... do: 1. Reduce: p t), q t).. Uformly sample τ blocks of coordates S t) k. 3. for all S t), compute: h t) α A + λγ) = argm h + t), h h R N λ ) + Ψ ρ t+1) u t) + v t) + h t) = 1 λ ρt+1 A p t) + A q t) ) + γ ρt+1 u t) + v t) ). 4. Let u t+1) = u t), v t+1) = v t) ad for all S t) : u t+1) = u t) Update p, q as p t+1) = p t) 1 κ α ρ t+1 1 α κ ρ t+1 ht), v t+1) = v t) Aht) Output: prmal ad dual solutos: α κ h t), q t+1) = q t) 1 + α κ A h t) x t+1) = ρ t+1 u t+1) +v t+1), w t+1) = 1 λ ρt+1 p t+1) +q t+1) ) datasets dmeso d sample sze sparsty epslo, , % covtype ,01 % RCV1 47,36 677, % Table 1: Iformato of three bary classfcato datasets. a upper boud to the gap D Dx). A overall dscusso about the relato betwee these gaps ca be foud [Duer et al., 016]. Here we drectly use the dualty gap as the statstcal dcator. We mplemet the algorthms by C++ ad opempi ad ru them clusters o Tahe-II super computer, where each ode we use a sgle cpu. Expermets are performed o 3 datasets from [Fa ad L, 011] whose formato s summarzed Table 1. Ifluece of m-batch sze τ ad umber of odes K We frst aalyze the fluece of the m-batch sze τ ad the umber of odes K o the Epslo dataset. We ether vary the m-batch sze τ o each ode wth the umber of odes K fxed, or vse versa. Itutvely, a larger m-batch sze τ meas a larger descet oe terato, however, t eeds more computato cost. Smlarly, a larger umber of odes K meas a larger descet whle more commucato cost. Therefore, o the oe had, we show the dualty gap w.r.t the umber of teratos to verfy the lear covergece rate ad the beefts by creasg computato resources. O the other had, we show the dualty gap w.r.t rug tme to make clear the trade-off betwee computato ad commucato. The overall results are summarzed Fg.1. I terms of dualty gap w.r.t the umber of teratos, our DsAPCG algorthm actually acheves a lear covergece rate all settgs, whch s cosstet wth our theory. Ad the crease of τ ad K does gve a large descet each terato. Takg a closer look we ca see ther s a κ acceleratg factor terms of the terato umber, aga matchg our theoretcal fdgs. For dualty gap w.r.t rug tme, the result suggests that a smaller m-batch sze e.g., 10) has better performace ad more odes meas a faster covergece rate. Comparso wth other solvers Now we compare our algorthm wth other state-of-art dstrbuted solvers for the prmal ad dual regularzed ERM problem, cludg the m-batch verso of SDCA dstrbuted settgs [Yag, 013] deoted by DsDCA) ad Co- CoA+ [Ma et al., 015]. The CoCoA+ solver s a er-outer 659

6 Fgure : Dualty gap vs. the umber of teratos, as well as dualty gap vs. elapsed tme for the RCV1 datasets wth umber of odes K = 16, m = H = 10. λ vares from 10 6 to Fgure 3: Dualty gap vs. the umber of teratos, as well as dualty gap vs. elapsed tme for the Covertype datasets wth umber of odes K = 16, m = H = 10. λ vares from 10 6 to loop scheme that volves a local solver, provdg a local approxmato to the global problem, ad the outer loop s resposble for commucato ad global parameter updatg. Here we follow the orgal paper [Ma et al., 015] wth SDCA as the local solver. I terms of the umber of teratos H for the local SDCA solver, we foud that usg a relatvely small umber of teratos e.g., 10 or 10 3 ) acheves the best performace wth rug tme beg take to cosderato. Hece we choose H = 10 for CoCoA+. For a far comparso, we set the m-batch sze to be τ = 10 for our DsAPCG method ad DsDCA. We vary λ from 10 6 to 10 8, whch s a relatvely hard settg sce the strog covexty parameter s small. For all settgs, we use K = 16 odes. The overall comparso s summarzed Fg. ad Fg.3. I all settgs, the CoCoA+ ad our DsAPCG method outperform DsDCA a lot. For the former two, as we ca see, whe λ s relatvely large,.e., λ = 10 6, the CoCoA+ solver reduces the dualty gap quckly at the begg, however, the speed slows dow rapdly at a relatvely accuracy level, e.g., 10 6 the RCV1 dataset. Ths pheomea happes, o matter wth the choce of the umber of teratos for the local SDCA solver. I cotrast, the DsAPCG algorthm keeps the lear covergece rate all the tme ad hece acheves better performace the case that hgh accuracy s eeded. For the most ll codto,.e., λ = 10 8, our DsAPCG algorthm acheve the best performace, ether terms of teratos or rug tme, o both datasets. It s worth metag that CoCoA+ s a framework that ca use ay local solver, ot oly lmted SDCA. As metoed Secto., our algorthm ca be regarded as a m-batch verso of the APCG algorthm whe K = 1, whch supports share-memory level parallel computg mult-core odes. A combato of CoCoA+ wth our sgle-mache verso of DsAPCG seems to be a good choce to further accelerate the algorthm practce. 5 Coclusos We have preseted a dstrbuted accelerated coordate methods DsAPCG for the composte covex optmzato problem. Our method combes the Nesterov s method ad parallel structures, ejoyg a accelerated covergece rate for both strogly ad o-strogly covex cases. Expermets for the ERM problem show better performace camparg wth several other state-of-art solvers, matchg our theoretcal fdgs as well. Ackowledgmets The work was supported by the Natoal Basc Research Program 973 Program) of Cha No. 013CB39403), NSFC Projects Nos , , ), Tagog Isttute for Itellget Computg, ad the Youth Top-otch Talet Support Program. 660

7 Refereces [Bradley et al., 011] Joseph Bradley, Aapo Kyrola, Day Bckso, ad Carlos Guestr. Parallel coordate descet for l1-regularzed loss mmzato. ICML, 011. [Duer et al., 016] Celeste Duer, Smoe Forte, Mart Takac, ad Mart Jagg. Prmal-dual rates ad certfcates. ICML, 016. [Fa ad L, 011] Rog-E Fa ad Chh-Je L. Lbsvm data: Classfcato, regresso ad mult-label. URL: cjl/lbsvmtools/datasets, 011. [Fercoq ad Rchtark, 015] Olver Fercoq ad Peter Rchtark. Accelerated, parallel ad proxmal coordate descet. SIAM Joural o Optmzato, 015. [L et al., 014] Qhag L, Zhaosog Lu, ad L Xao. A accelerated proxmal coordate gradet method ad ts applcato to regularzed emprcal rsk mmzato. NIPS, 014. [Ma et al., 015] Chex Ma, Vrga Smth, Mart Jagg, Mchael Jorda, Peter Rchtark, ad Mart Takac. Addg vs. averagg dstrbuted prmal-dual optmzato. ICML, 015. [Necoara ad Clpc, 013] Io Necoara ad Dragos Clpc. Effcet parallel coordate descet algorthm for covex optmzato problems wth separable costrats: applcato to dstrbuted mpc. Joural of Process Cotrol, 013. [Nesterov, 1983] Yur Nesterov. A method of solvg a covex programmg problem wth covergece rate o1/kˆ). Sovet Mathematcs Doklady, [Nesterov, 01] Yur Nesterov. Effcecy of coordate descet methods o huge-scale optmzato problems. SIAM Joural o Optmzato, 01. [Rchtark ad Takac, 013a] Peter Rchtark ad Mart Takac. Dstrbuted coordate descet method for learg wth bg data. arxv: , 013. [Rchtark ad Takac, 013b] Peter Rchtark ad Mart Takac. Parallel coordate descet methods for bg data optmzato. arxv: v, 013. [Shalev-Shwartz ad Zhag, 014] Sha Shalev-Shwartz ad Tog Zhag. Accelerated proxmal stochastc dual coordate ascet for regularzed loss mmzato. ICML, 014. [Takac ad Rchtark, 015] Mart Takac ad Peter Rchtark. Dstrbuted m-batch sdca. arxv: v1, 015. [Wrght, 015] Stephe Wrght. Coordate descet algorthms. Mathematcal Programmg 151.1, 3-34., 015. [Xao ad Lu, 013] L Xao ad Zhaosog Lu. O the complexty aalyss of radomzed block-coordate descet methods. arxv: , 013. [Yag, 013] Tabao Yag. Tradg computato for commucato: Dstrbuted stochastc dual coordate ascet. NIPS,

An Accelerated Proximal Coordinate Gradient Method

An Accelerated Proximal Coordinate Gradient Method A Accelerated Proxmal Coordate Gradet Method Qhag L Uversty of Iowa Iowa Cty IA USA qhag-l@uowaedu Zhaosog Lu Smo Fraser Uversty Buraby BC Caada zhaosog@sfuca L Xao Mcrosoft Research Redmod WA USA lxao@mcrosoftcom

More information

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture) CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.

More information

arxiv: v1 [cs.lg] 22 Feb 2015

arxiv: v1 [cs.lg] 22 Feb 2015 SDCA wthout Dualty Sha Shalev-Shwartz arxv:50.0677v cs.lg Feb 05 Abstract Stochastc Dual Coordate Ascet s a popular method for solvg regularzed loss mmzato for the case of covex losses. I ths paper we

More information

Dimensionality Reduction and Learning

Dimensionality Reduction and Learning CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that

More information

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines It J Cotemp Math Sceces, Vol 5, 2010, o 19, 921-929 Solvg Costraed Flow-Shop Schedulg Problems wth Three Maches P Pada ad P Rajedra Departmet of Mathematcs, School of Advaced Sceces, VIT Uversty, Vellore-632

More information

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971)) art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

A tighter lower bound on the circuit size of the hardest Boolean functions

A tighter lower bound on the circuit size of the hardest Boolean functions Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the

More information

PTAS for Bin-Packing

PTAS for Bin-Packing CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,

More information

Introduction to local (nonparametric) density estimation. methods

Introduction to local (nonparametric) density estimation. methods Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions

Lecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions CO-511: Learg Theory prg 2017 Lecturer: Ro Lv Lecture 16: Bacpropogato Algorthm Dsclamer: These otes have ot bee subected to the usual scruty reserved for formal publcatos. They may be dstrbuted outsde

More information

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:

More information

Research Article A New Iterative Method for Common Fixed Points of a Finite Family of Nonexpansive Mappings

Research Article A New Iterative Method for Common Fixed Points of a Finite Family of Nonexpansive Mappings Hdaw Publshg Corporato Iteratoal Joural of Mathematcs ad Mathematcal Sceces Volume 009, Artcle ID 391839, 9 pages do:10.1155/009/391839 Research Artcle A New Iteratve Method for Commo Fxed Pots of a Fte

More information

Bayes (Naïve or not) Classifiers: Generative Approach

Bayes (Naïve or not) Classifiers: Generative Approach Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg

More information

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015 Fall 05 Homework : Solutos Problem : (Practce wth Asymptotc Notato) A essetal requremet for uderstadg scalg behavor s comfort wth asymptotc (or bg-o ) otato. I ths problem, you wll prove some basc facts

More information

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package

More information

Rademacher Complexity. Examples

Rademacher Complexity. Examples Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th 018 3.1 Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed

More information

PROJECTION PROBLEM FOR REGULAR POLYGONS

PROJECTION PROBLEM FOR REGULAR POLYGONS Joural of Mathematcal Sceces: Advaces ad Applcatos Volume, Number, 008, Pages 95-50 PROJECTION PROBLEM FOR REGULAR POLYGONS College of Scece Bejg Forestry Uversty Bejg 0008 P. R. Cha e-mal: sl@bjfu.edu.c

More information

The Mathematical Appendix

The Mathematical Appendix The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.

More information

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall

More information

Lecture 9: Tolerant Testing

Lecture 9: Tolerant Testing Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have

More information

A Remark on the Uniform Convergence of Some Sequences of Functions

A Remark on the Uniform Convergence of Some Sequences of Functions Advaces Pure Mathematcs 05 5 57-533 Publshed Ole July 05 ScRes. http://www.scrp.org/joural/apm http://dx.do.org/0.436/apm.05.59048 A Remark o the Uform Covergece of Some Sequeces of Fuctos Guy Degla Isttut

More information

Analysis of Lagrange Interpolation Formula

Analysis of Lagrange Interpolation Formula P IJISET - Iteratoal Joural of Iovatve Scece, Egeerg & Techology, Vol. Issue, December 4. www.jset.com ISS 348 7968 Aalyss of Lagrage Iterpolato Formula Vjay Dahya PDepartmet of MathematcsMaharaja Surajmal

More information

Strong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity

Strong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity BULLETIN of the MALAYSIAN MATHEMATICAL SCIENCES SOCIETY Bull. Malays. Math. Sc. Soc. () 7 (004), 5 35 Strog Covergece of Weghted Averaged Appromats of Asymptotcally Noepasve Mappgs Baach Spaces wthout

More information

Communication-Efficient Distributed Primal-Dual Algorithm for Saddle Point Problems

Communication-Efficient Distributed Primal-Dual Algorithm for Saddle Point Problems Commucato-Effcet Dstrbuted Prmal-Dual Algorthm for Saddle Pot Problems Yaodog Yu Nayag Techologcal Uversty ydyu@tu.edu.sg Sul Lu Nayag Techologcal Uversty lusl@tu.edu.sg So Jal Pa Nayag Techologcal Uversty

More information

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat

More information

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

Chapter 8. Inferences about More Than Two Population Central Values

Chapter 8. Inferences about More Than Two Population Central Values Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha

More information

5 Short Proofs of Simplified Stirling s Approximation

5 Short Proofs of Simplified Stirling s Approximation 5 Short Proofs of Smplfed Strlg s Approxmato Ofr Gorodetsky, drtymaths.wordpress.com Jue, 20 0 Itroducto Strlg s approxmato s the followg (somewhat surprsg) approxmato of the factoral,, usg elemetary fuctos:

More information

Q-analogue of a Linear Transformation Preserving Log-concavity

Q-analogue of a Linear Transformation Preserving Log-concavity Iteratoal Joural of Algebra, Vol. 1, 2007, o. 2, 87-94 Q-aalogue of a Lear Trasformato Preservg Log-cocavty Daozhog Luo Departmet of Mathematcs, Huaqao Uversty Quazhou, Fua 362021, P. R. Cha ldzblue@163.com

More information

Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization

Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization Stochastc Prmal-Dual Coordate Method for Regularzed Emprcal Rsk Mmzato Yuche Zhag L Xao September 24 Abstract We cosder a geerc covex optmzato problem assocated wth regularzed emprcal rsk mmzato of lear

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Randomized Dual Coordinate Ascent with Arbitrary Sampling

Randomized Dual Coordinate Ascent with Arbitrary Sampling Radomzed Dual Coordate Ascet wth Arbtrary Samplg Zheg Qu Peter Rchtárk Tog Zhag November 21, 2014 Abstract We study the problem of mmzg the average of a large umber of smooth covex fuctos pealzed wth a

More information

MATH 247/Winter Notes on the adjoint and on normal operators.

MATH 247/Winter Notes on the adjoint and on normal operators. MATH 47/Wter 00 Notes o the adjot ad o ormal operators I these otes, V s a fte dmesoal er product space over, wth gve er * product uv, T, S, T, are lear operators o V U, W are subspaces of V Whe we say

More information

MEASURES OF DISPERSION

MEASURES OF DISPERSION MEASURES OF DISPERSION Measure of Cetral Tedecy: Measures of Cetral Tedecy ad Dsperso ) Mathematcal Average: a) Arthmetc mea (A.M.) b) Geometrc mea (G.M.) c) Harmoc mea (H.M.) ) Averages of Posto: a) Meda

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

L5 Polynomial / Spline Curves

L5 Polynomial / Spline Curves L5 Polyomal / Sple Curves Cotets Coc sectos Polyomal Curves Hermte Curves Bezer Curves B-Sples No-Uform Ratoal B-Sples (NURBS) Mapulato ad Represetato of Curves Types of Curve Equatos Implct: Descrbe a

More information

CHAPTER 4 RADICAL EXPRESSIONS

CHAPTER 4 RADICAL EXPRESSIONS 6 CHAPTER RADICAL EXPRESSIONS. The th Root of a Real Number A real umber a s called the th root of a real umber b f Thus, for example: s a square root of sce. s also a square root of sce ( ). s a cube

More information

18.413: Error Correcting Codes Lab March 2, Lecture 8

18.413: Error Correcting Codes Lab March 2, Lecture 8 18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse

More information

Support vector machines

Support vector machines CS 75 Mache Learg Lecture Support vector maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Outle Outle: Algorthms for lear decso boudary Support vector maches Mamum marg hyperplae.

More information

Generalized Linear Regression with Regularization

Generalized Linear Regression with Regularization Geeralze Lear Regresso wth Regularzato Zoya Bylsk March 3, 05 BASIC REGRESSION PROBLEM Note: I the followg otes I wll make explct what s a vector a what s a scalar usg vec t or otato, to avo cofuso betwee

More information

Cubic Nonpolynomial Spline Approach to the Solution of a Second Order Two-Point Boundary Value Problem

Cubic Nonpolynomial Spline Approach to the Solution of a Second Order Two-Point Boundary Value Problem Joural of Amerca Scece ;6( Cubc Nopolyomal Sple Approach to the Soluto of a Secod Order Two-Pot Boudary Value Problem W.K. Zahra, F.A. Abd El-Salam, A.A. El-Sabbagh ad Z.A. ZAk * Departmet of Egeerg athematcs

More information

Median as a Weighted Arithmetic Mean of All Sample Observations

Median as a Weighted Arithmetic Mean of All Sample Observations Meda as a Weghted Arthmetc Mea of All Sample Observatos SK Mshra Dept. of Ecoomcs NEHU, Shllog (Ida). Itroducto: Iumerably may textbooks Statstcs explctly meto that oe of the weakesses (or propertes) of

More information

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information Malaysa Joural of Mathematcal Sceces (): 97- (9) Bayes Estmator for Expoetal Dstrbuto wth Exteso of Jeffery Pror Iformato Hadeel Salm Al-Kutub ad Noor Akma Ibrahm Isttute for Mathematcal Research, Uverst

More information

Supervised learning: Linear regression Logistic regression

Supervised learning: Linear regression Logistic regression CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s

More information

ESS Line Fitting

ESS Line Fitting ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here

More information

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018 Chrs Pech Fal Practce CS09 Dec 5, 08 Practce Fal Examato Solutos. Aswer: 4/5 8/7. There are multle ways to obta ths aswer; here are two: The frst commo method s to sum over all ossbltes for the rak of

More information

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Multivariate Transformation of Variables and Maximum Likelihood Estimation Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty

More information

An Introduction to. Support Vector Machine

An Introduction to. Support Vector Machine A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork

More information

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov

MULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov Iteratoal Boo Seres "Iformato Scece ad Computg" 97 MULTIIMNSIONAL HTROGNOUS VARIABL PRICTION BAS ON PRTS STATMNTS Geady Lbov Maxm Gerasmov Abstract: I the wors [ ] we proposed a approach of formg a cosesus

More information

Beam Warming Second-Order Upwind Method

Beam Warming Second-Order Upwind Method Beam Warmg Secod-Order Upwd Method Petr Valeta Jauary 6, 015 Ths documet s a part of the assessmet work for the subject 1DRP Dfferetal Equatos o Computer lectured o FNSPE CTU Prague. Abstract Ths documet

More information

Chapter 9 Jordan Block Matrices

Chapter 9 Jordan Block Matrices Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean Research Joural of Mathematcal ad Statstcal Sceces ISS 30 6047 Vol. 1(), 5-1, ovember (013) Res. J. Mathematcal ad Statstcal Sc. Comparso of Dual to Rato-Cum-Product Estmators of Populato Mea Abstract

More information

A New Family of Transformations for Lifetime Data

A New Family of Transformations for Lifetime Data Proceedgs of the World Cogress o Egeerg 4 Vol I, WCE 4, July - 4, 4, Lodo, U.K. A New Famly of Trasformatos for Lfetme Data Lakhaa Watthaacheewakul Abstract A famly of trasformatos s the oe of several

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

Mu Sequences/Series Solutions National Convention 2014

Mu Sequences/Series Solutions National Convention 2014 Mu Sequeces/Seres Solutos Natoal Coveto 04 C 6 E A 6C A 6 B B 7 A D 7 D C 7 A B 8 A B 8 A C 8 E 4 B 9 B 4 E 9 B 4 C 9 E C 0 A A 0 D B 0 C C Usg basc propertes of arthmetc sequeces, we fd a ad bm m We eed

More information

NP!= P. By Liu Ran. Table of Contents. The P versus NP problem is a major unsolved problem in computer

NP!= P. By Liu Ran. Table of Contents. The P versus NP problem is a major unsolved problem in computer NP!= P By Lu Ra Table of Cotets. Itroduce 2. Prelmary theorem 3. Proof 4. Expla 5. Cocluso. Itroduce The P versus NP problem s a major usolved problem computer scece. Iformally, t asks whether a computer

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

ρ < 1 be five real numbers. The

ρ < 1 be five real numbers. The Lecture o BST 63: Statstcal Theory I Ku Zhag, /0/006 Revew for the prevous lecture Deftos: covarace, correlato Examples: How to calculate covarace ad correlato Theorems: propertes of correlato ad covarace

More information

NP!= P. By Liu Ran. Table of Contents. The P vs. NP problem is a major unsolved problem in computer

NP!= P. By Liu Ran. Table of Contents. The P vs. NP problem is a major unsolved problem in computer NP!= P By Lu Ra Table of Cotets. Itroduce 2. Strategy 3. Prelmary theorem 4. Proof 5. Expla 6. Cocluso. Itroduce The P vs. NP problem s a major usolved problem computer scece. Iformally, t asks whether

More information

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions. Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos

More information

Entropy ISSN by MDPI

Entropy ISSN by MDPI Etropy 2003, 5, 233-238 Etropy ISSN 1099-4300 2003 by MDPI www.mdp.org/etropy O the Measure Etropy of Addtve Cellular Automata Hasa Aı Arts ad Sceces Faculty, Departmet of Mathematcs, Harra Uversty; 63100,

More information

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model Chapter 3 Asmptotc Theor ad Stochastc Regressors The ature of eplaator varable s assumed to be o-stochastc or fed repeated samples a regresso aalss Such a assumpto s approprate for those epermets whch

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

Pinaki Mitra Dept. of CSE IIT Guwahati

Pinaki Mitra Dept. of CSE IIT Guwahati Pak Mtra Dept. of CSE IIT Guwahat Hero s Problem HIGHWAY FACILITY LOCATION Faclty Hgh Way Farm A Farm B Illustrato of the Proof of Hero s Theorem p q s r r l d(p,r) + d(q,r) = d(p,q) p d(p,r ) + d(q,r

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statstc ad Radom Samples A parameter s a umber that descrbes the populato. It s a fxed umber, but practce we do ot kow ts value. A statstc s a fucto of the sample data,.e., t s a quatty whose

More information

Descriptive Statistics

Descriptive Statistics Page Techcal Math II Descrptve Statstcs Descrptve Statstcs Descrptve statstcs s the body of methods used to represet ad summarze sets of data. A descrpto of how a set of measuremets (for eample, people

More information

Unsupervised Learning and Other Neural Networks

Unsupervised Learning and Other Neural Networks CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all

More information

to the estimation of total sensitivity indices

to the estimation of total sensitivity indices Applcato of the cotrol o varate ate techque to the estmato of total sestvty dces S KUCHERENKO B DELPUECH Imperal College Lodo (UK) skuchereko@mperalacuk B IOOSS Electrcté de Frace (Frace) S TARANTOLA Jot

More information

MA/CSSE 473 Day 27. Dynamic programming

MA/CSSE 473 Day 27. Dynamic programming MA/CSSE 473 Day 7 Dyamc Programmg Bomal Coeffcets Warshall's algorthm (Optmal BSTs) Studet questos? Dyamc programmg Used for problems wth recursve solutos ad overlappg subproblems Typcally, we save (memoze)

More information

Solving Interval and Fuzzy Multi Objective. Linear Programming Problem. by Necessarily Efficiency Points

Solving Interval and Fuzzy Multi Objective. Linear Programming Problem. by Necessarily Efficiency Points Iteratoal Mathematcal Forum, 3, 2008, o. 3, 99-06 Solvg Iterval ad Fuzzy Mult Obectve ear Programmg Problem by Necessarly Effcecy Pots Hassa Mshmast Neh ad Marzeh Aleghad Mathematcs Departmet, Faculty

More information

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i.

1 Mixed Quantum State. 2 Density Matrix. CS Density Matrices, von Neumann Entropy 3/7/07 Spring 2007 Lecture 13. ψ = α x x. ρ = p i ψ i ψ i. CS 94- Desty Matrces, vo Neuma Etropy 3/7/07 Sprg 007 Lecture 3 I ths lecture, we wll dscuss the bascs of quatum formato theory I partcular, we wll dscuss mxed quatum states, desty matrces, vo Neuma etropy

More information

C-1: Aerodynamics of Airfoils 1 C-2: Aerodynamics of Airfoils 2 C-3: Panel Methods C-4: Thin Airfoil Theory

C-1: Aerodynamics of Airfoils 1 C-2: Aerodynamics of Airfoils 2 C-3: Panel Methods C-4: Thin Airfoil Theory ROAD MAP... AE301 Aerodyamcs I UNIT C: 2-D Arfols C-1: Aerodyamcs of Arfols 1 C-2: Aerodyamcs of Arfols 2 C-3: Pael Methods C-4: Th Arfol Theory AE301 Aerodyamcs I Ut C-3: Lst of Subects Problem Solutos?

More information

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Numercal Computg -I UNIT SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS Structure Page Nos..0 Itroducto 6. Objectves 7. Ital Approxmato to a Root 7. Bsecto Method 8.. Error Aalyss 9.4 Regula Fals Method

More information

CS 1675 Introduction to Machine Learning Lecture 12 Support vector machines

CS 1675 Introduction to Machine Learning Lecture 12 Support vector machines CS 675 Itroducto to Mache Learg Lecture Support vector maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Mdterm eam October 9, 7 I-class eam Closed book Stud materal: Lecture otes Correspodg chapters

More information

Simulation Output Analysis

Simulation Output Analysis Smulato Output Aalyss Summary Examples Parameter Estmato Sample Mea ad Varace Pot ad Iterval Estmato ermatg ad o-ermatg Smulato Mea Square Errors Example: Sgle Server Queueg System x(t) S 4 S 4 S 3 S 5

More information

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class) Assgmet 5/MATH 7/Wter 00 Due: Frday, February 9 class (!) (aswers wll be posted rght after class) As usual, there are peces of text, before the questos [], [], themselves. Recall: For the quadratc form

More information

A NEW LOG-NORMAL DISTRIBUTION

A NEW LOG-NORMAL DISTRIBUTION Joural of Statstcs: Advaces Theory ad Applcatos Volume 6, Number, 06, Pages 93-04 Avalable at http://scetfcadvaces.co. DOI: http://dx.do.org/0.864/jsata_700705 A NEW LOG-NORMAL DISTRIBUTION Departmet of

More information

A New Measure of Probabilistic Entropy. and its Properties

A New Measure of Probabilistic Entropy. and its Properties Appled Mathematcal Sceces, Vol. 4, 200, o. 28, 387-394 A New Measure of Probablstc Etropy ad ts Propertes Rajeesh Kumar Departmet of Mathematcs Kurukshetra Uversty Kurukshetra, Ida rajeesh_kuk@redffmal.com

More information

On Modified Interval Symmetric Single-Step Procedure ISS2-5D for the Simultaneous Inclusion of Polynomial Zeros

On Modified Interval Symmetric Single-Step Procedure ISS2-5D for the Simultaneous Inclusion of Polynomial Zeros It. Joural of Math. Aalyss, Vol. 7, 2013, o. 20, 983-988 HIKARI Ltd, www.m-hkar.com O Modfed Iterval Symmetrc Sgle-Step Procedure ISS2-5D for the Smultaeous Icluso of Polyomal Zeros 1 Nora Jamalud, 1 Masor

More information

STRONG CONSISTENCY OF LEAST SQUARES ESTIMATE IN MULTIPLE REGRESSION WHEN THE ERROR VARIANCE IS INFINITE

STRONG CONSISTENCY OF LEAST SQUARES ESTIMATE IN MULTIPLE REGRESSION WHEN THE ERROR VARIANCE IS INFINITE Statstca Sca 9(1999), 289-296 STRONG CONSISTENCY OF LEAST SQUARES ESTIMATE IN MULTIPLE REGRESSION WHEN THE ERROR VARIANCE IS INFINITE J Mgzhog ad Che Xru GuZhou Natoal College ad Graduate School, Chese

More information

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur

More information

LINEARLY CONSTRAINED MINIMIZATION BY USING NEWTON S METHOD

LINEARLY CONSTRAINED MINIMIZATION BY USING NEWTON S METHOD Jural Karya Asl Loreka Ahl Matematk Vol 8 o 205 Page 084-088 Jural Karya Asl Loreka Ahl Matematk LIEARLY COSTRAIED MIIMIZATIO BY USIG EWTO S METHOD Yosza B Dasrl, a Ismal B Moh 2 Faculty Electrocs a Computer

More information

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 3. Sampling, sampling distributions, and parameter estimation Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called

More information

Arithmetic Mean and Geometric Mean

Arithmetic Mean and Geometric Mean Acta Mathematca Ntresa Vol, No, p 43 48 ISSN 453-6083 Arthmetc Mea ad Geometrc Mea Mare Varga a * Peter Mchalča b a Departmet of Mathematcs, Faculty of Natural Sceces, Costate the Phlosopher Uversty Ntra,

More information

STRONG CONSISTENCY FOR SIMPLE LINEAR EV MODEL WITH v/ -MIXING

STRONG CONSISTENCY FOR SIMPLE LINEAR EV MODEL WITH v/ -MIXING Joural of tatstcs: Advaces Theory ad Alcatos Volume 5, Number, 6, Pages 3- Avalable at htt://scetfcadvaces.co. DOI: htt://d.do.org/.864/jsata_7678 TRONG CONITENCY FOR IMPLE LINEAR EV MODEL WITH v/ -MIXING

More information

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x

CS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters

More information

Non-uniform Turán-type problems

Non-uniform Turán-type problems Joural of Combatoral Theory, Seres A 111 2005 106 110 wwwelsevercomlocatecta No-uform Turá-type problems DhruvMubay 1, Y Zhao 2 Departmet of Mathematcs, Statstcs, ad Computer Scece, Uversty of Illos at

More information

The Necessarily Efficient Point Method for Interval Molp Problems

The Necessarily Efficient Point Method for Interval Molp Problems ISS 6-69 Eglad K Joural of Iformato ad omputg Scece Vol. o. 9 pp. - The ecessarly Effcet Pot Method for Iterval Molp Problems Hassa Mshmast eh ad Marzeh Alezhad + Mathematcs Departmet versty of Ssta ad

More information

Lecture Note to Rice Chapter 8

Lecture Note to Rice Chapter 8 ECON 430 HG revsed Nov 06 Lecture Note to Rce Chapter 8 Radom matrces Let Y, =,,, m, =,,, be radom varables (r.v. s). The matrx Y Y Y Y Y Y Y Y Y Y = m m m s called a radom matrx ( wth a ot m-dmesoal dstrbuto,

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematcs of Mache Learg Lecturer: Phlppe Rgollet Lecture 3 Scrbe: James Hrst Sep. 6, 205.5 Learg wth a fte dctoary Recall from the ed of last lecture our setup: We are workg wth a fte dctoary

More information

BERNSTEIN COLLOCATION METHOD FOR SOLVING NONLINEAR DIFFERENTIAL EQUATIONS. Aysegul Akyuz Dascioglu and Nese Isler

BERNSTEIN COLLOCATION METHOD FOR SOLVING NONLINEAR DIFFERENTIAL EQUATIONS. Aysegul Akyuz Dascioglu and Nese Isler Mathematcal ad Computatoal Applcatos, Vol. 8, No. 3, pp. 293-300, 203 BERNSTEIN COLLOCATION METHOD FOR SOLVING NONLINEAR DIFFERENTIAL EQUATIONS Aysegul Ayuz Dascoglu ad Nese Isler Departmet of Mathematcs,

More information

CSE 5526: Introduction to Neural Networks Linear Regression

CSE 5526: Introduction to Neural Networks Linear Regression CSE 556: Itroducto to Neural Netorks Lear Regresso Part II 1 Problem statemet Part II Problem statemet Part II 3 Lear regresso th oe varable Gve a set of N pars of data , appromate d by a lear fucto

More information

Transforms that are commonly used are separable

Transforms that are commonly used are separable Trasforms s Trasforms that are commoly used are separable Eamples: Two-dmesoal DFT DCT DST adamard We ca the use -D trasforms computg the D separable trasforms: Take -D trasform of the rows > rows ( )

More information

COMPROMISE HYPERSPHERE FOR STOCHASTIC DOMINANCE MODEL

COMPROMISE HYPERSPHERE FOR STOCHASTIC DOMINANCE MODEL Sebasta Starz COMPROMISE HYPERSPHERE FOR STOCHASTIC DOMINANCE MODEL Abstract The am of the work s to preset a method of rakg a fte set of dscrete radom varables. The proposed method s based o two approaches:

More information