Constrained Stochastic Gradient Descent for Large-scale Least Squares Problem

Size: px
Start display at page:

Download "Constrained Stochastic Gradient Descent for Large-scale Least Squares Problem"

Transcription

1 Consrained Sochasic Gradien Descen for Large-scale Leas Squares Problem ABSRAC Yang Mu Universiy of Massachuses Boson 100 Morrissey Boulevard Boson, MA, US 015 ianyi Zhou Universiy of echnology Sydney 35 Jones Sree Ulimo, NSW 007, Ausralia he leas squares problem is one of he mos imporan regression problems in saisics, machine learning and daa mining. In his paper, we presen he Consrained Sochasic Gradien Descen (CSGD algorihm o solve he largescale leas squares problem. CSGD improves he Sochasic Gradien Descen (SGD by imposing a provable consrain ha he linear regression line passes hrough he mean poin of all he daa poins. I resuls in he bes regre bound O(log, and fases convergence speed among all firs order approaches. Empirical sudies jusify he effeciveness of CSGD by comparing i wih SGD and oher sae-of-hear approaches. An example is also given o show how o use CSGD o opimize SGD based leas squares problems o achieve a beer performance. Caegories and Subjec Descripors G.1.6 [Opimizaion]: Leas squares mehods, Sochasic programming; I..6 [Learning]: Parameer learning General erms Algorihms, heory Keywords Sochasic opimizaion, Large-scale leas squares, online learning Corresponding auhor Permission o make digial or hard copies of all or par of his work for personal or classroom use is graned wihou fee provided ha copies are no made or disribued for profi or commercial advanage and ha copies bear his noice and he full ciaion on he firs page. Copyrighs for componens of his work owned by ohers han ACM mus be honored. Absracing wih credi is permied. o copy oherwise, or republish, o pos on servers or o redisribue o liss, requires prior specific permission and/or a fee. Reques permissions from permissions@acm.org. KDDaŕ13, Augus 11 14, 013, Chicago, Illinois, USA. Copyrigh 0XX ACM /13/08...$ Wei Ding Universiy of Massachuses Boson 100 Morrissey Boulevard Boson, MA, US 015 ding@cs.umb.edu Dacheng ao Universiy of echnology Sydney 35 Jones Sree Ulimo, NSW 007, Ausralia dacheng.ao@us.edu.au 1. INRODUCION he sochasic leas squares problem aims o find he coefficien w R d o minimize he objecive funcion a sep w +1 =argmin w 1 1 y i w x i, (1 i=1 where (x i,y i X Yisan inpu-oupu pair randomly drawn from daa se (X, Y endowed in a disribuion D wih x i R d and y i R, w+1 R d is a parameer ha minimizes he empirical leas squares loss a sep, and l(w, x,y = 1 y w x is he empirical risk. For large-scale problems, he classical opimizaion mehods, such as inerior poin mehod and conjugae gradien descen, have o scan all daa poins several imes in order o evaluae he objecive funcion and find he opimal w. Recenly, Sochasic Gradien Descen (SGD [7, 9, 9, 3, 0, 13] mehods show is promising efficiency in solving large-scale problems. Some of hem have been widely applied o he leas squares problem. he Leas Mean Squares (LMS algorihm [5] is he sandard firs order SGD, which akes a scalar as he learning rae. he Recursive Leas Squares (RLS approach [5, 15] is an insaniaion of he sochasic Newon mehod by replacing he scalar learning rae wih an approximaion of he Hessian marix inverse. he Averaged Sochasic Gradien Descen (ASGD [1] averages he SGD resuls o esimae w. ASGD converges more sably han SGD. Is convergence rae even approaches o ha of second order mehod, when he esimaor is sufficienly close o w. his happens afer processing a huge amoun of daa poins. Since he leas squares loss l(w, X, Y is usually srongly convex [5], he firs order approaches can converge a he rae of O(1/. Given he smalles eigenvalue λ 0 of he Hessian marix [8, 7, 18], many algorihms can achieve fas convergence raes and good regre bounds [7,, 10, 3, 11]. However, if he Hessian marix is unknown in advance, SGD may perform poorly [18]. For high-dimensional large-scale problems, he srong convexiy are no always guaraneed, because he smalles eigenvalue of he Hessian marix migh be close o 0. Wihou

2 he assumpion of he srong convexiy, he convergence rae of he firs order approaches reduces o O(1/ [30] while reains compuaion complexiy of O(d in each ieraion. Second order approaches, using he Hessian approximaion, converge a rae O(1/. Alhough hey are appealing due o he fas convergence rae and sableness, he expensive ime complexiy of O(d when dealing wih each ieraion limis he use of second order approaches in pracical largescale problems. In his paper, we prove ha he linear regression line defined by he opimal coefficien w passes hrough he mean poin ( x, ȳ of all he daa poins drawn from he disribuion D. Given his propery, we can significanly improve SGD for opimizing he large-scale leas squares problem by adding an equaliy consrain w +1 x ȳ =0,where( x, ȳ is he bach mean of he colleced daa poins ill sep during he opimizaion ieraions. he bach mean ( x, ȳ is an unbiased esimaion o ( x, ȳ by ieraions (cf. he Law of Large Numbers. We erm he proposed approach as he consrained SGD (CSGD. CSGD shrinks he opimal soluion space of he leas squares problem from he enire R d o a hyper-plane in R d, hus significanly improves he convergence rae and he regre bound. In paricular, Wihou he srong convexiy assumpion, CSGD converges a he rae of O(log /, which is close o ha of a full second order approach, while reaining ime complexiy of O(d in each ieraion. CSGD achieves he O(log regreboundwihourequiring srong convexiy, which is he bes regre bound among exising SGD mehods. Noe ha when he daa poins are cenralized (mean is 0, he consrain becomes rivial and CSGD reduces o SGD, which is he wors case for CSGD. In pracical online learning, he colleced daa poins, however, are ofen no cenralized, and hus CSGD is preferred. In his paper, we only discuss he properies of CSGD when daa poins are no cenralized. Noaions. We denoe he inpu daa poin x =[1 x ] = [x (1,...,x (d ] and w =[w (1 w ] =[w (1,...,w (d ], where x (1 = 1 is he firs elemen of x and w (1 is he bias parameer [6]. p is he L p norm, p is he squared Lp norm, is he absolue operaion for scalars and l(w is abbreviaed for l(w, x,y.. CONSRAINED SOCHASIC GRADI- EN DESCEN We presen he Consrained Sochasic Gradien Descen (CSGD algorihm for he large-scale leas squares problem by incorporaing SGD wih he fac ha he linear regression line passes hrough he mean poin of all he daa poins..1 CSGD algorihm he sandard Sochasic Gradien Descen (SGD algorihm akes he form of w +1 =Π τ (w η g, ( where η is an appropriae learning rae, g is he gradien of he loss funcion l(w, x,y = 1 y w x,πτ( is he Euclidean projecion funcion ha projecs w ono he predefined convex se τ by Π τ(w =argmin v v τ w. (3 In leas squares, w is defined in he enire R d,andπ τ ( can be aken off. hus, he search space of SGD is he enire R d o obain he opimal soluion. According o heorem.1 (cf. Secion., we add a consrain w x ȳ =0asep o SGD, where ( x, ȳ is an unbiased esimaion o ( x, ȳ afer ieraions, and obain CSGD w +1 =argmin w 1 1 y i w x i, s.. w x ȳ =0. i=1 (4 he consrain in Eq.(4 deermines he hyper-plane τ = {w w x =ȳ } residing in R d. By replacing τ wih τ in Eq.(, we have w +1 =Π τ (w η g. (5 he projecion funcion Π τ ( projecsapoinonohe hyper-plane τ. By solving Eq.(3, Π τ ( is uniquely defined a each sep by Π τ (v =P v + r, (6 where P is he projecion marix a sep and akes he form of P = I x x, (7 x where P R d d is idempoen and projecs a vecor ono he subspace generaed by x,andr = ȳ x x. By combining Eqs.(5 and (6, he ieraive procedure for CSGD is w +1 = P (w η g +r. (8 We can obain he ime and space complexiies of above procedure boh as O(d afer plugging Eq.(7 ino (8 and updae w +1, ( w +1 = I x x (w x η g +r (9 = w η g x ( x (w η g / x + r. Algorihm 1 describes how o calculae CSGD for he leas squares problem. his algorihm has he ime and space complexiies boh of O(d. Algorihm 1 Consrained Sochasic Gradien Descen (CSGD Iniialize w 1 = x 0 = 0 and ȳ 0 =0. for =1,, 3,... do Compue he gradien g l(w, x,y. Compue ( x, ȳ wih x = 1 x x, and ȳ = 1 ȳ y. Compue w +1 = w η g x ( x (w η g / x + r. end for

3 . Regression line consrain Algorihm 1 relies on he fac ha he opimal soluion lies in a hyper-plane decided by he mean poin, which leads o a significan improvemen on he convergence rae and he regre bound. heorem.1. (Regression line consrain he opimal soluion w lies on he hyper-plane, w x ȳ =0,whichis defined by he mean poin ( x, ȳ of daa poins drawn from he disribuion X Yendowed in D. Proof. he loss funcion is explicily defined as l(w, X, Y = (x,y D 1 y w x w (1, (10 where w (1 is he firs elemen of w. Seing he derivaive of he loss funcion w.r.. w (1 o zero, we obain w x + w (1 y =0. x X y Y w x y =0. x X y Y hus he opimal soluion w saisfies w x ȳ =0. heorem.1 is he core heorem for our mehod. Bishop [6] applied he deriviive w.r. he bias w (1 o sudy he propery of he bias. However, alhough he heorem iself is in a simple form, o he bes of our knowledge, i has never been saed and applied in any approach for leas squares opimizaion. he mean poin ( x, ȳ over a disribuion D is usually no given. In a sochasic approach, we can use he bach mean ( x, ȳ o approximae ( x, ȳ. he approximaion has an esimaion error, however, i will no lower he performance. his is because he bach opimal always saisfies his consrain when opimizing he empirical loss. herefore, we give he consrained esimaion error bound for compleeness. Proposiion.. (Consrained esimaion error bound According o he Law of Large Numbers, we assume here is asepm yeilds w x m x + ȳ m ȳ ɛ w. hen given a olerable small value ɛ, he esimaion error bound Π τ (w w ɛ holds for any sep m. Proof. Since Π τ (w w is he disance beween w and he hyper-plane τ,wehave Π τ (w w = Along wih w x ȳ =0,wehave ȳ w x. w ȳ w x (11 = ȳ ȳ (w x w x w x x + ȳ ȳ w x m x + ȳ m ȳ ɛ w, where m exiss since ( x, ȳ convergeso( x, ȳ according o he Law of Large Numbers. herefore, Π τ (w w ɛ. Proposiion. saes ha, if a sep m, ( x m, ȳ misclose o ( x, ȳ, hen a sufficienly good soluion (wihin an ɛ-ball cenered a opimal soluion w lies on hyper-plane τ m.in addiion, he esimaion error decays and is value is upper bounded by he weighed combinaion of x x and ȳ ȳ. Noice ha CSGD opimizes he opimal empirical soluion w ha is always locaed on hyper-plane τ. According o heorem.1 and Proposiion., under he assumpion of regression consrain, CSGD explicily minimizesheempiricallossasgoodashesecondordersgd. AccordingoProposiion., Π τ (w w converges a he same rae as x x and ȳ ȳ whose exponenial convergence rae [4] is suppored by he Law of Large Numbers. hus, we ignore he difference beweenπ τ (w and w in our heoreical analysis for simpliciy reasons. 3. A ONE-SEP DIFFERENCE INEQUAL- IY o sudy he heoreical properies of CSGD, we sar from he one-sep difference bound, which is crucial o analyze he regre and convergence behavior.... Figure 1: An illusraing example afer sep. ŵ +1 is he SGD resul. w +1 is he projecion for ŵ +1 on o he hyper-plane τ. w is he opimal soluion. an θ = ŵ +1 w +1 / w +1 w Afer ieraion sep, CSGD projecs he SGD resul ŵ +1 on he hyper-plane τ o ge a new resul w +1 wih direcion and sep size correcion. An illusraion is given in Figure 1. Noe ha, w is assumed on τ accordingoproposiion.. In addiion, he definiion of a gradien for any g l(w implies l(w l(w +g (w w (1 g (w w l(w l(w. Wih Eq.(1 we have he following heorems for sep difference bound. Firsly, we describe he sep difference bound proved by Nemirovski for SGD. heorem 3.1. (Sep difference bound of SGD For any opimal soluion w, SGD has he following inequaliy beween seps 1 and ŵ +1 w ŵ w η g η (l(ŵ l(w, where η is he learning rae a sep.

4 Deail proof is given in [19]. Secondly, we prove he sep difference bound of CSGD as follows. heorem 3.. (Sep difference bound of CSGD For any opimal soluion w, he following inequaliy holds for CSGD beween seps 1 and w +1 w w w (13 η g η (l(w l(w ḡ+1, x where w =Π τ (ŵ and ḡ +1 = l(ŵ +1, x, ȳ. Proof. Since ŵ +1 = w η g, beween wo seps 1 and, ŵ +1 and w follows heorem 3.1. herefore, we have ŵ +1 w w w η g η (l(w l(w. (14 As Euclidean projecion w +1 =Π τ (ŵ +1 given in Eq.(3, which is also shown in Figure 1, has he propery, ŵ +1 w = w +1 w + w +1 ŵ +1. (15 hen, subsiuing ŵ +1 w given by Eq.(15 ino Eq.(14 yields w +1 w w w (16 η g η (l(w l(w w +1 ŵ +1. By using he projecion funcion defined in Eq.(6, we have w +1 ŵ +1 (17 = P ŵ +1 + r ŵ +1 ( = I x x ŵ x +1 + ȳ x x ŵ +1 x (ȳ x ŵ +1 = x. Since ḡ +1 = l(ŵ +1, x, ȳ = x (ȳ x ŵ +1,we have w +1 ŵ +1 = ḡ+1. x A direc resul of he sep difference bound allows he following heorem which derives he convergence resul of CSGD. heorem 3.3. (Loss bound Assume (1 he norm of any gradien from l is bounded by G, ( he norm of w is less han or equal o D and (3 ḡ +1 x η(l(w l(w D G + G G hen η. Proof. Rearranging he bound in heorem 3. and sum he loss erms over from 1 hrough andhengehesum: η (l(w l(w (18 w 1 w w +1 w ( + η g ḡ+1 x D G + G he final sep uses he fac ha w 1 w D, where w 1 is iniialized o 0, alongwih g G for any and he assumpion ḡ +1 x η. G. A corollary which is he consequence of his heorem is presened in he following. Alhough he convergence for CSGD follows immediaely according o he Nemirovski s 3- line subgradien descen convergence proof [17], we presen our firs corollary underscoring he rae of convergence when η is fixed, in general is approximaely 1/ɛ,orequivalenly, 1/. Corollary 3.4. (Fixed sep convergence rae Assume heorem 3.3 hold and for any predeermined ieraions wih η = 1,hen min l(w 1 l(w 1 (D G + G +l(w. Proof. le η = η = 1 for any sep, he bound for convergence rae in heorem 3.3 becomes, (l(w l(w 1 η (D G +G η. he desired bound is achieved afer plugging in he specific value of η and dividing boh sides by. I is clear ha he fixed sep convergence rae for CSGD is upper bounded by SGD, which can be achieved by aking ou he G. 4. REGRE ANALYSIS Regre is he difference beween he oal loss and he opimal loss, which has been analyzed in mos online algorihms for evaluaing he correcness and convergence. 4.1 Regre Le G be he upper bound of g for any from 1,...,, we have he following heorem. heorem 4.1. (Regre Bound for CSGD he regre of CSGD is: R G( G (1 + log, H where H is a consan. herefore, lim sup R G( / 0.

5 Proof. In heorem 3., Eq.(16 shows ha: η (l(w l(w (19 w w w +1 w + η g w +1 ŵ +1, where w +1 ŵ +1 = w +1 w an θ and θ [0, π, which is shown in Figure 1. herefore, sum Eq.(19 over from 1 o,wehave l(w l(w (0 w w = ( 1 1 an θ 1 η η 1 η w 1 w + G η. η 1 By adding a dummy erm 1 η 0 (1+an θ 0 w 1 w =0 on he righ side, we have l(w l(w (1 w w ( 1 1 an θ 1 η η 1 η 1 + G η. Noe ha, an θ does no decrease w.r. sep as shown in Lemma 4. and η does no increase w.r. sep. Since η 1 > 0, we assume he lower bound an θ 1 η 1 hold 1,henEq.(1canberewrienas H l(w l(w ( ( 1 w w 1 H + G η. η η 1 Se η = 1 for all 1, we have H l(w l(w G (1 + log. (3 H When an θ becomes small, he improvemen from he consrain will no be significan as shown in Figure 1. Lemma 4. shows ha an θ does no decrease as increases if ŵ +1 and w +1 are close o w. his indicaes he improvemen from ŵ +1 o w +1 is sable under he regression consrain. And his furher proves he sableness of he regre bound. Lemma 4.. an θ = w +1 ŵ +1 / w +1 w does no decrease w.r. sep. Proof. I is known ha ŵ +1 and w +1 converge o w. If w +1 converges beyond he speed of ŵ +1, anθ will diverge and he Lemma holds for sure. 1 his inequaliy is based on he assumpion ha H is posiive. Alhough his assumpion could be slighly violaed when an θ =0ifw lies on τ and (x,y =( x, ȳ, his even rarely happens in real cases. Even if i happens bu for finie imes, he legaliy of our analysis is sill provable. So we simply rule ou his rare even in heoreical analysis. SGD has he convergence rae O(1/. his is he wors convergence rae ha can be obained by all he sochasic opimizaion approaches. Before we prove CSGD has a faser convergence rae han SGD, we emporarily make a conservaive assumpion ha CSGD and SGD boh converge a he rae of O( α, where α is a posiive number. Le ŵ +1 w be a α and w +1 w be b α.along wih Eq.(15, we have an θ = w +1 ŵ +1 / w +1 w ŵ+1 w = w+1 w w +1 w a b = b = O(1. In our approach, he O(log regre bound achieved by CGSD neiher requires srong convexiy nor regularizaion, while Hazan e al. achieve he same O(log regre bound under he assumpion of srong convexiy [10], and Barle e al. use regularizaion o obain he same O(log regre bound for srongly convex funcions and O( for any arbirary convex funcions. Furhermore, he O(log regre bound of CGSD is beer han he general regre bound O( discussed in [30]. he regre bound suggess a decreasing sep size, which yields he convergence rae saed in he following corollary. Corollary 4.3. (Decreasing sep convergence rae Assume heorem 4.1 hold and η = 1 for any sep, hen H min l(w 1 l(w G H (1 + log +l(w. his corollary is a direc resul of heorem 4.1. I shows ha he O(log / convergence rae of CSGD is much beer han O(1/ obained by SGD. 4. Learning rae sraegy heorem 4.1 and Corollary 4.3 sugges an opimal learning rae decreasing a he rae of O(1/ wihou assuming he srong convexiy for he objecive funcion. However, he decay proporional o he inverse of he number of ieraions is no robus o he wrong seing of he proporionaliy consan. he ypical resul for he wrong proporionaliy consan will lead o divergence in he firs several ieraions or converge o a poin far away from he opimal. Moivaed by his problem, we propose a -phase learning rae sraegy, which is defined as { η 0/, < m η =. η 0 m/, m he sep m is achieved when desired error olerance ɛ is obained in Proposiion., m = O(1/ɛ. he maximum value for m is he oal size of he daase, since he global mean would be achieved afer one pass of he daa.

6 5. EXPERIMENS 5.1 Opimizaion sudy In his secion, we perform numerical simulaion o sysemaically analyze he proposed algorihms and conduc empirical verificaion of our heoreical resuls. Our opimizaion sudy includes 5 algorihms, including he proposed CSGD and NCSGD, and SGD, ASGD, SGD, for comparaive sudy: 1 Sochasic Gradien Descen (SGD (a.k.a Robbinsmonro algorihm [1]: SGD, which is also known as he Leas Mean Squares approach o solve he sochasic leas squares, is chosen as he baseline algorihm. Averaged Gradien Descen (ASGD (a.k.a. Polyak- Rupper averaging [1]: ASGD performs he SGD approach and reurns he average poin a each ieraion. ASGD, as he sae-of-he-ar approach on firs order sochasic opimizaion, has achieved good performance [9, ]. 3 Consrained Sochasic Gradien Descen (CSGD: CSGD uses he -phase learning rae sraegy in order o achieve he proposed regre bound. 4 Naive Consrained Sochasic Gradien Descen (NC- SGD: NCSGD is a naive version of CSGD, which updaes wih he same learning rae of SGD o illusrae he opimizaion error of NCSGD is upper bounded by SGD a each sep. 5 Second Order Sochasic Gradien Descen (SGD [7]: SGD replaces he learning rae η by he inverse of he Hessian marix, which also forms Recursive Leas Squares, a second order sochasic leas squares soluion. Compared o he firs order approaches, SGD is considered as he bes possible approach using a full Hessian marix. he experimens for leas squares opimizaion have been conduced on wo differen seings: srongly and non-srongly convex cases. he difference beween srongly convex and non-srongly convex objecives has been exensively sudied in convex opimizaion and machine learning on selecion of he learning rae sraegy [, 4]. Even hough a decay of he learning rae a he rae of he inverse of he number of samples has been heoreically suggesed o achieve he opimal rae of convergence in srongly convex case [10]. In pracice, he leas squares approach may decrease oo fas and he ieraion will ge suck oo far away from he opimal soluion [1]. o solve his problem, a slower decay has been proposed in [] for learning rae, η = η 0 α and α (1/, 1. A λ/ w regularizaion erm can also be added o obain λ-srongly convex and uniformly Lipschiz [9, 1]. o ensure convergence, we safely se α =1/ [30] as a robus learning rae for all algorihms in our empirical sudy, which guaranees all algorihms can converge in close viciniy o he opimal soluion. In order o sudy he real convergence performance of differen approaches, we use he prior knowledge of he specrum of he Hessian marix o obain he bes η 0. o achieve he regre bound in heorem 4.1, a -phase learning rae is uilized for CSGD and m is se o be he half of he daase size n/. We generae n inpu samples wih d dimensions i.i.d. from a uniform disribuion beween 0 and 1. he opimal coefficien w is randomly drawn from sandard Gaussian disribuion. hen he n daa samples for our experimens is consruced using he n inpu samples as well as he coefficien wih a zero-mean Gaussian noise wih variance 0.. wo seings for leas squares opimizaion are designed: 1 a low-dimension case wih srong convexiy, where n =10,000, d =100, a high-dimension case, where n =5,000, d =5,000, wih smalles eigenvalue of he Hessian marix close o 0, which yields a non-srongly convex case. In each ieraion round, one sample is randomly drawn from one individual daase using a uniformly disribuion. In he srongly convex case, as shown in Figure op row, CSGD behaves similar o SGD and ouperforms oher firs order approaches. As we know, SGD, as a second order approach, is he bes possible soluion per ieraion for all firs order approaches. However, SGD requires O(d compuaion and space complexiy in each ieraion. CSGD performs like SGD bu only requires O(d compuaion and space complexiy, and CSGD has a comparable performance as SGD when doing opimizaion by giving a cerain amoun of CPU ime, as shown in op righ slo of Figure. In he non-srongly convex case, as shown in Figure boom row, CSGD performs he bes among all firs order approaches. SGD becomes impracical in his high dimensional case due o is high compuaion complexiy. CSGD has a slow sar a he beginning and his is because i adops a larger iniial learning rae η 0 which yields beer convergence for he second phase. his fac is also consisen wih he comparisons using he wrong iniial learning raes discussed in []. However, his larger iniial learning rae speeds up he convergence in he second phase. In addiion, he non-srong convexiy corresponds o an infinie raio of he eigenvalues for he Hessian marix, which significanly slows he performance of SGD. NCSGD has no been influenced by his case and sill performs consisenly beer han SGD. In our empirical sudy, we have observed: 1 NCSGD consisenly performs beer han SGD, and he experimenal resuls verify Corollary4.3; CSGD performs very similarly o he second order approaches, and his suppors he regre bound discussed in heorem 4.1; 3CSGD performs he bes among all he oher sae-of-he-ar firs order approaches ; 4 CSGD is he mos compeen algorihm o deal wih high-dimension daa among all he oher algorihms in our experimens. 5. Classificaion exension In he classificaion ask, leas squares loss funcion plays an imporan role and he opimizaion of leas squares is he cornersone of all leas squares based algorihms, such as Leas Squares Suppor Vecor Machine (LS-SVM [6], Regularized Leas-Squares classificaion [, 6] and ec. In his case, SGD is uilized by defaul during he opimizaion. I is well acknowledged ha he opimizaion speed for leas squares direcly affecs he performance of leas squares based algorihms. A faser opimizaion procedure corresponds o less raining ieraions and less raining ime. herefore, replacing he SGD wih CSGD for he leas squares opimizaion in many algorihm, he performance could be grealy improved. In his subsecion, we show an example

7 l(w-l(w* l(w-l(w* SGD SGD NCSGD CSGD ASGD Number of Samples l(w-l(w* l(w-l(w* ime(cpu seconds d=5,000 d= Number of Samples ime(cpu seconds Figure : Comparison of he mehods on he low-dimension case (op, and he high-dimension case (boom. how o adop CSGD in he opimizaion for he exising classificaion approaches. One direc classificaion approach using leas squares loss is ADApive LINear Elemen (Adaline [16], which is a wellknown mehod in neural nework. Adaline adops a simple percepron-like sysem ha accomplishes classificaion, which modifies coefficiens o minimize he leas squares error a every ieraion. Noe ha, alhough i may no achieve a perfec classificaion by using a linear classifier, he direc opimizaion for leas squares is commonly used as a subrouine in many complex algorihms, such as Muliple Adaline (Madaline [16] o achieve he non-linear separabiliy by using muliple layers of Adalines. Since he fundamenal opimizaion procedures for hese leas squares algorihms are he same, we only show a basic case for Adaline o show CSGD can improve he performance. In Adaline, each neuron separaes wo classes using a coefficien vecor w. he equaion of he separaing hyper-plane can be derived from he coefficien vecor. Specifically, o classify inpu sample x i,lene be he ne inpu of his neuron, where ne = w x. he oupu of Adaline o is 1 when ne > 0ando is -1 oherwise. he crucial par for raining he Adaline is o obain he bes coefficien vecor w, which is updaed per ieraion by minimizing he squared error. A ieraion, he squared error is 1 (y ne,wherey is 1 or 1 represeninghe posiive class or negaive class respecively. Adaline adops SGD for opimizaion whose learning rule is given by w +1 = w ηg, (4 where η is a consan learning rae, and he gradien g = (y w x x. When replacing SGD wih CSGD for Adaline, Eq.(4 is replaced wih Eq.(9. In he muliclass classificaion case, suppose here are c classes, Adaline needs c neurons o perform he classificaion and each neuron sill performs he binary class discriminaion. he CSGD version of Adaline (C-Adaline is depiced in Algorihm, which is sraighforward and easy o implemen. One hing need o be poined ou is ha, he class label y has o be rebuil in order o fi c neurons. herefore, he class label y of sample x for neuron c i is defined as: y (,ci =1wheny = i and y (,ci =0oherwise. he oupu o = k means ha he k h neuron c k has he highes ne value among c neurons. Algorihm CSGD version of Adaline (C-Adaline Iniialize w 0 = x 0 = 0 and ȳ 0 =0. for =1,, 3,... do for i =1,, 3,...,c do Compue he gradien g l(w, x,y (,ci. Compue ( x, ȳ (,ci wih x = 1 x x, and ȳ (,ci = 1 ȳ ( 1,ci + 1 y (,c i. w +1 = w η g x ( x (w η g / x + r. end for end for o evaluae he improvemen of he C-Adaline,we provide compuaional experimens of boh Adaline and C-Adaline on he MNIS daase [14], which is widely used as sochasic opimizaion classificaion benchmark on handwrien digis recogniion [9, 8].

8 Error rae Adaline C-Adaline Error rae Number of Samples ime(cpu seconds Figure 3: Lef: es error rae versus ieraion on he MNIS classificaion. Righ: es error rae versus CPU ime on he MNIS classificaion. C-Adaline is he proposed CSGD version of Adaline he MNIS daase consiss of 60,000 raining samples and 10,000 es samples wih 10 classes. Each digi is presened by a 8 8 gray scale image, for a oal of 784 feaures. All he daa is downscaled o [0, 1] via dividing he maximum pixel inensiy by 55. For he seing of he learning rae, Adaline adops a consan while C-Adaline simply akes he updaing rule of Naive Consrained Sochasic Gradien Descen (NCSGD. η for Adaline is se o 17,and he C-Adaline has η 0 = 4, which are boh he opimal resuls seleced from 0, 19,, 1. Since he opimal soluion is unique, his experimen is o examine how fas can Adaline and C-Adaline converge o his opimal. he es se error rae as a funcions of number of operaions is shown in Figure 3 (Lef. I is clear ha boh Adaline and C-Adaline converge o he same es error because, hey boh opimize he leas squares error. C-Adaline achieves 0.14 es error afer processing 14 samples ( 10 4, while Adaline achieves 0.14 es error afer processing 0 samples ( his indicaes ha C-Adaline converges 64 imes as fas as Adaline! Considering he size of he raining se is 60,000, C-Adaline uses 1/4 of he oal raining samples o achieve he nearly opimal es error rae, while Adaline needs o visi each raining sample more han 16 imes o achieve he same es error rae. Figure 3 (Righ shows he es error rae versus he CPU ime. o achieve he 0.14 es error, Adaline consumes seconds, while C-Adaline only akes 3.38 seconds. Noe ha, in his experimen, 10 neurons are rained in parallel. I is anoher achievemen o ge he nearly opimal es error using a leas squares classifier in abou 3 seconds for a 10 6 scale daase. o beer undersand he classificaion resuls, in Figure 4, we visualize he daa samples on a wo dimensional space by -SNE [7], which is a nonlinear mapping commonly used for exploring he inheren srucure from high dimensional daa. Since a linear classifier does no perform well on his problem and boh algorihms have he same classificaion error ulimaely, we suppress he samples which are sill misclassified in Figure 4 for clariy s sake. When boh algorihms have processed 14 raining samples, heir classificaion resuls on 10,000 es samples are depiced in Figure 4. C-Adaline misclassified 1 samples while Adaline misclassified 148 samples, which is abou 6 imes as C-Adaline. his experimen compares a SGD based classifier (Adaline and he proposed CSGD improvemen version (C-Adaline using he MNIS daase. In summary, C-Adaline consisenly has a beer es error rae han he original Adaline during he opimizaion. For a given es error rae, C-Adaline akes much less CPU ime han Adaline. 6. CONCLUSION In his paper, we analyze a new consrained based sochasic gradien descen soluion for he large-scale leas square problem. We provide heoreical jusificaions for he proposed mehod, called CSGD and NCSGD, which uilize he averaging hyper-plane as he projeced hyper-plane. Specifically, we described he convergence raes as well as he regre bounds for he proposed mehod. CSGD performs like a full second order approach bu wih simpler compuaion han SGD. he opimal regre O(log isachievedincsgd when adoping a corresponding learning rae sraegy. he heoreical superioriies are jusified by experimenal resuls. In addiion, i is easy o exend he SGD based leas squares algorihms o CSGD and he CSGD version can yield beer performance. An example of upgrading Adaline from SGD o CSGD is used o demonsrae he sraighforward bu efficien implemenaion of CSGD References [1] A. Agarwal, S. Negahban, and M. Wainwrigh. Sochasic opimizaion and sparse saisical recovery: Opimal algorihms for high dimensions. Advances in Neural Informaion Processing Sysems, 01. [] F. Bach and E. Moulines. Non-asympoic analysis of sochasic approximaion algorihms for machine learning. Advances in Neural Informaion Processing Sysems, pages , 011. [3] P. L. Barle, E. Hazan, and A. Rakhlin. Adapive online gradien descen. Advances in Neural Informaion Processing Sysems, 007. [4] L. E. Baum and M. Kaz. Exponenial convergence raes for he law of large numbers. ransacion American Mahemaical Sociey, pages , [5] D. P. Bersekas. Nonlinear programming. Ahena Scienific, [6] C. M. Bishop. Paern recogniion and machine learning. Springer-Verlag New York, 006.

9 Figure 4: wo dimensional -SNE visualizaion of he classificaion resuls for C-Adaline (Lef and Adaline (Righ on MNIS daase when 14 samples have been processed. [7] L. Boou and O. Bousque. he radeoffs of large scale learning. Advances in Neural Informaion Processing Sysems, 0: , 008. [8] L. Boou and Y. LeCun. Large scale online learning. Advances in Neural Informaion Processing Sysems, 003. [9] J. C. Duchi, E. Hazan, and Y. Singer. Adapive subgradien mehods for online learning and sochasic opimizaion. Journal of Machine Learning Research, 1:11 159, 011. [10] E. Hazan, A. Agarwal, and S. Kale. Logarihmic regre algorihms for online convex opimizaion. Conference on Learning heory, 69(-3:169 19, Dec [11] C. Hu, J.. Kwok, and W. Pan. Acceleraed gradien mehods for sochasic opimizaion and online learning. Advances in Neural Informaion Processing Sysems, pages , 009. [1] H. J. Kushner and G. Yin. Sochasic approximaion and recursive algorihms and applicaions. Springer- Verlag, 003. [13] J. Langford, L. Li, and. Zhang. Sparse online learning via runcaed gradien. Journal of Machine Learning Research, 10: , June 009. [14] Y. LeCun, L. Boou, Y. Bengio, and P. Haffner. Gradien-based learning applied o documen recogniion. Proceedings of he IEEE, 86(11:78 34, [15] L. Ljung. Analysis of sochasic gradien algorihms for linear regression problems. IEEE ransacions on Informaion heory, pages 30(: , [16] K. Mehrora, C. K. Mohan, and S. Ranka. Elemens of arificial neural neworks. MI press, [17] A. Nemirovski. Efficien mehods in convex programming. Lecure Noes, [18] A. Nemirovski, A. Judisky, G. Lan, and A. Shapiro. Robus sochasic approximaion approach o sochasic programming. SIAM Journal on Opimizaion, 19(4: , Jan [19] A. Nemirovski and D. Yudin. Problem complexiy and mehod efficiency in opimizaion. John Wiley and Sons Ld, [0] Y. Neserov. Inroducory lecures on convex opimizaion. A basic course(applied Opimizaion, 004. [1] B.. Polyak and A. B. Judisky. Acceleraion of sochasic approximaion by averaging. SIAM Journal on Conrol and Opimizaion, pages , 199. [] R. Rifkin, G. Yeo, and. Poggio. Regularized leassquares classificaion. Nao Science Series Sub Series III Compuer and Sysems Sciences, 190: , 003. [3] S. Shalev-Shwarz and S. M. Kakade. Mind he dualiy gap: Logarihmic regre algorihms for online opimizaion. Advances in Neural Informaion Processing Sysems, pages , 008. [4] S. Shalev-Shwarz and N. Srebro. Svm opimizaion: inverse dependence on raining se size. Inernaional Conference on Machine Learning, 008. [5] J. C. Spall. Inroducion o sochasic search and opimizaion. John Wiley and Sons, Inc, 003. [6] J. Suykens and J. Vandewalle. Leas squares suppor vecor machine classifiers. Neural processing leers, [7] L. Van der Maaen and G. Hinon. Visualizing daa using -sne. Journal of Machine Learning Research, 9( :85, 008. [8] L. Xiao. Dual averaging mehods for regularized sochasic learning and online opimizaion. he Journal of Machine Learning Research, 11: , 010. [9]. Zhang. Solving large scale linear predicion problems using sochasic gradien descen algorihms. Inernaional Conference on Machine Learning, 004. [30] M. Zinkevich. Online convex programming and generalized infiniesimal gradien ascen. Inernaional Conference on Machine Learning, 003.

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3

More information

Constrained Stochastic Gradient Descent for Large-scale Least Squares Problem

Constrained Stochastic Gradient Descent for Large-scale Least Squares Problem Constrained Stochastic Gradient Descent for Large-scale Least Squares Problem ABSRAC Yang Mu University of Massachusetts Boston 100 Morrissey Boulevard Boston, MA, US 015 yangmu@cs.umb.edu ianyi Zhou University

More information

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs PROC. IEEE CONFERENCE ON DECISION AND CONTROL, 06 A Primal-Dual Type Algorihm wih he O(/) Convergence Rae for Large Scale Consrained Convex Programs Hao Yu and Michael J. Neely Absrac This paper considers

More information

A Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization

A Forward-Backward Splitting Method with Component-wise Lazy Evaluation for Online Structured Convex Optimization A Forward-Backward Spliing Mehod wih Componen-wise Lazy Evaluaion for Online Srucured Convex Opimizaion Yukihiro Togari and Nobuo Yamashia March 28, 2016 Absrac: We consider large-scale opimizaion problems

More information

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given

More information

Notes on online convex optimization

Notes on online convex optimization Noes on online convex opimizaion Karl Sraos Online convex opimizaion (OCO) is a principled framework for online learning: OnlineConvexOpimizaion Inpu: convex se S, number of seps T For =, 2,..., T : Selec

More information

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Ensamble methods: Boosting

Ensamble methods: Boosting Lecure 21 Ensamble mehods: Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Schedule Final exam: April 18: 1:00-2:15pm, in-class Term projecs April 23 & April 25: a 1:00-2:30pm in CS seminar room

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

Online Convex Optimization Example And Follow-The-Leader

Online Convex Optimization Example And Follow-The-Leader CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Appendix to Online l 1 -Dictionary Learning with Application to Novel Document Detection

Appendix to Online l 1 -Dictionary Learning with Application to Novel Document Detection Appendix o Online l -Dicionary Learning wih Applicaion o Novel Documen Deecion Shiva Prasad Kasiviswanahan Huahua Wang Arindam Banerjee Prem Melville A Background abou ADMM In his secion, we give a brief

More information

The Rosenblatt s LMS algorithm for Perceptron (1958) is built around a linear neuron (a neuron with a linear

The Rosenblatt s LMS algorithm for Perceptron (1958) is built around a linear neuron (a neuron with a linear In The name of God Lecure4: Percepron and AALIE r. Majid MjidGhoshunih Inroducion The Rosenbla s LMS algorihm for Percepron 958 is buil around a linear neuron a neuron ih a linear acivaion funcion. Hoever,

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

Random Walk with Anti-Correlated Steps

Random Walk with Anti-Correlated Steps Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS NA568 Mobile Roboics: Mehods & Algorihms Today s Topic Quick review on (Linear) Kalman Filer Kalman Filering for Non-Linear Sysems Exended Kalman Filer (EKF)

More information

Stability and Bifurcation in a Neural Network Model with Two Delays

Stability and Bifurcation in a Neural Network Model with Two Delays Inernaional Mahemaical Forum, Vol. 6, 11, no. 35, 175-1731 Sabiliy and Bifurcaion in a Neural Nework Model wih Two Delays GuangPing Hu and XiaoLing Li School of Mahemaics and Physics, Nanjing Universiy

More information

Lecture 9: September 25

Lecture 9: September 25 0-725: Opimizaion Fall 202 Lecure 9: Sepember 25 Lecurer: Geoff Gordon/Ryan Tibshirani Scribes: Xuezhi Wang, Subhodeep Moira, Abhimanu Kumar Noe: LaTeX emplae couresy of UC Berkeley EECS dep. Disclaimer:

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Kriging Models Predicing Arazine Concenraions in Surface Waer Draining Agriculural Waersheds Paul L. Mosquin, Jeremy Aldworh, Wenlin Chen Supplemenal Maerial Number

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecure Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/i2ml3e CHAPTER 2: SUPERVISED LEARNING Learning a Class

More information

Two Coupled Oscillators / Normal Modes

Two Coupled Oscillators / Normal Modes Lecure 3 Phys 3750 Two Coupled Oscillaors / Normal Modes Overview and Moivaion: Today we ake a small, bu significan, sep owards wave moion. We will no ye observe waves, bu his sep is imporan in is own

More information

Lecture 20: Riccati Equations and Least Squares Feedback Control

Lecture 20: Riccati Equations and Least Squares Feedback Control 34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he

More information

KINEMATICS IN ONE DIMENSION

KINEMATICS IN ONE DIMENSION KINEMATICS IN ONE DIMENSION PREVIEW Kinemaics is he sudy of how hings move how far (disance and displacemen), how fas (speed and velociy), and how fas ha how fas changes (acceleraion). We say ha an objec

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

References are appeared in the last slide. Last update: (1393/08/19)

References are appeared in the last slide. Last update: (1393/08/19) SYSEM IDEIFICAIO Ali Karimpour Associae Professor Ferdowsi Universi of Mashhad References are appeared in he las slide. Las updae: 0..204 393/08/9 Lecure 5 lecure 5 Parameer Esimaion Mehods opics o be

More information

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t... Mah 228- Fri Mar 24 5.6 Marix exponenials and linear sysems: The analogy beween firs order sysems of linear differenial equaions (Chaper 5) and scalar linear differenial equaions (Chaper ) is much sronger

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

Module 2 F c i k c s la l w a s o s f dif di fusi s o i n

Module 2 F c i k c s la l w a s o s f dif di fusi s o i n Module Fick s laws of diffusion Fick s laws of diffusion and hin film soluion Adolf Fick (1855) proposed: d J α d d d J (mole/m s) flu (m /s) diffusion coefficien and (mole/m 3 ) concenraion of ions, aoms

More information

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]

More information

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin ACE 56 Fall 005 Lecure 4: Simple Linear Regression Model: Specificaion and Esimaion by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Simple Regression: Economic and Saisical Model

More information

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED 0.1 MAXIMUM LIKELIHOOD ESTIMATIO EXPLAIED Maximum likelihood esimaion is a bes-fi saisical mehod for he esimaion of he values of he parameers of a sysem, based on a se of observaions of a random variable

More information

Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization

Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization Dual Averaging Mehods for Regularized Sochasic Learning and Online Opimizaion Lin Xiao Microsof Research Microsof Way Redmond, WA 985, USA lin.xiao@microsof.com Revised March 8, Absrac We consider regularized

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

Empirical Process Theory

Empirical Process Theory Empirical Process heory 4.384 ime Series Analysis, Fall 27 Reciaion by Paul Schrimpf Supplemenary o lecures given by Anna Mikusheva Ocober 7, 28 Reciaion 7 Empirical Process heory Le x be a real-valued

More information

Chapter 3 Boundary Value Problem

Chapter 3 Boundary Value Problem Chaper 3 Boundary Value Problem A boundary value problem (BVP) is a problem, ypically an ODE or a PDE, which has values assigned on he physical boundary of he domain in which he problem is specified. Le

More information

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070

More information

Lecture Notes 2. The Hilbert Space Approach to Time Series

Lecture Notes 2. The Hilbert Space Approach to Time Series Time Series Seven N. Durlauf Universiy of Wisconsin. Basic ideas Lecure Noes. The Hilber Space Approach o Time Series The Hilber space framework provides a very powerful language for discussing he relaionship

More information

A Local Regret in Nonconvex Online Learning

A Local Regret in Nonconvex Online Learning Sergul Aydore Lee Dicker Dean Foser Absrac We consider an online learning process o forecas a sequence of oucomes for nonconvex models. A ypical measure o evaluae online learning policies is regre bu such

More information

Lecture 33: November 29

Lecture 33: November 29 36-705: Inermediae Saisics Fall 2017 Lecurer: Siva Balakrishnan Lecure 33: November 29 Today we will coninue discussing he boosrap, and hen ry o undersand why i works in a simple case. In he las lecure

More information

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important on-parameric echniques Insance Based Learning AKA: neares neighbor mehods, non-parameric, lazy, memorybased, or case-based learning Copyrigh 2005 by David Helmbold 1 Do no fi a model (as do LDA, logisic

More information

2. Nonlinear Conservation Law Equations

2. Nonlinear Conservation Law Equations . Nonlinear Conservaion Law Equaions One of he clear lessons learned over recen years in sudying nonlinear parial differenial equaions is ha i is generally no wise o ry o aack a general class of nonlinear

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

ACE 564 Spring Lecture 7. Extensions of The Multiple Regression Model: Dummy Independent Variables. by Professor Scott H.

ACE 564 Spring Lecture 7. Extensions of The Multiple Regression Model: Dummy Independent Variables. by Professor Scott H. ACE 564 Spring 2006 Lecure 7 Exensions of The Muliple Regression Model: Dumm Independen Variables b Professor Sco H. Irwin Readings: Griffihs, Hill and Judge. "Dumm Variables and Varing Coefficien Models

More information

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models. Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear

More information

Lecture 2 October ε-approximation of 2-player zero-sum games

Lecture 2 October ε-approximation of 2-player zero-sum games Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion

More information

Let us start with a two dimensional case. We consider a vector ( x,

Let us start with a two dimensional case. We consider a vector ( x, Roaion marices We consider now roaion marices in wo and hree dimensions. We sar wih wo dimensions since wo dimensions are easier han hree o undersand, and one dimension is a lile oo simple. However, our

More information

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.

More information

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems 8 Froniers in Signal Processing, Vol. 1, No. 1, July 217 hps://dx.doi.org/1.2266/fsp.217.112 Recursive Leas-Squares Fixed-Inerval Smooher Using Covariance Informaion based on Innovaion Approach in Linear

More information

IMPLICIT AND INVERSE FUNCTION THEOREMS PAUL SCHRIMPF 1 OCTOBER 25, 2013

IMPLICIT AND INVERSE FUNCTION THEOREMS PAUL SCHRIMPF 1 OCTOBER 25, 2013 IMPLICI AND INVERSE FUNCION HEOREMS PAUL SCHRIMPF 1 OCOBER 25, 213 UNIVERSIY OF BRIISH COLUMBIA ECONOMICS 526 We have exensively sudied how o solve sysems of linear equaions. We know how o check wheher

More information

Sequential Importance Resampling (SIR) Particle Filter

Sequential Importance Resampling (SIR) Particle Filter Paricle Filers++ Pieer Abbeel UC Berkeley EECS Many slides adaped from Thrun, Burgard and Fox, Probabilisic Roboics 1. Algorihm paricle_filer( S -1, u, z ): 2. Sequenial Imporance Resampling (SIR) Paricle

More information

20. Applications of the Genetic-Drift Model

20. Applications of the Genetic-Drift Model 0. Applicaions of he Geneic-Drif Model 1) Deermining he probabiliy of forming any paricular combinaion of genoypes in he nex generaion: Example: If he parenal allele frequencies are p 0 = 0.35 and q 0

More information

Multi-scale 2D acoustic full waveform inversion with high frequency impulsive source

Multi-scale 2D acoustic full waveform inversion with high frequency impulsive source Muli-scale D acousic full waveform inversion wih high frequency impulsive source Vladimir N Zubov*, Universiy of Calgary, Calgary AB vzubov@ucalgaryca and Michael P Lamoureux, Universiy of Calgary, Calgary

More information

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important

Non-parametric techniques. Instance Based Learning. NN Decision Boundaries. Nearest Neighbor Algorithm. Distance metric important on-parameric echniques Insance Based Learning AKA: neares neighbor mehods, non-parameric, lazy, memorybased, or case-based learning Copyrigh 2005 by David Helmbold 1 Do no fi a model (as do LTU, decision

More information

WEEK-3 Recitation PHYS 131. of the projectile s velocity remains constant throughout the motion, since the acceleration a x

WEEK-3 Recitation PHYS 131. of the projectile s velocity remains constant throughout the motion, since the acceleration a x WEEK-3 Reciaion PHYS 131 Ch. 3: FOC 1, 3, 4, 6, 14. Problems 9, 37, 41 & 71 and Ch. 4: FOC 1, 3, 5, 8. Problems 3, 5 & 16. Feb 8, 018 Ch. 3: FOC 1, 3, 4, 6, 14. 1. (a) The horizonal componen of he projecile

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

Understanding the asymptotic behaviour of empirical Bayes methods

Understanding the asymptotic behaviour of empirical Bayes methods Undersanding he asympoic behaviour of empirical Bayes mehods Boond Szabo, Aad van der Vaar and Harry van Zanen EURANDOM, 11.10.2011. Conens 2/20 Moivaion Nonparameric Bayesian saisics Signal in Whie noise

More information

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 RL Lecure 7: Eligibiliy Traces R. S. Suon and A. G. Baro: Reinforcemen Learning: An Inroducion 1 N-sep TD Predicion Idea: Look farher ino he fuure when you do TD backup (1, 2, 3,, n seps) R. S. Suon and

More information

Mean-square Stability Control for Networked Systems with Stochastic Time Delay

Mean-square Stability Control for Networked Systems with Stochastic Time Delay JOURNAL OF SIMULAION VOL. 5 NO. May 7 Mean-square Sabiliy Conrol for Newored Sysems wih Sochasic ime Delay YAO Hejun YUAN Fushun School of Mahemaics and Saisics Anyang Normal Universiy Anyang Henan. 455

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales Advances in Dynamical Sysems and Applicaions. ISSN 0973-5321 Volume 1 Number 1 (2006, pp. 103 112 c Research India Publicaions hp://www.ripublicaion.com/adsa.hm The Asympoic Behavior of Nonoscillaory Soluions

More information

EXERCISES FOR SECTION 1.5

EXERCISES FOR SECTION 1.5 1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler

More information

Boosting with Online Binary Learners for the Multiclass Bandit Problem

Boosting with Online Binary Learners for the Multiclass Bandit Problem Shang-Tse Chen School of Compuer Science, Georgia Insiue of Technology, Alana, GA Hsuan-Tien Lin Deparmen of Compuer Science and Informaion Engineering Naional Taiwan Universiy, Taipei, Taiwan Chi-Jen

More information

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018 MATH 5720: Gradien Mehods Hung Phan, UMass Lowell Ocober 4, 208 Descen Direcion Mehods Consider he problem min { f(x) x R n}. The general descen direcions mehod is x k+ = x k + k d k where x k is he curren

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

Network Newton Distributed Optimization Methods

Network Newton Distributed Optimization Methods Nework Newon Disribued Opimizaion Mehods Aryan Mokhari, Qing Ling, and Alejandro Ribeiro Absrac We sudy he problem of minimizing a sum of convex objecive funcions where he componens of he objecive are

More information

Unit Root Time Series. Univariate random walk

Unit Root Time Series. Univariate random walk Uni Roo ime Series Univariae random walk Consider he regression y y where ~ iid N 0, he leas squares esimae of is: ˆ yy y y yy Now wha if = If y y hen le y 0 =0 so ha y j j If ~ iid N 0, hen y ~ N 0, he

More information

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.

More information

di Bernardo, M. (1995). A purely adaptive controller to synchronize and control chaotic systems.

di Bernardo, M. (1995). A purely adaptive controller to synchronize and control chaotic systems. di ernardo, M. (995). A purely adapive conroller o synchronize and conrol chaoic sysems. hps://doi.org/.6/375-96(96)8-x Early version, also known as pre-prin Link o published version (if available):.6/375-96(96)8-x

More information

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite American Journal of Operaions Research, 08, 8, 8-9 hp://wwwscirporg/journal/ajor ISSN Online: 60-8849 ISSN Prin: 60-8830 The Opimal Sopping Time for Selling an Asse When I Is Uncerain Wheher he Price Process

More information

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 17

Designing Information Devices and Systems I Spring 2019 Lecture Notes Note 17 EES 16A Designing Informaion Devices and Sysems I Spring 019 Lecure Noes Noe 17 17.1 apaciive ouchscreen In he las noe, we saw ha a capacior consiss of wo pieces on conducive maerial separaed by a nonconducive

More information

A variational radial basis function approximation for diffusion processes.

A variational radial basis function approximation for diffusion processes. A variaional radial basis funcion approximaion for diffusion processes. Michail D. Vreas, Dan Cornford and Yuan Shen {vreasm, d.cornford, y.shen}@ason.ac.uk Ason Universiy, Birmingham, UK hp://www.ncrg.ason.ac.uk

More information

Morning Time: 1 hour 30 minutes Additional materials (enclosed):

Morning Time: 1 hour 30 minutes Additional materials (enclosed): ADVANCED GCE 78/0 MATHEMATICS (MEI) Differenial Equaions THURSDAY JANUARY 008 Morning Time: hour 30 minues Addiional maerials (enclosed): None Addiional maerials (required): Answer Bookle (8 pages) Graph

More information

Ordinary Differential Equations

Ordinary Differential Equations Ordinary Differenial Equaions 5. Examples of linear differenial equaions and heir applicaions We consider some examples of sysems of linear differenial equaions wih consan coefficiens y = a y +... + a

More information

Appendix to Creating Work Breaks From Available Idleness

Appendix to Creating Work Breaks From Available Idleness Appendix o Creaing Work Breaks From Available Idleness Xu Sun and Ward Whi Deparmen of Indusrial Engineering and Operaions Research, Columbia Universiy, New York, NY, 127; {xs2235,ww24}@columbia.edu Sepember

More information

ODEs II, Lecture 1: Homogeneous Linear Systems - I. Mike Raugh 1. March 8, 2004

ODEs II, Lecture 1: Homogeneous Linear Systems - I. Mike Raugh 1. March 8, 2004 ODEs II, Lecure : Homogeneous Linear Sysems - I Mike Raugh March 8, 4 Inroducion. In he firs lecure we discussed a sysem of linear ODEs for modeling he excreion of lead from he human body, saw how o ransform

More information

Nature Neuroscience: doi: /nn Supplementary Figure 1. Spike-count autocorrelations in time.

Nature Neuroscience: doi: /nn Supplementary Figure 1. Spike-count autocorrelations in time. Supplemenary Figure 1 Spike-coun auocorrelaions in ime. Normalized auocorrelaion marices are shown for each area in a daase. The marix shows he mean correlaion of he spike coun in each ime bin wih he spike

More information

Experiments on logistic regression

Experiments on logistic regression Experimens on logisic regression Ning Bao March, 8 Absrac In his repor, several experimens have been conduced on a spam daa se wih Logisic Regression based on Gradien Descen approach. Firs, he overfiing

More information

Christos Papadimitriou & Luca Trevisan November 22, 2016

Christos Papadimitriou & Luca Trevisan November 22, 2016 U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream

More information

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis Speaker Adapaion Techniques For Coninuous Speech Using Medium and Small Adapaion Daa Ses Consaninos Boulis Ouline of he Presenaion Inroducion o he speaker adapaion problem Maximum Likelihood Sochasic Transformaions

More information

4.1 Other Interpretations of Ridge Regression

4.1 Other Interpretations of Ridge Regression CHAPTER 4 FURTHER RIDGE THEORY 4. Oher Inerpreaions of Ridge Regression In his secion we will presen hree inerpreaions for he use of ridge regression. The firs one is analogous o Hoerl and Kennard reasoning

More information

The L p -Version of the Generalized Bohl Perron Principle for Vector Equations with Infinite Delay

The L p -Version of the Generalized Bohl Perron Principle for Vector Equations with Infinite Delay Advances in Dynamical Sysems and Applicaions ISSN 973-5321, Volume 6, Number 2, pp. 177 184 (211) hp://campus.ms.edu/adsa The L p -Version of he Generalized Bohl Perron Principle for Vecor Equaions wih

More information

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients Secion 3.5 Nonhomogeneous Equaions; Mehod of Undeermined Coefficiens Key Terms/Ideas: Linear Differenial operaor Nonlinear operaor Second order homogeneous DE Second order nonhomogeneous DE Soluion o homogeneous

More information

Inventory Control of Perishable Items in a Two-Echelon Supply Chain

Inventory Control of Perishable Items in a Two-Echelon Supply Chain Journal of Indusrial Engineering, Universiy of ehran, Special Issue,, PP. 69-77 69 Invenory Conrol of Perishable Iems in a wo-echelon Supply Chain Fariborz Jolai *, Elmira Gheisariha and Farnaz Nojavan

More information

DEPARTMENT OF STATISTICS

DEPARTMENT OF STATISTICS A Tes for Mulivariae ARCH Effecs R. Sco Hacker and Abdulnasser Haemi-J 004: DEPARTMENT OF STATISTICS S-0 07 LUND SWEDEN A Tes for Mulivariae ARCH Effecs R. Sco Hacker Jönköping Inernaional Business School

More information

FITTING OF A PARTIALLY REPARAMETERIZED GOMPERTZ MODEL TO BROILER DATA

FITTING OF A PARTIALLY REPARAMETERIZED GOMPERTZ MODEL TO BROILER DATA FITTING OF A PARTIALLY REPARAMETERIZED GOMPERTZ MODEL TO BROILER DATA N. Okendro Singh Associae Professor (Ag. Sa.), College of Agriculure, Cenral Agriculural Universiy, Iroisemba 795 004, Imphal, Manipur

More information

Comparing Means: t-tests for One Sample & Two Related Samples

Comparing Means: t-tests for One Sample & Two Related Samples Comparing Means: -Tess for One Sample & Two Relaed Samples Using he z-tes: Assumpions -Tess for One Sample & Two Relaed Samples The z-es (of a sample mean agains a populaion mean) is based on he assumpion

More information

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems Paricle Swarm Opimizaion Combining Diversificaion and Inensificaion for Nonlinear Ineger Programming Problems Takeshi Masui, Masaoshi Sakawa, Kosuke Kao and Koichi Masumoo Hiroshima Universiy 1-4-1, Kagamiyama,

More information