TECHNICAL REPORT NO Generalization and Regularization in Nonlinear Learning Systems 1

Size: px
Start display at page:

Download "TECHNICAL REPORT NO Generalization and Regularization in Nonlinear Learning Systems 1"

Transcription

1 DEPARTMENT OF STATISTICS Uiversity f Wiscsi 1210 West Dayt St. Madis, WI TECHNICAL REPORT NO February 28, 2000 i Nliear Learig Systems 1 by Grace 1 Prepared fr the Hadbk f Brai Thery ad Neural Netwrks, Secd Editi, Michael Arbib, Ed, withi the space ad referece limitatis f the Hadbk. This TR is a updated versi f the etry f the same ame i the First Editi, 1995, which was als prited as TR Supprted by NIH Grat EY09946 ad NSF Grat DMS

2 i Nliear Learig Systems Grace Departmet f Statistics Uiversity f Wiscsi 1210 W. Dayt St. Madis, WI wahba@stat.wisc.edu February 24, Itrducti I this article we will describe geeralizati ad regularizati frm the pit f view f multivariate fucti estimati i a statistical ctext. Multivariate fucti estimati is t, i priciple, distiguishable frm supervised machie learig. Hwever, util fairly recetly supervised machie learig ad multivariate fucti estimati had fairly distict grups f practitiers, ad small verlap i laguage, literature, ad i the kids f practical prblems uder study. I ay case, we are give a traiig set, csistig f pairs f iput (feature) vectrs ad assciated utputs {t(i), y i }, fr traiig r example subjects, i = 1,... Frm this data, it is desired t cstruct a map which geeralizes well, that is, give a ew value f t, the map will prvide a reasable predicti fr the ubserved utput assciated with this t. Mst applicatis fall it e f tw brad categries, which might be called parametric regressi ad classificati. I parametric regressi, y may be (ay) real umber r a vectr f r real umbers. The desired algrithm will prduce a estimate ˆf(t) f the expected value f a (ew) y t be assciated with a (ew) attribute vectr t. I the (tw-class) classificati prblem y i will be a idicatr whether r t the example (subject) came frm class A. I sme classificati applicatis, the desired algrithm will, give t, retur a idicatr which predicts whether r t a example with attribute vectr t cmes frm class A ( hard ) classificati. I ther applicatis the desired algrithm will retur p(t), a estimate f the prbability that the example with attribute vectr t is i class A. ( sft classificati). I sme applicatis the feature vectr t f dimesi d ctais zeres ad es (fr example as i a bitmap f hadwritig), i thers it may ctai real umbers represetig sme physical quatities, rdered r urdered categry idicatrs are als pssible, as i medical demgraphic studies. Regularizati, lsely speakig, meas that while the desired map is cstructed t apprximately sed the bserved feature vectrs t the bserved utputs, cstraits are applied t the cstructi f the map with the gal f reducig the geeralizati errr. I sme applicatis, these cstraits embdy a priri ifrmati ccerig the true relatiship betwee iput ad utput; alteratively, varius ad hc cstraits have smetimes bee shw t wrk well i practice. Girsi, Jes ad Pggi (1995) give a wide-ragig review. 2 i N-Parametric Regressi 2.1 Sigle Iput Splie Smthig We will use Figure 1 t illustrate the ideas f geeralizati ad regularizati i the simplest pssible parametric regressi setup, that is, d = 1, r = 1, with t = t ay real umber i sme iterval f the real lie. The circles (which are idetical i each f the three paels f Figure 1) represet = 100 (sythetically

3 geerated) iput-utput pairs {t(i), y i }, geerated accrdig t the mdel y i = f TRUE (t(i)) + ǫ i, i = 1,...,, (1) where f TRUE (t) = 4.26(e t 4e 2t + 3e 3t ), ad the ǫ i came frm a pseudradm umber geeratr fr Nrmally distributed radm variables with mea 0 ad stadard deviati σ = 0.2. Give this traiig data {t(i), y i, i = 1,..., }, the learig prblem is t create a map which, if give a ew value f t, will predict the respse y(t). I this case, the data are isy, s that eve if the ew t cicides with sme predictr variable t(i) i the traiig set, merely predictig y as the respse y i is t likely t be satisfactry. Als, this des t yet prvide ay ability t make predictis whe t des t exactly match ay predictr values i the traiig set. It is desired t geerate a curve which will allw a reasable predicti f the respse fr ay t withi a reasable viciity f the set f traiig predictrs {t(i)}. The dashed lie i each pael f Figure 1 is f TRUE (t); the three slid black lies i the three paels f Figure 1 are three slutis t the variatial prblem: Fid f i the [Hilbert] space W 2 f fuctis with ctiuus first derivatives ad square itegrable secd derivatives which miimizes 1 (y i f(t(i)) 2 + λ (f (2) (u)) 2 du, (2) fr three differet values f λ. The parameter λ is kw as the regularizati r smthig parameter. As λ, f λ teds t the least squares straight lie best fittig the data, ad as λ 0 the sluti teds t that curve i W 2 which miimizes the pealty fuctial J(f) = (f (2) (u)) 2 du subject t iterplatig the data (prvided the {t(i)} are distict). This latter iterplatig curve is kw as a cubic iterplatig splie, ad miimizers f (2) are kw as smthig splies. See (1990) ad refereces cited there fr further ifrmati ccerig these ad ther prperties f splies ted belw, ad further refereces. I the tp pael f Figure 1 λ has bee chse t small, ad the wiggly slid lie is attemptig t fit the data t clsely. It ca be see that usig the wiggly curve i the tp pael is t likely t give a gd predicti f y, assumig that future predictr-respse data is geerated by the same mechaism as the traiig data. I the middle pael, λ has bee chse t large, the curve has bee frced t flatte ut, ad agai it ca be see that the heavy lie will t give a gd predicti f y. I the bttm pael, λ has bee chse by geeralized crss validati (GCV). This is a methd which behaves similarly t leavig-ut-e i may cases but with cmputatial ad theretical advatages. See Li(1986), (1990, Chapter 4), Girard(1998). It ca be see that the λ btaied this way des a gd jb f chsig the right amut f smthig t best recver f TRUE f Equati (1). The f TRUE f Equati (1) wuld prvide the best predictr f the respse i a expected mea square errr sese if future data were geerated accrdig t Equati (1). The curve i the bttm pael has a reasable ability t geeralize, that is, t predict the respse give a ew value t f the predictr variable, at least if t is t t far frm the traiig predictr set {t(i)}. Fr each psitive λ, there exists a uique κ = κ(λ) s that the miimizer f λ f (2) is als the sluti t the prblem: Fid f i W 2 t miimize L(y, f) = 1 (y i f(t(i)) 2 (3) subject t the cditi J(f) = (f (2) (u)) 2 du κ. (4) As λ becmes large, the assciated κ(λ) becmes small, ad cversely. I geeral, the term regularizati refers t slvig sme prblem ivlvig best fittig, subject t sme cstrait(s) the sluti. These cstraits may be f varius frms. Whe they ivlve a quadratic pealty ivlvig derivatives, like J(f), 2

4 Figure 1: Traiig data (circles) have bee geerated by addig ise t f TRUE (t), shw by the dashed curve i each pael. All three paels have the same data. Tp: Slid curve is fitted splie with λ t small. Middle: Slid curve is fitted splie with λ t large. Bttm: Slid curve is fitted splie with λ btaied by geeralized crss validati. 3

5 the methd is cmmly referred t as Tikhv regularizati. The tighter the cstraits (i. e. the smaller κ, equivaletly the larger λ) the further away the sluti f λ will geerally be frm the traiig data, that is, L will be larger. As the cstraits get weaker ad weaker the ultimately (if there are eugh degrees f freedm i the methd) the sluti will iterplate the data. Hwever, as is clear frm Figure 1 a curve which rus thrugh all the data pits is t a gd sluti. A fudametal prblem i machie learig with isy ad r icmplete data, is t balace the tightess f the cstraits with the gdess f fit t the data, i such a way as t miimize the geeralizati errr, that is, the ability t predict the ubserved respse fr ew values f t (r t). This tradeff is by w well kw as the bias-variace tradeff, r, equivaletly, the gdess f fit - mdel cmplexity tradeff. Methds abud i the statistical literature fr uivariate curve fittig, icludig Parze kerel estimates, earest eighbr estimates, rthgal series estimates, least squares regressi splie estimates, ad, recetly wavelet estimates. Each methd has e r mre regularizati parameters, be they kerel widw widths, umbers f earest eighbrs icluded, umber f terms i the rthgal series expasi r regressi basis, r factrs r threshlds fr shrikig r trucatig wavelet cefficiets, that ctrl this tradeff. See Ramsay ad Silverma (1997) ad refereces cited there. 2.2 Multiple Iput, Sigle Hidde Layer Feed-Frward Neural Net A multiple iput, sigle hidde layer feed-frward eural et (NN) predictr fr the learig prblem f Secti 1 is typically f the frm N f NN (t) = σ 0 (b + w j σ h (a jt(i) + b j )) (5) j=1 where the a j ad t are d-vectrs. The fucti σ h is the s-called activati fucti f the hidde layer ad σ 0 is the activati fucti fr the utput. σ h is geerally a sigmidal fucti, fr example, σ h (τ) = e τ /(1 + e τ ), while σ 0 may be liear, sigmidal r a threshld uit. Here N is the umber f hidde uits, ad the w j, a j ad b j are leared frm the traiig data by sme apprpriate iterative descet algrithm that tries t steer these values twards miimizig sme distace measure, typically L(y, f NN ) = 1 (y i f NN (t(i))) 2. It is clear that if N is sufficietly large, ad the descet algrithm is ru lg eugh, it shuld be pssible t drive the L as clse as e likes t 0. (I practice it is pssible t get stuck i lcal miima.) Hwever, it is als clear ituitively frm Figure 1 that drivig L all the way t zer is t a desirable thig t d. Regularizati i this prblem may be de by ctrllig the size f N, by impsig pealties the w j, by stppig the descet algrithm early, that is, t drivig dw L as far as it ca g, r by varius cmbiatis f these strategies. Each will ifluece hw clsely f NN will fit the data, hw wiggly it will be, ad hw well it will be able t predict ubserved data that is geerated by a similar mechaism as the bserved data. 2.3 Multiple Iput Radial Basis Fucti ad Related Estimates Radial basis fuctis are rapidly becmig a ppular methd fr parametric regressi. We first describe a geeral frm f parametric regressi which will specialize t radial basis fuctis ad ther methds f iterest. Let R(s, t) be ay symmetric, strictly psitive defiite fucti E d E d. Here strictly psitive defiite meas fr ay K = 1, 2,... the K K matrix with j, kth etry R(s(j), s(k)) is strictly psitive defiite wheever the s(1),..., s(k) are distict. (A symmetric K K matrix M is said t be psitive defiite if fr ay K dimesial clum vectr x, x Mx is greater tha r equal t 0, ad is said t be strictly psitive defiite if x Mx is always strictly greater tha 0.) Psitive defiiteess will play a key rle i the discussi belw because, (amg ther reass) ay psitive defiite matrix ca be the cvariace matrix f a radm vectr ad ay psitive defiite fucti R(s, t) ca be the cvariace fucti f sme stchastic prcess, X(t). That is, there exists X( ) such that Cv X(s)X(t) = R(s, t). Give traiig data {t(i), y i }, it is always pssible i 4

6 priciple t btai a (regularized) iput-utput map frm this data by lettig the mdel f R,λ be f the frm N f R,λ (t) = c j R(t, s(j)), (6) j=1 where the s(j) are N ceters which are placed at distict values f the {t(i)} ad c = (c 1,..., c N ) is chse t miimize L(y, f) + λj(f). Here ad the regularizig pealty J( ) is f the frm L(y, f R,λ ) = 1 (y i f R,λ (t(i)) 2 (7) J(f R,λ ) = N j,k=1 c j c k J jk (8) where J jk are the etries f a -egative defiite quadratic frm. The (strict) psitive defiiteess f R guaratees that L(y, f R,λ ) + λj(f R,λ ) (9) always has a uique miimizer i c, fr ay -egative λ. This fllws by substitutig (6) it (9), ad usig the fact that the clums f the N matrix with i, j etry R(t(i), s(j)) are liearly idepedet sice they are just N clums f the psitive defiite matrix with i, j etry R(t(i), t(j)). Radial basis fucti estimates are btaied fr the special case where R(s, t) is f the special frm R(s, t) = r( W(s t) ), (10) where W is sme liear trasfrmati E d ad the rm is Euclidea distace. That is, R(s, t) depeds ly sme geeralized distace i E d betwee s ad t. The regularizati, that is, the effectig f the tradeff betwee gdess f fit t the data ad smthess f the sluti, is perfrmed by reducig N, ad/r icreasig λ. The chice f W will als affect the wiggliess f f R,λ i the radial basis fucti case. Alteratively, a mdel ca be btaied by chsig N small ad miimizig L(y, f). I that case N ad W are the smthig parameters. I the special case N =, s(i) = t(i), the f R,λ ca (fr ay psitive defiite R) be shw t be Bayes estimates, see Kimeldrf ad (1970), (1990). Argumets ca be give t shw that if is large ad N < is t t small, the they are gd apprximatis t Bayes estimates, see (1990, Chapter 7). I the special case J i,j = R(t(i), t(j)), the Bayes mdel is easy t describe ad we d it here; it is: y i = X(t(i)) + ǫ i, (11) with X(t) a zer mea Gaussia stchastic prcess with cvariace EX(s)X(t) = br(s, t) ad the ǫ i idepedet zer mea Gaussia radm variables with cmm variace σ 2, ad idepedet f X(t). I this case, the miimizer f R,λ f L(y, f) + λj(f), evaluated at t, is the cditial expectati f X(t), give y 1,..., y prvided that λ is chse as σ 2 /b. I geeral, pretedig that e has a prir ad cmputig the psterir mea r mde will have a regularizig effect. The discussi abve exteds t symmetric psitive defiite fuctis arbitrary dmais fr t icludig thse metied i Secti 1. Thi plate splies i d variables (f rder m) csist f radial basis fuctis plus plymials f ttal degree less tha m i d variables. (2m d > 0 is required fr techical reass.) Lettig t = (t 1,..., d d ), the thi plate splies are miimizers (i a apprpriate fucti space) f 1 (y i f(t(i)) 2 + λ α 1 + +α d =m m! α 1! α d! 5 ( m ) 2 f t α 1 1 dt tα d 1 dt d. (12) d

7 Settig d = 1, m = 2 gives the cubic splie case discussed earlier. Nte that there is pealty plymials f ttal degree less tha m, the thi plate splies with a particular chice f λ are Bayes estimates with a imprper prir (that is, ifiite variace) the plymials f ttal degree less tha m, see (1990) ad refereces cited there. Related variatis regularized estimates iclude additive smthig splies, which are f the frm d f(t) = µ + f α (t α ) (13) α=1 where µ ad the f α are the sluti t a variatial prblem f the frm: Fid µ ad f 1,.., f d i a certai fucti space t miimize 1 d (y i f(t(i)) 2 + λ α J α (f α ). (14) The J α may be f the frm f J i Equati (4). Here, there is a regularizati parameter fr each cmpet. See Hastie ad Tibshirai (1990), (1990). These additive mdels geeralize t smthig splie aalysis f variace (SS-ANOVA) mdels. I the SS-ANOVA mdels iteracti terms f the frm f αβ (t α, t β ), f αβγ (t α, t β, t γ ), etc., which satisfy side cditis makig them uiquely determied, are added t the represetati i Equati (13), ad crrespdig pealty terms with regularizati parameters are added i Equati (14). The f α, etc, may be geeralized t themselves beig radial basis fuctis. Behid these mdels are psitive defiite fuctis which are built up via tesr sums ad prducts f psitive defiite fuctis, See Gu ad (1993), (1990),, Wag, Gu, Klei ad Klei (1995). Regressi splie ANOVA mdels be btaied by settig the f α, f αβ etc. as liear cmbiatis f a (relatively small) umber f basis fuctis (usually splies). I this case the umber f the basis fuctis is prbably the mst ifluetial regularizati parameter. These ad similar methds agai all have either explicit r implicit regularizati parameters which gver the balace betwee the cmplexity f the mdel ad the fit t the data - the bias-variace tradeff. The usual criteria fr the geeralizati errr whe the fit ivlves miimizig the bserved residual sum f squares is the expected (cmparative) residual sum f squares fr ew data, EL(y ew, f λ ) σ 2 L(f TRUE, f λ ). Here the y ew are ew bservatis. Leavig ut e, leavig ut 10%, leavig ut a 1/3 represetative sample ( tuig set ) ad GCV ( i sample tuig ) are ppular methds fr chsig the tuig parameters t miimize this criteria. Cdes i Splus (smth.splie()), SAS (tpsplie), etlib(/gcv), Fufits (sreg, tps), R(smth.Psplie, gss) ad elsewhere are available fr implemetig the uivariate splie, thi plate splie ad additive ad iteracti (ANOVA) splies with GCV t chse sigle r multiple smthig parameters. Netlib, Fufits ad R are freeware. The smth.psplie cde i R at was used t geerate Figure 1. 3 i Sft Classificati Sft classificati is a atural gal i certai kids f demgraphic medical studies - fr example suppse a large traiig set is available frm a demgraphic study, csistig f bservatis {t(i), y i } where y i is a idicatr (1 r 0) f the presece r absece f sme disease i subject i at the ed f the study, ad t(i) is a vectr f values f risk factrs fr this subject at the begiig f the study. With this kid f data, it is frequetly f iterest t make a sft classificati, that is, t estimate the prbability p(t) that a ew subject with predictr vectr t will ctract the disease. A dctr, give this mdel, may advise ew patiets which risk factr(s) are imprtat fr them t ctrl t reduce the prbability f their ctractig the disease. A regularized (that is, smth ) estimate fr p(t) is desirable. Regularized estimates ca be btaied as fllws. First, defie α=1 f(t) = lg[p(t)/(1 p(t))]. (15) 6

8 f is kw i the statistics literature as the lg dds rati, r lgit. The p(t) is a sigmidal fucti f f(t), that is p(t) = e f(t) /(1+e f(t) ). We will get a regularized estimate fr f. L(y, f) f Equati (3) will be replaced by a expressi mre suitable fr 0 1 data, by usig the likelihd fr this data. T describe the likelihd, te that if y is a radm variable with Prb [y = 1] = p ad Prb [y = 0] = (1 p), the the prbability desity (r likelihd) P(y, p) fr y whe p is true, is just P(y, p) = p y (1 p) (1 y), this merely says P(1, p) = p ad P(0, p) = (1 p). Thus, the likelihd fr y 1,..., y (assumig that the y i are idepedet), is P(y 1,..., y ; p(t(1),..., p(t()) = Π p(t(i)) y i (1 p(t(i)) (1 y i). (16) Substitutig f fr p i (16), takig the egative lgarithm, gives the egative lg likelihd L(y, f) i terms f f: lgp(y 1,..., y ; f(t(1),..., f(t()) L(y, f) = [lg(1 + e f(t(i)) ) y i f(t(i))]. (17) It is atural fr L(y, f) t replace L(y, f) f (3) (7), (14) whe y i is restricted t 0 r 1, sice L(y, f TRUE ) is (a multiple f) the egative lg likelihd fr y geerated by a mdel with Gaussia ise like (1). A eural et implemetati f sft classificati wuld csist f fidig f NN (t) = lgitp NN (t) f the frm f Equati (5) t miimize L(y, f) f (17). If N is large eugh, the, i priciple, f NN may be drive s that p NN (t(i)) is clse t 1 if y i is 1, ad is clse t 0 if y i is 0. Agai, it is ituitively clear that this is t desirable. As befre, a regularized, r smth f NN ca be btaied by ctrllig N, pealizig the w i, stppig the iterative fittig early, r sme cmbiati f these. Pealized likelihd estimates f f are btaied by miimizig L(y, f) + J λ (f) where J λ (f) is a pealty fuctial crrespdig t thse i Equatis (2), (9), (12) r (14) ad its geeralizatis. A ppular defiiti fr the geeralizati errr is the (ubservable) cmparative Kullback- Leibler distace f the estimate t the true prbability distributi, which ca be shw t be give by EL(y ew, f λ )) = L(p TRUE, f λ ). A estimate f the λ which miimizes this criteria ca be btaied by withhldig a represetative subset y [left ut] f the traiig set ad chsig λ t miimize L(y [left ut], f λ ). Leavig-ut-e estimates are als pssible but geerally t feasible i this case. Geeralized apprximate crss validati (GACV) is a feasible isample methd f chsig λ; based a leavig-ut-e argumet, it has bee shw i simulati studies t prvide a gd estimate f the miimizer f L(p TRUE, f λ ), see, Li, Ga, Xiag, Klei ad Klei (1999). 4 i Hard Classificati I the hard classificati prblem (here we will csider ly tw classes fr simplicity), we are ly iterested i estimatig whether a example with vectr t is i class A t. This is the typical situati i, fr example character recgiti, vice recgiti, ad ther situatis where it is kw that the t s frm the tw classes beig examied are geerally well separated. I that case (assumig, fr simplicity that the examples frm the tw classes are represeted i the traiig set equally as is the future ppulati f iterest, ad, that csts f misclassificati are the same fr bth classes), the the ptimum classifier (t miimize the expected cst) wuld be A if p(t) is greater tha e-half, ad t A therwise. Equivaletly, the same rule ca be implemeted by examiig the sig f the lgit f(t). Here we are idetifyig A with the 1 s, ad ptimum is with respect t miimizig the expected cst f future misclassificati. Ufrtuately, i geeral it is either desirable r feasible t estimate the lgit f directly by the methds f Secti 3, because i the well separated case f takes values ear ±, ad, if d ad/r the sample size is large slvig the pealized likelihd prblem f Secti 3 is likely t be umerically ustable. Recetly, supprt vectr machies (SVM s) have bee shw t prvide a excellet methd fr classificati i this situati. See Burges (1998). The supprt vectr machie (SVM) is implemeted cdig the y i as ±1 accrdig as the ith example is i A r t. Give a psitive defiite fucti R(s, t), we fid a fucti f f the frm f(t) = b+ c i R(t, t(i)) 7

9 by fidig b ad c = (c i,,c ) t miimize 1 (1 y i f(t(i)) + + λ i,j c i c j R(t(i), t(j)) (18) where (τ) + = τ fr τ > 0 ad 0 therwise. Lettig f λ be the miimizer f (18), the classificati algrithm is: fr a ew attribute vectr t, assig A if f(t) > 0 ad t A if f(t) < 0. Li (1999) has demstrated the remarkable result that, uder geeral circumstaces with apprpriately chse λ, the SVM estimate f λ teds almst everywhere t either 1 r 1 ad is a estimate f sigf TRUE sig(p TRUE 1 2 ), which is exactly what is eeded t carry ut the ptimum classificati algrithm. A ppular chice fr R(s, t) is R(s, t) = exp 1 σ s t 2 where is the Euclidea rm. I this chice f R(, ) the result may be 2 sesitive t bth σ ad λ. As befre, the λ ad σ may be chse by leavig ut a represetative subset f the bservatis ad chsig λ ad σ t miimize sme measure f the geeralizati errr. Here the atural chice fr geeralizati errr wuld be the misclassificati rate. A versi f GACV fr SVM s, agai based a leavig-ut-e argumet, may be used as a isample methd fr chsig λ ad σ, see, Li ad Zhag (1999). The geeralizati errr target fr the GACV is E 1 (1 y iew f λ (t(i))) +. Hwever, 1 2 E 1 (1 y iew sig[f λ (t(i))]) + is the expected misclassificati rate, s that t the extet that f λ resembles sigf λ, this criteria will be apprpriate fr the geeralizati errr. 5 Chsig Hw Much t Regularize At the time f this writig, it is a matter f lively debate ad much research hw t chse the varius regularizati parameters. Leavig ut a large fracti f the traiig sample fr this purpse ad tuig the regularizati parameter(s) t best predict the left-ut data (accrdig t whatever criteria f best predicti is adpted) is cceptually simple, defesible, ad widely used (this is called ut-f-sample tuig). Successively leavig-ut-e, successively leavig-ut-10%, ad the i-sample methds GCV ad GACV are all ppular. See als Ye (1998) wh discusses i-sample tuig methds related t GCV i the Gaussia case which allw cmpariss acrss differet regularized estimates. I the Nrmally distributed bservatial errr case, if the stadard deviati f the bservatial errr (σ i Equati (1))is kw the ubiased risk estimates becme available. See Li (1986), (1990) ad refereces cited there. Whe there is a Bayesia mdel behid the regularizati prcedure, the maximum likelihd estimates may be derived, see (1985), althugh i rder fr these ad ther Bayes estimates t d a gd jb f miimizig the geeralizati errr i practice, it is usually ecessary that the prirs which they are based are realistic. 6 Which methd is best? Feedfrward eural ets, radial basis fuctis, ad varius frms f splies all prvide regularized r regularizable methds fr estimatig smth fuctis f several variables, give a traiig set {t(i), y i }: Which apprach is best? Ufrtuately, there is t, r is there likely t be, a sigle aswer t that questi. The aswer mst surely depeds the particular ature f the uderlyig but ukw truth, the ature f ay prir ifrmati that might be available abut this truth, the ature f ay ise i the data, the ability f the experimeter t chse the varius smthig r regularizati parameters well, the size f the data set, the use t which the aswer will be put, ad the cmputatial facilities available. Frm a mathematical pit f view, the classes f fuctis well apprximated by eural ets, radial basis fuctis, additive ad iteracti splies (ANOVA splies) are t the same, althugh all f these methds have the capability f apprximatig large classes f fuctis. Of curse, if a large eugh data set is available, mdels utilizig all f these appraches may be built, ad tued, ad the cmpared data that has bee set aside fr this 8

10 purpse. I-sample tuig methds fr cmparis acrss differet regularized estimates i the hard ad sft classificati ctexts are a area f active research. REFERENCES Burges, C. (1998), A tutrial supprt vectr machies fr patter recgiti, Data Miig ad Kwledge Discvery 2, Girard, D. (1998), Asympttic cmparis f (partial) crss-validati, GCV ad radmized GCV i parametric regressi, A. Statist. 126, Girsi, F., Jes, M. & Pggi, T. (1995), Regularizati thery ad eural etwrks architectures, Neural Cmputati 7, Gu, C. &, G. (1993), Semiparametric aalysis f variace with tesr prduct thi plate splies, J. Ryal Statistical Sc. Ser. B 55, Hastie, T. & Tibshirai, R. (1990), Geeralized Additive Mdels, Chapma ad Hall. Kimeldrf, G. &, G. (1970), A crrespdece betwee Bayesia estimati f stchastic prcesses ad smthig by splies, A. Math. Statist. 41, Li, K. C. (1986), Asympttic ptimality f C L ad geeralized crss validati i ridge regressi with applicati t splie smthig, A. Statist. 14, Li, Y. (1999), Supprt vectr machies ad the Bayes rule i classificati, Techical Reprt 1014, Departmet f Statistics, Uiversity f Wiscsi, Madis WI. Ramsay, J. & Silverma, B. (1997), Fuctial Data Aalysis, Spriger., G. (1985), A cmparis f GCV ad GML fr chsig the smthig parameter i the geeralized splie smthig prblem, A. Statist. 13, , G. (1990), Splie Mdels fr Observatial Data, SIAM. CBMS-NSF Regial Cferece Series i Applied Mathematics, v. 59., G., Li, X., Ga, F., Xiag, D., Klei, R. & Klei, B. (1999), The bias-variace tradeff ad the radmized GACV, i M. Kears, S. Slla & D. Ch, eds, Advaces i Ifrmati Prcessig Systems 11, MIT Press, pp Full ral presetati at NIPS 11., G., Li, Y. & Zhag, H. (1999), Geeralized apprximate crss validati fr supprt vectr machies, r, ather way t lk at margi-like quatities, Techical Reprt 1006, Departmet f Statistics, Uiversity f Wiscsi, Madis WI. t appear, Advaces i Large Margi Classifiers, A. Smla, P. Bartlett, B. Schlkpf ad D. Schurmas, eds, MIT Press., G., Wag, Y., Gu, C., Klei, R. & Klei, B. (1995), Smthig splie ANOVA fr expetial families, with applicati t the Wiscsi Epidemilgical Study f Diabetic Retipathy, A. Statist. 23, Neyma Lecture. Ye, J. (1998), O measurig ad crrectig the effects f data miig ad mdel selecti, J. Amer. Statist. Assc. 93,

5.1 Two-Step Conditional Density Estimator

5.1 Two-Step Conditional Density Estimator 5.1 Tw-Step Cditial Desity Estimatr We ca write y = g(x) + e where g(x) is the cditial mea fucti ad e is the regressi errr. Let f e (e j x) be the cditial desity f e give X = x: The the cditial desity

More information

Ch. 1 Introduction to Estimation 1/15

Ch. 1 Introduction to Estimation 1/15 Ch. Itrducti t stimati /5 ample stimati Prblem: DSB R S f M f s f f f ; f, φ m tcsπf t + φ t f lectrics dds ise wt usually white BPF & mp t s t + w t st. lg. f & φ X udi mp cs π f + φ t Oscillatr w/ f

More information

ENGI 4421 Central Limit Theorem Page Central Limit Theorem [Navidi, section 4.11; Devore sections ]

ENGI 4421 Central Limit Theorem Page Central Limit Theorem [Navidi, section 4.11; Devore sections ] ENGI 441 Cetral Limit Therem Page 11-01 Cetral Limit Therem [Navidi, secti 4.11; Devre sectis 5.3-5.4] If X i is t rmally distributed, but E X i, V X i ad is large (apprximately 30 r mre), the, t a gd

More information

Chapter 3.1: Polynomial Functions

Chapter 3.1: Polynomial Functions Ntes 3.1: Ply Fucs Chapter 3.1: Plymial Fuctis I Algebra I ad Algebra II, yu ecutered sme very famus plymial fuctis. I this secti, yu will meet may ther members f the plymial family, what sets them apart

More information

D.S.G. POLLOCK: TOPICS IN TIME-SERIES ANALYSIS STATISTICAL FOURIER ANALYSIS

D.S.G. POLLOCK: TOPICS IN TIME-SERIES ANALYSIS STATISTICAL FOURIER ANALYSIS STATISTICAL FOURIER ANALYSIS The Furier Represetati f a Sequece Accrdig t the basic result f Furier aalysis, it is always pssible t apprximate a arbitrary aalytic fucti defied ver a fiite iterval f the

More information

BIO752: Advanced Methods in Biostatistics, II TERM 2, 2010 T. A. Louis. BIO 752: MIDTERM EXAMINATION: ANSWERS 30 November 2010

BIO752: Advanced Methods in Biostatistics, II TERM 2, 2010 T. A. Louis. BIO 752: MIDTERM EXAMINATION: ANSWERS 30 November 2010 BIO752: Advaced Methds i Bistatistics, II TERM 2, 2010 T. A. Luis BIO 752: MIDTERM EXAMINATION: ANSWERS 30 Nvember 2010 Questi #1 (15 pits): Let X ad Y be radm variables with a jit distributi ad assume

More information

ENGI 4421 Central Limit Theorem Page Central Limit Theorem [Navidi, section 4.11; Devore sections ]

ENGI 4421 Central Limit Theorem Page Central Limit Theorem [Navidi, section 4.11; Devore sections ] ENGI 441 Cetral Limit Therem Page 11-01 Cetral Limit Therem [Navidi, secti 4.11; Devre sectis 5.3-5.4] If X i is t rmally distributed, but E X i, V X i ad is large (apprximately 30 r mre), the, t a gd

More information

Author. Introduction. Author. o Asmir Tobudic. ISE 599 Computational Modeling of Expressive Performance

Author. Introduction. Author. o Asmir Tobudic. ISE 599 Computational Modeling of Expressive Performance ISE 599 Cmputatial Mdelig f Expressive Perfrmace Playig Mzart by Aalgy: Learig Multi-level Timig ad Dyamics Strategies by Gerhard Widmer ad Asmir Tbudic Preseted by Tsug-Ha (Rbert) Chiag April 5, 2006

More information

A Study on Estimation of Lifetime Distribution with Covariates Under Misspecification

A Study on Estimation of Lifetime Distribution with Covariates Under Misspecification Prceedigs f the Wrld Cgress Egieerig ad Cmputer Sciece 2015 Vl II, Octber 21-23, 2015, Sa Fracisc, USA A Study Estimati f Lifetime Distributi with Cvariates Uder Misspecificati Masahir Ykyama, Member,

More information

Quantum Mechanics for Scientists and Engineers. David Miller

Quantum Mechanics for Scientists and Engineers. David Miller Quatum Mechaics fr Scietists ad Egieers David Miller Time-depedet perturbati thery Time-depedet perturbati thery Time-depedet perturbati basics Time-depedet perturbati thery Fr time-depedet prblems csider

More information

Multi-objective Programming Approach for. Fuzzy Linear Programming Problems

Multi-objective Programming Approach for. Fuzzy Linear Programming Problems Applied Mathematical Scieces Vl. 7 03. 37 8-87 HIKARI Ltd www.m-hikari.cm Multi-bective Prgrammig Apprach fr Fuzzy Liear Prgrammig Prblems P. Padia Departmet f Mathematics Schl f Advaced Scieces VIT Uiversity

More information

Grade 3 Mathematics Course Syllabus Prince George s County Public Schools

Grade 3 Mathematics Course Syllabus Prince George s County Public Schools Ctet Grade 3 Mathematics Curse Syllabus Price Gerge s Cuty Public Schls Prerequisites: Ne Curse Descripti: I Grade 3, istructial time shuld fcus fur critical areas: (1) develpig uderstadig f multiplicati

More information

Solutions. Definitions pertaining to solutions

Solutions. Definitions pertaining to solutions Slutis Defiitis pertaiig t slutis Slute is the substace that is disslved. It is usually preset i the smaller amut. Slvet is the substace that des the disslvig. It is usually preset i the larger amut. Slubility

More information

The Excel FFT Function v1.1 P. T. Debevec February 12, The discrete Fourier transform may be used to identify periodic structures in time ht.

The Excel FFT Function v1.1 P. T. Debevec February 12, The discrete Fourier transform may be used to identify periodic structures in time ht. The Excel FFT Fucti v P T Debevec February 2, 26 The discrete Furier trasfrm may be used t idetify peridic structures i time ht series data Suppse that a physical prcess is represeted by the fucti f time,

More information

Lecture 21: Signal Subspaces and Sparsity

Lecture 21: Signal Subspaces and Sparsity ECE 830 Fall 00 Statistical Sigal Prcessig istructr: R. Nwak Lecture : Sigal Subspaces ad Sparsity Sigal Subspaces ad Sparsity Recall the classical liear sigal mdel: X = H + w, w N(0, where S = H, is a

More information

Cross-Validation in Function Estimation

Cross-Validation in Function Estimation Crss-Validati i Fucti Estimati Chg Gu Octber 1, 2006 Crss-validati is a ituitive ad effective techique fr mdel selecti i data aalysis. I this discussi, I try t preset a few icaratis f the geeral techique

More information

Comparative analysis of bayesian control chart estimation and conventional multivariate control chart

Comparative analysis of bayesian control chart estimation and conventional multivariate control chart America Jural f Theretical ad Applied Statistics 3; ( : 7- ublished lie Jauary, 3 (http://www.sciecepublishiggrup.cm//atas di:.648/.atas.3. Cmparative aalysis f bayesia ctrl chart estimati ad cvetial multivariate

More information

MATH Midterm Examination Victor Matveev October 26, 2016

MATH Midterm Examination Victor Matveev October 26, 2016 MATH 33- Midterm Examiati Victr Matveev Octber 6, 6. (5pts, mi) Suppse f(x) equals si x the iterval < x < (=), ad is a eve peridic extesi f this fucti t the rest f the real lie. Fid the csie series fr

More information

Fourier Series & Fourier Transforms

Fourier Series & Fourier Transforms Experimet 1 Furier Series & Furier Trasfrms MATLAB Simulati Objectives Furier aalysis plays a imprtat rle i cmmuicati thery. The mai bjectives f this experimet are: 1) T gai a gd uderstadig ad practice

More information

Fourier Method for Solving Transportation. Problems with Mixed Constraints

Fourier Method for Solving Transportation. Problems with Mixed Constraints It. J. Ctemp. Math. Scieces, Vl. 5, 200,. 28, 385-395 Furier Methd fr Slvig Trasprtati Prblems with Mixed Cstraits P. Padia ad G. Nataraja Departmet f Mathematics, Schl f Advaced Scieces V I T Uiversity,

More information

K [f(t)] 2 [ (st) /2 K A GENERALIZED MEIJER TRANSFORMATION. Ku(z) ()x) t -)-I e. K(z) r( + ) () (t 2 I) -1/2 e -zt dt, G. L. N. RAO L.

K [f(t)] 2 [ (st) /2 K A GENERALIZED MEIJER TRANSFORMATION. Ku(z) ()x) t -)-I e. K(z) r( + ) () (t 2 I) -1/2 e -zt dt, G. L. N. RAO L. Iterat. J. Math. & Math. Scl. Vl. 8 N. 2 (1985) 359-365 359 A GENERALIZED MEIJER TRANSFORMATION G. L. N. RAO Departmet f Mathematics Jamshedpur C-perative Cllege f the Rachi Uiversity Jamshedpur, Idia

More information

Markov processes and the Kolmogorov equations

Markov processes and the Kolmogorov equations Chapter 6 Markv prcesses ad the Klmgrv equatis 6. Stchastic Differetial Equatis Csider the stchastic differetial equati: dx(t) =a(t X(t)) dt + (t X(t)) db(t): (SDE) Here a(t x) ad (t x) are give fuctis,

More information

ALE 26. Equilibria for Cell Reactions. What happens to the cell potential as the reaction proceeds over time?

ALE 26. Equilibria for Cell Reactions. What happens to the cell potential as the reaction proceeds over time? Name Chem 163 Secti: Team Number: AL 26. quilibria fr Cell Reactis (Referece: 21.4 Silberberg 5 th editi) What happes t the ptetial as the reacti prceeds ver time? The Mdel: Basis fr the Nerst quati Previusly,

More information

are specified , are linearly independent Otherwise, they are linearly dependent, and one is expressed by a linear combination of the others

are specified , are linearly independent Otherwise, they are linearly dependent, and one is expressed by a linear combination of the others Chater 3. Higher Order Liear ODEs Kreyszig by YHLee;4; 3-3. Hmgeeus Liear ODEs The stadard frm f the th rder liear ODE ( ) ( ) = : hmgeeus if r( ) = y y y y r Hmgeeus Liear ODE: Suersiti Pricile, Geeral

More information

Intermediate Division Solutions

Intermediate Division Solutions Itermediate Divisi Slutis 1. Cmpute the largest 4-digit umber f the frm ABBA which is exactly divisible by 7. Sluti ABBA 1000A + 100B +10B+A 1001A + 110B 1001 is divisible by 7 (1001 7 143), s 1001A is

More information

Review for cumulative test

Review for cumulative test Hrs Math 3 review prblems Jauary, 01 cumulative: Chapters 1- page 1 Review fr cumulative test O Mday, Jauary 7, Hrs Math 3 will have a curse-wide cumulative test cverig Chapters 1-. Yu ca expect the test

More information

Unifying the Derivations for. the Akaike and Corrected Akaike. Information Criteria. from Statistics & Probability Letters,

Unifying the Derivations for. the Akaike and Corrected Akaike. Information Criteria. from Statistics & Probability Letters, Uifyig the Derivatis fr the Akaike ad Crrected Akaike Ifrmati Criteria frm Statistics & Prbability Letters, Vlume 33, 1997, pages 201{208. by Jseph E. Cavaaugh Departmet f Statistics, Uiversity f Missuri,

More information

A Hartree-Fock Calculation of the Water Molecule

A Hartree-Fock Calculation of the Water Molecule Chemistry 460 Fall 2017 Dr. Jea M. Stadard Nvember 29, 2017 A Hartree-Fck Calculati f the Water Mlecule Itrducti A example Hartree-Fck calculati f the water mlecule will be preseted. I this case, the water

More information

6.867 Machine learning, lecture 14 (Jaakkola)

6.867 Machine learning, lecture 14 (Jaakkola) 6.867 Machie learig, lecture 14 (Jaakkla) 1 Lecture tpics: argi ad geeralizati liear classifiers esebles iture dels Margi ad geeralizati: liear classifiers As we icrease the uber f data pits, ay set f

More information

, the random variable. and a sample size over the y-values 0:1:10.

, the random variable. and a sample size over the y-values 0:1:10. Lecture 3 (4//9) 000 HW PROBLEM 3(5pts) The estimatr i (c) f PROBLEM, p 000, where { } ~ iid bimial(,, is 000 e f the mst ppular statistics It is the estimatr f the ppulati prprti I PROBLEM we used simulatis

More information

AP Statistics Notes Unit Eight: Introduction to Inference

AP Statistics Notes Unit Eight: Introduction to Inference AP Statistics Ntes Uit Eight: Itrducti t Iferece Syllabus Objectives: 4.1 The studet will estimate ppulati parameters ad margis f errrs fr meas. 4.2 The studet will discuss the prperties f pit estimatrs,

More information

Physical Chemistry Laboratory I CHEM 445 Experiment 2 Partial Molar Volume (Revised, 01/13/03)

Physical Chemistry Laboratory I CHEM 445 Experiment 2 Partial Molar Volume (Revised, 01/13/03) Physical Chemistry Labratry I CHEM 445 Experimet Partial Mlar lume (Revised, 0/3/03) lume is, t a gd apprximati, a additive prperty. Certaily this apprximati is used i preparig slutis whse ccetratis are

More information

[1 & α(t & T 1. ' ρ 1

[1 & α(t & T 1. ' ρ 1 NAME 89.304 - IGNEOUS & METAMORPHIC PETROLOGY DENSITY & VISCOSITY OF MAGMAS I. Desity The desity (mass/vlume) f a magma is a imprtat parameter which plays a rle i a umber f aspects f magma behavir ad evluti.

More information

A New Method for Finding an Optimal Solution. of Fully Interval Integer Transportation Problems

A New Method for Finding an Optimal Solution. of Fully Interval Integer Transportation Problems Applied Matheatical Scieces, Vl. 4, 200,. 37, 89-830 A New Methd fr Fidig a Optial Sluti f Fully Iterval Iteger Trasprtati Prbles P. Padia ad G. Nataraja Departet f Matheatics, Schl f Advaced Scieces,

More information

x 2 x 3 x b 0, then a, b, c log x 1 log z log x log y 1 logb log a dy 4. dx As tangent is perpendicular to the x axis, slope

x 2 x 3 x b 0, then a, b, c log x 1 log z log x log y 1 logb log a dy 4. dx As tangent is perpendicular to the x axis, slope The agle betwee the tagets draw t the parabla y = frm the pit (-,) 5 9 6 Here give pit lies the directri, hece the agle betwee the tagets frm that pit right agle Ratig :EASY The umber f values f c such

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

The Acoustical Physics of a Standing Wave Tube

The Acoustical Physics of a Standing Wave Tube UIUC Physics 93POM/Physics 406POM The Physics f Music/Physics f Musical Istrumets The Acustical Physics f a Stadig Wave Tube A typical cylidrical-shaped stadig wave tube (SWT) {aa impedace tube} f legth

More information

Christensen, Mads Græsbøll; Vera-Candeas, Pedro; Somasundaram, Samuel D.; Jakobsson, Andreas

Christensen, Mads Græsbøll; Vera-Candeas, Pedro; Somasundaram, Samuel D.; Jakobsson, Andreas Dwladed frm vb.aau.dk : April 12, 2019 Aalbrg Uiversitet Rbust Subspace-based Fudametal Frequecy Estimati Christese, Mads Græsbøll; Vera-Cadeas, Pedr; Smasudaram, Samuel D.; Jakbss, Adreas Published i:

More information

Bayesian Estimation for Continuous-Time Sparse Stochastic Processes

Bayesian Estimation for Continuous-Time Sparse Stochastic Processes Bayesia Estimati fr Ctiuus-Time Sparse Stchastic Prcesses Arash Amii, Ulugbek S Kamilv, Studet, IEEE, Emrah Bsta, Studet, IEEE, Michael User, Fellw, IEEE Abstract We csider ctiuus-time sparse stchastic

More information

Mean residual life of coherent systems consisting of multiple types of dependent components

Mean residual life of coherent systems consisting of multiple types of dependent components Mea residual life f cheret systems csistig f multiple types f depedet cmpets Serka Eryilmaz, Frak P.A. Cle y ad Tahai Cle-Maturi z February 20, 208 Abstract Mea residual life is a useful dyamic characteristic

More information

Axial Temperature Distribution in W-Tailored Optical Fibers

Axial Temperature Distribution in W-Tailored Optical Fibers Axial Temperature Distributi i W-Tailred Optical ibers Mhamed I. Shehata (m.ismail34@yah.cm), Mustafa H. Aly(drmsaly@gmail.cm) OSA Member, ad M. B. Saleh (Basheer@aast.edu) Arab Academy fr Sciece, Techlgy

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples

More information

WEST VIRGINIA UNIVERSITY

WEST VIRGINIA UNIVERSITY WEST VIRGINIA UNIVERSITY PLASMA PHYSICS GROUP INTERNAL REPORT PL - 045 Mea Optical epth ad Optical Escape Factr fr Helium Trasitis i Helic Plasmas R.F. Bivi Nvember 000 Revised March 00 TABLE OF CONTENT.0

More information

Study of Energy Eigenvalues of Three Dimensional. Quantum Wires with Variable Cross Section

Study of Energy Eigenvalues of Three Dimensional. Quantum Wires with Variable Cross Section Adv. Studies Ther. Phys. Vl. 3 009. 5 3-0 Study f Eergy Eigevalues f Three Dimesial Quatum Wires with Variale Crss Secti M.. Sltai Erde Msa Departmet f physics Islamic Aad Uiversity Share-ey rach Ira alrevahidi@yah.cm

More information

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Hypthesis Testing and Cnfidence Intervals (Part 1): Using the Standard Nrmal Lecture 8 Justin Kern April 2, 2017 Inferential Statistics Hypthesis Testing One sample mean / prprtin Tw sample means / prprtins

More information

Matching a Distribution by Matching Quantiles Estimation

Matching a Distribution by Matching Quantiles Estimation Jural f the America Statistical Assciati ISSN: 0162-1459 (Prit) 1537-274X (Olie) Jural hmepage: http://www.tadflie.cm/li/uasa20 Matchig a Distributi by Matchig Quatiles Estimati Niklas Sgurpuls, Qiwei

More information

MATHEMATICS 9740/01 Paper 1 14 Sep hours

MATHEMATICS 9740/01 Paper 1 14 Sep hours Cadidate Name: Class: JC PRELIMINARY EXAM Higher MATHEMATICS 9740/0 Paper 4 Sep 06 3 hurs Additial Materials: Cver page Aswer papers List f Frmulae (MF5) READ THESE INSTRUCTIONS FIRST Write yur full ame

More information

Internal vs. external validity. External validity. Internal validity

Internal vs. external validity. External validity. Internal validity Secti 7 Mdel Assessmet Iteral vs. exteral validity Iteral validity refers t whether the aalysis is valid fr the pplati ad sample beig stdied. Exteral validity refers t whether these reslts ca be geeralized

More information

MODIFIED LEAKY DELAYED LMS ALGORITHM FOR IMPERFECT ESTIMATE SYSTEM DELAY

MODIFIED LEAKY DELAYED LMS ALGORITHM FOR IMPERFECT ESTIMATE SYSTEM DELAY 5th Eurpea Sigal Prcessig Cferece (EUSIPCO 7), Pza, Plad, September 3-7, 7, cpyright by EURASIP MOIFIE LEAKY ELAYE LMS ALGORIHM FOR IMPERFEC ESIMAE SYSEM ELAY Jua R. V. López, Orlad J. bias, ad Rui Seara

More information

Preliminary Test Single Stage Shrinkage Estimator for the Scale Parameter of Gamma Distribution

Preliminary Test Single Stage Shrinkage Estimator for the Scale Parameter of Gamma Distribution America Jural f Mathematics ad Statistics, (3): 3-3 DOI:.593/j.ajms.3. Prelimiary Test Sigle Stage Shrikage Estimatr fr the Scale Parameter f Gamma Distributi Abbas Najim Salma,*, Aseel Hussei Ali, Mua

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

5.80 Small-Molecule Spectroscopy and Dynamics

5.80 Small-Molecule Spectroscopy and Dynamics MIT OpeCurseWare http://cw.mit.edu 5.8 Small-Mlecule Spectrscpy ad Dyamics Fall 8 Fr ifrmati abut citig these materials r ur Terms f Use, visit: http://cw.mit.edu/terms. 5.8 Lecture #33 Fall, 8 Page f

More information

Sound Absorption Characteristics of Membrane- Based Sound Absorbers

Sound Absorption Characteristics of Membrane- Based Sound Absorbers Purdue e-pubs Publicatis f the Ray W. Schl f Mechaical Egieerig 8-28-2003 Sud Absrpti Characteristics f Membrae- Based Sud Absrbers J Stuart Blt, blt@purdue.edu Jih Sg Fllw this ad additial wrks at: http://dcs.lib.purdue.edu/herrick

More information

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels Mtivating Example Memry-Based Learning Instance-Based Learning K-earest eighbr Inductive Assumptin Similar inputs map t similar utputs If nt true => learning is impssible If true => learning reduces t

More information

Mixtures of Gaussians and the EM Algorithm

Mixtures of Gaussians and the EM Algorithm Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity

More information

Super-efficiency Models, Part II

Super-efficiency Models, Part II Super-efficiec Mdels, Part II Emilia Niskae The 4th f Nvember S steemiaalsi Ctets. Etesis t Variable Returs-t-Scale (0.4) S steemiaalsi Radial Super-efficiec Case Prblems with Radial Super-efficiec Case

More information

Claude Elysée Lobry Université de Nice, Faculté des Sciences, parc Valrose, NICE, France.

Claude Elysée Lobry Université de Nice, Faculté des Sciences, parc Valrose, NICE, France. CHAOS AND CELLULAR AUTOMATA Claude Elysée Lbry Uiversité de Nice, Faculté des Scieces, parc Valrse, 06000 NICE, Frace. Keywrds: Chas, bifurcati, cellularautmata, cmputersimulatis, dyamical system, ifectius

More information

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017 Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with

More information

Regression Quantiles for Time Series Data ZONGWU CAI. Department of Mathematics. Abstract

Regression Quantiles for Time Series Data ZONGWU CAI. Department of Mathematics.   Abstract Regressi Quatiles fr Time Series Data ZONGWU CAI Departmet f Mathematics Uiversity f Nrth Carlia Charltte, NC 28223, USA E-mail: zcai@ucc.edu Abstract I this article we study parametric estimati f regressi

More information

Identical Particles. We would like to move from the quantum theory of hydrogen to that for the rest of the periodic table

Identical Particles. We would like to move from the quantum theory of hydrogen to that for the rest of the periodic table We wuld like t ve fr the quatu thery f hydrge t that fr the rest f the peridic table Oe electr at t ultielectr ats This is cplicated by the iteracti f the electrs with each ther ad by the fact that the

More information

Energy xxx (2011) 1e10. Contents lists available at ScienceDirect. Energy. journal homepage:

Energy xxx (2011) 1e10. Contents lists available at ScienceDirect. Energy. journal homepage: Eergy xxx (2011) 1e10 Ctets lists available at ScieceDirect Eergy jural hmepage: www.elsevier.cm/lcate/eergy Multi-bjective ptimizati f HVAC system with a evlutiary cmputati algrithm Adrew Kusiak *, Fa

More information

Active redundancy allocation in systems. R. Romera; J. Valdés; R. Zequeira*

Active redundancy allocation in systems. R. Romera; J. Valdés; R. Zequeira* Wrkig Paper -6 (3) Statistics ad Ecmetrics Series March Departamet de Estadística y Ecmetría Uiversidad Carls III de Madrid Calle Madrid, 6 893 Getafe (Spai) Fax (34) 9 64-98-49 Active redudacy allcati

More information

Statistica Sinica 6(1996), SOME PROBLEMS ON THE ESTIMATION OF UNIMODAL DENSITIES Peter J. Bickel and Jianqing Fan University of California and U

Statistica Sinica 6(1996), SOME PROBLEMS ON THE ESTIMATION OF UNIMODAL DENSITIES Peter J. Bickel and Jianqing Fan University of California and U Statistica Siica 6(996), 23-45 SOME PROBLEMS ON THE ESTIMATION OF UNIMODAL DENSITIES Peter J. Bickel ad Jiaqig Fa Uiversity f Califria ad Uiversity f Nrth Carlia Abstract: I this paper, we study, i sme

More information

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Study the bias (due to the nite dimensional approximation) and variance of the estimators 2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite

More information

Examination No. 3 - Tuesday, Nov. 15

Examination No. 3 - Tuesday, Nov. 15 NAME (lease rit) SOLUTIONS ECE 35 - DEVICE ELECTRONICS Fall Semester 005 Examiati N 3 - Tuesday, Nv 5 3 4 5 The time fr examiati is hr 5 mi Studets are allwed t use 3 sheets f tes Please shw yur wrk, artial

More information

Introduction to Machine Learning DIS10

Introduction to Machine Learning DIS10 CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig

More information

Electrostatics. . where,.(1.1) Maxwell Eqn. Total Charge. Two point charges r 12 distance apart in space

Electrostatics. . where,.(1.1) Maxwell Eqn. Total Charge. Two point charges r 12 distance apart in space Maxwell Eq. E ρ Electrstatics e. where,.(.) first term is the permittivity i vacuum 8.854x0 C /Nm secd term is electrical field stregth, frce/charge, v/m r N/C third term is the charge desity, C/m 3 E

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551

More information

Pattern Recognition 2014 Support Vector Machines

Pattern Recognition 2014 Support Vector Machines Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft

More information

HIGH-DIMENSIONAL data are common in many scientific

HIGH-DIMENSIONAL data are common in many scientific IEEE RANSACIONS ON KNOWLEDGE AND DAA ENGINEERING, VOL. 20, NO. 10, OCOBER 2008 1311 Kerel Ucrrelated ad Regularized Discrimiat Aalysis: A heretical ad Cmputatial Study Shuiwag Ji ad Jiepig Ye, Member,

More information

Portfolio Performance Evaluation in a Modified Mean-Variance-Skewness Framework with Negative Data

Portfolio Performance Evaluation in a Modified Mean-Variance-Skewness Framework with Negative Data Available lie at http://idea.srbiau.ac.ir It. J. Data Evelpmet Aalysis (ISSN 345-458X) Vl., N.3, Year 04 Article ID IJDEA-003,3 pages Research Article Iteratial Jural f Data Evelpmet Aalysis Sciece ad

More information

E o and the equilibrium constant, K

E o and the equilibrium constant, K lectrchemical measuremets (Ch -5 t 6). T state the relati betwee ad K. (D x -b, -). Frm galvaic cell vltage measuremet (a) K sp (D xercise -8, -) (b) K sp ad γ (D xercise -9) (c) K a (D xercise -G, -6)

More information

Multilayer perceptrons

Multilayer perceptrons Multilayer perceptros If traiig set is ot liearly separable, a etwork of McCulloch-Pitts uits ca give a solutio If o loop exists i etwork, called a feedforward etwork (else, recurret etwork) A two-layer

More information

COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY

COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY HAC ESTIMATION BY AUTOMATED REGRESSION By Peter C.B. Phillips July 004 COWLES FOUNDATION DISCUSSION PAPER NO. 470 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY Bx 088 New Have, Cecticut 0650-88

More information

Review of Important Concepts

Review of Important Concepts Appedix 1 Review f Imprtat Ccepts I 1 AI.I Liear ad Matrix Algebra Imprtat results frm liear ad matrix algebra thery are reviewed i this secti. I the discussis t fllw it is assumed that the reader already

More information

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Recovery of Third Order Tensors via Convex Optimization

Recovery of Third Order Tensors via Convex Optimization Recvery f Third Order Tesrs via Cvex Optimizati Hlger Rauhut RWTH Aache Uiversity Lehrstuhl C für Mathematik (Aalysis) Ptdriesch 10 5056 Aache Germay Email: rauhut@mathcrwth-aachede Željka Stjaac RWTH

More information

Spatio-temporal Modeling of Environmental Data for Epidemiologic Health Effects Analyses

Spatio-temporal Modeling of Environmental Data for Epidemiologic Health Effects Analyses Spati-tempral Mdelig f Evirmetal Data fr Epidemilgic Health Effects Aalyses Paul D. Samps Uiversity f Washigt Air Quality ad Health: a glbal issue with lcal challeges 8 Aug 2017 -- Mexic City 1 The MESA

More information

Every gas consists of a large number of small particles called molecules moving with very high velocities in all possible directions.

Every gas consists of a large number of small particles called molecules moving with very high velocities in all possible directions. Kietic thery f gases ( Kietic thery was develped by Berlli, Jle, Clasis, axwell ad Bltzma etc. ad represets dyamic particle r micrscpic mdel fr differet gases sice it thrws light the behir f the particles

More information

Design and Implementation of Cosine Transforms Employing a CORDIC Processor

Design and Implementation of Cosine Transforms Employing a CORDIC Processor C16 1 Desig ad Implemetati f Csie Trasfrms Emplyig a CORDIC Prcessr Sharaf El-Di El-Nahas, Ammar Mttie Al Hsaiy, Magdy M. Saeb Arab Academy fr Sciece ad Techlgy, Schl f Egieerig, Alexadria, EGYPT ABSTRACT

More information

Solutions to Midterm II. of the following equation consistent with the boundary condition stated u. y u x y

Solutions to Midterm II. of the following equation consistent with the boundary condition stated u. y u x y Sltis t Midterm II Prblem : (pts) Fid the mst geeral slti ( f the fllwig eqati csistet with the bdary cditi stated y 3 y the lie y () Slti : Sice the system () is liear the slti is give as a sperpsiti

More information

Chapter 5. Root Locus Techniques

Chapter 5. Root Locus Techniques Chapter 5 Rt Lcu Techique Itrducti Sytem perfrmace ad tability dt determied dby cled-lp l ple Typical cled-lp feedback ctrl ytem G Ope-lp TF KG H Zer -, - Ple 0, -, - K Lcati f ple eaily fud Variati f

More information

Full algebra of generalized functions and non-standard asymptotic analysis

Full algebra of generalized functions and non-standard asymptotic analysis Full algebra f geeralized fuctis ad -stadard asympttic aalysis Tdr D. Tdrv Has Veraeve Abstract We cstruct a algebra f geeralized fuctis edwed with a caical embeddig f the space f Schwartz distributis.

More information

, which yields. where z1. and z2

, which yields. where z1. and z2 The Gaussian r Nrmal PDF, Page 1 The Gaussian r Nrmal Prbability Density Functin Authr: Jhn M Cimbala, Penn State University Latest revisin: 11 September 13 The Gaussian r Nrmal Prbability Density Functin

More information

Control Systems. Controllability and Observability (Chapter 6)

Control Systems. Controllability and Observability (Chapter 6) 6.53 trl Systems trllaility ad Oservaility (hapter 6) Geeral Framewrk i State-Spae pprah Give a LTI system: x x u; y x (*) The system might e ustale r des t meet the required perfrmae spe. Hw a we imprve

More information

The generation of successive approximation methods for Markov decision processes by using stopping times

The generation of successive approximation methods for Markov decision processes by using stopping times The geerati f successive apprximati methds fr Markv decisi prcesses by usig stppig times Citati fr published versi (APA): va Nue, J. A. E. E., & Wessels, J. (1976). The geerati f successive apprximati

More information

Distributed Trajectory Generation for Cooperative Multi-Arm Robots via Virtual Force Interactions

Distributed Trajectory Generation for Cooperative Multi-Arm Robots via Virtual Force Interactions 862 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS, VOL. 27, NO. 5, OCTOBER 1997 Distributed Trajectry Geerati fr Cperative Multi-Arm Rbts via Virtual Frce Iteractis Tshi Tsuji,

More information

Frequency-Domain Study of Lock Range of Injection-Locked Non- Harmonic Oscillators

Frequency-Domain Study of Lock Range of Injection-Locked Non- Harmonic Oscillators 0 teratial Cferece mage Visi ad Cmputig CVC 0 PCST vl. 50 0 0 ACST Press Sigapre DO: 0.776/PCST.0.V50.6 Frequecy-Dmai Study f Lck Rage f jecti-lcked N- armic Oscillatrs Yushi Zhu ad Fei Yua Departmet f

More information

Grouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014

Grouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014 Groupig 2: Spectral ad Agglomerative Clusterig CS 510 Lecture #16 April 2 d, 2014 Groupig (review) Goal: Detect local image features (SIFT) Describe image patches aroud features SIFT, SURF, HoG, LBP, Group

More information

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer. 6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio

More information

ESWW-2. Israeli semi-underground great plastic scintillation multidirectional muon telescope (ISRAMUTE) for space weather monitoring and forecasting

ESWW-2. Israeli semi-underground great plastic scintillation multidirectional muon telescope (ISRAMUTE) for space weather monitoring and forecasting ESWW-2 Israeli semi-udergrud great plastic scitillati multidirectial mu telescpe (ISRAMUTE) fr space weather mitrig ad frecastig L.I. Drma a,b, L.A. Pustil'ik a, A. Sterlieb a, I.G. Zukerma a (a) Israel

More information

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,

More information

The generalized marginal rate of substitution

The generalized marginal rate of substitution Jural f Mathematical Ecmics 31 1999 553 560 The geeralized margial rate f substituti M Besada, C Vazuez ) Facultade de Ecmicas, UiÕersidade de Vig, Aptd 874, 3600 Vig, Spai Received 31 May 1995; accepted

More information

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised

More information

Directional Duality Theory

Directional Duality Theory Suther Illiis Uiversity Carbdale OpeSIUC Discussi Papers Departmet f Ecmics 2004 Directial Duality Thery Daiel Primt Suther Illiis Uiversity Carbdale Rlf Fare Oreg State Uiversity Fllw this ad additial

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the

More information