TECHNICAL REPORT NO Generalization and Regularization in Nonlinear Learning Systems 1
|
|
- Theodore Fisher
- 5 years ago
- Views:
Transcription
1 DEPARTMENT OF STATISTICS Uiversity f Wiscsi 1210 West Dayt St. Madis, WI TECHNICAL REPORT NO February 28, 2000 i Nliear Learig Systems 1 by Grace 1 Prepared fr the Hadbk f Brai Thery ad Neural Netwrks, Secd Editi, Michael Arbib, Ed, withi the space ad referece limitatis f the Hadbk. This TR is a updated versi f the etry f the same ame i the First Editi, 1995, which was als prited as TR Supprted by NIH Grat EY09946 ad NSF Grat DMS
2 i Nliear Learig Systems Grace Departmet f Statistics Uiversity f Wiscsi 1210 W. Dayt St. Madis, WI wahba@stat.wisc.edu February 24, Itrducti I this article we will describe geeralizati ad regularizati frm the pit f view f multivariate fucti estimati i a statistical ctext. Multivariate fucti estimati is t, i priciple, distiguishable frm supervised machie learig. Hwever, util fairly recetly supervised machie learig ad multivariate fucti estimati had fairly distict grups f practitiers, ad small verlap i laguage, literature, ad i the kids f practical prblems uder study. I ay case, we are give a traiig set, csistig f pairs f iput (feature) vectrs ad assciated utputs {t(i), y i }, fr traiig r example subjects, i = 1,... Frm this data, it is desired t cstruct a map which geeralizes well, that is, give a ew value f t, the map will prvide a reasable predicti fr the ubserved utput assciated with this t. Mst applicatis fall it e f tw brad categries, which might be called parametric regressi ad classificati. I parametric regressi, y may be (ay) real umber r a vectr f r real umbers. The desired algrithm will prduce a estimate ˆf(t) f the expected value f a (ew) y t be assciated with a (ew) attribute vectr t. I the (tw-class) classificati prblem y i will be a idicatr whether r t the example (subject) came frm class A. I sme classificati applicatis, the desired algrithm will, give t, retur a idicatr which predicts whether r t a example with attribute vectr t cmes frm class A ( hard ) classificati. I ther applicatis the desired algrithm will retur p(t), a estimate f the prbability that the example with attribute vectr t is i class A. ( sft classificati). I sme applicatis the feature vectr t f dimesi d ctais zeres ad es (fr example as i a bitmap f hadwritig), i thers it may ctai real umbers represetig sme physical quatities, rdered r urdered categry idicatrs are als pssible, as i medical demgraphic studies. Regularizati, lsely speakig, meas that while the desired map is cstructed t apprximately sed the bserved feature vectrs t the bserved utputs, cstraits are applied t the cstructi f the map with the gal f reducig the geeralizati errr. I sme applicatis, these cstraits embdy a priri ifrmati ccerig the true relatiship betwee iput ad utput; alteratively, varius ad hc cstraits have smetimes bee shw t wrk well i practice. Girsi, Jes ad Pggi (1995) give a wide-ragig review. 2 i N-Parametric Regressi 2.1 Sigle Iput Splie Smthig We will use Figure 1 t illustrate the ideas f geeralizati ad regularizati i the simplest pssible parametric regressi setup, that is, d = 1, r = 1, with t = t ay real umber i sme iterval f the real lie. The circles (which are idetical i each f the three paels f Figure 1) represet = 100 (sythetically
3 geerated) iput-utput pairs {t(i), y i }, geerated accrdig t the mdel y i = f TRUE (t(i)) + ǫ i, i = 1,...,, (1) where f TRUE (t) = 4.26(e t 4e 2t + 3e 3t ), ad the ǫ i came frm a pseudradm umber geeratr fr Nrmally distributed radm variables with mea 0 ad stadard deviati σ = 0.2. Give this traiig data {t(i), y i, i = 1,..., }, the learig prblem is t create a map which, if give a ew value f t, will predict the respse y(t). I this case, the data are isy, s that eve if the ew t cicides with sme predictr variable t(i) i the traiig set, merely predictig y as the respse y i is t likely t be satisfactry. Als, this des t yet prvide ay ability t make predictis whe t des t exactly match ay predictr values i the traiig set. It is desired t geerate a curve which will allw a reasable predicti f the respse fr ay t withi a reasable viciity f the set f traiig predictrs {t(i)}. The dashed lie i each pael f Figure 1 is f TRUE (t); the three slid black lies i the three paels f Figure 1 are three slutis t the variatial prblem: Fid f i the [Hilbert] space W 2 f fuctis with ctiuus first derivatives ad square itegrable secd derivatives which miimizes 1 (y i f(t(i)) 2 + λ (f (2) (u)) 2 du, (2) fr three differet values f λ. The parameter λ is kw as the regularizati r smthig parameter. As λ, f λ teds t the least squares straight lie best fittig the data, ad as λ 0 the sluti teds t that curve i W 2 which miimizes the pealty fuctial J(f) = (f (2) (u)) 2 du subject t iterplatig the data (prvided the {t(i)} are distict). This latter iterplatig curve is kw as a cubic iterplatig splie, ad miimizers f (2) are kw as smthig splies. See (1990) ad refereces cited there fr further ifrmati ccerig these ad ther prperties f splies ted belw, ad further refereces. I the tp pael f Figure 1 λ has bee chse t small, ad the wiggly slid lie is attemptig t fit the data t clsely. It ca be see that usig the wiggly curve i the tp pael is t likely t give a gd predicti f y, assumig that future predictr-respse data is geerated by the same mechaism as the traiig data. I the middle pael, λ has bee chse t large, the curve has bee frced t flatte ut, ad agai it ca be see that the heavy lie will t give a gd predicti f y. I the bttm pael, λ has bee chse by geeralized crss validati (GCV). This is a methd which behaves similarly t leavig-ut-e i may cases but with cmputatial ad theretical advatages. See Li(1986), (1990, Chapter 4), Girard(1998). It ca be see that the λ btaied this way des a gd jb f chsig the right amut f smthig t best recver f TRUE f Equati (1). The f TRUE f Equati (1) wuld prvide the best predictr f the respse i a expected mea square errr sese if future data were geerated accrdig t Equati (1). The curve i the bttm pael has a reasable ability t geeralize, that is, t predict the respse give a ew value t f the predictr variable, at least if t is t t far frm the traiig predictr set {t(i)}. Fr each psitive λ, there exists a uique κ = κ(λ) s that the miimizer f λ f (2) is als the sluti t the prblem: Fid f i W 2 t miimize L(y, f) = 1 (y i f(t(i)) 2 (3) subject t the cditi J(f) = (f (2) (u)) 2 du κ. (4) As λ becmes large, the assciated κ(λ) becmes small, ad cversely. I geeral, the term regularizati refers t slvig sme prblem ivlvig best fittig, subject t sme cstrait(s) the sluti. These cstraits may be f varius frms. Whe they ivlve a quadratic pealty ivlvig derivatives, like J(f), 2
4 Figure 1: Traiig data (circles) have bee geerated by addig ise t f TRUE (t), shw by the dashed curve i each pael. All three paels have the same data. Tp: Slid curve is fitted splie with λ t small. Middle: Slid curve is fitted splie with λ t large. Bttm: Slid curve is fitted splie with λ btaied by geeralized crss validati. 3
5 the methd is cmmly referred t as Tikhv regularizati. The tighter the cstraits (i. e. the smaller κ, equivaletly the larger λ) the further away the sluti f λ will geerally be frm the traiig data, that is, L will be larger. As the cstraits get weaker ad weaker the ultimately (if there are eugh degrees f freedm i the methd) the sluti will iterplate the data. Hwever, as is clear frm Figure 1 a curve which rus thrugh all the data pits is t a gd sluti. A fudametal prblem i machie learig with isy ad r icmplete data, is t balace the tightess f the cstraits with the gdess f fit t the data, i such a way as t miimize the geeralizati errr, that is, the ability t predict the ubserved respse fr ew values f t (r t). This tradeff is by w well kw as the bias-variace tradeff, r, equivaletly, the gdess f fit - mdel cmplexity tradeff. Methds abud i the statistical literature fr uivariate curve fittig, icludig Parze kerel estimates, earest eighbr estimates, rthgal series estimates, least squares regressi splie estimates, ad, recetly wavelet estimates. Each methd has e r mre regularizati parameters, be they kerel widw widths, umbers f earest eighbrs icluded, umber f terms i the rthgal series expasi r regressi basis, r factrs r threshlds fr shrikig r trucatig wavelet cefficiets, that ctrl this tradeff. See Ramsay ad Silverma (1997) ad refereces cited there. 2.2 Multiple Iput, Sigle Hidde Layer Feed-Frward Neural Net A multiple iput, sigle hidde layer feed-frward eural et (NN) predictr fr the learig prblem f Secti 1 is typically f the frm N f NN (t) = σ 0 (b + w j σ h (a jt(i) + b j )) (5) j=1 where the a j ad t are d-vectrs. The fucti σ h is the s-called activati fucti f the hidde layer ad σ 0 is the activati fucti fr the utput. σ h is geerally a sigmidal fucti, fr example, σ h (τ) = e τ /(1 + e τ ), while σ 0 may be liear, sigmidal r a threshld uit. Here N is the umber f hidde uits, ad the w j, a j ad b j are leared frm the traiig data by sme apprpriate iterative descet algrithm that tries t steer these values twards miimizig sme distace measure, typically L(y, f NN ) = 1 (y i f NN (t(i))) 2. It is clear that if N is sufficietly large, ad the descet algrithm is ru lg eugh, it shuld be pssible t drive the L as clse as e likes t 0. (I practice it is pssible t get stuck i lcal miima.) Hwever, it is als clear ituitively frm Figure 1 that drivig L all the way t zer is t a desirable thig t d. Regularizati i this prblem may be de by ctrllig the size f N, by impsig pealties the w j, by stppig the descet algrithm early, that is, t drivig dw L as far as it ca g, r by varius cmbiatis f these strategies. Each will ifluece hw clsely f NN will fit the data, hw wiggly it will be, ad hw well it will be able t predict ubserved data that is geerated by a similar mechaism as the bserved data. 2.3 Multiple Iput Radial Basis Fucti ad Related Estimates Radial basis fuctis are rapidly becmig a ppular methd fr parametric regressi. We first describe a geeral frm f parametric regressi which will specialize t radial basis fuctis ad ther methds f iterest. Let R(s, t) be ay symmetric, strictly psitive defiite fucti E d E d. Here strictly psitive defiite meas fr ay K = 1, 2,... the K K matrix with j, kth etry R(s(j), s(k)) is strictly psitive defiite wheever the s(1),..., s(k) are distict. (A symmetric K K matrix M is said t be psitive defiite if fr ay K dimesial clum vectr x, x Mx is greater tha r equal t 0, ad is said t be strictly psitive defiite if x Mx is always strictly greater tha 0.) Psitive defiiteess will play a key rle i the discussi belw because, (amg ther reass) ay psitive defiite matrix ca be the cvariace matrix f a radm vectr ad ay psitive defiite fucti R(s, t) ca be the cvariace fucti f sme stchastic prcess, X(t). That is, there exists X( ) such that Cv X(s)X(t) = R(s, t). Give traiig data {t(i), y i }, it is always pssible i 4
6 priciple t btai a (regularized) iput-utput map frm this data by lettig the mdel f R,λ be f the frm N f R,λ (t) = c j R(t, s(j)), (6) j=1 where the s(j) are N ceters which are placed at distict values f the {t(i)} ad c = (c 1,..., c N ) is chse t miimize L(y, f) + λj(f). Here ad the regularizig pealty J( ) is f the frm L(y, f R,λ ) = 1 (y i f R,λ (t(i)) 2 (7) J(f R,λ ) = N j,k=1 c j c k J jk (8) where J jk are the etries f a -egative defiite quadratic frm. The (strict) psitive defiiteess f R guaratees that L(y, f R,λ ) + λj(f R,λ ) (9) always has a uique miimizer i c, fr ay -egative λ. This fllws by substitutig (6) it (9), ad usig the fact that the clums f the N matrix with i, j etry R(t(i), s(j)) are liearly idepedet sice they are just N clums f the psitive defiite matrix with i, j etry R(t(i), t(j)). Radial basis fucti estimates are btaied fr the special case where R(s, t) is f the special frm R(s, t) = r( W(s t) ), (10) where W is sme liear trasfrmati E d ad the rm is Euclidea distace. That is, R(s, t) depeds ly sme geeralized distace i E d betwee s ad t. The regularizati, that is, the effectig f the tradeff betwee gdess f fit t the data ad smthess f the sluti, is perfrmed by reducig N, ad/r icreasig λ. The chice f W will als affect the wiggliess f f R,λ i the radial basis fucti case. Alteratively, a mdel ca be btaied by chsig N small ad miimizig L(y, f). I that case N ad W are the smthig parameters. I the special case N =, s(i) = t(i), the f R,λ ca (fr ay psitive defiite R) be shw t be Bayes estimates, see Kimeldrf ad (1970), (1990). Argumets ca be give t shw that if is large ad N < is t t small, the they are gd apprximatis t Bayes estimates, see (1990, Chapter 7). I the special case J i,j = R(t(i), t(j)), the Bayes mdel is easy t describe ad we d it here; it is: y i = X(t(i)) + ǫ i, (11) with X(t) a zer mea Gaussia stchastic prcess with cvariace EX(s)X(t) = br(s, t) ad the ǫ i idepedet zer mea Gaussia radm variables with cmm variace σ 2, ad idepedet f X(t). I this case, the miimizer f R,λ f L(y, f) + λj(f), evaluated at t, is the cditial expectati f X(t), give y 1,..., y prvided that λ is chse as σ 2 /b. I geeral, pretedig that e has a prir ad cmputig the psterir mea r mde will have a regularizig effect. The discussi abve exteds t symmetric psitive defiite fuctis arbitrary dmais fr t icludig thse metied i Secti 1. Thi plate splies i d variables (f rder m) csist f radial basis fuctis plus plymials f ttal degree less tha m i d variables. (2m d > 0 is required fr techical reass.) Lettig t = (t 1,..., d d ), the thi plate splies are miimizers (i a apprpriate fucti space) f 1 (y i f(t(i)) 2 + λ α 1 + +α d =m m! α 1! α d! 5 ( m ) 2 f t α 1 1 dt tα d 1 dt d. (12) d
7 Settig d = 1, m = 2 gives the cubic splie case discussed earlier. Nte that there is pealty plymials f ttal degree less tha m, the thi plate splies with a particular chice f λ are Bayes estimates with a imprper prir (that is, ifiite variace) the plymials f ttal degree less tha m, see (1990) ad refereces cited there. Related variatis regularized estimates iclude additive smthig splies, which are f the frm d f(t) = µ + f α (t α ) (13) α=1 where µ ad the f α are the sluti t a variatial prblem f the frm: Fid µ ad f 1,.., f d i a certai fucti space t miimize 1 d (y i f(t(i)) 2 + λ α J α (f α ). (14) The J α may be f the frm f J i Equati (4). Here, there is a regularizati parameter fr each cmpet. See Hastie ad Tibshirai (1990), (1990). These additive mdels geeralize t smthig splie aalysis f variace (SS-ANOVA) mdels. I the SS-ANOVA mdels iteracti terms f the frm f αβ (t α, t β ), f αβγ (t α, t β, t γ ), etc., which satisfy side cditis makig them uiquely determied, are added t the represetati i Equati (13), ad crrespdig pealty terms with regularizati parameters are added i Equati (14). The f α, etc, may be geeralized t themselves beig radial basis fuctis. Behid these mdels are psitive defiite fuctis which are built up via tesr sums ad prducts f psitive defiite fuctis, See Gu ad (1993), (1990),, Wag, Gu, Klei ad Klei (1995). Regressi splie ANOVA mdels be btaied by settig the f α, f αβ etc. as liear cmbiatis f a (relatively small) umber f basis fuctis (usually splies). I this case the umber f the basis fuctis is prbably the mst ifluetial regularizati parameter. These ad similar methds agai all have either explicit r implicit regularizati parameters which gver the balace betwee the cmplexity f the mdel ad the fit t the data - the bias-variace tradeff. The usual criteria fr the geeralizati errr whe the fit ivlves miimizig the bserved residual sum f squares is the expected (cmparative) residual sum f squares fr ew data, EL(y ew, f λ ) σ 2 L(f TRUE, f λ ). Here the y ew are ew bservatis. Leavig ut e, leavig ut 10%, leavig ut a 1/3 represetative sample ( tuig set ) ad GCV ( i sample tuig ) are ppular methds fr chsig the tuig parameters t miimize this criteria. Cdes i Splus (smth.splie()), SAS (tpsplie), etlib(/gcv), Fufits (sreg, tps), R(smth.Psplie, gss) ad elsewhere are available fr implemetig the uivariate splie, thi plate splie ad additive ad iteracti (ANOVA) splies with GCV t chse sigle r multiple smthig parameters. Netlib, Fufits ad R are freeware. The smth.psplie cde i R at was used t geerate Figure 1. 3 i Sft Classificati Sft classificati is a atural gal i certai kids f demgraphic medical studies - fr example suppse a large traiig set is available frm a demgraphic study, csistig f bservatis {t(i), y i } where y i is a idicatr (1 r 0) f the presece r absece f sme disease i subject i at the ed f the study, ad t(i) is a vectr f values f risk factrs fr this subject at the begiig f the study. With this kid f data, it is frequetly f iterest t make a sft classificati, that is, t estimate the prbability p(t) that a ew subject with predictr vectr t will ctract the disease. A dctr, give this mdel, may advise ew patiets which risk factr(s) are imprtat fr them t ctrl t reduce the prbability f their ctractig the disease. A regularized (that is, smth ) estimate fr p(t) is desirable. Regularized estimates ca be btaied as fllws. First, defie α=1 f(t) = lg[p(t)/(1 p(t))]. (15) 6
8 f is kw i the statistics literature as the lg dds rati, r lgit. The p(t) is a sigmidal fucti f f(t), that is p(t) = e f(t) /(1+e f(t) ). We will get a regularized estimate fr f. L(y, f) f Equati (3) will be replaced by a expressi mre suitable fr 0 1 data, by usig the likelihd fr this data. T describe the likelihd, te that if y is a radm variable with Prb [y = 1] = p ad Prb [y = 0] = (1 p), the the prbability desity (r likelihd) P(y, p) fr y whe p is true, is just P(y, p) = p y (1 p) (1 y), this merely says P(1, p) = p ad P(0, p) = (1 p). Thus, the likelihd fr y 1,..., y (assumig that the y i are idepedet), is P(y 1,..., y ; p(t(1),..., p(t()) = Π p(t(i)) y i (1 p(t(i)) (1 y i). (16) Substitutig f fr p i (16), takig the egative lgarithm, gives the egative lg likelihd L(y, f) i terms f f: lgp(y 1,..., y ; f(t(1),..., f(t()) L(y, f) = [lg(1 + e f(t(i)) ) y i f(t(i))]. (17) It is atural fr L(y, f) t replace L(y, f) f (3) (7), (14) whe y i is restricted t 0 r 1, sice L(y, f TRUE ) is (a multiple f) the egative lg likelihd fr y geerated by a mdel with Gaussia ise like (1). A eural et implemetati f sft classificati wuld csist f fidig f NN (t) = lgitp NN (t) f the frm f Equati (5) t miimize L(y, f) f (17). If N is large eugh, the, i priciple, f NN may be drive s that p NN (t(i)) is clse t 1 if y i is 1, ad is clse t 0 if y i is 0. Agai, it is ituitively clear that this is t desirable. As befre, a regularized, r smth f NN ca be btaied by ctrllig N, pealizig the w i, stppig the iterative fittig early, r sme cmbiati f these. Pealized likelihd estimates f f are btaied by miimizig L(y, f) + J λ (f) where J λ (f) is a pealty fuctial crrespdig t thse i Equatis (2), (9), (12) r (14) ad its geeralizatis. A ppular defiiti fr the geeralizati errr is the (ubservable) cmparative Kullback- Leibler distace f the estimate t the true prbability distributi, which ca be shw t be give by EL(y ew, f λ )) = L(p TRUE, f λ ). A estimate f the λ which miimizes this criteria ca be btaied by withhldig a represetative subset y [left ut] f the traiig set ad chsig λ t miimize L(y [left ut], f λ ). Leavig-ut-e estimates are als pssible but geerally t feasible i this case. Geeralized apprximate crss validati (GACV) is a feasible isample methd f chsig λ; based a leavig-ut-e argumet, it has bee shw i simulati studies t prvide a gd estimate f the miimizer f L(p TRUE, f λ ), see, Li, Ga, Xiag, Klei ad Klei (1999). 4 i Hard Classificati I the hard classificati prblem (here we will csider ly tw classes fr simplicity), we are ly iterested i estimatig whether a example with vectr t is i class A t. This is the typical situati i, fr example character recgiti, vice recgiti, ad ther situatis where it is kw that the t s frm the tw classes beig examied are geerally well separated. I that case (assumig, fr simplicity that the examples frm the tw classes are represeted i the traiig set equally as is the future ppulati f iterest, ad, that csts f misclassificati are the same fr bth classes), the the ptimum classifier (t miimize the expected cst) wuld be A if p(t) is greater tha e-half, ad t A therwise. Equivaletly, the same rule ca be implemeted by examiig the sig f the lgit f(t). Here we are idetifyig A with the 1 s, ad ptimum is with respect t miimizig the expected cst f future misclassificati. Ufrtuately, i geeral it is either desirable r feasible t estimate the lgit f directly by the methds f Secti 3, because i the well separated case f takes values ear ±, ad, if d ad/r the sample size is large slvig the pealized likelihd prblem f Secti 3 is likely t be umerically ustable. Recetly, supprt vectr machies (SVM s) have bee shw t prvide a excellet methd fr classificati i this situati. See Burges (1998). The supprt vectr machie (SVM) is implemeted cdig the y i as ±1 accrdig as the ith example is i A r t. Give a psitive defiite fucti R(s, t), we fid a fucti f f the frm f(t) = b+ c i R(t, t(i)) 7
9 by fidig b ad c = (c i,,c ) t miimize 1 (1 y i f(t(i)) + + λ i,j c i c j R(t(i), t(j)) (18) where (τ) + = τ fr τ > 0 ad 0 therwise. Lettig f λ be the miimizer f (18), the classificati algrithm is: fr a ew attribute vectr t, assig A if f(t) > 0 ad t A if f(t) < 0. Li (1999) has demstrated the remarkable result that, uder geeral circumstaces with apprpriately chse λ, the SVM estimate f λ teds almst everywhere t either 1 r 1 ad is a estimate f sigf TRUE sig(p TRUE 1 2 ), which is exactly what is eeded t carry ut the ptimum classificati algrithm. A ppular chice fr R(s, t) is R(s, t) = exp 1 σ s t 2 where is the Euclidea rm. I this chice f R(, ) the result may be 2 sesitive t bth σ ad λ. As befre, the λ ad σ may be chse by leavig ut a represetative subset f the bservatis ad chsig λ ad σ t miimize sme measure f the geeralizati errr. Here the atural chice fr geeralizati errr wuld be the misclassificati rate. A versi f GACV fr SVM s, agai based a leavig-ut-e argumet, may be used as a isample methd fr chsig λ ad σ, see, Li ad Zhag (1999). The geeralizati errr target fr the GACV is E 1 (1 y iew f λ (t(i))) +. Hwever, 1 2 E 1 (1 y iew sig[f λ (t(i))]) + is the expected misclassificati rate, s that t the extet that f λ resembles sigf λ, this criteria will be apprpriate fr the geeralizati errr. 5 Chsig Hw Much t Regularize At the time f this writig, it is a matter f lively debate ad much research hw t chse the varius regularizati parameters. Leavig ut a large fracti f the traiig sample fr this purpse ad tuig the regularizati parameter(s) t best predict the left-ut data (accrdig t whatever criteria f best predicti is adpted) is cceptually simple, defesible, ad widely used (this is called ut-f-sample tuig). Successively leavig-ut-e, successively leavig-ut-10%, ad the i-sample methds GCV ad GACV are all ppular. See als Ye (1998) wh discusses i-sample tuig methds related t GCV i the Gaussia case which allw cmpariss acrss differet regularized estimates. I the Nrmally distributed bservatial errr case, if the stadard deviati f the bservatial errr (σ i Equati (1))is kw the ubiased risk estimates becme available. See Li (1986), (1990) ad refereces cited there. Whe there is a Bayesia mdel behid the regularizati prcedure, the maximum likelihd estimates may be derived, see (1985), althugh i rder fr these ad ther Bayes estimates t d a gd jb f miimizig the geeralizati errr i practice, it is usually ecessary that the prirs which they are based are realistic. 6 Which methd is best? Feedfrward eural ets, radial basis fuctis, ad varius frms f splies all prvide regularized r regularizable methds fr estimatig smth fuctis f several variables, give a traiig set {t(i), y i }: Which apprach is best? Ufrtuately, there is t, r is there likely t be, a sigle aswer t that questi. The aswer mst surely depeds the particular ature f the uderlyig but ukw truth, the ature f ay prir ifrmati that might be available abut this truth, the ature f ay ise i the data, the ability f the experimeter t chse the varius smthig r regularizati parameters well, the size f the data set, the use t which the aswer will be put, ad the cmputatial facilities available. Frm a mathematical pit f view, the classes f fuctis well apprximated by eural ets, radial basis fuctis, additive ad iteracti splies (ANOVA splies) are t the same, althugh all f these methds have the capability f apprximatig large classes f fuctis. Of curse, if a large eugh data set is available, mdels utilizig all f these appraches may be built, ad tued, ad the cmpared data that has bee set aside fr this 8
10 purpse. I-sample tuig methds fr cmparis acrss differet regularized estimates i the hard ad sft classificati ctexts are a area f active research. REFERENCES Burges, C. (1998), A tutrial supprt vectr machies fr patter recgiti, Data Miig ad Kwledge Discvery 2, Girard, D. (1998), Asympttic cmparis f (partial) crss-validati, GCV ad radmized GCV i parametric regressi, A. Statist. 126, Girsi, F., Jes, M. & Pggi, T. (1995), Regularizati thery ad eural etwrks architectures, Neural Cmputati 7, Gu, C. &, G. (1993), Semiparametric aalysis f variace with tesr prduct thi plate splies, J. Ryal Statistical Sc. Ser. B 55, Hastie, T. & Tibshirai, R. (1990), Geeralized Additive Mdels, Chapma ad Hall. Kimeldrf, G. &, G. (1970), A crrespdece betwee Bayesia estimati f stchastic prcesses ad smthig by splies, A. Math. Statist. 41, Li, K. C. (1986), Asympttic ptimality f C L ad geeralized crss validati i ridge regressi with applicati t splie smthig, A. Statist. 14, Li, Y. (1999), Supprt vectr machies ad the Bayes rule i classificati, Techical Reprt 1014, Departmet f Statistics, Uiversity f Wiscsi, Madis WI. Ramsay, J. & Silverma, B. (1997), Fuctial Data Aalysis, Spriger., G. (1985), A cmparis f GCV ad GML fr chsig the smthig parameter i the geeralized splie smthig prblem, A. Statist. 13, , G. (1990), Splie Mdels fr Observatial Data, SIAM. CBMS-NSF Regial Cferece Series i Applied Mathematics, v. 59., G., Li, X., Ga, F., Xiag, D., Klei, R. & Klei, B. (1999), The bias-variace tradeff ad the radmized GACV, i M. Kears, S. Slla & D. Ch, eds, Advaces i Ifrmati Prcessig Systems 11, MIT Press, pp Full ral presetati at NIPS 11., G., Li, Y. & Zhag, H. (1999), Geeralized apprximate crss validati fr supprt vectr machies, r, ather way t lk at margi-like quatities, Techical Reprt 1006, Departmet f Statistics, Uiversity f Wiscsi, Madis WI. t appear, Advaces i Large Margi Classifiers, A. Smla, P. Bartlett, B. Schlkpf ad D. Schurmas, eds, MIT Press., G., Wag, Y., Gu, C., Klei, R. & Klei, B. (1995), Smthig splie ANOVA fr expetial families, with applicati t the Wiscsi Epidemilgical Study f Diabetic Retipathy, A. Statist. 23, Neyma Lecture. Ye, J. (1998), O measurig ad crrectig the effects f data miig ad mdel selecti, J. Amer. Statist. Assc. 93,
5.1 Two-Step Conditional Density Estimator
5.1 Tw-Step Cditial Desity Estimatr We ca write y = g(x) + e where g(x) is the cditial mea fucti ad e is the regressi errr. Let f e (e j x) be the cditial desity f e give X = x: The the cditial desity
More informationCh. 1 Introduction to Estimation 1/15
Ch. Itrducti t stimati /5 ample stimati Prblem: DSB R S f M f s f f f ; f, φ m tcsπf t + φ t f lectrics dds ise wt usually white BPF & mp t s t + w t st. lg. f & φ X udi mp cs π f + φ t Oscillatr w/ f
More informationENGI 4421 Central Limit Theorem Page Central Limit Theorem [Navidi, section 4.11; Devore sections ]
ENGI 441 Cetral Limit Therem Page 11-01 Cetral Limit Therem [Navidi, secti 4.11; Devre sectis 5.3-5.4] If X i is t rmally distributed, but E X i, V X i ad is large (apprximately 30 r mre), the, t a gd
More informationChapter 3.1: Polynomial Functions
Ntes 3.1: Ply Fucs Chapter 3.1: Plymial Fuctis I Algebra I ad Algebra II, yu ecutered sme very famus plymial fuctis. I this secti, yu will meet may ther members f the plymial family, what sets them apart
More informationD.S.G. POLLOCK: TOPICS IN TIME-SERIES ANALYSIS STATISTICAL FOURIER ANALYSIS
STATISTICAL FOURIER ANALYSIS The Furier Represetati f a Sequece Accrdig t the basic result f Furier aalysis, it is always pssible t apprximate a arbitrary aalytic fucti defied ver a fiite iterval f the
More informationBIO752: Advanced Methods in Biostatistics, II TERM 2, 2010 T. A. Louis. BIO 752: MIDTERM EXAMINATION: ANSWERS 30 November 2010
BIO752: Advaced Methds i Bistatistics, II TERM 2, 2010 T. A. Luis BIO 752: MIDTERM EXAMINATION: ANSWERS 30 Nvember 2010 Questi #1 (15 pits): Let X ad Y be radm variables with a jit distributi ad assume
More informationENGI 4421 Central Limit Theorem Page Central Limit Theorem [Navidi, section 4.11; Devore sections ]
ENGI 441 Cetral Limit Therem Page 11-01 Cetral Limit Therem [Navidi, secti 4.11; Devre sectis 5.3-5.4] If X i is t rmally distributed, but E X i, V X i ad is large (apprximately 30 r mre), the, t a gd
More informationAuthor. Introduction. Author. o Asmir Tobudic. ISE 599 Computational Modeling of Expressive Performance
ISE 599 Cmputatial Mdelig f Expressive Perfrmace Playig Mzart by Aalgy: Learig Multi-level Timig ad Dyamics Strategies by Gerhard Widmer ad Asmir Tbudic Preseted by Tsug-Ha (Rbert) Chiag April 5, 2006
More informationA Study on Estimation of Lifetime Distribution with Covariates Under Misspecification
Prceedigs f the Wrld Cgress Egieerig ad Cmputer Sciece 2015 Vl II, Octber 21-23, 2015, Sa Fracisc, USA A Study Estimati f Lifetime Distributi with Cvariates Uder Misspecificati Masahir Ykyama, Member,
More informationQuantum Mechanics for Scientists and Engineers. David Miller
Quatum Mechaics fr Scietists ad Egieers David Miller Time-depedet perturbati thery Time-depedet perturbati thery Time-depedet perturbati basics Time-depedet perturbati thery Fr time-depedet prblems csider
More informationMulti-objective Programming Approach for. Fuzzy Linear Programming Problems
Applied Mathematical Scieces Vl. 7 03. 37 8-87 HIKARI Ltd www.m-hikari.cm Multi-bective Prgrammig Apprach fr Fuzzy Liear Prgrammig Prblems P. Padia Departmet f Mathematics Schl f Advaced Scieces VIT Uiversity
More informationGrade 3 Mathematics Course Syllabus Prince George s County Public Schools
Ctet Grade 3 Mathematics Curse Syllabus Price Gerge s Cuty Public Schls Prerequisites: Ne Curse Descripti: I Grade 3, istructial time shuld fcus fur critical areas: (1) develpig uderstadig f multiplicati
More informationSolutions. Definitions pertaining to solutions
Slutis Defiitis pertaiig t slutis Slute is the substace that is disslved. It is usually preset i the smaller amut. Slvet is the substace that des the disslvig. It is usually preset i the larger amut. Slubility
More informationThe Excel FFT Function v1.1 P. T. Debevec February 12, The discrete Fourier transform may be used to identify periodic structures in time ht.
The Excel FFT Fucti v P T Debevec February 2, 26 The discrete Furier trasfrm may be used t idetify peridic structures i time ht series data Suppse that a physical prcess is represeted by the fucti f time,
More informationLecture 21: Signal Subspaces and Sparsity
ECE 830 Fall 00 Statistical Sigal Prcessig istructr: R. Nwak Lecture : Sigal Subspaces ad Sparsity Sigal Subspaces ad Sparsity Recall the classical liear sigal mdel: X = H + w, w N(0, where S = H, is a
More informationCross-Validation in Function Estimation
Crss-Validati i Fucti Estimati Chg Gu Octber 1, 2006 Crss-validati is a ituitive ad effective techique fr mdel selecti i data aalysis. I this discussi, I try t preset a few icaratis f the geeral techique
More informationComparative analysis of bayesian control chart estimation and conventional multivariate control chart
America Jural f Theretical ad Applied Statistics 3; ( : 7- ublished lie Jauary, 3 (http://www.sciecepublishiggrup.cm//atas di:.648/.atas.3. Cmparative aalysis f bayesia ctrl chart estimati ad cvetial multivariate
More informationMATH Midterm Examination Victor Matveev October 26, 2016
MATH 33- Midterm Examiati Victr Matveev Octber 6, 6. (5pts, mi) Suppse f(x) equals si x the iterval < x < (=), ad is a eve peridic extesi f this fucti t the rest f the real lie. Fid the csie series fr
More informationFourier Series & Fourier Transforms
Experimet 1 Furier Series & Furier Trasfrms MATLAB Simulati Objectives Furier aalysis plays a imprtat rle i cmmuicati thery. The mai bjectives f this experimet are: 1) T gai a gd uderstadig ad practice
More informationFourier Method for Solving Transportation. Problems with Mixed Constraints
It. J. Ctemp. Math. Scieces, Vl. 5, 200,. 28, 385-395 Furier Methd fr Slvig Trasprtati Prblems with Mixed Cstraits P. Padia ad G. Nataraja Departmet f Mathematics, Schl f Advaced Scieces V I T Uiversity,
More informationK [f(t)] 2 [ (st) /2 K A GENERALIZED MEIJER TRANSFORMATION. Ku(z) ()x) t -)-I e. K(z) r( + ) () (t 2 I) -1/2 e -zt dt, G. L. N. RAO L.
Iterat. J. Math. & Math. Scl. Vl. 8 N. 2 (1985) 359-365 359 A GENERALIZED MEIJER TRANSFORMATION G. L. N. RAO Departmet f Mathematics Jamshedpur C-perative Cllege f the Rachi Uiversity Jamshedpur, Idia
More informationMarkov processes and the Kolmogorov equations
Chapter 6 Markv prcesses ad the Klmgrv equatis 6. Stchastic Differetial Equatis Csider the stchastic differetial equati: dx(t) =a(t X(t)) dt + (t X(t)) db(t): (SDE) Here a(t x) ad (t x) are give fuctis,
More informationALE 26. Equilibria for Cell Reactions. What happens to the cell potential as the reaction proceeds over time?
Name Chem 163 Secti: Team Number: AL 26. quilibria fr Cell Reactis (Referece: 21.4 Silberberg 5 th editi) What happes t the ptetial as the reacti prceeds ver time? The Mdel: Basis fr the Nerst quati Previusly,
More informationare specified , are linearly independent Otherwise, they are linearly dependent, and one is expressed by a linear combination of the others
Chater 3. Higher Order Liear ODEs Kreyszig by YHLee;4; 3-3. Hmgeeus Liear ODEs The stadard frm f the th rder liear ODE ( ) ( ) = : hmgeeus if r( ) = y y y y r Hmgeeus Liear ODE: Suersiti Pricile, Geeral
More informationIntermediate Division Solutions
Itermediate Divisi Slutis 1. Cmpute the largest 4-digit umber f the frm ABBA which is exactly divisible by 7. Sluti ABBA 1000A + 100B +10B+A 1001A + 110B 1001 is divisible by 7 (1001 7 143), s 1001A is
More informationReview for cumulative test
Hrs Math 3 review prblems Jauary, 01 cumulative: Chapters 1- page 1 Review fr cumulative test O Mday, Jauary 7, Hrs Math 3 will have a curse-wide cumulative test cverig Chapters 1-. Yu ca expect the test
More informationUnifying the Derivations for. the Akaike and Corrected Akaike. Information Criteria. from Statistics & Probability Letters,
Uifyig the Derivatis fr the Akaike ad Crrected Akaike Ifrmati Criteria frm Statistics & Prbability Letters, Vlume 33, 1997, pages 201{208. by Jseph E. Cavaaugh Departmet f Statistics, Uiversity f Missuri,
More informationA Hartree-Fock Calculation of the Water Molecule
Chemistry 460 Fall 2017 Dr. Jea M. Stadard Nvember 29, 2017 A Hartree-Fck Calculati f the Water Mlecule Itrducti A example Hartree-Fck calculati f the water mlecule will be preseted. I this case, the water
More information6.867 Machine learning, lecture 14 (Jaakkola)
6.867 Machie learig, lecture 14 (Jaakkla) 1 Lecture tpics: argi ad geeralizati liear classifiers esebles iture dels Margi ad geeralizati: liear classifiers As we icrease the uber f data pits, ay set f
More information, the random variable. and a sample size over the y-values 0:1:10.
Lecture 3 (4//9) 000 HW PROBLEM 3(5pts) The estimatr i (c) f PROBLEM, p 000, where { } ~ iid bimial(,, is 000 e f the mst ppular statistics It is the estimatr f the ppulati prprti I PROBLEM we used simulatis
More informationAP Statistics Notes Unit Eight: Introduction to Inference
AP Statistics Ntes Uit Eight: Itrducti t Iferece Syllabus Objectives: 4.1 The studet will estimate ppulati parameters ad margis f errrs fr meas. 4.2 The studet will discuss the prperties f pit estimatrs,
More informationPhysical Chemistry Laboratory I CHEM 445 Experiment 2 Partial Molar Volume (Revised, 01/13/03)
Physical Chemistry Labratry I CHEM 445 Experimet Partial Mlar lume (Revised, 0/3/03) lume is, t a gd apprximati, a additive prperty. Certaily this apprximati is used i preparig slutis whse ccetratis are
More information[1 & α(t & T 1. ' ρ 1
NAME 89.304 - IGNEOUS & METAMORPHIC PETROLOGY DENSITY & VISCOSITY OF MAGMAS I. Desity The desity (mass/vlume) f a magma is a imprtat parameter which plays a rle i a umber f aspects f magma behavir ad evluti.
More informationA New Method for Finding an Optimal Solution. of Fully Interval Integer Transportation Problems
Applied Matheatical Scieces, Vl. 4, 200,. 37, 89-830 A New Methd fr Fidig a Optial Sluti f Fully Iterval Iteger Trasprtati Prbles P. Padia ad G. Nataraja Departet f Matheatics, Schl f Advaced Scieces,
More informationx 2 x 3 x b 0, then a, b, c log x 1 log z log x log y 1 logb log a dy 4. dx As tangent is perpendicular to the x axis, slope
The agle betwee the tagets draw t the parabla y = frm the pit (-,) 5 9 6 Here give pit lies the directri, hece the agle betwee the tagets frm that pit right agle Ratig :EASY The umber f values f c such
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationThe Acoustical Physics of a Standing Wave Tube
UIUC Physics 93POM/Physics 406POM The Physics f Music/Physics f Musical Istrumets The Acustical Physics f a Stadig Wave Tube A typical cylidrical-shaped stadig wave tube (SWT) {aa impedace tube} f legth
More informationChristensen, Mads Græsbøll; Vera-Candeas, Pedro; Somasundaram, Samuel D.; Jakobsson, Andreas
Dwladed frm vb.aau.dk : April 12, 2019 Aalbrg Uiversitet Rbust Subspace-based Fudametal Frequecy Estimati Christese, Mads Græsbøll; Vera-Cadeas, Pedr; Smasudaram, Samuel D.; Jakbss, Adreas Published i:
More informationBayesian Estimation for Continuous-Time Sparse Stochastic Processes
Bayesia Estimati fr Ctiuus-Time Sparse Stchastic Prcesses Arash Amii, Ulugbek S Kamilv, Studet, IEEE, Emrah Bsta, Studet, IEEE, Michael User, Fellw, IEEE Abstract We csider ctiuus-time sparse stchastic
More informationMean residual life of coherent systems consisting of multiple types of dependent components
Mea residual life f cheret systems csistig f multiple types f depedet cmpets Serka Eryilmaz, Frak P.A. Cle y ad Tahai Cle-Maturi z February 20, 208 Abstract Mea residual life is a useful dyamic characteristic
More informationAxial Temperature Distribution in W-Tailored Optical Fibers
Axial Temperature Distributi i W-Tailred Optical ibers Mhamed I. Shehata (m.ismail34@yah.cm), Mustafa H. Aly(drmsaly@gmail.cm) OSA Member, ad M. B. Saleh (Basheer@aast.edu) Arab Academy fr Sciece, Techlgy
More information6.867 Machine learning
6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples
More informationWEST VIRGINIA UNIVERSITY
WEST VIRGINIA UNIVERSITY PLASMA PHYSICS GROUP INTERNAL REPORT PL - 045 Mea Optical epth ad Optical Escape Factr fr Helium Trasitis i Helic Plasmas R.F. Bivi Nvember 000 Revised March 00 TABLE OF CONTENT.0
More informationStudy of Energy Eigenvalues of Three Dimensional. Quantum Wires with Variable Cross Section
Adv. Studies Ther. Phys. Vl. 3 009. 5 3-0 Study f Eergy Eigevalues f Three Dimesial Quatum Wires with Variale Crss Secti M.. Sltai Erde Msa Departmet f physics Islamic Aad Uiversity Share-ey rach Ira alrevahidi@yah.cm
More informationHypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal
Hypthesis Testing and Cnfidence Intervals (Part 1): Using the Standard Nrmal Lecture 8 Justin Kern April 2, 2017 Inferential Statistics Hypthesis Testing One sample mean / prprtin Tw sample means / prprtins
More informationMatching a Distribution by Matching Quantiles Estimation
Jural f the America Statistical Assciati ISSN: 0162-1459 (Prit) 1537-274X (Olie) Jural hmepage: http://www.tadflie.cm/li/uasa20 Matchig a Distributi by Matchig Quatiles Estimati Niklas Sgurpuls, Qiwei
More informationMATHEMATICS 9740/01 Paper 1 14 Sep hours
Cadidate Name: Class: JC PRELIMINARY EXAM Higher MATHEMATICS 9740/0 Paper 4 Sep 06 3 hurs Additial Materials: Cver page Aswer papers List f Frmulae (MF5) READ THESE INSTRUCTIONS FIRST Write yur full ame
More informationInternal vs. external validity. External validity. Internal validity
Secti 7 Mdel Assessmet Iteral vs. exteral validity Iteral validity refers t whether the aalysis is valid fr the pplati ad sample beig stdied. Exteral validity refers t whether these reslts ca be geeralized
More informationMODIFIED LEAKY DELAYED LMS ALGORITHM FOR IMPERFECT ESTIMATE SYSTEM DELAY
5th Eurpea Sigal Prcessig Cferece (EUSIPCO 7), Pza, Plad, September 3-7, 7, cpyright by EURASIP MOIFIE LEAKY ELAYE LMS ALGORIHM FOR IMPERFEC ESIMAE SYSEM ELAY Jua R. V. López, Orlad J. bias, ad Rui Seara
More informationPreliminary Test Single Stage Shrinkage Estimator for the Scale Parameter of Gamma Distribution
America Jural f Mathematics ad Statistics, (3): 3-3 DOI:.593/j.ajms.3. Prelimiary Test Sigle Stage Shrikage Estimatr fr the Scale Parameter f Gamma Distributi Abbas Najim Salma,*, Aseel Hussei Ali, Mua
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More informationThe Method of Least Squares. To understand least squares fitting of data.
The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve
More information5.80 Small-Molecule Spectroscopy and Dynamics
MIT OpeCurseWare http://cw.mit.edu 5.8 Small-Mlecule Spectrscpy ad Dyamics Fall 8 Fr ifrmati abut citig these materials r ur Terms f Use, visit: http://cw.mit.edu/terms. 5.8 Lecture #33 Fall, 8 Page f
More informationSound Absorption Characteristics of Membrane- Based Sound Absorbers
Purdue e-pubs Publicatis f the Ray W. Schl f Mechaical Egieerig 8-28-2003 Sud Absrpti Characteristics f Membrae- Based Sud Absrbers J Stuart Blt, blt@purdue.edu Jih Sg Fllw this ad additial wrks at: http://dcs.lib.purdue.edu/herrick
More informationk-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels
Mtivating Example Memry-Based Learning Instance-Based Learning K-earest eighbr Inductive Assumptin Similar inputs map t similar utputs If nt true => learning is impssible If true => learning reduces t
More informationMixtures of Gaussians and the EM Algorithm
Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity
More informationSuper-efficiency Models, Part II
Super-efficiec Mdels, Part II Emilia Niskae The 4th f Nvember S steemiaalsi Ctets. Etesis t Variable Returs-t-Scale (0.4) S steemiaalsi Radial Super-efficiec Case Prblems with Radial Super-efficiec Case
More informationClaude Elysée Lobry Université de Nice, Faculté des Sciences, parc Valrose, NICE, France.
CHAOS AND CELLULAR AUTOMATA Claude Elysée Lbry Uiversité de Nice, Faculté des Scieces, parc Valrse, 06000 NICE, Frace. Keywrds: Chas, bifurcati, cellularautmata, cmputersimulatis, dyamical system, ifectius
More informationResampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017
Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with
More informationRegression Quantiles for Time Series Data ZONGWU CAI. Department of Mathematics. Abstract
Regressi Quatiles fr Time Series Data ZONGWU CAI Departmet f Mathematics Uiversity f Nrth Carlia Charltte, NC 28223, USA E-mail: zcai@ucc.edu Abstract I this article we study parametric estimati f regressi
More informationIdentical Particles. We would like to move from the quantum theory of hydrogen to that for the rest of the periodic table
We wuld like t ve fr the quatu thery f hydrge t that fr the rest f the peridic table Oe electr at t ultielectr ats This is cplicated by the iteracti f the electrs with each ther ad by the fact that the
More informationEnergy xxx (2011) 1e10. Contents lists available at ScienceDirect. Energy. journal homepage:
Eergy xxx (2011) 1e10 Ctets lists available at ScieceDirect Eergy jural hmepage: www.elsevier.cm/lcate/eergy Multi-bjective ptimizati f HVAC system with a evlutiary cmputati algrithm Adrew Kusiak *, Fa
More informationActive redundancy allocation in systems. R. Romera; J. Valdés; R. Zequeira*
Wrkig Paper -6 (3) Statistics ad Ecmetrics Series March Departamet de Estadística y Ecmetría Uiversidad Carls III de Madrid Calle Madrid, 6 893 Getafe (Spai) Fax (34) 9 64-98-49 Active redudacy allcati
More informationStatistica Sinica 6(1996), SOME PROBLEMS ON THE ESTIMATION OF UNIMODAL DENSITIES Peter J. Bickel and Jianqing Fan University of California and U
Statistica Siica 6(996), 23-45 SOME PROBLEMS ON THE ESTIMATION OF UNIMODAL DENSITIES Peter J. Bickel ad Jiaqig Fa Uiversity f Califria ad Uiversity f Nrth Carlia Abstract: I this paper, we study, i sme
More informationStudy the bias (due to the nite dimensional approximation) and variance of the estimators
2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite
More informationExamination No. 3 - Tuesday, Nov. 15
NAME (lease rit) SOLUTIONS ECE 35 - DEVICE ELECTRONICS Fall Semester 005 Examiati N 3 - Tuesday, Nv 5 3 4 5 The time fr examiati is hr 5 mi Studets are allwed t use 3 sheets f tes Please shw yur wrk, artial
More informationIntroduction to Machine Learning DIS10
CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig
More informationElectrostatics. . where,.(1.1) Maxwell Eqn. Total Charge. Two point charges r 12 distance apart in space
Maxwell Eq. E ρ Electrstatics e. where,.(.) first term is the permittivity i vacuum 8.854x0 C /Nm secd term is electrical field stregth, frce/charge, v/m r N/C third term is the charge desity, C/m 3 E
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationCOMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification
COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551
More informationPattern Recognition 2014 Support Vector Machines
Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft
More informationHIGH-DIMENSIONAL data are common in many scientific
IEEE RANSACIONS ON KNOWLEDGE AND DAA ENGINEERING, VOL. 20, NO. 10, OCOBER 2008 1311 Kerel Ucrrelated ad Regularized Discrimiat Aalysis: A heretical ad Cmputatial Study Shuiwag Ji ad Jiepig Ye, Member,
More informationPortfolio Performance Evaluation in a Modified Mean-Variance-Skewness Framework with Negative Data
Available lie at http://idea.srbiau.ac.ir It. J. Data Evelpmet Aalysis (ISSN 345-458X) Vl., N.3, Year 04 Article ID IJDEA-003,3 pages Research Article Iteratial Jural f Data Evelpmet Aalysis Sciece ad
More informationE o and the equilibrium constant, K
lectrchemical measuremets (Ch -5 t 6). T state the relati betwee ad K. (D x -b, -). Frm galvaic cell vltage measuremet (a) K sp (D xercise -8, -) (b) K sp ad γ (D xercise -9) (c) K a (D xercise -G, -6)
More informationMultilayer perceptrons
Multilayer perceptros If traiig set is ot liearly separable, a etwork of McCulloch-Pitts uits ca give a solutio If o loop exists i etwork, called a feedforward etwork (else, recurret etwork) A two-layer
More informationCOWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY
HAC ESTIMATION BY AUTOMATED REGRESSION By Peter C.B. Phillips July 004 COWLES FOUNDATION DISCUSSION PAPER NO. 470 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY Bx 088 New Have, Cecticut 0650-88
More informationReview of Important Concepts
Appedix 1 Review f Imprtat Ccepts I 1 AI.I Liear ad Matrix Algebra Imprtat results frm liear ad matrix algebra thery are reviewed i this secti. I the discussis t fllw it is assumed that the reader already
More informationExpectation-Maximization Algorithm.
Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationRecovery of Third Order Tensors via Convex Optimization
Recvery f Third Order Tesrs via Cvex Optimizati Hlger Rauhut RWTH Aache Uiversity Lehrstuhl C für Mathematik (Aalysis) Ptdriesch 10 5056 Aache Germay Email: rauhut@mathcrwth-aachede Željka Stjaac RWTH
More informationSpatio-temporal Modeling of Environmental Data for Epidemiologic Health Effects Analyses
Spati-tempral Mdelig f Evirmetal Data fr Epidemilgic Health Effects Aalyses Paul D. Samps Uiversity f Washigt Air Quality ad Health: a glbal issue with lcal challeges 8 Aug 2017 -- Mexic City 1 The MESA
More informationEvery gas consists of a large number of small particles called molecules moving with very high velocities in all possible directions.
Kietic thery f gases ( Kietic thery was develped by Berlli, Jle, Clasis, axwell ad Bltzma etc. ad represets dyamic particle r micrscpic mdel fr differet gases sice it thrws light the behir f the particles
More informationDesign and Implementation of Cosine Transforms Employing a CORDIC Processor
C16 1 Desig ad Implemetati f Csie Trasfrms Emplyig a CORDIC Prcessr Sharaf El-Di El-Nahas, Ammar Mttie Al Hsaiy, Magdy M. Saeb Arab Academy fr Sciece ad Techlgy, Schl f Egieerig, Alexadria, EGYPT ABSTRACT
More informationSolutions to Midterm II. of the following equation consistent with the boundary condition stated u. y u x y
Sltis t Midterm II Prblem : (pts) Fid the mst geeral slti ( f the fllwig eqati csistet with the bdary cditi stated y 3 y the lie y () Slti : Sice the system () is liear the slti is give as a sperpsiti
More informationChapter 5. Root Locus Techniques
Chapter 5 Rt Lcu Techique Itrducti Sytem perfrmace ad tability dt determied dby cled-lp l ple Typical cled-lp feedback ctrl ytem G Ope-lp TF KG H Zer -, - Ple 0, -, - K Lcati f ple eaily fud Variati f
More informationFull algebra of generalized functions and non-standard asymptotic analysis
Full algebra f geeralized fuctis ad -stadard asympttic aalysis Tdr D. Tdrv Has Veraeve Abstract We cstruct a algebra f geeralized fuctis edwed with a caical embeddig f the space f Schwartz distributis.
More information, which yields. where z1. and z2
The Gaussian r Nrmal PDF, Page 1 The Gaussian r Nrmal Prbability Density Functin Authr: Jhn M Cimbala, Penn State University Latest revisin: 11 September 13 The Gaussian r Nrmal Prbability Density Functin
More informationControl Systems. Controllability and Observability (Chapter 6)
6.53 trl Systems trllaility ad Oservaility (hapter 6) Geeral Framewrk i State-Spae pprah Give a LTI system: x x u; y x (*) The system might e ustale r des t meet the required perfrmae spe. Hw a we imprve
More informationThe generation of successive approximation methods for Markov decision processes by using stopping times
The geerati f successive apprximati methds fr Markv decisi prcesses by usig stppig times Citati fr published versi (APA): va Nue, J. A. E. E., & Wessels, J. (1976). The geerati f successive apprximati
More informationDistributed Trajectory Generation for Cooperative Multi-Arm Robots via Virtual Force Interactions
862 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART B: CYBERNETICS, VOL. 27, NO. 5, OCTOBER 1997 Distributed Trajectry Geerati fr Cperative Multi-Arm Rbts via Virtual Frce Iteractis Tshi Tsuji,
More informationFrequency-Domain Study of Lock Range of Injection-Locked Non- Harmonic Oscillators
0 teratial Cferece mage Visi ad Cmputig CVC 0 PCST vl. 50 0 0 ACST Press Sigapre DO: 0.776/PCST.0.V50.6 Frequecy-Dmai Study f Lck Rage f jecti-lcked N- armic Oscillatrs Yushi Zhu ad Fei Yua Departmet f
More informationGrouping 2: Spectral and Agglomerative Clustering. CS 510 Lecture #16 April 2 nd, 2014
Groupig 2: Spectral ad Agglomerative Clusterig CS 510 Lecture #16 April 2 d, 2014 Groupig (review) Goal: Detect local image features (SIFT) Describe image patches aroud features SIFT, SURF, HoG, LBP, Group
More information6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.
6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio
More informationESWW-2. Israeli semi-underground great plastic scintillation multidirectional muon telescope (ISRAMUTE) for space weather monitoring and forecasting
ESWW-2 Israeli semi-udergrud great plastic scitillati multidirectial mu telescpe (ISRAMUTE) fr space weather mitrig ad frecastig L.I. Drma a,b, L.A. Pustil'ik a, A. Sterlieb a, I.G. Zukerma a (a) Israel
More informationThe Maximum-Likelihood Decoding Performance of Error-Correcting Codes
The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,
More informationThe generalized marginal rate of substitution
Jural f Mathematical Ecmics 31 1999 553 560 The geeralized margial rate f substituti M Besada, C Vazuez ) Facultade de Ecmicas, UiÕersidade de Vig, Aptd 874, 3600 Vig, Spai Received 31 May 1995; accepted
More informationLecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff
Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised
More informationDirectional Duality Theory
Suther Illiis Uiversity Carbdale OpeSIUC Discussi Papers Departmet f Ecmics 2004 Directial Duality Thery Daiel Primt Suther Illiis Uiversity Carbdale Rlf Fare Oreg State Uiversity Fllw this ad additial
More informationACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics
ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the
More information