Marginal Conceptual Predictive Statistic for Mixed Model Selection

Size: px
Start display at page:

Download "Marginal Conceptual Predictive Statistic for Mixed Model Selection"

Transcription

1 Open Jurnal f Statistics, 06, 6, Published Online April 06 in Sci Marginal Cnceptual Predictive Statistic fr Mixed Mdel Selectin Cheng Wenren, Junfeng Shang, Juming Pan Prcess Mdeling Analytics Department, Bristl-Myers Squibb, New Yrk, NY, USA Bwling Green State University, Bwling Green, OH, USA Received 9 March 06; accepted 3 April 06; published 6 April 06 yright 06 by authrs and Scientific earch Publishing Inc his wrk is licensed under the Creative Cmmns Attributin Internatinal License CC BY Abstract We fcus n the develpment f mdel selectin criteria in linear mixed mdels In particular, we prpse the mdel selectin criteria fllwing the Mallws Cnceptual Predictive Statistic C p [] [] in linear mixed mdels When crrelatin exists between the bservatins in data, the nrmal Gauss discrepancy in univariate case is nt apprpriate t measure the distance between the true mdel and a candidate mdel Instead, we define a marginal Gauss discrepancy which takes the crrelatin int accunt in the mixed mdels he mdel selectin criterin, marginal C p, called MC p, serves as an asympttically unbiased estimatr f the expected marginal Gauss discrepancy An imprvement f MC p, called IMC p, is then derived and prved t be a mre accurate estimatr f the expected marginal Gauss discrepancy than MC p he perfrmance f the prpsed criteria is investigated in a simulatin study he simulatin results shw that in small samples, the prpsed criteria utperfrm the Akaike Infrmatin Criteria AIC [3] [4] and Bayesian Infrmatin Criterin BIC [5] in selecting the crrect mdel; in large samples, their perfrmance is cmpetitive Further, the prpsed criteria perfrm significantly better fr highly crrelated respnse data than fr weakly crrelated data Keywrds Mixed Mdel Selectin, Marginal C p, Imprved Marginal C p, Marginal Gauss Discrepancy, Linear Mixed Mdel Intrductin With the develpment in data science ver the past decades, peple becme mre aware f the cmplexity f Crrespnding authr Hw t cite this paper: Wenren, C, Shang, JF and Pan, JM 06 Marginal Cnceptual Predictive Statistic fr Mixed Mdel Selectin Open Jurnal f Statistics, 6,

2 C Wenren et al data in real life Univariate linear regressin mdels with independent identically distributed iid Gaussian errrs cannt achieve gd fitness fr sme types f data, especially fr the data with bservatins that are crrelated Fr instance, in lngitudinal data, bservatins are usually recrded frm the same individual ver time It is reasnable t assume that crrelatin exists amng the bservatins frm the same individual and linear mixed mdels are therefre apprpriately utilized fr mdeling such data Since linear mixed mdels are extensively used, mixed mdel selectin plays an imprtant rle in statistical literature he aim f mixed mdel selectin is t chse the mst apprpriate mdel frm a candidate pl in the mixed mdel setting facilitate this task, a variety f mdel selectin criteria are emplyed t implement the selectin prcess In linear mixed mdels, a number f criteria have been develped t characterize mdel selectin he mst widely used criteria are the infrmatin criteria such as the AIC [3] [4] and the BIC [5] Sugiura [6] prpsed a marginal AIC maic which invlved the number f randm effects parameters int the penalty term Shang and Cavanagh [7] emplyed the btstrap methd t estimate the penalty term f maic fr prpsing tw variants f AIC Fr lngitudinal data, a special case f linear mixed mdels, Azari, Li and sai [8] prpsed a crrected Akaike Infrmatin Criterin AICc In the justificatin f AICc, the paper mainly handled the challenge initiated by the crrelatin matrix under certain cnditins fr the mixed mdels Vaida and Blanchard [9] redefined the Akaike infrmatin based n the best linear unbiased predictr BLUP [0]-[] fr the randm effects in the mixed mdels, and prpsed a cnditinal AIC caic Dimva et al [3] derived a series f variants f the Akaike Infrmatin Criterin in small samples fr linear mixed mdels Anther infrmatin criterin, BIC, can be cnsidered as a Bayesian alternative t AIC In linear mixed mdels, BIC is cnverted frm marginal AIC by replacing the cnstant in the penalty by lg N, where N is the sample size mbic [4] Jnes [5] prpsed a measure f the effective sample size t replace the sample size in the penalty term f BIC, leading t a new criterin BIC J We nte that the BIC-type infrmatin criteria are derived using Bayesian appraches Different frm that, the AIC-type infrmatin selectin criteria are justified frm the frequentist perspective and based upn the infrmatin discrepancy Hwever, little research has relied n ther discrepancy t prpse criteria including Mallws C p [] [] in linear mixed mdels In fact, because f dissimilar derivatin, each selectin criterin has its wn advantages, and n unique selectin criterin can cver all the benefits fr mdel selectin further develp the selectin criteria in the mixed mdeling setting, we aim t justify the C p -type nes relying n the Gauss discrepancy Mallws C p [] [] in linear regressin mdels targets t estimate the Gauss discrepancy between the true mdel and a candidate mdel It serves as an asympttically unbiased estimatr f the expected Gauss discrepancy Fujikshi and Sath [6] identified C p in multivariate linear regressin Davies et al [7] presented the estimatin ptimality f C p in linear regressin mdels Cavanaugh et al [8] prvided an alternate versin f C p he Gauss discrepancy is an L nrm measuring the distance between the true mdel and a candidate mdel in linear mdels select the mst apprpriate mdel amng cmpeting fitted mdels, the candidate mdel leading t the smallest value f C p is chsen Hwever, since the cvariance matrix f linear mixed mdels pses the challenge fr the justificatin f selectin criteria, C p statistic in linear mixed mdels has nt been identified his paper extends the justificatin f C p frm linear mdels t linear mixed mdels We first define a marginal Gauss discrepancy reflecting the crrelatin fr measuring the distance between the true mdel and a candidate mdel We utilize the assumptin that under certain cnditins, the estimatr f the crrelatin matrix fr the candidate mdel is cnsistent t that fr the true crrelatin matrix he marginal C p, abbreviated as MC p MC p serves as an asympttically unbiased estimatr f the expected marginal Gauss discrepancy between the true mdel and a candidate mdel An imprvement f MC p, abbreviated as IMC p, is als prpsed and prved We then justify IMC p as an asympttically mre precisely unbiased estimatr f the expected marginal Gauss discrepancy We examine the perfrmance f the prpsed criteria in a simulatin study where we utilize varius crrelatin structures and different sample sizes he paper is rganized as fllws: Sectin presents the ntatin and defines the marginal Gauss discrepancy in the setting f linear mixed mdels In Sectin 3, we prvide the derivatins f the mdel selectin criteria MC p and IMC p Sectin 4 presents a simulatin study t demnstrate the effectiveness f the prpsed criteria Sectin 5 cncludes 40

3 C Wenren et al Marginal Gauss Discrepancy In this sectin, we will intrduce the true mdel, als called the generating mdel, and the candidate mdel in the setting f linear mixed mdels, then define the marginal Gauss discrepancy Suppse that the generating mdel fr the data is given by y = X + Zb + β, ε where y dentes an N respnse vectr, X is an N p design matrix f full clumn rank, β is a p unknwn vectr fr fixed effects Z is an N mr knwn matrix f full clumn rank and b is an mr unknwn vectr fr randm effects, where m is the number f cases, the sample size, and r is the dimensin f the randm effects fr each case Here, b ~ N 0, G, ε ~ N 0, IN, and b and ε are mutually independent and G is a psitive definite matrix and is a scalar We fit the data with a candidate mdel f the frm y = X β + Zb + ε,, ε, and b and ε are mutually independent he design matrix f the randm effects Z and the randm effects b are the same as thse in the generating mdel he matrix G is a psitive definite matrix with the q unknwn parameters in it Since the randm part f the mdel ie Zb is nt subject t selectin, it is easier t use the marginal frm in where X is an N p design matrix f full clumn rank, β is a p unknwn vectr, b ~ N 0, G ~ N 0, IN [9] f linear mixed mdels Let ζ = Zb + ε, then the generating mdel can be written as y = X β + ζ, Σ ζ ~ N 0,, where the scaled variance Σ = ZGZ + I N Fr the candidate mdel, let ζ = Zb + ε, we have y = Xβ + ζ, Σ ζ ~ N 0,, where the scaled variance Σ= ZGZ + I N herefre, the Σ is a nnsingular psitive definite matrix In mdels 3 and 4, the terms ζ and ζ are the cmbinatins f the randm effects and errrs in the mdel, respectively Since they are bth assumed t have mean zer, the parameters scaled variances Σ and Σ cntain all the infrmatin f the randm effects and errrs, including the crrelatin structures We measure the distance between the true mdel and a candidate mdel by defining the marginal Gauss discrepancy based n the marginal frms f mdels 3 and 4 he true mdel is assumed t be included in the pl f candidate mdels Let θ and θ dente the vectrs f parameters β,, Σ and β,, Σ, respectively he marginal Gauss discrepancy between the true mdel and a candidate mdel is defined as { } θθ = β Σ β G d, E y X y X, where E dentes the expectatin with respect t the true mdel Nte that the marginal Gauss discrepancy cntains a weight f inverse scaled variance Σ int the L nrm herefre, the crrelatin between bservatins is invlved when we use the marginal Gauss discrepancy t measure the distance between the true mdel and a candidate mdel Nw let θ= β,, Σ dente an estimate f θ Fr instance, θ culd be the maximum likelihd estimatr MLE r the restricted maximum likelihd estimatr REML Hwever, in this paper, the MLE is utilized he marginal Gauss discrepancy between the true mdel and the fitted candidate mdel is defined as which can be therefre expressed as d G G θθ = d θθ,,, θ= θ 3 4 4

4 C Wenren et al d G, θθ { β β } θ= θ { β β β β β β } θ= θ { } { } β β β β β β θ θ tr X X X X = E yx Σ yx = E y X + X X Σ y X + X X = E yx Σ y X + E X X Σ X X = Σ Σ + β β Σ β β = θ= θ We define a transfrmed marginal Gauss discrepancy between the true generating mdel and the fitted candidate mdel as a linear functin f the marginal Gauss discrepancy 5 as G θθ, θθ, 5 d = d N 6 aking the expectatin f the transfrmed marginal Gauss discrepancy 6, we btain the expected transfrmed marginal Gauss discrepancy as θ E { d, θθ p } = C { } β β Σ β β E X X X X { = E tr Σ Σ } + N serve as a mdel selectin criterin based n the expected transfrmed marginal Gauss discrepancy in Equatin 7, an unbiased estimatr r an asympttically unbiased estimatr will be prpsed simplifying the prcedure, we will first abbreviate this discrepancy in Equatin 7 Frm expressin 7, the expectatin part in the numeratr can be written as where { } β β 7 E X Hy Σ X Hy, 8 H = X X Σ X X Σ is a prjectin matrix such that X β = Hy explre a further expressin f 8, we need t knw the prperties f Ĥ herem Fr every Σ, the matrix H = X X Σ X X Σ satisfies the fllwing prperties: Ĥ is idemptent tr H = p and tr I N H = N p he prf is given in the Appendix Crllary Fllwing herem, we have: H H Σ =Σ H H Σ H I = 0 3 H I H I I H I H I H Σ = Σ =Σ he prf f Crllary can be easily cmpleted fllwing herem By Crllary, expressin 8 can be written as { β Σ β } β β β β β β E Hy X Hy X { } = E Hy HX + HX X Σ Hy HX + HX X { } β β β β { } { } = E y X Σ H y X + E X Σ H I X { ζ ζ } β β = E Σ H + E X Σ I H X 9 4

5 C Wenren et al Nte that the scaled variance Σ is a functin f the q unknwn parameter vectr f variance cmpnents γ, ie, Σ=Σ γ Azari, Li and sai [8] nted that under the assumptin that the set f candidate mdels includes the true mdel, it is reasnable t assume that the MLE γ is a cnsistent estimatr f γ herefre, we can apprximate Σ by Σ, ie, Σ=Σ + In what fllws, we will make use f this apprximatin First, since E{ ζ } = 0 and var{ ζ} = Σ, using the apprximatin Σ=Σ + and herem, we have the first term f 9 as Secnd, using the apprximatin { } ζ Σ ζ = Σ Σ tr H = p E H tr H Σ=Σ + again, the first term f Equatin 7 can be simplified as { } 0 E tr Σ Σ N Using expressins 9, 0, and, C θ p { p + E Xβ Σ I H Xβ } θ Fllwing Mallws interpretatin, θ in 7 can be therefre apprximated as { β Σ β } E X I H X = p + in can be expressed as B P + p θ V, where V P and B p are respectively variance and bias cntributins given by and VP = p { β } β B = E X Σ I H X p We cmment that increasing the number f the parameters f the fixed effects p will decrease the bias B p fr the fitted mdel, yet will increase the variance V P at the same time he marginal Gauss discrepancy can therefre be cnsidered as a bias-variance trade-ff Since a smaller value f the discrepancy indicates a smaller distance between the true mdel and a candidate mdel, the size f the Gauss discrepancy can really reflect hw a fitted mdel is clse t the true mdel 3 Derivatins f Marginal and Imprved Marginal 3 Marginal C p are develped by finding a statistic that has an expectatin which equals t r asympttically equals t the expected transfrmed marginal Gauss discrepancy We start with the expectatin f the sum f squared errrs SS frm a candidate mdel In linear mixed mdels, the sum f squared errrs SS can be written as In this sectin, mdel selectin criteria based n θ β β SS = y X Σ y X SS By herem and Crllary, the expectatin f the scaled sum f squared errr can be expressed by 43

6 C Wenren et al and then we have yxβ Σ yx β SS E E = y Hy Σ y Hy = E y I H y Σ = E, { β + β Σ β + β } SS E y X X I H y X X E = { } β Σ β β Σ β { } { } ζ Σ ζ E Xβ Σ I H Xβ { } E y X I H y X E X I H X = + E I H = + Similar t the derivatin f Equatin, the numeratr f first term f Equatin 3 is expressed as { } ζ Σ ζ = Σ Σ tr I H = N p E I H tr I H SS hen, by Equatins 3 and 3, it is straightfrward t cnstruct a functin = + p N, which is SS a linear cmbinatin f It can be shwn that the functin has the expectatin SS E { } = E + p N = E { SS} + pn { β Σ β } E X I H X = N p+ + pn { β Σ β } E X I H X B = p+ = + θ p V P Nte that the functin is nt a statistic since the parameter is unknwn Here, we wuld like t use an estimatr t replace in the functin Let X dente the design matrix fr the largest mdel in the candidate pl with rank X = p We assume that C X C X Let SS represent the sum f squared errrs fr the crrespnding fitted mdel and is written as β β SS = y X Σ y X, where β and Σ are the MLEs fr parameters β and Σ in the largest candidate mdel respectively he estimatr Σ cannt be expressed in a clsed frm and is calculated by cmputatinal algrithm where the iteratins are needed Fr the estimatr f, we use the mean squared errr f the largest candidate mdel

7 C Wenren et al SS, = N p which is an asympttically unbiased estimatr fr, yet it is biased In the justificatin f this estimatr, using the apprximatin Σ =Σ +, we can represent β in terms f Σ, then the expected value f SS can be easily calculated as N p, ie, asympttically we can have E = Serving as an asympttically unbiased estimatr f, the in Equatin 33 fr the largest candidate mdel is preferred t estimate MC p is then btained as Nte that MC p is biased fr θ p SS 33 SS N p SS MC = + p N = + p N 34 Hwever, under the assumptin that the true mdel is included in the pl f candidate mdels, MC p serves as an asympttically unbiased estimatr f the discrepancy in expressin 7 he prf is nntrivial, yet the simulatins nt presented here can shw that as the samples size increases, the curves f the average values fr MC p and the discrepancy C θ p, alng with IMC p, which will be intrduced in the fllwing subsectin, cllectively get merged, indicating that MC p and IMC p are all asymptti- θ cally unbiased estimatrs f the discrepancy 3 Imprved Marginal C p imprve the perfrmance f the MC p statistic in linear mixed mdels, we wish t prpse an imprved marginal C p, called IMC p, which is expected t be a mre accurate r less biased estimatr f the expected transfrmed marginal Gauss discrepancy than M IMC p is prpsed as N p SS IMC = + p N +, 35 p SS where SS and SS are the sum f squared errrs frm the candidate fitted mdel and the largest fitted mdel, respectively Nte that IMC p prvides us an asympttically unbiased estimatr f C θ p, ie, E{ IMC p } C θ p, and it will be shwn in what fllws SS evaluate the expectatin f IMC p, we first need t calculate the rati f the sum f squared errrs SS between the candidate mdel and the largest candidate mdel in the pl By Crllary, we have By using the apprximatin β β y X β Σ y X β y I H Σ I H y Σ y Σ I H y Σ Σ = = SS y X y X y Hy y Hy SS y Hy y Hy Σ Σ = = y I H I H y y I H y Σ=Σ + fr all Σ, we apprximate Ĥ and tively, and H = X X Σ X X Σ and Ĥ by H and H, respec- SS H = X X Σ X X Σ hen, the rati can be writ- SS ten as Σ SS SS y I H y y Σ I H y y Σ I H y = y Σ I H y y Σ I H + H H y y Σ H H y = = + y I H y y I H y Σ Σ cntinue the prf, we will use the fllwing therem and crllary 36 45

8 C Wenren et al herem If C X C X, then fr any N N he prf f herem is presented in the Appendix Crllary Fllwing herem, we can btain fllwing results: Σ HH =Σ H H =Σ H Σ HX =Σ X he prf f Crllary is included in the Appendix By herem and Crllary, we have matrix K, we have C K X C K X H H I H Σ ΣΣ = 0, such that the quadratic frms y Σ H H y and SS expectatin f SS in 36 can be written as Fr the term Fr the term Nte that where A y Σ I H y are independent It fllws that the y Σ H H y E SS + E SS y Σ I H y = + E{ y Σ H H y} E y Σ I H y { Σ } in 37, since ~, E y H H y y N X β Σ, we have { Σ } tr H H X H H X E y H H y = Σ Σ + β Σ β = tr H H + X β Σ H H X β p p X H H X = + β Σ β p p X I H X = + β Σ β E in 37, we can prve that y Σ I H y y Σ I H y ~ χ rank I H rank I H = N p justify the distributin f Σ I H y Σ I H y = y Ay, y Σ I H y, we have = Fr the distributin f y, we knw that ~, Σ A= I H, and by herem, the matrix I H is idemptent herefre, we have ν = rank I H = N p and by Crllary, we can calculate λ as I H Σ λ = X β AX β = X β X β = y N X β Σ We calculate that y Ay ~ χ λ ν,, where Nw, its inverse y Σ I H y fllws an inverse Chi-square distributin, ie, 46

9 C Wenren et al ~ I χ y Σ I H y rank I H, with the expectatin as E = y Σ I H y N p 39 SS Using the results f 38 and 39, we have the expectatin f E SS in 37 as SS E + E y Σ H H y E { } SS y Σ I H y = + E y Σ H H y E { } y Σ I H y p p Xβ I H Xβ = + + Σ N p = + + Xβ Σ I H Xβ p p N p N p N p = + N p Xβ Σ I H Xβ N p 30 We recall that the criterin IMC p in 35 is defined as SS IMC = N p + p N + p SS By the result f 30 and the apprximatin SS E{ IM } = N p E p N + SS Σ=Σ + again, we have the expectatin f IMC p as N p Xβ Σ I H Xβ N p N p E { Xβ Σ I H Xβ } p θ N p + + p N + + Hence, IMC p is an asympttically unbiased estimatr f the expected verall transfrmed Gauss discrepancy C θ p in Equatin 7 he advantage f IMC p is that it avids the bias f using t estimate t derive the criterin cmparing t the derivatin f MC p We cmment that the prpsed MC p and IMC p are justified based upn the assumptin that the true mdel is cntained in the candidate mdels Hence, we can calculate the MC p and IMC p values fr the crrectly and verfitted candidate mdels Hwever, the prpsed criteria are als can be utilized fr the underspecified mdels except that the values will be quite large and nt behave well 4 Simulatin Study In this simulatin study, we investigate the ability f MC p in 34 and IMC p in 35 t determine the crrect set f fixed effects fr the simulated data in different mdels 47

10 C Wenren et al 4 Presentatin f Simulatins Cnsider a setting in which data are generated by the mdel f the fllwing frm y = X β + b + ε, i =,, m, j =,, n, ij ij i ij where the randm effects b,, bm are uncrrelated with mean 0 and variance τ, the errrs ε ij are independent with each ther with mean 0 and variance It fllws that the crrelatin between any tw bserva- τ tins frm the same case is, whereas the bservatins frm different cases are uncrrelated Let φ dente the prprtin between the variance f the randm effects and the variance f the errrs, ie φ = We τ + τ φ can btain that the crrelatin between the bservatins frm the same case equals, which is an increas- + φ ing functin f φ herefre, a higher φ implies a higher crrelatin between the bservatins in the same case Fr cnvenience, the generating mdel can als be expressed by y = X β + Zb + ε, where β are unknwn cefficients f the fixed effects It is assumed that the randm effects b ~ N 0, G with G = φim, and r = We set Zi jn i j n i =,,, an n i -vectr f nes, and n = = nm = n = N m We als assume that the errr term ε ~ N 0, IN, and is independent f the randm effects b Since the randm part f the mdel ie Zb is nt subject t selectin, we wuld like t express the mdel by its marginal frm Let ζij = bi + ε, we have ij y = X β + ζ, = fr which can als be expressed by the general frm as ij ij ij y = Xβ + ζ, ζ ~ N 0, Σ, 4 τ where ζ = Zb + ε, Σ= ZZ + I N is a scaled cvariance matrix Equivalently, the term ζ has the fllwing exchangeable crrelatin structure: Var ζ = φ+ I + J, where φ =, I is the φ φ τ + φ + φ identity matrix and J is the matrix f s In this simulatin study, we generate the design matrix X with rank X f 5 he first clumn f X is and the ther fur clumns f X are generated randmly frm unifrm distributins but are fixed thrughut the simulatins herefre, the number f fixed effects including the intercept in the largest mdel is p = 5 We assume that the candidate vectrs f cvariates, X,, X5 frm which the clumns f X are t be selected, then p there are = 6 candidate mdels in the candidate pl Here, we will illustrate the behavir f mdel selectin criteria by chsing three generating mdels: Mdel : yij = β + β3xij3 + bi + ε, ij β =, β3 = 3 ; Mdel : yij = β + β3xij3 + β4xij4 + bi + ε, ij β =, β3 = 3, β4 = 4 ; 3 Mdel 3: yij = β + βxij + β3xij3 + β4xij4 + bi + ε, ij β =, β =, β = 3, β4 = 4 hese three mdels crrespnd t the three βs:,0,0, 3,0,,0,0, 3, 4 and,0,, 3,4 in mdel 4 with the number f fixed effects p equals, 3, 4, respectively Again, the MLEs are used fr estimatin in the simulatins Furthermre, we cnsider the case where the crrelated errrs have varying degrees f exchangeable structure he variance cmpnent f errr term is taken t be, and fur values in an increasing rder f τ are cnsidered: 3, 6, 9, crrespnding t three values f φ: 3, 6, 9, respectively We take the number f clusters m t be 5, 0 and 0, the number f repetitins in a cluster t be fixed at n = 5 We emply a ttal f 00 realizatins fr each mdel 48

11 C Wenren et al 4 ults 4 Mdel : β =,0,0, 3,0 able presents the perfrmance f the tw versins f marginal C p MC p and IMC p, maic and mbic, under mdel with the true fixed effects parameter β =,0,0, 3,0, and crrespnding t p = he crrect mdel selectin rate fr each criterin is listed We bserve that crrespnding t each φ, the IMC p utperfrms the MC p, and bth utperfrm maic and mbic in selecting the crrect mdel fr small samples With the increasing f the rati φ, we can bserve the better perfrmance in selecting the crrect mdel frm ur prpsed criteria 4 Mdel : β =,0,0, 3, 4 We evaluate the prpsed criteria fr mdel in the same manner as fr mdel able presents the perfrmance f MC p and IMC p, maic and mbic under mdel, where the true fixed effects parameter is β =,0,0, 3, 4 and p = 3 he nly change n mdel frm mdel is that we add ne mre fixed effect variable X 5 and set the cefficient f that variable β 5 = 4 In able, the simulatin results f mdel are similar t thse f mdel With the increasing f the rati φ, we can have the better perfrmance frm ur prpsed criteria M and IM, indicating that the prpsed M and IM can effectively fulfill the missin f mdel selectin in the mixed mdels We can als bserve and cnclude that IMC p has imprved the perfrmance f MC p fr mdel selectin in small samples With the increasing f m, the perfrmance f IMC p and MC p becmes clser Cmparing t the crrect selectin rates in mdel, all mdel selectin criteria behave better in mdel 43 Mdel 3: β =,0,, 3, 4 As in the first tw mdels, we evaluate the perfrmance f mdel selectin criteria by the rates in crrectly selecting the true mdel he results are presented in able 3 Mdel 3 is identical t mdel with the exceptin that we add ne mre significant fixed effect variable X with the cefficient β = he simulatin results f mdel 3 are similar t thse f mdels - Cnsidering the rates in chsing the crrect mdel, we can find the trend f dramatic imprvement f all criteria n mdel 3 ver thse n mdels and, implying that the prpsed MC p and IMC p essentially and effectively implement mdel selectin when the fixed-effects are significant In mderately large m = 0 sample sizes, cmpared t that f maic and mbic, MC p and IMC p have cmparative perfrmance in selecting the crrect mdel able Crrect selectin rate in mdel Sample size Criterin crrelatin parameter φ = 3 φ = 6 φ = 9 MC p m = 5 IMC p maic mbic MC p m = 0 IMC p maic mbic MC p m = 0 IMC p maic mbic

12 C Wenren et al able Crrect selectin rate in mdel Sample size Criterin Crrelatin parameter φ = 3 φ = 6 φ = 9 MC p m = 5 IMC p maic mbic MC p m = 0 IMC p maic mbic MC p m = 0 IMC p maic mbic able 3 Crrect selectin rate in mdel 3 Sample size Criterin Crrelatin parameter φ = 3 φ = 6 φ = 9 MC p m = 5 IMC p maic mbic MC p m = 0 IMC p maic mbic MC p m = 0 IMC p maic mbic Cncluding Remarks he simulatin results illustrate that the prpsed criteria MC p and IMC p utperfrm maic and mbic when the bservatins are highly crrelated in small samples he results als shw that with the increasing f the rati φ between the variance fr the randm effects and that fr errrs, the MC p and IMC p perfrm better Since a larger φ implies a higher crrelatin between the bservatins, we can cnclude that with the crrelatin between bservatins increases, a better perfrmance frm the prpsed criteria MC p and IMC p wuld be bserved Since the mdel with a small φ which clse t 0 is similar t a linear regressin mdel with independent errrs, ur prpsed criteria are nt advantageus t be applied in such case 50

13 C Wenren et al he simulatin results shw that the prpsed criteria MC p and IMC p significantly utperfrm maic and mbic when the sample size is small As the sample size increases, the perfrmance f the prpsed criteria becmes cmparable t that f maic and mbic herefre, MC p and IMC p are highly recmmended in small samples in the setting f linear mixed mdels Our research nt shwn in this paper als shws that bth prpsed criteria behave best when the maximum likelihd estimatin MLE is emplyed, cmparing t thse when the restricted maximum likelihd estimatin r least squares estimatin are used he research n MC p and IMC p under REML estimatin needs t be further develped in the future In the simulatin study, by the cmparisn amng mdels, and 3, we see that when the true mdel includes mre significant fixed effect cvariates, the prpsed criteria perfrm better in selecting the crrect mdel his fact indicates that the mdels with mre significant variables larger βs are mre identifiable by the prpsed criteria than the mdels with variables which are nt quite significant Cmparing the perfrmance between MC p and IMC p, we find that when the sample size is small, IMC p btains a higher crrect selectin rate than MC p, which demnstrates that IMC p imprves the perfrmance f MC p in selecting the mst apprpriate mdel Hwever, when the sample size becmes larger, the perfrmance f MC p and IMC p is quite identical Regarding the cnsistency f a mdel selectin criterin, it means that as the sample size increases, the mdel selectin will select the true mdel with prbability Nte that MC p, IMC p, and maic are nt cnsistent, whereas mbic is cnsistent as expected since its penalty term lg N prevents the verfitting in large samples As the simulatin study demnstrates, we can address again that the prpsed criteria MC p and IMC p validate their advantages in small samples, althugh they are riginally justified with large sample apprximatins, which is similar t quite a few ther mdel selectin criteria he details fr the cnsistency f mdel selectin criteria in linear mixed mdels can als see Jiang and Ra [0] References [] Mallws, CL 973 Sme Cmments n C p echnmetrics, 5, [] Mallws, CL 995 Mre Cmments n C p echnmetrics, 37, [3] Akaike, H 973 Infrmatin hery and an Extensin f the Maximum Likelihd Principle In: Petrv, BN and Csaki, F, Eds, Internatinal Sympsium n Infrmatin hery, 67-8 [4] Akaike, H 974 A New Lk at the Mdel Selectin Identificatin IEEE ransactins n Autmatic Cntrl, 9, [5] Schwarz, G 978 Estimating the Dimensin f a Mdel Annals f Statistics, 6, [6] Sugiura, N 978 Further Analysis f the Data by Akaike s Infrmatin Criterin and the Finite Crrectins Cmmunicatins in Statistics hery and Methds A, 7, [7] Shang, J and Cavanaugh, JE 008 Btstrap Variants f the Akaike Infrmatin Criterin fr Mixed Mdel Selectin Cmputatinal Statistics & Data Analysis, 5, [8] Azari, R, Li, L and sai, C 006 Lngitudinal Data Mdel Selectin Applied imes Series Analysis, Academic Press, New Yrk, -3 [9] Vaida, F and Blanchard, S 005 Cnditinal Akaike Infrmatin fr Mixed-Effects Mdels Bimetrika, 9, [0] Hendersn, CR 950 Estimatin f Genetic Parameters Annals f Mathematical Statistics,, [] Harville, DA 990 BLUP Best Linear Unbiased Predictin and beynd In: Gianla, D and Hammnd, K, Eds, Advances in Staitstical Methds fr Genetic Imprvement f Livestck, Springer, New Yrk, [] Rbinsn, GK 99 hat BLUP Is a Gd hing: he Estimatin f Randm Effects Statistical Science, 6, [3] Dimva, RB, Mariantihi, M and alal, AH 0 Infrmatin Methds fr Mdel Selectin in Linear Mixed Effects Mdels with Applicatin t HCV Data Cmputatinal Statistics & Data Analysis, 55, [4] Müller, S, Scealy, JL and Welsh, AH 03 Mdel Selectin in Linear Mixed Mdels Statistical Science, 8,

14 C Wenren et al [5] Jnes, RH 0 Bayesian Infrmatin Criterin fr Lngitudinal and Clustered Data Statistics in Medicine, 30, [6] Fujikshi, Y and Sath, K 997 Mdified AIC and C p in Multivariate Linear Regressin Bimetrika, 84, [7] Davies, SL, Neath, AA and Cavanaugh, JE 006 Estimatin Optimality f Crrected AIC and Mdified C p in Linear Regressin Internatinal Statistical Review, 74, [8] Cavanaugh, J, Neath, AA and Davies, SL 00 An Alternate Versin f the Cnceptual Predictive Statistic Based n a Symmetrized Discrepancy Measure Jurnal f Statistical Planning and Inference, 40, [9] Jiang, J 007 Linear and Generalized Linear Mixed Mdels and heir Applicatins Springer, New Yrk [0] Jiang, J and Ra, JS 003 Cnsistent Prcedures fr Mixed Linear Mdel Selectin Sankhya, 65, 3-4 5

15 C Wenren et al Appendix Prf f herem prve that Ĥ is idemptent, we calculate HH = X X Σ X X Σ X X Σ X X Σ = X X Σ X X Σ = H hus, we prve that Ĥ is idemptent By the prperties f trace, we have herefre, we have hus, herem is prved p tr H = tr X X Σ X X Σ = tr X Σ X X Σ X = tr I = p N N tr I H = tr I tr H = N p Prf f herem Let y C K X We need t shw that y C K X Since y C K X, there exists a p vectr β such that y = K Xβ By C X C X, there als exists a p vectr β such that Xβ = X β, which makes y = K Xβ = K X β S we have y C K X Prf f Crllary Since rank V = N, such that Σ is psitive definite, there exists an N N matrix V with Σ = VV It fllws that Σ = VV = V V = V V Let K = V, we can have Σ = KK hen, we arrive at Σ HH =Σ X X Σ X X Σ X X Σ X X Σ = Nw, let and Since C X C X which leads t KK X X KK X X KK X X KK X X KK = K K X K X K X K X K X K X K X K X K = H K X K X K X K X H K X K X K X K X =, by herem, we have C K X C K X, s that we can have Σ HH = K K X K X K X K X K X K X K X K X K = KHH K = KHK = KK X X KK X X KK =Σ X X Σ X X Σ =Σ H he first part f Crllary is therefre prved Fllwing the first part prf f Crllary, since C K X C K X, we have hen, we can cnclude that Σ H X = K K X K X K X K X = KH K X = KK X = Σ X herefre, the prf fr the secnd part f Crllary is cmpleted HH = H, H K X K X = 53

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) > Btstrap Methd > # Purpse: understand hw btstrap methd wrks > bs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(bs) > mean(bs) [1] 21.64625 > # estimate f lambda > lambda = 1/mean(bs);

More information

A Matrix Representation of Panel Data

A Matrix Representation of Panel Data web Extensin 6 Appendix 6.A A Matrix Representatin f Panel Data Panel data mdels cme in tw brad varieties, distinct intercept DGPs and errr cmpnent DGPs. his appendix presents matrix algebra representatins

More information

On Huntsberger Type Shrinkage Estimator for the Mean of Normal Distribution ABSTRACT INTRODUCTION

On Huntsberger Type Shrinkage Estimator for the Mean of Normal Distribution ABSTRACT INTRODUCTION Malaysian Jurnal f Mathematical Sciences 4(): 7-4 () On Huntsberger Type Shrinkage Estimatr fr the Mean f Nrmal Distributin Department f Mathematical and Physical Sciences, University f Nizwa, Sultanate

More information

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical mdel fr micrarray data analysis David Rssell Department f Bistatistics M.D. Andersn Cancer Center, Hustn, TX 77030, USA rsselldavid@gmail.cm

More information

What is Statistical Learning?

What is Statistical Learning? What is Statistical Learning? Sales 5 10 15 20 25 Sales 5 10 15 20 25 Sales 5 10 15 20 25 0 50 100 200 300 TV 0 10 20 30 40 50 Radi 0 20 40 60 80 100 Newspaper Shwn are Sales vs TV, Radi and Newspaper,

More information

IN a recent article, Geary [1972] discussed the merit of taking first differences

IN a recent article, Geary [1972] discussed the merit of taking first differences The Efficiency f Taking First Differences in Regressin Analysis: A Nte J. A. TILLMAN IN a recent article, Geary [1972] discussed the merit f taking first differences t deal with the prblems that trends

More information

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9. Sectin 7 Mdel Assessment This sectin is based n Stck and Watsn s Chapter 9. Internal vs. external validity Internal validity refers t whether the analysis is valid fr the ppulatin and sample being studied.

More information

Simple Linear Regression (single variable)

Simple Linear Regression (single variable) Simple Linear Regressin (single variable) Intrductin t Machine Learning Marek Petrik January 31, 2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins

More information

Distributions, spatial statistics and a Bayesian perspective

Distributions, spatial statistics and a Bayesian perspective Distributins, spatial statistics and a Bayesian perspective Dug Nychka Natinal Center fr Atmspheric Research Distributins and densities Cnditinal distributins and Bayes Thm Bivariate nrmal Spatial statistics

More information

Resampling Methods. Chapter 5. Chapter 5 1 / 52

Resampling Methods. Chapter 5. Chapter 5 1 / 52 Resampling Methds Chapter 5 Chapter 5 1 / 52 1 51 Validatin set apprach 2 52 Crss validatin 3 53 Btstrap Chapter 5 2 / 52 Abut Resampling An imprtant statistical tl Pretending the data as ppulatin and

More information

initially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur

initially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur Cdewrd Distributin fr Frequency Sensitive Cmpetitive Learning with One Dimensinal Input Data Aristides S. Galanpuls and Stanley C. Ahalt Department f Electrical Engineering The Ohi State University Abstract

More information

Admissibility Conditions and Asymptotic Behavior of Strongly Regular Graphs

Admissibility Conditions and Asymptotic Behavior of Strongly Regular Graphs Admissibility Cnditins and Asympttic Behavir f Strngly Regular Graphs VASCO MOÇO MANO Department f Mathematics University f Prt Oprt PORTUGAL vascmcman@gmailcm LUÍS ANTÓNIO DE ALMEIDA VIEIRA Department

More information

Enhancing Performance of MLP/RBF Neural Classifiers via an Multivariate Data Distribution Scheme

Enhancing Performance of MLP/RBF Neural Classifiers via an Multivariate Data Distribution Scheme Enhancing Perfrmance f / Neural Classifiers via an Multivariate Data Distributin Scheme Halis Altun, Gökhan Gelen Nigde University, Electrical and Electrnics Engineering Department Nigde, Turkey haltun@nigde.edu.tr

More information

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank CAUSAL INFERENCE Technical Track Sessin I Phillippe Leite The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Phillippe Leite fr the purpse f this wrkshp Plicy questins are causal

More information

Performance Bounds for Detect and Avoid Signal Sensing

Performance Bounds for Detect and Avoid Signal Sensing Perfrmance unds fr Detect and Avid Signal Sensing Sam Reisenfeld Real-ime Infrmatin etwrks, University f echnlgy, Sydney, radway, SW 007, Australia samr@uts.edu.au Abstract Detect and Avid (DAA) is a Cgnitive

More information

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa There are tw parts t this lab. The first is intended t demnstrate hw t request and interpret the spatial diagnstics f a standard OLS regressin mdel using GeDa. The diagnstics prvide infrmatin abut the

More information

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression 4th Indian Institute f Astrphysics - PennState Astrstatistics Schl July, 2013 Vainu Bappu Observatry, Kavalur Crrelatin and Regressin Rahul Ry Indian Statistical Institute, Delhi. Crrelatin Cnsider a tw

More information

Pattern Recognition 2014 Support Vector Machines

Pattern Recognition 2014 Support Vector Machines Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft

More information

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation III-l III. A New Evaluatin Measure J. Jiner and L. Werner Abstract The prblems f evaluatin and the needed criteria f evaluatin measures in the SMART system f infrmatin retrieval are reviewed and discussed.

More information

Chapter 3: Cluster Analysis

Chapter 3: Cluster Analysis Chapter 3: Cluster Analysis } 3.1 Basic Cncepts f Clustering 3.1.1 Cluster Analysis 3.1. Clustering Categries } 3. Partitining Methds 3..1 The principle 3.. K-Means Methd 3..3 K-Medids Methd 3..4 CLARA

More information

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017 Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with

More information

Lead/Lag Compensator Frequency Domain Properties and Design Methods

Lead/Lag Compensator Frequency Domain Properties and Design Methods Lectures 6 and 7 Lead/Lag Cmpensatr Frequency Dmain Prperties and Design Methds Definitin Cnsider the cmpensatr (ie cntrller Fr, it is called a lag cmpensatr s K Fr s, it is called a lead cmpensatr Ntatin

More information

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression 3.3.4 Prstate Cancer Data Example (Cntinued) 3.4 Shrinkage Methds 61 Table 3.3 shws the cefficients frm a number f different selectin and shrinkage methds. They are best-subset selectin using an all-subsets

More information

ENSC Discrete Time Systems. Project Outline. Semester

ENSC Discrete Time Systems. Project Outline. Semester ENSC 49 - iscrete Time Systems Prject Outline Semester 006-1. Objectives The gal f the prject is t design a channel fading simulatr. Upn successful cmpletin f the prject, yu will reinfrce yur understanding

More information

Modelling of Clock Behaviour. Don Percival. Applied Physics Laboratory University of Washington Seattle, Washington, USA

Modelling of Clock Behaviour. Don Percival. Applied Physics Laboratory University of Washington Seattle, Washington, USA Mdelling f Clck Behaviur Dn Percival Applied Physics Labratry University f Washingtn Seattle, Washingtn, USA verheads and paper fr talk available at http://faculty.washingtn.edu/dbp/talks.html 1 Overview

More information

Revision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax

Revision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax .7.4: Direct frequency dmain circuit analysis Revisin: August 9, 00 5 E Main Suite D Pullman, WA 9963 (509) 334 6306 ice and Fax Overview n chapter.7., we determined the steadystate respnse f electrical

More information

Computational modeling techniques

Computational modeling techniques Cmputatinal mdeling techniques Lecture 4: Mdel checing fr ODE mdels In Petre Department f IT, Åb Aademi http://www.users.ab.fi/ipetre/cmpmd/ Cntent Stichimetric matrix Calculating the mass cnservatin relatins

More information

Particle Size Distributions from SANS Data Using the Maximum Entropy Method. By J. A. POTTON, G. J. DANIELL AND B. D. RAINFORD

Particle Size Distributions from SANS Data Using the Maximum Entropy Method. By J. A. POTTON, G. J. DANIELL AND B. D. RAINFORD 3 J. Appl. Cryst. (1988). 21,3-8 Particle Size Distributins frm SANS Data Using the Maximum Entrpy Methd By J. A. PTTN, G. J. DANIELL AND B. D. RAINFRD Physics Department, The University, Suthamptn S9

More information

Inference in the Multiple-Regression

Inference in the Multiple-Regression Sectin 5 Mdel Inference in the Multiple-Regressin Kinds f hypthesis tests in a multiple regressin There are several distinct kinds f hypthesis tests we can run in a multiple regressin. Suppse that amng

More information

Math Foundations 20 Work Plan

Math Foundations 20 Work Plan Math Fundatins 20 Wrk Plan Units / Tpics 20.8 Demnstrate understanding f systems f linear inequalities in tw variables. Time Frame December 1-3 weeks 6-10 Majr Learning Indicatrs Identify situatins relevant

More information

Margin Distribution and Learning Algorithms

Margin Distribution and Learning Algorithms ICML 03 Margin Distributin and Learning Algrithms Ashutsh Garg IBM Almaden Research Center, San Jse, CA 9513 USA Dan Rth Department f Cmputer Science, University f Illinis, Urbana, IL 61801 USA ASHUTOSH@US.IBM.COM

More information

Least Squares Optimal Filtering with Multirate Observations

Least Squares Optimal Filtering with Multirate Observations Prc. 36th Asilmar Cnf. n Signals, Systems, and Cmputers, Pacific Grve, CA, Nvember 2002 Least Squares Optimal Filtering with Multirate Observatins Charles W. herrien and Anthny H. Hawes Department f Electrical

More information

a(k) received through m channels of length N and coefficients v(k) is an additive independent white Gaussian noise with

a(k) received through m channels of length N and coefficients v(k) is an additive independent white Gaussian noise with urst Mde Nn-Causal Decisin-Feedback Equalizer based n Sft Decisins Elisabeth de Carvalh and Dirk T.M. Slck Institut EURECOM, 2229 rute des Crêtes,.P. 93, 694 Sphia ntiplis Cedex, FRNCE Tel: +33 493263

More information

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank MATCHING TECHNIQUES Technical Track Sessin VI Emanuela Galass The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Emanuela Galass fr the purpse f this wrkshp When can we use

More information

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007 CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is

More information

Localized Model Selection for Regression

Localized Model Selection for Regression Lcalized Mdel Selectin fr Regressin Yuhng Yang Schl f Statistics University f Minnesta Church Street S.E. Minneaplis, MN 5555 May 7, 007 Abstract Research n mdel/prcedure selectin has fcused n selecting

More information

Determining the Accuracy of Modal Parameter Estimation Methods

Determining the Accuracy of Modal Parameter Estimation Methods Determining the Accuracy f Mdal Parameter Estimatin Methds by Michael Lee Ph.D., P.E. & Mar Richardsn Ph.D. Structural Measurement Systems Milpitas, CA Abstract The mst cmmn type f mdal testing system

More information

Lyapunov Stability Stability of Equilibrium Points

Lyapunov Stability Stability of Equilibrium Points Lyapunv Stability Stability f Equilibrium Pints 1. Stability f Equilibrium Pints - Definitins In this sectin we cnsider n-th rder nnlinear time varying cntinuus time (C) systems f the frm x = f ( t, x),

More information

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS 1 Influential bservatins are bservatins whse presence in the data can have a distrting effect n the parameter estimates and pssibly the entire analysis,

More information

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551

More information

Pure adaptive search for finite global optimization*

Pure adaptive search for finite global optimization* Mathematical Prgramming 69 (1995) 443-448 Pure adaptive search fr finite glbal ptimizatin* Z.B. Zabinskya.*, G.R. Wd b, M.A. Steel c, W.P. Baritmpa c a Industrial Engineering Prgram, FU-20. University

More information

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised

More information

Comparing Several Means: ANOVA. Group Means and Grand Mean

Comparing Several Means: ANOVA. Group Means and Grand Mean STAT 511 ANOVA and Regressin 1 Cmparing Several Means: ANOVA Slide 1 Blue Lake snap beans were grwn in 12 pen-tp chambers which are subject t 4 treatments 3 each with O 3 and SO 2 present/absent. The ttal

More information

Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Key Wrds: Autregressive, Mving Average, Runs Tests, Shewhart Cntrl Chart

Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Key Wrds: Autregressive, Mving Average, Runs Tests, Shewhart Cntrl Chart Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Sandy D. Balkin Dennis K. J. Lin y Pennsylvania State University, University Park, PA 16802 Sandy Balkin is a graduate student

More information

Sparse estimation for functional semiparametric additive models

Sparse estimation for functional semiparametric additive models Sparse estimatin fr functinal semiparametric additive mdels Peijun Sang, Richard A. Lckhart, Jigu Ca Department f Statistics and Actuarial Science, Simn Fraser University, Burnaby, BC, Canada V5A1S6 Abstract

More information

NUMBERS, MATHEMATICS AND EQUATIONS

NUMBERS, MATHEMATICS AND EQUATIONS AUSTRALIAN CURRICULUM PHYSICS GETTING STARTED WITH PHYSICS NUMBERS, MATHEMATICS AND EQUATIONS An integral part t the understanding f ur physical wrld is the use f mathematical mdels which can be used t

More information

UNIV1"'RSITY OF NORTH CAROLINA Department of Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION

UNIV1'RSITY OF NORTH CAROLINA Department of Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION UNIV1"'RSITY OF NORTH CAROLINA Department f Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION by N. L. Jlmsn December 1962 Grant N. AFOSR -62..148 Methds f

More information

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised

More information

MATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank

MATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank MATCHING TECHNIQUES Technical Track Sessin VI Céline Ferré The Wrld Bank When can we use matching? What if the assignment t the treatment is nt dne randmly r based n an eligibility index, but n the basis

More information

Kinetic Model Completeness

Kinetic Model Completeness 5.68J/10.652J Spring 2003 Lecture Ntes Tuesday April 15, 2003 Kinetic Mdel Cmpleteness We say a chemical kinetic mdel is cmplete fr a particular reactin cnditin when it cntains all the species and reactins

More information

Differentiation Applications 1: Related Rates

Differentiation Applications 1: Related Rates Differentiatin Applicatins 1: Related Rates 151 Differentiatin Applicatins 1: Related Rates Mdel 1: Sliding Ladder 10 ladder y 10 ladder 10 ladder A 10 ft ladder is leaning against a wall when the bttm

More information

AP Statistics Notes Unit Two: The Normal Distributions

AP Statistics Notes Unit Two: The Normal Distributions AP Statistics Ntes Unit Tw: The Nrmal Distributins Syllabus Objectives: 1.5 The student will summarize distributins f data measuring the psitin using quartiles, percentiles, and standardized scres (z-scres).

More information

INSTRUMENTAL VARIABLES

INSTRUMENTAL VARIABLES INSTRUMENTAL VARIABLES Technical Track Sessin IV Sergi Urzua University f Maryland Instrumental Variables and IE Tw main uses f IV in impact evaluatin: 1. Crrect fr difference between assignment f treatment

More information

More Tutorial at

More Tutorial at Answer each questin in the space prvided; use back f page if extra space is needed. Answer questins s the grader can READILY understand yur wrk; nly wrk n the exam sheet will be cnsidered. Write answers,

More information

Midwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter

Midwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter Midwest Big Data Summer Schl: Machine Learning I: Intrductin Kris De Brabanter kbrabant@iastate.edu Iwa State University Department f Statistics Department f Cmputer Science June 24, 2016 1/24 Outline

More information

ON-LINE PROCEDURE FOR TERMINATING AN ACCELERATED DEGRADATION TEST

ON-LINE PROCEDURE FOR TERMINATING AN ACCELERATED DEGRADATION TEST Statistica Sinica 8(1998), 207-220 ON-LINE PROCEDURE FOR TERMINATING AN ACCELERATED DEGRADATION TEST Hng-Fwu Yu and Sheng-Tsaing Tseng Natinal Taiwan University f Science and Technlgy and Natinal Tsing-Hua

More information

Tree Structured Classifier

Tree Structured Classifier Tree Structured Classifier Reference: Classificatin and Regressin Trees by L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stne, Chapman & Hall, 98. A Medical Eample (CART): Predict high risk patients

More information

A mathematical model for complete stress-strain curve prediction of permeable concrete

A mathematical model for complete stress-strain curve prediction of permeable concrete A mathematical mdel fr cmplete stress-strain curve predictin f permeable cncrete M. K. Hussin Y. Zhuge F. Bullen W. P. Lkuge Faculty f Engineering and Surveying, University f Suthern Queensland, Twmba,

More information

1996 Engineering Systems Design and Analysis Conference, Montpellier, France, July 1-4, 1996, Vol. 7, pp

1996 Engineering Systems Design and Analysis Conference, Montpellier, France, July 1-4, 1996, Vol. 7, pp THE POWER AND LIMIT OF NEURAL NETWORKS T. Y. Lin Department f Mathematics and Cmputer Science San Jse State University San Jse, Califrnia 959-003 tylin@cs.ssu.edu and Bereley Initiative in Sft Cmputing*

More information

The general linear model and Statistical Parametric Mapping I: Introduction to the GLM

The general linear model and Statistical Parametric Mapping I: Introduction to the GLM The general linear mdel and Statistical Parametric Mapping I: Intrductin t the GLM Alexa Mrcm and Stefan Kiebel, Rik Hensn, Andrew Hlmes & J-B J Pline Overview Intrductin Essential cncepts Mdelling Design

More information

COMP 551 Applied Machine Learning Lecture 4: Linear classification

COMP 551 Applied Machine Learning Lecture 4: Linear classification COMP 551 Applied Machine Learning Lecture 4: Linear classificatin Instructr: Jelle Pineau (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted

More information

Application of ILIUM to the estimation of the T eff [Fe/H] pair from BP/RP

Application of ILIUM to the estimation of the T eff [Fe/H] pair from BP/RP Applicatin f ILIUM t the estimatin f the T eff [Fe/H] pair frm BP/RP prepared by: apprved by: reference: issue: 1 revisin: 1 date: 2009-02-10 status: Issued Cryn A.L. Bailer-Jnes Max Planck Institute fr

More information

Comparison of two variable parameter Muskingum methods

Comparison of two variable parameter Muskingum methods Extreme Hydrlgical Events: Precipitatin, Flds and Drughts (Prceedings f the Ykhama Sympsium, July 1993). IAHS Publ. n. 213, 1993. 129 Cmparisn f tw variable parameter Muskingum methds M. PERUMAL Department

More information

APPLICATION OF THE BRATSETH SCHEME FOR HIGH LATITUDE INTERMITTENT DATA ASSIMILATION USING THE PSU/NCAR MM5 MESOSCALE MODEL

APPLICATION OF THE BRATSETH SCHEME FOR HIGH LATITUDE INTERMITTENT DATA ASSIMILATION USING THE PSU/NCAR MM5 MESOSCALE MODEL JP2.11 APPLICATION OF THE BRATSETH SCHEME FOR HIGH LATITUDE INTERMITTENT DATA ASSIMILATION USING THE PSU/NCAR MM5 MESOSCALE MODEL Xingang Fan * and Jeffrey S. Tilley University f Alaska Fairbanks, Fairbanks,

More information

Pipetting 101 Developed by BSU CityLab

Pipetting 101 Developed by BSU CityLab Discver the Micrbes Within: The Wlbachia Prject Pipetting 101 Develped by BSU CityLab Clr Cmparisns Pipetting Exercise #1 STUDENT OBJECTIVES Students will be able t: Chse the crrect size micrpipette fr

More information

SIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST. Mark C. Otto Statistics Research Division, Bureau of the Census Washington, D.C , U.S.A.

SIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST. Mark C. Otto Statistics Research Division, Bureau of the Census Washington, D.C , U.S.A. SIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST Mark C. Ott Statistics Research Divisin, Bureau f the Census Washingtn, D.C. 20233, U.S.A. and Kenneth H. Pllck Department f Statistics, Nrth Carlina State

More information

Homology groups of disks with holes

Homology groups of disks with holes Hmlgy grups f disks with hles THEOREM. Let p 1,, p k } be a sequence f distinct pints in the interir unit disk D n where n 2, and suppse that fr all j the sets E j Int D n are clsed, pairwise disjint subdisks.

More information

Sequential Allocation with Minimal Switching

Sequential Allocation with Minimal Switching In Cmputing Science and Statistics 28 (1996), pp. 567 572 Sequential Allcatin with Minimal Switching Quentin F. Stut 1 Janis Hardwick 1 EECS Dept., University f Michigan Statistics Dept., Purdue University

More information

MATHEMATICS SYLLABUS SECONDARY 5th YEAR

MATHEMATICS SYLLABUS SECONDARY 5th YEAR Eurpean Schls Office f the Secretary-General Pedaggical Develpment Unit Ref. : 011-01-D-8-en- Orig. : EN MATHEMATICS SYLLABUS SECONDARY 5th YEAR 6 perid/week curse APPROVED BY THE JOINT TEACHING COMMITTEE

More information

A Regression Solution to the Problem of Criterion Score Comparability

A Regression Solution to the Problem of Criterion Score Comparability A Regressin Slutin t the Prblem f Criterin Scre Cmparability William M. Pugh Naval Health Research Center When the criterin measure in a study is the accumulatin f respnses r behavirs fr an individual

More information

Name: Block: Date: Science 10: The Great Geyser Experiment A controlled experiment

Name: Block: Date: Science 10: The Great Geyser Experiment A controlled experiment Science 10: The Great Geyser Experiment A cntrlled experiment Yu will prduce a GEYSER by drpping Ments int a bttle f diet pp Sme questins t think abut are: What are yu ging t test? What are yu ging t measure?

More information

Lecture 13: Markov Chain Monte Carlo. Gibbs sampling

Lecture 13: Markov Chain Monte Carlo. Gibbs sampling Lecture 13: Markv hain Mnte arl Gibbs sampling Gibbs sampling Markv chains 1 Recall: Apprximate inference using samples Main idea: we generate samples frm ur Bayes net, then cmpute prbabilities using (weighted)

More information

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint Biplts in Practice MICHAEL GREENACRE Prfessr f Statistics at the Pmpeu Fabra University Chapter 13 Offprint CASE STUDY BIOMEDICINE Cmparing Cancer Types Accrding t Gene Epressin Arrays First published:

More information

On Out-of-Sample Statistics for Financial Time-Series

On Out-of-Sample Statistics for Financial Time-Series On Out-f-Sample Statistics fr Financial Time-Series Françis Gingras Yshua Bengi Claude Nadeau CRM-2585 January 1999 Département de physique, Université de Mntréal Labratire d infrmatique des systèmes adaptatifs,

More information

the results to larger systems due to prop'erties of the projection algorithm. First, the number of hidden nodes must

the results to larger systems due to prop'erties of the projection algorithm. First, the number of hidden nodes must M.E. Aggune, M.J. Dambrg, M.A. El-Sharkawi, R.J. Marks II and L.E. Atlas, "Dynamic and static security assessment f pwer systems using artificial neural netwrks", Prceedings f the NSF Wrkshp n Applicatins

More information

Array Variate Random Variables with Multiway Kronecker Delta Covariance Matrix Structure

Array Variate Random Variables with Multiway Kronecker Delta Covariance Matrix Structure Array Variate Randm Variables with Multiway Krnecker Delta Cvariance Matrix Structure Deniz Akdemir Department f Statistics University f Central Flrida Orland, FL 32816 Arjun K. Gupta Department f Mathematics

More information

, which yields. where z1. and z2

, which yields. where z1. and z2 The Gaussian r Nrmal PDF, Page 1 The Gaussian r Nrmal Prbability Density Functin Authr: Jhn M Cimbala, Penn State University Latest revisin: 11 September 13 The Gaussian r Nrmal Prbability Density Functin

More information

Eric Klein and Ning Sa

Eric Klein and Ning Sa Week 12. Statistical Appraches t Netwrks: p1 and p* Wasserman and Faust Chapter 15: Statistical Analysis f Single Relatinal Netwrks There are fur tasks in psitinal analysis: 1) Define Equivalence 2) Measure

More information

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came. MATH 1342 Ch. 24 April 25 and 27, 2013 Page 1 f 5 CHAPTER 24: INFERENCE IN REGRESSION Chapters 4 and 5: Relatinships between tw quantitative variables. Be able t Make a graph (scatterplt) Summarize the

More information

Smoothing, penalized least squares and splines

Smoothing, penalized least squares and splines Smthing, penalized least squares and splines Duglas Nychka, www.image.ucar.edu/~nychka Lcally weighted averages Penalized least squares smthers Prperties f smthers Splines and Reprducing Kernels The interplatin

More information

2004 AP CHEMISTRY FREE-RESPONSE QUESTIONS

2004 AP CHEMISTRY FREE-RESPONSE QUESTIONS 2004 AP CHEMISTRY FREE-RESPONSE QUESTIONS 6. An electrchemical cell is cnstructed with an pen switch, as shwn in the diagram abve. A strip f Sn and a strip f an unknwn metal, X, are used as electrdes.

More information

LHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers

LHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers LHS Mathematics Department Hnrs Pre-alculus Final Eam nswers Part Shrt Prblems The table at the right gives the ppulatin f Massachusetts ver the past several decades Using an epnential mdel, predict the

More information

Computational Statistics

Computational Statistics Cmputatinal Statistics Spring 2008 Peter Bühlmann and Martin Mächler Seminar für Statistik ETH Zürich February 2008 (February 23, 2011) ii Cntents 1 Multiple Linear Regressin 1 1.1 Intrductin....................................

More information

22.54 Neutron Interactions and Applications (Spring 2004) Chapter 11 (3/11/04) Neutron Diffusion

22.54 Neutron Interactions and Applications (Spring 2004) Chapter 11 (3/11/04) Neutron Diffusion .54 Neutrn Interactins and Applicatins (Spring 004) Chapter (3//04) Neutrn Diffusin References -- J. R. Lamarsh, Intrductin t Nuclear Reactr Thery (Addisn-Wesley, Reading, 966) T study neutrn diffusin

More information

February 28, 2013 COMMENTS ON DIFFUSION, DIFFUSIVITY AND DERIVATION OF HYPERBOLIC EQUATIONS DESCRIBING THE DIFFUSION PHENOMENA

February 28, 2013 COMMENTS ON DIFFUSION, DIFFUSIVITY AND DERIVATION OF HYPERBOLIC EQUATIONS DESCRIBING THE DIFFUSION PHENOMENA February 28, 2013 COMMENTS ON DIFFUSION, DIFFUSIVITY AND DERIVATION OF HYPERBOLIC EQUATIONS DESCRIBING THE DIFFUSION PHENOMENA Mental Experiment regarding 1D randm walk Cnsider a cntainer f gas in thermal

More information

On Boussinesq's problem

On Boussinesq's problem Internatinal Jurnal f Engineering Science 39 (2001) 317±322 www.elsevier.cm/lcate/ijengsci On Bussinesq's prblem A.P.S. Selvadurai * Department f Civil Engineering and Applied Mechanics, McGill University,

More information

Aerodynamic Separability in Tip Speed Ratio and Separability in Wind Speed- a Comparison

Aerodynamic Separability in Tip Speed Ratio and Separability in Wind Speed- a Comparison Jurnal f Physics: Cnference Series OPEN ACCESS Aerdynamic Separability in Tip Speed Rati and Separability in Wind Speed- a Cmparisn T cite this article: M L Gala Sants et al 14 J. Phys.: Cnf. Ser. 555

More information

SOLUTION OF THREE-CONSTRAINT ENTROPY-BASED VELOCITY DISTRIBUTION

SOLUTION OF THREE-CONSTRAINT ENTROPY-BASED VELOCITY DISTRIBUTION SOLUTION OF THREECONSTRAINT ENTROPYBASED VELOCITY DISTRIBUTION By D. E. Barbe,' J. F. Cruise, 2 and V. P. Singh, 3 Members, ASCE ABSTRACT: A twdimensinal velcity prfile based upn the principle f maximum

More information

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines COMP 551 Applied Machine Learning Lecture 11: Supprt Vectr Machines Instructr: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted fr this curse

More information

Lecture 10, Principal Component Analysis

Lecture 10, Principal Component Analysis Principal Cmpnent Analysis Lecture 10, Principal Cmpnent Analysis Ha Helen Zhang Fall 2017 Ha Helen Zhang Lecture 10, Principal Cmpnent Analysis 1 / 16 Principal Cmpnent Analysis Lecture 10, Principal

More information

How do scientists measure trees? What is DBH?

How do scientists measure trees? What is DBH? Hw d scientists measure trees? What is DBH? Purpse Students develp an understanding f tree size and hw scientists measure trees. Students bserve and measure tree ckies and explre the relatinship between

More information

Methods for Determination of Mean Speckle Size in Simulated Speckle Pattern

Methods for Determination of Mean Speckle Size in Simulated Speckle Pattern 0.478/msr-04-004 MEASUREMENT SCENCE REVEW, Vlume 4, N. 3, 04 Methds fr Determinatin f Mean Speckle Size in Simulated Speckle Pattern. Hamarvá, P. Šmíd, P. Hrváth, M. Hrabvský nstitute f Physics f the Academy

More information

Modeling the Nonlinear Rheological Behavior of Materials with a Hyper-Exponential Type Function

Modeling the Nonlinear Rheological Behavior of Materials with a Hyper-Exponential Type Function www.ccsenet.rg/mer Mechanical Engineering Research Vl. 1, N. 1; December 011 Mdeling the Nnlinear Rhelgical Behavir f Materials with a Hyper-Expnential Type Functin Marc Delphin Mnsia Département de Physique,

More information

V. Balakrishnan and S. Boyd. (To Appear in Systems and Control Letters, 1992) Abstract

V. Balakrishnan and S. Boyd. (To Appear in Systems and Control Letters, 1992) Abstract On Cmputing the WrstCase Peak Gain f Linear Systems V Balakrishnan and S Byd (T Appear in Systems and Cntrl Letters, 99) Abstract Based n the bunds due t Dyle and Byd, we present simple upper and lwer

More information

Resampling in State Space Models

Resampling in State Space Models Resampling in State Space Mdels David S. Stffer Department f Statistics University f Pittsburgh Pittsburgh, PA 15260 USA Kent D. Wall Defense Resurces Management Institute Naval Pstgraduate Schl Mnterey,

More information

Support-Vector Machines

Support-Vector Machines Supprt-Vectr Machines Intrductin Supprt vectr machine is a linear machine with sme very nice prperties. Haykin chapter 6. See Alpaydin chapter 13 fr similar cntent. Nte: Part f this lecture drew material

More information

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d)

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d) COMP 551 Applied Machine Learning Lecture 9: Supprt Vectr Machines (cnt d) Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Class web page: www.cs.mcgill.ca/~hvanh2/cmp551 Unless therwise

More information

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction T-61.5060 Algrithmic methds fr data mining Slide set 6: dimensinality reductin reading assignment LRU bk: 11.1 11.3 PCA tutrial in mycurses (ptinal) ptinal: An Elementary Prf f a Therem f Jhnsn and Lindenstrauss,

More information

Evaluating enterprise support: state of the art and future challenges. Dirk Czarnitzki KU Leuven, Belgium, and ZEW Mannheim, Germany

Evaluating enterprise support: state of the art and future challenges. Dirk Czarnitzki KU Leuven, Belgium, and ZEW Mannheim, Germany Evaluating enterprise supprt: state f the art and future challenges Dirk Czarnitzki KU Leuven, Belgium, and ZEW Mannheim, Germany Intrductin During the last decade, mircecnmetric ecnmetric cunterfactual

More information

Chapter 15 & 16: Random Forests & Ensemble Learning

Chapter 15 & 16: Random Forests & Ensemble Learning Chapter 15 & 16: Randm Frests & Ensemble Learning DD3364 Nvember 27, 2012 Ty Prblem fr Bsted Tree Bsted Tree Example Estimate this functin with a sum f trees with 9-terminal ndes by minimizing the sum

More information