Inference about the slope in linear regression: an empirical likelihood approach

Size: px
Start display at page:

Download "Inference about the slope in linear regression: an empirical likelihood approach"

Transcription

1 Aals of the Istitute of Statistical Mathematics mauscript No. (will be iserted by the editor) Iferece about the slope i liear regressio: a empirical likelihood approach Ursula U. Müller Haxiag Peg Ato Schick Received: date / Revised: date Abstract We preset a ew, efficiet maximum empirical likelihood estimator for the slope i liear regressio with idepedet errors ad covariates. The estimator does ot require estimatio of the ifluece fuctio, i cotrast to other approaches, ad is easy to obtai umerically. Our approach ca also be used i the model with resposes missig at radom, for which we recommed a complete case aalysis. This suffices thaks to results by Müller ad Schick (2017), which demostrate that efficiecy is preserved. We provide cofidece itervals ad tests for the slope, based o the limitig chi-square distributio of the empirical likelihood, ad a uiform expasio for the empirical likelihood ratio. The article cocludes with a small simulatio study. Keywords efficiecy estimated costrait fuctios ifiitely may costraits maximum empirical likelihood estimator missig resposes missig at radom U.U. Müller Texas A&M Uiversity, Departmet of Statistics, College Statio, TX , USA uschi@stat.tamu.edu H. Peg Idiaa Uiversity Purdue Uiversity at Idiaapolis, Departmet of Mathematical Scieces, Idiaapolis, IN , USA hpeg@math.iupui.edu A. Schick Bighamto Uiversity, Departmet of Mathematical Scieces, Bighamto, NY , USA ato@math.bighamto.edu

2 2 Ursula U. Müller et al. 1 Itroductio We cosider the homoscedastic regressio model i which the respose variable Y is liked to a covariate X by the formula Y = βx + ε. (1) For reasos of clarity we focus o the case where X is oe-dimesioal ad β a ukow real umber. We will assume throughout that ε ad X are idepedet, ad that X has a fiite positive variace. Our goal is to make ifereces about the slope β, treatig the desity f of the error ε ad the distributio of the covariate X as uisace parameters. We shall do so by usig a empirical likelihood approach based o idepedet copies (X 1, Y 1 ),..., (X, Y ) of the base observatio (X, Y ). Model (1) is the usual liear regressio model with a o-zero itercept, eve though it is writte without a explicit itercept parameter. Sice we do ot assume that the error variable is cetered, the mea E[ε] plays the role of the itercept parameter. Workig with this model ad otatio simplifies the explaatio of the method ad the presetatio of the proofs. The geeralizatio to the multivariate case is straightforward; see Remark 1 i Sectio 2. The liear regressio model is oe of the most useful statistical models, ad may simple estimators for the slope are available, such as the ordiary least squares estimator (OLSE) which takes o the form (X j X)Y j (X j X) 2 (2) rather tha X jy j / X2 j, because we do ot assume that the errors are cetered. However, these estimators are usually iefficiet. The costructio of efficiet (least dispersed) estimators is i fact quite ivolved. The reaso for this is the assumed idepedece betwee covariates ad errors, which is a structural assumptio that has to be take ito accout by the estimator to obtai efficiecy. Efficiet estimators for β i model (1) were first itroduced by Bickel (1982), who used sample splittig to estimate the efficiet ifluece fuctio. To establish efficiecy we must assume that f has fiite Fisher iformatio for locatio. This meas that f is absolutely cotiuous ad the itegral J f = l 2 f (y)f(y) dy is fiite, where l f = f /f deotes the score fuctio for locatio. It follows from Bickel (1982) that a efficiet estimator ˆβ of β is characterized by the stochastic expasio ˆβ = β + 1 (X j E[X])l f (Y j βx j ) J f Var(X) + o P ( 1/2 ). (3) Further efficiet estimators of the slope which require estimatig the ifluece fuctio were proposed by Schick (1987) ad Ji (1992). Koul ad Susarla

3 Iferece about the slope i liear regressio 3 (1983) studied the case whe f is also symmetric about zero. See also Schick (1993) ad Forrester et al. (2003), who achieved efficiecy without sample splittig ad istead used a coditioig argumet. Efficiet estimatio i the correspodig (heteroscedastic) model without the idepedece assumptio (defied by E(ε X) = 0) is much easier: Müller ad Va Keilegom (2012), for example, proposed weighted versios of the OLSE to efficietly estimate β i the model with fully observed data ad i a model with missig resposes. See also Schick (2013), who proposed a efficiet estimator usig maximum empirical likelihood with ifiitely may costraits. Like Müller ad Va Keilegom (2012), we are iterested i the commo case that resposes are missig at radom (MAR). This meas that we observe copies of the triplet (δ, X, δy ), where δ is a idicator variable with δ = 1 if Y is observed, ad where the probability π that Y is observed depeds oly o the covariate, P (δ = 1 X, Y ) = P (δ = 1 X) = π(x), with E[π(X)] = E[δ] > 0; we refer to the moographs by Little ad Rubi (2002) ad Tsiatis (2006) for further readig. Note that the MAR model we have just described covers the full model (i which all data are completely observed) as a special case with π(x) = 1. To estimate β i the MAR model we propose a complete case aalysis, i.e., oly the N = δ j observatios (X i1, Y i1 ),..., (X in, Y in ) with observed resposes will be cosidered. Complete case aalysis is the simplest approach to dealig with missig data, ad is frequetly disregarded as aive ad wasteful. I our applicatio, however, the cotrary is true: Müller ad Schick (2017) showed that geeral fuctioals of the coditioal distributio of Y give X ca be estimated efficietly (i the sese of Hájek ad Le Cam) by a complete case aalysis. Sice the slope β is covered as a special case, this meas that a estimator of β that is efficiet i the full model is also efficiet i the MAR model if we simply omit the icomplete cases. This property is called efficiecy trasfer. To costruct efficiet maximum empirical likelihood estimators for β, it therefore suffices to cosider the model with completely observed data. We write ˆβ c for the complete case versio of ˆβ from (3). It follows from the trasfer priciple for asymptotically liear statistics by Koul et al. (2012) that ˆβ c satisfies ˆβ c = β + 1 N δ j (X j E[X δ = 1])l f (Y j βx j ) J f Var(X δ = 1) + o P ( 1/2 ), (4) ad is therefore cosistet for β. That ˆβ c is also efficiet follows from Müller ad Schick (2017, Sectio 5.1). The efficiecy property ca alteratively be deduced from argumets i Müller (2009), who gave the efficiet ifluece fuctio for β i the MAR model, but with the additioal assumptio that the errors have mea zero; see Lemma 5.1 i that paper.

4 4 Ursula U. Müller et al. I this paper we use a empirical likelihood approach with a icreasig umber of estimated costraits to derive various iferetial procedures about the slope. Our approach is similar to Schick (2013), but our model requires differet costraits. We obtai a suitable Wilks theorem (see Theorem 1) to derive cofidece sets for β ad tests about a specific value of β, ad a poit estimator of β via maximum empirical likelihood, i.e., by maximizig the empirical likelihood. This estimator is show to be semiparametrically efficiet. Empirical likelihood was itroduced by Owe (1988, 2001) for a fixed umber of kow liear costraits to costruct cofidece itervals i a oparametric settig. More recetly, his results have bee geeralized to a fixed umber of estimated costraits by Hjort, McKeague ad Va Keilegom (2009), who further studied the case of a icreasig umber of kow costraits; see also Che et al. (2009). Peg ad Schick (2013) geeralized the approach to the case of a icreasig umber of estimated costraits. The idea of maximum empirical likelihood goes back to Qi ad Lawless (1994), who treated the case with a fixed umber of kow costraits. Peg ad Schick (2017) geeralized their result to the case with estimated costraits. Schick (2013) ad Peg ad Schick (2016) treated examples with a icreasig umber of estimated costraits ad showed efficiecy of the maximum empirical likelihood estimators. Our empirical likelihood is similar to the oe cosidered for the symmetric locatio model i Peg ad Schick (2016). We shall derive results that are aalogous to those i that paper. I Sectio 3 we provide the asymptotic chisquare distributio of the empirical log-likelihood for both the full model ad the MAR model. This facilitates the costructio of cofidece itervals ad tests about the slope β. I Sectio 4 we propose a ew method for estimatig β efficietly, amely a guided maximum empirical likelihood estimator, as suggested by Peg ad Schick (2017) for the geeral model with estimated costraits. Efficiecy of this estimator is etailed by a uiform expasio for the local empirical likelihood (see Theorem 2), which follows from a local asymptotic ormality coditio. Sectio 5 cotais a simulatio study. The proofs are i Sectio 6. 2 Empirical likelihood approach The costructio of the empirical likelihood is crucial sice we eed to icorporate the idepedece betwee the covariates ad the errors to obtai efficiecy. Let us explai it for the full model. The correspodig approach for the missig data model is the straightforward: i that case we will proceed i the same way, ow with the aalysis based o the N complete cases, ad with the radom sample size N treated like.

5 Iferece about the slope i liear regressio 5 Our empirical likelihood R (b), which we wat to maximize with respect to b R, is of the form { R (b) = π j : π P, Here P is the probability simplex i dimesio, P = { π = (π 1,..., π ) [0, 1] : π j (X j X)v } (F b (Y j bx j )) = 0. } π j = 1, X is the sample mea of the covariates X 1,..., X, F b is the empirical distributio fuctio costructed from residuals Y 1 bx 1,..., Y bx, F b (t) = 1 1[Y j bx j t], t R, which serves as a surrogate for the ukow error distributio F. The fuctio v maps from [0, 1] ito R r ad will be described i (6) below. The costrait π j(x j X)v (F b (Y j bx j )) = 0 i the defiitio of R (b) is therefore a vector of r oe-dimesioal costraits, where the iteger r teds to ifiity slowly as the sample size icreases. These costraits emerge from the idepedece assumptio as follows. Idepedece of X ad ε is equivalet to E[c(X)a(ε)] = 0 for all square-itegrable cetered fuctios c ad a uder the distributios of X ad ε, respectively. This leads to the empirical likelihood i Peg ad Schick (2013). We do ot work with these costraits. Istead we use costraits i the subspace {(X E[X])a(ε) : a L 2,0 (F )} (5) with L 2,0 (F ) = {a L 2 (F ) : a df = 0}, which suffices sice it cotais the efficiet ifluece fuctio; see (3). By our assumptios, F is cotiuous ad F (ε) is uiformly distributed o the iterval [0, 1], F (ε) U. A orthoormal basis of L 2,0 (F ) is ϕ 1 F, ϕ 2 F,... where the ϕ k deote a orthoormal basis of L 2,0 (U ). This suggests the costraits π j {X j E(X)}ϕ k {F (Y j bx j )} = 0, k = 1,..., r, which, however, caot be used sice either F or the the mea of X are kow. So we replace them by empirical estimators. I this article we will work with the trigoometric basis ϕ k (x) = 2 cos(kπx), 0 x 1, k = 1, 2,..., ad take v = (ϕ 1,..., ϕ r ). (6)

6 6 Ursula U. Müller et al. This yields our empirical likelihood R (b) from above. Let us briefly discuss the complete case approach that we propose for the MAR model. I the followig a subscript c will, as before whe we itroduced ˆβ c, idicate that a complete case statistic is used. For example, F b,c is the complete case versio of F b, F b,c (t) = 1 N δ j 1[Y j bx j t] = 1 N N 1[Y ij bx ij t], t R. The complete case empirical likelihood is { N N R,c (b) = Nπ j : π P N, π j (X ij X } c )v N (F b,c (Y ij bx ij )) = 0, with P N ad v defied above. Note that we perform a complete case aalysis, so the above formula must ivolve X c = N 1 δ jx j, which is a cosistet estimator of the coditioal expectatio E[X δ = 1], as give i (4); see also Sectio 3 i Müller ad Schick (2017) for the geeral case. Momets of the covariate distributio are replaced by momets of the coditioal covariate distributio give δ = 1, whe switchig from the full model to the complete case aalysis. Remark 1 If the covariate X is a p-dimesioal vector we have Y j = β X j + ε j, j = 1,...,, ad costruct F b usig the residuals Y j b X j. Now we eed to iterpret (5) with X beig p-dimesioal. The empirical likelihood R (b) is the { π j : π P, π j (X j X) } v (F b (Y j b X j )) = 0, where deotes the Kroecker product. Sice the Kroecker product of two vectors with dimesios p ad q is a vector of dimesio pq, there are pr radom costraits i the above empirical likelihood. Workig with this likelihood is otatioally more cumbersome, but the proofs are essetially the same. The complete case empirical likelihood R,c (b) chages aalogously. It equals { N N Nπ j : π P N, π j (X ij X } c ) v N (F b,c (Y ij bx ij )) = 0.

7 Iferece about the slope i liear regressio 7 3 A Wilks theorem Wilks origial theorem states that the classical log-likelihood ratio test statistic is asymptotically chi-square distributed. Our first result is a versio of that theorem for the empirical log-likelihood. It is give i Theorem 1 below ad proved i the first subsectio of Sectio 6. As i the previous sectio we write R (b) for the empirical likelihood ad R,c (b) for the complete case empirical likelihood. Further let χ γ (d) deote the γ-quatile of the chi-square distributio with d degrees of freedom. Theorem 1 Cosider the full model ad pose that X also has a fiite fourth momet ad that the umber of basis fuctios r satisfies r ad r 4 = o() as. The we have P ( 2 log R (β) χ u (r )) u, 0 < u < 1. The coclusio of this theorem is equivalet to ( 2 log R (β) r )/ r beig asymptotically stadard ormal. This implies that the complete case versio ( 2 log R,c (β) r N )/ r N is also asymptotically stadard ormal. This is a cosequece of the trasfer priciple for complete case statistics; see Remark 2.4 i the article by Koul et al. (2012). More precisely, these authors showed that if the limitig distributio of a statistic is L(Q), the the limitig distributio of its complete case versio is L( Q), where Q is the joit distributio of (X, Y ), belogig to some model, ad where Q the distributio of (X, Y ) give δ = 1. Oe oly eeds to assume that Q belogs to the same model as Q, i.e., it satisfies the same assumptios. Here we assume that the resposes are missig at radom, i.e., δ ad Y are coditioally idepedet give X. Therefore we oly eed to require that the coditioal covariate distributio give δ = 1 ad the ucoditioal covariate distributio belog to the same model. Here the limitig distributio is ot affected as it does ot deped o Q. Although the result for the MAR model is more geeral tha the result for the full model (which is covered as a special case), we ca ow, thaks to the trasfer priciple, formulate it as a corollary, i.e., we oly eed to take the modified assumptios for the coditioal covariate distributio ito accout, ad prove Theorem 1 for the full model. Corollary 1 Cosider the MAR model ad pose that the distributio of X give δ = 1 has a fiite fourth momet ad a positive variace. Let the umber of basis fuctios r N satisfy 1/r N = o P (1) ad r 4 N = o P (N) as. The we have P ( 2 log R,c (β) χ u (r N )) u, 0 < u < 1. Note that the coditios o the umber of basis fuctios r ad r N i the full model ad the MAR model are equivalet sice ad N icrease proportioally, N = 1 δ i E[δ] almost surely, i=1

8 8 Ursula U. Müller et al. with E[δ] > 0 by assumptio. The distributio of X give δ = 1 has desity π/e[δ] with respect to the distributio of X. Thus the variace of the former distributio is positive uless X is costat almost surely o the evet {π(x) > 0}. Remark 2 The above result shows that {b R : 2 log R,c (b) < χ 1 α (r N )} is a 1 α cofidece regio for β ad that 1[ 2 log R,c (β 0 ) χ 1 α (r N )] is a test of asymptotic size α for testig the ull hypothesis H 0 : β = β 0. Note that both the cofidece regio ad the test about the slope also apply to the special case of a full model with N = ad R i place of R,c. The asymptotic cofidece iterval for the slope, for example, is {b R : 2 log R (b) < χ 1 α (r )}. 4 Efficiet estimatio Our ext result gives a stregtheed versio of the uiform local asymptotic ormality (ULAN) coditio for the local empirical likelihood ratio L (t) = log ( ) R (β + 1/2 t) R (β), t R, i the full model. The usual ULAN coditio is established for fixed compact itervals for the local parameter t. Here we allow the itervals to grow with the sample size. Theorem 2 Suppose X has a fiite fourth momet, f has fiite Fisher iformatio for locatio, ad r satisfies (log )/r = O(1) ad r 5 log = o(). The for every sequece C satisfyig C 1 ad C 2 = O(log ), the uiform expasio L (t) tγ + J f Var(X)t 2 /2 (1 + t ) 2 = o P (1) (7) holds with Γ = 1 (X j E[X j ])l f (X j βx j ), which is asymptotically ormal with mea zero ad variace J f Var(X).

9 Iferece about the slope i liear regressio 9 The proof of Theorem 2 is quite elaborate ad carried out i Sectio 6. Expasio (7) is critical to obtai the asymptotic distributio of the maximum empirical likelihood estimator. We shall follow Peg ad Schick (2017) ad work with a guided maximum empirical likelihood estimator (GMELE). This requires a prelimiary 1/2 -cosistet estimator β of β. Oe possibility is the OLSE, see (2), which requires the additioal assumptio that the error has a fiite secod momet. Aother possibility which avoids this assumptio is the solutio β to the equatio 1 (X j X)ψ(Y j bx j ) = 0, where ψ is a bouded fuctio with a positive ad bouded first derivative ψ ad a bouded secod derivative as, for example, the arctaget. The β = β 1 (X j µ)(ψ(ε j ) E[ψ(ε)]) Var(X)E[ψ (ε)] + o P ( 1/2 ) ad 1/2 ( β β) is asymptotically ormal with mea zero ad variace Var(ψ(ε)) (E[ψ (ε)]) 2 Var(X). The GMELE associated with a 1/2 -cosistet prelimiary estimator β is defied by ˆβ = arg max R (b), (8) 1/2 b β C where C is proportioal to (log ) 1/2. By the results i Peg ad Schick (2017) the expasio (7) implies 1/2 ( ˆβ β) = Γ /(J f Var(X)) + o P ( 1/2 ). Thus, uder the assumptios of Theorem 2, the GMELE ˆβ satisfies (3) ad is therefore efficiet. The complete case estimator ˆβ,c = arg max R,c (b) N 1/2 b β,c C N is the efficiet i the MAR model, provided the coditioal distributio of X give δ = 1 has a fiite fourth momet ad a positive variace. Let us summarize our fidig i the followig theorem. Theorem 3 Suppose that the error desity f has fiite Fisher iformatio for locatio ad that r satisfies (log )/r = O(1) ad r 5 log = o(). (a) Assume that the variace of X is positive ad that E[X 4 ] is fiite. The the GMELE ˆβ satisfies expasio (3) ad is therefore efficiet i the full model.

10 10 Ursula U. Müller et al. (b) Cosider the MAR model ad assume that the coditioal variace Var[X δ = 1] is positive ad that E[X 4 δ = 1] is fiite. The the complete case versio ˆβ,c of the GMELE satisfies expasio (4) ad is efficiet i the MAR model. The choice of r (ad r N ) is addressed i Remark 4 i Sectio 5. Remark 3 A referee suggested the followig. A alterative (but asymptotically equivalet) procedure to compute the maximum empirical likelihood estimator ca be based o the set of the geeralized set of estimatig equatios g j (b) = (X j X)v (F b (Y j bx j )) (with r > 1) ad the followig program: ˆβ EE 1 = arg mi 1/2 b β C g j (b) ( 1 where β is a prelimiary estimator defied as β = 1 arg mi 1/2 b β C g j (β )g j (β ) ) 1 1 g j (b) Ŵ 1 g j (b), g j (b), for ay positive semi-defiite matrix Ŵ (ad similarly for the complete case aalysis). This estimator is computatioally simpler tha the maximum empirical likelihood estimator, especially if the dimesio of β is larger tha oe. A eve simpler estimator which avoids the prelimiary step is the estimator β with Ŵ = (ˆτ I 2 r ) 1, where I r is the r r idetity matrix ad ˆτ 2 = 1 (X j X) 2. This estimator reduces to ˆβ S = arg mi 1/2 b β C 1 g j (b) 2 /ˆτ 2 = arg mi 1/2 b β C 1 g j (b) 2. EE Usig argumets from the proof of Theorem 2, both estimators, ˆβ ad ˆβ S, ca be show to be efficiet. I simulatios the GMELE outperformed the EE alterative estimators ˆβ ad ˆβ S ; see Table 1 i Sectio 5. 5 Simulatios Here we report the results of a small simulatio study carried out to ivestigate the fiite sample behavior of the GMELE (8) ad the test from Remark 2. The simulatios were carried out with the help of the R package. The R fuctio optimize was used to locate the maximizers.

11 Iferece about the slope i liear regressio Comparig GMELE with the competig estimators from Remark 3 For this study we used the full model with β = 1 ad sample size = 100. We worked with two error distributios ad two covariate distributios. As error distributios we picked the mixture ormal distributio 0.25N ( 10, 1) + 0.5N (0, 1) N (10, 1) ad the skew ormal distributio with locatio parameter zero, scale parameter 1 ad skewess parameter 4. As covariate distributios we chose the stadard ormal distributio ad the uiform distributio o ( 1, 3). Table 1 reports simulated mea squared errors of the ˆβ EE estimators, ˆβ S, ad the GMELE, based o 2000 repetitios, ad for the choices r = 1,..., 10. We used the ordiary least squares estimator (OLSE) as prelimiary estimator for the GMELE ad ˆβ S, to specify the locatio of EE the search iterval. As prelimiary estimator for ˆβ we used ˆβ S. We chose 2c log()/ as the legth of the iterval, with c = 1 for skew ormal errors ad c = 10 for the mixture ormal errors. As ca be see from Table 1, the GMELE clearly outperforms the two competig approaches. Table 1 Comparig the GMELE ˆβ (M) with ˆβ S (S) ad ˆβ EE (EE) from Remark 3 r mixture ormal error, ormal covariate S EE M mixture ormal error, uiform covariate S EE M skew ormal error, ormal covariate S EE M skew ormal error, uiform covariate S EE M The table etries are the simulated MSE s for the three estimators i the full model for sample size = 100 based o 2000 repetitios. 5.2 Performace with missig data Here we report o the performace of the GMELE ad the OLSE with missig data. We agai used the model Y = βx + ε with β = 1 ad chose π(x) = P (δ = 1 X) = 1/(1 + d exp(x))

12 12 Ursula U. Müller et al. with d = 0, d = 0.1 ad d = 0.5 to produce differet missigess rates. Note that d = 0 correspods to the full model. Table 2 Simulated MSE s for OLSE ad GMELE with missig data MR OLSE mixture ormal error, ormal covariate 70 0% % % % % % mixture ormal error, uiform covariate 70 0% % % % % % skew ormal error, ormal covariate 70 0% % % % % % skew ormal error, uiform covariate 70 0% % % % % % The table etries are simulated mea squared errors for mixture ormal errors, ad 10 times the simulated mea squared errors for skew ormal errors. We used the same error ad covariate distributios as before ad worked with the search iterval βn,c ± c N log(n)/n based o the complete case versio of the OLSE. We chose c N = 1 for the skew ormal errors ad c N = 10 for the mixture ormal errors. The reported results are based o samples of size = 70 ad 140, r = 1,..., 10 basis fuctios ad 2000 repetitios. Table 2 reports simulated mea squared errors of the OLSE ad GMELE for r = 1,..., 10. The mea squared errors are multiplied by 10 for skew ormal errors. We also list the average missigess rates (MR). The GMELE performs i most cases much better (smaller MSE s) tha the OLSE, except i some of the small samples. The results for the sceario

13 Iferece about the slope i liear regressio 13 with uiform covariates are better tha the correspodig figures for stadard ormal covariates. The mea squared errors for the skew ormal errors are eve better tha those for mixture ormal errors. 5.3 Behavior for errors without fiite Fisher iformatio A differet sceario is cosidered i Table 3, amely whe the errors are from a expoetial distributio. Sice the expoetial distributio has o fiite Fisher iformatio for locatio it does ot fit ito our theory, but it still demostrates erior performace of the GMELE over the OLSE. Table 3 Simulated MSE s for expoetial error ormal covariate MR OLSE % % % % % % uiform covariate 70 0% % % % % % The table etries are the MSE s for r = 1,..., 5 costraits whe the errors are from a expoetial distributio (o fiite Fisher iformatio). Remark 4 The choice of the umber of basis vectors r (ad r N ) does affect the performace of the GMELE. This suggests usig a data-drive choice. Oe possibility is the approach of Peg ad Schick (2005, Sectio 5.1), who used bootstrap to select r i a related settig, with covicig results. The idea is to compute the bootstrap mea squared errors of the estimator (the GMELE i our case) for differet values of r, say for r = 1,..., 10. The select the r with the miimum bootstrap mea squared error. 5.4 Compariso of two tests We performed a small study comparig the empirical likelihood test about the slope from Remark 2 ad the correspodig bootstrap test, which uses

14 14 Ursula U. Müller et al. resamplig istead of the χ 2 approximatio to obtai critical values. The ull hypothesis is β = β 0 = 1 ad the omial level is.05. As i Table 1 we cosider oly the full model ad the sample size = 100. Table 4 reports the simulated sigificace level ad power of the two tests, usig r = 1, 2,..., 5 basis fuctios. The covariates X ad the errors ε were geerated from the same distributios as before. The bootstrap resample size was take to be the same as the sample size (i.e. = 100), while we used more repetitios tha before: i order to stabilize the results obtaied by the bootstrap method we worked with 10, 000 repetitios. Our simulatios idicate that the results based o the χ 2 approximatio (deoted by χ 2 ) are much more reliable tha the results of the bootstrap approach (deoted by B). For r 3 the bootstrapped sigificace levels are far away from the omial level 5%: they are betwee 11% ad 60%, i.e. the test is far too liberal, which is i cotrast to the χ 2 approach. The sigificace levels for r = 1, 2 are reasoable for both tests. I terms of power the bootstrap test is better tha the χ 2 test i the upper table with ormal covariates; for uiform covariates it is the other way roud. Table 4 Simulated sigificace level ad power of the empirical likelihood test about the slope usig χ 2 ad bootstrap quatiles ormal covariate mixture ormal error skew ormal error β = 1.0 χ B β = 1.2 χ B uiform covariate mixture ormal error skew ormal error β = 1.0 χ B β = 1.2 χ B The table shows simulated sigificace level ad power figures of the empirical likelihood test with ull hypothesis β = 1 at the omial level α = We cosider the full model; the sample size is = 100. The test uses approximative χ 2 quatiles (χ 2 ) ad bootstrap quatiles (B). 6 Proofs This sectio cotais the proofs of Theorem 1 (give i the first subsectio) ad of Theorem 2. The proof of the uiform expasio that is provided i Theorem 2 is split ito three parts. I Subsectio 6.2 we give six coditios ad show that they are sufficiet for the expasio. That the coditios are ideed satisfied is show separately i Subsectios 6.3 ad 6.4. Subsectio 6.5

15 Iferece about the slope i liear regressio 15 cotais a auxiliary result. As explaied i the itroductio, we oly eed to prove the results for the full model, i.e., the case whe π(x) equals oe. 6.1 Proof of Theorem 1 Let µ deote the mea ad τ deote the stadard deviatio of X. We should poit out that R (b) does ot chage if we replace (X j X) by (X j X)/τ = V j V, where V j = X j µ τ ad V = 1 V j. Thus, for the purpose of our proofs, we may assume that R (b) is give by { R (b) = π j : π P, π j (V j V } )v (F b (Y j bx j )) = 0. I what follows we shall repeatedly use the bouds v (y) 2 2r, v (y) 2 2π 2 r 3, ad v (y) 2 2π 4 r 5 for all real y. Let us set Z j = V j v (F (ε j )) ad Ẑj = (V j V )v (F β (ε j )), j = 1,...,. With Z = Z 1, we fid the idetities E[Z] = 0 ad E[ZZ ] = I r, where I r is the r r idetity matrix, ad the boud E[ Z 4 ] (2r ) 2 E[V 4 ] = O(r 2 ). As show i Peg ad Schick (2013), these results yield Z = 1 Z j = O P (r 1/2 ) (9) ad 1 u =1 (u Z j ) Z j Zj I r = OP (r 1/2 ). (10) From Corollary 7.6 i Peg ad Schick (2013) ad r 4 = o(), the desired result follows if we verify 1 (Ẑj Z j ) = o P (1) ad 1 Ẑj Z j 2 = o P (r/). 3 Let j = v (F β (ε j )) v (F (ε j )), j = 1,...,.

16 16 Ursula U. Müller et al. I view of the idetity Ẑj Z j = V j j V j V v (F (ε j )), the boud v 2 2r, ad the fact 1/2 V = OP (1), it is easy to see the desired results follow from the followig rates: S 1 = 1 S 2 = 1 S 3 = 1 S 4 = 1 V j j = O P (r 3/2 1/2 ), j = o P (r 3/2 1/2 ), v (F (ε j )) = O P (r 1/2 1/2 ), V 2 j j 2 = O P (r 3 1 ). Note that 1,..., are fuctios of the errors ε 1,..., ε oly ad satisfy M = max j 2 2π 2 r 3 F β (t) F (t) 2 = O P (r/). 3 1 j Coditioig o the errors thus yields t R E[ S 1 2 ε 1,..., ε ] = E[S 4 ε 1,..., ε ] M. This establishes the rates for S 1 ad S 4. The other rates follow from S 2 2 M ad E[ S 3 2 ] = E[ v (F (ε)) 2 ] = r. 6.2 Proof of Theorem 2 For t i R, we let ˆF t = F β+ 1/2 t ad ote that ˆF t is the empirical distributio fuctio of the radom variables ε jt = ε j 1/2 tx j, j = 1,...,. These radom variables are idepedet with commo distributio fuctio F t give by F t (y) = E[ ˆF t (y)] = E[F (y + 1/2 tx)], y R. To simplify otatio we itroduce ˆR jt = ˆF t (ε jt ), R jt = F t (ε jt ), R j = F (ε j ), ad Ẑ jt = (V j V )v ( ˆR jt ), Z jt = V j v (R jt ), Z j = V j v (R j ).

17 Iferece about the slope i liear regressio 17 Sice we are workig with the form of the empirical likelihood give i the previous sectio, we have { R (β + 1/2 t) = π j : π P, } π j Ẑ jt = 0, t R. Fix a sequece C such that C 1 ad C = O((log ) 1/2 ). The desired result follows if we verify the uiform expasio 2 log R (β + 1/2 t) Z 2 + 2tΓ t 2 τ 2 J f (1 + t ) 2 = o P (1) (11) with Z as i (9). To verify (11) we itroduce ν = E[Xl f (ε)v v (F (ε))]. We shall establish (11) by verifyig the followig six coditios. 1 u =1 1 1 (u Ẑ jt ) 2 1 = o P (1/r ), (12) (Ẑjt Z jt ) = o P (r 1/2 ), (13) (Z jt Z j E[Z jt Z j ]) = o P (r 1/2 ), (14) 1/2 E[Z 1t Z 1 ] + tν = o(r 1/2 ), (15) ν Z Γ = 1 ν 2 τ 2 J f, (16) [ν Z j (X j µ)l f (ε j )] = o P (1). (17) These six coditios are proved i the ext two subsectios. We first establish their sufficiecy. Lemma 1 The coditios (12) (17) imply (11). To prove this lemma, we use the followig result which is a special case of Lemma 5.2 i Peg ad Schick (2013). This versio was used i Schick (2013).

18 18 Ursula U. Müller et al. Lemma 2 Let x 1,..., x be m-dimesioal vectors. Set x = 1 x j, x = max x j, ν 4 = 1 x j 4, S = 1 1 j x j x j, ad let λ deote the smallest ad Λ the largest eigevalue of the matrix S. The the iequality λ > 5 x x implies with 2 log(r) x S 1 x x 3 (Λν 4 ) 1/2 (λ x x ) 3 + 4Λ2 x 4 ν 4 λ 2 (λ x x ) 4 { R = π j : π P, Proof of Lemma 1 We itroduce T(t) = 1 Ẑ jt ad S(t) = 1 } π j x j = 0. Ẑ jt Ẑjt, ad let λ (t) ad Λ (t) deote the smallest ad largest eigevalues of S(t), ad By (12), we have 1 λ (t) = if u =1 u S(t)u = if u =1 Λ (t) = u 1 S(t)u = u =1 u =1 (u Ẑ jt ) 2 (u Ẑ jt ) 2. λ (t) 1 = o P (1) ad Λ (t) 1 = o P (1). The coditios (13) (15) imply This, (9) ad (16) yield Next, we fid 1/2 T(t) Z + tν = o P (r 1/2 ). (18) T(t) 2 = O P (r ). (19) max Ẑjt (2r ) 1/2 max V j V = o P (r 1/2 1/4 ) 1 j 1 j ad 1 Ẑjt 4 (2r ) 2 1 V j V 4 = O P (r). 2

19 Iferece about the slope i liear regressio 19 Thus we derive 2 log R (β + 1/2 t) T(t) (S(t)) 1 T(t) = o P (1), (20) sice by Lemma 2 the left-had side is of order O P (r 5/2 1/2 + r/). 4 For a positive defiite matrix A ad a compatible vector x, we have x A 1 x x x x A 1 x 1 u Au x 2 u =1 λ 1 u Au u =1 with λ the smallest eigevalue of A. Usig this, (12) ad (19) we derive T(t) (S(t)) 1 T(t) T(t) T(t) = o P (1). (21) With the help of (9), (16) ad (18) we verify T(t) 2 Z 2 + 2tν Z t 2 ν 2 = op (1). (22) The results (20) (22) yield the expasio 2 log R (β + 1/2 t) Z 2 + 2tν Z t 2 ν 2 = op (1). From (16) ad (17) we derive the expasio 2t(ν Z Γ ) t 2 ( ν 2 τ 2 J f ) (1 + t ) 2 = o P (1). The desired result (11) follows from the last two expasios. 6.3 Proofs of (14)-(17) We begi by metioig properties of f ad F that are crucial to the proofs. Sice f has fiite Fisher iformatio for locatio, we have f(y + t) f(y + s) dy B 1 t s, (23) F (t) F (s) B 1 t s, (24) F (t + s) F (t) sf(t) B 2 s 3/2, (25) F (y + s) F (y) sf(y) dy B 1 s 2 (26) for all real s ad t, ad some costats B 1 ad B 2, see, e.g., Peg ad Schick (2016).

20 20 Ursula U. Müller et al. Next, we look at the process H (t) = 1 (h t (X j, Y j ) E[h t (X, Y )]), t R, where h t are measurable fuctios from R 2 to R m such that h 0 = 0. We are iterested i the cases m = 1 ad m = r. A versio of the followig lemma was used i Peg ad Schick (2016). Lemma 3 Suppose that the map t h t (x, y) is cotiuous for all x ad y i R ad E[ h t (X, Y ) h s (X, Y ) 2 ] K t s 2, s, t R, (27) for positive costats K. The we have the rate H (t) = O P (C K 1/2 ). Proof of (14) The desired result follows from Lemma 3 applied with h t (X, Y ) = V [v (F t (ε 1/2 tx)) v (F (ε))], t R, ad K = 2π 2 rb 3 1E[V 2 2 (X 1 X) 2 ]/. Ideed, we have h 0 = 0 ad (27) i view of (24). Note also that r CK 2 0. Proof of (15) Sice V ad ε are idepedet ad V has mea zero, we obtai the idetity 1/2 E[Z 1t Z 1 ] + tν = 1/2 E[V 1 v (F t (ε 1t ))] + tν = 1/2 ( 1 (t) + 2 (t)) with [ ] 1 (t) = E V [v (F t (y)) v (F (y))][f(y + 1/2 tx) f(y)] dy ad 2 (t) = E [ V ] v (F (y))[f(y + 1/2 tx) f(y) 1/2 txf (y)] dy. It follows from (23) ad (24) that 1 (t) (2π 2 r 3 ) 1/2 B 1 E[ X ]B 1 E[ V X ]t 2 /. Itegratio by parts shows that [ ] 2 (t) = E V (v (F (y))f(y)[f (y + 1/2 tx) F (y) 1/2 txf(y)] dy. It follows from (24) that f is bouded by B 1. This ad (26) yield the boud 2 (t) (2π 2 r 3 ) 1/2 B 2 1E[ V X 2 ]t 2 /.

21 Iferece about the slope i liear regressio 21 From these bouds we coclude 1/2 E[Z 1t Z 1 ] + tν = O(r 3/2 (log ) 1/2 ) = o(r 1/2 ) which is the desired (15). Proof of (16) ad (17) Note that ν ca be writte as ν = E[Xl f (ε)v v (F (ε))] = τe[v l f (ε)v v (F (ε))]. The fuctios V ϕ 1 (F (ε)), V ϕ 2 (F (ε)),... form a orthoormal basis of the space V = {V a(ε) : a L 2,0 (F )}. Thus ν is the vector cosistig of the first r Fourier coefficiets of (X µ)l f (ε) = τv l f (ε) with respect to this basis. Because (X µ)l f (ε) is a member of V, Parseval s theorem yields ν 2 E[((X µ)l f (ε)) 2 ] = τ 2 J f ad E[(ν V v (F (ε)) (X µ)l f (ε)) 2 ] 0. The former is (16) ad the latter implies (17). 6.4 Proofs of (12) ad (13) We begi by derivig properties of ˆR jt ad R jt which we eed i the proofs of (12) ad (13). For this we itroduce the leave-oe-out versio R jt of ˆRjt defied by which satisfies R jt = 1 1 i:i j 1[ε it ε jt ] = 1 ˆR jt 1 1 1[ε jt ε jt ] ˆR jt R jt 2 1. (28) We abbreviate R j0 by R j. I the esuig argumets we rely o the followig properties of these quatities, where B 1 ad B 2 are the costats appearig i (24) ad (25). max 1 j R jt R jt R j + R j = O P ( 5/8 (C log ) 1/2 ), (29) max R j R j = O P ( 1/2 ), (30) 1 j R jt R j B 1 C 1/2 ( X j + E[ X ]), (31) R jt R j + 1/2 t(x j µ)f(ε j ) B 2 C 3/2 3/4 2( X j 3/2 + E[ X 3/2 ]). (32)

22 22 Ursula U. Müller et al. The secod statemet follows from properties of the empirical distributio fuctio ad the last two statemets from (24) ad (25), respectively. To prove (29) we use Lemma 4 from Subsectio 6.5. Let ζ j (t) = R jt R jt R j +R j ad m = 1. These radom variables are idetically distributed, ad ( 1)ζ (t) equals Ñ( 1/2 t, X, ε ) from the begiig of Subsectio 6.5, with the role of Y i played by ε i. Lemma 4 gives P ( max 1 j ζ j (t) > 4KC 1/2 ( 1) 5/8 (log( 1)) 1/2 ) P ( ζ (t) > 4KC 1/2 m 5/8 (log m) 1/2 ) P ( X > m 1/4 ) + E[1[ X m 1/4 ]p m (ε, C, K)] 2E[ X 4 1[ X > m 1/4 ] + C 2 exp( K log(m)) for m > 2 ad K > 6B 1 (1 + E[ X ]) ad some costat C. The desired (29) is ow immediate. Note that statemets (28) (31) yield the bouds ˆR jt R j B 1 C 1/2 ( X j + E[ X ]) + 1/2 ξ, j = 1,...,, (33) which we eed for the ext proof. Here ξ is a positive radom variable which satisfies ξ = O P (1). Proof of (12) Give (10) ad the properties of r, it suffices to verify u =1 1 (u Ẑ jt ) 2 1 (u Z j ) 2 = op (1/r ). (34) Usig the Cauchy-Schwarz iequality we boud the left-had side of (34) by 2(D Λ ) 1/2 + D with 1 Λ = u =1 (u Z j ) 2 ad D = 1 Ẑjt Z j 2. Give (10), it therefore suffices to prove D = o P (1/r 2 ). This follows from (33), the iequality D 1 (2 V 2 v ( ˆR jt ) 2 + 2Vj 2 v ( ˆR jt ) v (R j ) 2 ) 4r V 2 + 4π 2 r 3 1 V 2 j ˆR jt R j 2 = O P (rc 3 /), 2 ad the rate r 5 log = o().

23 Iferece about the slope i liear regressio 23 Proof of (13) I view of the rate V = O P ( 1/2 ) ad the idetity Ẑ jt Z jt = V j (v ( ˆR jt ) v (R jt )) V (v ( ˆR jt ) v (R j )) V v (R j ), the desired (13) is implied by the followig three statemets: v (R j ) = O P (r 1/2 ), (35) (v ( ˆR jt ) v (R j )) = O P (C r 3/2 ), (36) V j [v ( ˆR jt ) v (R jt )] = o P (r 1/2 ). (37) We obtai (35) from E[v (F (ε))] = 0 ad E[ v (F (ε) 2 ] = r. Also, (36) follows from (33) ad the fact that its left-had side is bouded by Usig (28) we fid 1 (2π 2 r 3 ) 1/2 1 ˆR jt R j. V j [v ( ˆR jt ) v ( R jt )] = O P (r 3/2 1/2 ). Taylor expasios, the boud v 2 2π 6 r 7 ad equatios (28), (31) ad (33) show that 1 V j [v ( R jt ) v (R j ) v (R j )( R jt R j ) 1 2 v (R j )( R jt R j ) 2 ] ad 1 V j [v (R jt ) v (R j ) v (R j )(R jt R j ) 1 2 v (R j )(R jt R j ) 2 ] are of order r 7/2 C 3 1. Usig the idetity (a + b + c) 2 a 2 b 2 + 2db = c 2 + 2(a + d)b + 2(a + b)c with a = R jt R j, b = R j R j, c = R jt R jt R j + R j ad d = 1/2 t(x j µ)f(ε j ) = 1/2 tτv j f(ε j ) ad the the properties (29) (32), we derive the bouds ( R jt R j ) 2 (R jt R j ) 2 ( R j R j ) /2 tτv j f(ε j )( R j R j ) ζ (1 + X j ) 3/2, j = 1,...,,

24 24 Ursula U. Müller et al. with ζ = O P ( 9/8 C 3/2 (log ) 1/2 ). If follows from the above that the lefthad side of (37) is bouded by T 1 /2 + C τ T 2 + T 3 + T 4, where T 1 = 1 V j v (R j )( R j R j ) 2, ad T 3 = T 2 = 1 1 V 2 j v (R j )f(ε j )( R j R j ), V j v (R j )( R jt R jt ), T 4 = O P (r 3/2 1/2 + r 7/2 C r 5/2 5/8 C 3/2 (log ) 1/2 ) = o P (r 1/2 ). We calculate E[ T 1 2 ε 1,..., ε ] = 1 v (R j ) 2 ( R j R j ) 4 = O P (r 5 2 ). Thus T 1 = o P (r 1/2 ). Next, we write T 2 as the vector U-statistic ad obtai T 2 = 1 ( 1) E[ T 2 2 ] E[ k(ε) 2 ] i j V 2 j v (F (ε j ))f(ε j )(1[ε i ε j ] F (ε j )) + 2E[V 4 2 v (F (ε 2 ) 2 f 2 (ε 2 )(1[ε 1 ε 2 ] F (ε 2 )) 2 ] ( 1) with k(x) = E[v (F (ε))f(ε)(1[x ε] F (ε))]. Usig the represetatio f(y) = l f (z)f(z) dz ad the Fubii s theorem, we calculate y k(x) = = x v (F (y))f(y)(1[x y] F (y))f(y) dy (v (F (z)) v (F (x))l f (z)f(z) dz [v (F (z))f (z) v (F (z))])l f (z)f(z) dz. Thus k is bouded by a costat times r 3/2 ad we see that E[ T 2 2 ] = O(r/ 3 + r/ 5 2 ). This proves C T 2 = O P (r 1/2 ). We boud T 3 by the sum T 31 + T 32 + T 33 where T 31 = 1 W j v (R j )( R jt R jt ),

25 Iferece about the slope i liear regressio 25 T 32 = T 33 = 1 1 V j 1[ V j > 1/4 ]v (R j )( R jt R jt ), E[V 1[ V > 1/4 ]]v (R j )( R jt R jt ), ad W j = V j 1[ V j 1/4 ] E[V 1[ V 1/4 ]. Sice V has a fiite fourth momet, we obtai the rates max 1 j V j = o P ( 1/4 ) ad E[ V 1[ V > 1/4 ]] 3/4 E[V 4 1[ V > 1/4 ]] = o( 3/4 ). Thus we fid P (T 32 > 0) P (max 1 j V j > 1/4 ) 0 ad T 33 = o P ( 3/4 r 3/2 ), usig (29) ad (30). This shows T 32 + T 33 = o P (r 1/2 ). To deal with T 31 we express it as T 31 = 1/2 1 ( W j v. ( 1) (F (ε j )) 1[ε it ε jt ] F t (ε jt )) Let us set Usig (24) we obtai the boud i j k t (z) = E[W v (F (z + 1/2 tx))], z R. E[ k t (ε jt ) k s (ε js ) 2 ] 2π 2 r 3 B 2 1E[W 2 (X j X) 2 ] t s 2 / ad derive with the help of Lemma 3 1 (k t (ε jt ) E[k t (ε jt )]) = O P (r 3/2 C 1/2 ). We therefore obtai the rate T 31 = o P (r 1/2 ), if we verify U(t) = O P (r 3/2 1 log ), (38) where U(t) is the vector U-statistic equalig 1 ( 1) i j [ ( ) ] W j v (F (ε j )) 1[ε it ε jt ] F t (ε jt ) + k t (ε it ) E[k t (ε it )]. It is easy to verify that U(t) is degeerate. Let t k = C + 2kC /, k = 0,...,. The we have ( ) U(t) max U(t k ) + U(t) U(t k ). (39) 1 k t k 1 t t k For t [t k 1, t k ], we fid U(t) U(t k ) (2π 2 r 3 ) 1/2( 2 1/4 (N + k ) + N k ) + 2B 1C 3/2 S (40)

26 26 Ursula U. Müller et al. with S = 1 ( Wj ( X j + E[ X ]) + E[ W ] X j + 2E[ W X ] + E[ W ]E[ X ] ), N + k = 1 ( 1) N k = 1 ( 1) 1[t k 1 D ij < ε i ε j t k D ij ]1[D ij 0], i j 1[t k D ij < ε i ε j t k 1 D ij ]1[D ij < 0], i j ad D ij = 1/2 (X i X j ). We write U l (t) for the l-th compoet of the vector U(t). The we have P ( max 1 k U(t k) > η) r k=1 l=1 P ( U l (t k ) > ηr 1/2 ), η > 0. Sice U l (t) is a degeerate U-statistic whose kerel is bouded by b l = 2 1/4 ( 2πl+ 2) 27 1/4 l ad has secod momet bouded by 2(πl) 2, we derive from part (c) of Propositio 2.3 of Arcoes ad Gié (1993) that ( P (( 1) U l (t) > η) c 1 exp t C c 2 η 2πl + b 2/3 l η 1/3 1/3 for uiversal costats c 1 ad c 2. Usig the above we obtai P ( max U(t k) > K3 r 3/2 log ) 1 k 1 r P (( 1) U l (t k ) > K 3 r log ) k=1 l=1 ( c 2 K 3 log() r c 1 exp ), K > 0. 2π + 9K(log ) 1/3 1/6 ) This shows that To deal with N + k max U(t k) = O P (r 3/2 1 log ). (41) 1 k we itroduce the degeerate U-statistic Ñ + k = 1 ( 1) 1[D ij 0]ξ k (i, j) i j with ξ k (i, j) =1[t k 1 D ij < ε i ε j t k D ij ] F (ε j + t k D ij ) + F (ε j + t k 1 D ij ) F (ε i t k 1 D ij ) + F (ε i t k D ij ) + F 2 (t k D ij ) F 2 (t k 1 D ij )

27 Iferece about the slope i liear regressio 27 ad F 2 the distributio fuctio of ε 1 ε 2. It is easy to see that N + k Ñ + k 6B 1C 3/2 1 X i X j. ( 1) The kerel of the U-statistic Ñ + k is bouded by 8 ad has secod momet bouded by D 3/2 with D = 2B 1 C E[ X 1 X 2 ]. Thus, by part (c) of Propositio 2.3 i Arcoes ad Gié (1993), we obtai that the correspodig degeerate U-statistic Ñ + k satisfies k=1 P ( Ñ + k > K3 (log ) 3/2 1/2 ( ) c 1 exp 1 The above shows that Similarly oe obtais i j c 2 K 3 (log ) 3/2 D 1/2 1/4 + 4K(log ) 1/2 max 1 k N + k = O P ( 3/2 (log ) 3/2 ). (42) max 1 k N k = O P ( 3/2 (log ) 3/2 ). (43) The desired (38) follows from (39)-(43) ad S = O P (1). This cocludes the proof of (13). ). 6.5 Auxiliary Results Let X ad Y be idepedet radom variables. Let (X 1, Y 1 ),..., (X m, Y m ) be idepedet copies of (X, Y ). For reals t, x ad y, set m N(t, x, y) = (1[Y i tx i y tx] 1[Y i y]), ad i=1 Ñ(t, x, y) = N(t, x, y) E[N(t, x, y)]. Lemma 4 Suppose X has fiite expectatio ad the distributio fuctio F of Y is Lipschitz: F (y) F (x) Λ y x for all x, y ad some fiite costat Λ. The the iequality ( P t δ ) ( Ñ(t, x, y) > 4η (8M + 4) exp η 2 2mΛδE[ X x ] + 2η/3 holds for η > 0, δ > 0, real x ad y ad every iteger M mλδe[ X x ]/η. I particular, for C 1 ad K 6Λ(1 + E[ X ]), we have ( ) p m (y, C, K) = P Ñ(t, x, y) > 4KC1/2 m 3/8 (log m) 1/2 x m 1/4 t C/m 1/2 ( m3/8 C 1/2 ) ) exp( K log(m)), y R. 6(log m) 1/2 )

28 28 Ursula U. Müller et al. Proof Fix x ad y ad set ν = E[ X x ]. Abbreviate N(t, x, y) by N(t) ad Ñ(t, x, y) by Ñ(t), set m N + (t) = (1[Y j t(x j x) y] 1[Y j y])1[x j x 0] N (t) = i=1 m (1[Y j t(x j x) y] 1[Y j y])1[x j x < 0] i=1 ad let Ñ+(t) = N + (t) E[N + (t)] ad Ñ (t) = N (t) E[N (t)]. Sice F is Lipschitz, we obtai For s t u, we have ad thus E[N + (t 1 )] E[N + (t 2 )] mλ t 1 t 2 ν. N + (s) E[N + (u)] N + (t) E[N + (t)] N + (u) E[N + (s)] Ñ + (s) mλ u s ν Ñ+(t) Ñ+(u) + mλ u s ν. It is ow easy to see that Ñ+(t) max N +(kδ/m) + mλδν/m t δ k= M,...,M for every iteger M. From this we obtai the boud P ( Ñ+(t) 2η) M P ( Ñ+(kδ/M) > η) + P (mλδν/m > η). t δ k= M The Berstei iequality ad the fact that the variace of is bouded by Λ t ν yield Thus we have (1[Y t(x x) y] 1[Y y])1[x x] P ( Ñ+(kδ/M) > η) 2 exp P ( Ñ+(t) > 2η) 2(2M + 1) exp t δ P ( Ñ (t) > 2η) 2(2M + 1) exp t δ ( η 2 ). 2mΛδν + 2η/3 ( for M mλδν/η. Similarly, oe verifies for such M, ( η 2 ) 2mΛδν + 2η/3 η 2 2mΛδν + 2η/3 Sice Ñ(t) = Ñ+(t) + Ñ (t), we obtai the first result. The secod result follows from the first oe by takig δ = Cm 1/2, η = KC 1/2 m 3/8 (log m) 1/2 ad observig the iequality (log m) 1/2 m 3/8 1. ). Ackowledgemets We thak two reviewers ad the Associate Editor for their kowledgeable commets ad suggestios, which helped us improve the paper.

29 Iferece about the slope i liear regressio 29 Refereces 1. Arcoes, M. ad Gié E. (1993). Limit theorems for U-processes. A. Probab., 21, Bickel, P.J. (1982). O adaptive estimatio. A. Statist., 10, Che, S.X., Peg, L. ad Qi, Y.-L. (2009). Effects of data dimesio o empirical likelihood. Biometrika, 26, Forrester, J., Hooper, W., Peg H. ad Schick, A. (2003). O the costructio of efficiet estimators i semiparametric models. Statist. Decisios, 21, Hjort, N.L., McKeague, I.W. ad Va Keilegom, I. (2009). Extedig the scope of empirical likelihood. A. Statist., 37, Ji, K. (1992) Empirical smoothig parameter selectio i adaptive estimatio. A. Statist. 20, Koul, H.L., Müller, U.U. ad Schick, A. (2012). The trasfer priciple: a tool for complete case aalysis. A. Statist. 60, Koul, H.L. ad Susarla, V. (1983). Adaptive estimatio i liear regressio. Statist. Decisios, 1, Little, R.J.A. ad Rubi, D.B. (2002). Statistical Aalysis with Missig Data. Secod editio. Wiley Series i Probability ad Statistics, Wiley, Hoboke. 10. Müller, U.U. (2009). Estimatig liear fuctioals i oliear regressio with resposes missig at radom. A. Statist., 37, Müller, U.U. ad Schick, A. (2017). Efficiecy trasfer for regressio models with resposes missig at radom. Beroulli, 23, Müller, U.U. ad Va Keilegom, I. (2012). Efficiet parameter estimatio i regressio with missig resposes. Electro. J. Stat., 6, Owe, A.B. (1988). Empirical likelihood ratio cofidece itervals for a sigle fuctioal. Biometrika, 75, Owe, A.B. (2001). Empirical Likelihood. Moographs o Statistics ad Applied Probability, 92, Chapma & Hall. 15. Peg, H. ad Schick, A. (2005). Efficiet estimatio of liear fuctioals of a bivariate distributio with equal, but ukow margials: the least-squares approach. J. Multivariate Aal., 95, Peg, H. ad Schick, A. (2013). A empirical likelihood approach to goodess of fit testig. Beroulli, 19, Peg, H. ad Schick, A. (2016). Iferece i the symmetric locatio model: a empirical likelihood approach. Preprit. 18. Peg, H. ad Schick, A. (2017). Maximum empirical likelihood estimatio ad related topics. Preprit. 19. Qi, J. ad Lawless J. (1994). Empirical likelihood ad geeral estimatig equatios. A. Statist., 22, Schick, A. (1987). A ote o the costructio of asymptotically liear estimators. J. Statist. Pla. Iferece, 16, Schick, A. (1993). O efficiet estimatio i regressio models. A. Statist., 21, Correctio ad addedum: 23 (1995), Schick, A. (2013). Weighted least squares estimatio with missig resposes: A empirical likelihood approach. Electro. J. Statist., 7, Tsiatis, A.A. (2006). Semiparametric Theory ad Missig Data. Spriger Series i Statistics. Spriger, New York.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Rank tests and regression rank scores tests in measurement error models

Rank tests and regression rank scores tests in measurement error models Rak tests ad regressio rak scores tests i measuremet error models J. Jurečková ad A.K.Md.E. Saleh Charles Uiversity i Prague ad Carleto Uiversity i Ottawa Abstract The rak ad regressio rak score tests

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

An Empirical Likelihood Approach To Goodness of Fit Testing

An Empirical Likelihood Approach To Goodness of Fit Testing Submitted to the Beroulli A Empirical Likelihood Approach To Goodess of Fit Testig HANXIANG PENG ad ANTON SCHICK Idiaa Uiversity Purdue Uiversity at Idiaapolis, Departmet of Mathematical Scieces, Idiaapolis,

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

POWER COMPARISON OF EMPIRICAL LIKELIHOOD RATIO TESTS: SMALL SAMPLE PROPERTIES THROUGH MONTE CARLO STUDIES*

POWER COMPARISON OF EMPIRICAL LIKELIHOOD RATIO TESTS: SMALL SAMPLE PROPERTIES THROUGH MONTE CARLO STUDIES* Kobe Uiversity Ecoomic Review 50(2004) 3 POWER COMPARISON OF EMPIRICAL LIKELIHOOD RATIO TESTS: SMALL SAMPLE PROPERTIES THROUGH MONTE CARLO STUDIES* By HISASHI TANIZAKI There are various kids of oparametric

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan Deviatio of the Variaces of Classical Estimators ad Negative Iteger Momet Estimator from Miimum Variace Boud with Referece to Maxwell Distributio G. R. Pasha Departmet of Statistics Bahauddi Zakariya Uiversity

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

IMPROVING EFFICIENT MARGINAL ESTIMATORS IN BIVARIATE MODELS WITH PARAMETRIC MARGINALS

IMPROVING EFFICIENT MARGINAL ESTIMATORS IN BIVARIATE MODELS WITH PARAMETRIC MARGINALS IMPROVING EFFICIENT MARGINAL ESTIMATORS IN BIVARIATE MODELS WITH PARAMETRIC MARGINALS HANXIANG PENG AND ANTON SCHICK Abstract. Suppose we have data from a bivariate model with parametric margials. Efficiet

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

ESTIMATING THE ERROR DISTRIBUTION FUNCTION IN NONPARAMETRIC REGRESSION WITH MULTIVARIATE COVARIATES

ESTIMATING THE ERROR DISTRIBUTION FUNCTION IN NONPARAMETRIC REGRESSION WITH MULTIVARIATE COVARIATES ESTIMATING THE ERROR DISTRIBUTION FUNCTION IN NONPARAMETRIC REGRESSION WITH MULTIVARIATE COVARIATES URSULA U. MÜLLER, ANTON SCHICK AND WOLFGANG WEFELMEYER Abstract. We cosider oparametric regressio models

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Mathematical Statistics - MS

Mathematical Statistics - MS Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Asymptotic distribution of products of sums of independent random variables

Asymptotic distribution of products of sums of independent random variables Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege

More information

Self-normalized deviation inequalities with application to t-statistic

Self-normalized deviation inequalities with application to t-statistic Self-ormalized deviatio iequalities with applicatio to t-statistic Xiequa Fa Ceter for Applied Mathematics, Tiaji Uiversity, 30007 Tiaji, Chia Abstract Let ξ i i 1 be a sequece of idepedet ad symmetric

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2 82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,

More information

Inference for Alternating Time Series

Inference for Alternating Time Series Iferece for Alteratig Time Series Ursula U. Müller 1, Ato Schick 2, ad Wolfgag Wefelmeyer 3 1 Departmet of Statistics Texas A&M Uiversity College Statio, TX 77843-3143, USA (e-mail: uschi@stat.tamu.edu)

More information

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)]. Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x

More information

Law of the sum of Bernoulli random variables

Law of the sum of Bernoulli random variables Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A. Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Asymptotic Results for the Linear Regression Model

Asymptotic Results for the Linear Regression Model Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE Vol. 8 o. Joural of Systems Sciece ad Complexity Apr., 5 MOMET-METHOD ESTIMATIO BASED O CESORED SAMPLE I Zhogxi Departmet of Mathematics, East Chia Uiversity of Sciece ad Techology, Shaghai 37, Chia. Email:

More information

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Study the bias (due to the nite dimensional approximation) and variance of the estimators 2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution Iteratioal Mathematical Forum, Vol., 3, o. 3, 3-53 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.9/imf.3.335 Double Stage Shrikage Estimator of Two Parameters Geeralized Expoetial Distributio Alaa M.

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS Ryszard Zieliński Ist Math Polish Acad Sc POBox 21, 00-956 Warszawa 10, Polad e-mail: rziel@impagovpl ABSTRACT Weak laws of large umbers (W LLN), strog

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

5. Likelihood Ratio Tests

5. Likelihood Ratio Tests 1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

More information

Department of Mathematics

Department of Mathematics Departmet of Mathematics Ma 3/103 KC Border Itroductio to Probability ad Statistics Witer 2017 Lecture 19: Estimatio II Relevat textbook passages: Larse Marx [1]: Sectios 5.2 5.7 19.1 The method of momets

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT Itroductio to Extreme Value Theory Laures de Haa, ISM Japa, 202 Itroductio to Extreme Value Theory Laures de Haa Erasmus Uiversity Rotterdam, NL Uiversity of Lisbo, PT Itroductio to Extreme Value Theory

More information

arxiv: v1 [math.pr] 13 Oct 2011

arxiv: v1 [math.pr] 13 Oct 2011 A tail iequality for quadratic forms of subgaussia radom vectors Daiel Hsu, Sham M. Kakade,, ad Tog Zhag 3 arxiv:0.84v math.pr] 3 Oct 0 Microsoft Research New Eglad Departmet of Statistics, Wharto School,

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector Summary ad Discussio o Simultaeous Aalysis of Lasso ad Datzig Selector STAT732, Sprig 28 Duzhe Wag May 4, 28 Abstract This is a discussio o the work i Bickel, Ritov ad Tsybakov (29). We begi with a short

More information

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable. Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig

More information

Regression with an Evaporating Logarithmic Trend

Regression with an Evaporating Logarithmic Trend Regressio with a Evaporatig Logarithmic Tred Peter C. B. Phillips Cowles Foudatio, Yale Uiversity, Uiversity of Aucklad & Uiversity of York ad Yixiao Su Departmet of Ecoomics Yale Uiversity October 5,

More information

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2 Chapter 8 Comparig Two Treatmets Iferece about Two Populatio Meas We wat to compare the meas of two populatios to see whether they differ. There are two situatios to cosider, as show i the followig examples:

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Lecture 3: MLE and Regression

Lecture 3: MLE and Regression STAT/Q SCI 403: Itroductio to Resamplig Methods Sprig 207 Istructor: Ye-Chi Che Lecture 3: MLE ad Regressio 3. Parameters ad Distributios Some distributios are idexed by their uderlyig parameters. Thus,

More information

Maximum likelihood estimation from record-breaking data for the generalized Pareto distribution

Maximum likelihood estimation from record-breaking data for the generalized Pareto distribution METRON - Iteratioal Joural of Statistics 004, vol. LXII,. 3, pp. 377-389 NAGI S. ABD-EL-HAKIM KHALAF S. SULTAN Maximum likelihood estimatio from record-breakig data for the geeralized Pareto distributio

More information

On Random Line Segments in the Unit Square

On Random Line Segments in the Unit Square O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,

More information

Unbiased Estimation. February 7-12, 2008

Unbiased Estimation. February 7-12, 2008 Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

More information

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.09 Kolmogorov-Smirov type Tests for Local Gaussiaity i High-Frequecy Data George Tauche, Duke Uiversity Viktor Todorov,

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

Summary. Recap ... Last Lecture. Summary. Theorem

Summary. Recap ... Last Lecture. Summary. Theorem Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca

More information