A semiparametric single-index estimator for a class of estimating equation models

Size: px
Start display at page:

Download "A semiparametric single-index estimator for a class of estimating equation models"

Transcription

1 A semiparametric sigle-idex estimator for a class of estimatig equatio models arxiv: v2 [math.s] 26 Apr 2017 Maria Hristache Weiyu Li Valeti Patilea Abstract We propose a two-step pseudo-maximum likelihood procedure for semiparametric sigle-idex regressio models where the coditioal variace is a kow fuctio of the regressio ad a additioal parameter. he Poisso sigle-idex regressio with multiplicative uobserved heterogeeity is a example of such models. Our procedure is based o liear expoetial desities with uisace parameter. he pseudo-likelihood criterio we use cotais a oparametric estimate of the idex regressio ad therefore a rule for choosig the smoothig parameter is eeded. We propose a automatic ad atural rule based o the joit maximizatio of the pseudo-likelihood with respect to the idex parameter ad the smoothig parameter. We derive the asymptotic properties of the semiparametric estimator of the idex parameter ad the asymptotic behavior of our optimal smoothig parameter. he fiite sample performaces of our methodology are aalyzed usig simulated ad real data. Keywords: semiparametric pseudo-maximum likelihood, sigle-idex model, liear expoetial desities, badwidth selectio. CRES Esai, maria.hristache@esai.fr Correspodig author. CRES Esai, liweiyu84@gmail.com CRES Esai, valeti.patilea@esai.fr. Valeti Patilea gratefully ackowledges support from the research program New Challeges for New Data of Gees, LCL ad Fodatio de Risque. 1

2 1 Itroductio I this paper we cosider semiparametric models defied by coditioal mea ad coditioal variace estimatig equatios. Models defied by estimatig equatios for the first ad secod order coditioal momets are widely used i applicatios. See, for istace, Ziegler 2011 for a recet referece. Here we cosider a model that exteds the framework cosidered by Cui, Härdle ad Zhu o provide some isight o the type of models we study, cosider the followig semiparametric extesio of the classical Poisso regressio model with uobserved heterogeeity: the observed variables are Y, Z where Y deotes the cout variable ad Z is the vector of d explaatory variables. Let r t; θ = E Y Z θ = t. We assume that there exists θ 0 R d such that E Y Z = E Y Z θ 0 = r Z θ 0 ; θ 0. he parameter θ 0 ad the fuctio r are ukow. Give Z ad a uobserved error term ε, the variable Y has a Poisso law of mea r Z θ 0 ; θ 0 ε. If E ε Z = 1 ad V ar ε Z = σ 2, the V ar Y Z = V ar E Y Z, ε Z + E V ar Y Z, ε Z = r Z θ 0 ; θ 0 [ 1 + σ 2 r Z θ 0 ; θ 0 ]. 1.1 his model is a semiparametric sigle-idex regressio model e.g., Powell, Stock ad Stoker 1989, Ichimura 1993, Härdle, Hall ad Ichimura 1993, Sherma 1994b where a secod order coditioal momet is specified as a oliear fuctio of the coditioal mea ad a additioal ukow parameter. his exteds the framework of Cui, Härdle ad Zhu 2011 where the coditioal variace of the respose is proportioal to a give fuctio of the coditioal mea. Our first cotributio is to propose a ew semiparametric estimatio procedure for sigleidex regressio which icorporates the additioal iformatio o the coditioal variace of Y. For this we exted the quasi-geeralized pseudo maximum likelihood method itroduced by Gouriéroux, Mofort ad rogo 1984a, 1984b to a semiparametric framework. More precisely, we propose to estimate θ 0 ad the fuctio r through a two-step pseudomaximum likelihood PML procedure based o liear expoetial families with uisace parameter desities. Such desities are parameterized by the mea r ad a uisace parameter that ca be recovered from the variace. Although we use a likelihood type criterio, o coditioal distributio assumptio o Y give Z is required for derivig the asymptotic results. As a example of applicatio of our procedure cosider the case where Y is a cout variable. First, write the Poisso likelihood where the fuctio r is replaced by a kerel estimator ad maximize this likelihood with respect to θ to obtai a semiparametric PML estimator of θ 0. Use this estimate ad the variace formula 1.1 to deduce a cosistet momet estimator of σ 2. I a secod step, estimate θ 0 through a semiparametric Negative Biomial PML where r is agai replaced by a kerel estimator ad the variace parameter of the Negative Biomial is set equal to the estimate of σ 2. Fially, give the secod step estimate of θ 0, build a kerel estimator for the regressio r. For simplicity, we use a 2

3 Nadaraya-Watso estimator to estimate r. Other smoothers like local polyomials could be used at the expese of more itricate techical argumets. he occurrece of a oparametric estimator i a pseudo-likelihood criterio requires a rule for the smoothig parameter. While the semiparametric idex regressio literature cotais a large amout of cotributios o how to estimate a idex, there are much less results ad practical solutios o the choice of the smoothig parameter. Eve if the smoothig parameter does ot ifluece the asymptotic variace of a semiparametric estimator of θ 0, i practice the estimate of θ 0 ad of the regressio fuctio may be sesitive to the choice of the smoothig parameter. Aother cotributio of this paper is to propose a automatic ad atural choice of the smoothig parameter used to defie the semiparametric estimator. For this, we exted the approach itroduced by Härdle, Hall ad Ichimura 1993 see also Xia ad Li 1999, Xia, og ad Li 1999 ad Delecroix, Hristache ad Patilea he idea is to maximize the pseudo-likelihood simultaeously i θ ad the smoothig parameter, that is the badwidth of the kerel estimator. he badwidth is allowed to belog to a large rage betwee 1/4 ad 1/8. I some sese, this approach cosiders the badwidth a auxiliary parameter for which the pseudo-likelihood may provide a estimate. Usig a suitable decompositio of the pseudo-log-likelihood we show that such a joit maximizatio is asymptotically equivalet to separate maximizatio of a purely parametric oliear term with respect to θ ad miimizatio of a weighted mea-squared cross-validatio fuctio with respect to the badwidth. he weights of this cross-validatio fuctio are give by the secod order derivatives of the pseudo-log-likelihood with respect to r. We show that the rate of our optimal badwidth is 1/5, as expected for twice differetiable regressio fuctios. he paper is orgaized as follows. I sectio 2 we itroduce a class of semiparametric PML estimators based o liear expoetial desities with uisace parameter ad we provide a atural badwidth choice. Moreover, we preset the geeral methodology used for the asymptotics. Sectio 3 cotais the asymptotic results. A boud for the variace of our semiparametric PML estimators is also derived. I sectio 4 we use the semiparametric PML estimators to defie a two-step procedure that ca be applied i sigle-idex regressio models where a additioal variace coditio like 1.1 is specified. Sectio 5.1 examies the fiite-sample properties of our procedure via Mote Carlo simulatios. We compare the performaces of a two-step geeralized least-squares with those of a Negative Biomial PML i a Poisso sigle-idex regressio model with multiplicative uobserved heterogeeity. Eve if the two procedures cosidered lead to asymptotically equivalet estimates, the latter procedure seems preferable i fiite samples. A applicatio to real data o the frequecy of recreatioal trips see Camero ad rivedi 2013, page 246 is also provided. Sectio 6 cocludes the paper. he techical proofs are postpoed to the Appedix. 2 Semiparametric PML with uisace parameter Cosider that the observatios Y 1, Z1,..., Y, Z are idepedet copies of the radom vector Y, Z R R d. Assume that there exists θ 0 R d, uique up to a scale 3

4 ormalizatio factor, such that the sigle-idex model SIM coditio E Y Z = E Y Z θ 0 = r Z θ 0 ; θ holds. I this paper, we focus o sigle-idex models where the coditioal secod order momet of Y give Z is a kow fuctio of E [Y Z] ad of a uisace parameter. o be more precise, i the model we cosider, V ar Y Z = g E Y Z, α 0 = g r Z θ 0 ; θ 0, α0, 2.2 for some real value α 0. he fuctio g, is kow ad, for each r, the map α g r, α is oe-to-oe. Our framework is slightly more geeral that the oe cosidered by Cui, Härdle ad Zhu 2011 where the coditioal variace of Y give Z is a give fuctio of the coditioal mea of Y give Z multiplied by a ukow costat. o estimate the parameter of iterest θ 0 i a model like , we propose a semiparametric PML procedure based o liear expoetial families with uisace parameter. he desity used to build the pseudo-likelihood is take with mea ad variace equal to r ad gr, α, respectively. I this sectio we suppose that a estimator of the uisace parameter is give. I sectio 4 we show how to build such a estimator usig a prelimiary estimate of θ 0 ad coditio Liear expoetial families with uisace parameter Gouriéroux, Mofort ad rogo 1984a itroduced a class of desities, with respect to a give measure µ, called liear expoetial family with uisace parameter LEFN ad defied as l y r, α = exp [B r, α + C r, α y + D y, α], where α is the uisace parameter. Sice the domiatig measure µ eed ot be Lebesgue measure, the law defied by l is ot ecessarily cotiuous. he fuctios B, ad C, are such that the expectatio of the correspodig law is r while the variace is [ r C r, α] 1. r deotes the derivative with respect to the argumet r. Recall that for ay give α, the followig idetity holds: r B r, α + r C r, α r 0. If α is fixed, a LEFN becomes a liear expoetial family LEF of desities. Gouriéroux, Mofort ad rogo 1984a, 1984b used LEFN desities to defie a two-step PML procedure i oliear regressio models where a specificatio of the coditioal variace is give. Herei, we exted their approach to a semiparametric framework. I the case of the SIM defied by equatio 1.1, the coditioal variace is give by g r, α = r 1 + αr with r ad α > 0. I this case take B r, α = 1 α l 1 + αr ad C r, α = l r 1 + αr, which defie a Negative Biomial distributio of mea r ad variace r 1 + αr. Note that the limit case α = 0 correspods to a Poisso distributio. As aother example, cosider g r, α = r 2 /α with r ad α > 0. Now, take the LEFN desity give by B r, α = α l r ad C r, α = α/r, which is the desity of a gamma law of mea r ad variace r 2 /α. 4

5 2.2 he semiparametric estimator I order to defie our semiparametric PML estimator i the presece of a uisace parameter let us itroduce some otatio: give {c }, a sequece of umbers growig slowly to ifiity e.g., c = l, let H = { h : c 1/4 h c 1 1/8} be the rage from which the optimal badwidth will be chose. Defie the set Θ = {θ : θ θ 0 d }, 1, with {d } some sequece decreasig to zero. Let α be some real value of the uisace parameter. ypically, α = α 0 if the coditioal variace formula 2.2 is correctly specified. Otherwise, α is some pseudo-true value of the uisace parameter. Suppose that a sequece { α } such that α α, i probability, is give. Set 1 ψ y, r; α = l l y r, α with l y r, α the LEFN desity of expectatio r ad uisace parameter α. Defie the semiparametric PML estimator i the presece of a uisace parameter ad the optimal badwidth as where θ, ĥ 1 = arg max θ Θ, h H ˆr i h t; θ = ψ Y i, ˆr h i Z i θ; θ ; α τ Z i, 2.3 Y j K h t Z j θ j i K h t Z j θ =: γi h t; θ t; θ j i deotes the leave-oe-out versio of the Nadaraya-Watso estimator of the regressio fuctio r t; θ = E Y Z θ = t γ t; θ =: f t; θ, with f ; θ the desity of Z θ. he fuctio K is a secod order kerel fuctio ad K h stads for K /h /h, where h is the badwidth. τ deotes a trimmig fuctio. If the sequece α is costat or ψ does ot deped o α, equatio 2.3 defies a semiparametric PML based o a LEF desity. A trimmig is desiged to keep the desity estimator f h i away from zero i computatios ad it is usually required for aalyzig the asymptotic properties of the oparametric regressio estimator ad of the optimal badwidth. he practical purpose of a trimmig recommeds a data-drive device like I {z: f i h z θ;θ c}, with some fixed c > 0. Herei, I A deotes the idicator fuctio of the set A. However, to esure cosistecy with such a trimmig, oe should require i additio that θ 0 = arg max E [ ψ Y, r θ Z θ I {z: fz θ θ;θ c}z ]. 1 Herei, we focus o ψ y, r; α = l l y r, α where l y r, α = exp [B r, α + C r, α y + D y, α] is a LEFN desity. However, other fuctios ψ y, r; α havig the required properties ca be cosidered see Appedix A. 5 f i h

6 Meawhile, a trimmig like I {z: fz θ 0 ;θ 0 c} is easier to hadle i theory. Here, we cosider τ = I {z: f i h z θ ; θ c} 2.4 with θ Θ, 1, a sequece with limit θ 0 ad h, 1, a sequece of prelimiary badwidths such that ε h 0 ad 1/2 ε h for some 0 < ε < 1/2. he trimmig procedure we propose represets a appealig compromise betwee the theory ad the applicatios. O oe had, it is easy to implemet. O the other had, we show below that, i a certai sese, our trimmig is asymptotically equivalet to the fixed trimmig I {z: fz θ 0 ;θ 0 c} ad this fact greatly simplifies the proofs. We prove this equivalece uder two types of assumptios: either i Z is bouded ad θ θ 0 = o 1, or ii E [exp λ Z ] <, for some λ > 0, ad θ θ 0 = o 1/ l. o be more precise, defie A = { z : f } z θ 0 ; θ 0 c R d ad A δ = { z : } f z θ 0 ; θ 0 c δ, δ > 0. By little algebra, for all θ Θ, h ad i, I {z: f i h z θ;θ c} Z i I A Z i I A δz i + I δ, G, where Let G = max 1 i sup θ Θ, h Ŝ θ, h; α, A = 1 f i h Z i θ; θ f Z i θ 0 ; θ 0. ψ Y i, ˆr h i Z i θ; θ ; α IA Z i with A = A or A δ. Without loss of geerality, cosider that ψ, ; 0. Sice ψ is the logarithm of a LEFN desity, for ay give y ad α, the map r ψ y, r ; α attais its maximum at r = y; thus, up to a traslatio with a fuctio depedig oly o y ad α, we may cosider ψ 0. I this case we have 1 ψ Y i, ˆr h i Z i θ; θ ; α I{z: f i h z θ ;θ c} Z i Ŝ θ, h; α, A 2.5 Ŝ θ, h; α, A δ I δ, G ψ Y i, ˆr h i Z i θ; θ ; α. We show that Ŝ θ, h; α, A δ = o P Ŝ θ, h; α, A, uiformly over Θ H ad uiformly i α, provided that δ 0 ad P f Z θ 0 ; θ 0 = c = 0. O the other had, we prove that P G > δ 0, provided that δ 0 slowly eough ad h 0 faster tha ε ad slower tha 1/2 ε, for some 0 < ε < 1/2. See Lemma B.2 i the appedix; i that lemma we distiguish two types of assumptios depedig o whether Z is bouded or ot. Deduce that θ, ĥ is asymptotically equivalet to the maximizer of Ŝ θ, h; α, A over Θ H. herefore, hereafter, we simply write Ŝ θ, h; α istead of Ŝ θ, h; α, A ad we cosider θ, ĥ = arg max Ŝ θ, h; α. 2.6 θ Θ, h H 6

7 2.3 Methodology he semiparametric pseudo-log-likelihood Ŝ θ, h; α ca be split ito a purely parametric oliear part S θ; α, a purely oparametric oe h; α ad a remider term Rθ, h; α, where S θ; α = 1 h; α = 1 R θ, h; α = 1 [ ψ Yi, r Zi θ; θ ; α ψ Yi, r Zi θ 0 ; θ ] 0 ; α I A Z i, 2.7 ψ Y i, ˆr h i Z i θ 0 ; θ 0 ; α I A Z i, [ ψ Yi, ˆr h i Z i θ; θ ; α ψ Yi, r Zi θ; θ ] ; α IA Z i 1 [ ψ Yi, ˆr h i Z i θ 0 ; θ 0 ; α ψ Y i, r Zi θ 0 ; θ ] 0 ; α I A Z i see Härdle, Hall ad Ichimura 1993 for a slightly differet splittig. Give this decompositio, the simultaeous optimizatio of Ŝ θ, h; α is asymptotically equivalet to separately maximizig S θ; α with respect to θ ad h; α with respect to h, provided that R θ, h; α is sufficietly small. A key igrediet for provig that R θ, h; α is egligible with respect to S θ; α ad h; α, uiformly i θ, h Θ H ad for ay { α }, is represeted by the orthogoality coditios E [ 2 ψ Y, r ] Z θ 0 ; θ 0 ; α Z = ad E [ θ 2 ψ Y, r Z θ 0 ; θ 0 ; α Z θ 0 ] = 0, 2.9 that must hold for ay α, where 2 deotes the derivative with respect to the secod argumet of ψ, ; ad θ is the derivative with respect to all occurreces of θ, that is give y, z ad α, θ 2 ψ y, r z θ 0 ; θ 0 ; α = θ 2ψ y, r z θ; θ ; α θ=θ0 see also Sherma 1994b ad Delecroix, Hristache ad Patilea 2006 for similar coditios. If ψ y, r; α = l l y r, α = B r, α + C r, α y + D y, α, with r B r, α + r C r, α r 0, the 2 ψ y, r; α = r C r, α y r ad thus 2.8 is a cosequece of the SIM coditio 2.1. o check the secod orthogoality coditio ote that E [ 2 22ψ Y, r Z θ 0 ; θ 0 ; α Z ] = E [ 2 22 ψ Y, r Z θ 0 ; θ 0 ; α Z θ 0 ] ad E [ θ r Z θ 0 ; θ 0 Z θ 0 ] = E [ r Z θ 0 ; θ 0 Z E [ Z Z θ 0 ] Z θ 0 ], 7

8 where r ; θ 0 is the derivative of r ; θ 0. he last idetity is always true uder the SIM coditio e.g., Newey 1994, page Let us poit out that coditios hold eve if the variace coditio 2.2 is misspecified. Sice R θ, h; α is egligible with respect to S θ; α ad h; α does ot cotai the parameter of iterest, the asymptotic distributio of θ will be obtaied by stadard argumets used for M estimators i the presece of uisace parameters applied to the objective fuctio S θ; α. We deduce that θ behaves as follows: i if the SIM coditio 2.1 holds ad α α = O P 1, for some α, the θ is asymptotically ormal; ii if SIM coditio holds, the coditioal variace 2.2 is correctly specified ad α α 0 = O P 1, the θ is asymptotically ormal ad it has the lowest variace amog the semiparametric PML estimators based o LEF desities. I ay case, the asymptotic distributio of θ θ 0 does ot deped o the choice of α. Let us poit out that i our framework we oly impose α coverget i probability without askig a rate of covergece O P 1/, as it is usually supposed for M estimatio i the presece of uisace parameters. his because the usual orthogoality coditio E [ α θ ψ Y, r Z θ 0 ; θ 0 ; α ] = 0 is true for ay α, provided that ψ y, r; α = l l y r, α with l y r, α a LEFN desity. Ideed, we have E [ α θ ψ Y, r Z θ 0 ; θ 0 ; α ] = E [ α r ψ Y, r Z θ 0 ; θ 0 ; α θ r Z θ 0 ; θ 0 ] = E [ E { α r B r Z θ 0 ; θ 0 ; α + α r C r Z θ 0 ; θ 0 ; α Y Z } θ r Z θ 0 ; θ 0 ] = 0 because E Y Z = r Z θ 0 ; θ 0 ad α r B r, α + α r C r, α r 0, for ay α. For the badwidth ĥ we obtai a asymptotic equivalece with a theoretical optimal badwidth miimizig h; α, that is we prove that the ratio of the two badwidths coverges to oe, i probability. Remark that h; α is a kid of ψ CV cross validatio fuctio. It ca be show that, up to costat additive terms, h; α is asymptotically equivalet to a weighted mea-squared CV fuctio. Whe ψ y, r; α = y r 2, the fuctio h; α is the usual CV fuctio that oe would use for choosig the badwidth for the Nadaraya-Watso estimator of E Y Z θ 0. By extesio of classical results for oparametric regressio, it ca be proved that the rate of the theoretical optimal badwidth miimizig h; α is 1/5 see Lemma B.3 i Appedix B; see also Härdle, Hall ad Ichimura 1993 for the case ψ y, r; α = y r 2. Deduce that ĥ is also of order 1/ Extesios Give the model coditios , the idea is to choose a LEFN desity with mea r ad variace gr, α ad to costruct a semiparametric PML estimator give a prelimiary estimate of the uisace parameter α 0. However, it may happe that o such LEFN desity exists or that oe prefers aother type of LEFN desities. he, the idea is to reparametrize 8

9 the coditioal variace of Y give Z. More precisely, we may cosider l y r, η = exp [B r, η + C r, η y + D y, η], where η stads for the uisace parameter. Let Σ = Σr, η deote the variace of the law give by this desity. Assume that for ay give r, the map η Σr, η is oe-to-oe. I this case, i order to provide a LEFN desity with variace gr, α it suffices to cosider l y r, η with η = Σ 1 r, gr, α. For istace, if gr, α = r1 + αr 2, oe may use a Negative Biomial desity of mea r ad uisace parameter αr. Aother solutio is to cosider a ormal desity of mea r where the variace equal to r1 + αr 2 plays the role of the uisace parameter. I this case, give a estimate of r1 + αr 2, our semiparametric PML becomes a semiparametric geeralized least-squares GLS procedure. Note that this example of fuctio gr, α leads us to the situatio where the uisace parameter is replaced by a uisace fuctio of r ad some additioal parameters. At the expese of more complicated writigs, our methodology ca be exteded to take ito accout the case of a uisace fuctio. More precisely, cosider a more geeral pseudolog-likelihood fuctio ψ y, r; Ψr, gr, α where Ψ, is a give real-valued fuctio ad α is the uisace parameter. See also Gouriéroux, Mofort ad rogo 1984a. o defie θ, ĥ, oe replaces α by Ψ r h Z i θ ; θ ; α i equatio 2.3, where θ, α θ 0, α, i probability, for some α, ad r h ; θ is a Nadaraya-Watso estimator of the regressio r ; θ. he same type of decompositio of the pseudo-log-likelihood criterio ito a purely parametric part fuctio of θ 1 [ ψ Yi, r Zi θ; θ ; Ψ r Zi θ ; θ, g rz i θ, θ, α a purely oparametric part fuctio of h ψ Y i, r Z i θ 0 ; θ 0 ; Ψ r Z i θ 0 ; θ 0, g rz i θ 0, θ 0, α ] I A Z i, h; α = 1 ψ Y i, ˆr h i Z i θ 0 ; θ 0 ; Ψ r Z i θ 0 ; θ 0, g rz i θ 0, θ 0, α I A Z i ad a egligible remider fuctio of θ ad h ca be used. For brevity, the details of this more geeral case are omitted. However, we sketch a quick argumet that applies for the semiparametric GLS. 2 Cosider the semiparametric GLS criterio Ŝ θ, h; θ, α, h = 1 g r h Zi 1 [ θ ; θ ; α Yi ˆr h i Z i θ; θ ] 2 IA Z i with θ, α θ 0, α, i probability, ad h, 1, a sequece of badwidths. Assume that max g r h Zi θ ; θ ; α g rz i θ 0 ; θ 0 ; α I A Z i = o P i 2 his semiparametric geeralized least-squares procedure is a particular case for Picoe ad Butler However, they do ot provide a badwidth rule. 9

10 ad g rz θ 0 ; θ 0 ; α I A z stays away from zero. he the GLS criterio Ŝ θ, h; θ, α, h is asymptotically equivalet to the ifeasible GLS criterio 1 [ Yi ˆr h i Z i θ; θ ] 2 g rz i θ 0 ; θ 0 ; α 1 IA Z i, that is we ca decompose the two criteria i such way that, up to egligible remiders, they have exactly the same purely parametric ad purely oparametric parts. Fially, we apply the methodology 3 described i the previous subsectio with ψ y, r; α = y r 2 ad the trimmig I A Z i multiplied by g rzi θ 0 ; θ 0 ; α 1. I order to esure coditio 2.10, it suffices to suppose that the map r, α g r; α satisfies a Lipschitz coditio ad that h is such that max rh Zi θ 0 ; θ 0 rzi θ 0 ; θ 0 IA Z i = o P 1 1 i ad max 1 i θ r h Z i θ; θ I A Z i is bouded i probability, uiformly with respect to θ i o P 1 eighborhoods of θ 0. For istace, a badwidth of order 1/5 satisfies these coditios see Adrews 1995; see also Delecroix, Hristache ad Patilea Other possible extesios of the framework we cosider is to allow a multi-idex regressio ad/or multivariate depedet variables. For istace, the SIM coditio ca be replaced by the multi-idex coditio E Y Z = E Y Z θ0, 1..., Z θ p 0 with p smaller tha the dimesio of Z, while the secod order momet coditio remais V ar Y Z = g E Y Z, α 0. O the other had, for multivariate depedet variables oe may cosider PML estimatio based o the multivariate ormal or multivariate geeralizatios of Poisso, Negative Biomial distributios Johso, Kotz ad Balakrisha he decompositio of the pseudo-log-likelihood i S, ad R as above ca still be used for these cases but the detailed aalysis of these extesios will be cosidered elsewhere. 3 Asymptotic results I this sectio we obtai the asymptotic distributio for θ ad the correspodig estimator of the regressio fuctio r t; θ = E [ Y Z θ = t ] as well as the asymptotic behavior of ĥ, with θ, ĥ defied i 2.3. A cosistet estimator for the asymptotic variace matrix of θ is proposed. Moreover, a lower boud for the asymptotic variace matrix of θ is derived. For the idetifiability of the parameter of iterest θ 0, hereafter fix its first compoet, that is θ 0 = 1, θ 0, θ 0 R d 1. herefore, we shall implicitly idetify a vector θ = 1, θ with its last d 1 compoets ad redefie the symbol θ as beig the vector of the first order partial derivatives with respect to the last d 1 compoets of θ. 3 Notice that the trimmig fuctio z I A z with A = { z : f z θ 0 ; θ 0 c } ca be writte as a fuctio of z θ 0. I view of our proofs, it becomes obvious that the methodology described i the previous subsectio remais valid if I A Z i is multiplied by a fuctio depedig oly o Z i θ 0. 10

11 Let v t; θ = V ar Y Xθ = t. If the SIM assumptio ad variace coditio 2.2 hold, the v Z θ 0 ; θ 0 = g r Z θ 0 ; θ 0, α0. For a give θ, let r ; θ ad r ; θ deote the first ad secod order derivatives of the fuctio r ; θ. Similarly, f ; θ is the derivative of f ; θ. Defie 4 C 1 = K2 1 4 E 1 2 rc r Z θ 0 ; θ 0 ; α 3.1 [r Z 2 r Z θ 0 ; θ ] 2 0 f Z θ 0 ; θ 0 θ 0 ; θ 0 + I A Z f Z θ 0 ; θ 0 { 1 C 2 = K 2 E 2 rc r Z θ 0 ; θ 0 ; α 1 f Z θ 0 ; θ 0 v } Z θ 0 ; θ 0 IA Z, with K 1 = u 2 K u du, K 2 = K 2 u du, ad cosider h opt = arg max h C1 h 4 + C 2 1 h 1 = C 2 /4C 1 1/5 1/5. Defie the d 1 d 1 matrices { [ r I = E C r Z θ 0 ; θ ] 0 ; α 2 v Z θ 0 ; θ 0 θ r Z θ 0 ; θ 0 θ r } Z θ 0 ; θ 0 IA Z J = E [ r C r Z θ 0 ; θ 0 ; α θ r Z θ 0 ; θ 0 θ r ] Z θ 0 ; θ 0 IA Z. Note that I = J if the variace coditio 2.2 holds ad α = α 0. Now, we deduce the asymptotic ormality of the semiparametric PML θ estimator i the presece of a uisace parameter. Moreover, we obtai the rate of decay to zero of the optimal badwidth ĥ. he proof of the followig result is give i Appedix refproof. heorem 3.1 Suppose that the assumptios i Appedix A hold. Defie the set Θ = {θ : θ θ 0 d }, 1, with d l 0 ad α, 1, such that α α = o P 1. Fix c > 0. If θ, ĥ is defied as i , the ĥ/hopt 1, i probability, ad D θ θ0 N 0, J 1 IJ 1. If Z is bouded, the same coclusio remais true for ay sequece d 0. 4 Note that 2 22ψy, r = 2 rrcr, α y r r Cr, α. hus, r C ca be replaced by 2 22ψ i the defiitio of the costats C 1 ad C 2. 11

12 I applicatios J 1 IJ 1 is ukow ad therefore it has to be cosistetly estimated. o this ed, we propose a usual sadwich estimator of the asymptotic variace J 1 IJ 1 e.g., Ichimura Let f h ; θ deote the kerel estimator for the desity of Z θ. Defie I = 1 J = 1 [ r C rĥ Zi r C rĥ Z i ] 2 θ; θ ; α [Y i rĥ Zi θ rĥ θ; θ ; α θ rĥ Zi Z i ] 2 θ; θ θ; θ θ rĥ Zi θ; θ I{z: f ĥ z θ; θ c} Z i θ; θ θ rĥ Zi θ; θ I{z: f ĥ z θ; θ c} Z i. Propositio 3.2 Suppose that the coditios of heorem 3.1 hold. J 1 IJ 1, i probability. he, J 1 I J 1 Proof. he argumets are quite stadard e.g., Ichimura 1993, sectio 7. O oe had, the covergece i probability of θ ad α ad, o the other had, the covergece i probability of rĥ z θ; θ ad θ rĥ z θ; θ, uiformly over θ i eighborhoods shrikig to θ 0 ad uiformly over z A e.g., Adrews 1995, Delecroix, Hristache ad Patilea 2006 imply I I ad J J, i probability. heorem 3.1 shows, i particular, that θ is asymptotically equivalet to the semiparametric PML based o the LEF pseudo-log-likelihood ψ y, r; α = l f y, r α. As i the parametric case, we ca deduce a lower boud for the asymptotic variace J 1 IJ 1 with respect to semiparametric PML based o LEF desities. his boud is achieved by θ if the SIM assumptio ad the variace coditio 2.2 hold ad α = α 0. he proof of the followig propositio is idetical to the proof of Property 5 of Gouriéroux, Mofort ad rogo 1984a, page 687 ad thus it will be skipped. Propositio 3.3 he set of asymptotic variace matrices of the semiparametric PML estimators based o liear expoetial families has a lower boud equal to K, where { [v ] K 1 = E Z 1 θ 0 ; θ 0 θ r Z θ 0 ; θ 0 θ r } Z θ 0 ; θ 0 IA Z. Cocerig the oparametric part, we have the followig result o theasymptotic distributio of the oparametric estimator of the regressio. he proof is omitted see Härdle ad Stoker Propositio 3.4 Assume that the coditios of heorem 3.1 are fulfilled. he, for ay t such that f t; θ 0 > 0, ĥ rĥ t; θ D r t; θ 0 ĥ2 β t N 0, K 2 vt; θ 0 f t; θ 0 1 where β t = K 1 /2 [ r t; θ 0 + 2r t; θ 0 f t; θ 0 f t; θ 0 1]. 12

13 Note that, for ay z such that f z θ 0 ; θ 0 > 0, ĥ rĥ z θ; θ r z θ 0 ; θ 0 ĥ 2 β z θ D 0 N 0, K 2 vz θ 0 ; θ 0 f z 1 θ 0 ; θ 0. Ideed, use the results of Adrews 1995 to deduce that θ rĥ z θ; θ θ r z θ; θ, i probability, uiformly over eighborhoods of θ 0 where f z θ; θ stays away from zero. herefore, we ca write rĥ z θ; θ r z θ 0 ; θ 0 = rĥ z θ; θ rĥ z θ 0 ; θ 0 + rĥ z θ 0 ; θ 0 r z θ 0 ; θ 0 θ = θ rĥ z θ 0 ; θ 0 θ0 + o P θ θ0 + rĥ z θ 0 ; θ 0 r z θ 0 ; θ 0 = O P θ θ0 + rĥ z θ 0 ; θ 0 r z θ 0 ; θ 0 ad obtai the asymptotic ormality of rĥz θ; θ as a cosequece of the cosistecy of θ ad the asymptotic behavior of the Nadaraya-Watso estimator. 4 wo-step semiparametric PML Here, we cosider a two-step semiparametric PML procedure that ca be applied i semiparametric sigle-idex regressio models whe a coditioal variace coditio like V ar Y Z = g E Y Z, α 0 = g r Z θ 0 ; θ 0, α0, 4.2 is specified. Assume that this coditioal variace coditio is correctly specified. At the ed of this sectio we also discuss the misspecificatio case. First, we have to build a sequece {θ } with limit θ 0. Moreover, i the case of ubouded covariates, θ should approach θ 0 faster tha 1/ l. For this purpose, we maximize with respect to θ a pseudo-likelihood based o a LEF desity l y r. We use a fixed trimmig I B with B a subset of R d such that, for ay θ ad ay z B, we have f z θ; θ c > 0. o esure cosistecy for such a PML estimator, we have to check that θ 0 = arg max E [ l l Y r Z θ; θ I B Z ], 4.3 θ ad θ 0 is uique with this property. Recall that the SIM coditio specifies θ 0 as the uique vector satisfyig E [Y Z] = E [ ] Y Z θ 0. O the other had, if l l y r = B r + C r y +D y, the B m+c m r B r+c r r cf. Property 4, Gouriéroux, Mofort ad rogo 1984a, page 684. Deduce that for ay z, θ 0 = arg max E [ l l Y r z θ; θ ] θ ad θ 0 is the uique maximizer. Hece, coditio 4.3 holds for ay set B. his leads us to the followig defiitio of a prelimiary estimator. 13

14 SEP 1 prelimiary step. Cosider a sequece of badwidths h, 1, such that ε h 0 ad 1/2 ε h for some 0 < ε < 1/2. Moreover, let l y r be a LEF desity. Defie θ = arg max θ 1 l l Y i ˆr h Z i θ; θ I B Z i. Delecroix, Hristache ad Patilea 2006 showed that, uder the regularity coditios required by heorem 3.1, we have θ θ 0 = o P 1/ l. Usig the prelimiary estimate θ ad the variace coditio 4.2 we ca build α, 1, such that α α 0, i probability see the ed of this sectio. Let l y r, α deote a LEFN desity with mea r ad variace g r, α. Cosider c e.g., c = l, defie H = { h : c 1/4 h c 1 1/8}. Moreover, cosider Θ = {θ : θ θ 0 d }, 1 with {d } as i heorem 3.1. Fix some small c > 0. SEP 2. Defie θ, ĥ 1 = arg max θ Θ, h H with θ ad h from Step 1. l l Y i ˆr i h Z i θ; θ ; α I{z: f i h z θ ; θ c} Z i, he followig result is a direct cosequece of heorem 3.1. Corollary 4.1 Suppose that the assumptios of heorem 3.1 hold. If θ ad as i Step 2 above, the D θ θ0 N 0, K, ĥ are obtaied with { [v ] K 1 = E Z 1 θ 0 ; θ 0 θ r Z θ 0 ; θ 0 θ r } Z θ 0 ; θ 0 IA Z. Moreover, ĥ 1, C 2 /4C 1 1/5 1/5 i probability, where C 1 ad C 2 are defied as i 3.1 with α = α 0. Remark 1. Let us poit out that simultaeous optimizatio of the semiparametric criterio i Step 1 with respect to θ, α ad h or with respect to θ ad α for a give h is ot recommeded, eve if the coditioal variace V ar Y Z is correctly specified. Ideed, if the true coditioal distributio of Y give Z is ot the oe give by the LEFN desity l y r, α = exp ψ y, r; α, joit optimizatio with respect to θ ad α leads, i geeral, to a icosistet estimate of α 0. his failure is well-kow i the parametric case where r is a kow fuctio; see commets of Camero ad rivedi 2013, pages I view of decompositio 2.7 we deduce that this fact also happes i the semiparametric framework 14

15 where r has to be estimated. I this case the matrices I ad J defied i sectio 3 are o loger equal ad thus the asymptotic variace of the oe-step semiparametric estimator of θ obtaied by simultaeous maximizatio of the criterio i Step 1 with respect to θ, α does ot achieve the boud K. However, whe the SIM coditio holds ad the true coditioal law of Y is give by the LEFN desity l = exp ψ, our two-step estimator θ ad the semiparametric MLE of θ 0 obtaied by simultaeous optimizatio with respect to θ, α are asymptotically equivalet. Remark 2. Note that if we igore the efficiecy loss due to trimmig, K is equal to the efficiecy boud i the semiparametric model defied oly by the sigle-idex coditio E Y Z = E Y Z θ 0 whe the variace coditio 4.2 holds. o see this, apply the boud of Newey ad Stoker 1993 with the true variace give by 4.2. Our two-stage estimator achieves this SIM efficiecy boud if the variace is well-specified. However, this SIM boud is ot ecessarily the two momet coditios model boud. he latter should take ito accout the variace coditio see Newey 1993, sectio 3.2, for a similar discussio i the parametric oliear regressio framework. I other words our two-stage estimator has some optimality properties but it may ot achieve the semiparametric efficiecy boud of the two momet coditios model. he same remark applies for the two-stage semiparametric geeralized least squares GLS procedure of Härdle, Hall ad Ichimura 1993 [see also Picoe ad Butler 2000]. Achievig semiparametric efficiecy whe the first two momets are specified would be possible, for istace, by estimatig higher orders coditioal momets oparametrically. However, i this case we face agai the problem of the curse of dimesioality that we tried to avoid by assumig the SIM coditio. o complete the defiitio of the two-step procedure above, we have to idicate how to build a cosistet sequece { α }. Such a sequece ca be obtaied from the momet coditio 4.2 after replacig r z θ 0 ; θ 0 by a suitable estimator. his kid of procedure is commoly used i the semiparametric literature e.g., Newey ad McFadde For simplicity, let us oly cosider the Negative Biomial case where, for ay z, we have E [ Y E Y Z 2 Z = z ] = r z θ 0 ; θ 0 [ 1 + α0 r z θ 0 ; θ 0 ]. 4.4 Cosider a set B R d such that, for ay θ ad ay z B, we have f z θ; θ c > 0. We ca write { [ Y E E r Z 2 ] } { θ 0 ; θ 0 r Z θ 0 ; θ 0 Z I B Z = α 0 E r } Z 2 θ 0 ; θ 0 IB Z. Cosequetly, we may estimate 5 α 0 by α = [ 1 Yi r h Z 2 ] i θ ; θ rh Z i θ ; θ I B Z i 1 r h Zi θ ; θ 2 I B Z i Oe ca expect little ifluece of the choice of the badwidth used to costruct the α. his is ideed cofirmed by the simulatio experimets we report i sectio

16 with θ ad h from Step 1 ad r h the Nadaraya-Watso estimator with badwidth h. Sice θ θ 0, deduce that α α 0, i probability see also the argumets we used i subsectio 2.4. Now, let us commet o what happes with our two-step procedure if the secod order momet coditio is misspecified, while the SIM coditio still holds. I geeral, the sequece α oe may derive from the coditioal variace coditio ad the prelimiary estimate of θ 0 is still coverget to some pseudo-true value α of the uisace parameter. 6 he, the behavior of θ, ĥ yielded by Step 2 is described by heorem 3.1, that is θ is still asymptotically ormal ad ĥ is still of order 1/5. Fially, if the SIM coditio does ot hold, the θ estimates a kid of first projectiopursuit directio. I this case, our procedure provides a alterative to miimum average coditioal variace estimatio MAVE procedure of Xia et al he ovelty would be that the first projectio directio is defied through a more flexible PML fuctio tha the usual least-squares criterio. his case will be aalyzed elsewhere. 5 Empirical evidece I our empirical sectio we cosider the case of a cout respose variable Y. A bechmark model for studyig evet couts is the Poisso regressio model. Differet variats of the Poisso regressio have bee used i applicatios o the umber of patets applied for ad received by firms, bak failures, worker abseteeism, airlie or car accidets, doctor visits, etc. Camero ad rivedi 2013 provide a overview of the applicatios of Poisso regressio. I the basic setup, the regressio fuctio is log-liear. A additioal uobserved multiplicative radom error term i the coditioal mea fuctio is usually used to accout for uobserved heterogeeity. I this sectio we cosider semiparametric sigle-idex extesios of such models. 5.1 Mote Carlo simulatios o evaluate the fiite sample performaces of our estimator θ ad of the optimal badwidth ĥ, we coduct a simulatio experimet with 500 replicatios. We cosider three explaatory variables Z = Z 1, Z 2, Z 3 N0, Σ with Σ = [σ ij ] 3 3 ad σ ij = 0.5 i j. he regressio fuctio is EY Z = Z θ ad θ 0 = θ 1 0, θ 2 0, θ 3 0 = 1, 3, 2. he coditioal distributio of Y give Z ad ε is Poisso of mea rzθ 0 ; θ 0 ε with ε idepedet of Z ad distributed accordig to 6 For istace, α defied i 4.5 is coverget i probability to α = E[ Y r Z 2 θ 0 ; θ 0 IB Z] E[r Z θ 0 ; θ 0 IB Z] E[rZ θ 0 ; θ 0 2. I B Z] o esure that the limit of α is positive, oe may replace α by max α, ρ for some small but positive ρ. 16

17 Gamma0.5, 2 or Uiform0, 2. hus, the coditioal variace of Y give Z is give by the fuctio g r, α = r 1 + αr with α 0 = 2 for ε Gamma0.5, 2 ad α = 1/3 for ε Uiform0, 2. For this simulatio experimet we geerate samples of size = 200 ad 300. For the oparametric part we use a quartic kerel K u = 15/16 1 u 2 2 I [ 1,1] u. o estimate the parameter θ 0 ad the regressio r ; θ 0 we use two semiparametric two-step estimatio procedures as defied i sectio 4: i A procedure with a Poisso PML i the first step ad a Negative Biomial PML i the secod step; let θ 2 3 NB SP = 1, θ NB SP, θ NB SP deote the two-step estimator. ii a procedure with a least-squares method i the first step ad a GLS method i the secod step; let θ 2 3 GLS SP = 1, θ GLS SP, θ GLS SP be the two-step estimator. Note that θ NB SP ad θ GLS SP have the same asymptotic variace. I both two-step procedures cosidered, we estimate α 0 usig the estimator defied i 4.5. he badwidth h is equal to 3 1/5. We also cosider the parametric two-step GLS method as a bechmark. I this case the lik fuctio ad the variace parameter are cosidered give; let θ 3 GLS P = 1,, θ GLS P deote the correspodig estimator. θ 2 GLS P able 1. Poisso regressio with uobserved heterogeeity ε Gamma0.5, 2. he true coditioal variace of Y give Z is rzθ 0 ; θ rZθ 0 ; θ 0 with r t; θ 0 = t he true vector θ 0 is 1, 3, 2. Let θ NB SP ad θ GLS SP deote the two-step estimators obtaied from the Negative Biomial pseudo-likelihood ad GLS criterio, respectively. he first step Poisso PML estimator is deoted by θ P OI SP. he superscripts idicate the compoets of the vectors. θ2 GLS P θ 2 GLS SP θ 2 P OI SP θ 2 NB SP θ 3 GLS P θ 3 GLS SP θ 3 P OI SP θ 3 NB SP 200 mea std MSE mea std MSE able 2. he same setup as i able 1 but with ε Uiform0, 2 ad the true coditioal variace of Y give Z equal to rzθ 0 ; θ /3rZθ 0 ; θ 0. θ2 GLS P θ 2 GLS SP θ 2 P OI SP θ 2 NB SP θ 3 GLS P θ 3 GLS SP θ 3 P OI SP θ 3 NB SP 200 mea std MSE mea std MSE

18 he results o the estimates of the compoets of θ 0 are provided i able 1 ad able 2. We report the mea, the stadard deviatio ad the estimated mea squared error MSE for each compoet. he two semiparametric estimators that icorporate the iformatio o the coditioal variace clearly outperform the semiparametric sigle-idex estimator that igores that iformatio. Moreover, they behave reasoably well compared to the parametric bechmark. 5.2 A real data example I order to further illustrate our methodology, we cosider a real dataset o recreatioal trips as preseted by Camero ad rivedi his data iitially collected by Sellar, Stoll ad Chavas 1985 is built from a survey that icludes the umber of recreatioal boatig trips to Lake Sommerville, exas. We reproduce below the tables that describe the observed frequecies ad the explaatory variables. We do ot use all the explaatory variables for estimatio sice the variables C1, C3 ad C4 are almost perfectly correlated i the sample. Ideed, CorrC1, C3 = 0.977, CorrC1, C4 = ad CorrC3, C4 = o avoid colliearity problems, we drop C3 ad C4. We stadardize the variables IN C ad C1. able 3. he recreatioal trips data set: actual frequecy distributio. Number of rips Frequecy Number of rips Frequecy able 4. Explaatory variables for the recreatioal trips couts. Variable Defiitio Mea Std RIP S Number of recreatioal boatig trips i by a sample group SO Facility s subjective quality rakig o a scale of 1 to SKI Equal 1 if egaged i water-skiig at the lake IN C Household icome of the head of the group $10,000/year F C3 Equal 1 if user s fee paid at Lake Sommerville C1 Hudreds of dollar expediture whe visitig Lake Coroe C3 Hudreds of dollar expediture whe visitig Lake Somerville C4 Hudreds of dollar expediture whe visitig Lake Housto he model we cosider is the oe give by equatios with gr, α = r1 + αr. First, we assume that the regressio fuctio is log-liear, that is we cosider the stadard Negative Biomial Parametric model NB-P. Next, we o loger assume that the regressio fuctio is kow ad we apply our semiparametric methodology, the semiparametric 18

19 hat rt t Figure 1: he lik fuctio Negative Biomial pseudo-likelihood procedure. I the semi-parametric procedures the coefficiet of the variable SO is set to 1. For the oparametric part we use the quartic kerel Ku = 15/161 u 2 2 I [ 1,1] u. he parameter estimates ad estimated stadard errors are gathered i able 5, the plot of the estimated lik fuctio is provided i Figure 1. able 5. Estimatio results: parametric NB-P versus semiparametric model NB-SP. Parameters NB-P NB-SP Itercept SO SKI IN C F C C α h Note that the estimate of the coefficiet of SO i the parametric model is close to oe, while i the semiparametric approach we fixed it to oe. hus the estimated values of the remaiig parameters i the parametric ad semiparametric cases are almost directly comparable. he results obtaied with the semiparametric approach seem more realistic. For istace, the coefficiet of INC covariate is positive with NB-SP ad the lik fuctio is strictly mootoe. his suggests that a higher icome more likely iduces a larger umber of recreatioal trips. he NB-P model leads to the opposite coclusio. he reported parametric ad semiparametric stadard errors caot be directly compared o the same basis sice we ca oly compute the stadard error of a ratio of parameters i the semiparametric cases. he large badwidth could be explaied by the large coditioal variace of the 19

20 respose ad a lik fuctio with a secod derivative close to zero. his leads to a large costat C 2 /4C 1 1/5 i the expressio of h opt, see equatio 3.1 above. I order to evaluate the overall performace of the parametric ad semiparametric models ad of the estimatio methods, we cosider various goodess-of-fit measures such as the Pearso statistic, the deviace statistic ad the deviace pseudo R-squared statistic. he Pearso statistics is give by Y i r i 2 P =, ω i where r i is the estimated coditioal mea for idividual i ad ω i is the estimated coditioal variace computed accordig to equatio 2.2. he deviace statistic is give by D = 2 [ Y i l Yi ] Yi + 1/ α Y i + 1/ α l, r i r i + 1/ α with α the estimated value of the uisace parameter with the values give i the able 5. Fially, if Y deotes the sample mea of the variable Y, the deviace pseudo R-squared statistic is R 2 DEV = 1 [ Y i l Y i / r i Y i + 1/ α l [ Y i l Y i / Y Y i + 1/ α l ] Yi +1/ α r i +1/ α Yi +1/ α Y +1/ α Aother model diagostic is obtaied whe comparig fitted probabilities ad actual probabilities by the mea of a chi-square type statistic. he statistic we cosider is J 2 pj p j ξ =, p j j=1 7 where the possible values of Y are aggregated i J o overlappig cells. he actual frequecy for cell j is deoted p j while p j is the correspodig predicted probability by the model uder study. For both methods GLS-SP ad NB-SP we used the probabilities of a egative biomial distributio to compute p j. We cosider seve cells correspodig to the values RIP = 0,..., 5 ad RIP > 5. All the results are summarized i able 6. he semiparametric model performs better tha the parametric model. We also give the estimators of the probability i able 7. We ca see that our estimators are close to the empirical probability of RIP. he semiparametric approach greatly improves the stadard parametric modelig. ]. able 6. Goodess-of-fit statistics: P Pearso statistic, D deviace statistic, RDEV 2 deviace pseudo R-squared statistic ad ξ chi-square statistic. 7 he chi-square statistic we cosider is ot ecessarily chi-square distributed uder the ull hypotheses of a well specified model. his is because it does ot correctly take ito accout the estimatio error i p j. See Adrews 1988 for the geeral defiitio of the chi-square goodess-of-fit test statistic i odyamic regressio models. Here, we oly use ξ as a crude diagostic for the three types of fitted probabilities p j. 20

21 NB-P NB-SP P D RDEV ξ able 7. Empirical probability ad estimate probability RIP S > 5 Empirical probability NB P NB SP Coclusio We cosider a semiparametric sigle-idex model SIM where a additioal secod order momet coditio is specified. o estimate the parameter of iterest θ we itroduce a two-step semiparametric pseudo-maximum likelihood PML estimatio procedure based o liear expoetial families with uisace parameter desities. his procedure exteds the quasi-geeralized pseudo-maximum likelihood method proposed by Gouriéroux, Mofort ad rogo 1984a, 1984b. We also provide a atural rule for choosig the badwidth of the oparametric smoother appearig i the estimatio procedure. he idea is to maximize the pseudo-likelihood of the secod step simultaeously i θ ad the smoothig parameter h. he rate of the badwidth is allowed to lie i a rage betwee 1/4 ad 1/8. We derive the asymptotic behavior of θ, the two-step semiparametric PML we propose. If the SIM coditio holds, the θ is asymptotically ormal. We also provide a cosistet estimator of its variace. Whe the SIM coditio holds ad the coditioal variace is correctly specified, the θ has the best variace amogst the semiparametric PML estimators. he optimal badwidth ĥ obtaied by joit maximizatio of the pseudo-likelihood fuctio i the secod step is show to be equivalet to the miimizer of a weighted cross-validatio fuctio. From this we deduce that 1/5 ĥ coverges to a positive costat, i probability. I particular, our optimal badwidth ĥ has the rate expected whe estimatig a twice differetiable regressio fuctio oparametrically. We coduct a simulatio experimet i which the data were geerated usig a Poisso sigle-idex regressio model with multiplicative uobserved heterogeeity. he simulatio cofirms the sigificat advatage of estimators that icorporate the iformatio o the coditioal variace. We also applied our semiparametric approach to a bechmark real cout data set ad we obtai a much better fit tha the stadard parametric regressio models for cout data. 21

22 A Appedix: Assumptios Let Θ = {1} Θ with Θ a compact subset of R d 1 with ovoid iterior. Depedig o the cotext, Θ is cosidered a subset of R d 1 or a subset of R d. Assumptio A.1 he observatios Y 1, Z1,..., Y, Z are idepedet copies of a radom vector Y, Z R d+1. Assumptio A.2 Let r t; θ = E Y Z θ = t. here exists a uique θ 0 iterior poit of Θ such that E Y Z = E Y Z θ 0 = r Z θ 0 ; θ 0. Assumptio A.3 For every θ Θ, the radom variable Z θ admits a desity f ; θ with respect to the Lebesgue measure o R. Assumptio A.4 E [exp λ Z ] <, for some λ > 0. Moreover, EY 4+ε <, for some ε > 0. Assumptio A.5 With probability oe, the matrix 1, Z 1, Z is positive defiite. Assumptio A.6 here exists c 0 > 0 ad a positive iteger k 0 such that, for ay θ Θ ad 0 < c c 0, the set {t : ft; θ = c} has at most k 0 elemets. he last two assumptios esure that P fz θ 0 ; θ 0 = c = 0, for ay 0 < c c 0. CONDIION L A fuctio g : Θ R R is said to satisfy Coditio L if, for ay Λ a compact set o the real lie, there exists B > 0 ad b 0, 1] such that g θ, t g θ, t B θ, t θ, t b, θ, θ Θ, t, t Λ. Assumptio A.7 a he fuctio θ, t f t; θ 0, θ Θ, t R, satisfies a Lipschitz coditio, that is there exists a 0, 1] ad C > 0 such that f t; θ f t ; θ C θ, t θ, t a for θ, θ Θ ad t, t R. b he fuctio θ, t r t; θ, θ Θ, t R, satisfies Coditio L. c For ay θ Θ, the fuctios t γ t; θ ad t f t; θ are twice differetiable. Let γ t; θ ad f t; θ deote the secod order derivatives. he fuctios θ, t γ t; θ ad θ, t f t; θ, θ Θ, t R, satisfy Coditio L with b = 1. d For ay θ Θ ad ay compoet Z j of Z, the fuctios t E Z j Z θ = t ad t E Y Z j Z θ = t are twice differetiable ad their secod order derivatives satisfy Coditio L with b = 1. e For ay t R, the fuctio θ r t; θ is twice cotiuously differetiable ad, for ay θ Θ, the fuctios t θ r t; θ ad t θθ 2 r t; θ are cotiuous. Moreover, the fuctio θ, t θ r t; θ satisfy Coditio L with b = 1. Let v t; θ = V ar Y Z θ = t be the coditioal variace of Y give Z θ = t. 22

23 Assumptio A.8 he fuctio θ, t v t; θ satisfies Coditio L. Cosider the fuctios B, C : R N R, with Y, R, N R. Defie Λ = θ Θ {t : ft; θ c}, with c, δ > 0, ad Dc, δ = {r : θ, t Θ Λ such that r rt; θ δ}. Assumptio A.9 If c > 0, there exists δ > 0 such that Dc, δ is strictly icluded i R. Assumptio A.10 he kerel fuctio K is differetiable, symmetric, positive ad compactly supported. Moreover, K ad the derivative K are of bouded variatio. Up to a term depedig oly o y ad α, the three argumets fuctio ψ, ; ivolved i equatio 2.3 is defied as ψ y, r; α = Br, α + Cr, αy where ly r, α = exp [Br, α + Cr, αy + Dy, α] is a LEFN desity with mea r ad variace [ r Cr, α] 1. Assumptio A.11 he fuctios B r, α ad C r, α are twice differetiable i the first argumet. Moreover, for ay c ad δ > 0 for which Dc, δ is strictly icluded i R, there exists a costat M such that sup r,r Dc,δ, α,α N sup r Dc,δ, α N 2 rr Gr, α + r Gr, α M, 2 rr Gr, α 2 rrgr, α M r r + α α, where G stads for B or C. he fuctios r Br; α ad r Cr; α are cotiuously differetiable i α. Assumptio A.12 For ay c ad δ > 0 for which Dc, δ is strictly icluded i R, we have r Cr, α > 0, r Dc, δ, α N. Assumptio A.12 esures that the d 1 d 1 matrix J = E [ 2 θθ ψ Y, r Z θ 0 ; θ 0 ; α I A Z ] [ = E r C r Z θ 0 ; θ 0 ; α θ r Z θ 0 ; θ 0 θ r ] Z θ 0 ; θ 0 IA Z is positive defiite. Let us otice that the asymptotic results remai valid eve if the fuctio ψ y, r; α is ot the logarithm of a LEFN. It suffices to adapt Assumptio A.11, to suppose that there exists F ; such that ψ y, r; α F y; α, r R, to esure that J is positive defiite ad to assume that, for ay α, E [ 2 ψ Y, r Z θ 0 ; θ 0 ; α Z ] = 0 ad E [ θ 2 ψ Y, r Z θ 0 ; θ 0 ; α Z θ 0 ] = 0. 23

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

Kernel density estimator

Kernel density estimator Jauary, 07 NONPARAMETRIC ERNEL DENSITY ESTIMATION I this lecture, we discuss kerel estimatio of probability desity fuctios PDF Noparametric desity estimatio is oe of the cetral problems i statistics I

More information

5. Likelihood Ratio Tests

5. Likelihood Ratio Tests 1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Study the bias (due to the nite dimensional approximation) and variance of the estimators 2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite

More information

SEMIPARAMETRIC SINGLE-INDEX MODELS. Joel L. Horowitz Department of Economics Northwestern University

SEMIPARAMETRIC SINGLE-INDEX MODELS. Joel L. Horowitz Department of Economics Northwestern University SEMIPARAMETRIC SINGLE-INDEX MODELS by Joel L. Horowitz Departmet of Ecoomics Northwester Uiversity INTRODUCTION Much of applied ecoometrics ad statistics ivolves estimatig a coditioal mea fuctio: E ( Y

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Application to Random Graphs

Application to Random Graphs A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

1 Covariance Estimation

1 Covariance Estimation Eco 75 Lecture 5 Covariace Estimatio ad Optimal Weightig Matrices I this lecture, we cosider estimatio of the asymptotic covariace matrix B B of the extremum estimator b : Covariace Estimatio Lemma 4.

More information

11 THE GMM ESTIMATION

11 THE GMM ESTIMATION Cotets THE GMM ESTIMATION 2. Cosistecy ad Asymptotic Normality..................... 3.2 Regularity Coditios ad Idetificatio..................... 4.3 The GMM Iterpretatio of the OLS Estimatio.................

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Rank tests and regression rank scores tests in measurement error models

Rank tests and regression rank scores tests in measurement error models Rak tests ad regressio rak scores tests i measuremet error models J. Jurečková ad A.K.Md.E. Saleh Charles Uiversity i Prague ad Carleto Uiversity i Ottawa Abstract The rak ad regressio rak score tests

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Lecture 24: Variable selection in linear models

Lecture 24: Variable selection in linear models Lecture 24: Variable selectio i liear models Cosider liear model X = Z β + ε, β R p ad Varε = σ 2 I. Like the LSE, the ridge regressio estimator does ot give 0 estimate to a compoet of β eve if that compoet

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Law of the sum of Bernoulli random variables

Law of the sum of Bernoulli random variables Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Mathematical Statistics - MS

Mathematical Statistics - MS Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios

More information

Web-based Supplementary Materials for A Modified Partial Likelihood Score Method for Cox Regression with Covariate Error Under the Internal

Web-based Supplementary Materials for A Modified Partial Likelihood Score Method for Cox Regression with Covariate Error Under the Internal Web-based Supplemetary Materials for A Modified Partial Likelihood Score Method for Cox Regressio with Covariate Error Uder the Iteral Validatio Desig by David M. Zucker, Xi Zhou, Xiaomei Liao, Yi Li,

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Information-based Feature Selection

Information-based Feature Selection Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Chi-Squared Tests Math 6070, Spring 2006

Chi-Squared Tests Math 6070, Spring 2006 Chi-Squared Tests Math 6070, Sprig 2006 Davar Khoshevisa Uiversity of Utah February XXX, 2006 Cotets MLE for Goodess-of Fit 2 2 The Multiomial Distributio 3 3 Applicatio to Goodess-of-Fit 6 3 Testig for

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Lecture 6 Simple alternatives and the Neyman-Pearson lemma STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

On Random Line Segments in the Unit Square

On Random Line Segments in the Unit Square O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

Chapter 7 Isoperimetric problem

Chapter 7 Isoperimetric problem Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated

More information

Lecture Stat Maximum Likelihood Estimation

Lecture Stat Maximum Likelihood Estimation Lecture Stat 461-561 Maximum Likelihood Estimatio A.D. Jauary 2008 A.D. () Jauary 2008 1 / 63 Maximum Likelihood Estimatio Ivariace Cosistecy E ciecy Nuisace Parameters A.D. () Jauary 2008 2 / 63 Parametric

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

GUIDE FOR THE USE OF THE DECISION SUPPORT SYSTEM (DSS)*

GUIDE FOR THE USE OF THE DECISION SUPPORT SYSTEM (DSS)* GUIDE FOR THE USE OF THE DECISION SUPPORT SYSTEM (DSS)* *Note: I Frech SAD (Système d Aide à la Décisio) 1. Itroductio to the DSS Eightee statistical distributios are available i HYFRAN-PLUS software to

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

6. Sufficient, Complete, and Ancillary Statistics

6. Sufficient, Complete, and Ancillary Statistics Sufficiet, Complete ad Acillary Statistics http://www.math.uah.edu/stat/poit/sufficiet.xhtml 1 of 7 7/16/2009 6:13 AM Virtual Laboratories > 7. Poit Estimatio > 1 2 3 4 5 6 6. Sufficiet, Complete, ad Acillary

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2 82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector

Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector Dimesio-free PAC-Bayesia bouds for the estimatio of the mea of a radom vector Olivier Catoi CREST CNRS UMR 9194 Uiversité Paris Saclay olivier.catoi@esae.fr Ilaria Giulii Laboratoire de Probabilités et

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

Unbiased Estimation. February 7-12, 2008

Unbiased Estimation. February 7-12, 2008 Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A. Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)

More information

Quantile regression with multilayer perceptrons.

Quantile regression with multilayer perceptrons. Quatile regressio with multilayer perceptros. S.-F. Dimby ad J. Rykiewicz Uiversite Paris 1 - SAMM 90 Rue de Tolbiac, 75013 Paris - Frace Abstract. We cosider oliear quatile regressio ivolvig multilayer

More information

Estimation of the essential supremum of a regression function

Estimation of the essential supremum of a regression function Estimatio of the essetial supremum of a regressio fuctio Michael ohler, Adam rzyżak 2, ad Harro Walk 3 Fachbereich Mathematik, Techische Uiversität Darmstadt, Schlossgartestr. 7, 64289 Darmstadt, Germay,

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Basis for simulation techniques

Basis for simulation techniques Basis for simulatio techiques M. Veeraraghava, March 7, 004 Estimatio is based o a collectio of experimetal outcomes, x, x,, x, where each experimetal outcome is a value of a radom variable. x i. Defiitios

More information

1 The Haar functions and the Brownian motion

1 The Haar functions and the Brownian motion 1 The Haar fuctios ad the Browia motio 1.1 The Haar fuctios ad their completeess The Haar fuctios The basic Haar fuctio is 1 if x < 1/2, ψx) = 1 if 1/2 x < 1, otherwise. 1.1) It has mea zero 1 ψx)dx =,

More information

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract Goodess-Of-Fit For The Geeralized Expoetial Distributio By Amal S. Hassa stitute of Statistical Studies & Research Cairo Uiversity Abstract Recetly a ew distributio called geeralized expoetial or expoetiated

More information

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values Iteratioal Joural of Applied Operatioal Research Vol. 4 No. 1 pp. 61-68 Witer 2014 Joural homepage: www.ijorlu.ir Cofidece iterval for the two-parameter expoetiated Gumbel distributio based o record values

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

LECTURE 2 LEAST SQUARES CROSS-VALIDATION FOR KERNEL DENSITY ESTIMATION

LECTURE 2 LEAST SQUARES CROSS-VALIDATION FOR KERNEL DENSITY ESTIMATION Jauary 3 07 LECTURE LEAST SQUARES CROSS-VALIDATION FOR ERNEL DENSITY ESTIMATION Noparametric kerel estimatio is extremely sesitive to te coice of badwidt as larger values of result i averagig over more

More information

Supplemental Material: Proofs

Supplemental Material: Proofs Proof to Theorem Supplemetal Material: Proofs Proof. Let be the miimal umber of traiig items to esure a uique solutio θ. First cosider the case. It happes if ad oly if θ ad Rak(A) d, which is a special

More information

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense, 3. Z Trasform Referece: Etire Chapter 3 of text. Recall that the Fourier trasform (FT) of a DT sigal x [ ] is ω ( ) [ ] X e = j jω k = xe I order for the FT to exist i the fiite magitude sese, S = x [

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

MATHEMATICAL SCIENCES PAPER-II

MATHEMATICAL SCIENCES PAPER-II MATHEMATICAL SCIENCES PAPER-II. Let {x } ad {y } be two sequeces of real umbers. Prove or disprove each of the statemets :. If {x y } coverges, ad if {y } is coverget, the {x } is coverget.. {x + y } coverges

More information

CHAPTER 4 BIVARIATE DISTRIBUTION EXTENSION

CHAPTER 4 BIVARIATE DISTRIBUTION EXTENSION CHAPTER 4 BIVARIATE DISTRIBUTION EXTENSION 4. Itroductio Numerous bivariate discrete distributios have bee defied ad studied (see Mardia, 97 ad Kocherlakota ad Kocherlakota, 99) based o various methods

More information

Axioms of Measure Theory

Axioms of Measure Theory MATH 532 Axioms of Measure Theory Dr. Neal, WKU I. The Space Throughout the course, we shall let X deote a geeric o-empty set. I geeral, we shall ot assume that ay algebraic structure exists o X so that

More information

Point Estimation: properties of estimators 1 FINITE-SAMPLE PROPERTIES. finite-sample properties (CB 7.3) large-sample properties (CB 10.

Point Estimation: properties of estimators 1 FINITE-SAMPLE PROPERTIES. finite-sample properties (CB 7.3) large-sample properties (CB 10. Poit Estimatio: properties of estimators fiite-sample properties CB 7.3) large-sample properties CB 10.1) 1 FINITE-SAMPLE PROPERTIES How a estimator performs for fiite umber of observatios. Estimator:

More information

Notes On Median and Quantile Regression. James L. Powell Department of Economics University of California, Berkeley

Notes On Median and Quantile Regression. James L. Powell Department of Economics University of California, Berkeley Notes O Media ad Quatile Regressio James L. Powell Departmet of Ecoomics Uiversity of Califoria, Berkeley Coditioal Media Restrictios ad Least Absolute Deviatios It is well-kow that the expected value

More information

Empirical Processes: Glivenko Cantelli Theorems

Empirical Processes: Glivenko Cantelli Theorems Empirical Processes: Gliveko Catelli Theorems Mouliath Baerjee Jue 6, 200 Gliveko Catelli classes of fuctios The reader is referred to Chapter.6 of Weller s Torgo otes, Chapter??? of VDVW ad Chapter 8.3

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Department of Mathematics

Department of Mathematics Departmet of Mathematics Ma 3/103 KC Border Itroductio to Probability ad Statistics Witer 2017 Lecture 19: Estimatio II Relevat textbook passages: Larse Marx [1]: Sectios 5.2 5.7 19.1 The method of momets

More information