Distribution Free Estimation of Heteroskedastic Binary Response Models Using Probit/Logit Criterion Functions

Size: px
Start display at page:

Download "Distribution Free Estimation of Heteroskedastic Binary Response Models Using Probit/Logit Criterion Functions"

Transcription

1 Distributio Free Estimatio of Heteroskedastic Biary Respose Models Usig Probit/Logit Criterio Fuctios Shakeeb Kha Duke Uiversity Revised: February Abstract I this paper estimators for distributio free heteroskedastic biary respose models are proposed. The estimatio procedures are based o relatioships betwee distributio free models with a coditioal media restrictio ad parametric models (such as Probit/Logit) exhibitig (multiplicative) heteroskedasticity. The first proposed estimator is based o the observatioal equivalece betwee the two models, ad is a semiparametric sieve estimator (see, e.g. Gallat ad Nychka(987), Ai ad Che(3), Che, Hog ad Tamer(5)) for the regressio coefficiets, based o maximizig stadard Logit/Probit criterio fuctios, such as NLLS ad MLE. This procedure has the advatage that choice probabilities ad regressio coefficiets are estimated simultaeously. The secod proposed procedure is based o the equivalece betwee existig semiparametric estimators for the coditioal media model (Maski(975,985), Horowitz(99)) ad the stadard parametric (Probit/Logit) NLLS estimator. This estimator has the advatage of beig implemetable with stadard software packages such as Stata. Distributio theory is developed for both estimators ad a Mote Carlo study idicates they both perform well i fiite samples. JEL Classificatio: C3,C4,C4 Key Words: biary respose, heteroskedasticity, Probit/Logit, sieve estimatio. Correspodig author. Departmet of Ecoomics, Duke Uiversity, Durham, NC 778; e- mail:shakeebk@duke.edu. I am grateful to Co-editor T. Amemiya, a Aoymous Associate Editor ad Aoymous referees, S. Che, X. Che, M. Coppejas, B. Hooré, A. Lewbel, W. Newey, W. Ploberger, J. Powell, ad semiar participats at Bosto College, Brow, McGill, Harvard/MIT, Rice, ad Texas A&M for helpful commets. This research was supported i part by the Natioal Sciece Foudatio through grat SES-36.

2 Itroductio The biary respose model has received a great deal of attetio i both the theoretical ad applied ecoometrics literature, as may ecoomic variables of iterest are of a qualitative ature. The model is usually represeted by some variatio of the followig equatio: y i = I[x iβ ɛ i ] (.) where I[ ] is the usual idicator fuctio, y i is the observed respose variable, takig the values or ad x i is a observed vector of covariates which effect the behavior of y i. Both the disturbace term ɛ i, ad the vector β are uobserved, the latter ofte beig the parameter estimated from a radom sample of (y i, x i) i =,,... The disturbace term ɛ i is restricted i ways that esure idetificatio of β. Parametric restrictios specify the distributio of ɛ i up to a fiite umber of parameters ad assume it is distributed idepedetly of the covariates x i. Uder such a restrictio, β ca be estimated (up to scale) usig maximum likelihood or oliear least squares. However, except i special cases, these estimators are icosistet if the distributio of ɛ i is misspecified or coditioally heteroskedastic. Semiparametric, or distributio free restrictios have also bee imposed i the literature, resultig i a variety of estimatio procedures for β. The first was the maximum score estimator proposed i Maski(975). Idetificatio of β was based o a coditioal media restrictio: med(ɛ i x i ) = (.) Maski s estimator maximized the followig objective fuctio M (β) = I[y i = ]I[x iβ ] + I[y i = ]I[x iβ < ] (.3) i= Maski(975,985) established the estimator s cosistecy. Kim ad Pollard(99) established its rate of covergece ad limitig distributio, which were /3 ad o-gaussia, respectively. Horowitz(99) modified the procedure by smoothig the objective fuctio i (.3). Specifically, his approach was to maximize the followig objective fuctio: S (β) = I[y i = ]K h (x iβ) + I[y i = ]( K h (x iβ)) (.4) i= where K h ( ) K( /h) with K( ) deotig a smooth kerel fuctio, ad h deotig a smoothig parameter, covergig to with the sample size. Uder stroger smoothess

3 coditios o the distributios of ɛ i ad x i, Horowitz showed that the estimator coverges at the rate of /5 with a asymptotically ormal distributio. By stregtheig the coditios further, he was able to attai a rate of p/(p+) where p is a iteger related to the order of smoothess of the distributios of ɛ i ad x iβ i eighborhoods of. These estimators have two disadvatages which this paper attempts to address. For oe, both the maximum score ad smooth maximum score estimatio procedures oly provided a estimator of β. As discussed i Maski(988), a estimator of β permits structural aalysis, which may be of iterest for oe of two reasos. For oe, the researcher may have a scietific iterest i learig about the process yieldig biary outcomes. The other motive is predictio, where structural aalysis eables more precise ad tractable predictio, as well as extrapolatio. However, choice probabilities ad margial effects are also of iterest i most practical applicatios- see Greee(997) for a explaatio. Ufortuately, the maximum score ad smooth maximum score procedures do ot estimate these variables. Alterative semiparametric restrictios used i the literature were idepedece/idex restrictios. These restrictios are much stroger tha the media restrictio metioed, as they require the error term to be distributed idepedetly of x i, or deped o x i through the idex x iβ. Estimatio procedures uder this restrictio iclude those proposed by Cosslett(983), Powell et al.(989), Ichimura(993), Klei ad Spady(993), ad Coppejas(). A advatage of most of these procedures is that they eable joit estimatio of the regressio coefficiets ad choice probabilities. However a drawback of these procedures is the restrictios they are based o are much stroger tha the media restrictio metioed - they require the error term to be distributed idepedetly of x i, or deped o x i through the idex x iβ. They do ot permit the geeral forms of heteroskedasticity that the coditioal media restrictio allows for. Therefore, the first procedure proposed i this paper aims to address the drawbacks of the existig estimators metioed. Specifically, the geeral heteroskedasticity of the coditioal media restrictio is maitaied, yet the joit estimatio of the regressio coefficiets ad the choice probabilities is also permitted. The idea behid this approach is based o the observatioal equivalece betwee a distributio free model uder a coditioal media restrictio, ad a (multiplicative) heteroskedastic parametric (e.g. probit, logit) model. This equivalece result motivates a estimator of the heteroskedastic parametric model, ad the estimators proposed permit joit estimatio of regressio coefficiets ad choice By predictio, we mea i a somewhat crude sese. That is, oe predicts the value of or based o the sig of the estimated idex.

4 probabilities. The procedures ivolve maximizig stadard parametric criteria fuctios, such as MLE, ad NLLS probit/logit. A secod drawback of maximum ad smoothed maximum score estimators is implemetatio. Specifically, their objective fuctios are o-stadard ad thus they caot be computed usig stadard software packages. This motivates the secod estimator which ca compute regressio coefficiets i the semiparametric biary choice model uder media restrictios usig the NLLS objective fuctio for a parametric model such as Logit or Probit. Cosequetly, the regressio coefficiets ca be estimated usig stadard software packages such as Stata. The paper is orgaized as follows. The followig sectio formally establishes a equivalece result which motivates the first estimatio procedure. Sectio 3 proposes the estimatio procedure ad establishes its asymptotic properties. Sectio 4 proposes the estimatio procedure for the regressio coefficiets that is very simple to implemet o stadard software packages. Sectio 5 explores the fiite sample performace of these estimators via a simulatio study. Sectio 6 cocludes. Proofs of the asymptotic properties of the proposed estimators are left to the appedix. A Equivalece Result The equivalece result is based o the followig two models: y i = I[x iβ ɛ i ] (.) where Model : Coditioal Media Restrictio CM x i R k is assumed to have desity with respect to Lebesgue measure, which is positive o the set X R k. I what follows, we will let x [j,] i deote 3 the j-th compoet of the vector x i, j =,,..k. CM Lettig (t, x) deote P (ɛ i t x i = x) we assume This assumptio is ot required but will be maitaied throughout the paper for otatioal coveiece. Techically we require oly oe regressor to be cotiuously distributed ad have positive desity o the real lie. 3 More geerally we will deote the [a, b] compoet of a matrix M by M [a,b] throughout this paper. 3

5 CM. (, ) is cotiuous o R X. CM. (t, x) (t, x)/ t exists ad is cotiuous ad positive o R for all x X. CM.3 (, x) = / for x X. CM.4 lim t (t, x) = lim t + (t, x) =. Model : Heteroskedastic Probit/Logit Model HP Assumptio CM. HP ɛ i = σ (x i ) u i where σ( ) is cotiuous ad positive o X a.s., ad u i is idepedet of x i with ay kow (e.g. logistic, ormal) distributio with media ad has a desity fuctio which is positive ad cotiuous o the real lie. Theorem. Uder Assumptios CM,CM,HP,HP, Models ad are observatioally equivalet. Proof: Note that the assumptios i Model easily imply the assumptios i Model are satisfied. Now assume the assumptios of Model are satisfied. We will show that there exists a scale fuctio σ ( ) which satisfies Assumptio HP such that the coditioal distributio of the observed depedet variable is the same uder the two models. Note it will suffice to show that P (y i = x i = x) is the same (x i a.s.) i both models. Let P (x) = (x β, x) deote this probability fuctio for the Model. Now defie σ (x) = x β /Φ (P (x))i[x β ] where Φ( ) deotes the kow c.d.f. of u i. Note that σ (x) > for all x such that x β. This is because x β > P (x) > / Φ (P (x)) >, ad similarly x β < Φ (P (x)) <. We immediately see that for the heteroskedastic probit model, P (y i = x i = x) = Φ(x iβ /σ (x i )) = Φ ((Φ (P (x)))) = P (x). Sice x β = with probability uder Assumptio CM, establishig the equivalece of the two models. Remark. Here we ote the followig implicatios of the established equivalece result: The above equivalece result is similar to the Lemma o page 737 i Maski(988) who established a class of dual models. These models had oliear regressio fuctios 4

6 ad homoskedastic disturbace terms with kow distributio. 4 Here we have a liear regressio fuctio ad a heteroskedastic 5 ormal error term which makes it relatively simple to extract the structural compoet of the model from the choice probabilities. This is eabled by two properties of the model- ) the ormal distributio has media zero ad positive desity everywhere ) the scale fuctio is positive everywhere. These costraits ca be easily imposed to simultaeously estimate β ad σ ( ), as will be illustrated later i the paper. Aother useful feature of the equivalece result is that it suggests other methods of estimatig the model. The first model is geerally estimated usig the L ad smoothed L orm estimators proposed i Maski(975) ad Horowitz(99). This is a atural approach i the sese that models with coditioal media restrictios are ofte estimated miimizig least absolute deviatio (LAD) objective fuctios. I the followig sectio, we propose a estimator based o the observatioally equivalet Model, ad describe its advatages over the aforemetioed existig estimators. Fially, it should be poited out that the otio of equivalece is defied by equatig the choice probability fuctios. Further otios ca be used to distiguish differet models. Oe such example is the order of smoothess of the probability fuctio, which is oe way to distiguish betwee the maximum score model ad the smoothed maximum score models, resultig i differet rates of covergece for estimatig β. I a separate ote (Kha()), a more refied equivalece result is established betwee Model ad Model above. Specifically, they are equivalet uder stated smoothess coditios i the sese the optimal rate for estimatig β are the same i the two models. 4 I fact there are several other structures that eable the choice probabilities to match up with those attaied from Model. I am grateful to a referee for poitig this out to me. Note also that sice the stadard ormal distributio is symmetrically distributed aroud its mea, the equivalece result here also implies equivalece betwee two distributio-free semiparametric restrictios- coditioal media idepedece ad coditioal symmetry. 5 The multiplicative form of the heteroskedasticity has bee imposed elsewhere i the literature- see, e,g, Klei ad Vella(9). 5

7 3 Estimatio Procedure Results i the previous sectio suggest that oe could estimate a heteroskedastic probit model which is distributio free. We ote that the result matchig choice probabilities to a distributio free model restricted the sig of the scale fuctio to be positive everywhere o the support of x i. This will have to be icorporated ito the estimatio procedure for cosistet, distributio free estimatio of β. The proposed estimators will cosider joit estimatio of the parameter (β, σ ( )). This is aalogous to existig estimators (e.g. Cosslett(983), Klei ad Spady(993), Coppejas()) of (β, F ( )) where the fuctio F ( ) deotes the c.d.f. of the error term. As metioed previously, these estimators assume idepedece betwee x i ad the error term, rulig out coditioal heteroskedasticity. O the surface it appears that the approach adopted here is allowig for heteroskedasticity at the expese of requirig a parametrically specified error distributio, as well as restrictig the heteroskedasticity to be multiplicative. However, this is ot the case. The oparametric compoet 6 σ ( ) permits both a ukow error distributio ad the coditioal heteroskedasticity of a coditioal media restrictio. The ormality assumptio oly serves to impose the coditioal media restrictio, ad ay distributioal assumptio o u i that has media ca be used for distributio free estimatio 7. Before itroducig the estimator, we itroduce the otatio we will adopt to accout for the fact that the regressio coefficiets are oly idetified up to scale. Followig covetio we set the last coefficiet value to ad estimate the k vector θ, where (θ, ) = β. The heteroskedastic probit model ca be viewed as a likelihood model with ifiite dimesioal parameter space. This class of models has bee studied extesively i the ecoometric ad statistics literature. Work i this area icludes Gema ad Huag(98), Gallat ad Nychka(987), Wog ad Severii(99), She ad Wog(994), She(997), Che ad She(998), ad Coppejas(), Ai ad Che(3), Che et al.(5), Che ad Pouzo(9,). Bieres(). Most of these papers focus o the method of sieves, which will be used i the costructio of a estimator i this paper. The estimator itroduced here is based o treatig the scale fuctio as a ifiite dime- 6 While the previous theorem illustrated idetificatio of β ad P, σ is also idetified ad easily estimable usig the procedure discussed i the followig sectio. It should be emphasized that this parameter by itself is of less iterest, as it oly provides the fuctioal form of the heteroskedasticity whe the errors are ideed ormally distributed. 7 Cosequetly, the ormal c.d.f. used here ca be iterpreted as a particular kerel fuctio, aalogous to kerel fuctios used i smoothed maximum score estimatio. 6

8 sioal parameter. This motivates costructig a estimator which maximize a probit/logit criterio fuctio which icludes this fuctio. Specifically we defie the criterio fuctio as 8 γ (θ, l) = ( ) (y i Φ ( x iθ + x [k,] i ) exp(l(x i )) ) (3.) i= for α (θ, l) i the (ifiite dimesioal) parameter space A, whose properties will be detailed shortly. To be able to implemet the NLLS procedure, sice the parameter space is ifiite dimesioal, we propose a liear i parameters sieve estimator. Let b j (x i ) deote a sequece of kow basis fuctios 9. Deote b κ (x i ) = (b (x i ),...b κ (x i )) for some iteger κ. A approximator of g(x i ) exp(l(x i )) i the above objective fuctio is g (x i ) = exp(b κ (x i ) Π ) where Π is a vector of costats, ad the expoetial fuctio serves to impose the positivity of the scale fuctio eeded for idetificatio. Let α (θ, g ) A where A is the sieve space. We ca formally defie the estimator as : ˆα = mi (y i Φ (x α A iβ g (x i ))) (3.) i= 8 Here we have used a probit fuctio, with Φ( ) deotig the ormal c.d.f. ad have adopted the NLLS objective fuctio, as its boudedess properties facilitate proofs. The MLE objective fuctio could also be used, but as will be argued later o, this results i the same asymptotic variace matrix as NLLS i this cotext. We also ote that this NLLS objective fuctio is similar to the smoothed maximum score estimator whe oe sets exp(l(x i )) = h where h. The properties of this estimator are discussed i Sectio 4. Fially, ote that the ifiite dimesioal parameter l( ) is the log of the scale fuctio. 9 See, e.g. Che ad She(998) for examples of basis fuctios. For the problem at had with a regressor that has ubouded support, certai basis fuctios (e.g. power series) will ot achieve the desired approximatio for the asymptotic theory to be valid. Cosequetly, we restrict ourselves to basis fuctios suitable for approximatig fuctios of regressors with ubouded support- see. e.g. Che et al.(5) who use polyomial splies. The expoetial fuctio is ot ecessary ad oly adopted here for coveiece. Oe could simply use the approximator b κ (x i ) Π ad impose costraits o Π to esure positivity of the scale fuctio. Sieve estimators ca easily icorporate such parameter costraits- see e.g. She(997). Effectively, we are simply optimizig the objective fuctio with respect to the parameters β, Π. We ote the objective fuctio is smooth i these parameters ad stadard optimizatio routies ca be used to fid local optima. However, the objective fuctio is ot cocave i the parameters, ad a search amogst these local maxima eeds to be coducted. A similar problem is ecoutered with the smoothed maximum score estimator ad Horowitz(99) suggested the use of the geeralized simulated aealig algorithm i Bohachevsky et al.(986). We ote it is ot difficult to implemet a procedure where we impose positivity of the scale fuctio by imposig parameter costraits i the optimizatio. I fact, sice the objective fuctio is smooth i the parameters, CO - a applicatio module writte i GAUSS, ca be used for the problem at had. 7

9 Remark 3. The idea of miimizig a probit or logit criterio fuctio that icludes a growig umber of basis fuctios is ot ew to the ecoometrics or statistics literature. The first was i the semial work of McFadde(974) who itroduced the Mother Logit model. Stoe(994) estimated choice probabilities i a biary choice model by replacig the idex x iβ with a liear i parameters series, iside a probit or logit likelihood fuctio. While his approach ca estimate the probability fuctio by estimatig the fuctio Φ(g(x i )), it caot estimate the structural parameter β as the proposed procedure ca. We ow detail the coditios uder which the asymptotic properties of this estimator will be derived. The first property we will establish is cosistecy. We first itroduce some otatio which will be used i imposig smoothess ad compactess coditios. This will require itroducig ew otatio, ad the otatio adopted here is idetical to that used i Ai ad Che(3), Che et al.(5). For ay k vector v = (v, v,...v k ), let v deote k i= v i. Let h( ) deote ay fuctio o X. We deote the v -th derivative of the fuctio h( )as: v h(x) = v x v... x v h(x) k k Also, for γ > we let Λ γ (X ) deote the space of fuctios which have up to [γ] (here [ ] deotes the iteger operator) cotiuous derivatives with the highest derivatives that are Holder cotiuous 3 with expoet (γ [γ]). Let E deote the Euclidea orm. For a real valued fuctio h( ) Λ γ (X ) we defie its Holder orm as h Λ γ = sup h(x) + max sup v h(x) v h( x) E x X v =[γ] x x (x x) (x x) γ [γ] Fially we deote a space of fuctios that will be used i defiig the parameter space: Λ γ c (X, w ) {h Λ γ (X ) : h( )( + x x) w / Λ c < } γ where w > ad c is a kow costat. However, this estimator ca be used i a first stage to estimate choice probabilities which ca the be projected oto Φ(x i βg (x i )) to form a estimator of β. Sice the first stage ivolves a cocave objective fuctio if MLE is used, this approach may have computatioal advatages over the approach suggested here. 3 See Ai ad Che(3) for a formal defiitio of Holder cotiuity ad a more detailed discussio o Holder Spaces. 8

10 The weightig fuctio of the regressors ( + x x) w/ goes to as x E goes to ifiity ad permits h( ) ad its derivatives to be ubouded. 4 With our weightig fuctio we ca itroduce the weighted sup orm defied as: h(x)( h(x),w = sup x X + x x) w / E Our assumptios for cosistecy are: RC (Parameter Space) Recall our otatio that β = (θ, ). Let B = Θ. The parameter space A cosists of all pairs β, l( ) such that i β B, a compact subset of R k. ii l(x) Λ p c(x, w ), where p >. RC (Regressor Distributio) Recall that X deotes the support of the regressors. For simplicity, we assume the regressor vector is cotiuously distributed ad deote its joit desity fuctio as f X ( ). i The k th regressor, coditioal o the other regressors, has desity fuctio with respect to Lebesgue measure that is positive o R The first k compoets of x i, deoted by x i, are assumed to have bouded support. ii The support of the distributio of x i is ot cotaied i ay proper liear subspace of R k. iii ( + x E )w f X (x)dx < where w > w. RC3 E[b κ (x i )b κ (x i ) ] is osigular for all. RC4 The vector (y i, x i) is i.i.d. ad satisfies P (y i = x i ) = Φ(x iβ g (x i )) Φ(x iβ exp(l (x i ))) Remark 3. Before establishig cosistecy, we commet o some of the regularity coditios imposed: RCii is a type of compactess coditio o the fuctioal space, ad ofte imposed i the sieve literature. See, e.g. Che et al.(5). With our defiitio of the sieve space, we will have a sieve approximatio error which coverges to with respect to a weighted sup orm. 4 This is the weightig fuctio used i Che et al.(5). Examples of other weightig fuctios, such as exp( x i x i), ca be foud i Gallat ad Nychka(987). 9

11 Assumptio RCi imposes regressor support coditios. The coditio o the k th regressor is used for idetificatio. The bouded support coditio o x i is oly made to simplify argumets i the proofs ad ca be relaxed to this subvector havig fiite fourth momets. Assumptio RC3 is maily useful to esure poit idetificatio of the sieve coefficiets. 5 The above coditios are sufficiet to establish cosistecy of the estimator of the regressio coefficiets. The proof is omitted as it follows from virtually idetical argumets as i Che et al.(3). Theorem 3. Uder assumptios RC-RC4, if κ ad κ /, we have ˆβ β = o p () (3.3) While the above result is a importat first step, as metioed i the itroductio, there are several estimators for the model cosidered here for which the regressio coefficiets ca be estimated cosistetly. The motivatio for SNLLS estimator proposed here was also to cosistetly estimate the choice probability fuctio, which we ow tur attetio to. We first ote that cosistecy of the proposed estimator of the scale fuctio(with respect to the weighted sup orm) also follows from coditios RC-RC4 (see, e.g. Propositio A. i Che et al.(5)). Hece the choice probability fuctio estimator is also cosistet. But more importatly, a faster rate with respect to a differet orm ca be attaied uder additioal coditios. This orm, the Fisher orm- see Ai ad Che(3) ad Hu ad Scheach(8), will prove useful o may frots. For oe, a covergece result for this orm will directly tied to a covergece rate for the probability fuctio, as this fuctio is a fuctioal satisfyig a Lipschitz coditio. Secod, the asymptotic distributio of the regressio coefficiet estimator, which we will also establish shortly, will be related to features of this orm. We defie the Fisher orm as follows: o A, ad deote it by F. For α = (θ, l ) ad α = (θ, l ) we defie α α F E[φ igi( x i(θ θ ) (x iβ )(l l )) ( x i(θ θ ) (x iβ )(l l ))] 5 It may ot be ecessary for cosistetly estimatig the regressio coefficiets ad choice probability, as established here. I am grateful to a referee for poitig this out.

12 Our rate result 6 is stated i the followig theorem, whose proof is omitted as it ca be show usig similar argumets to those used i Ai ad Che(3) ad Che et al.(5). Before statig the theorem, we will impose the followig locally quadratic coditio o the objective fuctio we adopted: RC5 There exist positive costats c, c, c < c such that [ { } ] [ { } ] c E Φ(x iβ exp(l (x i ))) Φ(x iβ exp(l(x i ))) α α F c E Φ(x iβ exp(l (x i ))) Φ(x iβ exp(l(x i ))) for all α A such that β β E = o(), l l,w = o(). (3.4) Theorem 3. Suppose assumptios RC RC5 hold, but with the added coditios p > k/ ad w > p. The ( ) κ ˆα α F = O p + κ p/k (3.5) Remark 3.3 Assumptio RC5 imposes that the populatio criterio fuctio ca be approximated locally by a quadratic fuctio, effectively assumig that the remaider term i a mea value expasio gets small as the parameter α approaches α. This coditio will also be used whe derivig the limitig distributio theory for ˆθ. Remark 3.4 We ote the above rate coicides with the attaied i Newey(997) for estimatig a regressio fuctio usig series estimatio. From our coditios it will also imply the same rate of covergece for choice probability fuctio estimator 7 : E[(Φ(x iβ exp(l(x i ))) Φ(x ˆβ i exp(ˆl(x i )))) ] (3.6) We ext tur attetio to the limitig distributio theory of the estimator ˆθ. For this we require the additioal assumptios: 6 This particular rate result is with respect to the Fisher orm, which, as we will see shortly, will provide us rates for the choice probability fuctioal as well. I fact, cosistecy of ˆα with respect to the stroger weighted sup orm follows from assumptios RC-RC4. We the will derive a rate result ad distributio theory for the estimator of β. 7 As discussed i Ai ad Che(3), attaiig L rates geerally requires stroger coditios tha those eeded for rates with respect to the Fisher orm. I the curret settig, Assumptio RC5 is what eables us to get the same rate uder both orms.

13 AD β it B. AD Reparameterizig the fuctio g (x i ) σ (x i ) as g (z i, x i ) with z i x iβ, the matrix Q = E[φ() x i x i g (, x i ) f Z X( x)] is o-sigular where φ( ) is the stadard ormal desity fuctio ad f Z X( ) deotes the coditioal desity of z i x iβ give x i. AD3 f X(z Z x) is cotiuously differetiable z i a eighborhood of ad all x. The mai theorem establishes a liear represetatio for the sieve estimator. The liear represetatio exposes the bias ad variace of the estimator as a fuctio of the umber of basis fuctios i the sieve, κ, ad ca be used to derive the rate at which κ i order for ˆβ to coverge to β at the fastest rate i terms of MSE. The liear represetatio below requires the itroductio of some ew otatio. Let φ i, g i deote φ(x iβ g (x i )) ad g (x i ) respectively. Let wi be a (k ) vector fuctio of x i, which satisfies. (θ, w i) A. E[φ ig i( x i + (x iβ )w i) ( x i + (x iβ )w i)] E[φ ig i( x i + (x iβ )w i ) ( x i + (x iβ )w i )] for all (k ) fuctios w i such that (θ, w i ) A The, let x i deote x iβ w i. ad let l (x i ) satisfy (θ, l (x i )) A ad miimize: α α F where α = (β, l (x i )) ad α = (β, l (x i )). Theorem 3.3 Uder assumptios RC-RC5, AD-AD3, if κ /k the ˆθ θ = c(p) Q κ /k + o p ( κ /k ) ad κ /k, φ(x iβ g (x i )) g (x i )( x i x i )(y i Φ(x iβ g (x i ))) i= + o p (κ p/k ) (3.7)

14 where β deotes a sequece of values i betwee ˆβ ad β, g (x i ) deotes aalogous itermediate values for the scale fuctio, ad where c(p) is a costat depedig o the assumed order of smoothess p, ad whose expressio ca be foud i (A.8). Remark 3.5 From the liear represetatio i the theorem, we ca see that the bias is of order κ p/k, the rate at which the fuctios Φ(x iβ g (x i )) ca be approximated well (with respect to our weighted orm) by our basis fuctio approximatio (see, e.g. Che et al.(5)). The variace is of order κ /k /, the rate we are dividig the summatio by 8. Equatig the two to derive the optimal rate at which the sequece κ icreases, we get κ = O( k/(p+) ). This implies the rate of covergece of the MSE of ˆθ is O( p/(p+) ) which is slower tha the parametric (root-) rate. Che ad Kha(3) show that the parametric rate is ot achievable for a similar model, ad it is cojectured that the MSE rate attaied here is the optimal rate of covergece for the model uder Assumptios RC- RC5, AD-AD3- see Kha(). A immediate corollary to the above theorem is the limitig distributio theory for the sieve NLLS estimator: Corollary 3. Cosider the sequece κ = O( k+ɛ 3/p+k ) where ɛ 3 > is a arbitrarily small costat, ad p > k/. It follows that: κ /k (ˆθ θ ) N(, 4 c(p) Q ) (3.8) We coclude this sectio with some commets o the form of the limitig distributio. Remark 3.6 Recall the estimator was motivated by the fact that the heteroskedastic probit model probabilities could be equated to the probabilities i a distributio free model by settig g (z i, x i ) = ( ) Φ P (z i, x i ) z i 8 Details o how these rates are derived ca be foud i the derivatio of(a.) 3

15 where recall z i = x iβ ad P (, ) deotes the probability fuctio reparameterized as a fuctio of two argumets. Takig limits as z i (keepig x i fixed) we get g (, x i ) = φ() P (, x i ) where here P (, ) deotes the partial derivative of P (, ) with respect to its first argumet. From this we see Q = E[ x i x i P (, x i ) f Z X( x)] (3.9) ad we ote the form of Q is idepedet of the fact that the ormal c.d.f. was used i the objective fuctio. Remark 3.7 We ote the variace covariace matrix is ot of a sadwich form. While this feature usually occurs for MLE estimators it is a feature of the sieve NLLS estimator because all the iformatio for β is at x iβ =. I fact the structure of the variace matrix is the same as would be obtaied with a ifeasible weighted NLLS estimator, with more weight beig give to observatios where the true idex x iβ is close to a (vaishig) eigborhood of. This causes the usual sadwich form foud i NLLS estimators to collapse, sice here the outerscore term, which has the term Var(y i x i ) = P (y i = x i )( P (y i = x i ), is ow equal to the costat whe 4 x iβ =. This makes the outerscore term proportioal to the hessia term, causig the collapse. We coclude this sectio by illustratig a further advatage of the proposed estimatio procedure. I additio to estimatig the structural parameters β, the sieve approach also permits estimatio of other fuctioals of the probability fuctio. Oe relevat fuctioal is the (weighted) average margial effect, which we defie here as: W = w W (x) P (x)/ xdx (3.) where recall P ( ) deotes the choice probability fuctio ad w W ( ) deotes a weightig fuctio (assumed here to have compact support) satisfyig w W (x) ad w W (x)dx = (3.) Lettig Ŵ deote the estimator obtaied by replacig P with our proposed sieve estimator of the choice probability i (3.). The followig theorem establishes the limitig distributio theory of this estimator. Its proof is omitted as it follows virtually idetical argumets as used i the provig the previous theorems. 4

16 Theorem 3.4 Uder the coditios imposed i Theorem 3., if κ p/k, the ( Ŵ W) N(, V W ) (3.) where V W = E X [v W (x i )v W (x i ) P (x i )( P (x i ))] (3.3) with v W (x i ) = f X (x i ) w W (x i )/ x i (3.4) Remark 3.8 A atural ad illustrative example of the usefuless of the above theorem is to cosider the covetioal averaged derivative estimator. I this case we would let w W (x) = f X (x), where f X (x) deotes the regressor desity fuctio. The, from equatio (3.), we ca plug i ˆα ito Φ( ) to get a estimate ˆP (x) of the choice probability fuctio, the differetiate it with respect to x to get a estimate of the margial effect, which we ca average across regressor values to get Ŵ = ˆP (x i )/ x i (3.5) i= Remark 3.9 We ote that this limitig distributio correspods to that obtaied i Theorem 3 i Newey(997), who estimated the probability fuctio by a series regressio ad did ot attai a estimator of β. This result agrees with the geeral coclusio i She(997) which is that the two mai coditios affectig the limitig distributio of a smooth fuctioal are the rate of covergece of the sieve estimator ad the smoothess of the fuctioal. Sice the rate of covergece attaied i Theorem 3. aligs with Theorem i Newey(997), oe would the expect the limitig distributios of the same smooth fuctioal to coicide. Remark 3. While the above theorem is for smooth fuctioals, distributio theory for osmooth fuctioals, such as poit wise probability fuctio estimators, should also be attaiable followig argumets used i Che ad Pouzo(9,). The list of formal regularity coditios ad proof of such a theorem is left for future work. 4 Local NLLS Estimators This sectio proposes a procedure which agai relates media based semiparametric estimators for biary choice models to stadard estimatio procedures for parametric biary choice 5

17 models. Like the previous proposed estimator, the estimator optimizes a NLLS parametric objective fuctio. It differs i the sese that it does ot estimate choice probabilities like the previous procedure, but it has the advatage of beig implemetable i stadard software packages such as Stata. The estimators we propose ivolve combiig the maximum score ad smoothed maximum score objective fuctios i (.3) ad (.4) respectively. First we ote that the objective fuctio of the maximum score estimator: y i I[x iβ ] (4.) i= is idetical to the squared loss objective fuctio (y i I[x iβ ]) (4.) i= sice both y i ad I[ ] are - variables. Next we smooth this objective fuctio as was doe i (.4), by replacig the idicator fuctio with a kerel fuctio. For the smoothed maximum score estimator, the kerel fuctio serves to approximate a c.d.f. We do the same here, usig the c.d.f. of the stadard ormal distributio 9 which as before we deote by Φ( ), ad whose p.d.f we deote by φ( ). To formally defie the estimator, we let h deote a sequece of positive umbers, decreasig to with the sample size. (h ca be viewed as a badwidth sequece foud i oparametric kerel estimatio). We adopt the usual scale ormalizatio i semiparametric models (e.g. Horowitz(99)), where we set the coefficiet o the k th regressor to be, ad cosider estimatio of θ, where β = (θ, ). Our NLLS estimator ˆβ = (ˆθ, ) is defied as ( ( )) x ˆβ = arg mi y i Φ i β (4.3) β Θ i= h The mai advatage of this procedure is that it ivolves the stadard NLLS objective fuctio. I fact, it is the stadard NLLS Probit estimator used to estimate parametric biary choice models. Thus stadard software packages, such as Stata, ca be used to compute the estimator of θ. 9 Actually, the c.d.f. of other radom variables ca be used as well, so for example NLLS Logit ca also be used as a estimator. We oly use the ormal c.d.f. sice its values ca be easily computed usig stadard software packages. For example, i Stata, the l commad fits a arbitrary oliear fuctio by least squares. The Probit regressio fuctio ca be costructed usig Stata s orm( ) commad, which returs cumulative probabilities from the stadard ormal distributio. 6

18 Regardig asymptotic properties of this estimator, we impose coditios that are idetical to those i Horowitz(99). A θ is i the iterior of a compact set Θ. A The vector x i has bouded support. A3 The desity fuctio of x iβ coditioal o x i, deoted by f Z X( ) is positive ad cotiuously differetiable with bouded derivative. A4 The coditioal probability fuctio of y i, expressed as a fuctio of x i ad x iβ, is twice cotiuously differetiable with respect to x iβ with bouded derivatives for x iβ i a eighborhood of, for all x i. A5 The matrix Q H = E[ P (, x i ) x i x i f Z X( x i )] (4.4) is osigular, where P (x iβ, x i ) deotes the coditioal probability of y i = give x i, which we reparamaterized as a fuctio of x i, x iβ, ad P (, ) deotes the partial derivative of P (, ) with respect to its first argumet. The followig theorem characterizes the estimators rate of covergece ad limitig distributio as a fuctio of h. The proof of the theorem is omitted as it follows from argumets that are similar to those used i Horowitz(99). Theorem 4. Assume CM,CM, A -A5 hold ad h, the,. if h 3 the h (ˆθ θ ) p κ where κ is a k dimesioal vector of costats.. At the rate h = O( /3 ) the /3 (ˆθ θ ) o-stadard (i.e. o-gaussia) distributio. d B where the radom vector B has As the above theorem idicates, the local NLLS estimator has asymptotic properties that are similar to the maximum score estimator proposed i Maski(975,985). Specifically, its rate of covergece ca be as fast as O( /3 ), the same rate of the maximum score estimator, ad it has a o-gaussia limitig distributio. For the NLLS estimator, the o-gaussiaity stems from the result that the Hessia term i its liear represetatio coverges to a radom matrix, implyig the estimator has a asymptotically mixed ormal distributio. See, for example Sectio 9.6 i va der Vaart(998). 7

19 However, the slow rate of covergece (relative to the smoothed maximum score estimator i Horowitz(99)) is due to a bias coditio, where the bias of the estimator coverges at the rate of h, which is i cotrast to the rate of h for the smoothed maximum score estimator. Thus the differet rates of covergece for the two estimators (NLLS ad SMS) is loosely aalogous to differig rates of covergece for oe-sided ad two-sided kerel estimators i oparametric desity ad regressio estimatio. Fortuately, this bias coditio i NLLS is easily correctible. For example, a alterative kerel fuctio to the ormal c.d.f. could be used to reduce the order of the bias, or other bias reducig mechaisms, such as jackkifig could be implemeted, to achieve the same rate as SMS, as well as a asymptotic ormal distributio. The asymptotic properties of such approaches is left for future work. 5 Mote Carlo Results I this sectio, we ivestigate the small-sample performace of the estimators itroduced i this paper by ways of a small-scale Mote Carlo study. We begi by cosiderig the desigs used i Horowitz(99). These are based o the model: y i = I[x i + β x i u i ] β =, x i N(, ) ad x i N(, ). There are 4 desigs correspodig to 4 differet distributios of u i. They are:. u i logistic, media, variace.. u i uiform, media, variace. 3. u i Studet s t with 3 degrees of freedom, ormalized to have variace. 4. u i =.5( + zi + zi)v 4 i where z i = x i + x i ad v i logistic with media ad variace. The estimators studied i the study are the sieve NLLS (SNLLS), the sieve MLE (SMLE), maximum score (MS), the smoothed maximum score (SMS), the proposed local NLLS estimator (LNLLS) ad its jackkifed versio (JKNLLS). To implemet the estimators for SMS the feasible optimal badwidth sequece itroduced i Horowitz(99) was used. For the sieve estimators a series was used i the expasio of the log scale fuctio with a polyo- 8

20 mial of degree for = 5 ad otherwise. For the LNLLS a badwidth sequece of /3 was used. For JKNLLS, the weights used were 4/3 ad -/3, ad the badwidths were c /5, c /5 with costats /4 ad. Tables I-IV report the mea bias ad MSE for each of the estimators for = 5, 5, with replicatios. The MS ad SMS results reported are those foud i Horowitz(99). The sieve estimators were computed usig the Nelder- Meade simplex algorithm 3, with 5 radomly geerated startig values 4. The sieve estimators geerally perform better tha SMS across desigs with the exceptio of Desig 3 where results are very similar. The SMLE ad SNLLS perform quite similarly, also i accordace with the theory, as the MSE for the SMLE is ot smaller tha the SNLLS. The local NLLS estimators also perform quite well. Oe surprise i the simulatio results is that i terms of RMSE, for some desigs, the stadard NLLS performs as well as, if ot better tha the other estimators despite its slower rate of covergece. The jackkife procedure geerally results i a lower bias tha the LNLLS, but it appears this sometimes comes at the expese of a larger variace. As metioed i the paper a advatage of the sieve NLLS ad the sieve MLE is that they simultaeously estimate choice probabilities as well as regressio coefficiets. Figures I-IV plot the mea value of the estimated choice probabilities usig SNLLS o a grid of 5 regressor values for each of the 4 desigs, for sample sizes of = 5, 5,, agai usig replicatios. Also reported i paretheses are the values of the average mea square errors (AMSE) which averages MSE across the poits o the grid. As the results idicate the SNLLS does a adequate job of estimatig choice probabilities, ad the values of the AMSE go dow with the sample size. The estimator performs the worst i the heteroskedastic desig, both i terms of the level of the AMSE, ad the rate at which it decreases with the sample size. As a fial compoet of our simulatio study, we explore how each of the estimators perform i a higher dimesioal, more complicated desig. Specifically, we allow for more covariates ad a form of heteroskedasticity that is ot a fuctio of the idex x iβ, but a Precisely, to estimate with a polyomial of order the scale fuctio was approximated with exp(π + Π x i + Π x i + Π 3 x i + Π 4x + Π 5 x i x i ). Results for similar orders were experimeted with but did ot chage results much, ad are ot reported. 3 As metioed previously, sice the objective fuctio is smooth i the parameters, more stadard, gradiet based algorithms may be used. They were ot adopted here to avoid potetial istability problems associated with ear sigularity of hessia matrices, ad also because the relatively low dimesioality of desigs permit the Nelder- Meade algorithm to be computatioally feasible. 4 The simulatio was performed i GAUSS. 9

21 more geeral form of the covariates. The followig model was simulated: y i = I[x i + β () x i + β () x 3i + β (3) x 4i u i ] where here β () = β () = β (3) =, x i N(, ), x i N(, ), x 3i χ, x 4i N(, ). The heteroskedastic error term was distributed logistically, with scale fuctio exp( x i x 3i ). Table V reports results for the estimator of β () for the same 4 estimators for the same sample sizes ad umber of replicatios. For implemetatio, for the sieve estimator we icreased the order of polyomial by for each sample to accout for the fact there are more regressors. To implemet the SMS, we used the fourth-order kerel fuctio described i Horowitz(99), ad at first used the plug-i method described i Horowitz(99) to select the smoothig parameter. However this resulted i ustable results for = 5, so we implemeted a extra iteratio i the plug i strategy. That is, we implemeted the plug-i method to get a iitial estimator of the regressio coefficiets which we used to estimate the costat i the smoothig parameter. This led to improved results for = 5. As the results i the table idicate, all estimators perform reasoable well. The SMLE is smaller tha the SNLLS for = 5, 5 but the reverse is true for =, providig further evidece that either estimator is more efficiet. The SMS exhibits large values of MSE for = 5 but stabilizes afterwards. Noetheless, eve for = it has a larger MSE tha either sieve estimator. MS exhibits the largest bias ad MSE except for = 5, whe its MSE is smaller tha SMS, though larger tha the sieve estimators. The NLLS ad JKNLLS estimators perform well i this desig as well, but ot as well as the sieve estimators for large sample sizes. The results for this desig are ecouragig for the sieve estimators, demostratig they do ot suffer ay more i higher dimesioal desigs tha existig estimators. 6 Coclusios I this paper ew estimatio procedures for a distributio free heteroskedastic biary respose model were proposed. The sieve estimators eable joit estimatio of the regressio coefficiets, choice probabilities. The regressio coefficiet estimators was show to coverge at a oe-dimesioal oparametric rate, as was foud for the (smoothed) maximum score estimator. While the choice probability fuctio estimator coverged at a oparametric rate, a smooth fuctioal was show to coverge at the parametric rate with a limitig Gaussia distributio. The proposes local NLLS estimators estimated oly regressio co-

22 efficiets but had the advatage of beig very simple to implemet with stadard software packages. A simulatio study idicates these estimators perform adequately well i fiite samples. The work here suggests areas for future research. Limitig distributio theory for the (poitwise) choice probability, ad margial effects estimators, as well as smooth fuctioals thereof, eeds to be derived. Also it would also be useful to explore if further restrictios o the model, by costraiig the behavior of σ (x i ), would eable improvig upo the optimal rates attaied here ad i Horowitz(993a,b). Such further restrictios would be relatively easy to impose usig the sieve estimatio approach adopted here. Refereces [] Ai, C. ad X. Che (3), Efficiet Estimatio of Models with Coditioal Momet Restrictios Cotaiig Ukow Fuctios, Ecoometrica, 7, [] Bieres, H.J. (987) Kerel Estimators of Regressio Fuctios, i T.F. Bewley, ed., Advaces i Ecoometrics, Fifth World Cogress, Vol.., Cambridge: Cambridge Uiversity Press. [3] Bieres, H.J. (), Cosistecy ad Asymptotic Normality of Sieve Estimators Uder Weak ad Verifiable Coditios, Pe State workig paper. [4] Bohachevsky, I.O., M.E. Johso ad M.L. Ster (986) Geeralized Simulated Aealig, Tecometrics, 8, 9-7. [5] Che, S. ad S. Kha(3), Rates of Covergece for Estimatig Regressio Coefficiets i Heteroskedastic Discrete Respose Models, Joural of Ecoometrics, 7, [6] Coppejas, M. (), Estimatio of the Biary Respose Model usig a Mixture of Distributios Estimator (MOD), Joural of Ecoometrics,, [7] Che, X., H. Hog, ad E. Tamer (5), Noliear Models with Measuremet Error ad Auxiliary Data, Review of Ecoomic Studies, 7, [8] Che, X., O. B. Lito, ad I. va Keilegom (3), Estimatio of Semiparametric Models whe the Criterio Fuctio is ot Smooth, Ecoometrica, vol. 7, [9] Che, X. ad D. Pouzo (9), Efficiet Estimatio of Semiparametric Coditioal Momet Models with Possibly Nosmooth Residuals, Joural of Ecoometrics, 5, 46-6.

23 [] Che, X. ad D. Pouzo (), Efficiet Estimatio of Semiparametric Coditioal Momet Models with Possibly Nosmooth Geeralized Residuals, Ecoometrica, 8, [] Che, X. ad X. She (998), Sieve Extremum Estimates for Weakly Depedet Data, Ecoometrica, [] Cosslett, S.R. (983), Distributio-Free Maximum Likelihood Estimator of the Biary Choice Model, Ecoometrica, 5, [3] Gallat, A.R. ad D.W. Nychka (987), Semi-oparametric Maximum Likelihood Estimatio, Ecoometrica, 55, [4] Gema, S. ad C. Hwag (983), Noparametric Maximum Likelihood by the Method of Sieves, Aals of Statistics,, [5] Greee, W.H. (997), Ecoometric Aalysis, Upper Saddle River, NJ: Pretice Hall [6] Horowitz, J.L. (99), A Smoothed Maximum Score Estimator for the Biary Respose Model, Ecoometrica, 6, [7] Horowitz, J.L. (993a), Optimal Rates of Covergece of Parameter Estimators i the Biary Respose Model with Weak Distributioal Assumptio Ecoometric Theory, 9, -8. [8] Horowitz, J.L. (993b), Semiparametric ad Noparametric Estimatio of Quatal Respose Models, i G.S. Maddala, C.R. Rao, H.D. Viod eds. Hadbook of Statistics - Ecoometrics, Amsterdam: North Hollad [9] Hu, Y. ad Scheach, S. M. (8), Istrumetal Variable Treatmet of Noclassical Measuremet Error Models, Ecoometrica, 76, [] Ichimura, H. (993) Semiparametric Least Squares ad Weighted SLS Estimatio of Sigle -Idex Models, Joural of Ecoometrics, 58, 7- [] Kha, S. (), Optimal Rates for Regressio Coefficiets Heteroskedastic Biary Respose Models, mauscript, available at shakeebk/optimalrates.pdf [] Kim J., ad D. Pollard (99), Cube Root Asymptotics, Aals of Statistics, 8, 9-9 [3] Klei, R.W. ad R.H. Spady (993), A Efficiet Semiparametric Estimator for Discrete Choice Models, Ecoometrica, 6, [4] Klei, R.W. ad F. Vella (9), A Semiparametric Model for Biary Respose ad Cotiuous Outcomes Uder Idex Heteroscedasticity, Joural of Applied Ecoometrics, 4,

24 [5] Maski, C.F. (975), Maximum Score Estimatio of the Stochastic Utility Model of Choice, Joural of Ecoometrics, 3, 5-8 [6] Maski, C.F. (985), Semiparametric Aalysis of Discrete Respose: Asymptotic Properties of Maximum Score Estimatio, Joural of Ecoometrics, 7, [7] Maski, C.F. (988), Idetificatio of Biary Respose Models, Joural of the America Statistical Associatio, 83, [8] McFadde, D. (974), Coditioal Logit Aalysis of Qualitative Choice Behavior, I P. Zarembka (ed.) Frotiers i Ecoometrics, pp New York: Academic Press. [9] Newey, W.K. (997), Covergece Rates ad Asymptotic Normality for Series Estimators, Joural of Ecoometrics 79, [3] Powell, J.L., J.H. Stock, ad T.M. Stoker (989) Semiparametric Estimatio of Idex Coefficiets, Ecoometrica, 57, [3] Schumaker, L.L. (98) Splie Fuctios Basic Theory, New York: Joh Wiley ad Sos. [3] She, X. (997) O Method of Sieves ad Pealizatio, Aals of Statistics, 5, [33] She, X. ad W.H. Wog (994) Covergece Rates of Sieve Estimates, Aals of Statistics,, [34] Sherma, R.P. (994) U-Processes i the Aalysis of a Geeralized Semiparametric Regressio Estimator, Ecoometric Theory,, [35] Stoe, C.J. (994) The Use of Polyomial Splies ad Their Tesor Products i Multivariate Fuctio Estimatio, Aals of Statistics,, 8-7. [36] va der Vaart, A.W. (998) Asymptotic Statistics, Cambridge, U.K.: Cambridge Uiversity Press. [37] va der Vaart, A.W. ad J.A. Weller Weak Covergece ad Empirical Processes, New York: Spriger. [38] Wog, H.W. ad T.A. Severii(99), O Maximum Likelihood Estimatio i Ifiite Dimesioal Parameter Spaces, Aals of Statistics, 9,

25 A Appedix A. Proof of Theorem 3.3 Before we derive the liear represetatio for the estimator ˆθ, recall we defied the Fisher orm, deoted by F, as α α F E[φ ig i( x i(θ θ ) (x iβ )(l i l i )) ] (A.) where φ i, g i, l i, l i deote φ(x i β g (x i )), g (x i ), l (x i ), l (x i ) respectively. As we will see, derivig the form of the liear represetatio will rely heavily o covergece of certai terms with respect to this orm. We ote that similar argumets as used i, e.g. Ai ad Che(3), ca be used to coclude that ( ) κ ˆα α F = O p + κ p/k (A.) To establish the limitig distributio theory of the estimator we ote there are may results i the literature for the asymptotic theory for smooth fuctioals- see, e.g. She(997), Che ad She(998), Ai ad Che(3) ad Che et al.(5). However, these results apply oly to the root- case, which is ot possible here. 5 Our proof strategy is to follow the argumets used i Ai ad Che(3) Che et al.(5), but make the ecessary modificatios to accout for the fact that the estimator does ot coverge at the parametric (root-) rate. I the rest of this sectio we will scalarize the problem by derivig the liear represetatio for t (ˆθ θ ) where t is a (k ) o zero vector. Followig Ai ad Che(3) we wish to fid the (k ) vector w i t E[φ ig i( x i + z i w i )( x i + z i w i ) ]t which miimizes: (A.3) ad satisfies (θ, w i ) A for each θ Θ. Clearly, the above expectatio ca be set to by settig w i = I[z i ]( x i /z i ), as z i is cotiuously distributed aroud. The fact that we ca make this expectatio as small as possible relates to the impossibility of attaiig the root- rate for ˆθ. What will determie the rate of covergece of the estimator is the rate of covergece of the above expectatio to whe we replace w i with w i where (θ, w i ) A. So we will aim to fid if t E[φ w i igi( x i + z i w i )( x i + z i w i ) ]t (A.4) as a fuctio of κ. 5 See Che ad Kha(3) for a related impossibility result. A result o upper bouds o achievable rates is available from the author. 4

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

Kernel density estimator

Kernel density estimator Jauary, 07 NONPARAMETRIC ERNEL DENSITY ESTIMATION I this lecture, we discuss kerel estimatio of probability desity fuctios PDF Noparametric desity estimatio is oe of the cetral problems i statistics I

More information

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Study the bias (due to the nite dimensional approximation) and variance of the estimators 2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite

More information

SEMIPARAMETRIC SINGLE-INDEX MODELS. Joel L. Horowitz Department of Economics Northwestern University

SEMIPARAMETRIC SINGLE-INDEX MODELS. Joel L. Horowitz Department of Economics Northwestern University SEMIPARAMETRIC SINGLE-INDEX MODELS by Joel L. Horowitz Departmet of Ecoomics Northwester Uiversity INTRODUCTION Much of applied ecoometrics ad statistics ivolves estimatig a coditioal mea fuctio: E ( Y

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials

More information

1 Covariance Estimation

1 Covariance Estimation Eco 75 Lecture 5 Covariace Estimatio ad Optimal Weightig Matrices I this lecture, we cosider estimatio of the asymptotic covariace matrix B B of the extremum estimator b : Covariace Estimatio Lemma 4.

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Notes On Median and Quantile Regression. James L. Powell Department of Economics University of California, Berkeley

Notes On Median and Quantile Regression. James L. Powell Department of Economics University of California, Berkeley Notes O Media ad Quatile Regressio James L. Powell Departmet of Ecoomics Uiversity of Califoria, Berkeley Coditioal Media Restrictios ad Least Absolute Deviatios It is well-kow that the expected value

More information

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology Advaced Aalysis Mi Ya Departmet of Mathematics Hog Kog Uiversity of Sciece ad Techology September 3, 009 Cotets Limit ad Cotiuity 7 Limit of Sequece 8 Defiitio 8 Property 3 3 Ifiity ad Ifiitesimal 8 4

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

11 THE GMM ESTIMATION

11 THE GMM ESTIMATION Cotets THE GMM ESTIMATION 2. Cosistecy ad Asymptotic Normality..................... 3.2 Regularity Coditios ad Idetificatio..................... 4.3 The GMM Iterpretatio of the OLS Estimatio.................

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Regression with an Evaporating Logarithmic Trend

Regression with an Evaporating Logarithmic Trend Regressio with a Evaporatig Logarithmic Tred Peter C. B. Phillips Cowles Foudatio, Yale Uiversity, Uiversity of Aucklad & Uiversity of York ad Yixiao Su Departmet of Ecoomics Yale Uiversity October 5,

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 9 Maximum Likelihood Estimatio 9.1 The Likelihood Fuctio The maximum likelihood estimator is the most widely used estimatio method. This chapter discusses the most importat cocepts behid maximum

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Rates of Convergence by Moduli of Continuity

Rates of Convergence by Moduli of Continuity Rates of Covergece by Moduli of Cotiuity Joh Duchi: Notes for Statistics 300b March, 017 1 Itroductio I this ote, we give a presetatio showig the importace, ad relatioship betwee, the modulis of cotiuity

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A. Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)

More information

Empirical Processes: Glivenko Cantelli Theorems

Empirical Processes: Glivenko Cantelli Theorems Empirical Processes: Gliveko Catelli Theorems Mouliath Baerjee Jue 6, 200 Gliveko Catelli classes of fuctios The reader is referred to Chapter.6 of Weller s Torgo otes, Chapter??? of VDVW ad Chapter 8.3

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Rank tests and regression rank scores tests in measurement error models

Rank tests and regression rank scores tests in measurement error models Rak tests ad regressio rak scores tests i measuremet error models J. Jurečková ad A.K.Md.E. Saleh Charles Uiversity i Prague ad Carleto Uiversity i Ottawa Abstract The rak ad regressio rak score tests

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Sieve Estimators: Consistency and Rates of Convergence

Sieve Estimators: Consistency and Rates of Convergence EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

An Introduction to Asymptotic Theory

An Introduction to Asymptotic Theory A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20 Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Preponderantly increasing/decreasing data in regression analysis

Preponderantly increasing/decreasing data in regression analysis Croatia Operatioal Research Review 269 CRORR 7(2016), 269 276 Prepoderatly icreasig/decreasig data i regressio aalysis Darija Marković 1, 1 Departmet of Mathematics, J. J. Strossmayer Uiversity of Osijek,

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Bull. Korean Math. Soc. 36 (1999), No. 3, pp. 451{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Seung Hoe Choi and Hae Kyung

Bull. Korean Math. Soc. 36 (1999), No. 3, pp. 451{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Seung Hoe Choi and Hae Kyung Bull. Korea Math. Soc. 36 (999), No. 3, pp. 45{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Abstract. This paper provides suciet coditios which esure the strog cosistecy of regressio

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

A Note on Box-Cox Quantile Regression Estimation of the Parameters of the Generalized Pareto Distribution

A Note on Box-Cox Quantile Regression Estimation of the Parameters of the Generalized Pareto Distribution A Note o Box-Cox Quatile Regressio Estimatio of the Parameters of the Geeralized Pareto Distributio JM va Zyl Abstract: Makig use of the quatile equatio, Box-Cox regressio ad Laplace distributed disturbaces,

More information

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution Iteratioal Mathematical Forum, Vol., 3, o. 3, 3-53 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.9/imf.3.335 Double Stage Shrikage Estimator of Two Parameters Geeralized Expoetial Distributio Alaa M.

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.09 Kolmogorov-Smirov type Tests for Local Gaussiaity i High-Frequecy Data George Tauche, Duke Uiversity Viktor Todorov,

More information

A NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS

A NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS Jural Karya Asli Loreka Ahli Matematik Vol. No. (010) page 6-9. Jural Karya Asli Loreka Ahli Matematik A NEW CLASS OF -STEP RATIONAL MULTISTEP METHODS 1 Nazeeruddi Yaacob Teh Yua Yig Norma Alias 1 Departmet

More information

Lecture 3: MLE and Regression

Lecture 3: MLE and Regression STAT/Q SCI 403: Itroductio to Resamplig Methods Sprig 207 Istructor: Ye-Chi Che Lecture 3: MLE ad Regressio 3. Parameters ad Distributios Some distributios are idexed by their uderlyig parameters. Thus,

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

MA Advanced Econometrics: Properties of Least Squares Estimators

MA Advanced Econometrics: Properties of Least Squares Estimators MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Asymptotic Results for the Linear Regression Model

Asymptotic Results for the Linear Regression Model Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

Point Estimation: properties of estimators 1 FINITE-SAMPLE PROPERTIES. finite-sample properties (CB 7.3) large-sample properties (CB 10.

Point Estimation: properties of estimators 1 FINITE-SAMPLE PROPERTIES. finite-sample properties (CB 7.3) large-sample properties (CB 10. Poit Estimatio: properties of estimators fiite-sample properties CB 7.3) large-sample properties CB 10.1) 1 FINITE-SAMPLE PROPERTIES How a estimator performs for fiite umber of observatios. Estimator:

More information

A survey on penalized empirical risk minimization Sara A. van de Geer

A survey on penalized empirical risk minimization Sara A. van de Geer A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the

More information

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

Optimization Methods MIT 2.098/6.255/ Final exam

Optimization Methods MIT 2.098/6.255/ Final exam Optimizatio Methods MIT 2.098/6.255/15.093 Fial exam Date Give: December 19th, 2006 P1. [30 pts] Classify the followig statemets as true or false. All aswers must be well-justified, either through a short

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values Iteratioal Joural of Applied Operatioal Research Vol. 4 No. 1 pp. 61-68 Witer 2014 Joural homepage: www.ijorlu.ir Cofidece iterval for the two-parameter expoetiated Gumbel distributio based o record values

More information

CSE 527, Additional notes on MLE & EM

CSE 527, Additional notes on MLE & EM CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be

More information

ON POINTWISE BINOMIAL APPROXIMATION

ON POINTWISE BINOMIAL APPROXIMATION Iteratioal Joural of Pure ad Applied Mathematics Volume 71 No. 1 2011, 57-66 ON POINTWISE BINOMIAL APPROXIMATION BY w-functions K. Teerapabolar 1, P. Wogkasem 2 Departmet of Mathematics Faculty of Sciece

More information

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable. Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig

More information

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if LECTURE 14 NOTES 1. Asymptotic power of tests. Defiitio 1.1. A sequece of -level tests {ϕ x)} is cosistet if β θ) := E θ [ ϕ x) ] 1 as, for ay θ Θ 1. Just like cosistecy of a sequece of estimators, Defiitio

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Information-based Feature Selection

Information-based Feature Selection Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector

Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector Dimesio-free PAC-Bayesia bouds for the estimatio of the mea of a radom vector Olivier Catoi CREST CNRS UMR 9194 Uiversité Paris Saclay olivier.catoi@esae.fr Ilaria Giulii Laboratoire de Probabilités et

More information

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences Commuicatios of the Korea Statistical Society 29, Vol. 16, No. 5, 841 849 Precise Rates i Complete Momet Covergece for Negatively Associated Sequeces Dae-Hee Ryu 1,a a Departmet of Computer Sciece, ChugWoo

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

ESTIMATING THE ERROR DISTRIBUTION FUNCTION IN NONPARAMETRIC REGRESSION WITH MULTIVARIATE COVARIATES

ESTIMATING THE ERROR DISTRIBUTION FUNCTION IN NONPARAMETRIC REGRESSION WITH MULTIVARIATE COVARIATES ESTIMATING THE ERROR DISTRIBUTION FUNCTION IN NONPARAMETRIC REGRESSION WITH MULTIVARIATE COVARIATES URSULA U. MÜLLER, ANTON SCHICK AND WOLFGANG WEFELMEYER Abstract. We cosider oparametric regressio models

More information

Lecture 27: Optimal Estimators and Functional Delta Method

Lecture 27: Optimal Estimators and Functional Delta Method Stat210B: Theoretical Statistics Lecture Date: April 19, 2007 Lecture 27: Optimal Estimators ad Fuctioal Delta Method Lecturer: Michael I. Jorda Scribe: Guilherme V. Rocha 1 Achievig Optimal Estimators

More information

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2 82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,

More information

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8)

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8) Elemets of Statistical Methods Lots of Data or Large Samples (Ch 8) Fritz Scholz Sprig Quarter 2010 February 26, 2010 x ad X We itroduced the sample mea x as the average of the observed sample values x

More information