The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors

Size: px
Start display at page:

Download "The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors"

Transcription

1 The Asymptotic Variace of Semi-parametric stimators with Geerated Regressors Jiyog Hah Departmet of coomics, UCLA Geert Ridder Departmet of coomics, USC Jue 30, 20 Abstract We study the asymptotic distributio of three-step estimators of a ite dimesioal parameter vector where the secod step cosists of oe or more o-parametric regressios o a regressor that is estimated i the rst step. The rst-step estimator is either parametric or o-parametric. Usig Newey s (994) path-derivative method we derive the cotributio of the rst-step estimator to the i uece fuctio. I this derivatio it is importat to accout for the dual role that the rst-step estimator plays i the secod-step o-parametric regressio, i.e., that of coditioig variable ad that of argumet. JL Classi catio: C0, C4. Keywords: Semi-parametric estimatio, geerated regressors, asymptotic variace. Fiacial support for this research was geerously provided through NSF SS ad We thak Guido Imbes ad semiar participats at UC Riverside, the Tiberge Istitute, Yale, FGV-Rio de Jaeiro, FGV-São Paulo, Harvard/MIT, UCLA, Maheim, Scieces Po ad CMFI for commets. Geert Ridder thaks the Departmet of coomics, PUC, Rio de Jaeiro for their hospitality. Addresses: Jiyog Hah, Departmet of coomics, Buche Hall, UCLA, Los Ageles, CA 90095, haheco.ucla.edu; Geert Ridder, Departmet of coomics, Kaprilia Hall, USC, Los Ageles, CA 90089, ridderusc.edu.

2 Itroductio We study the asymptotic distributio of estimators of a ite dimesioal parameter i a semiparametric three (or more) step estimatio problem. This topic has become quite importat especially due to the recet developmets i the ecoometric aalysis of treatmet e ects ad i the ideti catio ad estimatio of o-liear models with edogeous covariates usig cotrol variables. We udertake a theoretical ivestigatio of such estimators, ad illustrate the usefuless of our result by examiig the asymptotic variace of the estimator of the Average Treatmet ect (AT) proposed by Heckma, Ichimura ad Todd (998) that is based o o-parametric regressios o the estimated propesity score. The estimators uder cosideratio are all characterized by three steps. I the rst step we estimate a regressor. I the secod step we estimate a o-parametric regressio with the geerated regressor as oe of the idepedet variables. I the third step we estimate a ite dimesioal parameter (without loss of geerality we cosider the scalar case) that satis es a momet coditio that depeds o the o-parametric regressio estimated i the secod step. Paga (984), who cosidered regressio estimators ivolvig geerated regressors i the parametric cotext, is a itellectual predecessor. We heavily use Newey s (994) path-derivative based characterizatio of the asymptotic variace of semi-parametric estimators. We exted his result to three-step estimators, where the secod step is a o-parametric regressio o a variable estimated i the rst step. This paper has the followig structure. Our mai result is i Sectio 2. I Sectio 3 we cosider estimators that ivolve partial meas with a applicatio to regressio o the estimated propesity score i Sectio 4. 2 The I uece Fuctio of Semi-parametric Three-Step stimators We ow preset our two mai results o semi-parametric three-step estimators. We distiguish betwee two cases: (i) the rst step is parametric, (ii) the rst step is o-parametric. Moreover, we rst cosider estimators that ca be expressed as a sample average of a fuctio of the secodstep o-parametric regressio oly. Next, we allow the third-step estimator to deped o other variables besides the secod-step o-parametric regressio. I both cases the estimators are, i Newey s (994) termiology, full meas, because they average over all argumets of the secodstep o-parametric regressio. I Sectio 3 we cosider estimators that average over most but ot all idepedet variables i the secod-step o-parametric regressio, i.e. the estimator is a partial mea. This makes the i uece fuctio more complicated, which is the reaso that we start with the full mea case. 2. Parametric First Step, No-parametric Secod Step We assume that we observe i.i.d. observatios s i = (y i ; x i ; z i ; ) ; i = ; : : : ;. I the rst step, we compute a estimator b such that p P (b ) = p (x i; z i ) + o p () with [ (x i ; z i )] = 0. The parameter vector idexes a relatio betwee a depedet variable that is a compoet of 2

3 x (ad that we later deote by u) ad idepedet variables that are some or all of the other variables i x ad those i z. ither the predicted value or the residual of this relatioship is a idepedet variable i the secod-step o-parametric regressio. The otatio '(x; z; ) for the geerated regressor covers both cases. I the secod step, our goal is to estimate (x; v ) = [y j x; v ] where v = ' (x; z; ). Because we do ot observe, we use bv i = ' (x i ; z i ; b) i the oparametric regressio. The o-parametric regressio estimator of y o x; v = ' (x; z; ) is deoted by b. (The b is distict from the oparametric regressio b of y o x; v = ' (x; z; ).) Our goal is to characterize the rst order asymptotic properties of b = h (b (x i ; ' (x i ; z i ; b))) We ca cosider b as the solutio of a sample momet equatio that is derived from a populatio momet equatio that depeds o ad (x; '(x; z; )). Usig Newey s (994) path-derivative approach, it ca be show that we have the approximatio p b = p (h ( (x i ; ' (x i ; z i ; ))) ) (i) + p + p (h (b (x i ; ' (x i ; z i ; ))) h ( (x i ; ' (x i ; z i ; )))) (ii) (h ( (x i ; ' (x i ; z i ; b))) h ( (x i ; ' (x i ; z i ; )))) + o p () (iii) The rst term (i) is the mai term, the secod term (ii) is the adjustmet for the estimatio of b, ad the third term (iii) is the adjustmet related to the estimatio of b. The decompositio here is based o the fact that Newey s approach ca be used term-by-term. Therefore, we may without loss of geerality assume that is a scalar. The secod compoet (ii) i the decompositio ca be aalyzed as i Newey (994, pp ), ad we therefore focus o the aalysis of the third compoet We de e p (h ( (x i ; ' (x i ; z i ; b))) h ( (x i ; ' (x i ; z i ; )))) (x; v ; ) = [y j x; ' (x; z; ) = v ] g (s; ; 2 ) = h ( (x; ' (x; z; ) ; 2 )) Note that the two roles that plays are made explicit i g (s; ; 2 ) that is obtaied by substitutig v = '(x; z; ) i (x; v ; 2 ). Note also that (x; v ) = (x; v ; ), where (x; v ) = [y j x; ' (x; z; ) = v ]. The otatio ; 2 is just a expositioal device, sice = 2 =. 3

4 With these de itios, we ca ow write h ( (x i ; ' (x i ; z i ; b) ; b)) = g (s i ; b ; b 2 ) where b = b 2 = b, but we keep them separate to emphasize the two roles of b. This forces us to deal with the two roles that b plays i the liearizatio that ivolves partial derivatives: p (h ( (x i ; ' (x i ; z i ; b) ; b)) h ( (x i ; ' (x i ; z i ; ) ; ))) = p (g (s i ; b ; b 2 ) g (s i ; ; )) g (s; ; ) g (s; ; ) p(b = + ) + o p () 2 h i h Therefore we must compute g(s;;) ad g(s;;) 2 i. The computatio of the rst expectatio is easy; it is straightforward to show that g (s; ; ) h ( (x; ' (x; z; ))) = (x; ' (x; z; )) v ' (x; z; ) The headache is to compute the secod expectatio. By the chai rule g (s; ; ) h ( (x; ' (x; z; ))) (x; ' (x; z; ) ; ) = 2 2 () Ufortuately, it is ot obvious how to di eretiate (x; ' (x; z; ) ; ) with respect to. After all, (x; ' (x; z; ) ; ) has the fuctioal form of [y j x; ' (x; z; ) = v ] that depeds o. The ext lemma gives the solutio i a geeric case. Lemma Let t (x; '(x; z; )) deote a arbitrary mea square itegrable fuctio that is cotiuously di eretiable i the secod argumet. The, [t(x; '(x; z; ))(x; '(x; z; ); )] = = t(x; v ) (x; v ) '(x; z; ) + ((x; z) (x; v )) t(x; v ) '(x; z; ) (2) v v with v = '(x; z; ) ad (x; z) = (yjx; z). Our aalysis is predicated o the assumptio that the derivative exists o the left of () exists. Aalysis of the case whe the derivative does ot exist is beyod the scope of this paper. A ote that aalyzes ad gives su ciet coditios for the existece of the derivative is available o request. 4

5 Proof. Because (x; '(x; z; ); ) is the solutio to we have that for all mi p (y p(x; '(x; z; ))) 2 [(y (x; '(x; z; ); )) t (x; ' (x; z; ))] = 0 Di eretiatig with respect to ad evaluatig the result at =, we d after rearragig (2). This key lemma is used repeatedly i the sequel, begiig with the proof of the followig theorem. Theorem (Cotributio parametric rst-step estimator) The adjustmet to the i uece fuctio that accouts for the rst-stage estimatio error is g (s; ; ) g (s; ; ) p + (b ) 2 = ( (x; z) (x; v )) 2 h ( (x; v )) (x; v ) '(x; z; ) p(b ) (3) 2 v with v = '(x; z; ). Proof. We compute the right had side of () that by Lemma is equal to h ( (x; ' (x; z; ))) (x; ' (x; z; ) ; ) h(x; (x; v )) (x; v ) '(x; z; ) = v + ((x; z) (x; v )) 2 h(x; (x; v )) (x; v ) '(x; z; ) 2 v h i Addig g(s;;) that is equal to the opposite of the rst term o the right-had side, we d the desired result. Remark From Theorem, it ca be easily deduced that the rst-stage estimate has o cotributio to the i uece fuctio if h is liear i. Although straightforward ex post, this simpli catio does ot seem to have bee recogized i the past. The literature focused istead o the simpli - catio that occurs if the idex restrictio [yj x; z] = [yj x; '(x; z; )] holds. See, e.g., Newey (994) ad Klei, She ad Vella (200). Suppose ow that the is multidimesioal, i.e., y is a J-dimesioal radom vector. The estimator is ow b = h (b (x i ; ' (x i ; z i ; b)) ; : : : ; b J (x i ; ' (x i ; z i ; b))) The product rule of calculus suggests that we ca tackle this problem by addig the derivatives. This is formalized i the ext theorem. 5

6 Theorem 2 (Cotributio parametric rst-step estimators) The adjustmet to the i uece fuctio that accouts for the rst-stage estimatio error is X 2 h ( (x; ' (x; z; ))) (y j j (x; ' (x; z; ))) j (x; ' (x; z; )) ' (x; z; ) p (b ) : v j 2 j Proof. See Appedix. 2.2 No-parametric First Step, No-parametric Secod Step We ow assume that the rst step is o-parametric. Agai we have a radom sample s i = (y i ; x i ; z i ) ; i = ; : : : ;. The rst-step projectio of oe of the compoets of x, that we deote by u, o some or all of the other compoets of x ad z is deoted by v = ' (x; z) = [u j x; z]. The rst step is to estimate this projectio by o-parametric regressio. I the secod step we estimate (x; v ) = [y j x; v ] by o-parametric regressio of y o x; bv = b'(x; z). Our iterest is to characterize the rst-order asymptotic properties of b = h (b (x i ; b' (x i ; z i ))) where b (x i ; bv i ) is the o-parametric regressio estimate. We de e (x; v ; v 2 ) = [y j x; ' (x; z) = v ] g (w; v ; v 2 ) = h ( (x; v ; v 2 )) with v 2 '(x; z) ad coditioig o '(x; z) = v. Note that v ad v 2 play the roles of ad 2. With these de itios, we ca ow write h ( (x i ; bv ; bv 2 )) = g (s i ; bv ; bv 2 ) where bv = bv 2 = bv. We keep them separate to emphasize their di eret roles. Our objective is to approximate g (s i ; bv ; bv 2 ) g (s i ; v ; v 2 ) To d the cotributio of the samplig variatio i ^v we take as kow, i.e., the samplig variatio i the secod-step o-parametric regressio is accouted for i a separate term that has a well-kow form, sice it follows directly from Newey (994). As i Newey (994) we cosider a path v idexed by 2 R such that v = v. First, usig the calculatio i the previous sectio we obtai that [h ( (x; v ; v ))] = [D (s; v )] = 6

7 for D (s; v ) = 2 h ( (x; v )) (y (x; v 2 )) (x; v ) v : v which is liear i v. Secod, we have that for ay v = '(x; z), with [D (s; v)] = [ (x; z) '(x; z)] 2 h ( (x; ' (x; z))) (x; z) = (y (x; ' 2 (x; z))) (x; ' (x; z)) v x; z where it is uderstood that we coditio o the variables that are i ' so that we average over all x that are ot i the geerated regressor. By Newey (994, Propositio 4), these two facts imply that the adjustmet to the i uece fuctio is equal to (x i ; z i ) (u i [u j x i ; z i ]) = (x i ; z i ) (u i ' (x i ; z i )) with u the compoet of x that is projected o x; z. We summarize the result i a theorem: Theorem 3 (Cotributio o-parametric rst-step estimator) The adjustmet to the i- uece fuctio that accouts for the rst-stage estimatio error is p (x i ; z i ) (u i ' (x i ; z i )) with ' (x; z) = [ujx; z] ad as i (4). 2.3 Additioal Variables i the Third Step So far, we have assumed that the parameter of iterest is = [h((x; v ))] where h depeds oly o (x; v ). We ow cosider the extesio to = [h(w; (x; v ))] where w is a vector of other variables that may have x; z as subvectors. We cosider both the case that ' is parametric ad the case that this fuctio is o-parametric. Because as before the mai term ad the cotributio of the estimatio of [yjx; v ] do ot raise ew issues, the ext two theorems oly give the cotributio of the rst-stage estimator. I these theorems we use the fuctio h (w; (x; v )) (x; v) = x; ' (x; z; ) = v with ' (x; z) substituted i the o-parametric case. 7 (4)

8 Theorem 4 (Cotributio parametric rst-step estimator) The adjustmet to the i uece fuctio that accouts for the rst-stage estimatio error is h (w; (x; ' (x; z; ))) (x; ' (x; z; )) ' (x; z; ) (x; ' (x; z; )) v (x; ' (x; z; )) + ((x; z) (x; ' (x; z; ))) ' (x; z; ) p (b ) v with v = '(x; z; ). Proof. See Appedix. Now, we cosider the case where the rst step is o-parametric. The discussio precedig Theorem 3 implies that Theorem 5 (Cotributio o-parametric rst-step estimator) The adjustmet to the i- uece fuctio that accouts for the rst-stage estimatio error is p 2 (x i ; z i ) (u i ' (x i ; z i )) with ' (x; z) = [ujx; z] ad h (w; (x; ' (x; z))) (x; ' (x; z)) 2 (x; z) = (x; ' (x; z)) v x; z (x; ' (x; z)) + ((x; z) (x; ' (x; z))) v x; z Remark 2 Theorems 4 ad 5 are easily geeralized to the case of multidimesioal. Remark 3 Suppose that (x; ' (x; z; )) = 0 i Theorem 4. The adjustmet is the equal to the derivative with respect to, i.e., the aive derivative that oly accouts for as a argumet (see equatio (5) i the proof of Theorem 4). Therefore, it may be useful to check whether (x; ' (x; z; )) = 0 i speci c models. If it is the case, we eed ot worry about the e ect of rst-step estimatio o the secod-stage o-parametric regressio. Such a characterizatio turs out to be useful for the partially liear regressio model y i = x i + m ('(x i ; z i ; )) + i ; where m is o-parametric ad [ i j x i ; v i ] = 0. 2 Remark 4 The theorems ca be applied to geeral semi-parametric GMM estimators. If we cosider the momet coditio [m(w; (x; v ); )] = 0 ad we liearize the correspodig sample momet coditio to obtai p ( b m(w; (x; v ); ) ) = p m (w 0 i ; b(x i ; b'(x i ; z i )); ) + o p () Therefore, the cotributio of the rst-stage P estimate to the asymptotic distributio of b ca be foud by applyig Theorem 5 to p m (w i; b(x i ; b'(x i ; z i )); ). 2 A detailed discussio ca be foud i a previous versio of the paper, which is available o request. 8

9 3 The I uece Fuctio of Semi-parametric Three-step stimators: Partial Meas I Sectio 2, the estimator b averaged over all argumets of the secod-step o-parametric regressio fuctio. I Newey s (994) termiology the estimator is a full mea. I this sectio we cosider the case that the discrete idepedet variable d is xed, i.e., b does ot average over this variable. Hece the estimator is a partial mea. Because the variable that is xed i the partial mea is discrete, the parametric rate of covergece applies. (If that variable were cotiuous we would have a slower rate.) Let the discrete variable d take the values d () ; : : : ; d (K). I the secod step we estimate (x; v ; d) = [y j x; v ; d] As i Sectio 2 we use bv i i the o-parametric regressio. De e b k (x i ; bv i ) = b(x i ; bv i ; d (k) ), i.e., b k is the o-parametric regressio fuctio if we set x ad bv to the observed values for i, but x d at value d (k) which may ot be its value for i. The K vector b(x i ; bv i ) stacks the b k (x i ; bv i ). We also de e k (x; v ) = y j x; v ; d (k) ad (x; v ) as the K vector with compoets k (x; v ). Our goal is to characterize the rst order asymptotic properties of b = h (w i ; b (x i ; bv i )) with b the K vector of o-parametric regressio fuctios where the discrete variable d is xed at its K distict values. As i Sectio 2.3, we allow for a vector of additioal variables w i h. 3. Partial Meas: Parametric First Step, No-parametric Secod Step We assume that the rst step is parametric, ad bv i = ' (x i ; z i ; b). As i Sectio 2, we ca use Newey s (994) approach, ad express the i uece fuctio of b as a sum of three terms: (i) the mai term, (ii) a term that adjusts for the estimatio of b, ad (iii) a adjustmet related to the estimatio of b. De e ad k (x; '(x; z; )) = Pr(d = d (k) jx; '(x; z; )) k (x; z) = Pr(d = d (k) jx; z) h (w; (x; v )) k (x; v) = k x; ' (x; z; ) = v The secod compoet i the decompositio ca be aalyzed as i Newey (994, pp ) ad is equal to p KX (d i = d (k) )(y i k (x i ; v i )) k(x i ; v i ) k (x i ; v i ) + o p () (5) 9

10 As i Sectio 2 we therefore focus o the aalysis of the third compoet p (h (w i ; (x i ; ' (x i ; z i ; b) ; b)) h (w i ; (x i ; ' (x i ; z i ; )) ; )) = p (g (s i ; b ; b 2 ) g (s i ; ; )) with k (x; v ; ) = y j x; ' (x; z; ) = v ; d = d (k) (x; v ; ) = ( (x; v ; ) K (x; v ; )) 0 g (w; ; 2 ) = h (w; (x; ' (x; z; ) ; 2 )) Lemma ca be geeralized to the partial meas case: Lemma 2 For partial meas we have for all k = ; : : : ; K [t(x; '(x; z; )) k (x; '(x; z; ); )] = k (x; z) = k (x; v ) t(x; v ) k(x; v ) '(x; z; ) v k (x; z) + k (x; v ) ( k(x; z) 2 k (x; v )) k (x; v ) t(x; v ) v with v = '(x; z; ). Proof. Because k (x; '(x; z; ); ) is the solutio to mi (d = d (k) )(y p(x; '(x; z; ))) 2 p t(x; v ) k(x; v ) v we have that for all t(x; '(x; z; )) (d = d (k) ) (y k (x; '(x; z; ); )) = 0 k (x; '(x; z; ) '(x; z; ) Di eretiatig with respect to ad evaluatig the result at =, we d after rearragig (6). The ext theorem geeralizes Theorem 4: Theorem 6 (Cotributio parametric rst-step estimator) The adjustmet to the i uece fuctio that accouts for the rst-stage estimatio error is KX! h(w; (x; v )) k (x; z) k k (x; v ) k (x; v ) ' (x; z; ) k(x; v ) v! KX k (x; z) + k (x; v ) ( k(x; z) k (x; v )) k(x; v ) ' (x; z; ) v!! KX k (x; z) k (x; v ) ( k(x; z) 2 k (x; v )) k (x; v ) k(v; v ) ' (x; z; ) p (b ) v 0 (6)

11 Proof. See Appedix. If h oly depeds o, the k (x; v ) = h((x;v)) k so that the cotributio of the rst stage is! KX k (x; z) k (x; v ) h ( (x; v )) k (x; v ) '(x; z; ) k (x; v ) k v! KX KX k (x; z) + k 0 k (x; v ) ( k (x; z) k (x; v )) 2 h ( (x; v )) k 0 (x; v ) '(x; z; ) k k 0 v =!! KX k (x; z) k (x; v ) 2 ( k (x; z) k (x; v )) h ( (x; v )) k (x; v ) '(x; z; ) p(b ) k v 3.2 Partial Meas: Noparametric First Step, No-parametric Secod Step Now assume that the rst step cosists of a o-parametric regressio estimate of v = ' (x; z) = [u j x; z]. We ca obtai the adjustmet by replicatig the argumets i Sectio 2.3 leadig to Theorem 5. Lettig X K h(w; (x; v )) k (x; z) 3 (x; z) = k k (x; v ) k (x; v ) k(x; v ) v x; z + X K k (x; z) k (x; v ) ( k(x; z) k (x; v )) k(x; v ) v x; z (8) X K k (x; z) k (x; v ) ( k(x; z) 2 k (x; v )) k (x; v ) k(v; v ) v x; z we obtai a aalog of Theorem 3: Theorem 7 (Cotributio o-parametric rst-step estimator) The adjustmet to the i- uece fuctio that accouts for the rst-stage estimatio error is p 3 (x i ; z i ) (u i ' (x i ; z i )) with 3 (x; z) give i (8). If h depeds o oly we replace 3 above by X K k (x; z) k (x; ' (x; z)) h ( (x; ' (x; z))) k (x; ' (x; z)) 3 (x; z) = k (x; v ) k v + KX KX k 0 = KX k (x; z) k (x; ' (x; z)) ( k (x; z) k (x; z) k (x; ' (x; z)) 2 ( k (x; z) (7) k (x; ' (x; z))) 2 h ( (x; ' (x; z))) k 0 (x; ' (x; z)) k k 0 v k (x; ' (x; z))) h ( (x; ' (x; z))) k k (x; ' (x; z)) v x; z

12 4 Applicatio: Regressio o the stimated Propesity Score We cosider a itervetio with potetial outcomes y 0 ; y that are the cotrol ad treated outcome, respectively. The treatmet idicator is d ad y = dy + ( d)y 0 is the observed outcome. The vector x cotais covariates that are ot a ected by the itervetio. As show by Rosebaum ad Rubi (983) ucofouded assigmet, i.e., the assumptio that y ; y 0? djx, implies y ; y 0? dj' (x) with ' (x) = Pr(d = jx) the probability of selectio or propesity score. This observatio has led to a large umber of estimators. The asymptotic variace of the estimators ca be compared to the semi-parametric e ciecy boud for the AT derived by Hah (998). The most popular estimators are the matchig estimators that estimate the AT give x or give ' (x) by averagig outcomes over uits with a similar value of x or ' (x). Abadie ad Imbes (2009a, 2009b) are recet cotributios. They show that matchig estimators that have a asymptotic distributio that is otoriously di cult to aalyze, are ot asymptotically e ciet. The secod class of estimators do ot estimate the AT give x or ' (x) but use the propesity scores as weights. Hah s (998) estimator ad the estimator of Hirao, Imbes ad Ridder (2003) are examples of such estimators. These estimators are asymptotically e ciet. The third class of estimators use o-parametric regressio to estimate [yjd = ; x], [yjd = 0; x] or [yjd = ; ' (x)], [yjd = 0; ' (x)]. Of these estimators the estimator based o [yjd = ; x], [yjd = 0; x], the imputatio estimator, is kow to be asymptotically e ciet, which suggests that there is o role for the propesity score. The missig result is that for the estimator that uses the o-parametric regressio o the propesity score that is estimated i a prelimiary step. This estimator that was suggested ad aalyzed by Heckma, Ichimura, ad Todd (HIT) (998) ts ito our framework ad is aalyzed here. 3 Our coclusio is that the HIT estimator has the same asymptotic variace as the imputatio estimator, so that there is o e ciecy gai i usig the propesity score. This should settle the issue whether there is a role for the propesity score i achievig semi-parametric e ciecy Parametric First Step, No-parametric Secod Step We have a radom sample s i = (y i ; x i ; d i ) ; i = ; : : : ;. The propesity score Pr(d = jx) = '(x; ) is parametric ad its parameters are estimated i the rst step, by e.g. Maximum Likelihood or OLS (Liear Probability model) or ay other method, such that p (b ) = p (d i ; x i ) + o p () with [ (d i ; x i )] = 0. I the secod step, we estimate ('(x; )) = ( [y j '(x; ); d = ] ; [y j '(x; ); d = 0]) 0 ; 3 Heckma, Ichimura, ad Todd actually cosider a estimator of the Average Treatmet ect o the Treated (ATT) that we also aalyze. 4 That does ot mea that there is o role for the propesity score i assessig the ideti catio or i improved small sample performace of AT estimators. 2

13 Because we do ot observe, we use ' (x i ; b) i the o-parametric regressio. Our iterest is to characterize the rst order asymptotic properties of b = (b (' (x i ; b)) b 2 (' (x i ; b))) This estimator ca be hadled by applyig Theorem 6 for the special case that h oly depeds o. The vector ('(x; )) is a 2-vector of partial meas ad d is either d () = or d (2) = 0. Further '; k deped o x oly ad (x) = '(x; ) ad ('(x; )) = Pr(d = j'(x; )) = '(x; ). Also h() = 2 so that the secod derivatives are 0. Upo substitutio the rst two terms o the right-had side of (7) are 0. Because h = for k = ; 2 we d that the cotributio k k v of the rst-stage estimator to the i uece fuctio is g (s; ; ) g (s; ; ) + p(^ ) = 2 [yj x; d = ] (' (x; )) + [yj x; d = 0] 2 (' (x; )) ' (x; ) p(^ ) ' (x; ) ' (x; ) The cotributio of b ca be derived usig Newey (994), ad is give i (0) below. We also cosider the HIT estimator of the Average Treatmet ect o the Treated (ATT) ^ = d i p (b ('(x i ; b)) b 2 ('(x i ; b))) with p = Pr(d = ). This estimator is a special case of that cosidered i Theorem 6 with h(w; ; 2 ) = d( p 2 ), so that h ot oly depeds o ; 2, so that the simpli catio i (7) does ot apply. Substitutio of h(w; ; 2 ) = d h(w; ; 2 ) = d p 2 p ad ( ; 2 are fuctios of v oly) (v) = v p 2 (v) = v p give that the rst term i Theorem 6 is 0, ad the secod term is p (( (x) ('(x; ))) ( 2 (x) 2 ('(x; )))) ' (x; ) ad the third p ( (x) ('(x; ))) '(x; ) ' (x; '(x; ) ( ) 2(x) 2 ('(x; ))) so that by takig the di erece of the last two terms we d that the cotributio is g (s; ; ) g (s; ; ) + p(^ ) = 2 [yjx; d = 0] 2 (' (x; )) '(x; ) p(^ ) p ( ' (x; )) 3

14 4.2 No-parametric First Step, No-parametric Secod Step The aalysis i the previous sectio combied with the results i Newey (994) shows that i the case that the rst stage is o-parametric the cotributio of the rst-stage estimatio to the i uece fuctio of the AT estimator is [yj x; d = ] (' (x)) + [yj x; d = 0] 2 (' (x)) (d ' (x)) ' (x) ' (x) which ca be alteratively writte as [yj x; d = ] (' (x)) d + ( [yj x; d = ] (' (x))) ' (x) + [yj x; d = 0] 2 (' (x)) ( d) ( [yj x; d = 0] 2 (' (x))) (9) ' (x) To obtai the complete i uece fuctio of b we eed the cotributio of the estimatio error i b. This cotributio is equal to 5 ( (' (x)) 2 (' (x)) ) + d ' (x) (y d (' (x))) ' (x) (y 2 (' (x))) (0) Addig (9) ad (0), we obtai the i uece fuctio of the estimator based o regressios o the estimated propesity score: ( [yj x; d = ] [yj x; d = 0] )+ d ' (x) (y [yj x; d = ]) d (y [yj x; d = 0]) ; ' (x) which is the i uece fuctio of the e ciet estimator ad also that of the imputatio estimator b I = ( b (x i ) 2 b (x i )) with (x) = [yjx; d = ]; 2 (x) = [yjx; d = 0]. The imputatio estimator ivolves oparametric regressios o x ad ot o the estimated propesity score. However these two estimators have the same i uece fuctio which shows that regressig o the o-parametrically estimated propesity score does ot result i a e ciecy gai. The ifeasible estimator that depeds o o-parametric regressios o the populatio propesity score is less e ciet tha the estimator that uses the estimated propesity score. For the estimator of the ATT the cotributio of the rst stage is [yjx; d = 0] 2 (' (x)) (d ' (x)) p ( ' (x)) 5 Derivatio is straightforward, ad is cotaied i a previous versio of the paper, which is available o request. 4

15 The mai term ad the cotributio of the estimatio of the (ifeasible) o-parametric regressios is 6 d p (y (' (x))) ( d)' (x) p ( ' (x)) (y 2 (' (x))) + d p ( (' (x)) 2 (' (x)) ) : Addig these expressios we obtai the full i uece fuctio d p (y [yjx; d = ]) ( d)' (x) p( ' (x)) (y [yjx; d = 0]) + d p ([yjx; d = ] [yjx; d = 0] ) As i the case of the AT the i uece fuctio is the same as that for the estimator that ivolves o-parametric regressios o x ad ot o the estimated propesity score, so that agai there is o rst-order asymptotic e ciecy gai if we use the estimated propesity score i the o-parametric regressios. It should be oted that the i uece fuctios derived i this sectio are di eret from those foud i the literature. Recetly, Mamme, Rothe, ad Schiele (200) derived the i uece fuctio for the AT estimator cosidered i this sectio. They cocluded that it is idetical to that of the ifeasible estimator that regresses o the populatio propesity score. This is because they imposed the idex assumptio [yj d; x] = [yj d; '(x)], which is ot made i the stadard program evaluatio literature, because it restricts the distributio of the potetial outcomes. For istace, i a liear selectio (o observables) model the idex restrictio implies that the regressio coe ciets i the outcome equatios are proportioal to those i the selectio equatio. HIT derived the i uece fuctio for the ATT estimator that is also di eret from ours. I this case, the di erece appears to be due to a error i their derivatio, which fails to accout for the e ect of the rst-stage estimatio o the coditioal expectatio i the secod stage. 5 Summary We studied the asymptotic distributio of three-step estimators of a ite dimesioal parameter vector where the secod step cosists of oe or more o-parametric regressios o a regressor that is estimated i the rst step. The rst step estimator is either parametric or o-parametric. We showed that Newey s (994) method ca be used to determie the cotributio of the rst-step estimatio error o the i uece fuctio. I doig so it is essetial to recogize that the rststage estimate has two e ects o the samplig distributio of the ite-dimesioal parameter vector. First, the rst-stage estimate eters the argumet at which the coditioal expectatio is evaluated, secod, the rst-stage estimate chages the coditioal expectatio itself. I the literature the secod cotributio of the rst-stage estimate to the i uece fuctio is sometimes forgotte. Our cotributio is that we show how to derive this cotributio so that we obtai the correct i uece fuctio for three- or more stage estimators. 6 This ca be derived usig a argumet that leads to (0). 5

16 Appedix Proof of Theorem 2 We write p (h ( (x i ; ' (x i ; z i ; b) ; b)) h ( (x i ; ' (x i ; z i ; ) ; ))) = p (g (s i ; b ; b 2 ) g (s i ; ; )) g (s; ; ) g (s; ; ) p(b = + ) + o p () 2 h i h Therefore we must compute g(s;;) ad g(s;;) 2 i. The computatio of the rst expectatio is easy. Because (x; ' (x; z; ) ; ) = (x; ' (x; z; )), we have g (s; ; ) = JX j= h ( (x; ' (x; z; ))) j (x; ' (x; z; )) j v ' (x; z; ) where j deotes the j-th compoet of, etc. We ow tackle the secod expectatio. By the chai rule g (s; ; ) JX h ( (x; ' (x; z; ))) j (x; ' (x; z; ) ; ) = () 2 j 2 j= We compute the right had side of () usig Lemma. g (s; ; ) = X h ( (x; ' (x; z; ))) j (x; '(x; z; ); ) = 2 j j 2 JX j= Addig JX j= h ( (x; ' (x; z; ))) j (x; '(x; z; )) j v (y j j (x; ' (x; z; ))) 2 h ( (x; ' (x; z; ))) 2 j h i g(s;;) we d the desired result. '(x; z; ) + j (x; ' (x; z; )) v '(x; z; ) Proof of Theorem 4 ad The cotributio of b is the sum of h (w; (x; ' (x; z; ) ; )) p (b ) (2) h (w; (x; ' (x; z; ) ; )) p (b ) (3) 2 6

17 Note that h (w; (x; ' (x; z; ) ; )) h (w; (x; v )) = = Because (x;'(x;z;);) is a fuctio of (x; ' (x; z; )), we have (x; v ) v ' (x; z; ) h (w; (x; ' (x; z; ) ; )) 2 h (w; (x; ' (x; z; ))) (x; ' (x; z; ) ; ) = 2 h (w; (x; ' (x; z; ))) = (x; ' (x; z; x; ' (x; z; ) ; ) ) 2 = (x; ' (x; z; )) (x; ' (x; z; ) ; ) 2 (4) (5) By Lemma (x; ' (x; z; ) ; ) (x; ' (x; z; )) 2 = ((x; z) (x; v )) (x; v ) ' (x; z; ) v (x; v ) (x; v ) ' (x; z; ) v (6) Combiig (4) - (6), we coclude that h (w; (x; ' (x; z; ) ; )) h (w; (x; ' (x; z; ) ; )) + 2 h (w; (x; v )) (x; v ) ' (x; z; ) = (x; v ) v (x; v ) + ((x; z) (x; v )) ' (x; z; ) v which gives us the desired result. Proof of Theorem 6 Note that h (w; (x; ' (x; z; ) ; )) = = KX! h (w; (x; v )) k (x; v ) k v ' (x; z; ) (7) 7

18 Because k(x;v ; ) is a fuctio of (x; v ), we have h (w; (x; ' (x; z; ) ; )) X K h (w; (x; v )) = 2 k x; v k (x; v ; ) K X = k (x; v ) k(x; v ; ) By Lemma 2 K X k (x; v ) k (x; v ; ) X k (x; z) = k (x; v ) k (x; v ) k(x; v ) '(x; z; ) v k X K k (x; z) + k (x; v ) ( k (x; z) k (x; v )) k(x; v ) v X K k (x; z) k (x; v ) ( 2 k (x; z) k (x; v )) k (x; v ) k (x; v ) v '(x; z; ) '(x; z; ) (8) (9) Combiig (7) - (9), we obtai the desired coclusio. 8

19 Refereces [] Abadie, A. ad G. Imbes (2009a), A Martigale Represetatio for Matchig stimators, upublished workig paper. [2] Abadie, A. ad G. Imbes (2009b), Matchig o the stimated Propesity Score, upublished workig paper. [3] Hah, J. (998): O the Role of the Propesity Score i ciet Semiparametric stimatio of Average Treatmet ects, coometrica, 66,pp [4] Heckma, J.J., H. Ichimura, ad P. Todd (998): Matchig as a coometric valuatio stimator, Review of coomic Studies 65, pp [5] Hirao, K., G. Imbes, ad G. Ridder (2003): ciet stimatio of the Average Treatmet ect Usig the stimated Propesity Score, coometrica, 7, pp [6] Klei, R., C. She, ad F. Vella (200): Triagular Semiparametric Models Featurig Two Depedet dogeous Biary Outcomes, upublished workig paper. [7] Mamme,., C. Rothe, ad M. Schiele (200): Noparametric Regressio with Noparametrically Geerated Covariates, upublished workig paper. [8] Murphy, K. M. ad R. H. Topel (985): stimatio ad Iferece i Two-Step coometric Models, Joural of Busiess ad coomic Statistics 3, pp [9] Newey, W.K. (994): The Asymptotic Variace of Semiparametric stimators, coometrica 62, pp [0] Paga, A. (984): coometric Issues i the Aalysis of Regressios with Geerated Regressors, Iteratioal coomic Review 25, pp [] Rosebaum, P., ad D. Rubi (983): The Cetral Role of the Propesity Score i Observatioal Studies for Causal e ects, Biometrika 70, pp

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors

The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors The Asymptotic Variance of Semi-parametric stimators with Generated Regressors Jinyong Hahn Department of conomics, UCLA Geert Ridder Department of conomics, USC October 7, 00 Abstract We study the asymptotic

More information

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Study the bias (due to the nite dimensional approximation) and variance of the estimators 2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite

More information

Regression with an Evaporating Logarithmic Trend

Regression with an Evaporating Logarithmic Trend Regressio with a Evaporatig Logarithmic Tred Peter C. B. Phillips Cowles Foudatio, Yale Uiversity, Uiversity of Aucklad & Uiversity of York ad Yixiao Su Departmet of Ecoomics Yale Uiversity October 5,

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

1 Covariance Estimation

1 Covariance Estimation Eco 75 Lecture 5 Covariace Estimatio ad Optimal Weightig Matrices I this lecture, we cosider estimatio of the asymptotic covariace matrix B B of the extremum estimator b : Covariace Estimatio Lemma 4.

More information

LECTURE 11 LINEAR PROCESSES III: ASYMPTOTIC RESULTS

LECTURE 11 LINEAR PROCESSES III: ASYMPTOTIC RESULTS PRIL 7, 9 where LECTURE LINER PROCESSES III: SYMPTOTIC RESULTS (Phillips ad Solo (99) ad Phillips Lecture Notes o Statioary ad Nostatioary Time Series) I this lecture, we discuss the LLN ad CLT for a liear

More information

Spurious Fixed E ects Regression

Spurious Fixed E ects Regression Spurious Fixed E ects Regressio I Choi First Draft: April, 00; This versio: Jue, 0 Abstract This paper shows that spurious regressio results ca occur for a xed e ects model with weak time series variatio

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

ON POINTWISE BINOMIAL APPROXIMATION

ON POINTWISE BINOMIAL APPROXIMATION Iteratioal Joural of Pure ad Applied Mathematics Volume 71 No. 1 2011, 57-66 ON POINTWISE BINOMIAL APPROXIMATION BY w-functions K. Teerapabolar 1, P. Wogkasem 2 Departmet of Mathematics Faculty of Sciece

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Recurrence Relations

Recurrence Relations Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >

More information

4 Conditional Distribution Estimation

4 Conditional Distribution Estimation 4 Coditioal Distributio Estimatio 4. Estimators Te coditioal distributio (CDF) of y i give X i = x is F (y j x) = P (y i y j X i = x) = E ( (y i y) j X i = x) : Tis is te coditioal mea of te radom variable

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker CHAPTER 9. POINT ESTIMATION 9. Covergece i Probability. The bases of poit estimatio have already bee laid out i previous chapters. I chapter 5

More information

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices Radom Matrices with Blocks of Itermediate Scale Strogly Correlated Bad Matrices Jiayi Tog Advisor: Dr. Todd Kemp May 30, 07 Departmet of Mathematics Uiversity of Califoria, Sa Diego Cotets Itroductio Notatio

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

4. Hypothesis testing (Hotelling s T 2 -statistic)

4. Hypothesis testing (Hotelling s T 2 -statistic) 4. Hypothesis testig (Hotellig s T -statistic) Cosider the test of hypothesis H 0 : = 0 H A = 6= 0 4. The Uio-Itersectio Priciple W accept the hypothesis H 0 as valid if ad oly if H 0 (a) : a T = a T 0

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

STAT331. Example of Martingale CLT with Cox s Model

STAT331. Example of Martingale CLT with Cox s Model STAT33 Example of Martigale CLT with Cox s Model I this uit we illustrate the Martigale Cetral Limit Theorem by applyig it to the partial likelihood score fuctio from Cox s model. For simplicity of presetatio

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Comparison Study of Series Approximation. and Convergence between Chebyshev. and Legendre Series

Comparison Study of Series Approximation. and Convergence between Chebyshev. and Legendre Series Applied Mathematical Scieces, Vol. 7, 03, o. 6, 3-337 HIKARI Ltd, www.m-hikari.com http://d.doi.org/0.988/ams.03.3430 Compariso Study of Series Approimatio ad Covergece betwee Chebyshev ad Legedre Series

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Solutions: Homework 3

Solutions: Homework 3 Solutios: Homework 3 Suppose that the radom variables Y,...,Y satisfy Y i = x i + " i : i =,..., IID where x,...,x R are fixed values ad ",...," Normal(0, )with R + kow. Fid ˆ = MLE( ). IND Solutio: Observe

More information

Conditional-Sum-of-Squares Estimation of Models for Stationary Time Series with Long Memory

Conditional-Sum-of-Squares Estimation of Models for Stationary Time Series with Long Memory Coditioal-Sum-of-Squares Estimatio of Models for Statioary Time Series with Log Memory.M. Robiso Lodo School of Ecoomics The Sutory Cetre Sutory ad Toyota Iteratioal Cetres for Ecoomics ad Related Disciplies

More information

Kernel density estimator

Kernel density estimator Jauary, 07 NONPARAMETRIC ERNEL DENSITY ESTIMATION I this lecture, we discuss kerel estimatio of probability desity fuctios PDF Noparametric desity estimatio is oe of the cetral problems i statistics I

More information

Analysis of the Chow-Robbins Game with Biased Coins

Analysis of the Chow-Robbins Game with Biased Coins Aalysis of the Chow-Robbis Game with Biased Cois Arju Mithal May 7, 208 Cotets Itroductio to Chow-Robbis 2 2 Recursive Framework for Chow-Robbis 2 3 Geeralizig the Lower Boud 3 4 Geeralizig the Upper Boud

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

5. Fractional Hot deck Imputation

5. Fractional Hot deck Imputation 5. Fractioal Hot deck Imputatio Itroductio Suppose that we are iterested i estimatig θ EY or eve θ 2 P ry < c where y fy x where x is always observed ad y is subject to missigess. Assume MAR i the sese

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

4.5 Multiple Imputation

4.5 Multiple Imputation 45 ultiple Imputatio Itroductio Assume a parametric model: y fy x; θ We are iterested i makig iferece about θ I Bayesia approach, we wat to make iferece about θ from fθ x, y = πθfy x, θ πθfy x, θdθ where

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

Select the Valid and Relevant Moments: An Information-Based LASSO for GMM with Many Moments

Select the Valid and Relevant Moments: An Information-Based LASSO for GMM with Many Moments Select the Valid ad Relevat Momets: A Iformatio-Based LASSO for GMM with May Momets Xu Cheg y Zhipeg Liao z First Versio: Jue, This Versio: October, 4 Abstract This paper studies the selectio of valid

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

A Hadamard-type lower bound for symmetric diagonally dominant positive matrices

A Hadamard-type lower bound for symmetric diagonally dominant positive matrices A Hadamard-type lower boud for symmetric diagoally domiat positive matrices Christopher J. Hillar, Adre Wibisoo Uiversity of Califoria, Berkeley Jauary 7, 205 Abstract We prove a ew lower-boud form of

More information

Asymptotic distribution of products of sums of independent random variables

Asymptotic distribution of products of sums of independent random variables Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege

More information

A note on self-normalized Dickey-Fuller test for unit root in autoregressive time series with GARCH errors

A note on self-normalized Dickey-Fuller test for unit root in autoregressive time series with GARCH errors Appl. Math. J. Chiese Uiv. 008, 3(): 97-0 A ote o self-ormalized Dickey-Fuller test for uit root i autoregressive time series with GARCH errors YANG Xiao-rog ZHANG Li-xi Abstract. I this article, the uit

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

K. Grill Institut für Statistik und Wahrscheinlichkeitstheorie, TU Wien, Austria

K. Grill Institut für Statistik und Wahrscheinlichkeitstheorie, TU Wien, Austria MARKOV PROCESSES K. Grill Istitut für Statistik ud Wahrscheilichkeitstheorie, TU Wie, Austria Keywords: Markov process, Markov chai, Markov property, stoppig times, strog Markov property, trasitio matrix,

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS R775 Philips Res. Repts 26,414-423, 1971' THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS by H. W. HANNEMAN Abstract Usig the law of propagatio of errors, approximated

More information

1 General linear Model Continued..

1 General linear Model Continued.. Geeral liear Model Cotiued.. We have We kow y = X + u X o radom u v N(0; I ) b = (X 0 X) X 0 y E( b ) = V ar( b ) = (X 0 X) We saw that b = (X 0 X) X 0 u so b is a liear fuctio of a ormally distributed

More information

Limit distributions for products of sums

Limit distributions for products of sums Statistics & Probability Letters 62 (23) 93 Limit distributios for products of sums Yogcheg Qi Departmet of Mathematics ad Statistics, Uiversity of Miesota-Duluth, Campus Ceter 4, 7 Uiversity Drive, Duluth,

More information

Enumerative & Asymptotic Combinatorics

Enumerative & Asymptotic Combinatorics C50 Eumerative & Asymptotic Combiatorics Stirlig ad Lagrage Sprig 2003 This sectio of the otes cotais proofs of Stirlig s formula ad the Lagrage Iversio Formula. Stirlig s formula Theorem 1 (Stirlig s

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

Singular Continuous Measures by Michael Pejic 5/14/10

Singular Continuous Measures by Michael Pejic 5/14/10 Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

Modified Decomposition Method by Adomian and. Rach for Solving Nonlinear Volterra Integro- Differential Equations

Modified Decomposition Method by Adomian and. Rach for Solving Nonlinear Volterra Integro- Differential Equations Noliear Aalysis ad Differetial Equatios, Vol. 5, 27, o. 4, 57-7 HIKARI Ltd, www.m-hikari.com https://doi.org/.2988/ade.27.62 Modified Decompositio Method by Adomia ad Rach for Solvig Noliear Volterra Itegro-

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Lecture Stat Maximum Likelihood Estimation

Lecture Stat Maximum Likelihood Estimation Lecture Stat 461-561 Maximum Likelihood Estimatio A.D. Jauary 2008 A.D. () Jauary 2008 1 / 63 Maximum Likelihood Estimatio Ivariace Cosistecy E ciecy Nuisace Parameters A.D. () Jauary 2008 2 / 63 Parametric

More information

11 THE GMM ESTIMATION

11 THE GMM ESTIMATION Cotets THE GMM ESTIMATION 2. Cosistecy ad Asymptotic Normality..................... 3.2 Regularity Coditios ad Idetificatio..................... 4.3 The GMM Iterpretatio of the OLS Estimatio.................

More information

On Random Line Segments in the Unit Square

On Random Line Segments in the Unit Square O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

Riesz-Fischer Sequences and Lower Frame Bounds

Riesz-Fischer Sequences and Lower Frame Bounds Zeitschrift für Aalysis ud ihre Aweduge Joural for Aalysis ad its Applicatios Volume 1 (00), No., 305 314 Riesz-Fischer Sequeces ad Lower Frame Bouds P. Casazza, O. Christese, S. Li ad A. Lider Abstract.

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP Etropy ad Ergodic Theory Lecture 5: Joit typicality ad coditioal AEP 1 Notatio: from RVs back to distributios Let (Ω, F, P) be a probability space, ad let X ad Y be A- ad B-valued discrete RVs, respectively.

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

Physics 232 Gauge invariance of the magnetic susceptibilty

Physics 232 Gauge invariance of the magnetic susceptibilty Physics 232 Gauge ivariace of the magetic susceptibilty Peter Youg (Dated: Jauary 16, 2006) I. INTRODUCTION We have see i class that the followig additioal terms appear i the Hamiltoia o addig a magetic

More information

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences Commuicatios of the Korea Statistical Society 29, Vol. 16, No. 5, 841 849 Precise Rates i Complete Momet Covergece for Negatively Associated Sequeces Dae-Hee Ryu 1,a a Departmet of Computer Sciece, ChugWoo

More information

A Risk Comparison of Ordinary Least Squares vs Ridge Regression

A Risk Comparison of Ordinary Least Squares vs Ridge Regression Joural of Machie Learig Research 14 (2013) 1505-1511 Submitted 5/12; Revised 3/13; Published 6/13 A Risk Compariso of Ordiary Least Squares vs Ridge Regressio Paramveer S. Dhillo Departmet of Computer

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

Solution to Chapter 2 Analytical Exercises

Solution to Chapter 2 Analytical Exercises Nov. 25, 23, Revised Dec. 27, 23 Hayashi Ecoometrics Solutio to Chapter 2 Aalytical Exercises. For ay ε >, So, plim z =. O the other had, which meas that lim E(z =. 2. As show i the hit, Prob( z > ε =

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

A Note on Effi cient Conditional Simulation of Gaussian Distributions. April 2010

A Note on Effi cient Conditional Simulation of Gaussian Distributions. April 2010 A Note o Effi ciet Coditioal Simulatio of Gaussia Distributios A D D C S S, U B C, V, BC, C April 2010 A Cosider a multivariate Gaussia radom vector which ca be partitioed ito observed ad uobserved compoetswe

More information

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable. Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

A NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS

A NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS Jural Karya Asli Loreka Ahli Matematik Vol. No. (010) page 6-9. Jural Karya Asli Loreka Ahli Matematik A NEW CLASS OF -STEP RATIONAL MULTISTEP METHODS 1 Nazeeruddi Yaacob Teh Yua Yig Norma Alias 1 Departmet

More information

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram. Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios

More information

Solutions to HW Assignment 1

Solutions to HW Assignment 1 Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.

More information