Efficiency Comparisons in Multivariate Multiple Regression with Missing Outcomes
|
|
- Leon Jennings
- 5 years ago
- Views:
Transcription
1 joural of multvarate aalyss 61, (1997) artcle o. MV Effcecy Comparsos Multvarate Multple Regresso wth Mssg Outcomes Adrea Rottzky* Harvard School of Publc Health Chrsta A. Holcroft - Uversty of Massachusetts Lowell ad James M. Robs Harvard School of Publc Health We cosder a follow-up study whch a outcome varable s to be measured at fxed tme pots ad covarate values are measured pror to start of follow-up. We assume that the codtoal mea of the outcome gve the covarates s a lear fucto of the covarates ad s dexed by occaso-specfc regresso parameters. I ths paper we study the asymptotc propertes of several frequetly used estmators of the regresso parameters, amely the ordary least squares (OLS), the geeralzed least squares (GLS), ad the geeralzed estmatg equato (GEE) estmators whe the complete vector of outcomes s ot always observed, the mssg data patters are mootoe ad the data are mssg completely at radom (MCAR) the sese defed by Rub [11]. We show that whe the covarace of the outcome gve the covarates s costat, as opposed to the omssg data case (a) the GLS estmator s more effcet tha the OLS estmator, (b) the GLS estmator s effcet, ad (c) the semparametrc effcet estmator a model that mposes lear restrctos oly o the codtoal mea of the last occaso regresso ca be less effcet tha the effcet estmator a model that mposes Receved March 21, 1996 revsed November 11, AMS subject classfcato prmary 62J05 secodary 62B99. Key words ad phrases geeralzed estmatg equatos, geeralzed least squares, mssg data, repeated measures, semparametrc effcet. * Supported part by the Natoal Isttutes of Health uder Grat 1-R29-GM A1. - Supported by the Natoal Isttutes of Health uder Grats 532 CA ad 1-R29-GM A1. Supported part by the Natoal Isttutes of Health uder Grats 2 P30 ES00002, R01AI32475, R01-ES03405, ad K04-ES X Copyrght 1997 by Academc Press All rghts of reproducto ay form reserved. 102
2 REGRESSION WIH MISSING OUCOMES 103 lear restrctos o the codtoal meas of all the outcomes. We provde formulae ad calculatos of the asymptotc relatve effceces of the cosdered estmators three mportat cases (1) for the estmators of the occaso-specfc meas, (2) for estmators of occaso-specfc mea dffereces, ad (3) for estmators of occaso-specfc dose-respose model parameters Academc Press 1. INRODUCION May radomzed ad oradomzed follow-up studes are desged so that outcomes Y t, t=1,...,, correspodg to the th subject are to be measured at prespecfed tme pots ad a vector of covarates X s to be measured at basele. I radomzed studes, X may record a treatmet arm dcator as well as pretreatmet varables such as age, sex, ad race. Ofte the codtoal mea of the outcome Y t gve X s assumed to be lear X, that s E(Y t X )= t X, ad the goal of the study s to make fereces about the ukow regresso parameters t. For example, f X represets dose levels of a drug admstered at basele, vestgators are ofte terested estmatg the parameter t dexg a occaso-specfc lear dose-respose model. Ofte a subset of the outcome vector Y =(Y 1,..., Y ) s mssg for some subjects. I ths paper we assume that the outcomes are mssg completely at radom (MCAR) the sese defed by Rub [11] ad that the orespose patters are mootoe, that s oce a subject msses a cycle of the study he or she msses also all subsequet cycles. Mootoe patters of MCAR data arse, for example, radomzed studes wth staggered etry ad a fxed termato caledar tme. Mootoe MCAR data also arses f subjects drop out of the study for reasos urelated to Y. Extesve lterature exsts o the estmato of parameters =(,..., 1 ) the absece of mssg data. Whe the covarace of Y gve X, 7(X ), s kow, the the geeralzed least squares (GLS) estmator G of s best lear ubased [7, p. 301]. Chamberla [3] showed that the asymptotc varace of G attas the semparametrc varace boud for regular estmators of the semparametrc model defed solely by the lear model restrctos o the margal meas. Whe 7(X ) s ukow, G s ufeasble because t depeds o the ukow covarace fucto. Carroll ad Ruppert [2] showed that whe 7(X ) s a smooth fucto of X, the the two-stage geeralzed least squares estmator G that uses a oparametrc estmator of 7(X ), has the same asymptotc dstrbuto as G. he geeralzed estmatg equatos (GEE) estmator GEE proposed by Lag ad Zeger [5] s a geeralzed least squares estmator of that uses a estmate of 7(X ) from a, possbly msspecfed, parametrc model for the covarace fucto. Whe the parametrc model for the covarace fucto s correctly specfed the GEE s asymptotcally equvalet to G ad G.
3 104 RONIZKY, HOLCROF, AND ROBINS Whe the true covarace fucto does ot deped o X,.e., 7(X )=7 for all, the G s exactly equal to OLS=(,..., 1, OLS, OLS ), where t, OLS s the ordary least squares (OLS) estmator of the coeffcet the lear regresso for the tth outcome Y t o the covarates X [4, pp , pp ]. hus, whe the covarace fucto s costat, the ordary least squares estmator of t s semparametrc effcet a model that mposes solely lear restrctos o the codtoal meas of the outcomes Y t gve X, t=1,...,. he estmator t, OLS s also semparametrc effcet the model defed by the lear restrcto o the tth mea oly,.e., E(Y t X )= t X, but wthout restrctos mposed o the codtoal meas of the remag outcomes,.e., E(Y j X ), j{t, s uspecfed [9]. hus, wth full data, whe 7(X ) s ot a fucto of X, kowledge that the meas of the remag outcomes are lear X does ot asymptotcally add formato about the regresso parameter t correspodg to the tth outcome. Furthermore, sce t, OLS s also the semparametrc effcet estmator of t whe the outcomes Y j, j{t, are ot recorded, the we coclude that oly the outcome Y t coveys formato about t whe o Y t are mssg ad 7(X ) costat. Wth mootoe MCAR outcomes the estmators G, G, GEE ad OLS are cosstet for estmatg but they may be less effcet tha the semparametrc effcet estmator EFF of the model defed by the lear restrctos o the codtoal meas of the vector Y gve X ad the MCAR codto [9]. he goal of ths paper s to compare ad expla the asymptotc relatve effceces of the estmators G, GEE, ad OLS relatve to EFF. I Secto 2 we descrbe the model assumptos. I Secto 3 we revew well-kow results about the estmato of whe the complete vector Y s observed for all subjects. I Secto 4 we revew a class of estmators troduced by Robs ad Rottzky [9] that cludes estmators that are asymptotcally equvalet to G, GEE, OLS, ad EFF. I Secto 5 we use a represetato of the asymptotc varace of the estmators ths class that helps terpretg the source of dffereces amog the asymptotc varaces of the varous cosdered estmators. Asymptotc relatve effceces are explctly calculated for the varous estmators of three mportat specal cases, amely, (1) whe X =1, (2) whe X =(1, X *) ad X* s bary, ad (3) whe X =(1, X*) ad X* s a arbtrary explaatory varable. Secto 6 cotas some fal remarks. 2. MODEL Wth =1,..., dexg subject, let Y t be the outcome of the th subject at the tth follow-up cycle of the study, t=1,...,. Let X deote a p_1
4 REGRESSION WIH MISSING OUCOMES 105 vector of explaatory varables for the th subject measured just pror to start of follow-up. We assume that the frst elemet of the vector X s the costat 1. Defe R t =1 f Y t s observed ad R t =0 otherwse. We assume that the mssg data patters are mootoe, that s R t =0 mples R (t+1) =0. We also assume that X s completely observed for all subjects ad that the vectors (X, Y, R ), =1,...,, are depedet ad detcally dstrbuted, where Y =(Y 1,..., Y ) ad R =(R 1,..., R ). We further assume that the mssg data process satsfes ad that P(R t =1 R (t&1) =1, Y, X )=P(R t =1 R (t&1) =1, X ). (1) P(R t =1 R (t&1) =1, X )>_>0, (2) for some _>0. Codto (1) s equvalet to the codto that the data are mssg completely at radom [11]. Codto (2) says that all subjects have a probablty of havg the full vector Y completely observed that s bouded away from zero. We suppose that the codtoal mea of Y t gve X follows the lear regresso model E(Y t X )= 0t X, (3) where 0t s a p_1 vector of ukow parameters, t=1,...,. hroughout we refer to the semparametrc model defed by restrcto (3) as the ``alllear-meas'' model. he goal of ths artcle s to compare the asymptotc relatve effceces of several commoly used estmators of 0t whe the outcomes Y t are ot always observed, the mssgess patters are mootoe, ad the data are mssg competely at radom,.e., Eq. (1) s true. 3. ESIMAION WIHOU MISSING DAA I ths secto we brefly revew well-kow results about the estmato of 0t whe Y s observed for all subjects. Let = t ( t )=Y t & t X, = ()= (= 1 ( 1 ),..., = ( )) wth =( 1,..., ), ad let d(x )beap_fxed matrx of fuctos of X. Whe Y s observed for all subjects, the uder mld regularty codtos, the estmatg equato =1 d(x ) = ()=0, (4)
5 106 RONIZKY, HOLCROF, AND ROBINS has a root that s cosstet ad asymptotcally ormal for estmatg 0. Several commoly used estmators of 0 are solutos to Eq. (4) for some specfc choce of d(x ). Whe 7(X ), the covarace of Y gve X, s kow, the geeralzed least squares estmator G solves (4) that uses d* GLS (X )=(IX ) 7(X ) &1, where I s the _ detty matrx ad deotes the Kroecker product. he Kroecker product of a a_b matrx ad a c_d matrx S s the ac_bd matrx wth block elemets [ j S] (Seber, 1984, p. 7). Whe 7(X ) s ukow ad satsfes certa smoothess codtos, Carroll ad Ruppert [2] showed that the two-stage geeralzed least squares estmator G that solves (4) wth d GLS (X )=(IX ) 7 (X ) &1, where 7 (X ) &1 s a prelmary cosstet oparametrc estmator of 7(X ), has the same asymptotc dstrbuto as G. he GEE estmator [5], GEE, solves (4) wth d GEE (X )=(IX )_ C (X ) &1, where C (X )=C(X ^ ) ad ^ s a cosstet estmator of 0 the model 7(X )=C(X 0 ), (5) where 0 s a q_1 ukow parameter vector ad C(X ) s, for each, a _ symmetrc ad postve defte matrx fucto of X. Lag ad Zeger [5] showed that the soluto to (4) that uses d GEE (X ) wll be a cosstet ad asymptotcally ormal estmator of 0 eve whe (5) s msspecfed. I fact, t s stadard to show that GEE wll have the same asymptotc dstrbuto as GEE solvg Eq. (4) that uses d* GEE (X )= (IX ) C(X *) &1, where * s the probablty lmt of ^ (see, for example, [8]). hus, whe (5) s correctly specfed, d* GEE (X )=d* GLS (X ), ad hece GEE ad G have the same asymptotc dstrbuto. he estmator OLS=(,..., 1, OLS, OLS ) whch each 0t s the ordary least squares estmator of 0t from the regresso of Y t o X s also obtaed as a soluto to Eq. (4). I fact, OLS solves (4) that uses d OLS (X )=IX. Robs ad Rottzky [9] showed that the solutos to the estmatg Eq. (4) essetally costtute all regular ad asymptotcally lear (RAL) estmators of 0. hat s, ay RAL estmator of 0 s asymptotcally equvalet to a soluto of Eq. (4) for some choce of fucto d(x ). wo estmators +^ 1 ad +^ 2 of + 0 are sad to be asymptotcally equvalet f - (+^ 1&+^ 2) coverges to 0 probablty. If +^ 1 ad +^ 2 are asymptotcally equvalet the - (+^ 1&+ 0 ) ad - (+^ 2&+ 0 ) have the same asymptotc dstrbuto. A estmator s sad to be asymptotcally lear f ( & 0 ) s asymptotcally equvalet to a sample average of..d. mea zero, fte varace radom varables. For example, the soluto to a estmatg equato =1 m(y, X )=0 s, uder smoothess codtos for m(y, X ), asymptotcally lear because usg a stadard aylor seres expaso,
6 REGRESSION WIH MISSING OUCOMES 107 ( & 0 ) ca be show to be asymptotcally equvalet to the sample average of E(m(Y, X ) =0 ] &1 m(y, X 0 ). Regularty s a techcal codto that prohbts super-effcet estmators by specfyg that the covergece of the estmator to ts lmtg dstrbuto s locally uform. Chamberla [3] showed that the asymptotc varace of G acheves the semparametrc varace boud for regular estmators of 0 the sese defed by Begu, Hall, Huag, ad Weller [1]. he semparametrc varace boud for 0 a semparametrc model s the supremum of the CramerRao varace bouds for 0 over all regular parametrc submodels ested wth the semparametrc model ad t s therefore a lower boud for the asymptotc varace of all regular estmators of 0. Whe 7(X ) s ot a fucto of X, t ca easly be show that GLS ad OLS are algebracally detcal (see, for example, [4, p. 307]). hus, OLS cocdes wth the semparametrc effcet estmator G ad t s therefore locally semparametrc effcet the ``all-lear-meas'' model at the addtoal restrcto that 7(X ) s costat. A locally semparametrc effcet estmator of a parameter 0 model A at a addtoal restrcto B s a estmator that attas the semparametrc varace boud for 0 model A whe both A ad B are true ad remas cosstet whe A s true but B s false. Cosder ow the estmato of 0, the coeffcet of the regresso of the outcome Y o X, a model that does ot mpose restrctos o the codtoal meas E(Y t X ) for t<. Specfcally, uder the ew model, whch throughout we call the ``last-mea-lear'' model, data o X ad the vector Y are observed, =1,...,, but the model mposes oly a lear restrcto o the last codtoal mea,.e., E(Y X )= 0 X. (6) Robs ad Rottzky [9] showed that, OLS s locally semparametrc effcet for 0 the ``last-mea-lear'' model at the addtoal restrcto that Var(Y X ) s ot a fucto of X. hus, sce, OLS s also a locally semparametrc effcet estmator of 0 the ``all-lear-meas'' model at the restrcto that 7(X ) s costat, the t follows that whe Y s observed for all subjects ad 7(X ) s costat, kowledge that the codtoal meas E(Y t X ) for the precedg outcomes Y =(Y 1,..., Y (&1) ) are lear X does ot asymptotcally add formato about 0. Furthermore, sce, OLS s also a locally semparametrc estmator of 0 the model (6) at the restrcto that Var(Y X ) s costat whe data o Y are ot recorded [3], the t follows that whe 7(X )s costat ad the ``all-lear-meas'' model holds, data o Y does ot provde formato about 0.
7 108 RONIZKY, HOLCROF, AND ROBINS 4. ESIMAION WIH MONOONE MCAR DAA I ths secto we revew results about the estmato of 0 whe Y s ot fully observed for all subjects ad the mssg data process satsfes (1) ad (2). Let * t #P(R t =1 R (t&1) =1, X ) ad? t #> t =1 * t. Suppose frst that * t are kow for all ad t. (7) Robs ad Rottzky [9] showed that the estmatg equato where U (d,, )= R =1 d(x? ) = ()& t=1 U (d,,)=0, (8) (R t &* t R (t&1) )? t, t (Y t, X ) wth, t (Y t, X ), t=1,...,, a arbtrary p_1 fucto of Y t #(Y 1,..., Y (t&1) ) ad X chose by the vestgator, has, uder mld regularty codtos, a soluto (d,,) that s a cosstet ad asymptotcally ormal estmator of 0. he asymptotc varace of (d,,) s gve by 1(d) &1 0(d,,) 1(d) &1, (9) where 1(d)=E[d(X )[IX ] ] ad 0(d,,)=Var[U (d,, 0 )]. hey also showed that the solutos of (8) are essetally all RAL estmators of 0 the ``all-lear-meas'' model wth the addtoal restrctos (1), (2), ad (7). Furthermore the soluto of (8), (d eff,, eff ), that uses d eff (X )=(IX ) {Var _\ R =? & t=1 &1 _E(= Y t, X ) &= +}X (R t &* t R (t&1) )? t (10) ad, eff, t (Y t, X )=d eff (X ) E(= Y t, X ), (11) where = #= ( 0 ), s semparametrc effcet ths model. I addto, they showed that kowledge of the orespose probabltes * t does ot asymptotcally provde formato about 0 sce the semparametrc effcecy boud for 0 remas uchaged f the restrcto (7) s dropped.
8 REGRESSION WIH MISSING OUCOMES 109 hat s, the semparametrc varace boud for 0 s the same the models (a) defed by (1), (2), (3), ad (7) (b) defed by (1), (2), ad (3). hey further showed that all RAL estmators of 0 model (b) are asymptotcally equvalet to the soluto of (8) for some choce of d(x ) ad, t (Y t, X ). Cosder ow Eq. (4) restrcted to the avalable observatos,.e., =1 d obs (X ) = obs ()=0, (12) where = obs () s the vector of observed resduals for the th subject ad d obs (X ) s the correspodg submatrx of d(x ). Lag ad Zeger [5] showed that (12) has a soluto that s cosstet ad asymptotcally ormal for estmatg 0. hus, sce ths soluto s a RAL estmator of 0, t must have the same asymptotc dstrbuto as a soluto of Eq. (8) for some specfc d(x ) ad, t (Y t, X ). he estmators G, G, GEE ad OLS calculated from the avalable observatos all solve the Eq. (12) usg the correspodg submatrces of ther respectve fuctos d* GLS (X ), d GLS (X ), d GEE (X ), ad d OLS (X ) defed Secto 3. hey are therefore asymptotcally equvalet to the soluto of Eq. (8) for specfc fuctos d(x ) ad, t (Y t, X ). Defe ad d C l (X )=(IX ) {Var C_ R = &? t=1 _E l, C (= Y t, X ) } &1 &= X,, C l, t(y t, X )=d C l(x ) E l, C (= Y t, X ), (R t &* t R (t&1) )? t E l, C (= Y t, X )=Cov C [(Y, Y t) X )] Var C (Y t X ) &1 = t, where = t s the (t&1)_1 vector wth jth elemet equal to Y j & 0j X, ad C, whe used as a subscrpt of Var ad Cov, dcates that the codtoal varaces ad covaraces are calculated assumg Cov(Y X )=C(X ), where C(X ) s a gve _ symmetrc postve defte matrx fucto of X. I the Appedx we show Lemma 1. Let l(c) be the soluto of Eq. (8)that uses d C (X l ) ad, C (Y l, t t, X ) stead of d(x ) ad, t (Y t, X ). he, (a) G ad G are asymptotcally equvalet to l(7), where 7(X )=Cov(Y X ) s the true codtoal covarace of Y gve X
9 110 RONIZKY, HOLCROF, AND ROBINS (b) GEE ad GEE are asymptotcally equvalet to l(c * ), where C * (X )#C(X *) s the ``workg covarace'' fucto defed Eq. (5) evaluated at *. Here, * s the probablty lmt of ^ estmated from model (5) ad (c) OLS s asymptotcally equvalet to l(i), where I s the _ detty matrx. Part (a) of Lemma 1 was show by Robs ad Rottzky [9] ad s cluded here for completeess. Robs ad Rottzky [9] also showed that the asymptotc varace of l(7) s equal to 0(d 7, l,7 l )&1. Part (b) of Lemma 1 mples that whe model (5) s correctly specfed, GEE has the same asymptotc dstrbuto as G. Robs ad Rottzky [9] showed that G has the smallest asymptotc varace the class of estmators that are solutos to Eq. (12). hey also showed that G ad the semparametrc effcet estmator (d eff,, eff ) have the same asymptotc varace f ad oly f E l, 7 (= Y t, X )=E(= Y t, X ),.e., whe the codtoal expectato of = s lear Y t. I ths secto we have show that the estmators G, G, GEE, GEE, ad OLS calculated from all avalable observatos are asymptotcally equvalet to solutos of Eq. (8) for specfc choces of fuctos d(x ) ad, t (Y t, X ) whe the MCAR codto (1) holds ad the mssg data patters are mootoe. I Secto 3 we oted that, the absece of mssg data, G ad G were semparametrc effcet. As argued prevously wth mootoe MCAR data, G s o loger effcet f the codtoal meas E(= Y t, X ) are olear fuctos of Y t. I Secto 3 we further oted that whe 7(X ) s costat, G ad OLS are algebracally detcal. hs s o loger true wth mootoe MCAR data. I fact, the ext secto we show that large effcecy gas ca be obtaed by usg G stead of OLS. Cosder ow the estmato of 0 the ``last-mea-lear'' model defed by restrcto (6) whe Y s ot observed for all subjects ad the data are MCAR ad mootoe. Rottzky ad Robs [10] showed that all RAL estmators of 0 the model defed by (1), (2), ad (6) are asymptotcally equvalet to a soluto (d*,,*) of =1 S (d*,,* )=0, for some specfc p_1 fuctos d*(x ) ad,*(y t, X ), t=1,...,. he estmatg fucto S s defed as S (d*,,* )= R d*(x ) = ( )&? t=1 (R t &* t R (t&1) )? t, *(Y t, X ).
10 REGRESSION WIH MISSING OUCOMES 111 Robs ad Rottzky [9] also showed that the soluto (d* eff,,* eff ) that uses d* eff =X { Var _ R = &? t=1 (R t &* t R (t&1) ) E(= Y t, X ) } &1? X t &= ad,* eff (Y t, X )=d* eff (X ) E(= Y t, X ) has asymptotc varace equal to 0 &1 =Var[S last (d* eff,,* eff 0 )] &1 that attas the semparametrc varace boud for estmatg 0 the model defed by restrctos (1), (2), ad (6). Sce (d eff,, eff ) has asymptotc varace that attas the semparametrc boud the model that addtoally assumes the learty of the codtoal meas of Y t gve X, t<, the f we let the verse of the varace boud of 0 represet the amout of formato about 0 a gve model, we have that AVar[ (d eff,, eff )] &1 &AVar[ (d* eff,,* eff )] &1 AVar[ (d eff,, eff )] &1 represets the fracto of the formato about 0 assocated wth the kowledge that E(Y t X ) s a lear fucto of X for all t<, where for ay estmator +^ of a parameter + 0, AVar(+^ ) deotes the varace of the asymptotc dstrbuto of - (+^ &+ 0 ). I Secto 5 we exame ths fracto for the specal case whch X =(1, X *) for a arbtrary explaatory varable X *. 5. EFFICIENCY COMPARISONS I ths secto we compare the asymptotc relatve effcecy (ARE) of the varous estmators of 0t dscussed Secto 4 the model E(Y t X )= 0, 0, t + 0, 1, t X *, where X * s a scalar radom varable. We start wth the case X *=0 whch correspods to the problem of estmatg the mea 0, 0, t of Y t, t=1,...,. We the cosder the case whch X * s a bary varable ad fally the case of a arbtrary covarate X *. Wthout loss of geeralty, we focus o the effcecy comparsos of the estmators of the coeffcets 0, 0, ad 0, 1, of the model for the codtoal mea of the last outcome Y gve X.
11 112 RONIZKY, HOLCROF, AND ROBINS 5.1. Estmato of Occaso-Specfc Meas Suppose that X cossts solely of the costat 1. I ths case we are terested estmatg 0, 0,, the mea of the outcome Y measured at the last occaso. o llustrate the asymptotc behavor of the estmators of 0, 0, we cosder frst the smple but pedagogcal case whch =2 ad Y 1 s observed for all subjects. he semparametrc effcet estmator 2(d eff,, eff )of 0, 0, 2 has asymptotc varace equal to the lower rghtmost elemet of 0 &1 eff #0(d eff,, eff ) &1 whch ca be easly calculated to be Var(= 2 )+ 1&* 2 * 2 E[Var(= 2 Y 1 )]. (13) Sce 2(d eff,, eff ) s semparametrc effcet, I MIS =AVar[ 2(d eff,, eff )] &1 represets the formato avalable for estmatg 0, 0, 2 whe asymptotcally, a fracto 1&* 2 of the outcomes Y 2 are mssg. Sce I FULL = Var(= 2 ) &1 s the formato for estmatg 0, 0, 2 whe all Y 2 's are observed the wth 8 eff =* &1 2 (1&* 2 ) E[Var(= 2 Y 1 )], I FULL &I MIS 8 = eff I FULL Var(= 2 )+8 eff represets the fracto of formato lost due to mssg Y 2 's. hs fracto s equal to 0 whe 8 eff =0, whch occurs whe * 2 =1,.e., whe Y 2 s observed for all subjects, or whe Var(= 2 Y 1 )=0,.e., whe Y 1 s a perfect predctor of Y 2. he asymptotc varace of l(c) s gve by the lower rghtmost elemet of 1(d C l )&1 0(d C l,,c l ) 1(dC l )&1,. It s easy to show that ths elemet s equal to Var(= 2 )+ 1&* 2 * 2 E[[= 2 &E l, C (= 2 = 1 )] 2 ]. (14) Formula (14) wth C(X )=C(X *) s, vew of Lemma 1, the asymptotc varace of GEE, 2, the GEE estmator of 0, 0, 2, that uses the ``workg covarace'' model (5). I partcular, takg C(X )=I, the asymptotc varace of OLS, 2 s gve by Var(= 2 )+ 1&* 2 * 2 [Var(= 2 )]. (15) Notce that (15) cocdes wth Var(= 2 )* 2, whch s equal to Var(= 2 ), the asymptotc varace of the ormalzed estmator of the sample mea of Y 2 had o Y 2 bee mssg, dvded by * 2, the fracto of subjects wth Y 2 observed for large.
12 REGRESSION WIH MISSING OUCOMES 113 Formula (14) says that the asymptotc varace of GEE, 2 depeds o the probablty lmt of the estmated workg covarace oly through E[[= 2 &E l, C (= 2 = 1 )] 2 ]. I partcular, AVar( GEE)&AVar( OLS)= 1&* 2 * 2 [E[[= 2 &E l, C (= 2 = 1 )] 2 ]&E(= 2 2 )]. (16) It follows from (16) that OLS, 2 s ot ecessarly less effcet tha GEE, 2 sce for certa choces of workg covarace C, the rght-had sde of (16) wll be postve. For example, f the workg covarace model specfes that the covarace of Y s costat ad equal to Cov(Y )= \ 1 &12 & (17) but the true covarace of Y s ( 1 \ 0 ) for some \ \ {&12, the (16) s equal to (1&* 2 ) * &1 2 whch s postve f \ 0 >&(0.5) 2. hs result says that GEE s ot ecessarly a more effcet estmator tha the (``workgdepedece'') OLS estmator OLS. Of course our example, sce X =1, the GEE estmator that uses a urestrcted model for Cov(Y X ), that s, Cov(Y X )= \ , for some ukow parameters 01, 02, ad 03 s semparametrc effcet ad feasble, ad t wll be preferred to GEE estmators usg, possbly correct, costat-valued workg covaraces, such as (17). he pot of our example was to show that GEE ca be less effcet that OLS ad the workg covarace models should be chose carefully f effcecy mprovemets over OLS are desred. he asymptotc varace of the geeralzed least squares estmator G, 2 s by Lemma 1, equal to part (a)(14) wth C(X )=Cov(Y X ) ad thus ca be wrtte as Var(= 2 )+ 1&* 2 * 2 E[Var l (= 2 Y 1 )], (18) where Var l (= 2 Y 2 )=Var(Y 2 )&[Cov(Y 1, Y 2 ) 2 Var(Y 1 )] s the resdual varace from the populato lear regresso of Y 2 o Y 1. I the Appedx we show that E[Var l (= 2 Y 1 )] s equal to E[Var(= 2 Y 1 ]f
13 114 RONIZKY, HOLCROF, AND ROBINS ad oly f E(= 2 Y 1 ) s a lear fucto of Y 1. hus G s semparametrc effcet oly whe E(= 2 Y 1 ) s lear Y 1. As oted Secto 4, ths has bee prevously observed by Robs ad Rottzky [9]. he followg argumet helps to uderstad why G, 2 may fal to be semparametrc effcet. For the th subject wth observed outcome Y 1, let Y 2 be the predcted value of Y 2 from the lear regresso of Y 2 o Y 1 based o subjects wth observed outcomes at both occasos. hat s, lettg $ 1 ad $ 2 be the soluto of =1 R 2\ 1 Y 1+ (Y 2&$ 1 &$ 2 Y 1 )=0, we defe Y 2=$ 1+$ 2Y 1, =1,...,. I the Appedx we show that G, 2 has the same asymptotc dstrbuto as the soluto IMP, 2 of =1 R 2 (Y 2 & 2 )+(1&R 2 )(Y 2& 2 )=0. he soluto IMP, 2 cocdes wth the regresso mputato estmator of 0, 0, 2 descrbed by Lttle ad Rub [6, pp. 4547]. hs estmator s calculated by frst mputg the mssg Y 2 's wth ther predcted values from the lear regresso of Y 2 o Y 1 based o the complete data, ad the averagg the observed ad mputed values of Y 2. he loss of effcecy of IMP, 2 ad therefore of G, 2, arses because the mssg Y 2 are mputed from a model that assumes that E(Y 2 Y 1 ) s lear Y 1. Rottzky ad Robs [10] showed that, whe Y 1 s dscrete, IMP ca be made semparametrc effcet by replacg Y 2 by E (Y 2 Y 1 ), the oparametrc maxmum lkelhood estmator of E(Y 2 Y 1 ). A comparso of formulas (13), (15), ad (18) helps to uderstad the effcecy dffereces amog EFF, 2, G, 2, ad OLS, 2. Sce E[Var(= 2 Y 1 )] E[Var l (= 2 Y 1 )]E[Var(= 2 )], OLS, 2 ca ever be more effcet that G, 2, whch, tur, ca ever be more effcet tha EFF, 2. I the Appedx we show that E[Var(= 2 )]=E[Var l (= 2 Y 1 )] oly whe Cov(Y 1, Y 2 )=0 ad therefore G, 2 ad OLS, 2 wll have the same asymptotc varace oly whe Y 1 ad Y 2 are ucorrelated. he greater effcecy of G, 2 relatve to OLS, 2 s therefore explaed because G, 2, as opposed to OLS, 2, explots the correlato betwee Y 1 ad Y 2 for estmato of 02 va the lear regresso mputato of the mssg Y 2 's. However, as oted earler, the lear regresso mputato of Y 2 wll oly lead to effcet estmators of 02 whe E(Y 2 Y 1 ) s lear Y 1, ad except for ths case, G, 2 wll fal to extract all the formato avalable Y 1 ad Y 2 about 0, 0, 2.
14 REGRESSION WIH MISSING OUCOMES 115 Gve two estmators +~ ad +^ of a scalar parameter +, the asymptotc relatve effcecy of +~ compared to +^ s deoted by ARE(+~, +^ ) ad s defed by ARE(+~, +^ )=AVar(+^ )AVar(+~). Wth EFF # (d eff,, eff ), (13) ad (18) mply that ad ARE( OLS, 2, EFF, 2)=1&(1&* 2 ) {Var(= 2)&E[Var(= 2 Y 1 )] Var(= 2 ) = ARE( G, 2,EFF, 2) =1& (1&* 2) 2{ Var(= 2)&E[Var(= 2 Y 1)] 2= &\ 1&(1&* 2 ) \ Var(= 2 ), where \=Corr(Y 1, Y 2 ). he effcecy loss of G, 2 relatve to EFF, 2 s summarzed the term (1&* 2 ) 2{ Var(= 2)&E[Var(= 2 Y 1)] 2= &\ 1&(1&* 2 ) \ Var(= 2 ). he factor [Var(= 2 )&E[Var(= 2 Y 1 ]] Var(= 2 ) &1 &\ 2 ca be terpreted as a measure of the degree of o-learty E(Y 2 Y 1 ). hs factor s equal to 0 whe E(Y 2 Y 1 ) s lear Y 1, ad t ca be as large as 1. he upper boud 1 s acheved whe Y 1 ad Y 2 are ucorrelated but Y 1 s a perfect predctor of Y 2, for example f Y 1 s ormally dstrbuted wth zero mea ad Y 2 =Y 2 1. he factor (1&* 2)[1&(1&* 2 ) \ 2 ] quatfes the effcecy loss as a fucto of the fracto of mssg Y 2. Example. o llustrate the relatve effceces of OLS, 2 ad G, 2 compared to the semparametrc effcet estmator EFF, 2, cosder Y 1 = Z 73 1 wth \ Z1 = 2+ td Normal \\0 0+, _2\ 1 ' ' 1++. (19) Sce Y 1 s a oe-to-oe trasformato of Z 1, E(= 2 Y 1 )=E(= 2 Z 1 ) ad sce, by ormalty, E(= 2 Z 1 ) s lear Z 1, the E(= 2 Y 1 )= a+by 37 1 for some costats a ad b. hus, the codtoal mea of Y 2 gve Y 1 s a olear fucto of Y 1. I the Appedx we show that Corr(Y 1, Y 2 )=% Corr(Z 1, Z 2 ), (20)
15 116 RONIZKY, HOLCROF, AND ROBINS where %=E(Z 103 ) E(Z 143 ) &12. Usg the average of 10,000 smulated values of Z ad Z we calculated %r0.88. Furthermore, Var(Y 2 Y 1 ) s, by Y 1 a oe-to-oe trasformato of Z 1, equal to Var(Y 2 Z 1 )=_ 2 (1&' 2 ) ad vew of (20), Var(Y 2 Y 1 )=_ 2 [1& (0.88) 2 \ 2 ]. Settg * 2 =0.5, the ARE's of OLS, 2 ad G, 2 compared to EFF, 2 reduce to ad ARE( OLS, 2, EFF, 2)=1&(0.5)(0.88) 2 \ 2 ARE( G, 2,EFF, 2)=1& 0.5[1&(0.88)2 ]\ 2, [1&0.5\ 2 ] where \=Corr(Y 1, Y 2 ). Fgure 1 plots ARE( OLS, 2, EFF, 2) ad ARE( G, 2, EFF, 2) as a fucto of \ for * 2 =0.5. he plots dcate that the effcecy of both OLS, 2 ad G, 2 decreases as a fucto of \. he relatvely small effcecy loss of G, 2 s due to the relatvely small fracto of mssg data,.e., 1&* 2 =0.5, ad the fact that E(Y 2 Y 1 ) s well-approxmated by a lear fucto of Y 1, for values of Y 1 lyg a rego of hgh probablty. We have also calculated ARE( G, 2,EFF, 2) for * 2 =0.2 (results ot preseted) ad obtaed that the ARE reached a mmum of Cosder ow the estmato of 0 for 2. he asymptotc varaces of EFF, ad l(c) are the lower rghtmost elemets of 0(d eff,, eff ) &1 ad 1(d C l )&1 0(d C l,,c l ) 1(dC l )&1,, respectvely. A straghtforward calculato gves ad AVar( eff, )=Var(= )+ AVar[ l, (C)]=Var(= )+ t=1 t=1 1&* t? t E[Var(= Y t)] (21) 1&* t? t E[[= &E l, C (= Y t)] 2 ]. (22) hus, by Lemma 1, the asymptotc varaces of G, ad OLS, are ad AVar( G, )=Var(= )+ t=1 AVar( OLS, )=Var(= )+ 1&* t? t E[Var l (= Y t)] (23) t=1 1&* t? t E[Var(= )], (24)
16 REGRESSION WIH MISSING OUCOMES 117 Fg. 1. ARE's for estmatg the mea of Y 2 whe Y 1 s always observed ad P(Y 2 mssg)=0.5. where Var l (= Y t)=cov(y, Y t) Var(Y t) &1 Cov(Y t, Y ). hus, dffereces the asymptotc varaces of EFF,, G,, ad OLS, are drve by dffereces amog E[Var(= Y t)], E[Var l (= Y t)] ad E[Var(= )]. Aalogously to the case =2, E[Var(= Y t)]= E[Var l (= Y t)] f ad oly f E[= Y t] s a lear fucto of Y t, t=1,...,, whch s the the ecessary ad suffcet codto for G, to be fully effcet. Whe Y ad = are depedet, the Var(= Y t) =Var(= ) ad OLS, s effcet. Aalogously to the case =2, t ca be show that G, s asymptotcally equvalet to a regresso mputato estmator of the th mea whch a mssg Y from a subject wth data observed up to tme t&1, s mputed wth ts predcted value from the lear regresso of Y o Y t based o subjects wth complete data. hus, the effcecy loss of G, relatve to EFF, s due to the mputato of the
17 118 RONIZKY, HOLCROF, AND ROBINS mssg Y from, possbly msspecfed, lear models for E(Y Y t). Sce E[Var l (= Y t)]=var(= ) holds for all t=1,..., f ad oly f Cov(=, Y )=0, t follows that OLS, ad G, wll have the same asymptotc dstrbuto oly whe = ad Y are ucorrelated. Also, OLS, wll be fully effcet oly whe Var(= Y )=Var(= ). Fally, as the case =2, t ca be show from formula (22) ad Lemma 1 that the asymptotc varace of GEE, ca be larger tha the asymptotc varace of OLS, for some msspecfed workg covarace models (5) Estmato of Occaso-Specfc Mea Dffereces Suppose that X * s a bary dcator varable ad cosder the model E(Y t X )= 0, 0, t + 0, 1, t X *. I a radomzed placebo-cotrolled follow-up tral for comparg treatmet A versus placebo, for example, X *=0 f subject s assged to the placebo arm ad X *=1 f subject s assged to the treatmet A arm. hus, 0, 0, t =E(Y t X*=0) s the occaso-specfc mea the placebo arm ad 0, 1, t =E(Y t X *=1)&E(Y t X *=0) s the occaso-specfc dfferece betwee the treatmet A ad placebo meas. Cosder ow the estmato of 0 =( 0, 0, 1, 0, 1, 1,..., 0, 0,, 0, 1, ). Let 0, G be the geeralzed least squares estmator of the vector of occaso-specfc meas the placebo arm, 0, 0 =( 0, 0, 1,..., 0, 0, ) computed from placebo-arm data oly. Smlarly, let 0, GEE, 0, OLS, ad 0, EFF be the GEE, OLS, ad semparametrc effcet estmators of 0, 0 computed from placebo-arm data oly. Defe aalogously the estmators 1, G, 1, GEE, 1, OLS, ad 1, EFF of 0, 1 =( 0, 1, 1,..., 0, 1, ) computed from treatmet A-arm data oly. I the Appedx we show that the estmators G, GEE, OLS, ad EFF of 0 ca be expressed respectvely terms of j, G, j, GEE, j, OLS, ad j, EFF, j=0, 1. Specfcally, G, 0,t, the geeralzed least squares estmator of the tercept of the t th-regresso, t=1,...,, based o data o both treatmet arms cocdes wth the geeralzed least squares of the tth mea the placebo arm,.e., G, 0,t=0, G, t. (25) he geeralzed least squares estmator G, 1,t of the slope the t th regresso, t=1,...,, based o data from both treatmet arms s equal to the dfferece betwee the arm-specfc geeralzed least squares estmators of the tth occaso meas,.e., G, 1,t=1, G, t& 0, G, t. (26)
18 REGRESSION WIH MISSING OUCOMES 119 Relatoshps (25) ad (26) hold also for the GEE, OLS, ad semparametrc effcet estmators of 0. Equato (25) mples that the ARE of the GLS ad OLS estmators of the occaso-specfc tercepts 0, 0, t compared to the semparametrc effcet estmator of 0, 0, t are equal to the ratos of the asymptotc varaces gve (23) ad (24) to the asymptotc varace gve (22). It follows from (26) that AVar( G, 1,t)=AVar( 1, G, t)+avar( 0, G, t), ad the same relatoshp holds for the GEE, OLS, ad semparametrc effcet estmator. Furthermore, t follows from (21), (22), (23), ad (24) that for j=0, 1, AVar( j, EFF, t)=p(x = j) &1 t { Var(= 1&* lj t X = j)+? l=1 lj _E[Var(= t Y l, X = j)] =, AVar[ j, l, t(c)]=p(x = j) &1 t { Var(= 1&* lj t X = j)+? l=1 lj _E[[= t &E l, C (= t Y l, X = j)] 2 ] =, ad AVar( j, OLS, t) AVar( j, G, t)=p(x = j) &1 =P(X = j) &1 t { Var(= 1&* lj t X = j)+ l=1? lj _E[Var l (= t Y l, X = j)] =, (27) t { Var(= 1&* lj t X = j)+ E[Var(= t X = j)]? =, l=1 lj where * lj =P(R l =1 R (l&1) =1, X = j) ad Var l (= t Y l, X = j)= Cov(Y t, Y l X = j) Var(Y l X = j) &1 = l. hus whe (a) the orespose probabltes * lj do ot deped o the treatmet arm,.e., * lj =* l (b) the covarace of Y s the same for both treatmet arms,.e., Cov(Y X *)= Cov(Y ) ad (c) Var(= t Y l, X *) s ot a fucto of X *, l=1,..., t,
19 120 RONIZKY, HOLCROF, AND ROBINS t=1,...,, the the ARE of the GLS ad OLS estmators of the occasospecfc slopes compared to the semparametrc effcet estmator rema the same as the ARE's of the respectve estmators of the occaso-specfc tercepts dscussed earler. Fally, as Secto 5.1, t ca be show that the GEE estmator of 0, 1, ca be less effcet tha the OLS estmator for some msspecfed workg covarace models. Example. o llustrate the depedece of the ARE's o the dfferece betwee the correlato matrces the two treatmet groups, we cosder a radomzed placebo-cotrolled study wth data measured at basele ad at oe follow-up pot. We assume that data at basele are always observed,.e., * 1j =1, j=0, 1, ad that the probablty that Y 2 s mssg s the same both treatmet arms. We assume that Y 1 =Z 73 1 ad that gve X*, \ Z 1 = 2+ dep t Normal 0+ \\0, _2 \ 1 '(X *) '(X *) (28) Uder (28), E(Y 1 X *)=0 so ths example we assume that there are o dffereces the treatmet meas at basele. hus, wth each treatmet arm, the data follows the model (19) of Example 1. However, sce '(X *) s a fucto of X *, the covarace betwee Y 1 ad Y 2 chages wth treatmet arm. A straghtforward calculato shows that ad 2{ 1 AVar( EFF,1,2)=_ & 1&* 2 (0.88) &2 (\2P 0 1+\ 2P 1 0) * 2 P 0 P 1 * 2 P 0 P 1 =, (29) 2{ 1 AVar( G, 1,2)=_ & 1&* 2 (\ 2 P 0 1+\ 2 P 1 0) * 2 P 0 P 1 * 2 P 0 P 1 =, (30) 2{ 1 AVar( OLS, 1, 2)=_ * 2 P 0 P 1=, (31) where \ j =Corr(Y 1, Y 2 X *= j), P j =P(X *= j), j=0, 1. Fgure 2 plots the ARE of the OLS ad GLS estmator of 0, 1, 2, the slope the regresso model for the secod occaso, compared to the semparametrc effcet estmator of 0, 1, 2 agast \ 1 for * 2 =0.5, \ 0 =- 0.5 ad P 0 =P 1 =0.5. Both ARE's atta ther maxmum at \ 1 =0, but these maxmums are ot equal to 1. he OLS estmator s substatally less effcet tha the semparametrc effcet estmator whe \ 1 s large. he GLS estmator
20 REGRESSION WIH MISSING OUCOMES 121 Fg. 2. ARE's for estmatg the mea dfferece of the outcomes Y 2 2 groups. Here, Y 1 s always observed, P(Y 2 mssg)=0.5, Corr(Y 1, Y 2 )=-0.5 the frst group ad Corr2(Y1, Y2)=Corr(Y 1, Y 2 ) the secod group. performs relatvely well over the whole rage of \ 1 as dcated by the theory sce E(= 2 Y 1 ) s well approxmated by a lear fucto of Y 1 over the rage of hgh probablty values of Y Estmato of Occaso-Specfc Slopes We ow cosder the effcecy of dfferet estmators of 0 the model E(Y t X )= 0, 0, t + 0, 1, t X *, (32) for a arbtrary radom varable X *. I what follows t wll be coveet to defe 0 *=( 0, 0, 1, 0, 0, 2,..., 0, 0,, 0, 1, 1,..., 0, 1, ). he vector 0 *s obtaed by permutg the elemets of 0 so that the frst elemets of 0 *
21 122 RONIZKY, HOLCROF, AND ROBINS are the tme-ordered tercepts ad the last elemets of 0 * are the tmeordered slopes. he sem-parametrc varace boud for estmatg 0 * model (32) s 0* &1 eff =E {\ I X *I+ K &1 (X eff )(I, X *I) = &1, (33) where I s the _ detty matrx ad K eff (X )=Var(= X *)+ If, for t=1,...,, t=1 1&* t? t E[Var(= Y t, X *) X ]. Var(= X *), E[Var(= Y t, X *) X *], ad * t do ot deped o X *, (34) the K eff (X ) s a costat matrx ad 0* eff = \ K &1 eff + 1 K &1 eff + 1 K &1 eff + 2 K +, &1 eff where + 1 =E(X *) ad + 2 =E(X* 2 ). he semparametrc varace boud for estmatg the vector of occaso-specfc slopes 0, 1 =( 0, 1, 1,..., 0, 1, ) s the _ lower rghtmost block matrx of 0* &1 eff, whch, whe (34) holds s, by the formula of the verse of a parttoed matrx, equal to 0 &1 1, eff =[+ 2K &1 eff &+ 1 K &1 eff K eff + 1 K &1 eff ] &1 =K eff Var(X *). hus, whe (34) holds the semparametrc varace boud for estmatg the slope at the last occaso s gve by the lower rghtmost elemet of ad t s equal to 0 &1 1, eff AVar( EFF, 1, )= _Var(= )+ t=1 1&* t? t E[Var(= Y t, X *)] &<Var(X *). (35) Cosder ow * G, the geeralzed least squares estmator of 0 *. Its asymptotc varace s gve by 0* &1 l = {\ I X *I+ K &1(X l )(I, X *I) = &1, (36)
22 REGRESSION WIH MISSING OUCOMES 123 where K l (X )=Var(= X *)+ t=1 1&* t? t E[Var l (= Y t, X *) X *], ad Var l (= Y t, X *)=Cov(=, Y t X ) Var(Y t X ) &1 Cov(Y t, = X ). Whe * t ad Var(Y X *) do ot deped o X *, a detcal argumet used to derve (35) ow gves AVar( G, 1,)= _Var(= )+ t=1 1&* t? t Var l (= Y t) &<Var(X *). (37) hus, whe (34) holds G, 1, s semparametrc effcet f ad oly f E[Var l (= Y t, X )]=E[Var(= Y t, X )] or equvaletly whe E(= Y t, X ) s lear Y t, as oted also by Robs ad Rottzky [9]. Whe * t s ot a fucto of X*, OLS, 1, s computed from a fracto of the outcomes Y that, as, s equal to?. hus, the asymptotc varace of the OLS estmator of 0, 1, s equal to Var(= )[? Var(X *)]. A straghtforward calculato shows that ths varace ca be rewrtte as AVar( OLS, 1, )= _Var(= )+ t=1 (1&* t )? t Var(= ) &<Var(X *). (38) Comparg Eqs. (37) ad (38) to Eqs. (23) ad (24), t follows that whe (34) holds the asymptotc varaces of the estmators OLS, 1, ad G,1, of the occaso-specfc slopes are equal to the asymptotc varaces of the correspodg estmators of the occaso-specfc meas dvded by the varace of X*. We coclude that whe (34) holds, the asymptotc relatve effceces of OLS, 1, ad G, 1, compared to EFF, 1, are less tha or equal to those dscussed Secto 5.1 for estmato of the mea of Y. Cosder ow the estmato of 0, 1, the ``last-mea-lear'' model (6) wth the addtoal restrcto (1), where X =(1, X *). he semparametrc varace boud for estmatg 0, ths model s gve by 0 &1 = last Var[S (d* eff,,* eff 0 )] &1. It s straghtforward to show that 0 &1 last =E {\ K eff, (X ) &1 X *K eff, (X ) &1 X *K eff, (X ) &1 X* 2 &1+= &1, K eff, (X ) where K eff, (X ) s the lower rghtmost elemet of the _ matrx K eff (X ). hus, whe (34) holds, 0 &1=K last eff, \ 1 & ,
23 124 RONIZKY, HOLCROF, AND ROBINS ad the semparametrc varace boud for estmatg 0, 1, s equal to K eff, Var(X *) whch, by (35), cocdes wth AVar( EFF, 1, ). hs result says that whe (34) holds, kowledge that the codtoal meas of Y t gve X* are lear fuctos of X* for t=1,..., &1 does ot asymptotcally add formato about the parameter 0, 1,. It s terestg to ote that sce, (a) OLS, 1, s semparametrc effcet whe (34) holds ad data o Y are ot avalable ad (b) the asymptotc varace of OLS, 1, s larger tha the asymptotc varace of EFF, 1, whe, gve X, Y ad = are statstcally depedet the, as opposed to the full-data case, data o Y provde formato about 0, 1, whe, gve X, Y s a predctor of Y. Whe (34) s ot true, the lower rghtmost elemets of 0* &1 eff ad may ot be equal. I such cases, kowledge of the learty of the 0* &1 last codtoal meas of Y t gve X, does provde addtoal formato about 0, 1,. Fally, the asymptotc varace of GEE, s gve by (9) wth d l ad, l defed Lemma 1(b). he results of Secto 5.1 suggest that GEE, ca be eve less effcet tha OLS, for some msspecfed workg covarace models (5). A detaled study of whch estmated covaraces C (X ) lead to GEE, beg less effcet tha OLS, s beyod the scope of ths paper. 6. FINAL REMARKS I ths paper we have examed the relatve effceces of varous estmators of the parameter t dexg the occaso-specfc lear models for the codtoal meas of Y t gve X, t=1,...,, whe the outcomes Y t are MCAR ad the mssg data patters are mootoe. We have show that, as opposed to the case whch the full-data vector Y s observed for all subjects, the GLS ad OLS estmators ca be less effcet tha the semparametrc effcet estmator of t. We have oted that the effcecy loss of the GLS estmator of t s related to the degree of olearty of the codtoal meas E(Y t Y t, X ) as fuctos of Y t. We also observed that, as opposed to the full-data case, the OLS estmator of t s effcet sce t oly uses X ad the outcomes Y t recorded at the tth occaso, ad wth mootoe mssg data, the outcomes Y t recorded pror to tme t carry formato about t. Fally, the results of Lemma 1 are vald also whe model (3) s replaced by E(Y t X )=g t (X, 0 ), where g t (X 0 ) s a, possbly olear, fucto of X ad 0. Whe g t (X 0 ) depeds o 0 oly through the occaso-specfc parameters 0t, but g t (X 0 ) s ot a lear fucto of 0t, the OLS ad G are o loger equal, eve whe o Y t 's are mssg. hus, wth full-data ad olear codtoal mea models, data o Y j, j{t, provde formato about the occaso-specfc parameters dexg the codtoal mea of Y t gve X.
24 REGRESSION WIH MISSING OUCOMES 125 APPENDIX Proof of Lemma 1. Part (a) s exactly Lemma 1 of Robs ad Rottzky [9]. o prove part (b) we wll show that GEE= l(c * ) ad the argue that sce GEE s asymptotcally equvalet to GEE the GEE ad l(c * ) must have the same asymptotc dstrbuto. he estmator GEE solves =1 (IX ) C * (X ) &1 2 = ()=0, (39) where 2 =dag(r j ) s the _ dagoal matrx wth dagoal elemets R j, j=1,...,. Robs ad Rottzky [9] showed that whe Cov(Y X )= C * (X ), (IX ) C * (X ) &1 2 = =U (d C l,,c l 0), (40) where d C l ad, C l are defed Secto 4. By defto, U (d C l,, C l 0 ) s a lear fucto of =. hus, U (d C, l,c l 0)=a(X, R ) = for some a(x, R ). Let b(x, R )#(IX ) C * (X ) &1 2 ad h(x, R )#a(x, R )& b(x, R ). By (40), h(x, R ) = =0 whe Cov(Y X )=C * (X ). hus, by the MCAR assumpto (1), Cov[h(X, R ) = X, R ]=h(x, R )_ C * (X ) h(x, R ) =0 whch, by C(X ) a postve defte matrx, mples that h(x, R )=0 almost everywhere. Hece, a(x, R )=b(x, R ) a.e. ad Eq. (40) s true eve whe Cov(Y X ){C(X ) whch eds the proof of part (b). Part (c) follows mmedately from part (b) by otg that OLS solves (39) wth C * (X )=I. Proof that E [Var l ( Y 2 Y 1 )]=E[ Var( Y 2 Y 1 )] s equvalet to E(Y 2 Y 1 ) s lear Y 1. Suppose frst that E(Y 2 Y 1 ) s lear Y 1, the E(Y 2 Y 1 )=E(Y 2 )+Cov(Y 1, Y 2 ) Var(Y 1 ) &1 = 1 ad Var[E(Y 2 Y 1 )] = Cov( Y 1, Y 2 ) 2 Var( Y 1 ) &1. hus E [ Var( Y 2 Y 1 )] = Var( Y 2 )& Var[ E(Y 2 Y 1 )] mples E [ Var( Y 2 Y 1 )]=Var( Y 2 )&Cov ( Y 1, Y 2 ) 2 _ Var( Y 1 ) &1 whch proves that E [ Var( Y 2 Y 1 )]=E[ Var l ( Y 2 Y 1 )]. Suppose ow that E[Var(Y 2 Y 1 )]=Var(Y 2 )&Cov(Y 1, Y 2 ) 2 Var(Y 1 ) &1, the Var[E(Y 2 Y 1 )]=Cov(Y 1, Y 2 ) 2 Var(Y 1 ) &1. hus, Var[E(Y 2 Y 1 )] =Var[Cov(Y 1, Y 2 ) Var( Y 1 ) &1 = 1 ]. Now, Var[E(Y 2 Y 1 )&Cov( Y 1, Y 2 ) _Var(Y 1 ) &1 = 1 ]= Var[E(Y 2 Y 1 ) + Var[ Cov( Y 1, Y 2 ) Var(Y 1 ) &1 = 1 ]& 2 Cov[E(Y 2 Y 1 ) Cov ( Y 1, Y 2 ) Var( Y 1 ) &1 = 1 ]. But Cov[ E ( Y 2 Y 1 )_ Cov(Y 1, Y 2 ) Var(Y 1 ) &1 = 1 ]=E[Y 2 = 1 ] Cov(Y 1, Y 2 )_Var(Y 1 ) &1 =Cov(Y 1, Y 2 ) 2 Var(Y 1 ) &1. hus Var[E(Y 2 Y 1 )&Cov(Y 1, Y 2 ) Var(Y 1 ) &1 = 1 ]=0 whch proves the asserto. Proof that Var l (Y 2 Y 1 )=Var(Y 2 ) s equvalet to Cov(Y 1, Y 2 )=0. By defto Var l ( Y 2 Y 1 ) = Var( Y 2 ) & Cov(Y 1, Y 2 ) 2 Var(Y 1 ) &1 thus
25 126 RONIZKY, HOLCROF, AND ROBINS Var l (Y 2 Y 1 )=Var(Y 2 ) Cov(Y 1, Y 2 ) 2 Var(Y 1 ) &1 =0 whch s equvalet to Cov(Y 1, Y 2 )=0. Proof that G, 2 ad IMP, 2 are asymptotcally equvalet. Sce G= l(7), the by defto of l, 2(7), - ( G, 2& 0 )= &12 _ =1{ R 2 = 2 & R 2&* 2 Cov(Y 1, Y 2 ) Var(Y 1 ) &1 = * 2 * 2 1=. (41) Also, by defto of IMP, 2, - ( IMP, 2& 0 ) = &12 =1{ R 2= 2 +(1&R 2 ) _Y 2, obs+ (Y 1, Y 2 ) 1 ) =^ 1& 2&=, where 1, Y 2 ) ad 1 ) are the sample covarace of Y 1 ad Y 2 ad the sample varace of Y 1 amog subjects wth R 2 =1, =^ 1 = Y 1 & Y 1, obs ad Y j, obs, j=1, 2, s the sample average of Y j from subjects wth R 2 =1. Now, ad =1 [R 2 = 2 +(1&R 2 )(Y 2, obs& 2 )]= =1 R 2= 2 =1 R 2 =1 hus, (1&R 2 ) (Y 1, Y 2 ) - ( IMP, 2& 0 ) 1 ) =- { =1 R 2 = 2 =1 R 2 =^ 1 = { =1 = 1 + (Y 1, Y 2 ) 1 ) =- R =1 2= 2 + Cov(Y 1, Y 2 ) { * 2 Var(Y 1 ) = &12 { R 2 = 2 & Cov(Y 1, Y 2 ) * =1 2 Var(Y 1 ) =1 & =1 R 2= 1 =1 R 2 =1 = 1 =1 = 1 = 1, Y 2 ). 1 ) & =1 R 2 = 1 R =1 2 &= & R =1 2= 1 * 2 &= +o p(1) R 2 &* 2 * 2 = 1= +o p(1), (42)
26 REGRESSION WIH MISSING OUCOMES 127 where the secod equalty follows by Slutsky's theorem. hus, by (41) ad (42) ad the cetral lmt theorem, IMP, 2 ad G, 2 are asymptotcally equvalet. Proof of Eqs. 25ad 26. Let M C (= )= R = &? t=1 [R t &* t R (t&1) ]? t Cov C (=, = t) Var C (= t) &1 = t, where Cov C ad Var C are calculated uder the assumpto that Cov(Y X*) =C(X *). he geeralzed least squares estmator G s asymptotcally equvalet to l(c) that solves =1 (IX ) K &1 (X ) M C [= ()]=0, (43) where X =(1, X *), K(X )=Var[M C (= ) X *] ad C(X )=Var(Y X *). Whe X * s a bary varable (43) s equvalet to X*=0\ I \ K&1 0 M C [= (0) ()]+ X*=1\ I \ K &1 1 M C [= (1) ()]=0, (44) where K &1 j =K &1 (X *= j), = (0) () s the _1 vector wth the jth elemet equal to (Y j &B 0j ) ad = (1) () s the _1 vector wth the jth elemet equal to (Y j &B 0j &B 1j ). he system (44) cossts of 2 equatos. Rearragg these equatos so that the equatos occupyg odd umbered places (44) come frst, we have X *=0 hus, 0j, j=1,..., solves IK &1 0 M C [= (0) ()]+ IK &1 1 M C [= (1) ()]=0 (45) X*=1 X *=0 X *=1 IK &1 1 M C[= (1) ()]=0. (46) M C [= (0) ()]=0, (47) ad t s therefore equal to the geeralzed least squares estmator of 0, 0 based o subjects wth X *=0. Smlarly,@ 0j + 0j, j=1,...,, solves X *=1 M C [= (1) ()]=0, (48)
27 128 RONIZKY, HOLCROF, AND ROBINS whch s the geeralzed least squares estmator of the mea vector amog subjects wth X*=1. hus t follows that 1j=@ 0j + 1j & 0j s the dfferece betwee the geeralzed least squares estmator of the mea vector amog subjects wth X*=1 ad the GLS estmator of the mea vector amog subjects wth X*=0. hat relatoshps (47) ad (48) hold also for the GEE, OLS, ad semparametrc effcet estmators follows by a aalogous argumet by cosderg the approprate fuctos M C [= ()] each case. Proof of Eq. (20). E(Y 1 )=0 sce (1) Y 1 =Z 73 1, (2) the fucto h(z)=z 73 s odd, ad (3) Z has a symmetrc dstrbuto wth zero mea. hus, Var(Y 1 )=E(Y 2 1 )=E(Z143 1 ). Also, Cov(Y 1, Y 2 )=E(Y 1, = 2 ) ad E(Y 1 = 2 )=E[Y 1 E(= 2 Y 1 )]. But E(= 2 Y 1 )=E(= 2 Z 1 ) because h(z) =Z 73 s a oe-to-oe fucto. hus, E[Y 1 E(= 2 Y 1 )]=E(Y 1 \Z 1 )= \E(Z 103 ). Fally, Corr(Y 1, Y 2 )#Cov(Y 1, Y 2 )[Var(Y 1 ) Var(Y 2 )] 12 =\E(Z 103 )- E(Z ) because Var(Y 2 )=1. ACKNOWLEDGMEN hs work was coducted as part of Chrsta Holcroft's doctoral dssertato. REFERENCES 1. Begu, J. M., Hall, W. J., Huag, W. M., ad Weller, J. A. (1983). Iformato ad asymptotc effcecy parametrcoparametrc models. A. Statst Carroll, R. J., ad Ruppert, D. (1982). Robust estmato heteroscedastc lear models. A. Statst Chamberla, G. (1987). Asymptotc effcecy estmato wth codtoal momet restrctos. J. Ecoometrcs Johso, R. A., ad Wcher, D. W. (1988). Appled Multvarate Statstcal Aalyss, 2d ed. PretceHall, Eglewood Clffs, NJ. 5. Lag, K-Y, ad Zeger, S. L. (1986). Logtudal data aalyss usg geeralzed lear models. Bometrka Lttle, R. J. A., ad Rub, D. B. (1987). Statstcal Aalyss wth Mssg Data. Wley, New York. 7. Rao, C. R. (1973). Lear Statstcal Iferece ad Its Applcatos, 2d ed. Wley, New York. 8. Robs, J. M., Mark, S. D., ad Newey, W. K. (1992). Estmatg exposure effects by modellg the expectato of exposure codtoal o cofouders. Bometrcs Robs, J. M., ad Rottzky, A. (1995). Semparametrc effcecy multvarate regresso models wth mssg data. J. Am. Statst. Assoc Rottzky, A., ad Robs, J. M. (1995). Semparametrc regresso estmato the presece of depedet cesorg. Bometrka Rub, D. B. (1976). Iferece ad mssg data. Bometrka Seber, G. A. F. (1984). Multvarate Observatos. Wley, New York.
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted
More informationChapter 5 Properties of a Random Sample
Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample
More informationX X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then
Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers
More information( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model
Chapter 3 Asmptotc Theor ad Stochastc Regressors The ature of eplaator varable s assumed to be o-stochastc or fed repeated samples a regresso aalss Such a assumpto s approprate for those epermets whch
More informationEconometric Methods. Review of Estimation
Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators
More informationLecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model
Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The
More informationLecture 3 Probability review (cont d)
STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto
More informationSummary of the lecture in Biostatistics
Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the
More informationTESTS BASED ON MAXIMUM LIKELIHOOD
ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal
More informationLecture Note to Rice Chapter 8
ECON 430 HG revsed Nov 06 Lecture Note to Rce Chapter 8 Radom matrces Let Y, =,,, m, =,,, be radom varables (r.v. s). The matrx Y Y Y Y Y Y Y Y Y Y = m m m s called a radom matrx ( wth a ot m-dmesoal dstrbuto,
More informationChapter 8. Inferences about More Than Two Population Central Values
Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:
More informationECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity
ECONOMETRIC THEORY MODULE VIII Lecture - 6 Heteroskedastcty Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur . Breusch Paga test Ths test ca be appled whe the replcated data
More informationX ε ) = 0, or equivalently, lim
Revew for the prevous lecture Cocepts: order statstcs Theorems: Dstrbutos of order statstcs Examples: How to get the dstrbuto of order statstcs Chapter 5 Propertes of a Radom Sample Secto 55 Covergece
More information{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:
Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed
More informationSimple Linear Regression
Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato
More informationChapter 14 Logistic Regression Models
Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as
More informationChapter 4 Multiple Random Variables
Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:
More informationQualifying Exam Statistical Theory Problem Solutions August 2005
Qualfyg Exam Statstcal Theory Problem Solutos August 5. Let X, X,..., X be d uform U(,),
More informationOrdinary Least Squares Regression. Simple Regression. Algebra and Assumptions.
Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1
STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ
More informationSpecial Instructions / Useful Data
JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth
More informationLecture Notes Types of economic variables
Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte
More information4. Standard Regression Model and Spatial Dependence Tests
4. Stadard Regresso Model ad Spatal Depedece Tests Stadard regresso aalss fals the presece of spatal effects. I case of spatal depedeces ad/or spatal heterogeet a stadard regresso model wll be msspecfed.
More informationENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections
ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty
More informationρ < 1 be five real numbers. The
Lecture o BST 63: Statstcal Theory I Ku Zhag, /0/006 Revew for the prevous lecture Deftos: covarace, correlato Examples: How to calculate covarace ad correlato Theorems: propertes of correlato ad covarace
More informationRecall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I
Chapter 8 Heterosedastcty Recall MLR 5 Homsedastcty error u has the same varace gve ay values of the eplaatory varables Varu,..., = or EUU = I Suppose other GM assumptos hold but have heterosedastcty.
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More information9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d
9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,
More informationThe Mathematical Appendix
The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.
More informationBayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information
Malaysa Joural of Mathematcal Sceces (): 97- (9) Bayes Estmator for Expoetal Dstrbuto wth Exteso of Jeffery Pror Iformato Hadeel Salm Al-Kutub ad Noor Akma Ibrahm Isttute for Mathematcal Research, Uverst
More informationLinear Regression with One Regressor
Lear Regresso wth Oe Regressor AIM QA.7. Expla how regresso aalyss ecoometrcs measures the relatoshp betwee depedet ad depedet varables. A regresso aalyss has the goal of measurg how chages oe varable,
More informationGENERALIZED METHOD OF MOMENTS CHARACTERISTICS AND ITS APPLICATION ON PANELDATA
Sc.It.(Lahore),26(3),985-990,2014 ISSN 1013-5316; CODEN: SINTE 8 GENERALIZED METHOD OF MOMENTS CHARACTERISTICS AND ITS APPLICATION ON PANELDATA Beradhta H. S. Utam 1, Warsoo 1, Da Kurasar 1, Mustofa Usma
More informationMultiple Linear Regression Analysis
LINEA EGESSION ANALYSIS MODULE III Lecture - 4 Multple Lear egresso Aalyss Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur Cofdece terval estmato The cofdece tervals multple
More informationPart 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))
art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the
More informationCHAPTER VI Statistical Analysis of Experimental Data
Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca
More informationECON 5360 Class Notes GMM
ECON 560 Class Notes GMM Geeralzed Method of Momets (GMM) I beg by outlg the classcal method of momets techque (Fsher, 95) ad the proceed to geeralzed method of momets (Hase, 98).. radtoal Method of Momets
More informationENGI 3423 Simple Linear Regression Page 12-01
ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable
More informationFunctions of Random Variables
Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,
More informationEconometrics. 3) Statistical properties of the OLS estimator
30C0000 Ecoometrcs 3) Statstcal propertes of the OLS estmator Tmo Kuosmae Professor, Ph.D. http://omepre.et/dex.php/tmokuosmae Today s topcs Whch assumptos are eeded for OLS to work? Statstcal propertes
More informationEstimation of Stress- Strength Reliability model using finite mixture of exponential distributions
Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur
More informationSTK4011 and STK9011 Autumn 2016
STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto
More informationChapter 4 Multiple Random Variables
Revew o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for Chapter 4-5 Notes: Although all deftos ad theorems troduced our lectures ad ths ote are mportat ad you should be famlar wth, but I put those
More informationApplication of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design
Authors: Pradp Basak, Kaustav Adtya, Hukum Chadra ad U.C. Sud Applcato of Calbrato Approach for Regresso Coeffcet Estmato uder Two-stage Samplg Desg Pradp Basak, Kaustav Adtya, Hukum Chadra ad U.C. Sud
More informationLecture 3. Sampling, sampling distributions, and parameter estimation
Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called
More informationMidterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes
coometrcs, CON Sa Fracsco State Uverst Mchael Bar Sprg 5 Mdterm xam, secto Soluto Thursda, Februar 6 hour, 5 mutes Name: Istructos. Ths s closed book, closed otes exam.. No calculators of a kd are allowed..
More informationThe number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter
LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s
More informationECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model
ECON 48 / WH Hog The Smple Regresso Model. Defto of the Smple Regresso Model Smple Regresso Model Expla varable y terms of varable x y = β + β x+ u y : depedet varable, explaed varable, respose varable,
More informationMidterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes
coometrcs, CON Sa Fracsco State Uversty Mchael Bar Sprg 5 Mdterm am, secto Soluto Thursday, February 6 hour, 5 mutes Name: Istructos. Ths s closed book, closed otes eam.. No calculators of ay kd are allowed..
More informationESS Line Fitting
ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here
More informationIntroduction to local (nonparametric) density estimation. methods
Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest
More informationComparison of Dual to Ratio-Cum-Product Estimators of Population Mean
Research Joural of Mathematcal ad Statstcal Sceces ISS 30 6047 Vol. 1(), 5-1, ovember (013) Res. J. Mathematcal ad Statstcal Sc. Comparso of Dual to Rato-Cum-Product Estmators of Populato Mea Abstract
More information9.1 Introduction to the probit and logit models
EC3000 Ecoometrcs Lecture 9 Probt & Logt Aalss 9. Itroducto to the probt ad logt models 9. The logt model 9.3 The probt model Appedx 9. Itroducto to the probt ad logt models These models are used regressos
More informationDISTURBANCE TERMS. is a scalar and x i
DISTURBANCE TERMS I a feld of research desg, we ofte have the qesto abot whether there s a relatoshp betwee a observed varable (sa, ) ad the other observed varables (sa, x ). To aswer the qesto, we ma
More informationSTA302/1001-Fall 2008 Midterm Test October 21, 2008
STA3/-Fall 8 Mdterm Test October, 8 Last Name: Frst Name: Studet Number: Erolled (Crcle oe) STA3 STA INSTRUCTIONS Tme allowed: hour 45 mutes Ads allowed: A o-programmable calculator A table of values from
More informationInvestigation of Partially Conditional RP Model with Response Error. Ed Stanek
Partally Codtoal Radom Permutato Model 7- vestgato of Partally Codtoal RP Model wth Respose Error TRODUCTO Ed Staek We explore the predctor that wll result a smple radom sample wth respose error whe a
More informationIntroduction to Matrices and Matrix Approach to Simple Linear Regression
Itroducto to Matrces ad Matrx Approach to Smple Lear Regresso Matrces Defto: A matrx s a rectagular array of umbers or symbolc elemets I may applcatos, the rows of a matrx wll represet dvduals cases (people,
More informationTHE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA
THE ROYAL STATISTICAL SOCIETY EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER II STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for
More informationClass 13,14 June 17, 19, 2015
Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral
More informationCOV. Violation of constant variance of ε i s but they are still independent. The error term (ε) is said to be heteroscedastic.
c Pogsa Porchawseskul, Faculty of Ecoomcs, Chulalogkor Uversty olato of costat varace of s but they are stll depedet. C,, he error term s sad to be heteroscedastc. c Pogsa Porchawseskul, Faculty of Ecoomcs,
More informationDimensionality Reduction and Learning
CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that
More informationDiscrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b
CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package
More informationA Combination of Adaptive and Line Intercept Sampling Applicable in Agricultural and Environmental Studies
ISSN 1684-8403 Joural of Statstcs Volume 15, 008, pp. 44-53 Abstract A Combato of Adaptve ad Le Itercept Samplg Applcable Agrcultural ad Evrometal Studes Azmer Kha 1 A adaptve procedure s descrbed for
More informationDr. Shalabh. Indian Institute of Technology Kanpur
Aalyss of Varace ad Desg of Expermets-I MODULE -I LECTURE - SOME RESULTS ON LINEAR ALGEBRA, MATRIX THEORY AND DISTRIBUTIONS Dr. Shalabh Departmet t of Mathematcs t ad Statstcs t t Ida Isttute of Techology
More informationLecture 9: Tolerant Testing
Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have
More informationChapter -2 Simple Random Sampling
Chapter - Smple Radom Samplg Smple radom samplg (SRS) s a method of selecto of a sample comprsg of umber of samplg uts out of the populato havg umber of samplg uts such that every samplg ut has a equal
More informationMultiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades
STAT 101 Dr. Kar Lock Morga 11/20/12 Exam 2 Grades Multple Regresso SECTIONS 9.2, 10.1, 10.2 Multple explaatory varables (10.1) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (10.2) Trasformatos
More informationTHE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5
THE ROYAL STATISTICAL SOCIETY 06 EAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 The Socety s provdg these solutos to assst cadtes preparg for the examatos 07. The solutos are teded as learg ads ad should
More informationChapter -2 Simple Random Sampling
Chapter - Smple Radom Samplg Smple radom samplg (SRS) s a method of selecto of a sample comprsg of umber of samplg uts out of the populato havg umber of samplg uts such that every samplg ut has a equal
More informationbest estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best
Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg
More informationLecture Notes 2. The ability to manipulate matrices is critical in economics.
Lecture Notes. Revew of Matrces he ablt to mapulate matrces s crtcal ecoomcs.. Matr a rectagular arra of umbers, parameters, or varables placed rows ad colums. Matrces are assocated wth lear equatos. lemets
More informationMultiple Choice Test. Chapter Adequacy of Models for Regression
Multple Choce Test Chapter 06.0 Adequac of Models for Regresso. For a lear regresso model to be cosdered adequate, the percetage of scaled resduals that eed to be the rage [-,] s greater tha or equal to
More informationObjectives of Multiple Regression
Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of
More informationMultivariate Transformation of Variables and Maximum Likelihood Estimation
Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty
More informationProbability and. Lecture 13: and Correlation
933 Probablty ad Statstcs for Software ad Kowledge Egeers Lecture 3: Smple Lear Regresso ad Correlato Mocha Soptkamo, Ph.D. Outle The Smple Lear Regresso Model (.) Fttg the Regresso Le (.) The Aalyss of
More informationRegresso What s a Model? 1. Ofte Descrbe Relatoshp betwee Varables 2. Types - Determstc Models (o radomess) - Probablstc Models (wth radomess) EPI 809/Sprg 2008 9 Determstc Models 1. Hypothesze
More informationSimple Linear Regression
Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uversty Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal
More informationComplete Convergence and Some Maximal Inequalities for Weighted Sums of Random Variables
Joural of Sceces, Islamc Republc of Ira 8(4): -6 (007) Uversty of Tehra, ISSN 06-04 http://sceces.ut.ac.r Complete Covergece ad Some Maxmal Iequaltes for Weghted Sums of Radom Varables M. Am,,* H.R. Nl
More informationA New Family of Transformations for Lifetime Data
Proceedgs of the World Cogress o Egeerg 4 Vol I, WCE 4, July - 4, 4, Lodo, U.K. A New Famly of Trasformatos for Lfetme Data Lakhaa Watthaacheewakul Abstract A famly of trasformatos s the oe of several
More informationMATH 247/Winter Notes on the adjoint and on normal operators.
MATH 47/Wter 00 Notes o the adjot ad o ormal operators I these otes, V s a fte dmesoal er product space over, wth gve er * product uv, T, S, T, are lear operators o V U, W are subspaces of V Whe we say
More informationDepartment of Agricultural Economics. PhD Qualifier Examination. August 2011
Departmet of Agrcultural Ecoomcs PhD Qualfer Examato August 0 Istructos: The exam cossts of sx questos You must aswer all questos If you eed a assumpto to complete a questo, state the assumpto clearly
More informationFaculty Research Interest Seminar Department of Biostatistics, GSPH University of Pittsburgh. Gong Tang Feb. 18, 2005
Faculty Research Iterest Semar Departmet of Bostatstcs, GSPH Uversty of Pttsburgh Gog ag Feb. 8, 25 Itroducto Joed the departmet 2. each two courses: Elemets of Stochastc Processes (Bostat 24). Aalyss
More information22 Nonparametric Methods.
22 oparametrc Methods. I parametrc models oe assumes apror that the dstrbutos have a specfc form wth oe or more ukow parameters ad oe tres to fd the best or atleast reasoably effcet procedures that aswer
More informationEcon 388 R. Butler 2016 rev Lecture 5 Multivariate 2 I. Partitioned Regression and Partial Regression Table 1: Projections everywhere
Eco 388 R. Butler 06 rev Lecture 5 Multvarate I. Parttoed Regresso ad Partal Regresso Table : Projectos everywhere P = ( ) ad M = I ( ) ad s a vector of oes assocated wth the costat term Sample Model Regresso
More informationExtreme Value Theory: An Introduction
(correcto d Extreme Value Theory: A Itroducto by Laures de Haa ad Aa Ferrera Wth ths webpage the authors ted to form the readers of errors or mstakes foud the book after publcato. We also gve extesos for
More informationRademacher Complexity. Examples
Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th 018 3.1 Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed
More informationMedian as a Weighted Arithmetic Mean of All Sample Observations
Meda as a Weghted Arthmetc Mea of All Sample Observatos SK Mshra Dept. of Ecoomcs NEHU, Shllog (Ida). Itroducto: Iumerably may textbooks Statstcs explctly meto that oe of the weakesses (or propertes) of
More informationBounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy
Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled
More informationTHE EFFICIENCY OF EMPIRICAL LIKELIHOOD WITH NUISANCE PARAMETERS
Joural of Mathematcs ad Statstcs (: 5-9, 4 ISSN: 549-3644 4 Scece Publcatos do:.3844/jmssp.4.5.9 Publshed Ole ( 4 (http://www.thescpub.com/jmss.toc THE EFFICIENCY OF EMPIRICAL LIKELIHOOD WITH NUISANCE
More informationChapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements
Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall
More informationChapter 9 Jordan Block Matrices
Chapter 9 Jorda Block atrces I ths chapter we wll solve the followg problem. Gve a lear operator T fd a bass R of F such that the matrx R (T) s as smple as possble. f course smple s a matter of taste.
More informationTHE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 00 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1
STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal
More informationChapter 3 Sampling For Proportions and Percentages
Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys
More informationLECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR
amplg Theory MODULE II LECTURE - 4 IMPLE RADOM AMPLIG DR. HALABH DEPARTMET OF MATHEMATIC AD TATITIC IDIA ITITUTE OF TECHOLOGY KAPUR Estmato of populato mea ad populato varace Oe of the ma objectves after
More informationAnalysis of Variance with Weibull Data
Aalyss of Varace wth Webull Data Lahaa Watthaacheewaul Abstract I statstcal data aalyss by aalyss of varace, the usual basc assumptos are that the model s addtve ad the errors are radomly, depedetly, ad
More information12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model
1. Estmatg Model parameters Assumptos: ox ad y are related accordg to the smple lear regresso model (The lear regresso model s the model that says that x ad y are related a lear fasho, but the observed
More informationSTRONG CONSISTENCY FOR SIMPLE LINEAR EV MODEL WITH v/ -MIXING
Joural of tatstcs: Advaces Theory ad Alcatos Volume 5, Number, 6, Pages 3- Avalable at htt://scetfcadvaces.co. DOI: htt://d.do.org/.864/jsata_7678 TRONG CONITENCY FOR IMPLE LINEAR EV MODEL WITH v/ -MIXING
More informationSTATISTICAL INFERENCE
(STATISTICS) STATISTICAL INFERENCE COMPLEMENTARY COURSE B.Sc. MATHEMATICS III SEMESTER ( Admsso) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION CALICUT UNIVERSITY P.O., MALAPPURAM, KERALA, INDIA -
More informationThird handout: On the Gini Index
Thrd hadout: O the dex Corrado, a tala statstca, proposed (, 9, 96) to measure absolute equalt va the mea dfferece whch s defed as ( / ) where refers to the total umber of dvduals socet. Assume that. The
More information