A note on multiple imputation for method of moments estimation

Size: px
Start display at page:

Download "A note on multiple imputation for method of moments estimation"

Transcription

1 Statstcs Publcatons Statstcs A note on multple mputaton for method of moments estmaton Shu Yang Harvard Unversty Jae Kwang Km Iowa State Unversty, jkm@astate.edu Follow ths and addtonal works at: Part of the Desgn of Experments and Sample Surveys Commons, Statstcal Methodology Commons, and the Statstcal Models Commons The complete bblographc nformaton for ths tem can be found at stat_las_pubs/120. For nformaton on how to cte ths tem, please vst howtocte.html. Ths Artcle s brought to you for free and open access by the Statstcs at Iowa State Unversty Dgtal Repostory. It has been accepted for ncluson n Statstcs Publcatons by an authorzed admnstrator of Iowa State Unversty Dgtal Repostory. For more nformaton, please contact dgrep@astate.edu.

2 Bometrka (20xx), xx, x, pp. 1 8 C 2012 Bometrka Trust Prnted n Great Brtan A note on multple mputaton for method of moments estmaton arxv: v1 [stat.me] 27 Aug 2015 BY S. YANG Department of Bostatstcs, Harvard T. H. Chan School of Publc Health, Boston, Massachusetts 02115, U.S.A. shuyang@hsph.harvard.edu J. K. KIM Department of Statstcs, Iowa State Unversty, Ames, Iowa 50010, U.S.A. jkm@astate.edu SUMMARY Multple mputaton s a popular mputaton method for general purpose estmaton. Rubn (1987) provded an easly applcable formula for the varance estmaton of multple mputaton. However, the valdty of the multple mputaton nference requres the congenalty condton of Meng (1994), whch s not necessarly satsfed for method of moments estmaton. Ths paper presents the asymptotc bas of Rubn s varance estmator when the method of moments estmator s used as a complete-sample estmator n the multple mputaton procedure. A new varance estmator based on over-mputaton s proposed to provde asymptotcally vald nference for method of moments estmaton. Some key words: Bayesan method; Congenalty; Mssng at random; Proper mputaton; Survey samplng. 1. INTRODUCTION Imputaton s often used to handle mssng data. For nference, f mputed values are treated as f they were observed, varance estmates wll generally be underestmates (Ford, 1983). To account for the uncertanty due to mputaton, Rubn (1987, 1996) proposed multple mputaton whch creates multply completed datasets to allow assessment of mputaton varablty. Multple mputaton s motvated n a Bayesan framework; however, ts frequentst valdty s controversal. Rubn (1987) clamed that multple mputaton can provde vald frequentst nference n varous applcatons (for example, Clogg et al., 1991). On the other hand, as dscussed by Fay (1992), Kott (1995), Fay (1996), Bnder & Sun (1996), Wang & Robns (1998), Robns & Wang (2000), Nelsen (2003), and Km et al. (2006), the multple mputaton varance estmator s not always consstent. For multple mputaton nference to be vald, mputatons must be proper (Rubn, 1987). A suffcent condton s gven by Meng (1994), the so-called congenalty condton, mposed on both the mputaton model and the form of subsequent complete-sample analyses, whch s qute restrctve for general purpose estmaton. Rubn s varance estmator s otherwse nconsstent. Km (2011) ponted out that multple mputaton that s congenal for mean estmaton s not necessarly congenal for proporton estmaton. Therefore, some common statstcal procedures,

3 2 S. YANG AND J. K. KIM such as the method of moments estmators, can be ncompatble wth the multple mputaton framework. In ths paper, we characterze the asymptotc bas of Rubn s varance estmator when the method of moments estmator s used n the complete-sample analyss. We also dscuss an alternatve varance estmator that can provde asymptotcally vald nference for method of moments estmaton. The new varance estmator s compared wth Rubn s varance estmator through two lmted smulaton studes n BASIC SETUP Suppose that the sample conssts of n observatons (x 1,y 1 ),...,(x n,y n ), whch s an ndependent realzaton of a random vector (X, Y). For smplcty of presentaton, assume that Y s a scalar outcome varable and X s a p-dmensonal covarate. Suppose that x s fully observed and y s not fully observed for all unts n the sample. Wthout loss of generalty, assume the frst r unts of y are observed and the remanng n r unts of y are mssng. Let δ be the response ndcator of y, that s, δ = 1 f y s observed and δ = 0 otherwse. Denote y obs = (y 1,...,y r ) T and X n = (x 1,...,x n ). We further assume that the mssng mechansm s mssng at random n the sense of Rubn (1976). The parameter of nterest s η = E{g(Y)}, whereg( ) s a known functon. For example, fg(y) = y, thenη = E(Y) s the populaton mean ofy,and fg(y) = I(y < 1), thenη = pr(y < 1) s the populaton proporton ofy less than1. Assume that the condtonal densty f(y x) belongs to a parametrc class of models ndexed by θ such that f(y x) = f(y x;θ) for some θ Ω and the margnal dstrbuton of x s completely unspecfed. To generate mputed values for mssng outcomes from f(y x; θ), we need to estmate the unknown parameter θ, ether by lkelhood-based methods or by Bayesan methods. The multple mputaton procedure employs a Bayesan approach to deal wth the unknown parameter θ, whch unfolds n three steps: Step 1. (Imputaton) Create M complete datasets by fllng n mssng values wth mputed values generated from the posteror predctve dstrbuton. Specfcally, to create thejth mputed dataset, frst generateθ (j) from the posteror dstrbutonp(θ X n,y obs ), and then generatey (j) from the mputaton model f(y x ;θ (j) ) for each mssng y. Step 2. (Analyss) Apply the user s complete-sample estmaton procedure to each mputed dataset. Let ˆη (j) be the complete-sample estmator of η = E{g(Y)} appled to the jth mputed dataset and ˆV (j) be the complete-sample varance estmator of ˆη (j). Step 3. (Summarze) Use Rubn s combnng rule to summarze the results from the multply mputed datasets. The multple mputaton estmator ofη s ˆη MI = M 1 M j=1 ˆη(j), and Rubn s varance estmator s ˆV MI (ˆη MI ) = W M + ( 1+M 1) B M, (1) where W M = M 1 M ˆV j=1 (j) and B M = (M 1) 1 M j=1 (ˆη(j) ˆη MI ) 2. If the method of moments estmator ofη = E{g(Y)} s used n step 2, the multple mputaton estmator of η becomes M r n M ˆη MI = M 1 ˆη (j) = n 1 g(y )+ M 1 g(y (j) ), (2) j=1 =1 =r+1 where ˆη (j) = n 1 { r =1 g(y )+ n =r+1 g(y (j) )}. To derve the frequentst property of ˆη MI, we rely on the Bernsten-von Mses theorem (van der Vaart, 2000; Chapter 10), whch clams j=1

4 Multple Imputaton 3 that under regularty condtons and condtonal on the observed data, the posteror dstrbuton p(θ X n,y obs ) converges to a normal dstrbuton wth mean ˆθ and varance I 1 obs, where ˆθ s the maxmum lkelhood estmator of θ from the observed data and I 1 obs s the nverse of the observed Fsher nformaton matrx wth I obs = r =1 2 logf(y x ; ˆθ)/ θ θ T. As a result, assume that E{g(Y) x ;θ} s suffcently smooth n θ, condtonal on the observed data, we have plm M M 1 M j=1 g(y (j) ) = E[E{g(Y) x ;θ } X n,y obs ] = E{g(Y) x ; ˆθ}, where A n = Bn means A n = B n +o p (1). Therefore, for M, ˆη MI converges to ˆη MI, = n 1 { r =1 y + n =r+1 m(x ; ˆθ)}, where m(x;θ) = E{g(Y) x;θ}. The varance estmaton of ˆη MI, needs to approprately account for the uncertanty assocated wth the estmate of θ, whch s usually done usng lnearzaton methods f the mputaton models are known (Robns & Wang, 2000; Km & Rao, 2009). In the multple mputaton procedure, ths s characterzed n the varablty between the multply mputed datasets wthout referrng to the mputaton models. However, Rubn s varance estmator (1) requres restrctve condtons for vald nference, whch we dscuss n the next secton. 3. MAIN RESULT Rubn s varance estmator s based on the followng decomposton, var(ˆη MI ) = var(ˆη n )+var(ˆη MI ˆη n )+2cov(ˆη MI ˆη n,ˆη n ), (3) where ˆη n s the complete-sample estmator ofη. Bascally, n Rubn s varance estmator (1),W M estmates the frst term of (3) and(1+m 1 )B M estmates the second term of (3). In partcular, Km et al. (2006) proved that E{(1+M 1 )B M } = var(ˆη MI ˆη n ) for a farly general class of estmators. Thus, f the complete-sample varance estmator satsfes the condton E(ˆV (j) ) = var(ˆη n ) for j = 1,...,M, the bas of Rubn s varance estmator s bas(ˆv MI ) = 2cov(ˆη MI ˆη n,ˆη n ). (4) Rubn s varance estmator s asymptotcally unbased f cov(ˆη MI ˆη n,ˆη n ) = 0, whch s called the congenalty condton by Meng (1994). However, the congenalty condton does not hold for some common estmators such as the method of moments estmators. Theorem 1 gves ths asymptotc bas of Rubn s varance estmator for M, wth the proof outlned n the onlne supplementary materal. THEOREM 1. Let ˆη n = n 1 n =1 g(y ) be the method of moments estmator ofη = E{g(Y)} under complete response. Assume that E(ˆV (j) ) = var(ˆη n ) holds for j = 1,...,M. Then for M, the bas of Rubn s varance estmator s bas(ˆv MI ) = 2n 1 (1 p) ( E[var{g(Y) X} δ = 0] ṁ T θ,0 I 1 θ ṁ θ,1 ), (5) where p = r/n, I θ = E{ 2 logf(y X;θ)/ θ θ T }, m(x;θ) = E{g(Y) x;θ}, ṁ θ (x) = m(x;θ)/ θ, ṁ θ,0 = E{ṁ θ (X) δ = 0}, and ṁ θ,1 = E{ṁ θ (X) δ = 1}. Remark 1. Under mssng completely at random, the bas n (5) smplfes to bas(ˆv MI ) = 2p(1 p){var(ˆη r,mme ) var(ˆη r,mle )}, (6) where ˆη r,mme = r 1 r =1 g(y ) and ˆη r,mle = r 1 r =1 E{g(Y) x ; ˆθ}, because var(ˆη r,mme ) = r 1 var{g(y)} = r 1 var[e{g(y) X}]+r 1 E[var{g(Y) X}],

5 4 S. YANG AND J. K. KIM and var(ˆη r,mle ) = r 1 var[e{g(y) X}]+r 1 ṁ T θ I 1 θ ṁ θ, whereṁ θ = E{ṁ θ (X)}. Result (6) explctly shows that Rubn s varance estmator s unbased f and only f the method of moments estmator s as effcent as the maxmum lkelhood estmator, that s, var(ˆη r,mme ) = var(ˆη r,mle ). Otherwse, Rubn s varance estmator s postvely based. Remark 2. Under mssng at random, the bas of Rubn s varance estmator can be zero, postve or negatve. Consder a smple lnear regresson model Y = X T β +ǫ, whereǫ N(0,σ 2 ). For g(y) = Y, f X contans 1, then the method of moments estmator n 1 n =1 y s dentcal to the maxmum lkelhood estmator n 1 n =1 xt ˆβ wth ˆβ beng the maxmum lkelhood estmator of β under complete response. By Theorem 1, let E 0 ( ) = E( δ = 0) and E 1 ( ) = E( δ = 1), the bas of Rubn s varance estmator n (5) s bas(ˆv MI ) = 2n 1 (1 p)σ 2 {1 E 0 (X) T E 1 (XX T ) 1 E 1 (X)} = 0, by drect calculaton consderng that X contans 1. Ths s consstent wth the theory n Wang & Robns (1998) and Nelsen (2003). Now consder a smple lnear regresson model whch contans one covarate X and no ntercept, then the method of moments estmator s strctly less effcent than the maxmum lkelhood estmator (Matloff, 1981). The bas of Rubn s varance estmator s bas(ˆv MI ) = 2n 1 (1 p)σ 2 E 1 (X 2 ) 1 {E 1 (X 2 ) E 0 (X) T E 1 (X)}, (7) whch can be zero, postve or negatve dependng on the nformaton of X n the respondent and non-respondent groups. See the frst smulaton study n ALTERNATIVE VARIANCE ESTIMATION In ths secton, we consder an alternatve varance estmaton method that leads to an unbased varance estmator for multple mputaton regardless of whether the method of moments estmator or the maxmum lkelhood estmator s used as the complete-sample estmator n the multple mputaton procedure. We frst decompose the multple mputaton estmator as, ˆη MI = ˆη MI, +(ˆη MI ˆη MI, ). The two terms are uncorrelated usng the law of total covarance and the fact that ˆη MI, s the condtonal expectaton of ˆη MI, condtonal on the observed data. Therefore, we have var(ˆη MI ) = var(ˆη MI, )+var(ˆη MI ˆη MI, ). (8) Note thatvar(ˆη MI ˆη MI, ) can be estmated bym 1 B M (Km et al., 2006; Lemma 2). We now focus on estmatngvar(ˆη MI, ) n (8). For smplcty of presentaton, all detals of dervaton are to be found n supplementary materal. We show that the varance of ˆη MI, s a sum of two terms, var(ˆη MI, ) = n 1 V 1 +r 1 V 2, (9) where V 1 = var{g(y)} (1 p)e[var{g(y) X} δ = 0], and V 2 = ṁ T θ I 1 θ ṁ θ p 2 ṁ T θ,1 I 1 θ ṁ θ,1. The frst term, n 1 V 1, s the varance of the sample mean of g(y ) (1 δ ){g(y ) m(x ;θ)}. To estmate ths term, consder W M = M 1 M ˆV j=1 (j) as n (1), and { 2 1 M n C M = n 2 g(y (k) ) 1 M g(y (k) )}. (10) (M 1) M k=1=r+1 k=1

6 Multple Imputaton 5 We have E{W M } = n 1 var{g(y)} and E(C M ) = n 1 (1 p)e[var{g(y) X} δ = 0]. Therefore, the frst term n 1 V 1 can be estmated by W M = W M C M. By the strong law of large numbers, pr( W M 0) 1 asn. The second term, r 1 V 2, reflects the varablty assocated wth the estmated value of θ nstead of the true value θ n the mputed values. To estmate ths term, we use overmputaton n the sense that the mputaton s carred out not only for the unts wth mssng outcomes, but also for the unts wth observed outcomes. Over-mputaton has been used n model dagnostcs for multple mputaton (Honaker et al., 2010; Blackwell et al., 2015). Let d (k) = g(y (k) ) M 1 M l=1 g(y (l) ) for = 1,...,n and k = 1,...,M. De- ) 2 (M 1) 1 M k=1 n 2 n =1 (d (k) ) 2, and fne D M,n = (M 1) 1 M k=1 (n 1 n =1 d (k) D M,r = (M 1) 1 M k=1 (n 1 r =1 d (k) =1 (d (k) ) 2. The key nsght s based on the followng observatons: E(D M,n ) = r 1 ṁ T θ I 1 θ ṁ θ and E(D M,r ) = r 1 p 2 ṁ T θ,1 I 1 θ ṁ θ,1 ; therefore, the second term of (9) can be estmated byd M = D M,n D M,r. Combnng the estmators of the two terms n (9), we have the new multple mputaton varance estmator, gven n the followng theorem. ) 2 (M 1) 1 M k=1 n 2 r THEOREM 2. Under the assumptons of Theorem 1, the new multple mputaton varance estmator s ˆV MI = W M +D M +M 1 B M, (11) where W M = W M C M, wthc M defned n (10) andb M beng the usual between-mputaton varance n (1). ˆV MI s asymptotcally unbased for estmatng the varance of the multple mputaton estmator n (2) as n. Remark 3. To account for the uncertanty n the varance estmator wth a small to moderate mputaton sze, a 100(1 α)% nterval estmate for η s ˆη MI ±t df,1 α/2 ˆVMI, where df s an approxmate number of degrees of freedom based on Satterthwate s method (1946) gven n supplementary materal. From smulaton studes, we fnd that usng df = M 1 gves smlar satsfactory results as usng the formula we provded. As a practcal matter, df = M 1 s preferred. Remark 4. The proposed varance estmator n (11) s also asymptotcally unbased when ˆη n s the maxmum lkelhood estmator of η = E{g(Y)} (see supplementary materal for proof). Therefore, the proposed varance estmator s applcable regardless of whether the maxmum lkelhood estmator or the method of moments estmator s used for the complete-sample estmator. The prce we pay for the better performance of our varance estmator s an ncrease n computatonal complexty and data storage space, whch requres M + 1 datasets, wth M of them ncludng the over-mputatons and the last one contanng the orgnal observed data. However, when one s concern s wth vald nference of multple mputaton, as n ths paper, our proposed varance estmator based on over-mputaton s preferred over that of Rubn s. In addton, gven over-mputatons, the subsequent nference does not requre the knowledge of the mputaton models. Ths s mportant because data analysts typcally do not have access to all the nformaton that the mputers used for mputaton. Our study would promote the use of over-mputaton at the tme of mputaton, whch not only allows the mputers to assess the adequacy of the mputaton models, but also enables the analysts to carry out vald nference wthout knowledge of the mputaton models.

7 6 S. YANG AND J. K. KIM 5. SIMULATION STUDY To test our theory, we conduct two lmted smulaton studes. In the frst smulaton, 5, 000 Monte Carlo samples of szen = 2,000 are ndependently generated fromy = βx +e, where β = 0.1, X exp(1) and e N(0,σe 2) wth σ2 e = 0.5. In the sample, we assume that X s fully observed, but Y s not. Let δ be the response ndcator of y and δ Bernoull(p ), where p = 1/{1+exp( φ 0 φ 1 x )}. We consder two scenaros: () (φ 0,φ 1 ) = ( 1.5,2) and () (φ 0,φ 1 ) = (3, 3), wth the average response rate about 0.6. The parameters of nterest are η 1 = E(Y) and η 2 = pr(y < 0.15). For multple mputaton, M = 500 mputed values are ndependently generated from the lnear regresson model usng the Bayesan regresson mputaton procedure dscussed n Schenker & Welsh (1998), where β and σe 2 are treated as ndependent wth pror densty proportonal to σe 2. In each mputed dataset, we adopt the followng complete-sample pont estmators and varance estmators: ˆη 1,n = n 1 n =1 y, ˆη 2,n = n 1 n =1 I(y < 0.15), ˆV(ˆη 1,n ) = n 1 (n 1) 1 n =1 (y ˆη 1,n ) 2, and ˆV(ˆη 2,n ) = (n 1) 1ˆη 2,n (1 ˆη 2,n ). The relatve bas of the varance estmator s calculated as {E(ˆV MI ) var(ˆη MI )}/var(ˆη MI ) 100%. The100(1 α)% confdence ntervals are calculated as (ˆη MI t ν,1 α/2 ˆVMI,ˆη MI +t ν,1 α/2 ˆVMI ), where t ν,1 α/2 s the 100(1 α/2)% quantle of the t dstrbuton wth ν degrees of freedom. For Rubn s method, ν = ν 1 ν 2 /(ν 1 +ν 2 ) wth ν 1 = (M 1)λ 2, ν 2 = (ν com +1)(ν com +3) 1 ν com (1 λ), ν com = n 3, and λ = (1+M 1 )B M /{W M +(1+M 1 )B M } (Barnard & Rubn, 1999). In our new method, ν = M 1. The coverage s calculated as the percentage of Monte Carlo samples where the estmate falls wthn the confdence nterval. From Table 1, for η 1 = E(Y), under scenaro (), the relatve bas of Rubn s varance estmator s 96.8%, whch s consstent wth our result n (7) wth E 1 (X 2 ) E 0 (X) T E 1 (X) > 0, where E 1 (X 2 ) = 3.38, E 1 (X) = 1.45, and E 0 (X) = Under scenaro (), the relatve bas of Rubn s varance estmator s 19.8%, whch s consstent wth our result n (7) wth E 1 (X 2 ) E 0 (X) T E 1 (X) < 0, where E 1 (X 2 ) = 0.37, E 1 (X) = 0.47, and E 0 (X) = The emprcal coverage for Rubn s method can be over or below the nomnal coverage due to varance overestmaton or underestmaton. On the other hand, the new varance estmator s essentally unbased for these scenaros. In the second smulaton, 5,000 Monte Carlo samples of sze n = 200 are ndependently generated from Y = β 0 +β 1 X +e, where β = (β 0,β 1 ) = (3, 1), X N(2,1) and e N(0,σe) 2 wthσe 2 = 1. The parameters of nterest areη 1 = E(Y)andη 2 = pr(y < 1). We consder two dfferent factors for smulaton. One s the response mechansm: mssng completely at random and mssng at random. For mssng completely at random, δ Bernoull(0.6). For mssng at random, δ Bernoull(p ), where p = 1/{1+exp( φ 0 φ 1 x )} and (φ 0,φ 1 ) = (0.28,0.1) wth the average response rate about 0.6. The other factor s the sze of multple mputaton, wth two levels M = 10 and M = 30. From Table 2, regardng the relatve bas, Rubn s varance estmator s unbased for η 1 = E(Y), wth absolute relatve bas of less than1%, and our new varance estmator s comparable wth Rubn s varance estmator wth absolute relatve bas of less than 1.68%. Rubn s varance estmator s based upward for η 2 = pr(y < 1), wth absolute relatve bas as hgh as 24%; whereas our new varance estmator reduces absolute relatve bas to less than1.74%. Regardng confdence nterval estmates, for η 1 = E(Y), the confdence nterval calculated from our new method s slghtly wder than that from Rubn s method, because our new method uses a smaller number of degrees of freedom n thetdstrbuton. However, forη 2 = pr(y < 1), the confdence nterval calculated from our new method s narrower than that from Rubn s method even wth a smaller number of degrees of freedom n the t dstrbuton, due to the overestmaton n Rubn s

8 Multple Imputaton 7 Table 1. Relatve bases of two varance estmators and mean wdth and coverages of two nterval estmates under two scenaros n smulaton one Relatve bas Mean Wdth Mean Wdth Coverage Coverage (%) for 90% C.I. for 95% C.I. for 90% C.I. for 95% C.I. Scenaro Rubn New Rubn New Rubn New Rubn New Rubn New 1 η η η η C.I., confdence nterval; η 1 = E(Y);η 2 = pr(y < 0.15); Rubn/New, Rubn s/new varance estmator. Table 2. Relatve bases of two varance estmators and mean wdth and coverages of two nterval estmates under two scenaros of mssngness n smulaton two Relatve Bas Mean Wdth Mean Wdth Coverage Coverage (%) for 90% C.I. for 95% C.I. for 90% C.I. for95% C.I. M Rubn New Rubn New Rubn New Rubn New Rubn New Mssng completely at random η η Mssng at random η η C.I., confdence nterval; η 1 = E(Y);η 2 = pr(y < 1); Rubn/New, Rubn s/new varance estmator. method. Rubn s method provdes good emprcal coverage for η 1 = E(Y) n the sense that the emprcal coverage s close to the nomnal coverage; however, the emprcal coverage for η 2 = pr(y < 1) reaches to 95% for 90% confdence ntervals, and 98% for 95% confdence ntervals, due to varance overestmaton. In contrast, our new method provdes more accurate coverage of confdence nterval for bothη 1 = E(Y) andη 2 = pr(y < 1) at90% and95% levels. 6. DISCUSSION Our method can be extended to a more general class of parameters obtaned from estmatng equatons. Let η be defned as a soluton to the estmatng equaton n =1 U(η;x,y ) = 0. Examples of η nclude mean of y, proporton of y less than q, pth quantle, regresson coeffcents, and doman means. A smlar approach can be used to characterze the bas of Rubn s varance estmator and to develop a bas-corrected varance estmator. Another extenson would be developng unbased varance estmaton for the vector case of η wth q > 1 components. As n the scalar case, we can construct the multvarate analogues of the multple mputaton estmator and the varance estmator; however, fndng an adequate reference dstrbuton for the statstc (ˆη MI η) T 1 ˆV MI (ˆη MI η)/q s more subtle n the vector case than n the scalar case. One potental soluton s to make a smplfyng assumpton that the fracton of mssng nformaton s equal for all the components of η, as dscussed n Xe & Meng (2014) and L et al. (1994).

9 8 S. YANG AND J. K. KIM ACKNOWLEDGMENT We are grateful to Xanchao Xe and Xaol Meng for many helpful conversatons and to the Bometrka edtors and four referees for ther valuable comments that helped to mprove ths paper. The research of the second author was partally supported by a grant from US Natonal Scence Foundaton and also by a Cooperatve Agreement between the U.S. Department of Agrculture Natural Resources Conservaton Servce and Iowa State Unversty. SUPPLEMENTARY MATERIAL The supplementary materal avalable at Bometrka onlne ncludes the proof of Theorem 1, the proof of Theorem 2, verfcaton of the new varance estmator beng unbased when ˆη n s the maxmum lkelhood estmator of η = E{g(Y)}, and an approxmate number of degrees of freedom. REFERENCES BARNARD, J. & RUBIN, D. B. (1999). Small-sample degrees of freedom wth multple mputaton. Bometrka 86, BINDER, D. A. & SUN, W. (1996). Frequency vald multple mputaton for surveys wth a complex desgn. Proc. Surv. Res. Meth. Sect. Am. Statst. Ass., BLACKWELL, M., HONAKER, J. & KING, G. (2015). A Unfed Approach to Measurement Error and Mssng Data: Detals and Extensons. Soc. Meth. Res., In press. CLOGG, C. C., RUBIN, D. B., SCHENKER, N., SCHULTZ, B. & WEIDMAN, L. (1991). Multple mputaton of ndustry and occupaton codes n census publc-use samples usng Bayesan logstc regresson. J. Am. Statst. Assoc. 86, FAY, R. E.(1992). When are nferences from multple mputaton vald?. Proc. Surv. Res. Meth. Sect. Am. Statst. Ass., FAY, R. E.(1993). Vald nference from mputed survey data. Proc. Surv. Res. Meth. Sect. Am. Statst. Ass., FAY, R. E.(1996). Alternatve paradgms for the analyss of mputed survey data. J. Am. Statst. Assoc. 91, FORD, B. L.(1983). An overvew of hot-deck procedures. Incomplete data n sample surveys 2, HONAKER, J. AND KING, G. & BLACKWELL, M. M.(2010). Package Amela. KIM, J. K.(2011). Parametrc fractonal mputaton for mssng data analyss. Bometrka 98, KIM, J. K., BRICK, J., FULLER, W. A., & KALTON, G.(2006). On the bas of the multple-mputaton varance estmator n survey samplng. J. R. Statst. Soc. B 68, KIM, J. K. & RAO, J. N. K.(2009). A unfed approach to lnearzaton varance estmaton from survey data after mputaton for tem nonresponse. Bometrka 96, KOTT, P. S.(1995). A paradox of multple mputaton. Proc. Surv. Res. Meth. Sect. Am. Statst. Ass., LI,K. H., RAGHUNATHAN, T. E. & RUBIN, D. B.(1994). Large-sample sgnfcance levels from multply mputed data usng moment-based statstcs and an F reference dstrbuton. J. Am. Statst. Assoc. 86, MATLOFF, N. S.(1981). Use of regresson functons for mproved estmaton of means. Bometrka 68, MENG, X.(1994). Multple-mputaton nferences wth uncongenal sources of nput. Statst. Sc., 9, NIELSEN, S. F.(2003). Proper and mproper multple mputaton. Int. Statst. Rev. 71, ROBINS, J. M. & WANG, N.(2000). Inference for mputaton estmators. Bometrka 87, RUBIN, D. B.(1976). Inference and mssng data. Bometrka 63, RUBIN, D. B. (1987). Multple Imputaton for Nonresponse n Surveys. John Wley & Sons. RUBIN, D. B.(1996). Multple mputaton after 18+ years. J. Am. Statst. Assoc. 91, RUBIN, D. B. & SCHENKER, N.(1986). Multple mputaton for nterval estmaton from smple random samples wth gnorable nonresponse. J. Am. Statst. Assoc. 81, SATTERTHWAITE, F. E.(1946). An approxmate dstrbuton of estmates of varance components. Bometrcs bulletn 2, SCHENKER, N. & WELSH, A. H.(1998). Asymptotc results for multple mputaton. Ann. Statst. 16, VAN DER VAART, A. W.(2000). Asymptotc Statstcs. Cambrdge Unversty Press. WANG, N. & ROBINS, J. M.(1998). Large-sample theory for parametrc multple mputaton procedures. Bometrka 85, XIE, X. & MENG, X. L.(2014). Dssectng multple mputaton from a mult-phase nference perspectve: what happens when God s, mputer s and analyst s models are uncongenal?. Statstca Snca, to appear.

10 Multple Imputaton 9 [Receved XX20xx. Revsed XX20xx]

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis Secton on Survey Research Methods JSM 2008 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Wayne Fuller Abstract Under a parametrc model for mssng data, the EM algorthm s a popular tool

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

Efficient nonresponse weighting adjustment using estimated response probability

Efficient nonresponse weighting adjustment using estimated response probability Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Small Area Interval Estimation

Small Area Interval Estimation .. Small Area Interval Estmaton Partha Lahr Jont Program n Survey Methodology Unversty of Maryland, College Park (Based on jont work wth Masayo Yoshmor, Former JPSM Vstng PhD Student and Research Fellow

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

A note on regression estimation with unknown population size

A note on regression estimation with unknown population size Statstcs Publcatons Statstcs 6-016 A note on regresson estmaton wth unknown populaton sze Mchael A. Hdroglou Statstcs Canada Jae Kwang Km Iowa State Unversty jkm@astate.edu Chrstan Olver Nambeu Statstcs

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

arxiv: v1 [stat.me] 27 Aug 2015

arxiv: v1 [stat.me] 27 Aug 2015 Submtted to Statstcal Scence Fractonal Imputaton n Survey Samplng: A Comparatve Revew Shu Yang and Jae Kwang Km Harvard Unversty and Iowa State Unversty arxv:1508.06945v1 [stat.me] 27 Aug 2015 Abstract.

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation Econ 388 R. Butler 204 revsons Lecture 4 Dummy Dependent Varables I. Lnear Probablty Model: the Regresson model wth a dummy varables as the dependent varable assumpton, mplcaton regular multple regresson

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function On Outler Robust Small Area Mean Estmate Based on Predcton of Emprcal Dstrbuton Functon Payam Mokhtaran Natonal Insttute of Appled Statstcs Research Australa Unversty of Wollongong Small Area Estmaton

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students. PPOL 59-3 Problem Set Exercses n Smple Regresson Due n class /8/7 In ths problem set, you are asked to compute varous statstcs by hand to gve you a better sense of the mechancs of the Pearson correlaton

More information

Efficient estimation in missing data and survey sampling problems

Efficient estimation in missing data and survey sampling problems Graduate Theses and Dssertatons Iowa State Unversty Capstones, Theses and Dssertatons 2012 Effcent estmaton n mssng data and survey samplng problems Sxa Chen Iowa State Unversty Follow ths and addtonal

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

β0 + β1xi and want to estimate the unknown

β0 + β1xi and want to estimate the unknown SLR Models Estmaton Those OLS Estmates Estmators (e ante) v. estmates (e post) The Smple Lnear Regresson (SLR) Condtons -4 An Asde: The Populaton Regresson Functon B and B are Lnear Estmators (condtonal

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact Multcollnearty multcollnearty Ragnar Frsch (934 perfect exact collnearty multcollnearty K exact λ λ λ K K x+ x+ + x 0 0.. λ, λ, λk 0 0.. x perfect ntercorrelated λ λ λ x+ x+ + KxK + v 0 0.. v 3 y β + β

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling Open Journal of Statstcs, 0,, 300-304 ttp://dx.do.org/0.436/ojs.0.3036 Publsed Onlne July 0 (ttp://www.scrp.org/journal/ojs) Multvarate Rato Estmator of te Populaton Total under Stratfed Random Samplng

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of Chapter 7 Generalzed and Weghted Least Squares Estmaton The usual lnear regresson model assumes that all the random error components are dentcally and ndependently dstrbuted wth constant varance. When

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

Properties of Least Squares

Properties of Least Squares Week 3 3.1 Smple Lnear Regresson Model 3. Propertes of Least Squares Estmators Y Y β 1 + β X + u weekly famly expendtures X weekly famly ncome For a gven level of x, the expected level of food expendtures

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrcs of Panel Data Jakub Mućk Meetng # 8 Jakub Mućk Econometrcs of Panel Data Meetng # 8 1 / 17 Outlne 1 Heterogenety n the slope coeffcents 2 Seemngly Unrelated Regresson (SUR) 3 Swamy s random

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

Parameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples

Parameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples Appled Mathematcal Scences, Vol. 5, 011, no. 59, 899-917 Parameters Estmaton of the Modfed Webull Dstrbuton Based on Type I Censored Samples Soufane Gasm École Supereure des Scences et Technques de Tuns

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Chapter 15 - Multiple Regression

Chapter 15 - Multiple Regression Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

Basic Statistical Analysis and Yield Calculations

Basic Statistical Analysis and Yield Calculations October 17, 007 Basc Statstcal Analyss and Yeld Calculatons Dr. José Ernesto Rayas Sánchez 1 Outlne Sources of desgn-performance uncertanty Desgn and development processes Desgn for manufacturablty A general

More information

January Examinations 2015

January Examinations 2015 24/5 Canddates Only January Examnatons 25 DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR STUDENT CANDIDATE NO.. Department Module Code Module Ttle Exam Duraton (n words)

More information

Robust Small Area Estimation Using a Mixture Model

Robust Small Area Estimation Using a Mixture Model Robust Small Area Estmaton Usng a Mxture Model Jule Gershunskaya U.S. Bureau of Labor Statstcs Partha Lahr JPSM, Unversty of Maryland, College Park, USA ISI Meetng, Dubln, August 23, 2011 Parameter of

More information

QUASI-LIKELIHOOD APPROACH TO RATER AGREEMENT PLUS LINEAR BY LINEAR ASSOCIATION MODEL FOR ORDINAL CONTINGENCY TABLES

QUASI-LIKELIHOOD APPROACH TO RATER AGREEMENT PLUS LINEAR BY LINEAR ASSOCIATION MODEL FOR ORDINAL CONTINGENCY TABLES Journal of Statstcs: Advances n Theory and Applcatons Volume 6, Number, 26, Pages -5 Avalable at http://scentfcadvances.co.n DOI: http://dx.do.org/.8642/jsata_72683 QUASI-LIKELIHOOD APPROACH TO RATER AGREEMENT

More information

Andreas C. Drichoutis Agriculural University of Athens. Abstract

Andreas C. Drichoutis Agriculural University of Athens. Abstract Heteroskedastcty, the sngle crossng property and ordered response models Andreas C. Drchouts Agrculural Unversty of Athens Panagots Lazards Agrculural Unversty of Athens Rodolfo M. Nayga, Jr. Texas AMUnversty

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Explaining the Stein Paradox

Explaining the Stein Paradox Explanng the Sten Paradox Kwong Hu Yung 1999/06/10 Abstract Ths report offers several ratonale for the Sten paradox. Sectons 1 and defnes the multvarate normal mean estmaton problem and ntroduces Sten

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Lecture 16 Statistical Analysis in Biomaterials Research (Part II) 3.051J/0.340J 1 Lecture 16 Statstcal Analyss n Bomaterals Research (Part II) C. F Dstrbuton Allows comparson of varablty of behavor between populatons usng test of hypothess: σ x = σ x amed for Brtsh statstcan

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Estimation of the Probability of Success Based on Communication History

Estimation of the Probability of Success Based on Communication History Workng paper, presented at 7-th Valenca Meetng n Bayesan Statstcs, June 22 Estmaton of the Probablty of Success Based on Communcaton Hstory Arkady E Shemyakn Unversty of St Thomas, Sant Paul, Mnnesota,

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE

REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE STATISTICA, anno LXXIV, n. 3, 2014 REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE Muqaddas Javed 1 Natonal College of Busness Admnstraton and Economcs, Lahore,

More information

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals Internatonal Journal of Scentfc World, 2 1) 2014) 1-9 c Scence Publshng Corporaton www.scencepubco.com/ndex.php/ijsw do: 10.14419/jsw.v21.1780 Research Paper Statstcal nference for generalzed Pareto dstrbuton

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information

Population Design in Nonlinear Mixed Effects Multiple Response Models: extension of PFIM and evaluation by simulation with NONMEM and MONOLIX

Population Design in Nonlinear Mixed Effects Multiple Response Models: extension of PFIM and evaluation by simulation with NONMEM and MONOLIX Populaton Desgn n Nonlnear Mxed Effects Multple Response Models: extenson of PFIM and evaluaton by smulaton wth NONMEM and MONOLIX May 4th 007 Carolne Bazzol, Sylve Retout, France Mentré Inserm U738 Unversty

More information

Weighted Estimating Equations with Response Propensities in Terms of Covariates Observed only for Responders

Weighted Estimating Equations with Response Propensities in Terms of Covariates Observed only for Responders Weghted Estmatng Equatons wth Response Propenstes n Terms of Covarates Observed only for Responders Erc V. Slud, U.S. Census Bureau, CSRM Unv. of Maryland, Mathematcs Dept. NISS Mssng Data Workshop, November

More information

β0 + β1xi. You are interested in estimating the unknown parameters β

β0 + β1xi. You are interested in estimating the unknown parameters β Ordnary Least Squares (OLS): Smple Lnear Regresson (SLR) Analytcs The SLR Setup Sample Statstcs Ordnary Least Squares (OLS): FOCs and SOCs Back to OLS and Sample Statstcs Predctons (and Resduals) wth OLS

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data Lab : TWO-LEVEL NORMAL MODELS wth school chldren popularty data Purpose: Introduce basc two-level models for normally dstrbuted responses usng STATA. In partcular, we dscuss Random ntercept models wthout

More information

UNR Joint Economics Working Paper Series Working Paper No Further Analysis of the Zipf Law: Does the Rank-Size Rule Really Exist?

UNR Joint Economics Working Paper Series Working Paper No Further Analysis of the Zipf Law: Does the Rank-Size Rule Really Exist? UNR Jont Economcs Workng Paper Seres Workng Paper No. 08-005 Further Analyss of the Zpf Law: Does the Rank-Sze Rule Really Exst? Fungsa Nota and Shunfeng Song Department of Economcs /030 Unversty of Nevada,

More information

Sample Correlation Coef cients Based on Survey Data Under Regression Imputation

Sample Correlation Coef cients Based on Survey Data Under Regression Imputation Sample Correlaton Coef cents Based on Survey ata Under Regresson Imputaton Jun Shao Hansheng Wang Regresson mputaton s commonly used to compensate for tem nonresponse when auxlary data are avalable. It

More information

Bayesian Planning of Hit-Miss Inspection Tests

Bayesian Planning of Hit-Miss Inspection Tests Bayesan Plannng of Ht-Mss Inspecton Tests Yew-Meng Koh a and Wllam Q Meeker a a Center for Nondestructve Evaluaton, Department of Statstcs, Iowa State Unversty, Ames, Iowa 5000 Abstract Although some useful

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson

More information