arxiv:math/ v1 [math.st] 23 Jun 2004

Size: px
Start display at page:

Download "arxiv:math/ v1 [math.st] 23 Jun 2004"

Transcription

1 The Annals of Statistics 2004, Vol. 32, No. 2, DOI: / c Institute of Mathematical Statistics, 2004 arxiv:math/ v1 [math.st] 23 Jun 2004 FINITE SAMPLE PROPERTIES OF MULTIPLE IMPUTATION ESTIMATORS By Jae Kwang Kim 1 Yonsei University Finite sample properties of multiple imputation estimators under the linear regression model are studied. The exact bias of the multiple imputation variance estimator is presented. A method of reducing the bias is presented simulation is used to make comparisons. We also show that the suggested method can be used for a general class of linear estimators. 1. Introduction. Multiple imputation, proposed by Rubin (1978), is a procedure for hling missing data that allows the data analyst to use stard techniques of analysis designed for complete data, while providing a method to estimate the uncertainty due to the missing data. Repeated imputations are drawn from the posterior predictive distribution of the missing values under the specified model given a suitable prior distribution. Schenker Welsh [(1988), hereafter SW] studied the asymptotic properties of multiple imputation in the linear-model framework, where the scalar outcome variable Y i is assumed to follow the model (1) Y i = x i β + e i, e i i.i.d. N(0,σ 2 ). The p-dimensional x i s are observed on the complete sample are assumed to be fixed. To describe the imputation procedure, we adopt matrix notation. Without loss of generality, we assume that the first r units are the respondents. Let y r = (Y 1,Y 2,...,Y r ) X r = (x 1,x 2,...,x r ). Also, let y n r = (Y r+1,y r+2,...,y n ) X n r = (x r+1,x r+2,...,x n ). The suggested method of multiple imputation for model (1) is as follows: Received September 2001; revised January Supported by Korea Research Foundation Grant KRF C AMS 2000 subject classifications. Primary 62D05; secondary 62J99. Key words phrases. Missing data, nonresponse, variance estimation. This is an electronic reprint of the original article published by the Institute of Mathematical Statistics in The Annals of Statistics, 2004, Vol. 32, No. 2, This reprint differs from the original in pagination typographic detail. 1

2 2 J. K. KIM [M1] For each repetition of the imputation, k = 1,..., M, draw (2) σ 2 (k) y r i.i.d. (r p)ˆσ 2 r/χ 2 r p, where ˆσ 2 r = (r p) 1 y r [I X r(x r X r) 1 X r ]y r. [M2] Draw (3) β (k) (y r,σ (k) ) i.i.d. N(ˆβ r,(x rx r ) 1 σ 2 (k) ), where ˆβ r = (X rx r ) 1 X ry r. [M3] For each missing unit j = r + 1,...,n draw (4) e j(k) (β (k),σ (k) ) i.i.d. N(0,σ(k) 2 ). Then Yj(k) = x jβ (k) +e j(k) is the imputed value associated with unit j for the kth imputation. [M4] Repeat [M1] [M3] independently M times. The above procedure assumes a constant prior for (β,logσ) an ignorable response mechanism in the sense of Rubin (1976). At each repetition of the imputation, k = 1,...,M, we can calculate the imputed version of the full sample estimators ( n ) 1 [ r ] ˆβ I(k),n = x i x i x i y i + x i Yi(k) i=1 i=1 i=r+1 ( n ) ˆV I(k),n = x i x i 1ˆσ I(k),n 2, i=1 where [ r ] ˆσ I(k),n 2 = (n p) 1 (Y i x iˆβ I(k),n ) 2 + (Yi(k) x iˆβ I(k),n ) 2. i=1 i=r+1 The proposed point estimator for the regression coefficient based on M repeated imputations is (5) ˆβ M M,n = M 1 ˆβ I(k),n. The proposed estimator for the variance of the point estimator (5) is (6) where (7) k=1 ˆV M,n = W M,n + (1 + M 1 )B M,n, W M,n = M 1 M k=1 ˆV I(k),n

3 MULTIPLE IMPUTATION ESTIMATORS 3 (8) M B M,n = (M 1) 1 (ˆβ I(k),n ˆβ M,n )(ˆβ I(k),n ˆβ M,n ). k=1 Rubin (1987) called W M,n the within-imputation variance called B M,n the between-imputation variance. We call ˆV M,n Rubin s variance estimator. SW studied the asymptotic properties of the point estimator (5) its variance estimator (6). Under regularity conditions they showed that (9) (10) lim E(ˆβ n M,n β) = 0 lim n{e(ˆv M,n ) Var(ˆβ n M,n )} = 0, where the reference distribution in (9) (10) is the regression model (1) with an ignorable response mechanism. Note that (9) (10) require that the sample size n go to infinity, for fixed M, M > 1. Finite sample properties are not discussed by SW. The next section gives finite sample properties of the multiple imputation estimators. In Section 3 a simple modified version of the SW method is proposed to minimize the finite sample bias of the multiple imputation variance estimator. In Section 4 extensions are made to a more general class of estimators. In Section 5 results of a simulation study are reported. In Section 6 concluding remarks are made. 2. Finite sample properties. The following lemma provides the covariance structure of the multiply-imputed data set generated by [M1] [M4]. Lemma 2.1. Let Y i be the observed value of the ith unit, i = 1,2,...,r (r > p+2), let Yj(k) be the imputed value associated with the jth unit for the kth repetition of the multiple imputation generated by the steps [M1] [M4]. Then, under model (1) with an ignorable response mechanism, (11) (12) Cov(Y i(k),yj(s) ) Cov(Y i,y j(k) ) = x i (X r X r) 1 x j σ 2 (1 + λ)x i (X rx r ) 1 x i σ 2 + λσ 2, if i = j k = s, = (1 + λ)x i (X rx r ) 1 x j σ 2, if i j k = s, x i (X rx r ) 1 x j σ 2, if k s, where λ = (r p 2) 1 (r p) the expectations in (11) (12) are taken over the joint distribution of model (1) the imputation mechanism with the indices of respondents fixed.

4 4 J. K. KIM can be decomposed into three indepen- For the proof, see Appendix A. Note that the imputed value Y dent components as (13) i(k) Y i(k) = x iˆβ r +x i (β (k) ˆβ r ) + e i(k). The first component x iˆβr has mean x i β variance x i (X r X r) 1 x i σ 2, the second component has mean zero variance λx i (X rx r ) 1 x i σ 2 the third component has mean zero variance λσ 2. The following theorem gives the mean variance of the point estimator ˆβ M,n of the regression coefficient the mean of the multiple imputation variance estimator ˆV M,n. Again, the expectations in the following theorem are taken over the joint distribution of model (1) the imputation mechanism with the indices of respondents fixed. (14) (15) (16) (17) Theorem 2.1. Under the assumptions of Lemma 2.1, E(ˆβ M,n ) = β, Var(ˆβ M,n ) = (X rx r ) 1 σ 2 + M 1 λ[(x rx r ) 1 (X nx n ) 1 ]σ 2, E(W M,n ) = (X n X n) 1 {1 + (n p) 1 (λ 1)(n r)}σ 2 E(B M,n ) = λ[(x rx r ) 1 (X nx n ) 1 ]σ 2, where λ = (r p 2) 1 (r p), W M,n is the within-imputation variability defined in (7) B M,n is the between-imputation variability defined in (8). The bias of the multiple imputation variance estimator is (18) E(ˆV M,n ) Var(ˆβ M,n ) = (X n X n) 1 (n p) 1 (λ 1)(n r)σ 2 + (λ 1)[(X r X r) 1 (X n X n) 1 ]σ 2. For the proof, see Appendix B. As is observed from (15), the point estimator ˆβ M,n for infinite M achieves the same efficiency of ˆβ r, the estimator based on the respondents. In fact, lim ˆβ M,n M ( n = + = ˆβ r, ) 1 { x i x r i=1 i=1 ( n ) 1 { x i x i=1 x i [x iˆβ r + (y i x iˆβ r )] n i=r+1 } [ M x i x iˆβ r + M 1 k=1 (x i (β (k) ˆβ r ) + e i(k) ) ]}

5 MULTIPLE IMPUTATION ESTIMATORS 5 because r i=1 x i (y i x iˆβ r ) = 0 by stard regression theory, lim M M 1 Mk=1 (β (k) ˆβ r ) = 0 by the law of large numbers lim M M 1 Mk=1 e i(k) = 0 by the law of large numbers. By (15) the variance of ˆβ M,n can be written as (19) Var(ˆβ M,n ) = Var(ˆβ r ) + M 1 λ[var(ˆβ r ) Var(ˆβ n )]. The second part in the right-h side of (19) is the increase in variance due to using ˆβ M,n instead of ˆβ r. By (17) that increase can be unbiasedly estimated by M 1 B M,n. Thus, an alternative estimator for the total variance of ˆβ M,n that is unbiased for (15) is (20) ˆ Var(ˆβ r ) + M 1 B M,n, where ˆ Var(ˆβ r ) is the stard variance estimator that treats the respondents as if they are the original sample. In large samples, λ. = 1 the bias of the multiple imputation variance estimator for the imputed regression coefficient is negligible. The bias term (18) is an exact bias for r > p + 2. The total variance of ˆβ M,n can be decomposed into three parts: Var(ˆβ M,n ) = Var(ˆβ n ) + Var(ˆβ r ˆβ n ) + Var(ˆβ M,n ˆβ r ). The first part, the second part the third part can be called the sampling variance, the variance due to missingness the variance due to imputation, respectively. Rubin s multiple imputation uses W M,n to estimate the sampling variance, B M,n to estimate the variance due to missingness M 1 B M,n to estimate the variance due to M repeated imputations. The first term on the right-h side of (18) is the bias of W M,n as an estimator of the sampling variance the second term on the right-h side of (18) is the bias of B M,n as an estimator of the variance due to missingness. The imputation variance is unbiasedly estimated by M 1 B M,n. 3. A modification. We consider ways of reducing the bias of the multiple imputation variance estimator. Recall that multiple imputation is characterized by the method of generating the imputed values by the variance formula. The variance formula directly uses the complete sample variance estimator so that it can be implemented easily using existing software. One estimator of variance is the alternative variance estimator in (20), which involves direct estimation of parameters from the respondents. A similar idea was used by Wang Robins (1998). Then Rubin s variance estimator ˆV M,n in (6) is no longer required. If we want to use Rubin s variance formula, one approach is to modify the imputation method to minimize the bias term in (18). To find a best imputation procedure, instead of fixing a constant prior for log σ, we use a

6 6 J. K. KIM class of prior distributions indexed by hyperparameters to express a class of imputation methods. As a conjugate prior distribution for σ 2, we choose a scaled inverse chi-square with degrees of freedom ν 0 scale parameter σ0 2 as the hyperparameters; that is, the prior distribution of σ 2 is the distribution of ν 0 σ0 2/χ2 ν 0. In the modified imputation method we determine the values of the hyperparameters, ν 0 σ0 2, to remove the bias of the multiple imputation variance estimator. Using the hyperparameters, the posterior distribution for σ 2 is written (21) σ 2 (k) y r i.i.d. [ν 0 σ (r p)ˆσ 2 r]/χ 2 (ν 0 +r p), so that (22) E(σ 2 (k) ) = (ν 0 + r p 2) 1 (ν 0 σ (r p)σ 2 ). Note that SW used ν 0 = 0. Using the arguments of the proofs of Lemma 2.1 Theorem 2.1, the bias of the multiple imputation variance estimator based on the posterior distribution in (21) is (23) Bias(ˆV M,n ) = (X nx n ) 1 (n p) 1 {(λ 0 1)(n r)σ 2 + λ 1 (n r)σ 2 0} + {(X r X r) 1 (X n X n) 1 }{(λ 0 1)σ 2 + λ 1 σ 2 0 }, where λ 0 = (ν 0 +r p 2) 1 (r p) λ 1 = (ν 0 +r p 2) 1 ν 0. The bias is zero when ν 0 = 2 σ0 2 = 0, which is equivalent to generating the posterior values of σ 2 from the inverse chi-square distribution with degrees of freedom ν = r p + 2 instead of ν = r p in (2). Thus, the choice of ν = r p + 2 in (2) makes Rubin s variance estimator unbiased for a finite sample. We can also derive the optimal prior using the recent work of Meng Zaslavsky (2002) on single observation unbiased priors (SOUP). Meng Zaslavsky [2002, Section 6] showed that (24) π(σ 2 ) (σ 2 ) 2 is the unique SOUP among all continuously relatively scale invariant priors for the scale family distribution. Note that the prior density of the scaled inverse chi-square distribution can be written as (25) π(σ 2 ) (σ 2 ) (νo/2+1) exp[ ν 0 σ 2 0 /(2σ2 )]. Thus, the unique SOUP for σ 2 in (24) corresponds to the scaled inverse chi-square distribution with ν 0 = 2 σ0 2 = 0, which makes the posterior mean in (22) unbiased for σ 2.

7 MULTIPLE IMPUTATION ESTIMATORS 7 4. Extensions. In this section we investigate the properties of the modified method when it is applied to estimators other than the regression coefficients. Let the complete sample point estimator be a linear estimator of the form (26) ˆθ n = α i Y i i=1 for some known coefficients α i. Also, let the complete sample estimator for the variance of ˆθ n be a quadratic function of the sample values of the form (27) ˆV n = Ω ij Y i Y j i=1 j=1 for some known coefficients Ω ij. For the ˆV M,n to be asymptotically unbiased for the variance of ˆθ M,n, we need the congeniality assumption as defined in Meng (1994). The congeniality assumption in our context implies (28) Var(ˆθ,n ) = Var(ˆθ n ) + Var(ˆθ,n ˆθ n ). For the case of ˆθ n = ˆβ n discussed in Section 2, congeniality holds because ˆθ,n = ˆβ r Var(ˆβ r ˆβ n ) = (X r X r) 1 σ 2 (X n X n) 1 σ 2 = Var(ˆβ r ) Var(ˆβ n ). We restrict our attention to the case of congenial multiple imputation estimation because otherwise the variance estimator will be biased even asymptotically. The following theorem expresses the bias of Rubin s variance estimator applied to the general class of estimators ˆθ n in (26) ˆV n in (27) under the SW method. Theorem 4.1. Let the assumptions of Lemma 2.1 hold. Assume that the complete sample variance estimator ˆV n in (27) is unbiased for the variance of the complete sample point estimator ˆθ n in (26) under model (1). Let the congeniality assumption (28) hold. Then the multiple imputation point estimator ˆθ M,n is unbiased with variance { r } r Var(ˆθ M,n ) = α 2 i + 2 α i α j h ij + α i α j h ij σ 2 (29) i=1 + M 1 λ i=1 j=r+1 { n α i α j h ij + } α 2 i i=r+1 σ 2.

8 8 J. K. KIM The bias of ˆV M,n as an estimator of Var(ˆθ M,n ) is (30) E(ˆV M,n ) Var(ˆθ M,n ) { r = 2 Ω ij h ij + (1 + λ) Ω ij h ij i=1 j=r+1 + (λ 1) j=r+1 Ω jj } { n + (λ 1) α 2 j + α i α j h ij }σ 2, j=r+1 j=r+1 j=r+1 σ 2 where λ = (r p 2) 1 (r p) h ij = x i (X rx r ) 1 x j. For the proof, see Appendix C. The first term on the right-h side of (30) is the bias of W M,n as an estimator of Var(ˆθ n ) the second term is the bias of (1 + M 1 )B M,n as an estimator of Var(ˆθ M,n ˆθ n ). Note that the bias term in (30) can be written (31) where U = Bias(ˆV M,n ) = 2 { n i=1 j=r+1 (Ω ij + α i α j )h ij + Ω ij h ij σ 2 + (λ 1)Uσ 2, j=r+1 (Ω jj + α 2 j ) } = trace {(Ω n r + α n r α n r )[X n r(x r X r) 1 X n r + I n r]} with Ω n r the lower-right (n r) (n r) partition of Ω = [Ω ij ] α n r = (α r+1,...,α n ). By the nonnegative definiteness of Ω n r, α n r α n r, X n r (X rx r ) 1 X n r I n r, the U term is nonnegative. Thus, if we use the modified method suggested in Section 3 so that we have λ = 1, the variance term (29) decreases the bias will be reduced to (32) Bias(ˆV M,n ) = 2 Ω ij h ij σ 2, i=1 j=r+1 which is always smaller than the original bias in (31) because λ = (r p 2) 1 (r p) > 1. A sufficient condition for the bias term in (32) to be zero is (33) Ω = c[i n X n (X nx n ) 1 X n]

9 MULTIPLE IMPUTATION ESTIMATORS 9 for some constant c > 0. To show this, let δ ij be the (i,j)th element of I n. Then Ω ij h ij = c[δ ij x i (X n X n) 1 x j ][x i (X r X r) 1 x j ] i=1 i=1 [ n ] = cx j (X r X r) 1 x j cx j (X n X n) 1 x i x i (X r X r) 1 x j = 0 i=1 the bias term in (32) equals zero, which alternatively justifies our assertion of zero bias in Section Simulation study. To see the effect of changing ν = r p into ν = r p + 2, we performed a limited simulation. The simulation study can be described as a factorial design with L = 50,000 samples within each cell, where each sample is generated from (34) Y i = 2 + 4x i + e i, where x i = (n + 1) 1 i e i are independently identically distributed from the stard normal distribution. Thus, the population mean of (X,Y ) is (10, 42). The factors are as follows: 1. factor A, method of multiple imputation SW method (ν = r 2), new method (ν = r); 2. factor B, response rate (r/n) 0.8, 0.6, 0.4; 3. factor C, sample size (n) 20, 200. We used a uniform response mechanism M = 5 repeated imputations. Table 1 presents the mean, the variance the percentage relative efficiency of the point estimators under the two imputation schemes. The percentage relative efficiency (PRE) is PRE = [Var L (ˆθ SW )] 1 Var L (ˆθ new ) 100, where the subscript L denotes the distribution generated by the Monte Carlo simulation. Both imputation methods are unbiased for the two parameters the Monte Carlo results are consistent with that property. The new procedure is slightly more efficient than the SW procedure the efficiency is greater for lower response rates. Table 2 presents the relative bias z-statistics for the variance estimators. The relative bias of ˆV as an estimator of the variance of ˆθ is calculated as [Var L (ˆθ)] 1 [E L (ˆV ) Var L (ˆθ)], the z-statistic for testing H 0 :E(ˆV ) = Var(ˆθ) is (35) z-statistic = L 1/2 [E L ˆV ) VarL (ˆθ)] {E L [ˆV E L (ˆV ) + Var L (ˆθ) (ˆθ E L (ˆθ)) 2 ] 2 } 1/2.

10 10 J. K. KIM Table 1 Mean, variance the percentage relative efficiency (PRE) of the multiple imputation point estimators under the two different imputation schemes (50,000 samples) Mean Variance PRE Parameter n r/n SW New SW New (%) Mean Slope A heuristic argument for the justification of the z-statistic is made in Appendix D. Under the SW imputation, the relative bias is larger for smaller samples for smaller response rates. The new imputation produces much smaller relative bias for the variance estimator. For large sample sizes both imputation methods produce negligible relative biases of the variance estimators. Table 2 Relative bias (RB) the z-statistic of Rubin s variance estimators under the two different imputation schemes (50,000 samples) RB z-statistic Parameter n r/n SW New SW New Mean Slope

11 MULTIPLE IMPUTATION ESTIMATORS 11 Table 3 Mean length the coverage of 95% confidence intervals under the two different imputation schemes (50,000 samples) Mean length Coverage (%) Parameter n r/n SW New SW New Mean Slope Table 3 displays the mean lengths the coverages of 95% confidence intervals. The confidence intervals are (ˆθ t ˆV, ˆθ+t ˆV ), where t = t0.025,ν ν is computed using the method of Barnard Rubin (1999). The coverages of the confidence intervals are all close to the nominal level. For small sample sizes the confidence intervals based on the new imputation are slightly narrower than the confidence intervals based on the SW method. Interval estimation shows less dramatic results for small sample size than variance estimation. This is partly because the distributions of point estimates are bell-shaped partly because the degrees of freedom of Barnard Rubin (1999) attenuate the effect of small sample bias of the variance estimator. 6. Discussion. We study the mean the covariance structure of the data set generated by the conventional multiple-imputation method under the regression model. Using the mean the covariance structure of the multiply-imputed data set, we investigate the exact bias of Rubin s variance estimator. The bias of Rubin s variance estimator is negligible for large sample sizes, as discussed by SW, but the bias may be sizable for small sample sizes. We propose a simple modified imputation method that is more efficient than the SW method makes Rubin s variance estimator unbiased. When applied to a general class of linear estimators, the proposed method produces more efficient estimates has smaller bias for variance estimation than that of the SW method. In a simulation study we found that the bias of Rubin s variance estimator under the SW method is remarkably large for small sample sizes. The bias of Rubin s variance estimator under the new method is reasonably small in the simulation.

12 12 J. K. KIM In practice, the small sample bias of the multiple imputation variance estimator is of special concern when the scale parameter σ is generated with small degrees of freedom. One such example is stratified sampling, where the sample selection is performed independently across the strata. In a stratified sample the assumption of equal σ across the strata is not a reasonable assumption. Thus, the scale parameters have to be generated independently within each stratum, using only the respondents in the stratum, which often makes the degrees of freedom very small even for a large data set. The new method will significantly reduce the bias in this case. A commonly used imputation model is the cell mean model, where the study variables are assumed homogeneous within each cell. Under the cell mean model Rubin Schenker (1986) considered various multiple imputation methods. Since the imputation is performed separately within each cell, the scale parameters are generated independently within each cell. Thus, the methods considered by Rubin Schenker (1986) are subject to small sample biases. The biases can be significant when there are a small number of respondents within each cell. Recently, Kim (2002) proposed an alternative imputation method of making the variance estimator unbiased under the cell mean model. APPENDIX A Proof of Lemma 2.1. By (3), (A.1) So E(x j β (k) y r) = x j ˆβ r. Cov(Y i,y j(k) ) = Cov {Y i,e(y j(k) y r)} = Cov(Y i,x j ˆβ r ) = x i (X r X r) 1 x j σ 2. Now, by (2), (A.2) where λ = (r p 2) 1 (r p). So (A.3) E(σ 2 (k) ) = E(σ 2 (k) y r) = E(λˆσ 2 r) = λσ 2, Cov(x iβ (k),x jβ (k) y r) = Cov[E(x iβ (k) y r,σ (k) ),E(x jβ (k) y r,σ (k) ) y r], for k s, (A.4) + E[Cov(x i β (k),x j β (k) y r,σ (k) ) y r] = E[x i (X r X r) 1 x j σ 2 (k) y r] = x i (X r X r) 1 x jˆσ 2 r λ, Cov(x i β (k),x j β (s) y r) = 0.

13 Hence, by (A.1) (A.3), (A.5) MULTIPLE IMPUTATION ESTIMATORS 13 Cov(x iβ (k),x jβ (k) ) = Cov[E(x iβ (k) y r),e(x jβ (k) y r)], for k s, by (A.1) (A.4), (A.6) + E[Cov(x iβ (k),x jβ (k) y r)] = Cov(x iˆβ r,x j ˆβ r ) + E{x i (X r X r) 1 x jˆσ 2 r λ} = (1 + λ)x i (X r X r) 1 x j σ 2 Cov(x iβ (k),x jβ (s) ) = E[Cov(x iβ (k),x jβ (s) y r)] By (4) we have, for i j, (A.7) (A.8) + Cov[E(x i β (k) y r),e(x j β (s) y r)] = Cov(x iˆβ r,x j ˆβ r ) = x i(x X) 1 x j σ 2. Cov(Yi(k),Yj(s) ) = Cov(x i β (k),x j β (s) ) { Cov(Yi(k) Var(x,Yj(k) ) = i β (k) ) + E(σ 2 (k)), if i = j, Cov(x i β (k),x j β (k)), if i j. Therefore, inserting (A.2), (A.5) (A.6) into (A.7) (A.8), result (12) follows. APPENDIX B Proof of Theorem 2.1. Before we calculate the variance of the multiple imputation estimator we provide the following matrix identity. Lemma B.1. Let X n be an n p matrix of the form X n = (X r,x n r), where X r is an r p matrix X n r is an (n r) p matrix. Assume that X nx n X rx r are nonsingular. Then (X r X r) 1 = (X n X n) 1 + (X n X n) 1 X n r X n r(x n X n) 1 (B.1) + (X nx n ) 1 X n rx n r (X rx r ) 1 X n rx n r (X nx n ) 1. Proof. Using the identity [e.g., Searle (1982), page 261] (B.2) (D CA 1 B) 1 = D 1 + D 1 C(A BD 1 C) 1 BD 1 with A = I, B = X n r, C = X n r D = X n X n, we have (X r X r) 1 = (X n X n) 1 + (X n X n) 1 X n r [I X n r(x n X n) 1 X n r (B.3) ] 1 X n r (X n X n) 1.

14 14 J. K. KIM Using (B.2) again with A = X nx n, B = X n r, C = X n r D = I, we have (B.4) [I X n r (X n X n) 1 X n r ] 1 = I + X n r (X r X r) 1 X n r. Inserting (B.4) into (B.3), we have (B.1). Note that ˆβ M,n = M 1 M k=1 ˆβI(k),n the ˆβ I(1),n, ˆβ I(2),n,..., ˆβ I(M),n are identically distributed. Thus, (B.5) Var(ˆβ M,n ) = (1 M 1 )Cov(ˆβ I(1),n, ˆβ I(2),n ) + M 1 Var(ˆβ I(1),n ). Define a i = (X nx n ) 1 x i h ij = x i (X rx r ) 1 x j. By (11) (12) (B.6) (B.7) Cov(ˆβ I(1),n, ˆβ r r I(2),n ) = a i a iσ a i h ij a jσ 2 i=1 i=1 j=r+1 + a i h ij a j σ2 r r Var(ˆβ I(1),n ) = a i a i σ2 + 2 a i h ij a j σ2 i=1 i=1 j=r+1 + (1 + λ) a i h ij a j σ2 + λ a i a i σ2. i=r+1 By the definition of a i h ij, we have (B.8) (B.9) (B.10) r a i a i = (X nx n ) 1 X rx r (X nx n ) 1, i=1 r a i h ij a j = (X n X n) 1 X n r X n r(x n X n) 1 i=1 j=r+1 = a i a j a i h ij a j = (X n X n) 1 X n r X n r(x r X r) 1 X n r X n r(x n X n) 1.

15 MULTIPLE IMPUTATION ESTIMATORS 15 Hence, inserting (B.8) (B.10) into (B.6) (B.7), applying X rx r + X n r X n r = X n X n (B.1), we have (B.11) (B.12) Cov(ˆβ I(1),n, ˆβ I(2),n ) = (X r X r) 1 σ 2 Var(ˆβ I(1),n ) = (X r X r) 1 σ 2 + λ[(x r X r) 1 (X n X n) 1 ]σ 2. Thus, (15) is proved by (B.5). To show (17), because the ˆβ I(1),n, ˆβ I(2),n,..., ˆβ I(M),n are identically distributed, (B.13) E(B M,n ) = Var(ˆβ I(1),n ) Cov(ˆβ I(1),nˆβI(2),n ). Thus, (17) is proved by inserting (B.11) (B.12) into (B.13). To show (16), we define Ỹ (k) = (y r,y(k) ) to be the vector of the augmented data set at the kth repeated imputation, where y(k) = (Y r+1(k),y r+2(k),...,y k = 1,2,...,M. Then (n p)ˆσ 2 I(k),n = Ỹ (k) [I n X n (X nx n ) 1 X n]ỹ(k), under the regression model (1), E{Ỹ (k) [I n X n (X nx n ) 1 X n]ỹ(k)} = trace {[I n X n (X nx n ) 1 X n]v (Ỹ (k) )}. By (11) (12) we have ( V (Ỹ (k) ) = I r X n r (X r X r) 1 X r X r (X r X r) 1 X n r (1 + λ)x n r (X r X r) 1 X n r + λi n r Let P x = X n (X n X n) 1 X n Q = V (Ỹ (k) )σ 2 I n. Then (B.14) E{(n p)ˆσ 2 I(k),n } = E{(I P x)(q x + I n )σ 2 } = trace{i n P x }σ 2 + trace{[i n P x ]Q}σ 2. ) σ 2. By the classical regression theory trace{i n P x } = n p. For the second term note that the left-upper r r elements of Q are all zeros. Define C = (X n X n) 1 X n r X n r D = (X r X r) 1 X n r X n r. Then (B.15) trace(q) = trace {(1 + λ)x n r (X r X r) 1 X n r + (λ 1)I n r} = (1 + λ)trace(d) + (λ 1)(n r) trace(p x Q) = 2trace {X n r (X n X n) 1 X n r } (B.16) + (1 + λ)trace {X n r (X n X n) 1 X n r X n r(x r X r) 1 X n r } + (λ 1)trace {X n r (X nx n ) 1 X n r} = (1 + λ)trace(c + CD). n(k) ),

16 16 J. K. KIM Note that, using X rx r + X n rx n r = X nx n, we have (B.17) trace(d CD) = trace {[I (X nx n ) 1 X n rx n r ](X rx r ) 1 X n rx n r } = trace {(X nx n ) 1 X n rx n r } = trace(c). Thus, by applying (B.17) to (B.15) (B.16), we have (B.18) trace(q P x Q) = (λ 1)(n r). Therefore, inserting (B.18) into (B.14), we have (16). APPENDIX C Proof of Theorem 4.1. The variance formula (29) directly follows by applying the multiple imputation variance formula (B.5) to ˆθ M,n using a i = α i in (B.6) (B.7). To show (30), we decompose the total variance into three parts: (C.1) Var(ˆθ M,n ) = Var(ˆθ n ) + Var(ˆθ M,n ˆθ n ) + 2Cov(ˆθ n, ˆθ M,n ˆθ n ). To compute the bias of W M,n as an estimator of Var(ˆθ n ), we first express it as ˆV n = Y nωy n, where Y n = (y r,y n r) Ω = [Ω ij ]; then the withinimputation variance term can be written W M,n = M 1 M k=1 Ỹ (k) ΩỸ(k). By E(Ỹ(k)) = E(Y n ), E(ˆV (k) ) = E(Ỹ (k) )ΩE(Ỹ(k)) + trace {ΩVar(Ỹ(k))} = E(ˆV n ) + trace {Ω[Var(Ỹ(k)) Var(Y n )]}. By the unbiasedness of ˆV n by the covariance structures in (11) (12) we have (C.2) E(W M,n ) Var(ˆθ n ) = 2 r i=1 j=r+1 + (λ 1)σ 2 n Ω ij h ij σ 2 + (σ 2 + λσ 2 ) j=r+1 Ω jj. Ω ij h ij For the B M,n term it can be shown that, using the same argument as for (B.6) (B.7), (C.3) E(B M,n ) = Var(ˆθ I(1),n ) Cov(ˆθ I(1),nˆθI(2),n ) ( n ) = α i α j h ij + λσ 2. α 2 i j=r+1

17 By the covariance structures in (11) MULTIPLE IMPUTATION ESTIMATORS 17 (C.4) ( Cov(ˆθ M,n, ˆθ r r n ) = α 2 i + α i α j h ij )σ 2. i=1 i=1 j=r+1 Thus, by (29), (C.4) Var(ˆθ n ) = n i=1 α 2 i σ2, Var(ˆθ M,n ˆθ n ) = Var(ˆθ M,n ) + Var(ˆθ n ) 2Cov(ˆθ M,n, ˆθ n ) (C.5) ( = (1 + M 1 n n ) λ) α 2 i + α i α j h ij, by (C.3) (C.5), i=r+1 σ 2 (C.6) E[(1 + M 1 )B M,n ] Var(ˆθ M,n ˆθ n ) [ n = (λ 1) α 2 i + n i=r+1 α i α j h ij ]σ 2. For the covariance term in (C.1) note that (C.7) Cov(ˆθ n, ˆθ M,n ˆθ n ) = Cov(ˆθ n, ˆθ,n ˆθ n ) + Cov(ˆθ n, ˆθ M,n ˆθ,n ) = 0, because the first term on the right-h side of the above equality is zero by the congeniality condition (28) the second term is also zero because Cov(ˆθ n, ˆθ M,n ˆθ,n ) = Cov(ˆθ n, ˆθ M,n ) Cov(ˆθ n, ˆθ,n ) = 0, by the fact that ˆθ I(k), k = 1,2,...,M, are identically distributed. Therefore, (30) follows from (C.2), (C.6) (C.7). APPENDIX D D.1. Justification for z-statistic in (35). Let (ˆθ i, ˆV i ), i = 1,2,...,L, be i.i.d. samples from a bivariate distribution G(ˆθ, ˆV ) with second moments. Then E(ˆV ) is unbiasedly estimated by E L (ˆV ) = L 1 L i=1 ˆVi Var(ˆθ) is unbiasedly estimated by (L 1) 1 L E L [(ˆθ E L (ˆθ)) 2 ]. = L 1 L i=1 (ˆθ E L (ˆθ)) 2, where E L (ˆθ) = L 1 L i=1 ˆθi. Thus, by the central limit theorem, (D.1) Z = E L(ˆV ) E L [(ˆθ E L (ˆθ)) 2 ] [E(ˆV ) Var(ˆθ)] Var{E L (ˆV ) E L [(ˆθ E L (ˆθ)) 2 ]}

18 18 J. K. KIM converges to a N(0,1) distribution as L. As E L (ˆV ) E L [(ˆθ E L (ˆθ)) 2 ] = L 1 L i=1 [ˆV i (ˆθ i E L (ˆθ)) 2 ], the variance term in the denominator of (D.1) is consistently estimated by L 1 E L {[ˆV (ˆθ E L (ˆθ)) 2 E L [ˆV (ˆθ E L (ˆθ)) 2 ]] 2 }. = L 1 E L {[ˆV (ˆθ E L (ˆθ)) 2 E L (ˆV ) + Var L (ˆθ)] 2 }. Thus, using Slutsky s theorem, the z-statistic in (35) converges to a N(0, 1) distribution under H 0 :E(ˆV ) = Var(ˆθ) as L. Acknowledgments. The author thanks Wayne Fuller, Lou Rizzo the referees for their useful comments suggestions. REFERENCES Barnard, J. Rubin, D. B. (1999). Small-sample degrees of freedom with multiple imputation. Biometrika MR Kim, J. K. (2002). A note on approximate Bayesian bootstrap imputation. Biometrika MR Meng, X.-L. (1994). Multiple-imputation inferences with uncongenial sources of input (with discussion). Statist. Sci Meng, X.-L. Zaslavsky, A. M. (2002). Single observation unbiased priors. Ann. Statist MR Rubin, D. B. (1976). Inference missing data (with discussion). Biometrika MR Rubin, D. B. (1978). Multiple imputations in sample surveys: A phenomenological Bayesian approach to nonresponse (with discussion). In Proc. Survey Research Methods Section Amer. Statist. Assoc., Alexria, VA. Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. Wiley, New York. MR Rubin, D. B. Schenker, N. (1986). Multiple imputation for interval estimation from simple rom samples with ignorable nonresponse. J. Amer. Statist. Assoc MR Schenker, N. Welsh, A. H. (1988). Asymptotic results for multiple imputation. Ann. Statist MR Searle, S. R. (1982). Matrix Algebra Useful for Statistics. Wiley, New York. MR Wang, N. Robins, J. M. (1998). Large sample theory for parametric multiple imputation procedures. Biometrika MR Department of Applied Statistics Yonsei University 134 Sinchon-dong, Seodaemun-gu Seoul Korea kimj@yonsei.ac.kr

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

On the bias of the multiple-imputation variance estimator in survey sampling

On the bias of the multiple-imputation variance estimator in survey sampling J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in

More information

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy

More information

Combining data from two independent surveys: model-assisted approach

Combining data from two independent surveys: model-assisted approach Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Miscellanea A note on multiple imputation under complex sampling

Miscellanea A note on multiple imputation under complex sampling Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.

More information

Chapter 8: Estimation 1

Chapter 8: Estimation 1 Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Statistical Methods for Handling Missing Data

Statistical Methods for Handling Missing Data Statistical Methods for Handling Missing Data Jae-Kwang Kim Department of Statistics, Iowa State University July 5th, 2014 Outline Textbook : Statistical Methods for handling incomplete data by Kim and

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Asymptotic Normality under Two-Phase Sampling Designs

Asymptotic Normality under Two-Phase Sampling Designs Asymptotic Normality under Two-Phase Sampling Designs Jiahua Chen and J. N. K. Rao University of Waterloo and University of Carleton Abstract Large sample properties of statistical inferences in the context

More information

Fractional hot deck imputation

Fractional hot deck imputation Biometrika (2004), 91, 3, pp. 559 578 2004 Biometrika Trust Printed in Great Britain Fractional hot deck imputation BY JAE KWANG KM Department of Applied Statistics, Yonsei University, Seoul, 120-749,

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Multivariate Regression Analysis

Multivariate Regression Analysis Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x

More information

Pooling multiple imputations when the sample happens to be the population.

Pooling multiple imputations when the sample happens to be the population. Pooling multiple imputations when the sample happens to be the population. Gerko Vink 1,2, and Stef van Buuren 1,3 arxiv:1409.8542v1 [math.st] 30 Sep 2014 1 Department of Methodology and Statistics, Utrecht

More information

Two-phase sampling approach to fractional hot deck imputation

Two-phase sampling approach to fractional hot deck imputation Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.

More information

Accounting for Complex Sample Designs via Mixture Models

Accounting for Complex Sample Designs via Mixture Models Accounting for Complex Sample Designs via Finite Normal Mixture Models 1 1 University of Michigan School of Public Health August 2009 Talk Outline 1 2 Accommodating Sampling Weights in Mixture Models 3

More information

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

ASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS

ASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS Statistica Sinica 17(2007), 1047-1064 ASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS Jiahua Chen and J. N. K. Rao University of British Columbia and Carleton University Abstract: Large sample properties

More information

Bootstrap inference for the finite population total under complex sampling designs

Bootstrap inference for the finite population total under complex sampling designs Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Chapter 4: Imputation

Chapter 4: Imputation Chapter 4: Imputation Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Basic Theory for imputation 3 Variance estimation after imputation 4 Replication variance estimation

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

Review of probability and statistics 1 / 31

Review of probability and statistics 1 / 31 Review of probability and statistics 1 / 31 2 / 31 Why? This chapter follows Stock and Watson (all graphs are from Stock and Watson). You may as well refer to the appendix in Wooldridge or any other introduction

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1 36. Multisample U-statistics jointly distributed U-statistics Lehmann 6.1 In this topic, we generalize the idea of U-statistics in two different directions. First, we consider single U-statistics for situations

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Introduction to Estimation Methods for Time Series models. Lecture 1

Introduction to Estimation Methods for Time Series models. Lecture 1 Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation

More information

Inference with Imputed Conditional Means

Inference with Imputed Conditional Means Inference with Imputed Conditional Means Joseph L. Schafer and Nathaniel Schenker June 4, 1997 Abstract In this paper, we develop analytic techniques that can be used to produce appropriate inferences

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

ECON 3150/4150, Spring term Lecture 6

ECON 3150/4150, Spring term Lecture 6 ECON 3150/4150, Spring term 2013. Lecture 6 Review of theoretical statistics for econometric modelling (II) Ragnar Nymoen University of Oslo 31 January 2013 1 / 25 References to Lecture 3 and 6 Lecture

More information

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA Submitted to the Annals of Applied Statistics VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA By Jae Kwang Kim, Wayne A. Fuller and William R. Bell Iowa State University

More information

Robustness to Parametric Assumptions in Missing Data Models

Robustness to Parametric Assumptions in Missing Data Models Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

BOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION

BOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION Statistica Sinica 8(998), 07-085 BOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION Jun Shao and Yinzhong Chen University of Wisconsin-Madison Abstract: The bootstrap

More information

Does low participation in cohort studies induce bias? Additional material

Does low participation in cohort studies induce bias? Additional material Does low participation in cohort studies induce bias? Additional material Content: Page 1: A heuristic proof of the formula for the asymptotic standard error Page 2-3: A description of the simulation study

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

1 One-way analysis of variance

1 One-way analysis of variance LIST OF FORMULAS (Version from 21. November 2014) STK2120 1 One-way analysis of variance Assume X ij = µ+α i +ɛ ij ; j = 1, 2,..., J i ; i = 1, 2,..., I ; where ɛ ij -s are independent and N(0, σ 2 ) distributed.

More information

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Bull. Korean Math. Soc. 5 (24), No. 3, pp. 7 76 http://dx.doi.org/34/bkms.24.5.3.7 KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Yicheng Hong and Sungchul Lee Abstract. The limiting

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Recent Advances in the analysis of missing data with non-ignorable missingness

Recent Advances in the analysis of missing data with non-ignorable missingness Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation

More information

A measurement error model approach to small area estimation

A measurement error model approach to small area estimation A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Preliminary Statistics. Lecture 3: Probability Models and Distributions

Preliminary Statistics. Lecture 3: Probability Models and Distributions Preliminary Statistics Lecture 3: Probability Models and Distributions Rory Macqueen (rm43@soas.ac.uk), September 2015 Outline Revision of Lecture 2 Probability Density Functions Cumulative Distribution

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

MISSING or INCOMPLETE DATA

MISSING or INCOMPLETE DATA MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing

More information

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation. Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Inference in Regression Analysis

Inference in Regression Analysis Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Inferences on missing information under multiple imputation and two-stage multiple imputation

Inferences on missing information under multiple imputation and two-stage multiple imputation p. 1/4 Inferences on missing information under multiple imputation and two-stage multiple imputation Ofer Harel Department of Statistics University of Connecticut Prepared for the Missing Data Approaches

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators Thilo Klein University of Cambridge Judge Business School Session 4: Linear regression,

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Statistical Inference with Monotone Incomplete Multivariate Normal Data

Statistical Inference with Monotone Incomplete Multivariate Normal Data Statistical Inference with Monotone Incomplete Multivariate Normal Data p. 1/4 Statistical Inference with Monotone Incomplete Multivariate Normal Data This talk is based on joint work with my wonderful

More information

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance: 8. PROPERTIES OF LEAST SQUARES ESTIMATES 1 Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = 0. 2. The errors are uncorrelated with common variance: These assumptions

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

ANALYSIS OF VARIANCE AND QUADRATIC FORMS

ANALYSIS OF VARIANCE AND QUADRATIC FORMS 4 ANALYSIS OF VARIANCE AND QUADRATIC FORMS The previous chapter developed the regression results involving linear functions of the dependent variable, β, Ŷ, and e. All were shown to be normally distributed

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance

More information

A weighted simulation-based estimator for incomplete longitudinal data models

A weighted simulation-based estimator for incomplete longitudinal data models To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

LIST OF FORMULAS FOR STK1100 AND STK1110

LIST OF FORMULAS FOR STK1100 AND STK1110 LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) The Simple Linear Regression Model based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #2 The Simple

More information

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1 Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Bootstrap, Jackknife and other resampling methods

Bootstrap, Jackknife and other resampling methods Bootstrap, Jackknife and other resampling methods Part III: Parametric Bootstrap Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD)

More information

Sampling Distributions

Sampling Distributions Merlise Clyde Duke University September 8, 2016 Outline Topics Normal Theory Chi-squared Distributions Student t Distributions Readings: Christensen Apendix C, Chapter 1-2 Prostate Example > library(lasso2);

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information