Parametric fractional imputation for missing data analysis

Size: px
Start display at page:

Download "Parametric fractional imputation for missing data analysis"

Transcription

1 Biometrika (????),??,?, pp C???? Biometrika Trust Printed in Great Britain Parametric fractional imputation for missing data analysis BY JAE KWANG KIM Department of Statistics, Iowa State University, Ames, Iowa 50011, U.S.A. jkim@iastate.edu SUMMARY Imputation is a popular method of handing missing data in practice. Imputation, when carefully done, can be used to facilitate the parameter estimation by applying the complete-sample estimators to the imputed dataset. The basic idea is to generate the imputed values from the conditional distribution of the missing data given the observed data. In this article, parametric fractional imputation is proposed for generating imputed values. Using fractional weights, the observed likelihood can be approximated by the weighted mean of the imputed data likelihood. Some computational efficiency can be achieved using the idea of importance sampling and calibration weighting. The proposed imputation method provides efficient parameter estimates for the model parameters specified in the imputation model and also provides reasonable estimates for parameters that are not part of the imputation model. Variance estimation is covered and results from a limited simulation study are presented. Some key words: EM algorithm, General purpose estimation, Importance sampling, Monte Carlo EM, Multiple imputation. 1. INTRODUCTION Suppose that y 1, y 2,..., y n are the p-dimensional observations for a probability sample selected from a finite population, where the finite population is a realization from a distribution F 0 (y). Suppose that, under complete response, a parameter of interest, denoted by η g = g (y) df0 (y), is estimated by ˆη g = w i g (y i ) (1) for some function g (y i ) with the sampling weight w i. Under simple random sampling, the sampling weight is 1/n and the sample can be regarded as a random sample from an infinite population with distribution F 0 (y). In (1), different choices of g-function can lead to different parameter estimation. For example, g (y i ) = y i can be used to estimate the population mean while g (y i ) = I (y i < c) corresponds to the population proportion of y less than c, where I (y < c) takes the value one if y < c and takes the value zero otherwise. Existence of an unbiased estimator (1) for any g-function is very attractive for general purpose estimation where different users want to estimate different parameters from the same sample, such as the public access data. That is, in general purpose estimation, the parameter η g is unknown at the time of data collection.

2 J.K. KIM Given missing data, we cannot directly compute the estimator (1). Instead, one can use ˆη gr w i E {g (y i ) y i,obs, (2) where (y i,obs, y i,mis ) denote the observed part and missing part of y i, respectively. To simplify the presentation, we assume the sampling mechanism and the response mechanism are ignorable in the sense of Rubin (1976). To compute the conditional expectation in (2), we need a correct specification of the conditional distribution of y i,mis given y i,obs. We adopt a parametric approach in the sense that we assume a fully parametric model for y i so that the conditional distribution of y i,mis given y i,obs can be derived from the joint distribution for an arbitrary missing pattern. Let F 0 {F θ ; θ Ω be the parametric model indexed by θ. Then the conditional expectation in (2) depends on θ 0, where θ 0 is the true parameter value corresponding to F 0. That is, E {g (y i ) y i,obs = E {g (y i ) y i,obs, θ 0. To compute the conditional expectation in (2), a Monte Carlo approximation based on the imputed data can be used. Thus, one can interpret imputation as a Monte Carlo approximation of the conditional expectation given the observed data. Imputation is very attractive in practice because, once the imputed data are created, the data analyst does not need to know the conditional distribution in (2). Monte Carlo methods of approximating the conditional expectation in (2) can be placed in two classes: 1. Bayesian approach: Generate the imputed values from the posterior predictive distribution of y i,mis given y i,obs : f (y i,mis y i,obs ) = f (y i,mis θ, y i,obs ) f (θ y i,obs ) dθ. (3) This is essentially the approach used in multiple imputation as proposed by Rubin (1987). 2. Frequentist ( approach: Generate the imputed values from the conditional distribution f y i,mis y i,obs, ˆθ ) with an estimated value ˆθ. In the Bayesian approach to imputation, the convergence to a stable posterior predictive distribution (3) is difficult to check (Gelman et al, 1996). Also, the variance estimator used in multiple imputation is not consistent for some estimated parameters. For examples, see Wang & Robins (1998) and Kim et al (2006). The frequentist approach for imputation has received less attention than the Bayesian imputation. One notable exception is Wang & Robins (1998) who studied the asymptotic properties of multiple imputation and a parametric frequentist imputation procedure. Wang & Robins (1998) considered the estimated parameter ˆθ to be given, and did not discuss parameter estimation. We consider frequentist imputation given a parametric model for the original distribution. Using the idea of importance sampling, we propose a frequentist imputation method that can be implemented as the fractional imputation discussed by Kim & Fuller (2004). The proposed fractional imputation method can be further modified to remove the Monte Carlo error without increasing the size of Monte Carlo samples. The proposed method uses the calibration technique to obtain the exact maximum likelihood estimator with a modest imputation size. In 2, the parametric fractional imputation method is described. Asymptotic properties of the fractionally imputed estimator are discussed in 3. Maximum likelihood estimation using fractional imputation is discussed in 4. Results from a limited simulation study are presented in 5. Concluding remarks are made in 6.

3 Parametric fractional imputation 3 2. FRACTIONAL IMPUTATION As discussed in 1, we consider an approximation for the conditional expectation in (2) using fractional imputation. In fractional imputation, M(> 1) imputed values of y i,mis, say y (1) i,mis,..., y (M) i,mis, are generated and assigned fractional weights, w i1,..., w im, so that j=1 wijg ( yij ) { = E g (y i ) y i,obs, ˆθ, (4) where yij = (y i,obs, y (j) i,mis ), holds at least asymptotically for each i, where ˆθ is a consistent estimator of θ 0. A popular choice for ˆθ is the (pseudo) maximum likelihood estimator, which can be obtained by finding θ that maximizes the (pseudo) log-likelihood function. That is, ˆθ = arg max w i ln { f obs(i) (y i,obs ; θ), (5) θ Ω where f obs(i) (y i,obs ; θ) = f (y i ; θ) dy i,mis is the marginal density of y i,obs. In the marginal density of y i,obs, the subscript i is used because the missing pattern can be different for each i. A computationally simple method of finding the maximum likelihood estimator shall be discussed in 4. Condition (4) applied to g(y i ) = c implies that wij = 1 (6) j=1 for all i. From the fractionally imputed data satisfying (4) and (6), the parameter η g can be estimated by ˆη F I,g = w i wijg ( yij). (7) Note that the imputed estimator (7) is obtained by applying the formula (1) using yij as the observations with weights w i wij. The size of the imputed dataset can be as large as nm. For a single parameter η g = E {g (y), any fractional imputation satisfying (4) can provide a consistent estimator of η g. For general purpose estimation, the g-function in η g is unknown at the time of imputation (Fay, 1992). In this case, we cannot expect (4) to hold for any g-function based on a finite number of imputations, with the exception of categorical data. For categorical data, there are a finite number of possible values for y i,mis. In this case, we take the possible values as the imputed values and compute the conditional probability of y i,mis by the following Bayes formula: ( p y (j) i,mis y i,obs, ˆθ ) f(yij = ; ˆθ) Mi k=1 f(y ik ; ˆθ), where f (y i ; θ) is the joint density of y i evaluated at θ and M i is the number of possible values of y i,mis. Note that the choice of wij = p(y (j) i,mis y i,obs, ˆθ) satisfies (4) and (6). Thus, fractional imputation for categorical data using wij = p(y (j) i,mis y i,obs, ˆθ), which is close in spirit to the EM by weighting method of Ibrahim (1990), is discussed in Kim & Rao (2009). For the continuous random variable y i, condition (4) can be approximately satisfied using an importance sampling, where y (1) i,mis,..., y (M) i,mis are independently generated from a distribution

4 J.K. KIM with density h (y i,mis ) which has the same support as f (y i,mis y i,obs, θ) for all θ Ω and the corresponding fractional weights are computed by w ij0 = w ij0(ˆθ) = C i f(y (j) i,mis y i,obs; ˆθ) h(y (j) i,mis ), (8) where C i is chosen to satisfy (6). If h (y i,mis ) = f(y i,mis y i,obs, ˆθ) is used, w ij0 = M 1. REMARK 1. Under mild conditions, ḡi = M j=1 w ij0 g(y ij ) with w ij0 in (8) converges to ḡ i (ˆθ) E{g (y i ) y i,obs, ˆθ with probability 1, as M, and has approximate variance σ 2 i /M, where [ { 2 σi 2 f(y i,mis y i,obs, = E g (y i ) ḡ i (ˆθ) ˆθ) ] y i,obs, h (y i,mis ) ˆθ. The optimal choice of h (y i,mis ) that minimizes σi 2 is ( h (y i,mis ) = f y i,mis y i,obs, ˆθ ) g (y i ) ḡ i (ˆθ) { g E (yi ) ḡ i (ˆθ) y i,obs, ˆθ. (9) When the g-function is unknown, h (y i,mis ) = f(y i,mis y i,obs, ˆθ) is a reasonable choice in terms of statistical efficiency. However, other choices of h (y i,mis ) can have better computational efficiency in some situations. For public assess data, creating the fractional imputation with a large imputation size is not feasible. We propose an approximation with a moderate imputation size, say M = 10. To formally describe the procedure, let y (1) i,mis,..., y (M) i,mis be independently generated from a distribution with density h (y i,mis ). Given the imputed values, it remains to compute the fractional weights that satisfy (4) and (6) as closely as possible. The proposed fractional weights are computed in two steps. In the first step, the initial fractional weights are computed by (8). In the second step, the initial fractional weights are adjusted to satisfy (6) and w i wijs(ˆθ; yij) = 0, (10) where s (θ; y) = ln f (y; θ) / θ is the score function of θ. Adjusting the initial weights to satisfy a constraint is often called calibration. As can be seen in 3, constraint (10) is a calibration constraint making the resulting imputed estimator ˆη F I,g in (7) fully efficient for η g when η g is a linear function of θ. To construct the fractional weights satisfying (6) and (10), the regression weighting technique or the empirical likelihood technique can be used. For example, in the regression weighting technique, the fractional weights are constructed by ( ) T wij = wij0 w i s i w i wij0 ( s ij s ) 2 i 1 w ij0 ( s ij s i ), (11) where w ij0 is the initial fractional weight (8) using importance sampling, s i = M B 2 = BB T, and s ij = s(ˆθ; yij ). Here, M need not be large. j=1 w ij0 s ij,

5 Parametric fractional imputation 5 If the distribution belongs to the exponential family of the form f (y; θ) = exp { t (y) T θ + φ (θ) + A (y), (12) then (10) can be obtained by n M j=1 w iw ij {t(y ij ) + φ(ˆθ) = 0, where φ (θ) = φ (θ) / θ. In this case, calibration can be used only for complete sufficient statistics. 3. ASYMPTOTIC RESULTS In this section, we discuss some asymptotic properties of the fractional imputed estimator (7). We consider two types of fractionally imputed estimators. One is obtained by using the initial fractional weights in (8) and the other is obtained by using the calibration fractional weights such as (11). Note that the imputed estimator ˆη F I,g in (7) is a function of n and M, where n is the sample size and M is the imputation size for fractional imputation. Thus, we use ˆη g0,n,m and ˆη g1,n,m to denote the fractionally imputed estimator (7) using the initial fractional weights in (8) and the fractionally imputed estimator (7) using the calibration fractional weights in (11), respectively. The following theorem presents some asymptotic properties of the fractionally imputed estimator. The proof is presented in Appendix A. THEOREM 1. Let ˆη g0,n,m and ˆη g1,n,m be fractionally imputed estimators of η g of the form (7) using M fractions where the fractional weights are constructed by (8) and by (11), respectively, and the imputed values are generated from h (y i,mis ). Under some regularity conditions, and (ˆη g0,n,m η g ) /σ g0,n,m L N (0, 1) (13) (ˆη g1,n,m η g ) /σ g1,n,m L N (0, 1) (14) for each M > 1, where [ ] σg0,n,m 2 = var w i {ḡi (θ 0 ) + K1 T s i (θ 0 ), [ ] σg1,n,m 2 = var w i {ḡi (θ 0 ) + K1 T s i (θ 0 ) + B T ( s i (θ 0 ) s i (θ 0 )), ḡ i (θ) = M j=1 w ij0 (θ) g(y ij ), s i (θ) = E θ {s (θ; y i ) y i,obs, s i (θ) = M j=1 w ij0 (θ) s(θ; y ij ), K 1 = [I obs (θ 0 )] 1 I g,mis (θ 0 ), (15) B = [I mis (θ 0 )] 1 I g,mis (θ 0 ), I obs (θ) = E { n w i s i (θ) / θ, I g,mis (θ) = E [ n w i {s (θ; y i ) s i (θ) g (y i )] and I mis (θ) = E [ n w i {s (θ; y i ) s i (θ) 2]. In Theorem 1, note that { σg0,n,m 2 = σg1,n,m 2 + B T var w i ( s i s i ) B and the last term represents the reduction in the variance of the fractional imputation estimator of η g due to the calibration in (10). Thus, σg0,n,m 2 σ2 g1,n,m with equality for M =. Clayton

6 J.K. KIM et al (1998) and Robins & Wang (2000) also proved results similar to (13) for the special case of M =. For variance estimation of the fractional imputation estimator, let ˆV (ˆη g ) = Ω ij g (y i ) g (y j ) be a consistent estimator for the variance of ˆη g = n w ig (y i ) under complete response, where Ω ij are coefficients. Under simple random sampling, Ω ij = 1/ { n 2 (n 1) for i j and Ω ii = 1/n 2. For sufficiently large M, using the results in Theorem 1, a consistent estimator for the variance of ˆη F I,g in (7) is ˆV (ˆη F I,g ) = Ω ij ē i ē j, (16) where ē i = ḡ i (ˆθ) + ˆK 1 T s i (ˆθ) = M j=1 w ij0ê, ê ij = g(y ij ) + ˆK 1 Ts(ˆθ; yij ) and [ T] 1 { ˆK 1 = w i s i (ˆθ) s i (ˆθ) w i wij s(ˆθ; yij) s i g ( yij). For moderate size M, the expected value of variance estimator (16) can be written { E ˆV (ˆηF I,g ) = E Ω ij ē i ē j + E (ē Ω ij cov I i, ē ) j, where ē i = E I (ē i ) and subscript I denotes the expectation over the imputation mechanism generating y (j) i,mis from h (y i,mis). If the imputed values are generated independently, we have { (ē cov I i, ē ) 0 if i j j = var I (ē i ) if i = j and, using the argument for Remark 1, var I (ē i ) can be estimated by ˆV Ii,e M j=1 (w ij0 )2 (ê ij ē i )2. Thus, an unbiased estimator for σ 2 g0,n,m is ˆσ 2 g0,n,m = Ω ij ē i ē j Ω ii ˆVIi,e + wi 2 ˆV Ii,g, (17) where ˆV Ii,g = M j=1 (w ij0 )2 (gij ḡ i )2. Using the argument similar to (17), it can be shown that ˆσ g1,n,m 2 = Ω ij ē i ē j Ω ii ˆVIi,e + wi 2 ˆV Ii,u, (18) is an unbiased estimator of σg1,n,m 2, where ˆV Ii,u = M j=1 (w ij0 )2 (u ij ū i )2 is an unbiased estimator of V I (ū i ), u ij = g(y ij ) ˆB T s(ˆθ; yij ), ū i = M j=1 w ij0 u ij, and 1 ˆB = w i wij0 ( s ij s ) 2 i w i wij0 ( s ij s ) ( ) i g y ij.

7 Parametric fractional imputation 7 Variance estimation with fractionally imputed data can be also performed using the replication method described in Appendix B. 4. MAXIMUM LIKELIHOOD ESTIMATION In this section, we propose a computationally simple method of obtaining the pseudo maximum likelihood estimator in (5). The pseudo maximum likelihood estimator reduces to the usual maximum likelihood estimator if the sampling design is simple random sampling with w i = 1/n. Instead of (5), the pseudo maximum likelihood estimator of θ 0 can be obtained by ˆθ = arg max θ Ω w i E {ln f (y i ; θ) y i,obs. (19) For w i = 1/n, Dempster et al (1977) proved that the maximum likelihood estimator in (19) maximizes (5) and proposed the EM algorithm that computes the solution iteratively by defining ˆθ (t+1) to be the solution to ˆθ (t+1) = arg max θ Ω w i E {ln f (y i ; θ) y i,obs, ˆθ (t), (20) where ˆθ (t) is the estimate of θ obtained at the t-th iteration. To compute the conditional expectation in (20), Monte Carlo implementation of the EM algorithm of Wei & Tanner (1990) can be used. The Monte Carlo EM method avoids the analytic computation of the conditional expectation (20) by using the Monte Carlo approximation based on the imputed data. In the Monte Carlo EM method, independent draws of y i,mis are generated from the conditional distribution f(y i,mis y i,obs, ˆθ (t) ) for each t to approximate the conditional expectation in (20). The Monte Carlo EM method requires heavy computation because the imputed values are re-generated for each iteration t. Also, generating imputed values from f(y i,mis y i,obs, ˆθ (t) ) can be computationally challenging since it often requires an iterative algorithm such as Metropolis- Hastings algorithm for each EM iteration. To avoid re-generating values from the conditional distribution at each step, we propose the following algorithm for parametric fractional imputation using importance sampling: [Step 0] Obtain an initial estimator ˆθ (0) of θ and set h(y i,mis ) = f(y i,mis y i,obs, ˆθ (0) ). [Step 1] Generate M imputed values, y (1) i,mis,..., y (M) i,mis, from h (y i,mis). [Step 2] With the current estimate of θ, denoted by ˆθ (t), compute the fractional weights by wij(t) = w ij0(ˆθ (t) ), where w ij0 (ˆθ) is defined in (8). [Step 3] (Optional) If wij(t) > C/M for some i = 1, 2,..., n and j = 1, 2,..., M, then set h (y i,mis ) = f(y i,mis y i,obs, ˆθ (t) ) and go to Step 1. Increase M if necessary. [Step 4] Solve the weighted score equation for θ to get ˆθ (t+1), where ˆθ (t+1) maximizes Q ( θ ˆθ (t) ) = w i wij(t) ln f ( yij; θ ) (21) over θ Ω. [Step 5] Set t = t + 1 and go to Step 2. Stop if ˆθ (t) meets the convergence criterion.

8 J.K. KIM In Step 0, the initial estimator ˆθ (0) can be the maximum likelihood estimator obtained by using only the respondents. Step 1 and Step 2 correspond to the E-step of the EM algorithm. Step 3 can be used to control the variation of the fractional weights and to avoid extremely large fractional weights. The threshold C/M in Step 3 guarantees that no individual fractional weight is bigger than C times of the average of the fractional weights. In Step 4, the value of θ that maximizes Q (θ ˆθ (t) ) in (21) can often be obtained by solving w i wij(t) s ( θ; yij) = 0, (22) where s (θ; y) = ln f (y; θ) / θ is the score function of θ. Thus, the solution can be obtained by applying the complete sample score equation to the fractionally imputed data. The equation (22) can be called the imputed score equation using fractional imputation. Unlike the Monte Carlo EM method, the imputed values are not changed for each iteration, only the fractional weights are changed. REMARK 2. In Step 2, fractional weights can be computed by using the joint density with the current parameter estimate ˆθ (t). Note that wij(0) (θ) in (8) can be written w ij(0) (θ) = f(y (j) i,mis y i,obs, θ)/h(y (j) i,mis ) M k=1 f(y (k) i,mis y i,obs, θ)/h(y (k) i,mis ) = f(yij ; θ)/h(y (j) i,mis ) M k=1 f(y ik ; (23) θ)/h(y (k) i,mis ), which does not require the marginal density in computing the conditional distribution. Only the joint density is needed. Given the M imputed values, y (1) i,mis,..., y (M) i,mis, generated from h (y i,mis), the sequence of estimators {ˆθ (0), ˆθ (1),... can be constructed from the parametric fractional imputation using importance sampling. The following theorem presents some convergence properties of the sequence of the estimators. THEOREM 2. Assume that the M imputed values are generated from h (y i,mis ). Let Q (θ ˆθ (t) ) be the weighted log likelihood function (21) based on the fractional imputation. If then where l obs (θ) = n w i ln{f obs(i) (y i,obs; θ) and Q (ˆθ (t+1) ˆθ (t) ) Q (ˆθ (t) ˆθ (t) ) (24) f obs(i) (y i,obs; θ) = l obs (ˆθ (t+1) ) l obs (ˆθ (t) ), (25) M j=1 f(y ij ; θ)/h(y (j) i,mis ) M j=1 1/h(y (j) i,mis ).

9 Proof. By (23) and using the Jensen s inequality, Therefore, (24) implies (25). l obs (ˆθ (t+1) ) l obs (ˆθ (t) ) = Parametric fractional imputation 9 w i ln wij(t) f(yij ; ˆθ (t+1) ) f(yij ; ˆθ (t) ) j=1 w i w ij(t) ln f(y ij ; ˆθ (t+1) ) f(y ij ; ˆθ (t) ) = Q (ˆθ (t+1) ˆθ (t) ) Q (ˆθ (t) ˆθ (t) ). Note that lobs (θ) is an imputed version of the observed pseudo log-likelihood based on the the M imputed values, y (1) i,mis,..., y (M) i,mis. Thus, by Theorem 1, the sequence l obs (ˆθ (t) ) is monotonically increasing and, under the conditions stated in Wu (1983), the convergence of ˆθ (t) follows for fixed M. Theorem 2 does not hold for the sequence obtained from the Monte Carlo EM method for fixed M, because the imputed values are re-generated for each E-step on the Monte Carlo EM method, and the convergence is very hard to check for the Monte Carlo EM (Booth & Hobert, 1999). REMARK 3. Sung and Geyer (2007) considered a Monte Carlo maximum likelihood method that directly maximizes l obs (θ). Computing the value of θ that maximizes Q (θ ˆθ (t) ) is easier than computing the value of θ that maximizes l obs (θ). 5. SIMULATION STUDY In a simulation study, B = 2, 000 Monte Carlo samples of size n = 200 were independently generated from an infinite population specified by x i N (2, 1); y i x i N (β 0 + β 1 x i, σ ee ) with (β 0, β 1, σ ee ) = (1, 0 7, 1); z i (x i, y i ) Ber (p i ), where log{p i /(1 p i ) = ψ 0 + ψ 1 y i with (ψ 0, ψ 1 ) = ( 3, 1), and δ i (x i, y i, z i ) Ber (π i ) where log{π i /(1 π i ) = 0 5x i + 0 8z i. Variable y i is observed if δ i = 1 and is not observed if δ i = 0. The overall response rate for y is about 76%. Variable x is the covariate for the regression of y on x and is always observed. Variable z is the surrogate variable for y in the sense that interest is focused on estimation of the parameter β in the regression model P β (y x) but incorporating the surrogate variable z can improve the efficiency of the resulting estimators (Pepe, 1992). Also, note that, ignoring the surrogate variable z, the response mechanism is not ignorable. By incorporating the surrogate variable into the estimation procedure, we can safely ignore the response mechanism. We are interested in estimating three parameters: θ 1 = E (y); the marginal mean of y, θ 2 = β 1 ; the slope for the regression of y on x, and θ 3 = pr (y < 3); the proportion of y less than 3. Under complete response, θ 1 and θ 2 are computed by the maximum likelihood method and the proportion θ 3 is estimated by ˆθ 3,n = 1 n I (y i < 3), (26) which is different from the maximum likelihood estimator 3 ( ) y ˆµy φ dy, (27) ˆσ yy

10 J.K. KIM where φ (y) is the density of the standard normal distribution and (ˆµ y, ˆσ yy ) is the maximum likelihood estimator of (µ y, σ yy ). Under nonresponse, five estimators were computed: 1. The parametric fractional imputation estimator proposed using wij0 in (8) with M = 100 and M = The calibration fractional imputation estimator using the regression weighting method in (11) with M = Multiple imputation with M = 100 and M = 10. In addition to the five imputed estimators, we also compute the complete sample estimator for comparison. In fractional imputation, M imputed values were independently generated from N( ˆβ 0(0) + ˆβ 1(0) x i, ˆσ ee(0) ), where ( ˆβ 0(0), ˆβ 1(0), ˆσ ee(0) ) is the regression parameters computed from the respondents only. For each imputed value, we assign the fractional weight { exp (y (j) wij(t) i ˆβ 0(t) ˆβ ( 1(t) x i ) 2 /ˆσ ee(t) exp ˆψ0(t) z i + ˆψ ) 1(t) z i y (j) i { exp (y (j) i ˆβ 0(0) ˆβ { ( 1(0) x i ) 2 /ˆσ ee(0) 1 + exp ˆψ0(t) + ˆψ ), 1(t) y (j) i where (β 0(t), β 1(t), ˆσ ee(t), ˆψ 0(t), ˆψ 1(t) ) is obtained by the maximum likelihood method using the fractionally imputed data with fractional weight wij(t 1). In Step 3 of the fractional imputation for maximum likelihood estimation in 4, C = 5 was used. In the calibration fractional imputation method, M = 10 values were randomly selected from M 1 = 100 initial fractionally imputed values by systematic sampling with selection probability proportional to wij0 in (8). The regression fractional weights were then computed by (11). In Step 5, the convergence criterion was ˆθ (t+1) ˆθ (t) < ɛ = The average number of iterations for convergence were 45 and 35 for M = 100 and M = 10, respectively. In multiple imputation, the imputed values are generated from the posterior predictive distribution iteratively using Gibbs sampling with 100 iterations. The Monte Carlo means and the Monte Carlo standardised variances of the complete sample estimator and the five imputed estimators are presented in Table 1. The standardised variance in Table 1 was computed by dividing the variance of each estimator by that of the full sample estimator. The simulation results in Table 1 show that fractional imputation is generally more efficient than multiple imputation because multiple imputation has additional variability due to generating the parameter values. For estimation of the proportion, it is possible for the multiple imputation estimator with M = 100 to be more efficient than the calibration fractional imputation estimator with M = 10. The differences in efficiencies for these two parameters are less than one percent. In addition to point estimators, variance estimates were also computed from each Monte Carlo sample. For variance estimation of fractional imputation, we used the linearized variance estimator discussed in 3. For variance estimation of the calibration fractional imputation estimator, we used the one-step jackknife variance estimator discussed in Appendix B. For multiple imputation variance estimation, we used the variance formula of Rubin (1987). Table 2 presents the Monte Carlo relative biases and t-statistics for the variance estimators. The t-statistic is the statistic for testing zero bias in the variance estimator. The sizes of simulation error for the relative bias of variance estimators reported in Table 2 are less than 1%. Table 2 shows that, for the fractional imputation estimators, both linearization and replication methods provide consistent estimates for the variance of the parameter estimates, with t-statistics less than 2 in absolute value. The multiple imputation variance estimators are also unbiased for

11 Parametric fractional imputation 11 Table 1. Monte Carlo means and standardised variances of the imputed estimators Parameter Method Mean Std Var Complete Sample FI (M=100) θ 1 FI (M=10) CFI (M=10) MI (M=100) MI (M=10) Complete Sample FI (M=100) θ 2 FI (M=10) CFI (M=10) MI (M=100) MI (M=10) Complete Sample FI (M=100) θ 3 FI (M=10) CFI (M=10) MI (M=100) MI (M=10) FI, fractional imputation; CFI, calibration fractional imputation; MI, multiple imputation; Std Var, standardised variance.. the variance of estimators for θ 1 and θ 2 which were explicitly considered in the imputation model. For variance estimation of the proportion, the multiple imputation variance estimator shows significant biases (15.53% for M = 10 and % for M = 100). The congeniality condition of Meng (1994) does not hold for the proportion based on the estimator (26). For more details, see Appendix C. Thus, the multiple imputation method in this simulation is congenial for θ 1 and θ 2, but it is not congenial for the proportion parameter θ CONCLUDING REMARKS Parametric fractional imputation is proposed as a tool for parameter estimation with missing data. The analyses using fractionally imputed data yij with weights w iwij can be performed using existing software for the complete sample treating the imputed values as if observed. Generating the imputed values and computing the fractional weights may require careful modeling and parameter estimation, but the users do not need to know these details for their analyses. The data provider, who has the best available information for the imputation model, can use the imputation model to construct the fractionally imputed data with fractional weights and possibly with replicated fractional weights for variance estimation. No other information is necessary for the users. If parametric fractional imputation is used to construct the score function, the solution to the imputed score equation is very close to the maximum likelihood estimator of the parameters in the model. Some computational efficiency relative to the Monte Carlo EM method can be achieved with parametric fractional imputation using the idea of importance sampling and calibration weighting. Parametric fractional imputation yields direct consistent estimates for pa-

12 12 J.K. KIM Table 2. Monte Carlo relative biases and t-statistics of the variance estimators Parameter Method Relative Bias (%) t-statistics Linearize (for FI with M = 100) var(ˆθ 1 ) Linearize (for FI with M = 10) One-step JK (for CFI with M = 10) MI (M=100) MI (M=10) Linearize (for FI with M = 100) var(ˆθ 2 ) Linearize (for FI with M = 10) One-step JK (for CFI with M = 10) MI (M=100) MI (M=10) Linearize (for FI with M = 100) var(ˆθ 3 ) Linearize (for FI with M = 10) One-step JK (for CFI with M = 10) MI (M=100) MI (M=10) FI, fractional imputation; CFI, calibration fractional imputation; MI, multiple imputation; JK, jackknife;. rameters that are not part of the imputation model. For example, in the simulation study, the parametric fractional imputation computed from a normal model provides direct estimates for the cumulative distribution function. Thus, the proposed imputation method is a useful tool for general-purpose data analysis, where the parameters of interest are unknown at the time of imputation. Variance estimation can be performed using a linearization method or a replication method. The validity of variance estimation for parametric fractional imputation, unlike multiple imputation, does not require the congeniality condition of Meng (1994). ACKNOWLEDGEMENT The author wishes to thank professor Wayne Fuller, the referees, and the Editor for their very helpful comments. Define a class of estimators η g0,n,k (θ) = APPENDIX A. Proof of Theorem 1 w i wij0 (θ) g ( yij ) + K T w i E {s (θ; y i ) y i,obs, θ indexed by K. Note that, by (5), we have n w ie{s(ˆθ; y i ) y i,obs, ˆθ = 0 and η g0,n,k (ˆθ) = ˆη g0,n,m for any K. According to Randles (1982), we have { n η g0,n,k (ˆθ) η g0,n,k (θ 0 ) = o p (1),

13 if is satisfied. Using { w i θ w ij0 (θ) g ( yij ) n = Parametric fractional imputation 13 { E θ η g0,n,k (θ 0 ) = 0 (A.1) the choice of K = K 1 in (15) satisfies (A.1) and thus (13) follows. To show (14), consider Using (11), η g1,n,k (θ) = w i wij (θ) g ( yij ) n = + w i wij (θ) g ( yij ) + K T w i wij0 (θ) g ( yij ) w i wij0 (θ) { s ( θ; yij) s i (θ) g ( yij), w i E {s (θ; y i ) y i,obs, θ. w i E {s (θ; y i ) y i,obs, θ where ˆB g (θ) = w i wij0 (θ) ( s ij (θ) s i (θ) ) 2 1 n w i wij0 (θ) s ( θ; yij ) T ˆB g (θ), w i wij0 (θ) ( s ij (θ) s i (θ) ) g ( yij). After some algebra, it can be shown that the choice of K = K 1 in (15) also satisfies E { η g1,n,k (θ 0 ) / θ = 0 and, by Randles (1982) again, { n η g1,n,k (ˆθ) η g1,n,k (θ 0 ) = o p (1). B. Replication variance estimation Under complete response, let w [k] i be the k-th replication weight for unit i. Assume that the replication variance estimator ˆV n = L k=1 c k ( ˆη [k] g ˆη g ) 2, where ˆη g = n w ig (y i ) and ˆη g [k] = n w[k] i g (y i ), is consistent for the variance of ˆη g. For replication with the calibration fraction method computed from (10), we consider the following steps for creating replicated fractional weights. [Step 1] Compute ˆθ [k], the k-th replicate of ˆθ, using fractional weights. [Step 2] Using the ˆθ [k] computed from Step 1, compute the replicated fractional weights by using the regression weighting technique. w [k] i w [k] (ˆθ[k] ) ij s ; yij = 0, (B.1)

14 J.K. KIM Equation (B.1) is the calibration equation for the replicated fractional weights. For any estimator of the form (7), the replication variance estimator is constructed as where ˆη [k] F I,g = n M j=1 w[k] i ˆV (ˆη F I,g ) = L k=1 ( ) 2 c k ˆη [k] F I,g ˆη F I,g w [k] ij g(yij ) and w [k] ij is computed from (B.1). In general, Step 1 can be computationally problematic since ˆθ [k] is often computed from the iterative algorithm (21) for each replication. Thus, we consider an approximation for ˆθ [k] using Taylor linearization of 0 = S [k] (ˆθ [k] ) around ˆθ. The one-step approximation can be implemented as ˆθ [k] = ˆθ [Î[k] (ˆθ)] 1 ) + obs S[k] (ˆθ, where Î [k] obs (θ) = n and S [k] (θ) = n w[k] i proposed by Louis (1982). { w [k] i wij θ s ( ) i θ; y ij + s i w [k] i { s i (θ) 2 w [k] i wij { ( ) s θ; y 2 ij (θ). The Î[k] obs (θ) is a replicated version of the observed information matrix C. A note on multiple imputation for a proportion Assume that we have a random sample of size n with observations (x i, y i ) obtained from a bivariate normal distribution. The parameter of interest is a proportion, for example, η = pr (y 0.8). An unbiased estimator of η is ˆη n = 1 I (y i 0.8). (C.1) n Note that ˆη n is unbiased but has bigger variance than the maximum likelihood estimator (27) of η. For simplicity, assume that the first r(< n) elements have both x i and y i responding, but the last n r elements have only x i observed and y i missing. In this situation, an efficient imputation method such as ( ) yi N ˆβ0 + x i ˆβ1, ˆσ e 2 (C.2) can be used, where ˆβ 0, ˆβ 1 and ˆσ e 2 can be computed from the respondents. In multiple imputation, the parameter estimates are generated from a posterior distribution given the observations. Under the imputation mechanism (C.2), the imputed estimator of µ 2 of the form ˆµ 2,I = n { 1 r y i + n i=r+1 y i satisfies var {ˆµ 2,F E = var (ȳ n ) + var (ˆµ 2,F E ȳ n ), where ˆµ 2,F E = E I (ˆµ 2,I ). Condition (C.3) is the congeniality condition of Meng (1994). Now, for η = pr (y 3), the imputed estimator of η based on ˆη n in (C.1) is { r ˆη I = 1 I (y i 3) + I (yi 3). (C.4) n i=r+1 The expected value of ˆη I over the imputation mechanism is E I (ˆη I ) = 1 r n { I (y i 3) + pr(y i 3 x i, ˆθ) = ˆη F E + 1 n i=r+1 r {I(y i 3) pr(y i 3 x i, ˆθ), (C.3)

15 Parametric fractional imputation 15 where ˆη F E = n 1 n pr(y i 3 x i, ˆθ). For the proportion, ˆη F E E I (ˆη I ) and so the congeniality condition does not hold. In fact, var {E I (ˆη I ) < var (ˆη n ) + var {E I (ˆη I ) ˆη n and the multiple imputation variance estimator overestimate the variance of ˆη I in (C.4). Instead, we have var (ˆη F E ) = var (ˆη n ) + var (ˆη F E ˆη n ) and the multiple imputation variance estimator estimates the variance of ˆη F E for sufficiently large M. REFERENCES Booth, J.G. and Hobert, J.P. (1999). Maximizing generalized linear mixed models with an automated Monte Carlo EM algorithm, Journal of the Royal Statistical Society: Series B 61, CLAYTON, D., SPIEGELHALTER, D., DUNN, G., & PICKLES, A. (1998). Analysis of longitudinal binary data from multiphase sampling. J. R. Statist. Soc. B 60, DEMPSTER, A. P., LAIRD, N. M. & RUBIN, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc. B FAY, R. E. (1992). When are inferences from multiple imputation valid? In Proc. Survey Res. Meth. Sect., Am. Statist. Assoc., pp Washington, DC: American Statistical Association. GELMAN, A., MENG, X.-L., & STERN, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Statist. Sci. 6, Henmi, M., Yoshida, R., & Eguchi, S. (2007). Importance sampling via the estimated sampler. Biometrika 94, IBRAHIM, J. G. (1990). Incomplete data in generalized linear models. J. Am. Statist. Assoc. 85, KIM, J.K., BRICK, M.J., FULLER, W.A., & KALTON, G. (2006). On the bias of the multiple imputation variance estimator in survey sampling. J. R. Statist. Soc. B 68, KIM, J.K. & FULLER, W.A. (2004). Fractional hot deck imputation, Biometrika 91, KIM, J.K. & RAO, J.N.K. (2009). Unified approach to linearization variance estimation from survey data after imputation for item nonresponse, Biometrika, In press. (available through Biometrika Advance Access. doi: /biomet/asp041) LOUIS, T. A. (1982). Finding the observed information matrix when using the EM algorithm. J. R. Statist. Soc. B 44, MENG, X.L.(1994). Multiple-imputation inferences with uncongenial sources of input (with discussion), Statist. Sci. 9, RANDLES, R. H. (1982). On the asymptotic normality of statistics with estimated parameters. Ann. Statist. 10, ROBINS, J.M. & WANG, N. (2000). Inference for imputation estimators, Biometrika 87, RUBIN, D. B. (1976). Inference and missing data. Biometrika 63, RUBIN, D.B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley. SUNG, Y.J. & GEYER, C.J. (2007). Monte Carlo likelihood inference for missing data models. Ann. Statist. 35, PEPE, M.S. (1992). Inference using surrogate outcome data and a validation sample. Biometrika 79, WANG, N. & ROBINS, J.M. (1998). Large-sample theory for parametric multiple imputation procedures. Biometrika 85, WEI, G.C.G. & TANNER, M.A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man s data augmentation algorithm. J. Am. Statist. Assoc. 85, WU, C.F.J. (1983). On the convergence properties of the EM algorithm, Ann. Statist. 11, [Received January????. Revised July????]

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Statistical Methods for Handling Missing Data

Statistical Methods for Handling Missing Data Statistical Methods for Handling Missing Data Jae-Kwang Kim Department of Statistics, Iowa State University July 5th, 2014 Outline Textbook : Statistical Methods for handling incomplete data by Kim and

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

Two-phase sampling approach to fractional hot deck imputation

Two-phase sampling approach to fractional hot deck imputation Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.

More information

Chapter 4: Imputation

Chapter 4: Imputation Chapter 4: Imputation Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Basic Theory for imputation 3 Variance estimation after imputation 4 Replication variance estimation

More information

On the bias of the multiple-imputation variance estimator in survey sampling

On the bias of the multiple-imputation variance estimator in survey sampling J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,

More information

Recent Advances in the analysis of missing data with non-ignorable missingness

Recent Advances in the analysis of missing data with non-ignorable missingness Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation

More information

Miscellanea A note on multiple imputation under complex sampling

Miscellanea A note on multiple imputation under complex sampling Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

Combining data from two independent surveys: model-assisted approach

Combining data from two independent surveys: model-assisted approach Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved

More information

Fractional hot deck imputation

Fractional hot deck imputation Biometrika (2004), 91, 3, pp. 559 578 2004 Biometrika Trust Printed in Great Britain Fractional hot deck imputation BY JAE KWANG KM Department of Applied Statistics, Yonsei University, Seoul, 120-749,

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Introduction An approximated EM algorithm Simulation studies Discussion

Introduction An approximated EM algorithm Simulation studies Discussion 1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA Submitted to the Annals of Applied Statistics VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA By Jae Kwang Kim, Wayne A. Fuller and William R. Bell Iowa State University

More information

Calibration estimation using exponential tilting in sample surveys

Calibration estimation using exponential tilting in sample surveys Calibration estimation using exponential tilting in sample surveys Jae Kwang Kim February 23, 2010 Abstract We consider the problem of parameter estimation with auxiliary information, where the auxiliary

More information

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:

More information

arxiv:math/ v1 [math.st] 23 Jun 2004

arxiv:math/ v1 [math.st] 23 Jun 2004 The Annals of Statistics 2004, Vol. 32, No. 2, 766 783 DOI: 10.1214/009053604000000175 c Institute of Mathematical Statistics, 2004 arxiv:math/0406453v1 [math.st] 23 Jun 2004 FINITE SAMPLE PROPERTIES OF

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

analysis of incomplete data in statistical surveys

analysis of incomplete data in statistical surveys analysis of incomplete data in statistical surveys Ugo Guarnera 1 1 Italian National Institute of Statistics, Italy guarnera@istat.it Jordan Twinning: Imputation - Amman, 6-13 Dec 2014 outline 1 origin

More information

Plausible Values for Latent Variables Using Mplus

Plausible Values for Latent Variables Using Mplus Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

STATISTICAL INFERENCE WITH DATA AUGMENTATION AND PARAMETER EXPANSION

STATISTICAL INFERENCE WITH DATA AUGMENTATION AND PARAMETER EXPANSION STATISTICAL INFERENCE WITH arxiv:1512.00847v1 [math.st] 2 Dec 2015 DATA AUGMENTATION AND PARAMETER EXPANSION Yannis G. Yatracos Faculty of Communication and Media Studies Cyprus University of Technology

More information

A weighted simulation-based estimator for incomplete longitudinal data models

A weighted simulation-based estimator for incomplete longitudinal data models To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Chapter 8: Estimation 1

Chapter 8: Estimation 1 Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Introduction to Survey Data Integration

Introduction to Survey Data Integration Introduction to Survey Data Integration Jae-Kwang Kim Iowa State University May 20, 2014 Outline 1 Introduction 2 Survey Integration Examples 3 Basic Theory for Survey Integration 4 NASS application 5

More information

Statistical Practice

Statistical Practice Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design 1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Bayesian inference for factor scores

Bayesian inference for factor scores Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor

More information

A Note on Bayesian Inference After Multiple Imputation

A Note on Bayesian Inference After Multiple Imputation A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level A Monte Carlo Simulation to Test the Tenability of the SuperMatrix Approach Kyle M Lang Quantitative Psychology

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Fractional imputation method of handling missing data and spatial statistics

Fractional imputation method of handling missing data and spatial statistics Graduate Theses and Dissertations Graduate College 2014 Fractional imputation method of handling missing data and spatial statistics Shu Yang Iowa State University Follow this and additional works at:

More information

Weighting in survey analysis under informative sampling

Weighting in survey analysis under informative sampling Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting

More information

Comparison between conditional and marginal maximum likelihood for a class of item response models

Comparison between conditional and marginal maximum likelihood for a class of item response models (1/24) Comparison between conditional and marginal maximum likelihood for a class of item response models Francesco Bartolucci, University of Perugia (IT) Silvia Bacci, University of Perugia (IT) Claudia

More information

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys Richard Valliant University of Michigan and Joint Program in Survey Methodology University of Maryland 1 Introduction

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Bayesian Analysis of Latent Variable Models using Mplus

Bayesian Analysis of Latent Variable Models using Mplus Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are

More information

A measurement error model approach to small area estimation

A measurement error model approach to small area estimation A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

Graybill Conference Poster Session Introductions

Graybill Conference Poster Session Introductions Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

A Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java

A Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS A Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java To cite this

More information

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy

More information

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data Comparison of multiple imputation methods for systematically and sporadically missing multilevel data V. Audigier, I. White, S. Jolani, T. Debray, M. Quartagno, J. Carpenter, S. van Buuren, M. Resche-Rigon

More information

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for

More information

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Genet. Sel. Evol. 33 001) 443 45 443 INRA, EDP Sciences, 001 Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Louis Alberto GARCÍA-CORTÉS a, Daniel SORENSEN b, Note a

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

Last lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton

Last lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton EM Algorithm Last lecture 1/35 General optimization problems Newton Raphson Fisher scoring Quasi Newton Nonlinear regression models Gauss-Newton Generalized linear models Iteratively reweighted least squares

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Determining the number of components in mixture models for hierarchical data

Determining the number of components in mixture models for hierarchical data Determining the number of components in mixture models for hierarchical data Olga Lukočienė 1 and Jeroen K. Vermunt 2 1 Department of Methodology and Statistics, Tilburg University, P.O. Box 90153, 5000

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Group Sequential Designs: Theory, Computation and Optimisation

Group Sequential Designs: Theory, Computation and Optimisation Group Sequential Designs: Theory, Computation and Optimisation Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 8th International Conference

More information

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

Asymptotic inference for a nonstationary double ar(1) model

Asymptotic inference for a nonstationary double ar(1) model Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk

More information

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η

More information

POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL

POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL COMMUN. STATIST. THEORY METH., 30(5), 855 874 (2001) POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL Hisashi Tanizaki and Xingyuan Zhang Faculty of Economics, Kobe University, Kobe 657-8501,

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Robustness to Parametric Assumptions in Missing Data Models

Robustness to Parametric Assumptions in Missing Data Models Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice

More information

Bootstrap inference for the finite population total under complex sampling designs

Bootstrap inference for the finite population total under complex sampling designs Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

Bayesian inference for multivariate extreme value distributions

Bayesian inference for multivariate extreme value distributions Bayesian inference for multivariate extreme value distributions Sebastian Engelke Clément Dombry, Marco Oesting Toronto, Fields Institute, May 4th, 2016 Main motivation For a parametric model Z F θ of

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

An Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data

An Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data An Akaike Criterion based on Kullback Symmetric Divergence Bezza Hafidi a and Abdallah Mkhadri a a University Cadi-Ayyad, Faculty of sciences Semlalia, Department of Mathematics, PB.2390 Marrakech, Moroco

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

Short course on Missing Data

Short course on Missing Data Short course on Missing Data Demetris Athienitis University of Florida Department of Statistics Contents An Introduction to Missing Data....................................... 5. Introduction 5.2 Types

More information

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in

More information

Calibration Estimation for Semiparametric Copula Models under Missing Data

Calibration Estimation for Semiparametric Copula Models under Missing Data Calibration Estimation for Semiparametric Copula Models under Missing Data Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Economics and Economic Growth Centre

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Multiple QTL mapping

Multiple QTL mapping Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power

More information