Parametric fractional imputation for missing data analysis
|
|
- Justin Holt
- 5 years ago
- Views:
Transcription
1 Biometrika (????),??,?, pp C???? Biometrika Trust Printed in Great Britain Parametric fractional imputation for missing data analysis BY JAE KWANG KIM Department of Statistics, Iowa State University, Ames, Iowa 50011, U.S.A. jkim@iastate.edu SUMMARY Imputation is a popular method of handing missing data in practice. Imputation, when carefully done, can be used to facilitate the parameter estimation by applying the complete-sample estimators to the imputed dataset. The basic idea is to generate the imputed values from the conditional distribution of the missing data given the observed data. In this article, parametric fractional imputation is proposed for generating imputed values. Using fractional weights, the observed likelihood can be approximated by the weighted mean of the imputed data likelihood. Some computational efficiency can be achieved using the idea of importance sampling and calibration weighting. The proposed imputation method provides efficient parameter estimates for the model parameters specified in the imputation model and also provides reasonable estimates for parameters that are not part of the imputation model. Variance estimation is covered and results from a limited simulation study are presented. Some key words: EM algorithm, General purpose estimation, Importance sampling, Monte Carlo EM, Multiple imputation. 1. INTRODUCTION Suppose that y 1, y 2,..., y n are the p-dimensional observations for a probability sample selected from a finite population, where the finite population is a realization from a distribution F 0 (y). Suppose that, under complete response, a parameter of interest, denoted by η g = g (y) df0 (y), is estimated by ˆη g = w i g (y i ) (1) for some function g (y i ) with the sampling weight w i. Under simple random sampling, the sampling weight is 1/n and the sample can be regarded as a random sample from an infinite population with distribution F 0 (y). In (1), different choices of g-function can lead to different parameter estimation. For example, g (y i ) = y i can be used to estimate the population mean while g (y i ) = I (y i < c) corresponds to the population proportion of y less than c, where I (y < c) takes the value one if y < c and takes the value zero otherwise. Existence of an unbiased estimator (1) for any g-function is very attractive for general purpose estimation where different users want to estimate different parameters from the same sample, such as the public access data. That is, in general purpose estimation, the parameter η g is unknown at the time of data collection.
2 J.K. KIM Given missing data, we cannot directly compute the estimator (1). Instead, one can use ˆη gr w i E {g (y i ) y i,obs, (2) where (y i,obs, y i,mis ) denote the observed part and missing part of y i, respectively. To simplify the presentation, we assume the sampling mechanism and the response mechanism are ignorable in the sense of Rubin (1976). To compute the conditional expectation in (2), we need a correct specification of the conditional distribution of y i,mis given y i,obs. We adopt a parametric approach in the sense that we assume a fully parametric model for y i so that the conditional distribution of y i,mis given y i,obs can be derived from the joint distribution for an arbitrary missing pattern. Let F 0 {F θ ; θ Ω be the parametric model indexed by θ. Then the conditional expectation in (2) depends on θ 0, where θ 0 is the true parameter value corresponding to F 0. That is, E {g (y i ) y i,obs = E {g (y i ) y i,obs, θ 0. To compute the conditional expectation in (2), a Monte Carlo approximation based on the imputed data can be used. Thus, one can interpret imputation as a Monte Carlo approximation of the conditional expectation given the observed data. Imputation is very attractive in practice because, once the imputed data are created, the data analyst does not need to know the conditional distribution in (2). Monte Carlo methods of approximating the conditional expectation in (2) can be placed in two classes: 1. Bayesian approach: Generate the imputed values from the posterior predictive distribution of y i,mis given y i,obs : f (y i,mis y i,obs ) = f (y i,mis θ, y i,obs ) f (θ y i,obs ) dθ. (3) This is essentially the approach used in multiple imputation as proposed by Rubin (1987). 2. Frequentist ( approach: Generate the imputed values from the conditional distribution f y i,mis y i,obs, ˆθ ) with an estimated value ˆθ. In the Bayesian approach to imputation, the convergence to a stable posterior predictive distribution (3) is difficult to check (Gelman et al, 1996). Also, the variance estimator used in multiple imputation is not consistent for some estimated parameters. For examples, see Wang & Robins (1998) and Kim et al (2006). The frequentist approach for imputation has received less attention than the Bayesian imputation. One notable exception is Wang & Robins (1998) who studied the asymptotic properties of multiple imputation and a parametric frequentist imputation procedure. Wang & Robins (1998) considered the estimated parameter ˆθ to be given, and did not discuss parameter estimation. We consider frequentist imputation given a parametric model for the original distribution. Using the idea of importance sampling, we propose a frequentist imputation method that can be implemented as the fractional imputation discussed by Kim & Fuller (2004). The proposed fractional imputation method can be further modified to remove the Monte Carlo error without increasing the size of Monte Carlo samples. The proposed method uses the calibration technique to obtain the exact maximum likelihood estimator with a modest imputation size. In 2, the parametric fractional imputation method is described. Asymptotic properties of the fractionally imputed estimator are discussed in 3. Maximum likelihood estimation using fractional imputation is discussed in 4. Results from a limited simulation study are presented in 5. Concluding remarks are made in 6.
3 Parametric fractional imputation 3 2. FRACTIONAL IMPUTATION As discussed in 1, we consider an approximation for the conditional expectation in (2) using fractional imputation. In fractional imputation, M(> 1) imputed values of y i,mis, say y (1) i,mis,..., y (M) i,mis, are generated and assigned fractional weights, w i1,..., w im, so that j=1 wijg ( yij ) { = E g (y i ) y i,obs, ˆθ, (4) where yij = (y i,obs, y (j) i,mis ), holds at least asymptotically for each i, where ˆθ is a consistent estimator of θ 0. A popular choice for ˆθ is the (pseudo) maximum likelihood estimator, which can be obtained by finding θ that maximizes the (pseudo) log-likelihood function. That is, ˆθ = arg max w i ln { f obs(i) (y i,obs ; θ), (5) θ Ω where f obs(i) (y i,obs ; θ) = f (y i ; θ) dy i,mis is the marginal density of y i,obs. In the marginal density of y i,obs, the subscript i is used because the missing pattern can be different for each i. A computationally simple method of finding the maximum likelihood estimator shall be discussed in 4. Condition (4) applied to g(y i ) = c implies that wij = 1 (6) j=1 for all i. From the fractionally imputed data satisfying (4) and (6), the parameter η g can be estimated by ˆη F I,g = w i wijg ( yij). (7) Note that the imputed estimator (7) is obtained by applying the formula (1) using yij as the observations with weights w i wij. The size of the imputed dataset can be as large as nm. For a single parameter η g = E {g (y), any fractional imputation satisfying (4) can provide a consistent estimator of η g. For general purpose estimation, the g-function in η g is unknown at the time of imputation (Fay, 1992). In this case, we cannot expect (4) to hold for any g-function based on a finite number of imputations, with the exception of categorical data. For categorical data, there are a finite number of possible values for y i,mis. In this case, we take the possible values as the imputed values and compute the conditional probability of y i,mis by the following Bayes formula: ( p y (j) i,mis y i,obs, ˆθ ) f(yij = ; ˆθ) Mi k=1 f(y ik ; ˆθ), where f (y i ; θ) is the joint density of y i evaluated at θ and M i is the number of possible values of y i,mis. Note that the choice of wij = p(y (j) i,mis y i,obs, ˆθ) satisfies (4) and (6). Thus, fractional imputation for categorical data using wij = p(y (j) i,mis y i,obs, ˆθ), which is close in spirit to the EM by weighting method of Ibrahim (1990), is discussed in Kim & Rao (2009). For the continuous random variable y i, condition (4) can be approximately satisfied using an importance sampling, where y (1) i,mis,..., y (M) i,mis are independently generated from a distribution
4 J.K. KIM with density h (y i,mis ) which has the same support as f (y i,mis y i,obs, θ) for all θ Ω and the corresponding fractional weights are computed by w ij0 = w ij0(ˆθ) = C i f(y (j) i,mis y i,obs; ˆθ) h(y (j) i,mis ), (8) where C i is chosen to satisfy (6). If h (y i,mis ) = f(y i,mis y i,obs, ˆθ) is used, w ij0 = M 1. REMARK 1. Under mild conditions, ḡi = M j=1 w ij0 g(y ij ) with w ij0 in (8) converges to ḡ i (ˆθ) E{g (y i ) y i,obs, ˆθ with probability 1, as M, and has approximate variance σ 2 i /M, where [ { 2 σi 2 f(y i,mis y i,obs, = E g (y i ) ḡ i (ˆθ) ˆθ) ] y i,obs, h (y i,mis ) ˆθ. The optimal choice of h (y i,mis ) that minimizes σi 2 is ( h (y i,mis ) = f y i,mis y i,obs, ˆθ ) g (y i ) ḡ i (ˆθ) { g E (yi ) ḡ i (ˆθ) y i,obs, ˆθ. (9) When the g-function is unknown, h (y i,mis ) = f(y i,mis y i,obs, ˆθ) is a reasonable choice in terms of statistical efficiency. However, other choices of h (y i,mis ) can have better computational efficiency in some situations. For public assess data, creating the fractional imputation with a large imputation size is not feasible. We propose an approximation with a moderate imputation size, say M = 10. To formally describe the procedure, let y (1) i,mis,..., y (M) i,mis be independently generated from a distribution with density h (y i,mis ). Given the imputed values, it remains to compute the fractional weights that satisfy (4) and (6) as closely as possible. The proposed fractional weights are computed in two steps. In the first step, the initial fractional weights are computed by (8). In the second step, the initial fractional weights are adjusted to satisfy (6) and w i wijs(ˆθ; yij) = 0, (10) where s (θ; y) = ln f (y; θ) / θ is the score function of θ. Adjusting the initial weights to satisfy a constraint is often called calibration. As can be seen in 3, constraint (10) is a calibration constraint making the resulting imputed estimator ˆη F I,g in (7) fully efficient for η g when η g is a linear function of θ. To construct the fractional weights satisfying (6) and (10), the regression weighting technique or the empirical likelihood technique can be used. For example, in the regression weighting technique, the fractional weights are constructed by ( ) T wij = wij0 w i s i w i wij0 ( s ij s ) 2 i 1 w ij0 ( s ij s i ), (11) where w ij0 is the initial fractional weight (8) using importance sampling, s i = M B 2 = BB T, and s ij = s(ˆθ; yij ). Here, M need not be large. j=1 w ij0 s ij,
5 Parametric fractional imputation 5 If the distribution belongs to the exponential family of the form f (y; θ) = exp { t (y) T θ + φ (θ) + A (y), (12) then (10) can be obtained by n M j=1 w iw ij {t(y ij ) + φ(ˆθ) = 0, where φ (θ) = φ (θ) / θ. In this case, calibration can be used only for complete sufficient statistics. 3. ASYMPTOTIC RESULTS In this section, we discuss some asymptotic properties of the fractional imputed estimator (7). We consider two types of fractionally imputed estimators. One is obtained by using the initial fractional weights in (8) and the other is obtained by using the calibration fractional weights such as (11). Note that the imputed estimator ˆη F I,g in (7) is a function of n and M, where n is the sample size and M is the imputation size for fractional imputation. Thus, we use ˆη g0,n,m and ˆη g1,n,m to denote the fractionally imputed estimator (7) using the initial fractional weights in (8) and the fractionally imputed estimator (7) using the calibration fractional weights in (11), respectively. The following theorem presents some asymptotic properties of the fractionally imputed estimator. The proof is presented in Appendix A. THEOREM 1. Let ˆη g0,n,m and ˆη g1,n,m be fractionally imputed estimators of η g of the form (7) using M fractions where the fractional weights are constructed by (8) and by (11), respectively, and the imputed values are generated from h (y i,mis ). Under some regularity conditions, and (ˆη g0,n,m η g ) /σ g0,n,m L N (0, 1) (13) (ˆη g1,n,m η g ) /σ g1,n,m L N (0, 1) (14) for each M > 1, where [ ] σg0,n,m 2 = var w i {ḡi (θ 0 ) + K1 T s i (θ 0 ), [ ] σg1,n,m 2 = var w i {ḡi (θ 0 ) + K1 T s i (θ 0 ) + B T ( s i (θ 0 ) s i (θ 0 )), ḡ i (θ) = M j=1 w ij0 (θ) g(y ij ), s i (θ) = E θ {s (θ; y i ) y i,obs, s i (θ) = M j=1 w ij0 (θ) s(θ; y ij ), K 1 = [I obs (θ 0 )] 1 I g,mis (θ 0 ), (15) B = [I mis (θ 0 )] 1 I g,mis (θ 0 ), I obs (θ) = E { n w i s i (θ) / θ, I g,mis (θ) = E [ n w i {s (θ; y i ) s i (θ) g (y i )] and I mis (θ) = E [ n w i {s (θ; y i ) s i (θ) 2]. In Theorem 1, note that { σg0,n,m 2 = σg1,n,m 2 + B T var w i ( s i s i ) B and the last term represents the reduction in the variance of the fractional imputation estimator of η g due to the calibration in (10). Thus, σg0,n,m 2 σ2 g1,n,m with equality for M =. Clayton
6 J.K. KIM et al (1998) and Robins & Wang (2000) also proved results similar to (13) for the special case of M =. For variance estimation of the fractional imputation estimator, let ˆV (ˆη g ) = Ω ij g (y i ) g (y j ) be a consistent estimator for the variance of ˆη g = n w ig (y i ) under complete response, where Ω ij are coefficients. Under simple random sampling, Ω ij = 1/ { n 2 (n 1) for i j and Ω ii = 1/n 2. For sufficiently large M, using the results in Theorem 1, a consistent estimator for the variance of ˆη F I,g in (7) is ˆV (ˆη F I,g ) = Ω ij ē i ē j, (16) where ē i = ḡ i (ˆθ) + ˆK 1 T s i (ˆθ) = M j=1 w ij0ê, ê ij = g(y ij ) + ˆK 1 Ts(ˆθ; yij ) and [ T] 1 { ˆK 1 = w i s i (ˆθ) s i (ˆθ) w i wij s(ˆθ; yij) s i g ( yij). For moderate size M, the expected value of variance estimator (16) can be written { E ˆV (ˆηF I,g ) = E Ω ij ē i ē j + E (ē Ω ij cov I i, ē ) j, where ē i = E I (ē i ) and subscript I denotes the expectation over the imputation mechanism generating y (j) i,mis from h (y i,mis). If the imputed values are generated independently, we have { (ē cov I i, ē ) 0 if i j j = var I (ē i ) if i = j and, using the argument for Remark 1, var I (ē i ) can be estimated by ˆV Ii,e M j=1 (w ij0 )2 (ê ij ē i )2. Thus, an unbiased estimator for σ 2 g0,n,m is ˆσ 2 g0,n,m = Ω ij ē i ē j Ω ii ˆVIi,e + wi 2 ˆV Ii,g, (17) where ˆV Ii,g = M j=1 (w ij0 )2 (gij ḡ i )2. Using the argument similar to (17), it can be shown that ˆσ g1,n,m 2 = Ω ij ē i ē j Ω ii ˆVIi,e + wi 2 ˆV Ii,u, (18) is an unbiased estimator of σg1,n,m 2, where ˆV Ii,u = M j=1 (w ij0 )2 (u ij ū i )2 is an unbiased estimator of V I (ū i ), u ij = g(y ij ) ˆB T s(ˆθ; yij ), ū i = M j=1 w ij0 u ij, and 1 ˆB = w i wij0 ( s ij s ) 2 i w i wij0 ( s ij s ) ( ) i g y ij.
7 Parametric fractional imputation 7 Variance estimation with fractionally imputed data can be also performed using the replication method described in Appendix B. 4. MAXIMUM LIKELIHOOD ESTIMATION In this section, we propose a computationally simple method of obtaining the pseudo maximum likelihood estimator in (5). The pseudo maximum likelihood estimator reduces to the usual maximum likelihood estimator if the sampling design is simple random sampling with w i = 1/n. Instead of (5), the pseudo maximum likelihood estimator of θ 0 can be obtained by ˆθ = arg max θ Ω w i E {ln f (y i ; θ) y i,obs. (19) For w i = 1/n, Dempster et al (1977) proved that the maximum likelihood estimator in (19) maximizes (5) and proposed the EM algorithm that computes the solution iteratively by defining ˆθ (t+1) to be the solution to ˆθ (t+1) = arg max θ Ω w i E {ln f (y i ; θ) y i,obs, ˆθ (t), (20) where ˆθ (t) is the estimate of θ obtained at the t-th iteration. To compute the conditional expectation in (20), Monte Carlo implementation of the EM algorithm of Wei & Tanner (1990) can be used. The Monte Carlo EM method avoids the analytic computation of the conditional expectation (20) by using the Monte Carlo approximation based on the imputed data. In the Monte Carlo EM method, independent draws of y i,mis are generated from the conditional distribution f(y i,mis y i,obs, ˆθ (t) ) for each t to approximate the conditional expectation in (20). The Monte Carlo EM method requires heavy computation because the imputed values are re-generated for each iteration t. Also, generating imputed values from f(y i,mis y i,obs, ˆθ (t) ) can be computationally challenging since it often requires an iterative algorithm such as Metropolis- Hastings algorithm for each EM iteration. To avoid re-generating values from the conditional distribution at each step, we propose the following algorithm for parametric fractional imputation using importance sampling: [Step 0] Obtain an initial estimator ˆθ (0) of θ and set h(y i,mis ) = f(y i,mis y i,obs, ˆθ (0) ). [Step 1] Generate M imputed values, y (1) i,mis,..., y (M) i,mis, from h (y i,mis). [Step 2] With the current estimate of θ, denoted by ˆθ (t), compute the fractional weights by wij(t) = w ij0(ˆθ (t) ), where w ij0 (ˆθ) is defined in (8). [Step 3] (Optional) If wij(t) > C/M for some i = 1, 2,..., n and j = 1, 2,..., M, then set h (y i,mis ) = f(y i,mis y i,obs, ˆθ (t) ) and go to Step 1. Increase M if necessary. [Step 4] Solve the weighted score equation for θ to get ˆθ (t+1), where ˆθ (t+1) maximizes Q ( θ ˆθ (t) ) = w i wij(t) ln f ( yij; θ ) (21) over θ Ω. [Step 5] Set t = t + 1 and go to Step 2. Stop if ˆθ (t) meets the convergence criterion.
8 J.K. KIM In Step 0, the initial estimator ˆθ (0) can be the maximum likelihood estimator obtained by using only the respondents. Step 1 and Step 2 correspond to the E-step of the EM algorithm. Step 3 can be used to control the variation of the fractional weights and to avoid extremely large fractional weights. The threshold C/M in Step 3 guarantees that no individual fractional weight is bigger than C times of the average of the fractional weights. In Step 4, the value of θ that maximizes Q (θ ˆθ (t) ) in (21) can often be obtained by solving w i wij(t) s ( θ; yij) = 0, (22) where s (θ; y) = ln f (y; θ) / θ is the score function of θ. Thus, the solution can be obtained by applying the complete sample score equation to the fractionally imputed data. The equation (22) can be called the imputed score equation using fractional imputation. Unlike the Monte Carlo EM method, the imputed values are not changed for each iteration, only the fractional weights are changed. REMARK 2. In Step 2, fractional weights can be computed by using the joint density with the current parameter estimate ˆθ (t). Note that wij(0) (θ) in (8) can be written w ij(0) (θ) = f(y (j) i,mis y i,obs, θ)/h(y (j) i,mis ) M k=1 f(y (k) i,mis y i,obs, θ)/h(y (k) i,mis ) = f(yij ; θ)/h(y (j) i,mis ) M k=1 f(y ik ; (23) θ)/h(y (k) i,mis ), which does not require the marginal density in computing the conditional distribution. Only the joint density is needed. Given the M imputed values, y (1) i,mis,..., y (M) i,mis, generated from h (y i,mis), the sequence of estimators {ˆθ (0), ˆθ (1),... can be constructed from the parametric fractional imputation using importance sampling. The following theorem presents some convergence properties of the sequence of the estimators. THEOREM 2. Assume that the M imputed values are generated from h (y i,mis ). Let Q (θ ˆθ (t) ) be the weighted log likelihood function (21) based on the fractional imputation. If then where l obs (θ) = n w i ln{f obs(i) (y i,obs; θ) and Q (ˆθ (t+1) ˆθ (t) ) Q (ˆθ (t) ˆθ (t) ) (24) f obs(i) (y i,obs; θ) = l obs (ˆθ (t+1) ) l obs (ˆθ (t) ), (25) M j=1 f(y ij ; θ)/h(y (j) i,mis ) M j=1 1/h(y (j) i,mis ).
9 Proof. By (23) and using the Jensen s inequality, Therefore, (24) implies (25). l obs (ˆθ (t+1) ) l obs (ˆθ (t) ) = Parametric fractional imputation 9 w i ln wij(t) f(yij ; ˆθ (t+1) ) f(yij ; ˆθ (t) ) j=1 w i w ij(t) ln f(y ij ; ˆθ (t+1) ) f(y ij ; ˆθ (t) ) = Q (ˆθ (t+1) ˆθ (t) ) Q (ˆθ (t) ˆθ (t) ). Note that lobs (θ) is an imputed version of the observed pseudo log-likelihood based on the the M imputed values, y (1) i,mis,..., y (M) i,mis. Thus, by Theorem 1, the sequence l obs (ˆθ (t) ) is monotonically increasing and, under the conditions stated in Wu (1983), the convergence of ˆθ (t) follows for fixed M. Theorem 2 does not hold for the sequence obtained from the Monte Carlo EM method for fixed M, because the imputed values are re-generated for each E-step on the Monte Carlo EM method, and the convergence is very hard to check for the Monte Carlo EM (Booth & Hobert, 1999). REMARK 3. Sung and Geyer (2007) considered a Monte Carlo maximum likelihood method that directly maximizes l obs (θ). Computing the value of θ that maximizes Q (θ ˆθ (t) ) is easier than computing the value of θ that maximizes l obs (θ). 5. SIMULATION STUDY In a simulation study, B = 2, 000 Monte Carlo samples of size n = 200 were independently generated from an infinite population specified by x i N (2, 1); y i x i N (β 0 + β 1 x i, σ ee ) with (β 0, β 1, σ ee ) = (1, 0 7, 1); z i (x i, y i ) Ber (p i ), where log{p i /(1 p i ) = ψ 0 + ψ 1 y i with (ψ 0, ψ 1 ) = ( 3, 1), and δ i (x i, y i, z i ) Ber (π i ) where log{π i /(1 π i ) = 0 5x i + 0 8z i. Variable y i is observed if δ i = 1 and is not observed if δ i = 0. The overall response rate for y is about 76%. Variable x is the covariate for the regression of y on x and is always observed. Variable z is the surrogate variable for y in the sense that interest is focused on estimation of the parameter β in the regression model P β (y x) but incorporating the surrogate variable z can improve the efficiency of the resulting estimators (Pepe, 1992). Also, note that, ignoring the surrogate variable z, the response mechanism is not ignorable. By incorporating the surrogate variable into the estimation procedure, we can safely ignore the response mechanism. We are interested in estimating three parameters: θ 1 = E (y); the marginal mean of y, θ 2 = β 1 ; the slope for the regression of y on x, and θ 3 = pr (y < 3); the proportion of y less than 3. Under complete response, θ 1 and θ 2 are computed by the maximum likelihood method and the proportion θ 3 is estimated by ˆθ 3,n = 1 n I (y i < 3), (26) which is different from the maximum likelihood estimator 3 ( ) y ˆµy φ dy, (27) ˆσ yy
10 J.K. KIM where φ (y) is the density of the standard normal distribution and (ˆµ y, ˆσ yy ) is the maximum likelihood estimator of (µ y, σ yy ). Under nonresponse, five estimators were computed: 1. The parametric fractional imputation estimator proposed using wij0 in (8) with M = 100 and M = The calibration fractional imputation estimator using the regression weighting method in (11) with M = Multiple imputation with M = 100 and M = 10. In addition to the five imputed estimators, we also compute the complete sample estimator for comparison. In fractional imputation, M imputed values were independently generated from N( ˆβ 0(0) + ˆβ 1(0) x i, ˆσ ee(0) ), where ( ˆβ 0(0), ˆβ 1(0), ˆσ ee(0) ) is the regression parameters computed from the respondents only. For each imputed value, we assign the fractional weight { exp (y (j) wij(t) i ˆβ 0(t) ˆβ ( 1(t) x i ) 2 /ˆσ ee(t) exp ˆψ0(t) z i + ˆψ ) 1(t) z i y (j) i { exp (y (j) i ˆβ 0(0) ˆβ { ( 1(0) x i ) 2 /ˆσ ee(0) 1 + exp ˆψ0(t) + ˆψ ), 1(t) y (j) i where (β 0(t), β 1(t), ˆσ ee(t), ˆψ 0(t), ˆψ 1(t) ) is obtained by the maximum likelihood method using the fractionally imputed data with fractional weight wij(t 1). In Step 3 of the fractional imputation for maximum likelihood estimation in 4, C = 5 was used. In the calibration fractional imputation method, M = 10 values were randomly selected from M 1 = 100 initial fractionally imputed values by systematic sampling with selection probability proportional to wij0 in (8). The regression fractional weights were then computed by (11). In Step 5, the convergence criterion was ˆθ (t+1) ˆθ (t) < ɛ = The average number of iterations for convergence were 45 and 35 for M = 100 and M = 10, respectively. In multiple imputation, the imputed values are generated from the posterior predictive distribution iteratively using Gibbs sampling with 100 iterations. The Monte Carlo means and the Monte Carlo standardised variances of the complete sample estimator and the five imputed estimators are presented in Table 1. The standardised variance in Table 1 was computed by dividing the variance of each estimator by that of the full sample estimator. The simulation results in Table 1 show that fractional imputation is generally more efficient than multiple imputation because multiple imputation has additional variability due to generating the parameter values. For estimation of the proportion, it is possible for the multiple imputation estimator with M = 100 to be more efficient than the calibration fractional imputation estimator with M = 10. The differences in efficiencies for these two parameters are less than one percent. In addition to point estimators, variance estimates were also computed from each Monte Carlo sample. For variance estimation of fractional imputation, we used the linearized variance estimator discussed in 3. For variance estimation of the calibration fractional imputation estimator, we used the one-step jackknife variance estimator discussed in Appendix B. For multiple imputation variance estimation, we used the variance formula of Rubin (1987). Table 2 presents the Monte Carlo relative biases and t-statistics for the variance estimators. The t-statistic is the statistic for testing zero bias in the variance estimator. The sizes of simulation error for the relative bias of variance estimators reported in Table 2 are less than 1%. Table 2 shows that, for the fractional imputation estimators, both linearization and replication methods provide consistent estimates for the variance of the parameter estimates, with t-statistics less than 2 in absolute value. The multiple imputation variance estimators are also unbiased for
11 Parametric fractional imputation 11 Table 1. Monte Carlo means and standardised variances of the imputed estimators Parameter Method Mean Std Var Complete Sample FI (M=100) θ 1 FI (M=10) CFI (M=10) MI (M=100) MI (M=10) Complete Sample FI (M=100) θ 2 FI (M=10) CFI (M=10) MI (M=100) MI (M=10) Complete Sample FI (M=100) θ 3 FI (M=10) CFI (M=10) MI (M=100) MI (M=10) FI, fractional imputation; CFI, calibration fractional imputation; MI, multiple imputation; Std Var, standardised variance.. the variance of estimators for θ 1 and θ 2 which were explicitly considered in the imputation model. For variance estimation of the proportion, the multiple imputation variance estimator shows significant biases (15.53% for M = 10 and % for M = 100). The congeniality condition of Meng (1994) does not hold for the proportion based on the estimator (26). For more details, see Appendix C. Thus, the multiple imputation method in this simulation is congenial for θ 1 and θ 2, but it is not congenial for the proportion parameter θ CONCLUDING REMARKS Parametric fractional imputation is proposed as a tool for parameter estimation with missing data. The analyses using fractionally imputed data yij with weights w iwij can be performed using existing software for the complete sample treating the imputed values as if observed. Generating the imputed values and computing the fractional weights may require careful modeling and parameter estimation, but the users do not need to know these details for their analyses. The data provider, who has the best available information for the imputation model, can use the imputation model to construct the fractionally imputed data with fractional weights and possibly with replicated fractional weights for variance estimation. No other information is necessary for the users. If parametric fractional imputation is used to construct the score function, the solution to the imputed score equation is very close to the maximum likelihood estimator of the parameters in the model. Some computational efficiency relative to the Monte Carlo EM method can be achieved with parametric fractional imputation using the idea of importance sampling and calibration weighting. Parametric fractional imputation yields direct consistent estimates for pa-
12 12 J.K. KIM Table 2. Monte Carlo relative biases and t-statistics of the variance estimators Parameter Method Relative Bias (%) t-statistics Linearize (for FI with M = 100) var(ˆθ 1 ) Linearize (for FI with M = 10) One-step JK (for CFI with M = 10) MI (M=100) MI (M=10) Linearize (for FI with M = 100) var(ˆθ 2 ) Linearize (for FI with M = 10) One-step JK (for CFI with M = 10) MI (M=100) MI (M=10) Linearize (for FI with M = 100) var(ˆθ 3 ) Linearize (for FI with M = 10) One-step JK (for CFI with M = 10) MI (M=100) MI (M=10) FI, fractional imputation; CFI, calibration fractional imputation; MI, multiple imputation; JK, jackknife;. rameters that are not part of the imputation model. For example, in the simulation study, the parametric fractional imputation computed from a normal model provides direct estimates for the cumulative distribution function. Thus, the proposed imputation method is a useful tool for general-purpose data analysis, where the parameters of interest are unknown at the time of imputation. Variance estimation can be performed using a linearization method or a replication method. The validity of variance estimation for parametric fractional imputation, unlike multiple imputation, does not require the congeniality condition of Meng (1994). ACKNOWLEDGEMENT The author wishes to thank professor Wayne Fuller, the referees, and the Editor for their very helpful comments. Define a class of estimators η g0,n,k (θ) = APPENDIX A. Proof of Theorem 1 w i wij0 (θ) g ( yij ) + K T w i E {s (θ; y i ) y i,obs, θ indexed by K. Note that, by (5), we have n w ie{s(ˆθ; y i ) y i,obs, ˆθ = 0 and η g0,n,k (ˆθ) = ˆη g0,n,m for any K. According to Randles (1982), we have { n η g0,n,k (ˆθ) η g0,n,k (θ 0 ) = o p (1),
13 if is satisfied. Using { w i θ w ij0 (θ) g ( yij ) n = Parametric fractional imputation 13 { E θ η g0,n,k (θ 0 ) = 0 (A.1) the choice of K = K 1 in (15) satisfies (A.1) and thus (13) follows. To show (14), consider Using (11), η g1,n,k (θ) = w i wij (θ) g ( yij ) n = + w i wij (θ) g ( yij ) + K T w i wij0 (θ) g ( yij ) w i wij0 (θ) { s ( θ; yij) s i (θ) g ( yij), w i E {s (θ; y i ) y i,obs, θ. w i E {s (θ; y i ) y i,obs, θ where ˆB g (θ) = w i wij0 (θ) ( s ij (θ) s i (θ) ) 2 1 n w i wij0 (θ) s ( θ; yij ) T ˆB g (θ), w i wij0 (θ) ( s ij (θ) s i (θ) ) g ( yij). After some algebra, it can be shown that the choice of K = K 1 in (15) also satisfies E { η g1,n,k (θ 0 ) / θ = 0 and, by Randles (1982) again, { n η g1,n,k (ˆθ) η g1,n,k (θ 0 ) = o p (1). B. Replication variance estimation Under complete response, let w [k] i be the k-th replication weight for unit i. Assume that the replication variance estimator ˆV n = L k=1 c k ( ˆη [k] g ˆη g ) 2, where ˆη g = n w ig (y i ) and ˆη g [k] = n w[k] i g (y i ), is consistent for the variance of ˆη g. For replication with the calibration fraction method computed from (10), we consider the following steps for creating replicated fractional weights. [Step 1] Compute ˆθ [k], the k-th replicate of ˆθ, using fractional weights. [Step 2] Using the ˆθ [k] computed from Step 1, compute the replicated fractional weights by using the regression weighting technique. w [k] i w [k] (ˆθ[k] ) ij s ; yij = 0, (B.1)
14 J.K. KIM Equation (B.1) is the calibration equation for the replicated fractional weights. For any estimator of the form (7), the replication variance estimator is constructed as where ˆη [k] F I,g = n M j=1 w[k] i ˆV (ˆη F I,g ) = L k=1 ( ) 2 c k ˆη [k] F I,g ˆη F I,g w [k] ij g(yij ) and w [k] ij is computed from (B.1). In general, Step 1 can be computationally problematic since ˆθ [k] is often computed from the iterative algorithm (21) for each replication. Thus, we consider an approximation for ˆθ [k] using Taylor linearization of 0 = S [k] (ˆθ [k] ) around ˆθ. The one-step approximation can be implemented as ˆθ [k] = ˆθ [Î[k] (ˆθ)] 1 ) + obs S[k] (ˆθ, where Î [k] obs (θ) = n and S [k] (θ) = n w[k] i proposed by Louis (1982). { w [k] i wij θ s ( ) i θ; y ij + s i w [k] i { s i (θ) 2 w [k] i wij { ( ) s θ; y 2 ij (θ). The Î[k] obs (θ) is a replicated version of the observed information matrix C. A note on multiple imputation for a proportion Assume that we have a random sample of size n with observations (x i, y i ) obtained from a bivariate normal distribution. The parameter of interest is a proportion, for example, η = pr (y 0.8). An unbiased estimator of η is ˆη n = 1 I (y i 0.8). (C.1) n Note that ˆη n is unbiased but has bigger variance than the maximum likelihood estimator (27) of η. For simplicity, assume that the first r(< n) elements have both x i and y i responding, but the last n r elements have only x i observed and y i missing. In this situation, an efficient imputation method such as ( ) yi N ˆβ0 + x i ˆβ1, ˆσ e 2 (C.2) can be used, where ˆβ 0, ˆβ 1 and ˆσ e 2 can be computed from the respondents. In multiple imputation, the parameter estimates are generated from a posterior distribution given the observations. Under the imputation mechanism (C.2), the imputed estimator of µ 2 of the form ˆµ 2,I = n { 1 r y i + n i=r+1 y i satisfies var {ˆµ 2,F E = var (ȳ n ) + var (ˆµ 2,F E ȳ n ), where ˆµ 2,F E = E I (ˆµ 2,I ). Condition (C.3) is the congeniality condition of Meng (1994). Now, for η = pr (y 3), the imputed estimator of η based on ˆη n in (C.1) is { r ˆη I = 1 I (y i 3) + I (yi 3). (C.4) n i=r+1 The expected value of ˆη I over the imputation mechanism is E I (ˆη I ) = 1 r n { I (y i 3) + pr(y i 3 x i, ˆθ) = ˆη F E + 1 n i=r+1 r {I(y i 3) pr(y i 3 x i, ˆθ), (C.3)
15 Parametric fractional imputation 15 where ˆη F E = n 1 n pr(y i 3 x i, ˆθ). For the proportion, ˆη F E E I (ˆη I ) and so the congeniality condition does not hold. In fact, var {E I (ˆη I ) < var (ˆη n ) + var {E I (ˆη I ) ˆη n and the multiple imputation variance estimator overestimate the variance of ˆη I in (C.4). Instead, we have var (ˆη F E ) = var (ˆη n ) + var (ˆη F E ˆη n ) and the multiple imputation variance estimator estimates the variance of ˆη F E for sufficiently large M. REFERENCES Booth, J.G. and Hobert, J.P. (1999). Maximizing generalized linear mixed models with an automated Monte Carlo EM algorithm, Journal of the Royal Statistical Society: Series B 61, CLAYTON, D., SPIEGELHALTER, D., DUNN, G., & PICKLES, A. (1998). Analysis of longitudinal binary data from multiphase sampling. J. R. Statist. Soc. B 60, DEMPSTER, A. P., LAIRD, N. M. & RUBIN, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc. B FAY, R. E. (1992). When are inferences from multiple imputation valid? In Proc. Survey Res. Meth. Sect., Am. Statist. Assoc., pp Washington, DC: American Statistical Association. GELMAN, A., MENG, X.-L., & STERN, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Statist. Sci. 6, Henmi, M., Yoshida, R., & Eguchi, S. (2007). Importance sampling via the estimated sampler. Biometrika 94, IBRAHIM, J. G. (1990). Incomplete data in generalized linear models. J. Am. Statist. Assoc. 85, KIM, J.K., BRICK, M.J., FULLER, W.A., & KALTON, G. (2006). On the bias of the multiple imputation variance estimator in survey sampling. J. R. Statist. Soc. B 68, KIM, J.K. & FULLER, W.A. (2004). Fractional hot deck imputation, Biometrika 91, KIM, J.K. & RAO, J.N.K. (2009). Unified approach to linearization variance estimation from survey data after imputation for item nonresponse, Biometrika, In press. (available through Biometrika Advance Access. doi: /biomet/asp041) LOUIS, T. A. (1982). Finding the observed information matrix when using the EM algorithm. J. R. Statist. Soc. B 44, MENG, X.L.(1994). Multiple-imputation inferences with uncongenial sources of input (with discussion), Statist. Sci. 9, RANDLES, R. H. (1982). On the asymptotic normality of statistics with estimated parameters. Ann. Statist. 10, ROBINS, J.M. & WANG, N. (2000). Inference for imputation estimators, Biometrika 87, RUBIN, D. B. (1976). Inference and missing data. Biometrika 63, RUBIN, D.B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley. SUNG, Y.J. & GEYER, C.J. (2007). Monte Carlo likelihood inference for missing data models. Ann. Statist. 35, PEPE, M.S. (1992). Inference using surrogate outcome data and a validation sample. Biometrika 79, WANG, N. & ROBINS, J.M. (1998). Large-sample theory for parametric multiple imputation procedures. Biometrika 85, WEI, G.C.G. & TANNER, M.A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man s data augmentation algorithm. J. Am. Statist. Assoc. 85, WU, C.F.J. (1983). On the convergence properties of the EM algorithm, Ann. Statist. 11, [Received January????. Revised July????]
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More information6. Fractional Imputation in Survey Sampling
6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population
More informationLikelihood-based inference with missing data under missing-at-random
Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationStatistical Methods for Handling Missing Data
Statistical Methods for Handling Missing Data Jae-Kwang Kim Department of Statistics, Iowa State University July 5th, 2014 Outline Textbook : Statistical Methods for handling incomplete data by Kim and
More informationShu Yang and Jae Kwang Kim. Harvard University and Iowa State University
Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationTwo-phase sampling approach to fractional hot deck imputation
Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.
More informationChapter 4: Imputation
Chapter 4: Imputation Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Basic Theory for imputation 3 Variance estimation after imputation 4 Replication variance estimation
More informationOn the bias of the multiple-imputation variance estimator in survey sampling
J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,
More informationRecent Advances in the analysis of missing data with non-ignorable missingness
Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation
More informationMiscellanea A note on multiple imputation under complex sampling
Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.
More informationEM Algorithm II. September 11, 2018
EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationCombining data from two independent surveys: model-assisted approach
Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,
More informationNonresponse weighting adjustment using estimated response probability
Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy
More informationANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW
SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved
More informationFractional hot deck imputation
Biometrika (2004), 91, 3, pp. 559 578 2004 Biometrika Trust Printed in Great Britain Fractional hot deck imputation BY JAE KWANG KM Department of Applied Statistics, Yonsei University, Seoul, 120-749,
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationIntroduction An approximated EM algorithm Simulation studies Discussion
1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse
More informationDiscussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs
Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully
More informationVARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA
Submitted to the Annals of Applied Statistics VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA By Jae Kwang Kim, Wayne A. Fuller and William R. Bell Iowa State University
More informationCalibration estimation using exponential tilting in sample surveys
Calibration estimation using exponential tilting in sample surveys Jae Kwang Kim February 23, 2010 Abstract We consider the problem of parameter estimation with auxiliary information, where the auxiliary
More informationChapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70
Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:
More informationarxiv:math/ v1 [math.st] 23 Jun 2004
The Annals of Statistics 2004, Vol. 32, No. 2, 766 783 DOI: 10.1214/009053604000000175 c Institute of Mathematical Statistics, 2004 arxiv:math/0406453v1 [math.st] 23 Jun 2004 FINITE SAMPLE PROPERTIES OF
More informationReconstruction of individual patient data for meta analysis via Bayesian approach
Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi
More informationData Integration for Big Data Analysis for finite population inference
for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation
More informationBasics of Modern Missing Data Analysis
Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing
More informationMH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution
MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationanalysis of incomplete data in statistical surveys
analysis of incomplete data in statistical surveys Ugo Guarnera 1 1 Italian National Institute of Statistics, Italy guarnera@istat.it Jordan Twinning: Imputation - Amman, 6-13 Dec 2014 outline 1 origin
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationSTATISTICAL INFERENCE WITH DATA AUGMENTATION AND PARAMETER EXPANSION
STATISTICAL INFERENCE WITH arxiv:1512.00847v1 [math.st] 2 Dec 2015 DATA AUGMENTATION AND PARAMETER EXPANSION Yannis G. Yatracos Faculty of Communication and Media Studies Cyprus University of Technology
More informationA weighted simulation-based estimator for incomplete longitudinal data models
To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationModification and Improvement of Empirical Likelihood for Missing Response Problem
UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationChapter 8: Estimation 1
Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationIntroduction to Survey Data Integration
Introduction to Survey Data Integration Jae-Kwang Kim Iowa State University May 20, 2014 Outline 1 Introduction 2 Survey Integration Examples 3 Basic Theory for Survey Integration 4 NASS application 5
More informationStatistical Practice
Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationEmpirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design
1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationBayesian inference for factor scores
Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor
More informationA Note on Bayesian Inference After Multiple Imputation
A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in
More informationLECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)
LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered
More informationStreamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level
Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level A Monte Carlo Simulation to Test the Tenability of the SuperMatrix Approach Kyle M Lang Quantitative Psychology
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationFractional imputation method of handling missing data and spatial statistics
Graduate Theses and Dissertations Graduate College 2014 Fractional imputation method of handling missing data and spatial statistics Shu Yang Iowa State University Follow this and additional works at:
More informationWeighting in survey analysis under informative sampling
Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting
More informationComparison between conditional and marginal maximum likelihood for a class of item response models
(1/24) Comparison between conditional and marginal maximum likelihood for a class of item response models Francesco Bartolucci, University of Perugia (IT) Silvia Bacci, University of Perugia (IT) Claudia
More informationAn Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys
An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys Richard Valliant University of Michigan and Joint Program in Survey Methodology University of Maryland 1 Introduction
More informationLinear Methods for Prediction
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationBayesian Analysis of Latent Variable Models using Mplus
Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are
More informationA measurement error model approach to small area estimation
A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth
More informationGraybill Conference Poster Session Introductions
Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary
More informationEstimating the Marginal Odds Ratio in Observational Studies
Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios
More informationCombining Non-probability and Probability Survey Samples Through Mass Imputation
Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu
More information2 Statistical Estimation: Basic Concepts
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationA Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java
IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS A Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java To cite this
More informationINSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING
Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy
More informationComparison of multiple imputation methods for systematically and sporadically missing multilevel data
Comparison of multiple imputation methods for systematically and sporadically missing multilevel data V. Audigier, I. White, S. Jolani, T. Debray, M. Quartagno, J. Carpenter, S. van Buuren, M. Resche-Rigon
More informationREPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES
Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for
More informationAlternative implementations of Monte Carlo EM algorithms for likelihood inferences
Genet. Sel. Evol. 33 001) 443 45 443 INRA, EDP Sciences, 001 Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Louis Alberto GARCÍA-CORTÉS a, Daniel SORENSEN b, Note a
More informationEstimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk
Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:
More informationLast lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton
EM Algorithm Last lecture 1/35 General optimization problems Newton Raphson Fisher scoring Quasi Newton Nonlinear regression models Gauss-Newton Generalized linear models Iteratively reweighted least squares
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationDetermining the number of components in mixture models for hierarchical data
Determining the number of components in mixture models for hierarchical data Olga Lukočienė 1 and Jeroen K. Vermunt 2 1 Department of Methodology and Statistics, Tilburg University, P.O. Box 90153, 5000
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More informationLabor-Supply Shifts and Economic Fluctuations. Technical Appendix
Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January
More informationGroup Sequential Designs: Theory, Computation and Optimisation
Group Sequential Designs: Theory, Computation and Optimisation Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 8th International Conference
More informationAsymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationAsymptotic inference for a nonstationary double ar(1) model
Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk
More informationMeasurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007
Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η
More informationPOSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL
COMMUN. STATIST. THEORY METH., 30(5), 855 874 (2001) POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL Hisashi Tanizaki and Xingyuan Zhang Faculty of Economics, Kobe University, Kobe 657-8501,
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationParametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationBootstrap inference for the finite population total under complex sampling designs
Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.
More informationPerformance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project
Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore
More informationThe propensity score with continuous treatments
7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.
More informationBayesian inference for multivariate extreme value distributions
Bayesian inference for multivariate extreme value distributions Sebastian Engelke Clément Dombry, Marco Oesting Toronto, Fields Institute, May 4th, 2016 Main motivation For a parametric model Z F θ of
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationAn Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data
An Akaike Criterion based on Kullback Symmetric Divergence Bezza Hafidi a and Abdallah Mkhadri a a University Cadi-Ayyad, Faculty of sciences Semlalia, Department of Mathematics, PB.2390 Marrakech, Moroco
More informationOptimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.
Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More informationShort course on Missing Data
Short course on Missing Data Demetris Athienitis University of Florida Department of Statistics Contents An Introduction to Missing Data....................................... 5. Introduction 5.2 Types
More informationREPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY
REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in
More informationCalibration Estimation for Semiparametric Copula Models under Missing Data
Calibration Estimation for Semiparametric Copula Models under Missing Data Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Economics and Economic Growth Centre
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationMultiple QTL mapping
Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power
More information