Recent Advances in the analysis of missing data with non-ignorable missingness

Size: px
Start display at page:

Download "Recent Advances in the analysis of missing data with non-ignorable missingness"

Transcription

1 Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014

2 1 Introduction 2 Full likelihood-based ML estimation 3 GMM method 4 Partial Likelihood approach 5 Exponential tilting method 6 Concluding Remarks Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

3 Observed likelihood (X, Y ): random variable, y is subject to missingness f (y x; θ): model of y on x g(δ x, y; φ): model of δ on (x, y) Observed likelihood L obs (θ, φ) = f (y i x i ; θ) g (δ i x i, y i ; φ) δ i =1 δ i =0 f (y i x i ; θ) g (δ i x i, y i ; φ) dy i Under what conditions the parameters are identifiable (or estimable)? Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

4 Lemma Suppose that we can decompose the covariate vector x into two parts, u and z, such that g(δ y, x) = g(δ y, u) (1) and, for any given u, there exist z u,1 and z u,2 such that f (y u, z = z u,1) f (y u, z = z u,2). (2) Under some other minor conditions, all the parameters in f and g are identifiable. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

5 Remark Condition (1) means δ z y, u. That is, given (y, u), z does not help in explaining δ. Thus, z plays the role of instrumental variable in econometrics: f (y x, z ) = f (y x ), Cov(z, x ) 0. Here, y = δ, x = (y, u), and z = z. We may call z the nonresponse instrument variable. Rigorous theory developed by Wang et al. (2014). Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

6 Introduction Remark MCAR (Missing Completely at random): P(δ y) does not depend on y. MAR (Missing at random): P(δ y) = P(δ y obs ) NMAR (Not Missing at random): P(δ y) P(δ y obs ) Thus, MCAR is a special case of MAR. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

7 Parameter estimation under the existence of nonresponse instrument variable Full likelihood-based ML estimation Generalized method of moment (GMM) approach (Section 6.3 of KS) Conditional likelihood approach (Section 6.2 of KS) Pseudo likelihood approach (Section 6.4 of KS) Exponential tilting method (Section 6.5 of KS) Latent variable approach (Section 6.6 of KS) Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

8 1 Introduction 2 Full likelihood-based ML estimation 3 GMM method 4 Partial Likelihood approach 5 Exponential tilting method 6 Concluding Remarks Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

9 Full likelihood-based ML estimation Wish to find ˆη = (ˆθ, ˆφ), that maximizes the observed likelihood L obs (η) = f (y i x i ; θ) g (δ i x i, y i ; φ) δ i =1 f (y i x i ; θ) g (δ i x i, y i ; φ) dy i δ i =0 Mean score theorem: Under some regularity conditions, finding the MLE of maximizing the observed likelihood is equivalent to finding the solution to S(η) E{S(η) y obs, δ; η} = 0, where y obs is the observed data. The conditional expectation of the score function is called mean score function. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

10 Example 1 [Example 2.5 of KS] 1 Suppose that the study variable y is randomly distributed with Bernoulli distribution with probability of success p i, where p i = p i (β) = exp (x i β) 1 + exp (x i β) for some unknown parameter β and x i is a vector of the covariates in the logistic regression model for y i. We assume that 1 is in the column space of x i. 2 Under complete response, the score function for β is S 1(β) = n (y i p i (β)) x i. i=1 Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

11 Example 1 (Cont d) 3 Let δ i be the response indicator function for y i with distribution Bernoulli(π i ) where π i = exp (x i φ 0 + y i φ 1) 1 + exp (x i φ0 + y iφ 1). We assume that x i is always observed, but y i is missing if δ i = 0. 4 Under missing data, the mean score function for β, the conditional expectation of the original score function given the observed data, is S 1 (β, φ) = δi =1 {y i p i (β)} x i + δ i =0 1 w i (y; β, φ) {y p i (β)} x i, where w i (y; β, φ) is the conditional probability of y i = y given x i and δ i = 0: y=0 w i (y; β, φ) = P β (y i = y x i ) P φ (δ i = 0 y i = y, x i ) 1 z=0 P β (y i = z x i ) P φ (δ i = 0 y i = z, x i ) Thus, S 1 (β, φ) is also a function of φ. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

12 Example 1 (Cont d) 5 If the response mechanism is MAR so that φ 1 = 0, then w i (y; β, φ) = P β (y i = y x i ) 1 z=0 P β (y i = z x i ) = P β (y i = y x i ) and so S 1 (β, φ) = δi =1 {y i p i (β)} x i = S 1 (β). 6 If MAR does not hold, then ( ˆβ, ˆφ) can be obtained by solving S 1 (β, φ) = 0 and S 2(β, φ) = 0 jointly, where S 2 (β, φ) = δi =1 {δ i π (φ; x i, y i )} (x i, y i ) + δ i =0 1 w i (y; β, φ) {δ i π i (φ; x i, y)} (x i, y). y=0 Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

13 EM algorithm Interested in finding ˆη that maximizes L obs (η). The MLE can be obtained by solving S obs (η) = 0, which is equivalent to solving S(η) = 0 by the mean score theorem. EM algorithm provides an alternative method of solving S(η) = 0 by writing S(η) = E {S com(η) y obs, δ; η} and using the following iterative method: { ˆη (t+1) solve E S com(η) y obs, δ; ˆη (t)} = 0. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

14 5. EM algorithm Definition Let η (t) be the current value of the parameter estimate of η. The EM algorithm can be defined as iteratively carrying out the following E-step and M-steps: E-step: Compute ( Q η η (t)) = E {ln f (y, δ; η) y obs, δ, η (t)} M-step: Find η (t+1) that maximizes Q(η η (t) ) w.r.t. η. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

15 EM algorithm Example 1 (Cont d) E-step: where ( S 1 β β (t), φ (t)) = {y i p i (β)} x i + δ i =1 δ i =0 w ij(t) = Pr(Y i = j x i, δ i = 0; β (t), φ (t) ) = 1 w ij(t) {j p i (β)} x i, j=0 Pr(Y i = j x i ; β (t) )Pr(δ i = 0 x i, j; φ (t) ) 1 y=0 Pr(Y i = y x i ; β (t) )Pr(δ i = 0 x i, y; φ (t) ) and ( S 2 φ β (t), φ (t)) = {δ i π (x i, y i ; φ)} ( x ) i, y i δ i =1 + δ i =0 1 w ij(t) {δ i π i (x i, j; φ)} ( x i, j ). j=0 Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

16 EM algorithm Example 1 (Cont d) M-step: The parameter estimates are updated by solving [ S1 (β β (t), φ (t)), S 2 ( φ β (t), φ (t))] = (0, 0) for β and φ. For categorical missing data, the conditional expectation in the E-step can be computed using the weighted mean with weights w ij(t). Ibrahim (1990) called this method EM by weighting. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

17 Monte Carlo EM Motivation: Monte Carlo samples in the EM algorithm can be used as imputed values. Monte Carlo EM 1 In the EM algorithm defined by [E-step] Compute ( Q η η (t)) = E {ln f (y, δ; η) y obs, δ; η (t)} [M-step] Find η (t+1) that maximizes Q ( η η (t)), E-step is computationally cumbersome because it involves integral. 2 Wei and Tanner (1990): In the E-step, first draw y (1) mis,, y (m) mis f (y mis y obs, δ; η (t)) and approximate ( Q η η (t)) 1 = m m ln f j=1 ( y obs, y (j) mis, δ; η ). Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

18 Monte Carlo EM Example 2 [Example 3.15 of KS] Suppose that y i f (y i x i ; θ) Assume that x i is always observed but we observe y i only when δ i = 1 where δ i Bernoulli [π i (φ)] and π i (φ) = exp (φ0 + φ1x i + φ 2y i ) 1 + exp (φ 0 + φ 1x i + φ 2y i ). To implement the MCEM method, in the E-step, we need to generate samples from f (y i x i, δ i = 0; ˆθ, ˆφ) = f (y i x i ; ˆθ){1 π i ( ˆφ)} f (yi x i ; ˆθ){1 π i ( ˆφ)}dy i. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

19 Monte Carlo EM Example 2 (Cont d) We can use the following rejection method to generate samples from f (y i x i, δ i = 0; ˆθ, ˆφ): 1 Generate yi from f (y i x i ; ˆθ). 2 Using yi, compute π i ( ˆφ) = Accept yi with probability 1 πi ( ˆφ). 3 If yi is not accepted, then goto Step 1. exp( ˆφ 0 + ˆφ 1 x i + ˆφ 2 y i ) 1 + exp( ˆφ 0 + ˆφ 1 x i + ˆφ 2 y i ). Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

20 Monte Carlo EM Example 2 (Cont d) Using the m imputed values of y i, denoted by y (1) i,, y (m) i, and the M-step can be implemented by solving and n i=1 j=1 n i=1 m ( ) S θ; x i, y (j) i = 0 j=1 where S (θ; x i, y i ) = log f (y i x i ; θ)/ θ. m { } ( ) δ i π(φ; x i, y (j) i ) 1, x i, y (j) i = 0, Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

21 1 Introduction 2 Full likelihood-based ML estimation 3 GMM method 4 Partial Likelihood approach 5 Exponential tilting method 6 Concluding Remarks Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

22 Basic setup (X, Y ): random variable θ: Defined by solving E{U(θ; X, Y )} = 0. y i is subject to missingness δ i = { 1 if yi responds 0 if y i is missing. Want to find w i such that the solution ˆθ w to is consistent for θ. n δ i w i U(θ; x i, y i ) = 0 i=1 Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

23 Basic Setup Result 1: The choice of w i = makes the resulting estimator ˆθ w consistent. 1 E(δ i x i, y i ) Result 2: If δ i Bernoulli(π i ), then using w i = 1/π i also makes the resulting estimator consistent, but it is less efficient than ˆθ w using w i in (3). (3) Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

24 Parameter estimation : GMM method Because z is a nonresponse instrumental variable, we may assume for some (φ 0, φ 1, φ 2). P(δ = 1 x, y) = π(φ 0 + φ 1u + φ 2y) Kott and Chang (2010): Construct a set of estimating equations such as n { } δ i π(φ 0 + φ 1u i + φ 2y i ) 1 (1, u i, z i ) = 0 i=1 that are unbiased to zero. May have overidentified situation: Use the generalized method of moments (GMM). Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

25 GMM method Example 3 Suppose that we are interested in estimating the parameters in the regression model y i = β 0 + β 1x 1i + β 2x 2i + e i (4) where E(e i x i ) = 0. Assume that y i is subject to missingness and assume that P(δ i = 1 x 1i, x i2, y i ) = exp(φ0 + φ1x 1i + φ 2y i ) 1 + exp(φ 0 + φ 1x 1i + φ 2y i ). Thus, x 2i is the nonresponse instrument variable in this setup. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

26 Example 3 (Cont d) A consistent estimator of φ can be obtained by solving n { } δ Û 2(φ) π(φ; x 1i, y i ) 1 (1, x 1i, x 2i ) = (0, 0, 0). (5) i=1 Roughly speaking, the solution to (5) exists almost surely if E{ Û2(φ)/ φ} is of full rank in the neighborhood of the true value of φ. If x 2 is vector, then (5) is overidentified and the solution to (5) does not exist. In the case, the GMM algorithm can be used. Finding the solution to Û2(φ) = 0 can be obtained by finding the minimizer of Q(φ) = Û2(φ) Û 2(φ) or Q W (φ) = Û2(φ) W Û2(φ) where W = {V (Û2)} 1. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

27 Example 3 (Cont d) Once the solution ˆφ to (5) is obtained, then a consistent estimator of β = (β 0, β 1, β 2) can be obtained by solving for β. Û 1(β, ˆφ) n i=1 δ i ˆπ i {y i β 0 β 1x 1i β 2x 2i } (1, x 1i, x 2i ) = (0, 0, 0) (6) Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

28 Asymptotic Properties The asymptotic variance of the GMM estimator ˆφ that minimizes Û 2(φ) ˆΣ 1 Û 2(φ) is V ( ˆφ) ( ) 1 = Γ Σ 1 Γ where The variance is estimated by Γ = E{ Û2(φ)/ φ} Σ = V (Û2). (ˆΓ ˆΣ 1ˆΓ) 1, where ˆΓ = Û/ φ evaluated at ˆφ and ˆΣ is an estimated variance-covariance matrix of Û2(φ) evaluated at ˆφ. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

29 Asymptotic Properties The asymptotic variance of ˆβ obtained from (6) with ˆφ computed from the GMM can be obtained by V (ˆθ) ( ) 1 = Γ aσ 1 a Γ a where and θ = (β, φ). Γ a = E{ Û(θ)/ θ} Σ a = V (Û) Û = (Û 1, Û 2) Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

30 1 Introduction 2 Full likelihood-based ML estimation 3 GMM method 4 Partial Likelihood approach 5 Exponential tilting method 6 Concluding Remarks Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

31 Partial Likelihood approach A classical way of likelihood-based approach for parameter estimation under non ignorable nonresponse is to maximize L obs (θ, φ) with respect to (θ, φ), where L obs (θ, φ) = δ i =1 f (y i x i ; θ) g (δ i x i, y i ; φ) δ i =0 f (y i x i ; θ) g (δ i x i, y i ; φ) dy i Such approach can be called full likelihood-based approach because it uses full information available in the observed data. However, it is well known that such full likelihood-based approach is quite sensitive to the failure of the assumed model. On the other hand, partial likelihood-based approach (or conditional likelihood approach) uses a subset of the sample. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

32 Conditional Likelihood approach Idea Since f (y x)g(δ x, y) = f 1(y x, δ)g 1(δ x), for some f 1 and g 1, we can write L obs (θ) = f 1 (y i x i, δ i = 1) g 1 (δ i x i ) δ i =1 δ i =0 f 1 (y i x i, δ i = 0) g 1 (δ i x i ) dy i = δ i =1 f 1 (y i x i, δ i = 1) n g 1 (δ i x i ). The conditional likelihood is defined to be the first component: L c(θ) = δ i =1 f 1 (y i x i, δ i = 1) = δ i =1 i=1 f (y i x i ; θ)π(x i, y i ) f (y xi ; θ)π(x i, y)dy, where π(x, y i ) = Pr(δ i = 1 x i, y i ). Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

33 Conditional Likelihood approach Example Assume that the original sample is a random sample from an exponential distribution with mean µ = 1/θ. That is, the probability density function of y is f (y; θ) = θ exp( θy)i (y > 0). Suppose that we observe y i only when y i > K for a known K > 0. Thus, the response indicator function is defined by δ i = 1 if y i > K and δ i = 0 otherwise. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

34 Conditional Likelihood approach Example To compute the maximum likelihood estimator from the observed likelihood, note that S obs (θ) = ( ) 1 θ y i + { } 1 θ E(y i δ i = 0). δ i =1 δ i =0 Since E(Y y > K) = 1 θ K exp( θk) 1 exp( θk), the maximum likelihood estimator of θ can be obtained by the following iteration equation: {ˆθ(t+1) } { 1 = ȳr n r K exp( K ˆθ } (t) ) r 1 exp( K ˆθ, (7) (t) ) where r = n i=1 δ i and ȳ r = r 1 n i=1 δ iy i. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

35 Conditional Likelihood approach Example Since π i = Pr(δ i = 1 y i ) = I (y i > K) and E(π i ) = E{I (y i > K)} = exp( Kθ), the conditional likelihood reduces to θ exp{ θ(y i K)}. δ i =1 The maximum conditional likelihood estimator of θ is ˆθ c = 1 ȳ r K. Since E(y y > K) = µ + K, the maximum conditional likelihood estimator of µ, which is ˆµ c = 1/ˆθ c, is unbiased for µ. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

36 Conditional Likelihood approach Remark Under some regularity conditions, the solution ˆθ c that maximizes L c(θ) satisfies where I c(θ) = E { } θ Sc(θ) x i; θ = Ic 1/2 L (ˆθ c θ) N(0, I ) S c(θ) = ln L c(θ)/ θ, and S i (θ) = ln f (y i x i ; θ) / θ. n i=1 Works only when π(x, y) is a known function. [ E { S i S i π i x i ; θ } {E (S iπ i x i ; θ)} 2 ], E (π i x i ; θ) Does not require nonresponse instrumental variable assumption. Popular for biased sampling problem. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

37 Pseudo Likelihood approach Idea Consider bivariate (x i, y i ) with density f (y x; θ)h(x) where y i are subject to missingness. We are interested in estimating θ. Suppose that Pr(δ = 1 x, y) depends only on y. (i.e. x is nonresponse instrument) Note that f (x y, δ) = f (x y). Thus, we can consider the following conditional likelihood L c(θ) = δ i =1 f (x i y i, δ i = 1) = δ i =1 f (x i y i ). We can consider maximizing the pseudo likelihood L p(θ) = δ i =1 f (y i x i ; θ)ĥ(x i) f (yi x; θ)ĥ(x)dx, where ĥ(x) is a consistent estimator of the marginal density of x. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

38 Pseudo Likelihood approach Idea We may use the empirical density in ĥ(x). That is, ĥ(x) = 1/n if x = x i. In this case, L c(θ) = f (y i x i ; θ) n δ i =1 k=1 f (y i x k ; θ). We can extend the idea to the case of x = (u, z) where z is a nonresponse instrument. In this case, the conditional likelihood becomes p(z i y i, u i ) = i:δ i =1 i:δ i =1 f (y i u i, z i ; θ)p(z i u i ) f (yi u i, z; θ)p(z u i )dz. (8) Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

39 Pseudo Likelihood approach Let ˆp(z u) be an estimated conditional probability density of z given u. Substituting this estimate into the likelihood in (8), we obtain the following pseudo likelihood: i:δ i =1 f (y i u i, z i ; θ)ˆp(z i u i ) f (yi u i, z; θ)ˆp(z u i )dz. (9) The pseudo maximum likelihood estimator (PMLE) of θ, denoted by ˆθ p, can be obtained by solving S p(θ; ˆα) δ i =1 [S(θ; x i, y i ) E{S(θ; u i, z, y i ) y i, u i ; θ, ˆα}] = 0 for θ, where S(θ; x, y) = S(θ; u, z, y) = log f (y x; θ)/ θ and S(θ; ui, z, y i )f (y i u i, z; θ)p(z u i ; ˆα)dz E{S(θ; u i, z, y i ) y i, u i ; θ, ˆα} =. f (yi u i, z; θ)p(z u i ; ˆα)dz Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

40 Pseudo Likelihood approach The Fisher-scoring method for obtaining the PMLE is given by { ˆθ p (t+1) (t) = ˆθ (ˆθ(t) 1 p + I p, ˆα)} Sp(ˆθ (t), ˆα) where I p (θ, ˆα) = δ i =1 [ E{S(θ; u i, z, y i ) 2 y i, u i ; θ, ˆα} E{S(θ; u i, z, y i ) y i, u i ; θ, ˆα} 2]. Variance estimation is very complicated. Jackknife or bootstrap can be used. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

41 1 Introduction 2 Full likelihood-based ML estimation 3 GMM method 4 Partial Likelihood approach 5 Exponential tilting method 6 Concluding Remarks Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

42 Exponential tilting method Motivation Observed likelihood function can be written n [ 1 δi L obs (φ) = {π i (φ)} δ i {1 π i (φ)} f (y x i )dy], i=1 where f (y x) is the true conditional distribution of y given x. To find the MLE of φ, we solve the mean score equation S(φ) = 0, where S(φ) = n [δ i S i (φ) + (1 δ i )E{S i (φ) x i, δ i = 0}], (10) i=1 where S i (φ) = {δ i π i (φ)}( logitπ i (φ)/ φ) is the score function of φ for the density g(δ x, y; φ) = π δ (1 π) 1 δ with π = π(x, y; φ). Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

43 Motivation The conditional expectation in (10) can be evaluated by using f (y x, δ = 0) = P(δ = 0 x, y) f (y x) E{P(δ = 0 x, y) x} Two problems occur: (11) 1 Requires correct specification of f (y x; θ). Known to be sensitive to the choice of f (y x; θ). 2 Computationally heavy: Often uses Monte Carlo computation. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

44 Exponential tilting method Remedy (for Problem One) Idea Instead of specifying a parametric model for f (y x), consider specifying a parametric model for f (y x, δ = 1), denoted by f 1(y x). In this case, Si (φ)f 1(y x i )O(x i, y; φ)dy E{S i (φ) x i, δ i = 0} = f1(y x i )O(x i, y; φ)dy where O(x, y; φ) = 1 π(φ; x, y). π(φ; x, y) Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

45 Remark Based on the following identity f 0 (y i x i ) = f 1 (y i x i ) where f δ (y i x i ) = f (y i x i, δ i = δ) and O (x i, y i ) E {O (x i, Y i ) x i, δ i = 1}, (12) O (x i, y i ) = Pr (δ i = 0 x i, y i ) Pr (δ i = 1 x i, y i ) (13) is the conditional odds of nonresponse. Kim and Yu (2011) considered a Kernel-based nonparametric regression method of estimating f (y x, δ = 1) to obtain E(Y x, δ = 0). Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

46 If the response probability follows from a logistic regression model π(u i, y i ) Pr (δ i = 1 u i, y i ) = exp (φ0 + φ1u i + φ 2y i ) 1 + exp (φ 0 + φ 1u i + φ 2y i ), (14) the expression (12) can be simplified to f 0 (y i x i ) = f 1 (y i x i ) exp (γy i ) E {exp (γy ) x i, δ i = 1}, (15) where γ = φ 2 and f 1 (y x) is the conditional density of y given x and δ = 1. Model (15) states that the density for the nonrespondents is an exponential tilting of the density for the respondents. The parameter γ is the tilting parameter that determines the amount of departure from the ignorability of the response mechanism. If γ = 0, the the response mechanism is ignorable and f 0(y x) = f 1(y x). Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

47 Exponential tilting method Problem Two How to compute E{S i (φ) x i, δ i = 0} = without relying on Monte Carlo computation? Si (φ)o(x i, y; φ)f 1(y x i )dy O(xi, y; φ)f 1(y x i )dy Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

48 Computation for E 1{Q(x i, Y ) x i } = Q(x i, y)f 1(y x i )dy. If x i were null, then we would approximate the integration by the empirical distribution among δ = 1. Use Q(x i, y)f 1(y x i )dy = Q(x i, y) f1(y x i) f 1(y)dy f 1(y) Q(x i, y j ) f1(y j x i ) f δ j =1 1(y j ) where f 1(y) = f 1(y x)f (x δ = 1)dx δ i =1 f 1(y x i ). Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

49 Exponential tilting method In practice, f 1(y x) is unknown and is estimated by ˆf 1(y x) = f 1(y x; ˆγ). Thus, given ˆγ, a fully efficient estimator of φ can be obtained by solving S 2(φ, ˆγ) n { δi S(φ; x i, y i ) + (1 δ i ) S 0(φ x i ; ˆγ, φ) } = 0, (16) i=1 where and S 0(φ x i ; ˆγ, φ) = δ j =1 S(φ; x i, y j )f 1(y j x i ; ˆγ)O(φ; x i, y j )/ˆf 1(y j ) δ j =1 f1(y j x i ; ˆγ)O(φ; x i, y j )/ˆf 1(y i ) ˆf 1(y) = n 1 R n δ i f 1(y x i ; ˆγ). i=1 May use EM algorithm to solve (16) for φ. Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

50 Exponential tilting method Step 1: Use the responding part of (x i, y i ), obtain ˆγ in the model f 1(y x; γ). S 1(γ) δ i =1 S 1(γ; x i, y i ) = 0. (17) Step 2: Given ˆγ from Step 1, obtain ˆφ by solving (16): S 2(φ, ˆγ) = 0. Step 3: Using ˆφ computed from Step 2, the PSA estimator of θ can be obtained by solving n δ i U(θ; x i, y i ) = 0, (18) ˆπ i where ˆπ i = π i ( ˆφ). i=1 Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

51 Exponential tilting method Remark In many cases, x is categorical and f 1(y x) can be fully nonparametric. If x has a continuous part, nonparametric Kernel smoothing can be used. The proposed method seems to be robust against the failure of the assumed model on f 1(y x; γ). Asymptotic normality of PSA estimator can be obtained & Linearization method can be used for variance estimation (Details skipped) By augmenting the estimating function, we can also impose a calibration constraint such as n δ i n x i = x i. ˆπ i i=1 i=1 Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

52 Exponential tilting method Example 4 (Example 6.5 of KS) Assume that both x i = (z i, u i ) and y i are categorical with category {(i, j); i S z S u} and S y, respectively. We are interested in estimating θ k = Pr(Y = k), for k S y. Now, we have nonresponse in y and let δ i be the response indicator function for y i. We assume that the response probability satisfies Pr (δ = 1 x, y) = π (u, y; φ). To estimate φ, we first compute the observed conditional probability of y among the respondents: δ j =1 ˆp 1(y x i ) = I (x j = x i, y j = y) δ j =1 I (x. j = x i ) Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

53 Exponential tilting method Example 4 (Cont d) The EM algorithm can be implemented by (16) with δ j =1 S 0(φ x i ; φ) = S(φ; δ i, u i, y j )ˆp 1(y j x i )O(φ; u i, y j )/ˆp 1(y j ) δ j =1 ˆp1(y, j x i )O(φ; u i, y j )/ˆp 1(y j ) where O(φ; u, y) = {1 π(u, y; φ)}/π(u, y; φ) and ˆp 1(y) = n 1 R n δ i ˆp 1(y x i ). Alternatively, we can use y S y S(φ; δ i, u i, y)ˆp 1(y x i )O(φ; u i, y) S 0(φ x i ; φ) =. (19) y S y ˆp 1(y x i )O(φ; u i, y) i=1 Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

54 Exponential tilting method Example 4 (Cont d) Once ˆπ(u, y) = π(u, y; ˆφ) is computed, we can use ˆθ k,et = n 1 I (y i = k) + δ i =1 δ i =0 where w iy is the fractional weights computed by y S y w iy I (y = k), wiy {ˆπ(u i, y)} 1 1}ˆp 1(y x i ) = y S y {ˆπ(u i, y) 1 1}ˆp 1(y x i ). Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

55 Real Data Example Exit Poll: The Assembly election (2012 Gang-dong district in Seoul) Gender Age Party A Party B Other Refusal Total Male , Female Total 1,809 1, ,473 Truth 62,489 57,909 1, ,022 Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

56 Comparison of the methods (%) Table : Analysis result : Gang-dong district in Seoul Method Party A Party B Other No adjustment Adjustment (Age * Sex) New Method Truth Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

57 Analysis result in Seoul (48 Seats) Table : Analysis result : 48 seats in Seoul Method Party A Party B Other No adjustment Adjustment (Age* Sex) New Method Truth Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

58 1 Introduction 2 Full likelihood-based ML estimation 3 GMM method 4 Partial Likelihood approach 5 Exponential tilting method 6 Concluding Remarks Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

59 Concluding remarks Uses a model for the response probability. Parameter estimation for response model can be implemented using the idea of maximum likelihood method. Instrumental variable needed for identifiability of the response model. Likelihood-based approach vs GMM approach Less tools for model diagnostics or model validation Promising areas of research Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

60 REFERENCES Ibrahim, J. G. (1990), Incomplete data in generalized linear models, Journal of the American Statistical Association 85, Kim, J. K. and C. L. Yu (2011), A semi-parametric estimation of mean functionals with non-ignorable missing data, Journal of the American Statistical Association 106, Kott, P. S. and T. Chang (2010), Using calibration weighting to adjust for nonignorable unit nonresponse, Journal of the American Statistical Association 105, Wang, S., J. Shao and J. K. Kim (2014), Identifiability and estimation in problems with nonignorable nonresponse, Statistica Sinica 24, Wei, G. C. and M. A. Tanner (1990), A Monte Carlo implementation of the EM algorithm and the poor man s data augmentation algorithms, Journal of the American Statistical Association 85, Jae-Kwang Kim (ISU) Nonignorable missing July 4th, / 59

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Statistical Methods for Handling Missing Data

Statistical Methods for Handling Missing Data Statistical Methods for Handling Missing Data Jae-Kwang Kim Department of Statistics, Iowa State University July 5th, 2014 Outline Textbook : Statistical Methods for handling incomplete data by Kim and

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Chapter 4: Imputation

Chapter 4: Imputation Chapter 4: Imputation Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Basic Theory for imputation 3 Variance estimation after imputation 4 Replication variance estimation

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in

More information

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:

More information

Introduction An approximated EM algorithm Simulation studies Discussion

Introduction An approximated EM algorithm Simulation studies Discussion 1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Propensity score adjusted method for missing data

Propensity score adjusted method for missing data Graduate Theses and Dissertations Graduate College 2013 Propensity score adjusted method for missing data Minsun Kim Riddles Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/etd

More information

AN INSTRUMENTAL VARIABLE APPROACH FOR IDENTIFICATION AND ESTIMATION WITH NONIGNORABLE NONRESPONSE

AN INSTRUMENTAL VARIABLE APPROACH FOR IDENTIFICATION AND ESTIMATION WITH NONIGNORABLE NONRESPONSE Statistica Sinica 24 (2014), 1097-1116 doi:http://dx.doi.org/10.5705/ss.2012.074 AN INSTRUMENTAL VARIABLE APPROACH FOR IDENTIFICATION AND ESTIMATION WITH NONIGNORABLE NONRESPONSE Sheng Wang 1, Jun Shao

More information

Graybill Conference Poster Session Introductions

Graybill Conference Poster Session Introductions Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary

More information

Introduction to Survey Data Integration

Introduction to Survey Data Integration Introduction to Survey Data Integration Jae-Kwang Kim Iowa State University May 20, 2014 Outline 1 Introduction 2 Survey Integration Examples 3 Basic Theory for Survey Integration 4 NASS application 5

More information

Two-phase sampling approach to fractional hot deck imputation

Two-phase sampling approach to fractional hot deck imputation Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.

More information

Weighting in survey analysis under informative sampling

Weighting in survey analysis under informative sampling Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting

More information

analysis of incomplete data in statistical surveys

analysis of incomplete data in statistical surveys analysis of incomplete data in statistical surveys Ugo Guarnera 1 1 Italian National Institute of Statistics, Italy guarnera@istat.it Jordan Twinning: Imputation - Amman, 6-13 Dec 2014 outline 1 origin

More information

Imputation for Missing Data under PPSWR Sampling

Imputation for Missing Data under PPSWR Sampling July 5, 2010 Beijing Imputation for Missing Data under PPSWR Sampling Guohua Zou Academy of Mathematics and Systems Science Chinese Academy of Sciences 1 23 () Outline () Imputation method under PPSWR

More information

Miscellanea A note on multiple imputation under complex sampling

Miscellanea A note on multiple imputation under complex sampling Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Chapter 3: Maximum Likelihood Theory

Chapter 3: Maximum Likelihood Theory Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood

More information

A measurement error model approach to small area estimation

A measurement error model approach to small area estimation A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information

Some methods for handling missing data in surveys

Some methods for handling missing data in surveys Graduate Theses and Dissertations Graduate College 2015 Some methods for handling missing data in surveys Jongho Im Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/etd

More information

On the bias of the multiple-imputation variance estimator in survey sampling

On the bias of the multiple-imputation variance estimator in survey sampling J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,

More information

Combining data from two independent surveys: model-assisted approach

Combining data from two independent surveys: model-assisted approach Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Bootstrap, Jackknife and other resampling methods

Bootstrap, Jackknife and other resampling methods Bootstrap, Jackknife and other resampling methods Part III: Parametric Bootstrap Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD)

More information

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design 1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary

More information

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC

More information

Graduate Econometrics I: Maximum Likelihood II

Graduate Econometrics I: Maximum Likelihood II Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007) Double Robustness Bang and Robins (2005) Kang and Schafer (2007) Set-Up Assume throughout that treatment assignment is ignorable given covariates (similar to assumption that data are missing at random

More information

Calibration estimation using exponential tilting in sample surveys

Calibration estimation using exponential tilting in sample surveys Calibration estimation using exponential tilting in sample surveys Jae Kwang Kim February 23, 2010 Abstract We consider the problem of parameter estimation with auxiliary information, where the auxiliary

More information

Fractional imputation method of handling missing data and spatial statistics

Fractional imputation method of handling missing data and spatial statistics Graduate Theses and Dissertations Graduate College 2014 Fractional imputation method of handling missing data and spatial statistics Shu Yang Iowa State University Follow this and additional works at:

More information

Calibration Estimation for Semiparametric Copula Models under Missing Data

Calibration Estimation for Semiparametric Copula Models under Missing Data Calibration Estimation for Semiparametric Copula Models under Missing Data Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Economics and Economic Growth Centre

More information

arxiv: v2 [stat.me] 7 Oct 2017

arxiv: v2 [stat.me] 7 Oct 2017 Weighted empirical likelihood for quantile regression with nonignorable missing covariates arxiv:1703.01866v2 [stat.me] 7 Oct 2017 Xiaohui Yuan, Xiaogang Dong School of Basic Science, Changchun University

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy

More information

Robustness to Parametric Assumptions in Missing Data Models

Robustness to Parametric Assumptions in Missing Data Models Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice

More information

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University Chap 2. Linear Classifiers (FTH, 4.1-4.4) Yongdai Kim Seoul National University Linear methods for classification 1. Linear classifiers For simplicity, we only consider two-class classification problems

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

More information

POLI 8501 Introduction to Maximum Likelihood Estimation

POLI 8501 Introduction to Maximum Likelihood Estimation POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22 MLE and GMM Li Zhao, SJTU Spring, 2017 Li Zhao MLE and GMM 1 / 22 Outline 1 MLE 2 GMM 3 Binary Choice Models Li Zhao MLE and GMM 2 / 22 Maximum Likelihood Estimation - Introduction For a linear model y

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center, the Netherlands d.rizopoulos@erasmusmc.nl

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

Various types of likelihood

Various types of likelihood Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,

More information

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Econometrics Working Paper EWP0401 ISSN 1485-6441 Department of Economics AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Lauren Bin Dong & David E. A. Giles Department of Economics, University of Victoria

More information

Chapter 8: Estimation 1

Chapter 8: Estimation 1 Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION

WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION WEIGHTED LIKELIHOOD NEGATIVE BINOMIAL REGRESSION Michael Amiguet 1, Alfio Marazzi 1, Victor Yohai 2 1 - University of Lausanne, Institute for Social and Preventive Medicine, Lausanne, Switzerland 2 - University

More information

ECON 3150/4150, Spring term Lecture 6

ECON 3150/4150, Spring term Lecture 6 ECON 3150/4150, Spring term 2013. Lecture 6 Review of theoretical statistics for econometric modelling (II) Ragnar Nymoen University of Oslo 31 January 2013 1 / 25 References to Lecture 3 and 6 Lecture

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Last lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton

Last lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton EM Algorithm Last lecture 1/35 General optimization problems Newton Raphson Fisher scoring Quasi Newton Nonlinear regression models Gauss-Newton Generalized linear models Iteratively reweighted least squares

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Combining Non-probability and. Probability Survey Samples Through Mass Imputation

Combining Non-probability and. Probability Survey Samples Through Mass Imputation Combining Non-probability and arxiv:1812.10694v2 [stat.me] 31 Dec 2018 Probability Survey Samples Through Mass Imputation Jae Kwang Kim Seho Park Yilin Chen Changbao Wu January 1, 2019 Abstract. This paper

More information

0.1 The plug-in principle for finding estimators

0.1 The plug-in principle for finding estimators The Bootstrap.1 The plug-in principle for finding estimators Under a parametric model P = {P θ ;θ Θ} (or a non-parametric P = {P F ;F F}), any real-valued characteristic τ of a particular member P θ (or

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

How to Use the Internet for Election Surveys

How to Use the Internet for Election Surveys How to Use the Internet for Election Surveys Simon Jackman and Douglas Rivers Stanford University and Polimetrix, Inc. May 9, 2008 Theory and Practice Practice Theory Works Doesn t work Works Great! Black

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse

Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse Nanhua Zhang Division of Biostatistics & Epidemiology Cincinnati Children s Hospital Medical Center (Joint work

More information

Regression: Lecture 2

Regression: Lecture 2 Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University  babu. Bootstrap G. Jogesh Babu Penn State University http://www.stat.psu.edu/ babu Director of Center for Astrostatistics http://astrostatistics.psu.edu Outline 1 Motivation 2 Simple statistical problem 3 Resampling

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

2018 2019 1 9 sei@mistiu-tokyoacjp http://wwwstattu-tokyoacjp/~sei/lec-jhtml 11 552 3 0 1 2 3 4 5 6 7 13 14 33 4 1 4 4 2 1 1 2 2 1 1 12 13 R?boxplot boxplotstats which does the computation?boxplotstats

More information