Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, PDF Free Download

Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010

1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton Smulaton study Concluson

2 Introducton Z: a vector of random varables wth dstrbuton F (z; θ). z 1,, z n are n ndependent realzatons of Z. Two types of parameters of nterest 1 θ: the parameter n the model F (z; θ) 2 η g = E θ {g(z)}: nduced parameter from θ. η g can be computed n two ways: 1 Maxmum lkelhood method: ˆη g = η g (ˆθ MLE ) 2 Smple method: ˆθ g = n 1 n =1 g (z )

3 Introducton Suppose that z s not fully observed. z = (z obs,, z ms, ): (observed, mssng) part of z Under the mssng-at-random assumpton, we can use the followng estmators: ˆθ: soluton to n E {S (θ; z ) z obs, } = 0 =1 ( ) ˆη g = n 1 n E {g (z ) z obs, } =1 where S (θ; z ) = ln f (z ; θ) / θ s the score functon of θ under complete response. The equaton n (*) s called mean score equaton.

4 Introducton Under some regularty condtons, the soluton ˆθ to the mean score equaton maxmzes the observed lkelhood (= the lkelhood assocated wth the margnal dstrbuton of z obs, ). Computng the condtonal expectaton can be a challengng problem. 1 Do not know θ n E {g (z ) z obs, } = E {g (z ) z obs, ; θ}. 2 Even f we know θ, computng the condtonal expectaton s numercally dffcult.

5 Introducton Imputaton: Monte Carlo approxmaton of the condtonal expectaton E { S (θ; z ) z obs, ; ˆθ } = 1 M { E g (z ) z obs, ; ˆθ } = 1 M ( where z (j) z ms, z obs, ; ˆθ ) M S j=1 M g j=1 ( ) θ; z obs,, z (j) ms, ( ) z obs,, z (j) ms, ms, f. By provdng mputed data, the estmates are very easy to compute and dfferent users can get consstent results. Very attractve when η g s unknown at the tme of mputaton (e.g. publc-access data n survey samplng).

6 Introducton To compute ˆθ, EM algorthm can be used. ˆθ (t+1) : soluton to M 1 n M S =1 j=1 ( ) where zj(t) = z obs,, z (j) ms,(t) wth ( ) θ; zj(t) = 0 z (j) ms,(t) f (z ms, z obs, ; ˆθ (t)). (1) Computatonally heavy f (1) requres MCMC for each t. Convergence s hard to be acheved (unless M s ncreased) snce the mputed values are re-generated for each teraton.

7 Proposed method: Fractonal mputaton Fractonal mputaton 1 More than one (say M) mputed values of z ms, : z (1) ms,,, z (M) ms, from some (ntal) densty h (y,ms ). 2 Create weghted data set {( w j, zj ) } ; j = 1, 2,, M; = 1, 2, n where M j=1 w j = 1, z j = (z obs,, z (j) ms, ) w j f (z j ; ˆθ)/h(z (j),ms ), ˆθ s the maxmum lkelhood estmator of θ, and f (z; θ) s the jont densty of z. 3 The weght wj are the normalzed mportance weghts and can be called fractonal weghts.

8 Proposed method: Fractonal mputaton Product: fractonally mputed data set of sze nm {( w j, zj ) } ; j = 1, 2,, M; = 1, 2, n Property: for suffcently large M, M j=1 wj g ( f (z ;ˆθ) zj ) = h(z,ms ) g(z )h(z,ms )dz,ms f (z ;ˆθ) h(z,ms ) h(z,ms)dz,ms for any g such that the expectaton exsts. = E { } g (z ) z,obs ; ˆθ If we choose h(z,ms ) = f (z,ms z,obs, ˆθ) where ˆθ s the MLE, then t s equal to the usual Monte Carlo mputaton for maxmum lkelhood estmaton.

9 Proposed method: Fractonal mputaton EM algorthm by fractonal mputaton 1 Intal mputaton: generate z (j) ms, h (y,ms ). 2 E-step: compute where M j=1 w j(t) = 1. 3 M-step: update w j(t) f (z j ; ˆθ (t) )/h(z (j),ms ) ˆθ (t+1) : soluton to n M =1 j=1 w j(t) S ( θ; z j ) = 0. 4 Repeat Step2 and Step 3 untl convergence.

10 Proposed method: Fractonal mputaton If we set h( ) ndependent of θ, then the mputed values are not changed for each teraton. Only the fractonal weghts are changed. 1 Computatonally effcent (because we use mportance samplng only once). 2 Convergence s acheved (because the mputed values are not changed). For suffcently large t, ˆθ (t) ˆθ. Also, for suffcently large M, ˆθ ˆθ MLE. Thus, we need to use bg M for satsfactory approxmaton.

11 Approxmaton: Calbraton Fractonal mputaton In large scale survey samplng, we prefer to have smaller M. Two-step method for fractonal mputaton: 1 Create a set of fractonally mputed data wth sze nm 1, (say M 1 = 500). 2 Use an effcent samplng and weghtng method to get a fnal set of fractonally mputed data wth sze nm 2, (say M 2 = 10). Thus, we treat the step-one mputed data as a fnte populaton and the step-two mputed data as a sample. We can use effcent samplng technque (such as systematc samplng or stratfcaton) to get a fnal mputed data and use calbraton technque for fractonal weghtng.

Approxmaton: Calbraton Fractonal mputaton Step-One data set (of sze nm 1 ): {( w j, zj ) ; j = 1, 2,, M1 ; = 1, 2, n } and the fractonal weghts satsfy M 1 j=1 w j = 1 and n M 1 ) wj S (ˆθ; zj = 0 =1 j=1 where ˆθ s obtaned from the EM algorthm after convergence. The fnal fractonally mputed data set can be wrtten {( w j, z j ) ; j = 1, 2,, M2 ; = 1, 2, n } and the fractonal weghts satsfy M 2 j=1 w j = 1 and n M 2 =1 j=1 ) w j S (ˆθ; z j = 0 12

13 Approxmaton: Calbraton Fractonal mputaton Thus, fractonal weghts can be constructed by the calbraton technques n survey samplng. If the dstrbuton belongs to the exponental famly, then the calbraton constrants can be smplfed to n M 2 =1 j=1 w j T ( z j ) = n M 1 wj T (zj ) =1 j=1 where T (z) s the complete suffcent statstc for θ.

14 Varance estmaton for fractonal mputaton Wrte where ˆη g,fi = ˆη g,fi (ˆθ) n 1 S(ˆθ) n M =1 j=1 n M =1 j=1 wj (ˆθ)g ( zj ) w j (ˆθ)S(ˆθ; z j ) = 0. Taylor lnearzaton { } η g,fi ) = ˆηg,FI (θ 0 ) θ ˆη g,fi (θ 0 ) (ˆθ θ0 { } 0 = S(ˆθ) = S(θ 0 ) + θ S(θ ) 0 ) (ˆθ θ 0

15 Varance estmaton for fractonal mputaton Combne the two η g,fi = ˆηg,FI (θ 0 ) = n 1 n M =1 j=1 n = n 1 ē (θ 0 ) =1 { } { } 1 θ ˆη g,fi (θ 0 ) θ S(θ 0 ) S(θ0 ) wj (θ 0 ) { g ( zj ) K S(θ 0 ; zj ) } where K = { θ ˆη g,fi (θ 0 ) } { S(θ θ 0 ) } 1 and ē (θ 0 ) are IID random varables. Plug-n estmator for the lnearzed varance can be used.

16 Multple mputaton Generate M mputed values (wth equal weghts) Features 1 Imputed values are generated from z (j),ms f (z,ms z,obs, θ ) where θ s generated from the posteror dstrbuton π (θ z,obs ). 2 Varance estmaton formula s smple. ˆV MI ( η g,m ) = 1 M ( ˆV I (m) + 1 + 1 ) 1 M ) 2 (ˆηg(m) η g,m M M M 1 m=1 m=1 where η g,m = M 1 M m=1 ˆη g(m) s the average of M mputed estmators and ˆV I (m) s the mputed verson of the varance estmator of ˆη g under complete response.

17 Multple mputaton The computaton for Bayesan mputaton can be mplemented by the data augmentaton (Tanner and Wong, 1987) technque, whch s a specal applcaton of the Gbb s samplng method: 1 I-step: Generate z ms f (z ms z obs, θ ) 2 P-step: Generate θ g (θ z obs, z ms ) Consstency of varance estmator s questonable (Km et al, 2006, JRSSB). 1 If η g = η g (θ), then the varance estmator wth large M s consstent. 2 If η g η g (θ), then the varance estmator s not consstent.

18 Smulaton study: smulaton setup B = 2, 000 smulaton samples of sze n = 200 are generated wth x N (2, 1) y x N (β 0 + β 1 x, σ ee ) z (x, y ) Bernoull(p ) where (β 0, β 1 ) = (1, 0.7), σ ee = 1, p = exp (ψ 0 + ψ 1 x + ψ 2 y ) 1 + exp (ψ 0 + ψ 1 x + ψ 2 y ) wth (ψ 0, ψ 1, ψ 2 ) = ( 3, 0.5, 0.7).

19 Smulaton study: smulaton setup x s always observed. y s observed f δ 1 = 1 and y s mssng f δ 1 = 0, where and wth (ϕ 0, ϕ 1 ) = (0, 0.5). δ 1 Bernoull(π 1 ) π 1 = exp (ϕ 0 + ϕ 1 x ) 1 + exp (ϕ 0 + ϕ 1 x ) z s observed f δ 2 = 1 and z s mssng f δ 2 = 0, where δ 2 Bernoull(0.7). The response mechansm s mssng at random.

20 Smulaton study: fractonal mputaton 1 Intal mputaton: a Ft a parametrc model f 1 (y x, θ 1 ) for the condtonal dstrbuton of y gven x among respondents of y. In ths smulaton setup, we use the followng model y (x, δ 1 = 1) N (β 0 + β 1 x, σ ee ) for some θ 1 = (β 0, β 1, σ ee ). b Estmate parameter θ 1 usng the samples wth δ 1 = 1 only. c For each unt wth δ 1 = 0, generate M mputed values of y, say y (1),, y (M), from the estmated densty f 1 (y x, ˆθ 1 ).

21 Smulaton study: fractonal mputaton 1 Intal mputaton (cont d): d Smlarly, ft a parametrc model f 2 (z x, y, θ 2 ) for the condtonal dstrbuton of z gven x and y among z-respondents. e Estmate parameter θ 2 usng the samples wth δ 2 = 1 only. f For each unt wth δ 2 = 0, generate M mputed values of z by f 2 (z x, y, ˆθ 2 ). z (j)

22 Smulaton study: fractonal mputaton 2 Fractonal weghtng (E-step): For the current parameter estmate ˆθ (t) = (ˆθ 1(t), ˆθ 2(t) ) where ˆθ 1(t) = ( ˆβ 0(t), ˆβ 1(t), ˆσ ee(t) ) and ˆθ 2(t) = ( ˆψ 0(t), ˆψ 1(t), ˆψ 2(t) ), compute the fractonal weghts assocated wth the mputed values. The ( fractonal weghts assocate wth w y (j) ), z (j) are ( ) ( ) f 1 y (j) j(t) x, ˆθ 1(t) f 2 z (j) x, y (j), ˆθ 2(t) ( ) ( ), (2) f 1 y (j) x, ˆθ 1(0) f 2 z (j) x, y (j), ˆθ 2(0) where f 1 (y x, θ 1 ) f 2 (z x, y, θ 2 ) s the jont densty of (y, z) gven x. In (2), t s understood that y (j) = y f δ 1 = 1 and z (j) = z f δ 2 = 1.

23 Smulaton study: fractonal mputaton 3 Update parameter estmates (M-step): Usng the current fractonal weghts, compute the maxmum lkelhood estmator to update ˆθ (t+1) by solvng the followng mputed score equatons: S (t) (θ) n =1 j=1 M ( ) wj(t) S θ; x, y (j), z (j) = 0 The soluton to the above equaton can be obtaned usng the exstng software.

24 Smulaton study: fractonal mputaton Usng the fnal fractonal weghts after convergence, we can estmate η g = E {g(x, y, z)} by ˆη g = 1 n n M =1 j=1 w j g(x, y (j), z (j) ) where y (j) = y f δ 1 = 1 and z (j) = z f δ 2 = 1. For fractonal mputaton wth M = 10, calbraton method s used wth an ntal mputaton of sze M 1 = 100. The followng parameters were consdered n the smulaton study: 1 η 1 = E(y) 2 η 2 = Pr (y < 3).

25 Smulaton study: result Table 1 Monte Carlo bas and varance of the pont estmators. Parameter Estmator Bas Varance Std Var Complete sample 0.00.00739 100 η 1 FI (M = 100) 0.00.00925 125 MI (M = 100) 0.00.01058 143 FI (M = 10) 0.00.00952 129 MI (M = 10) 0.00.01081 146 Complete sample 0.00.00108 100 η 2 FI (M = 100) 0.00.00117 108 MI (M = 100) 0.00.00114 106 FI (M = 10) 0.00.00117 108 MI (M = 10) 0.00.00116 107

26 Smulaton study: result Table 2 Monte Carlo relatve bas of the varance estmator. Parameter Imputaton Relatve bas (%) FI (M = 100) 3.7 V (ˆη 1 ) MI (M = 100) 4.3 FI (M = 10) 2.7 MI (M = 10) 4.2 FI (M = 100) 3.9 V (ˆη 2 ) MI (M = 100) 15.6 FI (M = 10) 2.3 MI (M = 10) 15.5

27 Smulaton study: result For the estmaton of E(Y ), fractonal mputaton s more effcent. For the estmaton of proportons, multple mputaton s slghtly more effcent. However, multple mputaton provdes based varance estmates for proportons. Note that we are not usng the MLE for η 2. Under complete response, ˆη 2 = 1 n n I (y < 3) =1 s less effcent than the maxmum lkelhood estmator. The MI varance estmator s justfed only for the MLE s.

28 Concluson Fractonal mputaton s proposed as a tool for computng the condtonal expectaton of any functon of orgnal data gven the observed data. Fractonal mputaton can be used to mplement the Monte Carlo EM algorthm effcently n a computatonally effcent manner. Varance estmaton based on Taylor lnearzaton s also covered. Further detals can be found n Km (2010, Bometrka, tentatvely accepted).

29 Future Research Survey samplng applcaton (wth multvarate mssng data) Nonparametrc or sem-parametrc fractonal mputaton. Measurement error models Random effect models (wth possble applcatons n small area estmaton) More theoretcal work on nference (Wlk s theorem?)

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010