Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010
1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton Smulaton study Concluson
2 Introducton Z: a vector of random varables wth dstrbuton F (z; θ). z 1,, z n are n ndependent realzatons of Z. Two types of parameters of nterest 1 θ: the parameter n the model F (z; θ) 2 η g = E θ {g(z)}: nduced parameter from θ. η g can be computed n two ways: 1 Maxmum lkelhood method: ˆη g = η g (ˆθ MLE ) 2 Smple method: ˆθ g = n 1 n =1 g (z )
3 Introducton Suppose that z s not fully observed. z = (z obs,, z ms, ): (observed, mssng) part of z Under the mssng-at-random assumpton, we can use the followng estmators: ˆθ: soluton to n E {S (θ; z ) z obs, } = 0 =1 ( ) ˆη g = n 1 n E {g (z ) z obs, } =1 where S (θ; z ) = ln f (z ; θ) / θ s the score functon of θ under complete response. The equaton n (*) s called mean score equaton.
4 Introducton Under some regularty condtons, the soluton ˆθ to the mean score equaton maxmzes the observed lkelhood (= the lkelhood assocated wth the margnal dstrbuton of z obs, ). Computng the condtonal expectaton can be a challengng problem. 1 Do not know θ n E {g (z ) z obs, } = E {g (z ) z obs, ; θ}. 2 Even f we know θ, computng the condtonal expectaton s numercally dffcult.
5 Introducton Imputaton: Monte Carlo approxmaton of the condtonal expectaton E { S (θ; z ) z obs, ; ˆθ } = 1 M { E g (z ) z obs, ; ˆθ } = 1 M ( where z (j) z ms, z obs, ; ˆθ ) M S j=1 M g j=1 ( ) θ; z obs,, z (j) ms, ( ) z obs,, z (j) ms, ms, f. By provdng mputed data, the estmates are very easy to compute and dfferent users can get consstent results. Very attractve when η g s unknown at the tme of mputaton (e.g. publc-access data n survey samplng).
6 Introducton To compute ˆθ, EM algorthm can be used. ˆθ (t+1) : soluton to M 1 n M S =1 j=1 ( ) where zj(t) = z obs,, z (j) ms,(t) wth ( ) θ; zj(t) = 0 z (j) ms,(t) f (z ms, z obs, ; ˆθ (t)). (1) Computatonally heavy f (1) requres MCMC for each t. Convergence s hard to be acheved (unless M s ncreased) snce the mputed values are re-generated for each teraton.
7 Proposed method: Fractonal mputaton Fractonal mputaton 1 More than one (say M) mputed values of z ms, : z (1) ms,,, z (M) ms, from some (ntal) densty h (y,ms ). 2 Create weghted data set {( w j, zj ) } ; j = 1, 2,, M; = 1, 2, n where M j=1 w j = 1, z j = (z obs,, z (j) ms, ) w j f (z j ; ˆθ)/h(z (j),ms ), ˆθ s the maxmum lkelhood estmator of θ, and f (z; θ) s the jont densty of z. 3 The weght wj are the normalzed mportance weghts and can be called fractonal weghts.
8 Proposed method: Fractonal mputaton Product: fractonally mputed data set of sze nm {( w j, zj ) } ; j = 1, 2,, M; = 1, 2, n Property: for suffcently large M, M j=1 wj g ( f (z ;ˆθ) zj ) = h(z,ms ) g(z )h(z,ms )dz,ms f (z ;ˆθ) h(z,ms ) h(z,ms)dz,ms for any g such that the expectaton exsts. = E { } g (z ) z,obs ; ˆθ If we choose h(z,ms ) = f (z,ms z,obs, ˆθ) where ˆθ s the MLE, then t s equal to the usual Monte Carlo mputaton for maxmum lkelhood estmaton.
9 Proposed method: Fractonal mputaton EM algorthm by fractonal mputaton 1 Intal mputaton: generate z (j) ms, h (y,ms ). 2 E-step: compute where M j=1 w j(t) = 1. 3 M-step: update w j(t) f (z j ; ˆθ (t) )/h(z (j),ms ) ˆθ (t+1) : soluton to n M =1 j=1 w j(t) S ( θ; z j ) = 0. 4 Repeat Step2 and Step 3 untl convergence.
10 Proposed method: Fractonal mputaton If we set h( ) ndependent of θ, then the mputed values are not changed for each teraton. Only the fractonal weghts are changed. 1 Computatonally effcent (because we use mportance samplng only once). 2 Convergence s acheved (because the mputed values are not changed). For suffcently large t, ˆθ (t) ˆθ. Also, for suffcently large M, ˆθ ˆθ MLE. Thus, we need to use bg M for satsfactory approxmaton.
11 Approxmaton: Calbraton Fractonal mputaton In large scale survey samplng, we prefer to have smaller M. Two-step method for fractonal mputaton: 1 Create a set of fractonally mputed data wth sze nm 1, (say M 1 = 500). 2 Use an effcent samplng and weghtng method to get a fnal set of fractonally mputed data wth sze nm 2, (say M 2 = 10). Thus, we treat the step-one mputed data as a fnte populaton and the step-two mputed data as a sample. We can use effcent samplng technque (such as systematc samplng or stratfcaton) to get a fnal mputed data and use calbraton technque for fractonal weghtng.
Approxmaton: Calbraton Fractonal mputaton Step-One data set (of sze nm 1 ): {( w j, zj ) ; j = 1, 2,, M1 ; = 1, 2, n } and the fractonal weghts satsfy M 1 j=1 w j = 1 and n M 1 ) wj S (ˆθ; zj = 0 =1 j=1 where ˆθ s obtaned from the EM algorthm after convergence. The fnal fractonally mputed data set can be wrtten {( w j, z j ) ; j = 1, 2,, M2 ; = 1, 2, n } and the fractonal weghts satsfy M 2 j=1 w j = 1 and n M 2 =1 j=1 ) w j S (ˆθ; z j = 0 12
13 Approxmaton: Calbraton Fractonal mputaton Thus, fractonal weghts can be constructed by the calbraton technques n survey samplng. If the dstrbuton belongs to the exponental famly, then the calbraton constrants can be smplfed to n M 2 =1 j=1 w j T ( z j ) = n M 1 wj T (zj ) =1 j=1 where T (z) s the complete suffcent statstc for θ.
14 Varance estmaton for fractonal mputaton Wrte where ˆη g,fi = ˆη g,fi (ˆθ) n 1 S(ˆθ) n M =1 j=1 n M =1 j=1 wj (ˆθ)g ( zj ) w j (ˆθ)S(ˆθ; z j ) = 0. Taylor lnearzaton { } η g,fi ) = ˆηg,FI (θ 0 ) θ ˆη g,fi (θ 0 ) (ˆθ θ0 { } 0 = S(ˆθ) = S(θ 0 ) + θ S(θ ) 0 ) (ˆθ θ 0
15 Varance estmaton for fractonal mputaton Combne the two η g,fi = ˆηg,FI (θ 0 ) = n 1 n M =1 j=1 n = n 1 ē (θ 0 ) =1 { } { } 1 θ ˆη g,fi (θ 0 ) θ S(θ 0 ) S(θ0 ) wj (θ 0 ) { g ( zj ) K S(θ 0 ; zj ) } where K = { θ ˆη g,fi (θ 0 ) } { S(θ θ 0 ) } 1 and ē (θ 0 ) are IID random varables. Plug-n estmator for the lnearzed varance can be used.
16 Multple mputaton Generate M mputed values (wth equal weghts) Features 1 Imputed values are generated from z (j),ms f (z,ms z,obs, θ ) where θ s generated from the posteror dstrbuton π (θ z,obs ). 2 Varance estmaton formula s smple. ˆV MI ( η g,m ) = 1 M ( ˆV I (m) + 1 + 1 ) 1 M ) 2 (ˆηg(m) η g,m M M M 1 m=1 m=1 where η g,m = M 1 M m=1 ˆη g(m) s the average of M mputed estmators and ˆV I (m) s the mputed verson of the varance estmator of ˆη g under complete response.
17 Multple mputaton The computaton for Bayesan mputaton can be mplemented by the data augmentaton (Tanner and Wong, 1987) technque, whch s a specal applcaton of the Gbb s samplng method: 1 I-step: Generate z ms f (z ms z obs, θ ) 2 P-step: Generate θ g (θ z obs, z ms ) Consstency of varance estmator s questonable (Km et al, 2006, JRSSB). 1 If η g = η g (θ), then the varance estmator wth large M s consstent. 2 If η g η g (θ), then the varance estmator s not consstent.
18 Smulaton study: smulaton setup B = 2, 000 smulaton samples of sze n = 200 are generated wth x N (2, 1) y x N (β 0 + β 1 x, σ ee ) z (x, y ) Bernoull(p ) where (β 0, β 1 ) = (1, 0.7), σ ee = 1, p = exp (ψ 0 + ψ 1 x + ψ 2 y ) 1 + exp (ψ 0 + ψ 1 x + ψ 2 y ) wth (ψ 0, ψ 1, ψ 2 ) = ( 3, 0.5, 0.7).
19 Smulaton study: smulaton setup x s always observed. y s observed f δ 1 = 1 and y s mssng f δ 1 = 0, where and wth (ϕ 0, ϕ 1 ) = (0, 0.5). δ 1 Bernoull(π 1 ) π 1 = exp (ϕ 0 + ϕ 1 x ) 1 + exp (ϕ 0 + ϕ 1 x ) z s observed f δ 2 = 1 and z s mssng f δ 2 = 0, where δ 2 Bernoull(0.7). The response mechansm s mssng at random.
20 Smulaton study: fractonal mputaton 1 Intal mputaton: a Ft a parametrc model f 1 (y x, θ 1 ) for the condtonal dstrbuton of y gven x among respondents of y. In ths smulaton setup, we use the followng model y (x, δ 1 = 1) N (β 0 + β 1 x, σ ee ) for some θ 1 = (β 0, β 1, σ ee ). b Estmate parameter θ 1 usng the samples wth δ 1 = 1 only. c For each unt wth δ 1 = 0, generate M mputed values of y, say y (1),, y (M), from the estmated densty f 1 (y x, ˆθ 1 ).
21 Smulaton study: fractonal mputaton 1 Intal mputaton (cont d): d Smlarly, ft a parametrc model f 2 (z x, y, θ 2 ) for the condtonal dstrbuton of z gven x and y among z-respondents. e Estmate parameter θ 2 usng the samples wth δ 2 = 1 only. f For each unt wth δ 2 = 0, generate M mputed values of z by f 2 (z x, y, ˆθ 2 ). z (j)
22 Smulaton study: fractonal mputaton 2 Fractonal weghtng (E-step): For the current parameter estmate ˆθ (t) = (ˆθ 1(t), ˆθ 2(t) ) where ˆθ 1(t) = ( ˆβ 0(t), ˆβ 1(t), ˆσ ee(t) ) and ˆθ 2(t) = ( ˆψ 0(t), ˆψ 1(t), ˆψ 2(t) ), compute the fractonal weghts assocated wth the mputed values. The ( fractonal weghts assocate wth w y (j) ), z (j) are ( ) ( ) f 1 y (j) j(t) x, ˆθ 1(t) f 2 z (j) x, y (j), ˆθ 2(t) ( ) ( ), (2) f 1 y (j) x, ˆθ 1(0) f 2 z (j) x, y (j), ˆθ 2(0) where f 1 (y x, θ 1 ) f 2 (z x, y, θ 2 ) s the jont densty of (y, z) gven x. In (2), t s understood that y (j) = y f δ 1 = 1 and z (j) = z f δ 2 = 1.
23 Smulaton study: fractonal mputaton 3 Update parameter estmates (M-step): Usng the current fractonal weghts, compute the maxmum lkelhood estmator to update ˆθ (t+1) by solvng the followng mputed score equatons: S (t) (θ) n =1 j=1 M ( ) wj(t) S θ; x, y (j), z (j) = 0 The soluton to the above equaton can be obtaned usng the exstng software.
24 Smulaton study: fractonal mputaton Usng the fnal fractonal weghts after convergence, we can estmate η g = E {g(x, y, z)} by ˆη g = 1 n n M =1 j=1 w j g(x, y (j), z (j) ) where y (j) = y f δ 1 = 1 and z (j) = z f δ 2 = 1. For fractonal mputaton wth M = 10, calbraton method s used wth an ntal mputaton of sze M 1 = 100. The followng parameters were consdered n the smulaton study: 1 η 1 = E(y) 2 η 2 = Pr (y < 3).
25 Smulaton study: result Table 1 Monte Carlo bas and varance of the pont estmators. Parameter Estmator Bas Varance Std Var Complete sample 0.00.00739 100 η 1 FI (M = 100) 0.00.00925 125 MI (M = 100) 0.00.01058 143 FI (M = 10) 0.00.00952 129 MI (M = 10) 0.00.01081 146 Complete sample 0.00.00108 100 η 2 FI (M = 100) 0.00.00117 108 MI (M = 100) 0.00.00114 106 FI (M = 10) 0.00.00117 108 MI (M = 10) 0.00.00116 107
26 Smulaton study: result Table 2 Monte Carlo relatve bas of the varance estmator. Parameter Imputaton Relatve bas (%) FI (M = 100) 3.7 V (ˆη 1 ) MI (M = 100) 4.3 FI (M = 10) 2.7 MI (M = 10) 4.2 FI (M = 100) 3.9 V (ˆη 2 ) MI (M = 100) 15.6 FI (M = 10) 2.3 MI (M = 10) 15.5
27 Smulaton study: result For the estmaton of E(Y ), fractonal mputaton s more effcent. For the estmaton of proportons, multple mputaton s slghtly more effcent. However, multple mputaton provdes based varance estmates for proportons. Note that we are not usng the MLE for η 2. Under complete response, ˆη 2 = 1 n n I (y < 3) =1 s less effcent than the maxmum lkelhood estmator. The MI varance estmator s justfed only for the MLE s.
28 Concluson Fractonal mputaton s proposed as a tool for computng the condtonal expectaton of any functon of orgnal data gven the observed data. Fractonal mputaton can be used to mplement the Monte Carlo EM algorthm effcently n a computatonally effcent manner. Varance estmaton based on Taylor lnearzaton s also covered. Further detals can be found n Km (2010, Bometrka, tentatvely accepted).
29 Future Research Survey samplng applcaton (wth multvarate mssng data) Nonparametrc or sem-parametrc fractonal mputaton. Measurement error models Random effect models (wth possble applcatons n small area estmaton) More theoretcal work on nference (Wlk s theorem?)