Parametric fractional imputation for missing data analysis

Size: px
Start display at page:

Download "Parametric fractional imputation for missing data analysis"

Transcription

1 Secton on Survey Research Methods JSM 2008 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Wayne Fuller Abstract Under a parametrc model for mssng data, the EM algorthm s a popular tool for fndng the maxmum lkelhood estmates MLE of the parameters of the model. Imputaton, when carefully done, can be used to facltate the parameter estmaton by applyng the complete-sample estmators to the mputed dataset. The basc dea s to generate the mputed values from the condtonal dstrbuton of the mssng data gven the observed data. Multple mputaton s a Bayesan approach to generate the mputed values from the condtonal dstrbuton. In ths artcle, parametrc fractonal mputaton s proposed as a parametrc approach for generatng mputed values. Usng fractonal weghts, the E-step of the EM algorthm can be approxmated by the weghted mean of the mputed data lkelhood the fractonal weghts are computed from the current value of the parameter estmates. Some computatonal effcency can be acheved usng the dea of mportance samplng n the Monte Carlo approxmaton of the condtonal expectaton. The resultng estmator of the specfed parameters wll be dentcal to the MLE under mssng data f the fractonal weghts are adjusted usng a calbraton step. The proposed mputaton method provdes effcent parameter estmates for the model parameters specfed and also provdes reasonable estmates for parameters that are not part of the mputaton model, for example doman means. Thus, the proposed mputaton method s a useful tool for general-purpose data analyss. Varance estmaton s covered and results from a lmted smulaton study are presented. Key Words: nformaton. EM algorthm, Importance samplng, Monte Carlo EM, Multple mputaton, Observed lkelhood, Observed 1. INTRODUCTION Suppose that y 1, y 2,, y n are ndependent observatons of a p-dmensonal random varable y from a parametrc dstrbuton wth densty f y; θ 0 wth θ 0 Ω. The MLE of θ 0 can be obtaned as a soluton to the followng score equaton: S n θ s θ = 0, 1 s θ = ln f y ; θ / θ and S n θ s the score functon. Gven mssng data, let y,obs, y denote the observed part and mssng part of y, respectvely. To smplfy the presentaton, we assume the response mechansm s Mssng-At-Random MAR n the sense of Rubn Under MAR, the lkelhood functon s a margnal lkelhood obtaned by ntegratng out over the mssng part. Thus, we can wrte the observed lkelhood as L obs θ = n f obs y obs, ; θ, 2 f obs y obs, ; θ = f y,obs, y ; θ dy s the margnal densty of y,obs and the subscrpt s used n f obs because the mssng pattern can dffer from observaton to observaton. To compute the MLE that maxmzes the observed lkelhood 2, we need to solve the observed score equaton for θ, the observed score equaton s S obs θ s,obs θ θ Instead of solvng 3, the MLE of θ 0 can be obtaned by solvng S θ E S n θ Y obs ln f obs y obs, ; θ = 0. 3 E s θ y,obs = 0, 4 Y obs = y 1,obs, y 2,obs,, y n,obs, and S θ s called the mean score functon. The equvalence of the observed score functon and the mean score functon was frst proved by Fsher Department of Statstcs, Iowa State Unversty, Ames, IA 50011, U.S.A. 158

2 Strctly speakng, the condtonal expectaton n 4 s evaluated at θ and we should wrte the mean score equaton as S θ E s θ y,obs, θ = 0. 5 The EM algorthm, proposed by Dempster el al 1977, computes the soluton teratvely by defnng ˆθ t+1 to be the soluton to E s θ y,obs, ˆθ t = 0, 6 ˆθ t s the estmate of θ obtaned at the t-th teraton. To compute the condtonal expectaton n 6, the Monte Carlo mplementaton of the EM MCEM algorthm of We and Tanner 1990 can be used. The MCEM method avods the analytc computaton of the condtonal expectaton 6 by usng the Monte Carlo approxmaton based on the mputed data. Thus, one can nterpret mputaton as a Monte Carlo approxmaton of the condtonal expectaton gven the observed data. The Monte Carlo methods of approxmatng the condtonal expectaton n 4 can be placed n two classes: 1. Bayesan approach: Generate the mputed values from the posteror predctve dstrbuton of y gven y,obs : f y y,obs = f y θ, y,obs f θ y,obs dθ. 7 Ths s essentally the approach used n multple mputaton as proposed by Rubn Frequentst approach: Generate the mputed values from the condtonal dstrbuton f y y,obs, ˆθ wth an estmated value ˆθ. The Bayesan approach to mputaton has been proposed as a general method of handlng mssng data because of the feasblty of Bayesan computatonal methods and the smplcty of varance estmaton. However, the convergence to a stable posteror predctve dstrbuton 7 s dffcult to check and often requres huge computaton Gelman et al, Also, the varance estmator used n multple mputaton s not always consstent. For examples, see Fay 1992, Wang and Robns 1998, and Km et al In the frequentst approach to mputaton, the mputed values are generated from the condtonal dstrbuton f y y,obs, ˆθ wth a partcular value ˆθ, often the MLE of θ. However, the frequentst approach for mputaton has receved less attenton than Bayesan mputaton. One notable excepton s Wang and Robns 1998 who studed the asymptotc propertes of multple mputaton and a parametrc frequentst mputaton procedure. Wang and Robns 1998 consdered the estmated parameter ˆθ to be gven, and dd not dscuss parameter estmaton. We consder a frequentst mputaton gven a parametrc model for the orgnal dstrbuton. We propose an alternatve mplementaton of the MCEM method usng parametrc fractonal mputaton that does not requre regeneraton of the mputed values at each teraton. Only the fractonal weghts are re-computed for each teraton and we propose a smple method of computng the fractonal weghts wthout ncreasng the sze of Monte Carlo samples. The proposed method uses the calbraton technque to obtan the MLE and s computatonally very attractve n many cases. In Secton 2, the parametrc fractonal mputaton method s proposed. Varance estmaton s dscussed n Secton 3 and the proposed method s extended for general purpose estmaton n Secton 4. Calbraton fractonal mputaton s derved n Secton 5. Results from a lmted smulaton study are presented n Secton Proposed method As dscussed n Secton 1, solvng the mean score equaton 4 requres an teratve method because the condtonal dstrbuton of y gven y,obs, denoted by f y y,obs, θ, s a functon of θ. Thus, snce we cannot generate mputed values from the condtonal dstrbuton wth unknown θ, the teratve procedure generates mputed values from the condtonal dstrbuton wth the current value of θ and then updates θ based on the mputed score equaton. To avod re-generatng values from the condtonal dstrbuton at each step, we frst generate M mputed values from some known dstrbuton q y whose support ncludes that of f y y,obs, θ. Let the generated values be y 1,, y M. Because E s θ y,obs, ˆθ t = Secton on Survey Research Methods JSM 2008 f y y,obs, ˆθ t s θ q y q y dy, 8 159

3 we can approxmate the condtonal expectaton by E s θ y,obs, ˆθ.= 1 t M f s j θ y j y,obs, ˆθ t. q y j Thus, we propose the followng algorthm for the parametrc fractonal mputaton usng mportance samplng: [Step 1] Obtan an ntal estmator ˆθ 0 of θ. Also, generate M mputed values, y 1,, y M, from some densty q y. Often, q y = f y y,obs, ˆθ 0. [Step 2] Wth the current estmate of θ, denoted by ˆθ t, compute the fractonal weghts as w jt = C t C t s chosen to satsfy M w jt = 1. Secton on Survey Research Methods JSM 2008 f y j y,obs; ˆθ t, 9 q y j [Step 3] Usng the fractonal weght obtaned from Step 2, solve the weghted score equaton ˆθ t+1 soluton to w jt s j θ = [Step 4] Go to Step 2. Stop f ˆθ t meets the convergence crteron. The proposed method s computatonally attractve because we use a weghted score equaton to compute the parameter estmates. Unlke the MCEM method, the mputed values are not changed for each teraton, only the fractonal weghts are changed. Remark 1 In Step 2, fractonal weghts can be computed by usng the jont densty wth the current parameter estmate ˆθ t. Note that f y j y,obs, ˆθ t /q M y f j y,obs, ˆθ t /q y j y j = f y,obs, y j ; ˆθ t /q M y f,obs, y j ; ˆθ t /q Thus, the fractonal weghts 9 can be computed as f y,obs, y j wjt = C ; ˆθ t t, q y j y j y k whch does not requre the densty of the condtonal dstrbuton. Only the jont densty s needed. Remark 2 The choce of the ntal densty q y s somewhat arbtrary. If we choose q y = f y y,obs, ˆθ 0 ˆθ 0 s an ntal parameter estmate of θ, the fractonal weght wth current parameter estmate ˆθ t s of the form f y,obs, y j wjt = C ; ˆθ t t f y,obs, y j ; ˆθ, 11 0 C t s a normalzng constant. The ntal estmate ˆθ 0 s not necessarly n-consstent. Gven the M mputed values, y 1,, y M, generated from q y, the sequence of estmators ˆθ0, ˆθ 1, can be constructed from the parametrc fractonal mputaton usng mportance samplng. The followng theorem presents some convergence propertes of the sequence of the estmators.. 160

4 Secton on Survey Research Methods JSM 2008 Theorem 1 Assume that the M mputed values are generated from q y. Let w jt = w j then ˆθt. If Q θ ˆθ t = w jt ln f y,obs, y j ; θ, 12 Q ˆθt+1 ˆθ t Q ˆθt ˆθ t L obs ˆθt+1 L obs 13 ˆθt, 14 L obs θ = n f obs y,obs; θ wth M fobs y y f,obs, y j ; θ /q y j,obs; θ =. M y 1/q j Proof. By the Jensen s nequalty, ln L obs ˆθt+1 ln L obs ˆθt f y,obs, y j = ln wjt ; ˆθ t+1 f y,obs, y j ; ˆθ t f y,obs, y j ln wjt ; ˆθ t+1 f y,obs, y j ; ˆθ t = Q ˆθt+1 ˆθ t Q ˆθt ˆθ t. Therefore, 13 mples 14. Note that L obs θ s an mputed verson of the observed lkelhood based on the the M mputed values, y 1,, y M, generated from q y. Under farly general condtons, the soluton to the mputed score equaton 10 satsfes 13. Thus, by Theorem 1, the sequence L obs ˆθt s monotoncally ncreasng. Also, under the farly general condtons stated n Wu 1983, the convergence of ˆθ t follows for fxed M. Theorem 1 does not hold for the sequence obtaned from the MCEM method for fxed M. To dscuss varance estmaton, note that 3. Varance estmaton θ S θ = I obs θ, 15 I obs θ = E θ S n θ Y obs, θ + S θ 2 E S n θ 2 Y obs, θ wth S n θ = n s θ and S 2 = SS. Lous 1982 frst proved 15 to estmate the varance of the MLE obtaned by the EM algorthm. Let ˆθ be the soluton to the approxmate mean score equaton s j θ = s θ; y,obs, y j and w j θ = S θ 16 w j θ s j θ = 0, 17 f y,obs, y j ; θ /q M k=1 y f,obs, y k ; θ /q y j y k

5 Note that Secton on Survey Research Methods JSM 2008 E S θ Y obs = S θ 19 S θ s defned n 5 and the expectaton n 19 s over the mputaton mechansm. Here, superscrpt s used n ˆθ to emphasze that the soluton s obtaned from the approxmate mean score equaton 17, not from the exact mean score equaton 5. An EM-type algorthm such as 10 can be used to fnd a soluton ˆθ to 17. Usng the Taylor lnearzaton, ] 1 ˆθ θ 0 = [E θ S θ 0 S θ 0. Thus, we can use the sandwch formula to compute the varance of ˆθ that s the soluton to 4. Note that, by 19, V ar S θ 0 = V ar S θ0 + V ar S θ 0 S θ The frst term n the rght sde of 20 can be estmated by I obs ˆθ 1, as suggested by Lous The observed nformaton 16 can be easly computed from fractonal mputaton. That s, we use Îobsˆθ an an estmator of I obs θ 0, Î obs θ = + w j s θ 2 s θ; y,obs, y j / θ w j s j 2 θ 21 s θ = M w j s j θ and wj = w j ˆθ. Thus, the estmator n 21 s based on the Monte Carlo approxmaton of the condtonal expectaton 16 usng fractonal mputaton the fractonal weght corresponds to the mportance weght of mportance samplng. Because the Monte Carlo expectaton not only approxmates the mean score equaton 5 but also approxmates the observed nformaton 16, the fractonal mputaton FI method provdes consstent varance estmaton for suffcently large M. To estmate the second term of 20, we consder the case when y 1,, y M are ndependent samples from q y. In ths case, we can express S θ = 1 S j θ M S j θ = M n w j s j θ and we have 1 M B θ = 1 1 M M 1 S j θ S 2 θ to be unbased for second term of 20. Therefore, the proposed varance estmator s ˆV ˆθ = [I obs ˆθ ] 1 [ + I obs ˆθ ] 1 1 M Bˆθ [I obs ˆθ ] Often, the second term n 20 s very small for large M or for an effcent mputaton method. In ths case, the second term 22 can be safely omtted n the varance estmaton. 4. Extensons So far, we have consdered the case the parameter of nterest s estmated by the maxmum lkelhood method. We consder an extenson the parameter of nterest s not necessarly estmated from the maxmum lkelhood method, but s estmated by solvng an estmatng equaton. Suppose that, under complete response, a parameter of nterest, denoted by η, s estmated as the unque soluton to the estmatng equaton U η u η; y = 0,

6 Secton on Survey Research Methods JSM 2008 for some functon u η; y of η wth contnuous partal dervatves. Let ˆη be the soluton to 23. Under some regularty condtons, n ˆη η0 N [0, g η 0 1 V u η 0 ; y g η 0 1] g η = E u η; y / η and η 0 s a unque soluton to E U η = 0. Under nonresponse, a consstent estmator of η 0 can be obtaned as a soluton to the followng estmatng equaton Ū η ˆθ E u η; y y,obs, ˆθ = 0, 24 ˆθ s the soluton to 5. The estmatng equaton 24 s called the expected estmatng equaton. The use of an expected estmatng equaton has been dscussed by, among others, Wang and Pepe 2000 and Robns and Wang Usng the fractonal mputaton approach dscussed n Secton 2, we can construct a Monte Carlo approxmaton to the estmatng equaton ˆη soluton to w jˆθ u j η = 0, 25 u j η = uη; y,obs, y j, w j θ s defned n 18, and ˆθ s the soluton to 17. Note that we do not have to update the soluton ˆθ teratvely n 25 and only the fnal estmate ˆθ s needed. The followng theorem presents some asymptotc propertes of the estmator that s the soluton to 24, or the soluton to 25. Theorem 2 Let ˆθ be the Monte Carlo approxmaton of the MLE of θ that s computed by solvng the approxmated mean score equaton 17. Under some regularty condtons, the soluton ˆη to 25 satsfes n ˆη η = o p 1 26 and E η = η 0 V ar η = g η 0 1 V ar Ũ η 0, θ 0 g η Here, g η = E n u η; y / η and Ũ η, θ = Ū η, θ + K S θ, 28 Ū η, θ = S θ = wj θ u j η wj θ s j θ and K = [I obs θ 0 ] 1 E [S ms θ 0 U η 0 ]. 29 Here, I obs θ = E S obs θ / θ and S ms θ = S n θ S obs θ. The result n Theorem 2 can be used to derve a varance estmator for ˆη that s a soluton to 25. The crucal part s to estmate the varance of the lnearzed term 28. Note that we can wrte Ũ V ar η 0, θ 0 = V ar Ũ Ũ η0, θ 0 + V ar η 0, θ 0 Ũ η 0, θ 0, 30 Ũ η, θ = p lm Ũ η, θ M 163

7 If we wrte Ũ η, θ = Ū η, θ K S θ = a plug-n estmator of V ar Ũ η0, θ 0 s Secton on Survey Research Methods JSM 2008 n n 1 n ū η, θ K s θ = û û û û ũ, û = ū ˆη, ˆθ ˆK s ˆθ. The terms ū ˆη, ˆθ and s ˆθ are easly computed from the fractonal mputaton wth fractonal weghts. To estmate the second term of 30, wrte Ũ η, θ = 1 Ũ j η, θ, M. The second term n 30 can be consstently est- Ũ j η, θ = M n w j θ mated by u j 1 1 M M 1 η K s j θ j Ũ ˆη, ˆθ ˆη, ˆθ 2 Ũ. To estmate K term n 29, we need to estmate the two terms n 29 separately. The frst term, I obs θ, can be computed usng 21, the estmated observed nformaton based on the Lous formula. Now, to estmate the second term n K, we use E U η, θ S ms θ Y obs, ˆθ = E U η, θ S n θ Y obs, ˆθ Ū η, θ S θ. The frst expectaton can be estmated by the fractonal mputaton. That s, we can estmate E U η, θ S n θ Y obs, ˆθ by wju j j ˆη, ˆθs ˆθ wth u j η, θ = u η, θ; y,obs, y j and s j θ = s θ; y,obs, y j. 5. Calbraton The proposed estmaton method can be vewed as a method of mplementng a MCEM algorthm usng mportance samplng. The MCEM method s subject to samplng error when approxmatng the condtonal expectaton by a summaton. In general, the sze M of the Monte Carlo sample needs to be very large for satsfactory approxmaton. For moderate sze M, there are two stuatons when the approxmaton s accurate. The frst stuaton s when there are only fnte number of possble values for y. In ths case, we take the possble values as the mputed values and compute the condtonal probablty of y by the followng Bayes formula: p y j y,obs, ˆθ = y,obs, y ; ˆθ f M f y,obs, y j ; ˆθ M s the number of possble values of y and ˆθ s the MLE of θ. The condtonal expectaton n 6 can be wrtten E s θ y,obs, ˆθ M t = s j θ p y j y,obs, ˆθ t. 31, 164

8 Here, the estmated probablty p y j y,obs, ˆθ t takes the role of the fractonal weght. Ibrahm 1990 proposed usng 31 n the E-step of the EM algorthm for dscrete data. The approxmaton s exact when the dstrbuton belongs to the exponental famly of the form f y; θ = exp t y θ + φ θ + A y. 32 Under the model 32, the score equaton 1 under complete response s equal to φ θ t y + = 0 θ and the mean score equaton 4 can be wrtten E [t y y,obs, θ] + φ θ = 0. θ Thus, the ntegraton problem n 6 reduces to the problem of computng the ntegraton E t y y,obs, θ, whch s often a known functon of y,obs and θ. In ths case, the mplementaton of the EM algorthm smplfes. Defne g y,obs, θ = E t y y,obs, θ. 33 Recall that, n the fractonal mputaton approach, we can express the condtonal expectaton by a weghted summaton E t y y,obs, ˆθ M t = wjt t y,obs, y j, 34 y j s the j-th mputed value of y and wjt s the fractonal weght whch s the condtonal probablty of y = y j ms, gven y obs, usng the current parameter value ˆθ t. Thus, t s proposed that M wjt t y,obs, y j = g y,obs, ˆθ t be used as as a constrant for fndng the fractonal weghts. We can use the regresson weghtng technque or the emprcal lkelhood technque to fnd a soluton to 35. Here, M need not be large. Example 1 Suppose that y = y 1, y 2 has a bvarate normal dstrbuton: [ ] y1..d. µ1 σ11 σ N, 12. y 2 µ 2 σ 12 σ 22 Under the bvarate normal dstrbuton, a set of suffcent statstcs for the parameter θ = µ 1, µ 2, σ 11,, σ 12, σ 22 s y1, y 2, y1 2, y 1y 2, y2 2. Therefore, constrant 35 can be satsfes f n Secton on Survey Research Methods JSM and = = 1, E 1, E w jt 1, y j 1, y j 1 2 y 1 y 2, ˆθ t, E y 1 y 2, ˆθ 2 t + ˆσ11 2t, for A MR w jt 1, y j 2, y j 2 2 y 2 y 1, ˆθ t, E y 2 y 1, ˆθ 2 t + ˆσ22 1t, for A RM σ 11 2 = σ 11 σ 2 12/σ 22, and σ 22 1 = σ 22 σ 2 12/σ 11. E y 1 y 2, ˆθ = ˆµ 1 + ˆσ 12 y 2 ˆµ 2 ˆσ 22 E y 2 y 1, ˆθ = ˆµ 2 + ˆσ 12 y 1 ˆµ 1, ˆσ

9 In practce, nstead of 35, the fractonal weghts are computed from A c wjt t y,obs, y j = g y,obs, ˆθ t, 36 A c A c s the set of sample ndces n a cell c. Imposng fractonal weghtng constrants n each cell rather than for each unt reduces the chance of extreme weghts. Varance estmaton wth fractonally mputed data can be performed usng lnearzaton or replcaton. The plug-n method dscussed n Secton 3 s essentally the lnearzaton method. Assume that, under complete response, let be the k-th replcaton weght for unt. Assume that the replcaton varance estmator ˆθ [k] n ˆV n = L k=1 c k ˆθ[k] n ˆθ n 2, 37 ˆθ n = n w y and = n w[k] y, s consstent for the varance of ˆθ n. For replcaton wth the calbraton fractonal mputaton method, we consder the followng steps for creatng replcated fractonal weghts. Here, we assume that the calbraton fractonal weghts are computed from 36. [Step 1] Compute ˆθ [k], the k-th replcate of ˆθ, usng fractonal weghts. [Step 2] Usng the ˆθ [k] computed from Step 1, compute the replcated fractonal weghts by A c usng the regresson weghtng technque. w [k] j t y,obs, y j = g y,obs, ˆθ [k], 38 A c Equaton 38 s the calbraton equaton for the replcated fractonal weghts. In general, Step 1 can be computatonally problematc snce ˆθ [k] s computed from the teratve algorthm 10 for each replcaton. Thus, we consder an approxmaton for ˆθ [k] usng Taylor lnearzaton. Let Secton on Survey Research Methods JSM 2008 S [k] θ = s θ s θ = E s θ y,obs, θ. Usng 15 and 21, the approxmaton formula can be mplemented as ˆθ [k] = ˆθ [Î[k] ˆθ] 1 + obs S[k] ˆθ, 39 and Î [k] obs θ = n + w j s θ 2 S [k] θ = s θ; y,obs, y j / θ w j w js j θ. 6. Smulaton Study s j 2 θ 40 In a lmted smulaton study, we generated B = 5, 000 Monte Carlo samples of sze n = 200 from a bvarate normal dstrbuton wth µ 1 = 0, µ 2 = 2, σ 11 = 1, σ 12 = 1, and σ 22 = 2. The probablty of both respondng s 0.42, the probablty of only y 1 respondng 0.18, and the probablty of only y 2 respondng We consdered the followng seven parameters: 166

10 1. Fve parameters n the bvarate normal dstrbuton: 2. Proporton of y 1 less than 0.8. µ 1, µ 2, σ 11, σ 12, σ Doman mean the probablty of beng n the doman s 0.4. The probablty of beng n the doman does not depend on y 1 or y 2. For each parameter, we have computed four estmators: 1. The MLE usng the EM algorthm 2. The fractonal mputaton estmator proposed n Secton 2 wth M = 100 and M = The calbraton fractonal mputaton estmator proposed n Secton 5 wth M = 10 usng the regresson weghtng method. 4. Multple mputaton MI wth M = 10 mputatons. Secton on Survey Research Methods JSM 2008 In fractonal mputaton, mputed values are generated by a systematc samplng method descrbed n Appendx B, wth M = 1, 000. The basc dea s to generate M ntal mputed values and then use a verson of systematc samplng to get the fnal M mputed values. In the calbraton fractonal mputaton method, the regresson fractonal weghts are computed by 35. In multple mputaton, the mputed values are generated from the posteror predctve dstrbuton teratvely usng Gbbs samplng. For varance estmaton, we consdered the FI estmator wthout calbraton, the calbraton FI estmator, and multple mputaton. For varance estmaton of the fractonal mputaton, we used the plug-n estmator dscussed n Secton 3 and Secton 4. For varance estmaton of the calbraton FI estmator, we used the one-step jackknfe varance estmator dscussed n Secton 5. For varance estmaton of the multple mputaton, we used the varance formula of Rubn Table 1 presents the Monte Carlo means and varances of the four estmators. Table 2 presents the Monte Carlo relatve bases and t-statstcs for the varance estmators. The t-statstc s the statstc for testng zero bas n the varance estmator. For pont estmaton, the calbraton FI estmator and the the EM method gve the same values for the parameters specfed n the model. The uncalbrated fractonal mputaton estmator shows farly good effcency for many parameters, whch suggests that the systematc samplng method used n the fractonal mputaton s already qute effcent. Multple mputaton shows less effcency than the FI estmators for all parameters. For estmaton of the proporton and the doman mean, t s possble for the FI estmator wth M = 100 to be more effcent than the calbraton FI estmator wth M = 10 because these parameters are not drectly consdered n the calbraton step. The dfferences n effcences for these two parameters are less than one percent. For varance estmaton of the FI estmators, both lnearzaton and replcaton methods provde consstent estmates for the varance of the parameter estmates. Varance estmaton for doman estmaton s based under multple mputaton, as was dentfed by Km and Fuller REFERENCES Dempster, A. P., Lard, N. M. and Rubn, D. B. 1977, Maxmum lkelhood from ncomplete data va the EM algorthm, Journal of the Royal Statstcal Socety, Ser. B, 39, Fay, R. E. 1992, When are nferences from multple mputaton vald? In Proceedngs n Survey Research Method Secton, Washngton, DC: Amercan Statstcal Assocaton, pp Fsher, R.A. 1925, Theory of statstcal estmaton, Proceedngs of the Cambrdge Phlosophcal Socety, 22, Gelman, A., Meng, X.-L., and Stern, H. 1996, Posteror predctve assessment of model ftness va realzed dscrepances wth dscusson, Statstca Snca, 6, Ibrahm, J. G. 1990, Incomplete data n generalzed lnear models, Journal of the Amercan Statstcal Assocaton, 85, Km, J.K., Brck, M.J., Fuller, W.A., and Kalton, G. 2006, On the bas of the multple mputaton varance estmator n survey samplng, Journal of the Royal Statstcal Socety, Ser. B, 68, Km, J.K. and Fuller, W.A Fractonal hot deck mputaton, Bometrka, 91, Lous, T. A. 1982, Fndng the observed nformaton matrx when usng the EM algorthm, Journal of the Royal Statstcal Socety, Ser. B, 44, Robns, J.M. and Wang, N. 2000, Inference for mputaton estmators, Bometrka, 87, Rubn, D. B. 1976, Inference and mssng data, Bometrka, 63, Rubn, D.B. 1987, Multple mputaton for nonresponse n surveys, New York: Wley. Wang, C.-Y. and Pepe, M. S. 2000, Expected estmatng equatons to accommodate covarate measurement error, Journal of the Royal Statstcal Socety, Ser. B, 62, Wang, N. and Robns, J.M. 1998, Large-sample theory for parametrc multple mputaton procedures, Bometrka, 85, We, G.C.G. and Tanner, M.A. 1990, A Monte Carlo mplementaton of the EM algorthm and the poor man s data augmentaton algorthm, Journal of the Amercan Statstcal Assocaton, 85, Wu, C.F.J. 1983, On the convergence propertes of the EM algorthm, The Annals of Statstcs, 11,

11 Secton on Survey Research Methods JSM 2008 Table 1: Monte Carlo means and varances of the mputed estmators, based on 5,000 samples Parameter Method Mean Varance µ 1 EM FI M= FI M= Calb. FI M= MI M= µ 2 EM FI M= FI M= Calb. FI M= MI M= σ 11 EM FI M= FI M= Calb. FI M= MI M= σ 12 EM FI M= FI M= Calb. FI M= MI M= σ 22 EM FI M= FI M= Calb. FI M= MI M= Proporton FI M= FI M= Calb. FI M= MI M= Doman Mean FI M= FI M= Calb. FI M= MI M=

12 Secton on Survey Research Methods JSM 2008 Table 2: Monte Carlo relatve bases and t-statstcs of the varance estmators, based on 5,000 samples Parameter Method Rel. Bas % t-statstcs Lnearze for FI wth M = V ar ˆµ 1 Lnearze for FI wth M = One-step JK for calbraton FI MI M= Lnearze for FI wth M = V ar ˆµ 2 Lnearze for FI wth M = One-step JK for calbraton FI MI M= Lnearze for FI wth M = V ar ˆσ 11 Lnearze for FI wth M = One-step JK for calbraton FI MI M= Lnearze for FI wth M = V ar ˆσ 12 Lnearze for FI wth M = One-step JK for calbraton FI MI M= Lnearze for FI wth M = V ar ˆσ 22 Lnearze for FI wth M = One-step JK for calbraton FI MI M= Lnearze for FI wth M = V ar ˆp Lnearze for FI wth M = One-step JK for calbraton FI MI M= Lnearze for FI wth M = V ar ˆµ d Lnearze for FI wth M = One-step JK for calbraton FI MI M=

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Efficient nonresponse weighting adjustment using estimated response probability

Efficient nonresponse weighting adjustment using estimated response probability Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

arxiv: v1 [stat.me] 27 Aug 2015

arxiv: v1 [stat.me] 27 Aug 2015 Submtted to Statstcal Scence Fractonal Imputaton n Survey Samplng: A Comparatve Revew Shu Yang and Jae Kwang Km Harvard Unversty and Iowa State Unversty arxv:1508.06945v1 [stat.me] 27 Aug 2015 Abstract.

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

The EM Algorithm (Dempster, Laird, Rubin 1977) The missing data or incomplete data setting: ODL(φ;Y ) = [Y;φ] = [Y X,φ][X φ] = X

The EM Algorithm (Dempster, Laird, Rubin 1977) The missing data or incomplete data setting: ODL(φ;Y ) = [Y;φ] = [Y X,φ][X φ] = X The EM Algorthm (Dempster, Lard, Rubn 1977 The mssng data or ncomplete data settng: An Observed Data Lkelhood (ODL that s a mxture or ntegral of Complete Data Lkelhoods (CDL. (1a ODL(;Y = [Y;] = [Y,][

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

A note on multiple imputation for method of moments estimation

A note on multiple imputation for method of moments estimation Statstcs Publcatons Statstcs 2-2016 A note on multple mputaton for method of moments estmaton Shu Yang Harvard Unversty Jae Kwang Km Iowa State Unversty, jkm@astate.edu Follow ths and addtonal works at:

More information

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Efficient estimation in missing data and survey sampling problems

Efficient estimation in missing data and survey sampling problems Graduate Theses and Dssertatons Iowa State Unversty Capstones, Theses and Dssertatons 2012 Effcent estmaton n mssng data and survey samplng problems Sxa Chen Iowa State Unversty Follow ths and addtonal

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maxmum Lkelhood Estmaton INFO-2301: Quanttatve Reasonng 2 Mchael Paul and Jordan Boyd-Graber MARCH 7, 2017 INFO-2301: Quanttatve Reasonng 2 Paul and Boyd-Graber Maxmum Lkelhood Estmaton 1 of 9 Why MLE?

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE

REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE STATISTICA, anno LXXIV, n. 3, 2014 REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE Muqaddas Javed 1 Natonal College of Busness Admnstraton and Economcs, Lahore,

More information

SIO 224. m(r) =(ρ(r),k s (r),µ(r))

SIO 224. m(r) =(ρ(r),k s (r),µ(r)) SIO 224 1. A bref look at resoluton analyss Here s some background for the Masters and Gubbns resoluton paper. Global Earth models are usually found teratvely by assumng a startng model and fndng small

More information

Stat 543 Exam 2 Spring 2016

Stat 543 Exam 2 Spring 2016 Stat 543 Exam 2 Sprng 206 I have nether gven nor receved unauthorzed assstance on ths exam. Name Sgned Date Name Prnted Ths Exam conssts of questons. Do at least 0 of the parts of the man exam. I wll score

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Population Design in Nonlinear Mixed Effects Multiple Response Models: extension of PFIM and evaluation by simulation with NONMEM and MONOLIX

Population Design in Nonlinear Mixed Effects Multiple Response Models: extension of PFIM and evaluation by simulation with NONMEM and MONOLIX Populaton Desgn n Nonlnear Mxed Effects Multple Response Models: extenson of PFIM and evaluaton by smulaton wth NONMEM and MONOLIX May 4th 007 Carolne Bazzol, Sylve Retout, France Mentré Inserm U738 Unversty

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

Stat 543 Exam 2 Spring 2016

Stat 543 Exam 2 Spring 2016 Stat 543 Exam 2 Sprng 2016 I have nether gven nor receved unauthorzed assstance on ths exam. Name Sgned Date Name Prnted Ths Exam conssts of 11 questons. Do at least 10 of the 11 parts of the man exam.

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

Small Area Interval Estimation

Small Area Interval Estimation .. Small Area Interval Estmaton Partha Lahr Jont Program n Survey Methodology Unversty of Maryland, College Park (Based on jont work wth Masayo Yoshmor, Former JPSM Vstng PhD Student and Research Fellow

More information

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013 ISSN: 2277-375 Constructon of Trend Free Run Orders for Orthogonal rrays Usng Codes bstract: Sometmes when the expermental runs are carred out n a tme order sequence, the response can depend on the run

More information

Lecture 21: Numerical methods for pricing American type derivatives

Lecture 21: Numerical methods for pricing American type derivatives Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)

More information

RELIABILITY ASSESSMENT

RELIABILITY ASSESSMENT CHAPTER Rsk Analyss n Engneerng and Economcs RELIABILITY ASSESSMENT A. J. Clark School of Engneerng Department of Cvl and Envronmental Engneerng 4a CHAPMAN HALL/CRC Rsk Analyss for Engneerng Department

More information

A note on regression estimation with unknown population size

A note on regression estimation with unknown population size Statstcs Publcatons Statstcs 6-016 A note on regresson estmaton wth unknown populaton sze Mchael A. Hdroglou Statstcs Canada Jae Kwang Km Iowa State Unversty jkm@astate.edu Chrstan Olver Nambeu Statstcs

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Estimation of the Mean of Truncated Exponential Distribution

Estimation of the Mean of Truncated Exponential Distribution Journal of Mathematcs and Statstcs 4 (4): 84-88, 008 ISSN 549-644 008 Scence Publcatons Estmaton of the Mean of Truncated Exponental Dstrbuton Fars Muslm Al-Athar Department of Mathematcs, Faculty of Scence,

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Web-based Supplementary Materials for Inference for the Effect of Treatment. on Survival Probability in Randomized Trials with Noncompliance and

Web-based Supplementary Materials for Inference for the Effect of Treatment. on Survival Probability in Randomized Trials with Noncompliance and Bometrcs 000, 000 000 DOI: 000 000 0000 Web-based Supplementary Materals for Inference for the Effect of Treatment on Survval Probablty n Randomzed Trals wth Noncomplance and Admnstratve Censorng by Ne,

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Explaining the Stein Paradox

Explaining the Stein Paradox Explanng the Sten Paradox Kwong Hu Yung 1999/06/10 Abstract Ths report offers several ratonale for the Sten paradox. Sectons 1 and defnes the multvarate normal mean estmaton problem and ntroduces Sten

More information

Parameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples

Parameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples Appled Mathematcal Scences, Vol. 5, 011, no. 59, 899-917 Parameters Estmaton of the Modfed Webull Dstrbuton Based on Type I Censored Samples Soufane Gasm École Supereure des Scences et Technques de Tuns

More information

AN OPTIMAL ESTIMATING EQUATION WITH UNSPECIFIED VARIANCES

AN OPTIMAL ESTIMATING EQUATION WITH UNSPECIFIED VARIANCES Sankhyā : The Indan Journal of Statstcs 2002, Volume 64, Seres A, Pt. 1, pp 95-108 AN OPTIMAL ESTIMATING EQUATION WITH UNSPECIFIED VARIANCES By ANUP DEWANJI Indan Statstcal Insttute, Kolkata LUE PING ZHAO

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

Appendix B. The Finite Difference Scheme

Appendix B. The Finite Difference Scheme 140 APPENDIXES Appendx B. The Fnte Dfference Scheme In ths appendx we present numercal technques whch are used to approxmate solutons of system 3.1 3.3. A comprehensve treatment of theoretcal and mplementaton

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Note 10. Modeling and Simulation of Dynamic Systems

Note 10. Modeling and Simulation of Dynamic Systems Lecture Notes of ME 475: Introducton to Mechatroncs Note 0 Modelng and Smulaton of Dynamc Systems Department of Mechancal Engneerng, Unversty Of Saskatchewan, 57 Campus Drve, Saskatoon, SK S7N 5A9, Canada

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD Matrx Approxmaton va Samplng, Subspace Embeddng Lecturer: Anup Rao Scrbe: Rashth Sharma, Peng Zhang 0/01/016 1 Solvng Lnear Systems Usng SVD Two applcatons of SVD have been covered so far. Today we loo

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals Internatonal Journal of Scentfc World, 2 1) 2014) 1-9 c Scence Publshng Corporaton www.scencepubco.com/ndex.php/ijsw do: 10.14419/jsw.v21.1780 Research Paper Statstcal nference for generalzed Pareto dstrbuton

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model

More information

MATH 281A: Homework #6

MATH 281A: Homework #6 MATH 28A: Homework #6 Jongha Ryu Due date: November 8, 206 Problem. (Problem 2..2. Soluton. If X,..., X n Bern(p, then T = X s a complete suffcent statstc. Our target s g(p = p, and the nave guess suggested

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE STATISTICA, anno LXXV, n. 4, 015 USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE Manoj K. Chaudhary 1 Department of Statstcs, Banaras Hndu Unversty, Varanas,

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2017 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2017 Instructor: Victor Aguirregabiria ECOOMETRICS II ECO 40S Unversty of Toronto Department of Economcs Wnter 07 Instructor: Vctor Agurregabra SOLUTIO TO FIAL EXAM Tuesday, Aprl 8, 07 From :00pm-5:00pm 3 hours ISTRUCTIOS: - Ths s a closed-book

More information

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE P a g e ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE Darmud O Drscoll ¹, Donald E. Ramrez ² ¹ Head of Department of Mathematcs and Computer Studes

More information

Solutions Homework 4 March 5, 2018

Solutions Homework 4 March 5, 2018 1 Solutons Homework 4 March 5, 018 Soluton to Exercse 5.1.8: Let a IR be a translaton and c > 0 be a re-scalng. ˆb1 (cx + a) cx n + a (cx 1 + a) c x n x 1 cˆb 1 (x), whch shows ˆb 1 s locaton nvarant and

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Primer on High-Order Moment Estimators

Primer on High-Order Moment Estimators Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017

More information

A Comparative Study for Estimation Parameters in Panel Data Model

A Comparative Study for Estimation Parameters in Panel Data Model A Comparatve Study for Estmaton Parameters n Panel Data Model Ahmed H. Youssef and Mohamed R. Abonazel hs paper examnes the panel data models when the regresson coeffcents are fxed random and mxed and

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

An adaptive SMC scheme for ABC. Bayesian Computation (ABC)

An adaptive SMC scheme for ABC. Bayesian Computation (ABC) An adaptve SMC scheme for Approxmate Bayesan Computaton (ABC) (ont work wth Prof. Mke West) Department of Statstcal Scence - Duke Unversty Aprl/2011 Approxmate Bayesan Computaton (ABC) Problems n whch

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

4.3 Poisson Regression

4.3 Poisson Regression of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)

More information