Chapter 4: Imputation

Size: px
Start display at page:

Download "Chapter 4: Imputation"

Transcription

1 Chapter 4: Imputation Jae-Kwang Kim Department of Statistics, Iowa State University

2 Outline 1 Introduction 2 Basic Theory for imputation 3 Variance estimation after imputation 4 Replication variance estimation 5 Multiple imputation 6 Fractional imputation Ch 4 2 / 79

3 Introduction Basic setup Y: a vector of random variables with distribution F (y; θ). y 1,, y n are n independent realizations of Y. We are interested in estimating ψ which is implicitly defined by E{U(ψ; Y)} = 0. Under complete observation, a consistent estimator ˆψ n of ψ can be obtained by solving estimating equation for ψ: U(ψ; y i ) = 0. A special case of estimating function is the score function. In this case, ψ = θ. Sandwich variance estimator is often used to estimate the variance of ˆψ n : ˆV ( ˆψ n ) = ˆτ 1 u where τ u = E{ U(ψ; y)/ ψ }. ˆV (U)ˆτ 1 u Ch 4 3 / 79

4 1. Introduction Missing data setup Suppose that y i is not fully observed. y i = (y obs,i, y mis,i ): (observed, missing) part of y i δ i : response indicator functions for y i. Under the existence of missing data, we can use the following estimators: ˆψ: solution to E {U (ψ; y i ) y obs,i, δ i } = 0. (1) The equation in (1) is often called expected estimating equation. Ch 4 4 / 79

5 1. Introduction Motivation (for imputation) Computing the conditional expectation in (1) can be a challenging problem. 1 The conditional expectation depends on unknown parameter values. That is, E {U (ψ; y i ) y obs,i, δ i } = E {U (ψ; y i ) y obs,i, δ i ; θ, φ}, where θ is the parameter in f (y; θ) and φ is the parameter in p(δ y; φ). 2 Even if we know η = (θ, φ), computing the conditional expectation is numerically difficult. Ch 4 5 / 79

6 1. Introduction Imputation Imputation: Monte Carlo approximation of the conditional expectation (given the observed data). E {U (ψ; y i ) y obs,i, δ i } = 1 m m U j=1 ( ) ψ; y obs,i, y (j) mis,i 1 Bayesian approach: generate y mis,i from f (y mis,i y obs, δ) = f (y mis,i y obs, δ; η) p(η y obs, δ)dη 2 Frequentist approach: generate y mis,i from f (y mis,i y obs,i, δ; ˆη), where ˆη is a consistent estimator. Ch 4 6 / 79

7 Example 4.1 Basic Setup Let (x, y) be a vector of bivariate random variables. Assume that x i are always observed and y i are subject to missingness in the sample, and the probability of missingness does not depend on the value of y i. In this case, an imputed estimator of θ = E(Y ) based on single imputation can be computed by ˆθ I = 1 n {δ i y i + (1 δ i ) y i } (2) where y i is an imputed value for y i. Imputation model y i N ( β 0 + β 1 x i, σ 2 e), for some (β 0, β 1, σ 2 e). Ch 4 7 / 79

8 Example 4.1 (Cont d) Deterministic imputation: Use yi = ˆβ 0 + ˆβ 1 x i where ( ) ( ) ˆβ 0, ˆβ 1 = ȳ r ˆβ 1 x r, Sxxr 1 S xyr. Note that and ) V (ˆθ I ) E (ˆθ I θ = 0 = 1 ( 1 n σ2 y + r 1 ) σe 2 = σ2 { ( y 1 1 r ) ρ 2}. n r n Stochastic imputation: Use yi = ˆβ 0 + ˆβ 1 x i + ei, where e i (0, ˆσ e). 2 The imputed estimator under stochastic imputation satisfies ) V (ˆθ I = 1 n σ2 y + ( 1 r 1 n ) σe 2 + n r n 2 σ2 e where the third term represents the additional variance due to stochastic imputation. Ch 4 8 / 79

9 Remark Deterministic imputation is unbiased for the estimating the mean but may not be unbiased for estimating the proportion. For example, if θ = Pr(Y < c) = E{I (Y < c)}, the imputed estimator ˆθ = n 1 {δ i I (y i < c) + (1 δ i )I (y i < c)} is unbiased if E{I (Y < c)} = E{I (Y < c)}, which holds only when the marginal distribution of y is the same as the marginal distribution of y. In general, under deterministic imputation, we have E(y) = E(y ) but V (y) > V (y ). For regression imputation, V (y ) = σ 2 y (1 ρ 2 ) < σ 2 y = V (y). Imputation increases the variance (of the imputed estimator) because the imputed values are positively correlated. Variance estimation is complicated because of the correlation between the imputed values. Ch 4 9 / 79

10 2 Basic Theory for Imputation Lemma 4.1 Let ˆθ be the solution to Û (θ) = 0, where Û(θ) is a function of complete observations } y 1,, y n and parameter θ. Let θ 0 be the solution to E {Û (θ) = 0. Then, under some regularity conditions, ˆθ θ 0 = [E { U(θ 0 )}] 1 Û (θ0 ), where U(θ) = Û(θ)/ θ and notation A n = Bn means that A n = 1 + R n for some R n which converges to zero in probability. B 1 n Ch 4 10 / 79

11 Remark (about Lemma 4.1) Its proof is based on Taylor linearization: ) Û(ˆθ) = Û(θ 0 ) + U(θ 0 ) (ˆθ θ 0 = Û(θ 0) + E{ U(θ 0 )} (ˆθ θ0 ), where the second (approximate) equality follows by and ˆθ = θ 0 + o p (1). { } Need to assume that E U(θ 0 ) U(θ 0 ) = E{ U(θ 0 )} + o p (1) Also, we need conditions for ˆθ p θ 0. is nonsingular. Lemma 4.1 can be used to establish the asymptotic normality of ˆθ. (Use Slutsky Theorem). Ch 4 11 / 79

12 where S com (η; y) is the score function of η = (θ, φ) under complete response. Ch 4 12 / Basic Theory for Imputation Basic Setup (for Case 1: ψ = η) y = (y 1,, y n ) f (y; θ) δ = (δ 1,, δ n ) P(δ y; φ) y = (y obs, y mis ): (observed, missing) part of y. y (1) mis,, y (m) mis : m imputed values of y mis generated from f (y mis y obs, δ; ˆη p ) = f (y; ˆθ p )P(δ y; ˆφ p ) f (y; ˆθp )P(δ y; ˆφ p )dµ(y mis ), where ˆη p = (ˆθ p, ˆφ p ) is a preliminary estimator of η = (θ, φ). Using m imputed values, imputed score function is computed as S imp,m (η ˆη p ) m 1 m j=1 S com ( η; y obs, y (j) mis, δ )

13 2. Basic Theory for Imputation Lemma 4.2 (Asymptotic results for m = ) Assume that ˆη p converges in probability to η. Let ˆη I,m be the solution to 1 m m j=1 S com ( η; y obs, y (j) mis, δ ) = 0, where y (1) mis,, y (m) mis are the imputed values generated from f (y mis y obs, δ; ˆη p ). Then, under some regularity conditions, for m, and ˆη imp, = ˆη MLE + J mis (ˆη p ˆη MLE ) (3) V (ˆη imp, ). = I 1 obs + J mis {V (ˆη p ) V (ˆη MLE )} J mis, where J mis = I 1 comi mis is the fraction of missing information. Ch 4 13 / 79

14 Remark Equation (3) means that ˆη imp, = (I J mis ) ˆη MLE + J misˆη p. (4) That is, ˆη imp, is a convex combination of ˆη MLE and ˆη p. Note that ˆη imp, is one-step EM update with initial estimate ˆη p. Let ˆη (t) be the t-th EM update of η that is computed by solving ( S η ˆη (t 1)) = 0 with ˆη (0) = ˆη p. Equation (4) implies that Thus, we can obtain ˆη (t) = (I J mis ) ˆη MLE + J misˆη (t 1). ˆη (t) = ˆη MLE + (J mis ) t 1 (ˆη (0) ˆη MLE ), which justifies lim t ˆη (t) = ˆη MLE. Ch 4 14 / 79

15 Proof for Lemma 4.2 Ch 4 15 / 79

16 Wang and Robins (1998) Theorem 4.1 (Asymptotic results for m < ) Let ˆη p be a preliminary n-consistent estimator of η with variance V p. Under some regularity conditions, the solution ˆη imp,m to S m (η ˆη p ) 1 m m j=1 S com ( η; y obs, y (j) mis, δ ) = 0 has mean η 0 and asymptotic variance equal to V (ˆη imp,m ). = I 1 obs + J mis where J mis = I 1 comi mis. { Vp I 1 } obs J mis + m 1 IcomI 1 mis Icom 1 (5) Ch 4 16 / 79

17 2. Basic Theory for Imputation Remark If we use ˆη p = ˆη MLE, then the asymptotic variance in (5) is V (ˆη imp,m ). = I 1 obs + m 1 IcomI 1 mis Icom. 1 In Bayesian imputation (or multiple imputation), the posterior values of η are independently generated from η N(ˆη MLE, I 1 obs ), which implies that we can use V p = I 1 obs + m 1 I 1 obs. Thus, the asymptotic variance in (5) for multiple imputation is V (ˆη imp,m ). = I 1 obs + m 1 J mis I 1 obs J mis + m 1 IcomI 1 mis Icom. 1 The second term is the additional price we pay when generating the posterior values, rather than using ˆη MLE directly. Ch 4 17 / 79

18 Remark (about Theorem 4.1) Variance term (5) has three components. Writing ˆη imp,m = ˆη MLE + (ˆη imp, ˆη MLE ) + (ˆη imp,m ˆη imp, ), we can establish that the three terms are independent and satisfies V (ˆη MLE = I 1 obs V (ˆη imp, ) { ˆη MLE = Jmis Vp I 1 V (ˆη imp,m ˆη imp, ) = m 1 I 1 obs} J mis comi mis Icom 1 Ch 4 18 / 79

19 Sketched proof for Theorem 4.1 By Lemma 4.1 applied to S(η ˆη p ) = 0, we have Similarly, we can write Thus, V (ˆη imp,m ˆη imp, ) and ˆη imp, η 0 = I 1 com S(η ˆη p ). ˆη imp,m η 0 = I 1 com S m(η ˆη p ). = I 1 comv { S m (η ˆη p ) S(η ˆη p ) } I 1 com V { S m(η ˆη p ) S(η ˆη p ) } = 1 m V {S com(η) y obs, δ; ˆη p } = 1 m V {S mis(η) y obs, δ; ˆη p } = 1 m E { S mis (η) 2 y obs, δ; ˆη p } = 1 m E { S mis (η) 2 y obs, δ; η }. Ch 4 19 / 79

20 2. Basic Theory for Imputation Basic Setup (for Case 2: ψ η) Parameter ψ defined by E{U(ψ; y)} = 0. Under complete response, a consistent estimator of ψ can be obtained by solving U (ψ; y) = 0. Assume that some part of y, denoted by y mis, is not observed and m imputed values, say y (1) mis,, y (m) mis, are generated from f (y mis y obs, δ; ˆη MLE ), where ˆη MLE is the MLE of η 0. The imputed estimating function using m imputed values is computed as Ūimp,m (ψ ˆη MLE ) = 1 m U(ψ; y (j) ), (6) m where y (j) = (y obs, y (j) mis ). Let ˆψ imp,m be the solution to Ū imp,m (ψ ˆη MLE ) = 0. We are j=1 interested in the asymptotic properties of ˆψ imp,m. Ch 4 20 / 79

21 Theorem 4.2 Theorem 4.2 Suppose that the parameter of interest ψ 0 is estimated by solving U (ψ) = 0 under complete response. Then, under some regularity conditions, the solution to E {U (ψ) y obs, δ; ˆη MLE } = 0 (7) has mean ψ 0 and the asymptotic variance τ 1 Ωτ 1, where τ = E { U (ψ 0 ) / ψ } Ω = V { Ū (ψ 0 η 0 ) + κs obs (η 0 ) } and κ = E {U (ψ 0 ) S mis (η 0 )} I 1 obs. Ch 4 21 / 79

22 Sketched Proof Writing Ū(ψ η) E{U(ψ) y obs, δ; η}, the solution to (7) can be treated as the solution to the joint estimating equation [ ] U1 (ψ, η) U (ψ, η) = 0, U 2 (η) where U 1 (ψ, η) = Ū (ψ η) and U 2 (η) = S obs (η). We can apply the Taylor expansion to get ( ) ) ( ˆψ B11 B 12 ˆη where = ( ψ0 η 0 ( B11 B 12 B 21 B 22 ) = B 21 B 22 ) 1 [ U1 (ψ 0, η 0 ) U 2 (η 0 ) [ E ( U1 / ψ ) E ( U 1 / η ) E ( U 2 / ψ ) E ( U 2 / η ) ]. ] Ch 4 22 / 79

23 Sketched Proof (Cont d) Note that B 11 = E{ U(ψ)/ ψ } B 21 = 0 B 12 = E{U(ψ)S mis (η 0 )} B 22 = I obs Thus, ˆψ = ψ 0 B11 1 { U1 (ψ 0, η 0 ) B 12 B22 1 U 2(η 0 ) }. Ch 4 23 / 79

24 Alternative approach Use Randles (1982) theorem: An estimator ˆθ( ˆβ) is asymptotically equivalent to ˆθ(β) if { } E β ˆθ(β) = 0. Thus, writing ˆθ k (β) = ˆθ(β) + ks obs (β), we have 1 ˆθ( ˆβ) = ˆθ k ( ˆβ) holds for any k. 2 If we can find k = k such that ˆθ k ( ˆβ) satisfies Randles condition, then we can safely ignore the effect of the sampling variability of ˆβ and assume that ˆθ k ( ˆβ) = ˆθ k (β). Ch 4 24 / 79

25 Example 4.2 Under the setup of Example 4.1, we are interested in obtaining the asymptotic variance of the regression imputation estimator ˆθ Id = 1 { ( )} δ i y i + (1 δ i ) ˆβ 0 + ˆβ 1 x i, n ( ) where ˆβ = ˆβ 0, ˆβ 1 is the solution to S obs (β) = 1 σ 2 e δ i (y i β 0 β 1 x i )(1, x i ). The imputed estimator is the solution to { } E U(θ) y obs, δ; ˆβ = 0 where U(θ) = (y i θ) /σe. 2 Ch 4 25 / 79

26 Example 4.2 (Cont d) Since E {U(θ) y obs, δ; β} Ū(θ β) = {δ i y i + (1 δ i ) (β 0 + β 1 x i ) θ} /σe. 2 Thus, using the linearization formula in Theorem 4.2, we have Ū l (θ β) = Ū(θ β) + (κ 1, κ 2 )S obs (β) (8) where (κ 1, κ 2 ) = I 1 obs E {S mis(β)u(θ)}. (9) Ch 4 26 / 79

27 Example 4.2 (Cont d) In this example, we have ( κ0 κ 1 ) = [ { }] 1 { } E δ i (1, x i ) (1, x i ) E (1 δ i ) (1, x i ) = E { ( 1 + (n/r)(1 g x r ), (n/r)g) }, where g = ( x n x r )/ n δ i(x i x r ) 2 /r. Thus, Ū l (θ β) σ 2 e = δ i (y i θ) + (1 δ i ) (β 0 + β 1 x i θ) + δ i (y i β 0 β 1 x i ) (κ 0 + κ 1 x i ). Ch 4 27 / 79

28 Example 4.2 (Cont d) Note that the solution to Ūl (θ β) = 0 leads to ˆθ Id,l = 1 n = 1 n {β 0 + β 1 x i + δ i (1 + κ 0 + κ 1 x i ) (y i β 0 β 1 x i )} d i, where 1 + κ 0 + κ 1 x i = (n/r){1 + g(x i x r )}. Thus, ˆθ Id is asymptotically equivalent to ˆθ Id,l, which is the sample mean of d i, the influence function of unit i to ˆθ Id. Under uniform response mechanism, 1 + κ 0 + κ 1 x i = n/r and the asymptotic variance of ˆθ l is equal to 1 n β2 1σx r σ2 e = 1 ( 1 n σ2 y + r 1 ) σe 2 n Ch 4 28 / 79

29 Remark Let X 1,, X n be IID sample from f (x; θ 0 ), θ 0 Θ and we are interested in estimating γ 0 = γ(θ 0 ), where γ( ) : Θ R k. An estimator ˆγ = ˆγ n is called asymptotically linear if there exist a random vector ψ(x) such that n (ˆγn γ 0 ) = 1 n ψ(x i ) + o p (1) (10) with E θ0 {ψ(x )} = 0 and E θ0 {ψ(x )ψ(x ) } is finite and non-singular. Here, Z n = o p (1) means that Z n converges to zero in probability. Ch 4 29 / 79

30 Remark The function ψ(x) is referred to as an influence function. The phrase influence function was used by Hampel (JASA, 1974) and is motivated by the fact that to the first order ψ(x) is the influence of a single observation on the estimator ˆγ = ˆγ(X 1,, X n ). The asymptotic properties of an asymptotically linear estimator, ˆγ n can be summarized by considering only its influence function. Since ψ(x ) has zero mean, the CLT tells us that 1 n ψ(x i ) L N [ 0, E θ0 {ψ(x )ψ(x ) } ]. (11) Thus, combining (10) with (11) and applying Slutsky s theorem, we have n (ˆγn γ 0 ) L N [ 0, E θ0 {ψ(x )ψ(x ) } ]. Ch 4 30 / 79

31 3 Variance estimation after imputation Ch 4 31 / 79

32 Deterministic Imputation In Example 4.2, the imputed estimator can be written as ˆθ Id ( ˆβ). Note that we can write the deterministic imputed estimator as where ŷ i ( ˆβ) = ˆβ 0 + ˆβ 1 x i. ˆθ Id = n 1 ŷ i ( ˆβ), In general, the asymptotic variance of ˆθ Id = ˆθ( ˆβ) is different from the asymptotic variance of ˆθ(β) Ch 4 32 / 79

33 As in Example 4.2, if we can find d i = d i (β) such that ˆθ Id ( ˆβ) = n 1 d i ( ˆβ) = n 1 d i (β), then the asymptotic variance of ˆθ Id is equal to the asymptotic variance of d n = n 1 n d i(β). Note that, if (x i, y i, δ i ) are IID, then d i = d(x i, y i, δ i ) are also IID. Thus, the variance of d n = n 1 n d i is unbiasedly estimated by ˆV ( d n ) = 1 n 1 n 1 ( ) 2 di d n. (12) Unfortunately, we cannot compute ˆV ( d n ) in (12) since d i = d i (β) is a function of unknown parameters. Thus, we use ˆd i = d i ( ˆβ) in (12) to get a consistent variance estimator of the imputed estimator. Ch 4 33 / 79

34 Stochastic Imputation Instead of the deterministic imputation, suppose that a stochastic imputation is used such that ˆθ I = n 1 { ( )} δ i y i + (1 δ i ) ˆβ 0 + ˆβ 1 x i + êi, where êi are the additional noise terms in the stochastic imputation. Often êi are randomly selected from the empirical distribution of the sample residuals in the respondents. The variance of the imputed estimator can be decomposed into two parts: ) ) ) V (ˆθ I = V (ˆθ Id + V (ˆθ I ˆθ Id (13) where the first part is the deterministic part and the second part is the additional variance due to stochastic imputation. The first part can be estimated by the linearization method discussed above. The second part is called the imputation variance. Ch 4 34 / 79

35 If we require the imputation mechanism to satisfy (1 δ i ) êi = 0 then the imputation variance is equal to zero. Often the variance of ˆθ I ˆθ Id = n 1 n (1 δ i) êi can be computed under the known imputation mechanism. For example, if simple random sampling without replacement is used then ) V (ˆθ I ˆθ Id = E where m = n r. { V (ˆθ I y obs, δ)} = n 2 (1 m/r) (r 1) 1 δ i êi 2 Ch 4 35 / 79

36 Imputed estimator for general parameters We now discuss a general case of parameter estimation when the parameter of interest ψ is estimated by the solution ˆψ n to U(ψ; y i ) = 0 (14) under complete response of y 1,, y n. Under the existence of missing data, we can use the imputed estimating equation Ū m (ψ) m 1 m j=1 U(ψ; y (j) i ) = 0, (15) where y (j) i = (y i,obs, y (j) i,mis ) and y (j) i,mis are randomly generated from the conditional distribution h (y i,mis y i,obs, δ i ; ˆη p ) where ˆη p is estimated by solving Û p (η) U p (η; y i,obs ) = 0. (16) Ch 4 36 / 79

37 To apply the linearization method, we first compute the conditional expectation of U(ψ; y i ) given (y i,obs, δ i ) evaluated at ˆη p. That is, compute Ū (ψ ˆη p ) = Ū i (ψ ˆη p ) = E {U(ψ; y i ) y i,obs, δ i ; ˆη p }. (17) Let ˆψ R be the solution to Ū (ψ ˆη p) = 0. Using the linearization technique, we have { } Ū (ψ ˆη p ) = Ū (ψ η 0) + E η Ū (ψ η 0) (ˆη p η 0 ) (18) and 0 = Û p (ˆη p ) = Û p (η 0 ) + E { } η Ûp (η 0 ) (ˆη p η 0 ). (19) Ch 4 37 / 79

38 Thus, combining (18) and (19), we have Ū (ψ ˆη p ) = Ū (ψ η 0) + κ(ψ)ûp (η 0 ) (20) where κ(ψ) = E { } [ { }] 1 η Ū (ψ η 0) E η Ûp (η 0 ). Thus, writing Ū l (ψ η 0 ) = {Ūi (ψ η 0 ) + κ(ψ)û p (η 0 ; y i,obs )} = q i (ψ η 0 ), and q i (ψ η 0 ) = Ū i (ψ η 0 ) + κ(ψ)û p (η 0 ; y i,obs ), the variance of Ū (ψ ˆη p ) is asymptotically equal to the variance of Ū l (ψ η 0 ). Ch 4 38 / 79

39 Thus, the sandwich-type variance estimator for ˆψ R is ( ) ˆV ˆψ R = ˆτ q 1 ˆΩ q ˆτ q 1 (21) where ( ) ˆτ q = n 1 q i ˆψ R ˆη p ˆΩ q = n 1 (n 1) 1 (ˆq i q n ) 2, q i (ψ η) = q i (ψ η) / ψ, q n = n 1 n ˆq i, and ˆq i = q i ( ˆψ R ˆη p ). Note that ( ) ˆτ q = n 1 q i ˆψ R ˆη p = n 1 because ˆη p is the solution to (16). E { U( ˆψ R ; y i ) y i,obs, δ i ; ˆη p } Ch 4 39 / 79

40 Example 4.3 Assume that the original sample is decomposed into G disjoint groups (often called imputation cells) and the sample observations are IID within the same cell. That is, i.i.d. y i i S g ( µ g, σg 2 ) (22) where S g is the set of sample indices in cell g. Assume that n g sample elements in cell g and r g elements are observed in the cell. For deterministic imputation, let ˆµ g = rg 1 i S g δ i y i be the g-th cell mean of y among the respondents. The (deterministically) imputed estimator of θ = E(Y ) is, under MAR, ˆθ Id = n 1 G g=1 i S g {δ i y i + (1 δ i )ˆµ g } = n 1 G n g ˆµ g. (23) g=1 Ch 4 40 / 79

41 Example 4.3 (Cont d) Using the linearization technique in (20), the imputed estimator can be expressed as ˆθ Id = n 1 G g=1 i S g { µ g + n } g δ i (y i µ g ) r g and the plug-in variance estimator can be expressed as ˆV (ˆθ Id ) = 1 n 1 n 1 (24) ( ˆdi d ) 2 n (25) where ˆd i = ˆµ g + (n g /r g )δ i (y i ˆµ g ) and d n = n 1 n ˆd i. Ch 4 41 / 79

42 Example 4.3 (Cont d) If a stochastic imputation is used where an imputed value is randomly selected from the set of respondents in the same cell, then we can write ˆθ Is = n 1 G g=1 i S g {δ i y i + (1 δ i )y i }. (26) Writing ˆθ Is = ˆθ Id + n 1 G g=1 i S g (1 δ i ) (yi ˆµ g ), the variance of the second term can be estimated by n 2 G g=1 i S g (1 δ i ) (yi ˆµ g ) 2 if the imputed values are generated independently, conditional on the respondents. Ch 4 42 / 79

43 4.4 Replication variance estimation Ch 4 43 / 79

44 Replication variance estimation (under complete response) Let ˆθ n be the complete-sample estimator of θ. The replication variance estimator of ˆθ n takes the form of ˆV rep (ˆθ n ) = L k=1 c k (ˆθ (k) n ˆθ n ) 2 (27) where L is the number of replicates, c k is the replication factor associated with replication k, and ˆθ n (k) is the k-th replicate of ˆθ n. If ˆθ n = n y i/n, then we can write ˆθ n (k) = n w (k) i y i for some replication weights w (k) 1, w (k) 2,, w n (k). Ch 4 44 / 79

45 Replication variance estimation (under complete response) For example, in the jackknife method, we have L = n, c k = (n 1)/n, and { w (k) (n 1) 1 if i k i = 0 if i = k. If we use the above jackknife method to ˆθ n = n y i/n, the resulting jackknife estimator in (27) is algebraically equivalent to n 1 (n 1) 1 n (y i ȳ n ) 2. Ch 4 45 / 79

46 Replication variance estimation (under complete response) Under some regularity conditions, for ˆθ n = g(ȳ n ), the replication variance estimator of ˆθ n, defined by where ˆθ (k) n ˆV rep (ˆθ n ) = = g(ȳ n (k) ), satisfies L k=1 c k (ˆθ (k) n ˆθ n ) 2, (28) ˆV rep (ˆθ n ) = { g (ȳ n ) }2 ˆV rep (ȳ n ). Thus, the replication variance estimator (28) is asymptotically equivalent to the linearized variance estimator. Ch 4 46 / 79

47 Remark If the parameter of interest, denoted by ψ, is estimated by ˆψ which is obtained by solving an estimating equation n U(ψ; y i) = 0, then a consistent variance estimator can be obtained by the sandwich formula: The complete-sample variance estimator of ˆψ is ( ˆV ˆψ) = ˆτ u 1 ˆΩ u ˆτ u 1 (29) where ( ) ˆτ u = n 1 U ˆψ; y i ˆΩ u = n 1 (n 1) 1 (û i ū n ) 2, U (ψ; y) = U (ψ; y) / ψ, ū n = n 1 n ûi, and û i = U( ˆψ; y i ). Ch 4 47 / 79

48 If we want to use the replication method of the form (27), we can construct the replication variance estimator of ˆψ by ˆV rep ( ˆψ) = where ˆψ (k) is computed by L ( 2 c k ˆψ (k) ˆψ) (30) k=1 Û (k) (ψ) w (k) i U(ψ; y i ) = 0. (31) Ch 4 48 / 79

49 One-step approximation of ˆψ (k) is to use or, even more simply, to use { ˆψ (k) 1 1 = ˆψ U (k) ( ˆψ)} Û(k) ( ˆψ) (32) ˆψ (k) 1 = ˆψ { ˆψ)} 1 U( Û(k) ( ˆψ). (33) The replication variance estimator using (33) is algebraically equivalent to { U( ˆψ) } 1 [ k=1 c k {Û(k) ( ˆψ) Û( ˆψ)} 2 ] { U( ˆψ)} 1, which is very close to the sandwich variance formula in (29). Ch 4 49 / 79

50 Back to Example 4.1 For the regression imputation in Example 4.1, ˆθ Id = 1 { ( )} δ i y i + (1 δ i ) ˆβ 0 + ˆβ 1 x i. n The replication variance estimator of ˆθ Id is computed by ) L ) ˆV rep (ˆθ Id = c k (ˆθ (k) 2 Id ˆθ Id (34) where ˆθ (k) Id = k=1 { ( )} w (k) i δ i y i + (1 δ i ) ˆβ(k) (k) 0 + ˆβ 1 x i and ( ˆβ (k) 0, ˆβ (k) 1 ) is the solution to w (k) i δ i (y i β 0 β 1 x i ) (1, x i ) = (0, 0). Ch 4 50 / 79

51 Example 4.4 We now return to the setup of Example In this case, the deterministically imputed estimator of θ = E(Y ) is constructed by ˆθ Id = n 1 {δ i y i + (1 δ i )ˆp 0i } (35) where ˆp 0i is the predictor of y i given x i and δ i = 0. That is, ˆp 0i = p(x i ; ˆβ){1 π(x i, 1; ˆφ)} {1 p(x i ; ˆβ)}{1 π(x i, 0; ˆφ)} + p(x i ; ˆβ){1 π(x i, 1; ˆφ)}, where ˆβ and ˆφ are jointly estimated by the EM algorithm. Ch 4 51 / 79

52 Example 4.4 (Cont d) E-step ( S 1 β β (t), φ (t)) = {y i p i (β)} x i + δ i =1 δ i =0 where w ij(t) = Pr (Y i = j x i, δ i = 0; β (t), φ (t)) = 1 w ij(t) {j p i (β)} x i, j=0 Pr ( Y i = j x i ; β (t)) Pr ( δ i = 0 x i, j; φ (t)) 1 y=0 Pr ( Y i = y x i ; β (t)) Pr ( δ i = 0 x i, y; φ (t)) and ( S 2 φ β (t), φ (t)) = {δ i π (x i, y i ; φ)} ( x ) i, y i δ i =1 + δ i =0 1 w ij(t) {δ i π i (x i, j; φ)} ( x i, j ). j=0 Ch 4 52 / 79

53 Example 4.4 (Cont d) M-step The parameter estimates are updated by solving [ S 1 (β β (t), φ (t)), S 2 ( φ β (t), φ (t))] = (0, 0) for β and φ. Ch 4 53 / 79

54 Example 4.4 (Cont d) For replication variance estimation, we can use (34) with where ˆp (k) 0i = { } Id = w (k) i δ i y i + (1 δ i )ˆp (k) 0i. (36) ˆθ (k) p(x i ; ˆβ (k) ){1 π(x i, 1; ˆφ (k) )} {1 p(x i ; ˆβ (k) )}π(x i, 0; ˆφ (k) ) + p(x i ; ˆβ (k) ){1 π(x i, 1; ˆφ (k) )}. and ( ˆβ (k), ˆφ (k) ) is obtained by solving the mean score equations with weights replaced by the replication weights w (k) i. Ch 4 54 / 79

55 Example 4.4 (Cont d) That is, ( ˆβ (k), ˆφ (k) ) is the solution to S (k) 1 (β, φ) w (k) i {y i p(x i ; β)} x i δ i =1 + δ i =0 w (k) i 1 y=0 w iy(β, φ){y p(x i ; β)}x i = 0 S (k) 2 (β, φ) w (k) i {δ i π(x i, y i ; φ)} (x i, y i ) δ i =1 + δ i =0 w (k) i 1 y=0 w iy(β, φ){δ i π(x i, y; β)}(x i, y) = 0 and w iy(β, φ) = p(x i ; β)π(x i, 1; φ) {1 p(x i ; β)}π(x i, 0; φ) + p(x i ; β)π(x i, 1; φ). Ch 4 55 / 79

56 6 Fractional Imputation Ch 4 56 / 79

57 Monte Carlo EM Remark Monte Carlo EM can be used as a frequentist approach to imputation. Convergence is not guaranteed (for fixed m). E-step can be computationally heavy. (May use MCMC method). Ch 4 57 / 79

58 If y mis,i is categorical, then simply use Ch 4all possible values of y mis,i as the58 / 79 Parametric Fractional Imputation (Kim, 2011) Parametric fractional imputation 1 More than one (say m) imputed values of y mis,i : y (1) mis,i,, y (m) mis,i from some (initial) density h (y mis,i ). 2 Create weighted data set {( w ij, yij ) } ; j = 1, 2,, m; i = 1, 2, n where m j=1 w ij = 1, y ij = (y obs,i, y (j) mis,i ) w ij f (y ij, δ i ; ˆη)/h(y (j) mis,i ), ˆη is the maximum likelihood estimator of η, and f (y, δ; η) is the joint density of (y, δ). 3 The weight w ij are the normalized importance weights and can be called fractional weights.

59 Remark Importance sampling idea: For sufficiently large m, m j=1 ij g ( yij ) g(yi ) f (y i,δ i ;ˆη) = h(y mis,i ) h(y mis,i)dy mis,i f (yi,δ i ;ˆη) h(y mis,i ) h(y mis,i)dy mis,i w for any g such that the expectation exists. = E {g (y i ) y obs,i, δ i ; ˆη} In the importance sampling literature, h( ) is called proposal distribution and f ( ) is called target distribution. Do not need to compute the conditional distribution f (y mis,i y obs,i, δ i ; η). Only the joint distribution f (y obs,i, y mis,i, δ i ; η) is needed because f (y obs,i, y (j) mis,i, δ i; ˆη)/h(y (j) i,mis ) m k=1 f (y obs,i, y (k) mis,i, δ i; ˆη)/h(y (k) i,mis ) = f (y (j) mis,i y obs,i, δ i ; ˆη)/h(y (j) i,mis ) m (k) k=1 f (ymis,i y obs,i, δ i ; ˆη)/h(y (k) i,mis ). Ch 4 59 / 79

60 EM algorithm by fractional imputation 1 Imputation-step: generate y (j) i,mis h (y i,mis). 2 Weighting-step: compute where m j=1 w ij(t) = 1. 3 M-step: update w ij(t) f (y ij, δ i ; ˆη (t) )/h(y (j) i,mis ) ˆη (t+1) = arg max m j=1 w ij(t) log f ( η; y ij, δ i ). 4 (Optional) Check if wij(t) is too large for some j. If so, set h(y i,mis ) = f (y i,mis y i,obs ; ˆη t ) and goto Step 1. 5 Repeat Step 2 - Step 4 until convergence. Ch 4 60 / 79

61 Remark Imputation Step + Weighting Step = E-step. The imputed values are not changed for each EM iteration. Only the fractional weights are changed. 1 Computationally efficient (because we use importance sampling only once). 2 Convergence is achieved (because the imputed values are not changed). See Theorem 4.5. For sufficiently large t, ˆη (t) ˆη. Also, for sufficiently large m, ˆη ˆη MLE. For estimation of ψ in E{U(ψ; Y )} = 0, simply use 1 n m j=1 w ij U(ψ; y ij) = 0 where wij = wij (ˆη) and ˆη is obtained from the above EM algorithm. Ch 4 61 / 79

62 Theorem 4.5 (Theorem 1 of Kim (2011) ) Theorem Let If then Q (η ˆη (t) ) = m j=1 w ij(t) log f ( η; y ij, δ i ). Q (ˆη (t+1) ˆη (t) ) Q (ˆη (t) ˆη (t) ) (37) where l obs (η) = n ln{f obs(i) (y i,obs, δ i ; η)} and f obs(i) (y i,obs, δ i ; η) = 1 m l obs (ˆη (t+1)) l obs (ˆη (t)), (38) m j=1 f (y ij, δ i ; η)/h m (y (j) i,mis ). Ch 4 62 / 79

63 Proof By using Jensen s inequality, l obs (ˆη (t+1)) l obs (ˆη (t)) = Therefore, (37) implies (38). m ln m j=1 j=1 w ij(t) w ij(t) ln f (yij, δ i; ˆη (t+1) ) f (yij, δ i; ˆη (t) ) { } f (y ij, δ i ; ˆη (t+1) ) f (y ij, δ i; ˆη (t) ) = Q (ˆη (t+1) ˆη (t) ) Q (ˆη (t) ˆη (t) ). Ch 4 63 / 79

64 Example 4.11: Return to Example 3.15 Fractional imputation 1 Imputation Step: Generate y (1) i,, y (m) i from f (y i x i ; ˆθ ) (0). 2 Weighting Step: Using the m imputed values generated from Step 1, compute the fractional weights by ( f y (j) wij(t) i x i ; ˆθ ) (t) { ( ) 1 π(x i, y (j) f y (j) i ; ˆφ } (t) ) i x i ; ˆθ (0) where ( π(x i, y i ; ˆφ) exp ˆφ0 + ˆφ 1 x i + ˆφ ) 2 y i = ( ). 1 + exp ˆφ 0 + ˆφ 1 x i + ˆφ 2 y i Ch 4 64 / 79

65 Example 4.11 Using the imputed data and the fractional weights, the M-step can be implemented by solving and j=1 j=1 m ( ) wij(t) S θ; x i, y (j) = 0 i m { } ( ) wij(t) δ i π(φ; x i, y (j) i ) 1, x i, y (j) i = 0, (39) where S (θ; x i, y i ) = log f (y i x i ; θ)/ θ. Ch 4 65 / 79

66 Example 4.12: Back to Example 3.18 (GLMM) Level 1 model y ij f 1 (y ij x ij, a i ; θ 1 ) for some fixed θ 1 and a i random. Level 2 model a i f 2 (a i ; θ 2 ) Latent variable: a i We are interested in generating a i from n i p(a i x i, y i ; θ 1, θ) f 1 (y ij x ij, a i ; θ 1 ) f 2(a i ; θ 2 ) j=1 Ch 4 66 / 79

67 Example 4.12 (Cont d) E-step 1 Imputation Step: Generate a (1) i,, a (m) (t) i from f 2 (a i ; ˆθ 2 ). 2 Weighting Step: Using the m imputed values generated from Step 1, compute the fractional weights by ij(t) g 1(y i x i, a (j) ; w i ˆθ (t) 1 ) where g 1 (y i x i, a i ; ˆθ 1 ) = n i j=1 f 1(y ij x ij, a i ; θ 1 ). M-step: Update the parameters by solving j=1 m ( ) wij(t) S 1 θ 1 ; x i, y i, a (j) i = 0 j=1 m ( ) wij(t) S 2 θ 2 ; a (j) i = 0. Ch 4 67 / 79

68 Example 4.13: Measurement error model Interested in estimating θ in f (y x; θ). Instead of observing x, we observe z which can be highly correlated with x. Thus, z is an instrumental variable for x: f (y x, z) = f (y x) and f (y z = a) f (y z = b) for a b. In addition to original sample, we have a separate calibration sample that observes (x i, z i ). Ch 4 68 / 79

69 Example 4.13 (Cont d) Table: Data Structure Z X Y Calibration Sample o o Original Sample o o The goal is to generate x in the original sample from f (x i z i, y i ) f (x i z i ) f (y i x i, z i ) = f (x i z i ) f (y i x i ) Obtain a consistent estimator ˆf (x z) from calibration sample. E-step 1 Generate x (1) i,, x (m) i from ˆf (x i z i ). 2 Compute the fractional weights associated with x (j) i by w ij f (y i x (j) ; ˆθ) M-step: Solve the weighted score equation for θ. i Ch 4 69 / 79

70 Remarks for Computation Recall that, writing S (η η) = M j=1 w ij (η)s(η; y ij, δ i ) where wij (η) is the fractional weight associated with y ij, denoted by w ij (η) = f (yij, δ i; η)/h m (y (j) i,mis ) m k=1 f (y ik, δ i; η)/h m (y (k) (40) i,mis ), and S(η; y, δ) = log f (y, δ; η)/ η, the EM algorithm for fractional imputation can be expressed as ˆη (t+1) solve S (η η (t) ) = 0. Ch 4 70 / 79

71 Remarks for Computation Instead of EM algorithm, Newton-type algorithm can also be used. The Newton-type algorithm for computing the MLE from the fractionally imputed data is given by { 1 ˆη (t+1) = ˆη (t) + Iobs )} S (ˆη(t) (ˆη (t) ˆη (t) ) where I obs (η) = m j=1 m j=1 w ij (η)ṡ(η; y ij, δ i ) w ij (η) { S(η; y ij, δ i ) S i (η) } 2, Ṡ(η; y, δ) = S(η; y, δ)/ η and S i (η) = M j=1 w ij (η)ṡ(η; y ij, δ i). Ch 4 71 / 79

72 Estimation of general parameter Parameter Ψ is defined through E{U(Ψ; Y )} = 0. The FI estimator of Ψ is computed by solving m j=1 Note that ˆη is the solution to m j=1 w ij (ˆη)U(Ψ; y ij) = 0. (41) w ij (ˆη) S (ˆη; y ij) = 0. Ch 4 72 / 79

73 Estimation of general parameter We can use either linearization method or replication method for variance estimation. For linearization method, using Theorem 4.2, we can use sandwich formula where ( ˆV ˆΨ) ˆτ q = n 1 j=1 = ˆτ q 1 ˆΩ q ˆτ q 1 (42) m ( wij U ˆΨ; yij) ˆΩ q = n 1 (n 1) 1 (ˆq i q n) 2, with ˆq i = Ūi + ˆκ S i, where (Ū i, S i ) = m j=1 w ij (U ij, S ij ), Uij = U( ˆΨ; yij ), S ij = S(ˆη; y ij ), and ˆκ = m j=1 wij (ˆη) ( Uij Ūi ) S ij {Iobs (ˆη)} 1 Ch 4 73 / 79

74 Estimation of general parameter For replication method, we first obtain the k-th replicate ˆη (k) of ˆη by solving m w (k) i wij (η) S ( η; yij) = 0. j=1 Once ˆη (k) is obtained then the k-th replicate ˆΨ (k) of ˆΨ is obtained by solving m w (k) i wij (ˆη (k) )U(Ψ; yij) = 0 for ψ. j=1 The replication variance estimator of ˆΨ from (41) is obtained by ˆV rep ( ˆΨ) = L ( 2 c k ˆΨ (k) ˆΨ). k=1 Ch 4 74 / 79

75 Example 4.15: Nonparametric Fractional Imputation Bivariate data: (x i, y i ) x i are completely observed but y i is subject to missingness. Joint distribution of (x, y) completely unspecified. Assume MAR in the sense that P(δ = 1 x, y) does not depend on y. Without loss of generality, assume that δ i = 1 for i = 1,, r and δ i = 0 for i = r + 1,, n. We are only interested in estimating θ = E(Y ). Ch 4 75 / 79

76 Example 4.15 (Cont d) Let K h (x i, x j ) = K((x i x j )/h) be the Kernel function with bandwidth h such that K(x) 0 and K(x)dx = 1, xk(x)dx = 0, σk 2 x 2 K(x)dx > 0. Examples include the following: Boxcar kernel: K(x) = 1 2 I (x) Gaussian kernel: K(x) = 1 2π exp( 1 2 x 2 ) where Epanechnikov kernel: K(x) = 3 4 (1 x 2 )I (x) Tricube Kernel: K(x) = (1 x 3 ) 3 I (x) I (x) = { 1 if x 1 0 if x > 1. Ch 4 76 / 79

77 Example 4.15 (Cont d) Nonparametric regression estimator of m(x) = E(Y x): ˆm(x) = r l i (x)y i (43) where l i (x) = K ( x x i ) ( h j K x xj ). h Estimator in (43) is often called Nadaraya-Watson kernel estimator. Under some regularity conditions and under the optimal choice of h (with h = O(n 1/5 )), it can be shown that [ E { ˆm(x) m(x)} 2] = O(n 4/5 ). Thus, its convergence rate is slower than that of parametric one. Ch 4 77 / 79

78 Example 4.15 (Cont d) However, the imputed estimator of θ using (43) can achieve the n-consistency. That is, { r ˆθ NP = 1 y i + n i=r+1 ˆm(x i ) } (44) achieves n (ˆθ NP θ) N(0, σ 2 ) (45) where σ 2 = E{v(x)/π(x)} + V {m(x)}, m(x) = E(y x), v(x) = V (y x) and π(x) = E(δ x). Result (45) was first proved by Cheng (1994). Ch 4 78 / 79

79 Example 4.15 (Cont d) We can express ˆθ NP in (45) as a nonparametric fractional imputation (NFI) estimator of the form ˆθ NFI = 1 r r y n i + wij y (j) i j=r+1 where w ij = l i (x j ), which is defined after (43), and y (j) i = y i. Variance estimation can be implemented by a resampling method, such as bootstrap. Ch 4 79 / 79

Statistical Methods for Handling Missing Data

Statistical Methods for Handling Missing Data Statistical Methods for Handling Missing Data Jae-Kwang Kim Department of Statistics, Iowa State University July 5th, 2014 Outline Textbook : Statistical Methods for handling incomplete data by Kim and

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

Recent Advances in the analysis of missing data with non-ignorable missingness

Recent Advances in the analysis of missing data with non-ignorable missingness Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

Introduction An approximated EM algorithm Simulation studies Discussion

Introduction An approximated EM algorithm Simulation studies Discussion 1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse

More information

Robustness to Parametric Assumptions in Missing Data Models

Robustness to Parametric Assumptions in Missing Data Models Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Two-phase sampling approach to fractional hot deck imputation

Two-phase sampling approach to fractional hot deck imputation Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

Combining data from two independent surveys: model-assisted approach

Combining data from two independent surveys: model-assisted approach Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,

More information

Chapter 4. Replication Variance Estimation. J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28

Chapter 4. Replication Variance Estimation. J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28 Chapter 4 Replication Variance Estimation J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28 Jackknife Variance Estimation Create a new sample by deleting one observation n 1 n n ( x (k) x) 2 = x (k) = n

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Chapter 8: Estimation 1

Chapter 8: Estimation 1 Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.

More information

A measurement error model approach to small area estimation

A measurement error model approach to small area estimation A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion

More information

Bootstrap inference for the finite population total under complex sampling designs

Bootstrap inference for the finite population total under complex sampling designs Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.

More information

On the bias of the multiple-imputation variance estimator in survey sampling

On the bias of the multiple-imputation variance estimator in survey sampling J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,

More information

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1 36. Multisample U-statistics jointly distributed U-statistics Lehmann 6.1 In this topic, we generalize the idea of U-statistics in two different directions. First, we consider single U-statistics for situations

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

Introduction to Survey Data Integration

Introduction to Survey Data Integration Introduction to Survey Data Integration Jae-Kwang Kim Iowa State University May 20, 2014 Outline 1 Introduction 2 Survey Integration Examples 3 Basic Theory for Survey Integration 4 NASS application 5

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions. Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1 Estimation and Model Selection in Mixed Effects Models Part I Adeline Samson 1 1 University Paris Descartes Summer school 2009 - Lipari, Italy These slides are based on Marc Lavielle s slides Outline 1

More information

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego Model-free prediction intervals for regression and autoregression Dimitris N. Politis University of California, San Diego To explain or to predict? Models are indispensable for exploring/utilizing relationships

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Estimation, Inference, and Hypothesis Testing

Estimation, Inference, and Hypothesis Testing Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of

More information

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Joint Probability Distributions

Joint Probability Distributions Joint Probability Distributions ST 370 In many random experiments, more than one quantity is measured, meaning that there is more than one random variable. Example: Cell phone flash unit A flash unit is

More information

Graybill Conference Poster Session Introductions

Graybill Conference Poster Session Introductions Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary

More information

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University

More information

arxiv:math/ v1 [math.st] 23 Jun 2004

arxiv:math/ v1 [math.st] 23 Jun 2004 The Annals of Statistics 2004, Vol. 32, No. 2, 766 783 DOI: 10.1214/009053604000000175 c Institute of Mathematical Statistics, 2004 arxiv:math/0406453v1 [math.st] 23 Jun 2004 FINITE SAMPLE PROPERTIES OF

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Stat 451 Lecture Notes Monte Carlo Integration

Stat 451 Lecture Notes Monte Carlo Integration Stat 451 Lecture Notes 06 12 Monte Carlo Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 23 in Lange, and Chapters 3 4 in Robert & Casella 2 Updated:

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

More information

Imputation for Missing Data under PPSWR Sampling

Imputation for Missing Data under PPSWR Sampling July 5, 2010 Beijing Imputation for Missing Data under PPSWR Sampling Guohua Zou Academy of Mathematics and Systems Science Chinese Academy of Sciences 1 23 () Outline () Imputation method under PPSWR

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

Miscellanea A note on multiple imputation under complex sampling

Miscellanea A note on multiple imputation under complex sampling Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

Introduction to Regression

Introduction to Regression Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric

More information

A Conditional Approach to Modeling Multivariate Extremes

A Conditional Approach to Modeling Multivariate Extremes A Approach to ing Multivariate Extremes By Heffernan & Tawn Department of Statistics Purdue University s April 30, 2014 Outline s s Multivariate Extremes s A central aim of multivariate extremes is trying

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

Data Uncertainty, MCML and Sampling Density

Data Uncertainty, MCML and Sampling Density Data Uncertainty, MCML and Sampling Density Graham Byrnes International Agency for Research on Cancer 27 October 2015 Outline... Correlated Measurement Error Maximal Marginal Likelihood Monte Carlo Maximum

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

ECON 3150/4150, Spring term Lecture 6

ECON 3150/4150, Spring term Lecture 6 ECON 3150/4150, Spring term 2013. Lecture 6 Review of theoretical statistics for econometric modelling (II) Ragnar Nymoen University of Oslo 31 January 2013 1 / 25 References to Lecture 3 and 6 Lecture

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

Propensity score adjusted method for missing data

Propensity score adjusted method for missing data Graduate Theses and Dissertations Graduate College 2013 Propensity score adjusted method for missing data Minsun Kim Riddles Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/etd

More information

Chapter 3: Maximum Likelihood Theory

Chapter 3: Maximum Likelihood Theory Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Introduction to Regression

Introduction to Regression Introduction to Regression David E Jones (slides mostly by Chad M Schafer) June 1, 2016 1 / 102 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression Nonparametric Procedures

More information

Section 10: Role of influence functions in characterizing large sample efficiency

Section 10: Role of influence functions in characterizing large sample efficiency Section 0: Role of influence functions in characterizing large sample efficiency. Recall that large sample efficiency (of the MLE) can refer only to a class of regular estimators. 2. To show this here,

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Testing Error Correction in Panel data

Testing Error Correction in Panel data University of Vienna, Dept. of Economics Master in Economics Vienna 2010 The Model (1) Westerlund (2007) consider the following DGP: y it = φ 1i + φ 2i t + z it (1) x it = x it 1 + υ it (2) where the stochastic

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

analysis of incomplete data in statistical surveys

analysis of incomplete data in statistical surveys analysis of incomplete data in statistical surveys Ugo Guarnera 1 1 Italian National Institute of Statistics, Italy guarnera@istat.it Jordan Twinning: Imputation - Amman, 6-13 Dec 2014 outline 1 origin

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

Bayesian inference for factor scores

Bayesian inference for factor scores Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor

More information

Combining Non-probability and. Probability Survey Samples Through Mass Imputation

Combining Non-probability and. Probability Survey Samples Through Mass Imputation Combining Non-probability and arxiv:1812.10694v2 [stat.me] 31 Dec 2018 Probability Survey Samples Through Mass Imputation Jae Kwang Kim Seho Park Yilin Chen Changbao Wu January 1, 2019 Abstract. This paper

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Weighting in survey analysis under informative sampling

Weighting in survey analysis under informative sampling Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting

More information

Final Exam November 24, Problem-1: Consider random walk with drift plus a linear time trend: ( t

Final Exam November 24, Problem-1: Consider random walk with drift plus a linear time trend: ( t Problem-1: Consider random walk with drift plus a linear time trend: y t = c + y t 1 + δ t + ϵ t, (1) where {ϵ t } is white noise with E[ϵ 2 t ] = σ 2 >, and y is a non-stochastic initial value. (a) Show

More information

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression Group Prof. Daniel Cremers 9. Gaussian Processes - Regression Repetition: Regularized Regression Before, we solved for w using the pseudoinverse. But: we can kernelize this problem as well! First step:

More information

δ -method and M-estimation

δ -method and M-estimation Econ 2110, fall 2016, Part IVb Asymptotic Theory: δ -method and M-estimation Maximilian Kasy Department of Economics, Harvard University 1 / 40 Example Suppose we estimate the average effect of class size

More information

Mann-Whitney Test with Adjustments to Pre-treatment Variables for Missing Values and Observational Study

Mann-Whitney Test with Adjustments to Pre-treatment Variables for Missing Values and Observational Study MPRA Munich Personal RePEc Archive Mann-Whitney Test with Adjustments to Pre-treatment Variables for Missing Values and Observational Study Songxi Chen and Jing Qin and Chengyong Tang January 013 Online

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Stat 451 Lecture Notes Numerical Integration

Stat 451 Lecture Notes Numerical Integration Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29

More information

Economics 582 Random Effects Estimation

Economics 582 Random Effects Estimation Economics 582 Random Effects Estimation Eric Zivot May 29, 2013 Random Effects Model Hence, the model can be re-written as = x 0 β + + [x ] = 0 (no endogeneity) [ x ] = = + x 0 β + + [x ] = 0 [ x ] = 0

More information

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy

More information