arxiv: v2 [stat.me] 7 Oct 2017

Size: px
Start display at page:

Download "arxiv: v2 [stat.me] 7 Oct 2017"

Transcription

1 Weighted empirical likelihood for quantile regression with nonignorable missing covariates arxiv: v2 [stat.me] 7 Oct 2017 Xiaohui Yuan, Xiaogang Dong School of Basic Science, Changchun University of Technology, Changchun , China Abstract In this paper, we propose an empirical likelihood-based weighted estimator of regression parameter in quantile regression model with nonignorable missing covariates. The proposed estimator is computationally simple and achieves semiparametric efficiency if the probability of missingness on the fully observed variables is correctly specified. The efficiency gain of the proposed estimator over the complete-case-analysis estimator is quantified theoretically and illustrated via simulation and a real data application. Keywords: Complete-case-analysis estimator, Empirical likelihood, Nonignorable missing covariates, Quantile regression 1. Introduction Quantile regression, as introduced by Koenker and Bassett (1978), is robust against outliers and can describe the entire conditional distribution of the response variable given the covariates. Due to these advantages, quantile regression became appealing in econometrics, statistics, and biostatistics. The book by Koenker (2005) contains a comprehensive account of overview and discussions in quantile regression. Let Y denote the outcome variable, Z be a vector of covariates which is always observed, and X be a vector of covariates which may not be observed for all subjects. The quantile regression model assumes that the τ-th Corresponding author. addresses: yuanxh@ccut.edu.cn (Xiaohui Yuan), dongxiaogang@ccut.edu.cn (Xiaogang Dong) Preprint submitted to October 10, 2017

2 conditional quantile of Y given X and Z: Q τ (Y X, Z, ) = 0 + X T 1 + Z T 2 = W T, (1) where W = (1, X T, Z T ) T and = (0, 1 T, 2 T ) T is interior to parameter space Θ, Θ is a compact subset of R p. We are interested in the inference about based on a random sample of incomplete data (Y i, X T i, Z T i, δ i ), i = 1,, n, where all the Z i s and Y i s are observed, and δ i = 0 if X i is missing, otherwise δ i = 1. The most commonly used method for handling missing covariate data is the complete-case analysis (CCA), with only the remaining complete data used to perform a regression-based or likelihood-based analysis. The CCA esitmator of is given by ˆ C = arg min Θ 1 n δ i ρ τ (Y i W T i ), (2) where ρ τ (u) = u{τ I(u < 0)} is the quantile loss function and I( ) is the indicator function. In statistic literature, there are three missing data categories (Little and Rubin, 2002). The first case is missing completely at random (MCAR), i.e., data missing mechanism is independent of any observable or unobservable quantities. The second case is missing at random (MAR), i.e., data missing mechanism depends on the observed variables. The third case is not missing at random (NMAR) or nonignorable, i.e., data missing mechanism depends on their own values. When X i s are not MCAR, the CCA estimator can be biased. Consistent and efficient estimators have been proposed in the statistical literature for the quantile regression model when the covariates data are MAR. See for example, Wei et al. (2012) developed an iterative imputation procedure for estimating the conditional quantile in the presence of missing covariates. Sherwood et al. (2013) proposed an inverse probability weighted (IPW) approach to correct for the bias from longitudinal dropouts. Chen et al. (2015) examined the problem of estimation in a quantile regression model and developed three nonparametric methods when observations are missing at ran- 2

3 dom under independent and nonidentically distributed errors. Liu and Yuan (2016) proposed a weighted quantile regression model with weights chosen by empirical likelihood. This approach efficiently incorporates the incomplete data into the data analysis by combining the complete data unbiased estimating equations and incomplete data unbiased estimating equations. However, it may not be an easy task to extend these methods to deal with NMAR missing data mechanisms, because these methods are biased under the NMAR assumption. NMAR is the most difficult problem in the missing data literature. Following Little and Zhang (2011) and Bartlett et al. (2014), we make the following not missing at random (NMAR) assumption: Y δ X, Z. (3) The NMAR assumption (3) implies that, missingness in a covariate depends on the value of that covariate, but is conditionally independent of outcome. The CCA estimator is valid but inefficient under the assumption (3) because it fails to draw on the observed information contained in the incomplete cases. In the context of mean regression model, Bartlett et al. (2014) proposed an augmented CCA estimator to improve upon the efficiency of CCA estimator by modeling an additional model for the probability of missingness on the fully observed variables, i.e. P (δ = 1 Y, Z). The estimating function used in Bartlett et al. (2014) utilizes all the observed data by drawing on the information available from both complete and incomplete cases and thus improves upon the efficiency of CCA estimator. Note that under NMAR assumption (3), P (δ = 1 Y, X, Z) = P (δ = 1 X, Z), whose feasible estimators are not available, since the observations of X are missing on some subjects. Thanks to the NMAR assumption (3), there is no need to estimate P (δ = 1 X, Z) under the assumption (3). Recently, Xie and Zhang (2017) proposed an empirical likelihood approach for estimating the regression parameters in mean regression model with missing covariates under NMAR assumption (3). They showed that the empirical likelihood estimator can improve estimation efficiency if P (δ = 1 Y, Z) is correctly specified. In this paper, we put forward an empirical likelihood-based weighted (ELW) estimator for estimating quantile regression model with nonignorable missing covariates under NMAR assumption (3). To fully utilize the information contained in the incomplete data, we incorporate the unbiased estimating equations of incomplete observations into empirical likelihood and 3

4 obtain the empirical likelihood-based weights to adjust the CCA estimator defined in (2). The proposed ELW estimator is computationally simple as the CCA estimator and achieves semiparametric efficiency if P (δ = 1 Y, Z) is correctly specified. Empirical likelihood is an effective approach to improving efficiency. For a comprehensive review of the empirical likelihood method, one can refer to Qin and Lawless (1994), Owen (2001), Lopez et al. (2009) among others. For applications of empirical likelihood in missing-data problems, one can refer to Wang and Rao (2002), Qin et al. (2009), Liu and Yuan (2012), Liu et al. (2013), Zhong and Qin (2017) among others. The rest of this paper is organized as follows. In section 2, we introduce the empirical likelihood-based weighted estimator for quantile regression model. In section 3, we show that the ELW estimator is asymptotically equivalent to the profile empirical likelihood estimator and thus achieves semiparametric efficiency. Numerical studies are reported in sections 4-5. Proofs of the main theorems needed are given in the Appendix. 2. The empirical likelihood-based weighted estimation In this section, we propose the ELW estimator of under the assumption (3). Under the assumption (3), we only need to estimate the probability of X being observed given Y and Z, i.e. P (δ = 1 Y, Z). Following Bartlett et al. (2014) and Xie and Zhang (2017), we assume that P (δ = 1 Y, Z) is described by the probability model: P (δ = 1 Y, Z) = π(y, Z, γ ), (4) where γ is a q 1 unknown vector parameter. It is natural to estimate γ by the binomial likelihood estimator ˆγ which maximizes the binomial loglikelihood L B (γ) = [δ i log{π(y i, Z i, γ)} + (1 δ i ) log{1 π(y i, Z i, γ)}]. Let m(y i, Z i,, α) be a working model of E{δ i φ(x i, Z i, Y i, ) Z i, Y i } with φ(x i, Z i, Y i, ) = W i {I(Y i W T i < 0) τ}. In the following, we proposed 4

5 the ELW estimator of. Define δ i π(y i, Z i, γ) π(y i, Z i, γ) U B (δ i, Z i, Y i, γ) =, π(y i, Z i, γ){1 π(y i, Z i, γ)} γ g 1 (δ i, X i, Z i, Y i, θ) = [δ i π(y i, Z i, γ)]m(z i, Y i,, α), ( ) g1 (δ g(δ i, X i, Z i, Y i, θ) = i, X i, Z i, Y i, θ). U B (δ i, Z i, Y i, γ) Let p i represent the probability weight allocated to g(δ i, X i, Z i, Y i, ˆθ), where ˆθ = (ˆα T, ˆ T C, ˆγT ) T and ˆα is a consistent estimator for some α. If π(y, z, γ) is correctly specified, one can show that E{g(δ i, X i, Z i, Y i, θ )} = 0, where θ = (α T, T, ˆγ T ) T. Then, we maximize the empirical likelihood function n p i subject to the constraints: p i 0, p i = 1, p i g(δ i, X i, Z i, Y i, ˆθ) = 0. By using the Lagrange multiplier method, we get ˆp i = 1 1 n 1 + ˆλ T g(δ i, X i, Z i, Y i, ˆθ), where ˆλ is the Lagrange multiplier that satisfies 1 n g(δ i, X i, Z i, Y i, ˆθ) 1 + ˆλ = 0. T g(δ i, X i, Z i, Y i, ˆθ) The ELW estimator of is given by ˆ ELW = arg min Θ ˆp i δ i ρ τ (Y i W T i ). (5) Define λ(θ) = arg max λ log{1 + λ T g(δ i, X i, Z i, Y i, θ)}. (6) 5

6 From (5), it is easily seen ˆλ = λ(ˆθ). For fixed θ = ˆθ, solving (6) is a wellbehaved optimization problem since the objective function is globally concave and can be solved by a simple Newton-Raphson numerical procedure. Let F i ( ) and f i ( ) denote respectively the conditional distribution and density functions of Y i given (X i, Z i ). Denote F = E { } δ i f i (0)W i Wi T, S φ = E { δ i φ(x i, Z i, Y i, )φ T (X i, Z i, Y i, ) }, D 1 = E { [δ i π(y i, Z i, γ )] 2 m(z i, Y i,, α )m T (Z i, Y i,, α ) }, D 2 = E { [δ i π(y i, Z i, γ )]m(z i, Y i,, α )U T B(δ i, Z i, Y i, γ ) }, D 3 = E { δ i [δ i π(y i, Z i, γ )]φ(x i, Z i, Y i, )m T (Z i, Y i,, α ) }, D 4 = E { δ i φ(x i, Z i, Y i, )U T B(δ i, Z i, Y i, γ ) }, S B = E { U B (δ i, Z i, Y i, γ )U T B(δ i, Z i, Y i, γ ) }. The following regularity conditions help us in doing asymptotic analysis: C1 The τ-th conditional quantile of Y i given W i is Q τ (Y i W i, ) = Wi T and W i has a bounded support. C2 Y δ X, Z. C3 F, S φ, S B are positive definite. C4 F i ( ) is absolutely continuous and f i ( ) is uniformly bounded away from 0 and at 0. C5 (a) P (δ = 1 Y, Z) = π(y, Z, γ ) (b) inf (Y,Z) π(y, Z, γ ) c 0 for some c 0 > 0. (c) For all (Y i, Z i ), π(y i, Z i, γ) admits all third partial derivatives 3 π(y i,z i,γ) γ k γ l γ m for all γ in a neighborhood of the true value γ, 3 π(y i,z i,γ) γ k γ l γ m and π(y i, Z i, γ)/ γ 2 are bounded by an integrable function for all γ in this neighborhood. C6 For all (Y i, Z i ), m(y i, Z i,, α) admits all second partial derivatives 2 m(y i,z i,,α) i j and 2 m(y i,z i,,α) α i α j ( T, α T ) T. m(y i, Z i,, α) 2, 2 m(y i,z i,,α) i j for all and α in a neighborhood of and 2 m(y i,z i,,α) α i α j are bounded by an integrable function for all and α in this neighborhood. The asymptotic distribution of ˆ C is given by the following theorem. Theorem 2.1. Under conditions C1-C4, n 1/2 ( ˆ C ) n, where Σ C = F 1 S φf 1. 6 d N(0, Σ C ) as

7 The asymptotic distribution of ˆ ELW is given by the following theorem. Theorem 2.2. Under conditions C1-C6, n 1/2 ( ˆ ELW d ) N(0, Σ ELW ) as n, where Σ ELW = F 1 ( ) Sφ V 1 V2 1 V1 T F 1 = Σ C F 1 V 1V2 1 V1 T F 1, V 1 = D 3 D 4 S 1 B DT 2 and V 2 = D 1 D 2 S 1 B DT 2. For two matrices A and B, we write A B if B A is a nonnegativedefinite matrix. Corollary 2.3. If both F and V 2 are positive definite, we have Σ ELW Σ C, and the equality holds if and only if V 1 = 0. Corollary 2.3 reveals that ˆ ELW is at least as efficient as ˆ C for any working regression function m(y i, Z i,, α), whether or not it correctly identifies the optimal regression function E{φ(X i, Z i, Y i, ) Z i, Y i, δ i = 1}. Although ˆ ELW can be obtained easily, it is difficult to estimate the limiting covariance matrix analytically. We apply the resampling method in Liu and Yuan (2016) to the inference about. 3. Simulation studies In this section, we investigate the performance of the proposed estimator ˆ ELW and several other estimators based on Monte-Carlo simulations. The simulated data are generated by the procedure of Bartlett et al. (2014), in which the non-missing indicator δ is distributed with P (δ = 1) = 0.5, and (X, Z, Y ) is generated from a trivariate normal distribution conditional on δ: (X, Z, Y ) T δ N((δ, 0, ηδ) T, Ψ), where Ψ = (σ ab ), a, b = x, z, y, η = (σ xy σ zz σ xz σ zy )υ 1 and υ 1 = (σ xx σ zz σ 2 xz) 1. It is easy to verify that the assumption δ Y (X, Z) is satisfied in this setup. Conditional on Z and Y, the probability of P (δ = 1 Z, Y ) is a logistic regression with P (δ = 1 Z, Y ) = exp(γ 0 + γ 1 Z + γ 2 Y ) 1 + exp(γ 0 + γ 1 Z + γ 2 Y ) 7

8 where γ 0 = 0.5η 2 σ zz υ 2, γ 1 = ησ zy υ 2, γ 2 = ησ zz υ 2 and υ 2 = (σ zz σ yy σ 2 zy) 1. The conditional quantile model of interest is specified as Q τ (Y X, Z) = X + 2 Z, with 0 = Φ 1 (τ, σ 2 ), 1 = (σ xy σ zz σ xz σ zy )υ 1, 2 = (σ zy σ xx σ xz σ xy )υ 1, σ 2 = σ yy (σ 2 xzσ zz 2σ 2 xzσ zy + σ 2 zyσ xx )υ 1. We set σ xx = σ yy = σ zz = 1, σ xz = σ xy = σ zy = 0.5 and generate 1000 Monte Carlo data sets of sample sizes n = 100 and 300. Five estimators are considered: 1. ˆideal : the quantile regression estimator with the full observations. This is the ideal case, but it is not feasible in practice. Nevertheless, we used it as a benchmark for comparison; 2. ˆC : the CCA estimator defined in equation (2); 3. ˆIP W MAR : the IPW estimator assuming MAR, introduced in Sherwood et al. (2013); 4. ˆELW MAR : the ELW estimator assuming MAR, proposed by Liu and Yuan (2016); 5. ˆELW : the ELW estimator defined in equation (5). The empirical bias and the root-mean-squared errors (RMSEs) of the proposed estimators with sample sizes of 100 and 300 are reported in Table 1. The results can be summarized as follows: the CCA estimator ˆ C and the ELW estimator ˆ ELW are unbiased as expected. While ˆ IP W MAR and ˆ ELW MAR for 0 are clearly biased. ˆELW performs better than ˆ C in terms of RMSE in most cases, which agrees with our theory. ˆC and ˆ ELW are improved in terms of RMSE as the sample size n goes up from 100 to Data analysis In this section, we apply the proposed method to the data on alcohol consumption, age, body mass index and systolic blood pressure from the NHANES. We model the population quantile of SBP (systolic blood pressure) as a function of the following four covariates: BMI (body mass index), Alcohol (log{alcohol consumption per day+1}), Age ({age 50}/10) and Age 2 ({age 50} 2 /100). In our analysis, there are 7104 observations in the data set, where the dependent variable SBP and the covariates BMI and Age have complete data, 8

9 the covariate Alcohol are missing 53.29%. It is a priori plausible that missingness in Alcohol is primarily dependent on the value of itself (i.e. MNAR), and that missingness in Alcohol is independent of SBP conditional on Alcohol, BMI, Age, and Age 2. Consequently, CCA is expected to give valid inferences, while the MAR assumption likely does not hold. For i = 1,, n = 7104, let Y i denote the ith observation of Y =SBP, Z i denote the ith observation of Z=(BMI, Age, Age 2 ) T and X i denote the ith observation of X =Alcohol. Then, we consider the following model for the τth conditional quantile of Y i given W i = (1, X i, Z T i ) T : Q τ (Y i X i, Z i, ) = 0 + X i 1 + Z T i 2, i = 1,, n, where = ( 0, 1, T 2 ) T and 2 = ( 21, 22, 23 ) T. We consider two estimators ˆ C and ˆ ELW. For the ELW method, the probability of whether the Alcohol is observed is modeled by π(y, Z, γ) = {1 + exp( γ 0 Y γ 1 Z T γ 2 )} 1. In Figure 1, we plot the estimated regression coefficients, ˆ C and ˆ ELW for 1, 21, 22 and 23, at quantile levels τ = 0.1, 0.2,, 0.9. We see that the CCA and ELW methods produce similar estimated regression coefficients. In Figure 2, we plot the standard errors of ˆ C and ˆ ELW for 1, 21, 22 and 23 at various quantile levels. The standard error of ˆ ELW is smaller than that of ˆ C in most cases. 5. Conclusions In this paper, we develop weighted empirical likelihood approach for estimating the conditional quantile functions in linear models with nonignorable missing covariates. By incorporating the unbiased estimating equations of incomplete data into empirical likelihood, the ELW estimator can achieve semiparametric efficiency if the probability of missingness is correctly specified. We will extend the proposed methods to other regression models, which will be investigated in the future work. Acknowledgements Xiaohui Yuan was partly supported by the NSFC (No , , ). Xiaogang Dong was partly supported by the NSFC (No ). 9

10 6. Appendix In the section, we list a preliminary lemma which has been used in the proofs of the main results in section 2. Lemma 6.1. Under conditions C1-C5, we have ˆλ = λ(ˆθ) = n 1 S 1 g where λ(θ) is defined in (6). [ Ug (θ ) + G γ S 1 B U(γ ) ] + o p (n 1/2 ), The proof of Lemma 6.1 By Lemma A.2 in Liu and Yuan (2016), we have { 1 ˆθ)} 1 ˆλ = g(δ i, X i, Z i, Y i, n ˆθ)g T (δ i, X i, Z i, Y i, n 1 U g (ˆθ) + o p (n 1/2 ), where U g (θ) = n g(δ i, X i, Z i, Y i, θ). By a Taylor expansion, n 1 U g (ˆθ) = n 1 U g (θ ) + n 1 U g( θ) α T (ˆα α ) + n 1 U g( θ) ( ˆ T C ) + n 1 U g( θ) (ˆγ γ ), γ T where θ is a point on the segment connecting ˆθ and θ. By the law of large numbers, we have 1 n ( g(δ i, X i, Z i, Y i, ˆθ)g T (δ i, X i, Z i, Y i, ˆθ) p D1 D 2 n 1 U g( θ) γ T n 1 U g( θ) α T D T 2 S B { } ( ) p g(δi, X i, Z i, Y i, θ ) D2 E = = G γ T S γ, B p 0, n 1 U g( θ) T p 0. ) = S g, By the asymptotic properties of maximum likelihood estimate, we have ˆγ γ = n 1 S 1 B U(γ ) + o p (n 1/2 ), (8) (7) 10

11 where U(γ ) = n U B(δ i, Z i, Y i, γ ). Thus by (7) and (8), { } ˆλ = Sg 1 n 1 U g (θ ) + n 1 U g( θ) (ˆγ γ ) + o γ T p (n 1/2 ) [ = n 1 Sg 1 Ug (θ ) + G γ S 1 B U(γ ) ] + o p (n 1/2 ). The desired result follows. The proof of Theorem 2.1 The proof is similar to the proof of Theorem 4.1 in Koenker (2005, page 120). The proof of Theorem 2.2 For i = 1,, n, let A i (η) = ρ τ (ε i W T i η/ n) ρ τ (ε i ), where ε i = Y i Wi T. The function A(η) = n nˆp iδ i A i (η) is convex and is minimized at ˆη = n( ˆ ELW ). Following Knight s identity (Knight,1998) ρ(u v) ρ(u) = v[i(u < 0) τ] + we can write A(η) = A 1 (η) + A 2 (η), where v 0 [I(u s) I(u 0)]ds, A 1 (η) = n 1/2 η T nˆp i δ i φ(x i, Z i, Y i, ), (9) A 2 (η) = w T i η/ nˆp i δ i 0 n {I(ε i s) I(ε i 0)}ds. (10) We first give the asymptotic expression of (9). Applying a Taylor expansion, we get nˆp i δ i A 1i (η) = η T n 1/2 δ i φ(x i, Z i, Y i, ) η T n 1 δ i φ(x i, Z i, Y i, )g T (δ i, X i, Z i, Y i, θ )n 1/2ˆλ + op (1). 11

12 By the law of large numbers, we have n 1 δ i φ(x i, Z i, Y i, )g T (δ i, X i, Z i, Y i, θ p ) F g = ( ) D 3 D 4. (11) By Lemma 6.1, nˆp i δ i A 1i (η) = η T n 1/2 δ i φ(x i, Z i, Y i, ) η T F g n 1/2ˆλ + op (1) = η T n 1/2 { U φ (θ ) F g S 1 g [ Ug (θ ) + G γ S 1 B U(γ ) ]} + o p (1), where U φ (θ ) = n δ iφ(x i, Z i, Y i, ). Next, we give the asymptotic expression of (10). reveals that nˆp i δ i A 2i (η) = δ i A 2i (η) A Taylor expansion A 2i (η)δ i g T (δ i, X i, Z i, Y i, θ )ˆλ + o p (1). Moreover, similar to the proof of Theorem 4.1 in Koenker(2005), one can show that δ i A 2i (η) = 1 2 ηt F η + o p (1). Thus, we only need to show that n A 2i(η)δ i g T (δ i, X i, Z i, Y i, θ )ˆλ is asymptotically negligible. By Lemma 6.1 and Lemma D.2 in Kitamura et al. (2004), we have ˆλ = O p (n 1/2 ) and max 1 i n { g(δ i, X i, Z i, Y i, θ ) } = o p (n 1/2 ). Then, A 2i (η)δ i g T (δ i, X i, Z i, Y i, θ )ˆλ max { g(δ i, X i, Z i, Y i, θ ) } ˆλ δ i A 2i (η) = o p (1). 1 i n 12

13 d By the asymptotic expressions of (9) and (10), we conclude that A(η) A 0 (η), where A 0 (η) = η T n { [ 1/2 U φ (θ ) F g Sg 1 Ug (θ ) + G γ S 1 B U(γ ) ]} ηt F η. Then, it follows that n( ˆELW ) = ˆη d arg min η A 0 (η), where arg min A 0 (η) = F { [ 1 η n 1/2 U φ (θ ) F g Sg 1 Ug (θ ) + G γ S 1 B U(γ ) ]}. Furthermore, by simple algebra, one can verify that and ( ) ( D D3 D 1 D 2 4 D2 T S B ) 1 = ( V 1 V2 1 D 4 S 1 B V 1V2 1 D 2 S 1 B ) = U g (θ ) + G γ S 1 B U(γ ) ( n [g 1(δ i, X i, Z i, Y i, θ ) D 2 S 1 B U B(δ i, Z i, Y i, γ )] 0 ). Therefore, Let [ F g Sg 1 Ug (θ ) + G γ S 1 = ( D 3 D 4 ) ( D 1 D 2 = V 1 V 1 2 D T 2 B U(γ ) ] S B ) 1 [ Ug (θ ) + G γ S 1 B U(γ ) ] [ g1 (δ i, X i, Z i, Y i, θ ) D 2 S 1 B U B(δ i, Z i, Y i, γ ) ]. h 1i = δ i φ(x i, Z i, Y i, ), h 2i = g 1 (δ i, X i, Z i, Y i, θ ) D 2 S 1 B U B(δ i, Z i, Y i, γ ). 13

14 One can write arg min η A 0 (η) as F 1 n 1/2 n { } h1i V 1 V2 1 h 2i. It is easily verified that V ar (h 1i ) = E(h 1i h T 1i) = S φ, E(h 1i h T 2i) = E(h 1i g1 T (δ i, X i, Z i, Y i, θ )) E(h 1i UB(δ T i, Z i, Y i, γ ))S 1 B DT 2 = D 3 D 4 S 1 B DT 2, and V ar (h 2i ) = V ar(g 1 (δ i, X i, Z i, Y i, θ )) + D 2 S 1 B V ar(u B(δ i, Z i, Y i, γ ))S 1 B DT 2 E[g 1 (δ i, X i, Z i, Y i, θ )UB(δ T i, Z i, Y i, γ )]S 1 B DT 2 D 2 S 1 B E[U B(δ i, Z i, Y i, γ )g1 T (δ i, X i, Z i, Y i, θ )] = D 1 D 2 S 1 B DT 2. Thus, V ar ( ) h 1i V 1 V2 1 h 2i = V ar(h 1i ) V 1 V2 1 E(h 2i h T 1i) E(h 1i h T 2i)V2 1 V1 T + V 1 V2 1 V ar(h 2i )V2 1 V1 T = S φ V 1 V2 1 (D 3 D 4 S 1 B DT 2 ) T (D 3 D 4 S 1 B DT 2 )V2 1 V1 T +V 1 V2 1 (D 1 D 2 S 1 B DT 2 )V2 1 V1 T = S φ V 1 V 1 2 V T 1. The desired result follows by the central limit theorem. The proof of Theorem?? According to the proof of Theorem 1 of Lopez et al.(2009), it can be shown that n 1/2 ( ˆEL ˆγ EL γ where Σ EL = ( S T 1 S 1 2 S 1 ) 1, S1 = ( D1 D Let E 22 = 2 D T 2 S B defined in (11) We know that ( S2 1 = ) d N (0, Σ EL ), F 0 0 D 2 0 S B ), then we write S 2 = E E 1 and S 2 = S φ D 3 D 4 D3 T D 1 D 2 D4 T D2 T S B. ( ) Sφ F g Fg T, where F E g is F g E F g E22 1 E 1 22 F T g E E E 1 22 F T g E 1 ), 14

15 with E 11.2 = S φ F g E 1 22 F T g = S φ V 1 V 1 2 V T 1 D 4 S 1 B DT 4. Note that S T 1 S 1 2 S 1 can be written as = = S1 T S 1 2 S 1 ( F E11.2F 1 F E11.2F 1 g E22 1 D2 S B (D2 T ( F E11.2F 1 F E 1 D4 T E11.2F 1 S B + D4 T E 1 S B )E 1 22 F T g E F (D T 2 S B ){E E 1 22 F T g E F g E D D 4 ) ( H11 H = 12 H 21 H 22 ), ) ( D2 22 } S B ) where H 11 = F E11.2F 1, H 12 = F E11.2D 1 4, H 21 = H12 T and H 22 = S B + D4 T E11.2D 1 4. Therefore, we have (S T 1 S 1 2 S 1 ) 1 = = ( ) 1 H11 H 12 H 21 H 22 ( H H11 1 H 12 H H 21 H11 1 H 1 H22.1H 1 21 H11 1 H H 12 H ), where H 22.1 = H 22 H 21 H 1 11 H 12 = S B. By direct calculation, it follows that H H11 1 H 12 H22.1H 1 21 H11 1 E 11.2F 1 D 4S 1 = F 1 = F 1 S φf 1 = F 1 S φf 1 = Σ ELW, + F 1 F 1 F 1 V 1V 1 V 1V 1 B DT 4 F 1 2 V1 T F 1 2 V1 T F 1 F 1 D 4S 1 B DT 4 F 1 + F 1 D 4S 1 B DT 4 F 1 and H11 1 H 12 H = F 1 result follows. E 11.2F 1 F E D 4 S 1 B = F 1 D 4S 1. The desired B Reference References Bartlett J W, Carpenter J R, Tilling K, et al Improving upon the efficiency of complete case analysis when covariates are MNAR[J]. Biostatistics, 15(4):

16 Chen X, Wan A T K, Zhou Y Efficient quantile regression analysis with missing observations[j]. Journal of the American Statistical Association, 110(510): Kitamura Y, Tripathi G, Ahn H Empirical likelihood-based inference in conditional moment restriction Models[J]. Econometrica, 72(6): Knight K Limiting distributions for L 1 regression estimators under general conditions[j]. Annals of Statistics, 26: Koenker, R. and Bassett, G Regression quantiles. Econometrica,46, Koenker R Quantile regression[m]. Cambridge university press. Little RJA, Rubin DB 2002 Statistical analysis with missing data, 2nd ed, Wiley, Hoboken, NJ. Little, R.J., Zhang, N Subsample ignorable likelihood for regression analysis with missing data. J R Stat Soc Ser C 60: Liu, T. and Yuan, X Combining quasi and empirical likelihoods in generalized linear models with missing responses. Journal of Multivariate Analysis 111, Liu, T., Yuan, X., Li, Z. and Li, Y Empirical and weighted conditional likelihoods for matched case-control studies with missing covariates. Journal of Multivariate Analysis 119, Liu T, Yuan X Weighted quantile regression with missing covariates using empirical likelihood[j]. Statistics A Journal of Theoretical and Applied Statistics, 50(1): Lopez EMM, Keilegom IV, Veraverbeke N (2009) Empirical likelihood for non-smooth criterion functions. Scand J Stat 36: Owen, A. B Empirical Likelihood. Chapman and Hall, New York. Qin J, Lawless J 1994 Empirical likelihood and general estimating equations. Ann Stat 22(1):

17 Qin, J., Zhang, B. and Leung, D. H Empirical likelihood in missing data problems[j]. Journal of the American Statistical Association, 104: Sherwood B, Wang L, Zhou X H Weighted quantile regression for analyzing health care cost data with missing covariates.[j]. Statistics in Medicine, 32(28): Wang, Q., and Rao, J. N. K Empirical likelihood-based inference under imputation for missing response data[j]. Annals of Statistics, 30: Wei Y, Ma Y, Carroll R J Multiple imputation in quantile regression[j]. Biometrika, 99(2): Xie, Y., Zhang, H Empirical likelihood in nonignorable covariatemissing data problems. Int. J. Biostat 13(1): Zhong G, Qin J Empirical likelihood method for non-ignorable missing data problems. Lifetime Data Anal 23(1):

18 Table 1: Empirical bias and RMSE (in parentheses) based on 1000 simulations with n = 100, 300. τ n Estimator ˆideal (0.1185) (0.1095) (0.1272) ˆ C (0.2403) (0.1851) (0.1853) ˆ IP W MAR (0.3065) (0.2128) (0.2077) ˆ ELW MAR (0.3021) (0.2144) (0.1945) ˆ ELW (0.2446) (0.1875) (0.1752) 300 ˆideal (0.0685) (0.0617) (0.0676) ˆ C (0.1332) (0.1031) (0.1016) ˆ IP W MAR (0.2188) (0.1167) (0.1150) ˆ ELW MAR (0.2079) (0.1116) (0.0968) ˆ ELW (0.1252) (0.1002) (0.0873) ˆideal (0.1128) (0.0997) (0.1187) ˆ C (0.2347) (0.1781) (0.1765) ˆ IP W MAR (0.2761) (0.1851) (0.1850) ˆ ELW MAR (0.2794) (0.1814) (0.1649) ˆ ELW (0.2326) (0.1685) (0.1617) 300 ˆideal (0.0648) (0.0608) (0.0679) ˆ C (0.1274) (0.0979) (0.0973) ˆ IP W MAR (0.2040) (0.1036) (0.1003) ˆ ELW MAR (0.2007) (0.1002) (0.0891) ˆ ELW (0.1238) (0.0954) (0.0889) ˆideal (0.1216) (0.1147) (0.1212) ˆ C (0.2471) (0.1864) (0.1791) ˆ IP W MAR (0.2839) (0.1845) (0.1790) ˆ ELW MAR (0.2901) (0.1868) (0.1694) ˆ ELW (0.2498) (0.1870) (0.1795) 300 ˆideal (0.0706) (0.0630) (0.0708) ˆ C (0.1371) (0.1008) (0.1028) ˆ IP W MAR (0.2007) (0.1046) (0.1033) ˆ ELW MAR (0.1983) (0.1006) (0.0935) ˆ ELW (0.1325) (0.0969) (0.0964) 18

19 Figure 1: The estimated regression coefficients, ˆ C (-.) and ˆ ELW ( ) at various quantile levels. 19

20 Figure 2: The standard errors of ˆ C (-.) and ˆ ELW levels. ( ) at various quantile 20

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Recent Advances in the analysis of missing data with non-ignorable missingness

Recent Advances in the analysis of missing data with non-ignorable missingness Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

Calibration Estimation for Semiparametric Copula Models under Missing Data

Calibration Estimation for Semiparametric Copula Models under Missing Data Calibration Estimation for Semiparametric Copula Models under Missing Data Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Economics and Economic Growth Centre

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC

More information

Extending the two-sample empirical likelihood method

Extending the two-sample empirical likelihood method Extending the two-sample empirical likelihood method Jānis Valeinis University of Latvia Edmunds Cers University of Latvia August 28, 2011 Abstract In this paper we establish the empirical likelihood method

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

Some methods for handling missing values in outcome variables. Roderick J. Little

Some methods for handling missing values in outcome variables. Roderick J. Little Some methods for handling missing values in outcome variables Roderick J. Little Missing data principles Likelihood methods Outline ML, Bayes, Multiple Imputation (MI) Robust MAR methods Predictive mean

More information

A weighted simulation-based estimator for incomplete longitudinal data models

A weighted simulation-based estimator for incomplete longitudinal data models To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun

More information

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and

More information

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

arxiv: v2 [stat.me] 4 Jun 2016

arxiv: v2 [stat.me] 4 Jun 2016 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates 1 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates Ben Sherwood arxiv:1510.00094v2

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

Empirical likelihood methods in missing response problems and causal interference

Empirical likelihood methods in missing response problems and causal interference The University of Toledo The University of Toledo Digital Repository Theses and Dissertations 2017 Empirical likelihood methods in missing response problems and causal interference Kaili Ren University

More information

Asymptotic inference for a nonstationary double ar(1) model

Asymptotic inference for a nonstationary double ar(1) model Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk

More information

Efficient Estimation for the Partially Linear Models with Random Effects

Efficient Estimation for the Partially Linear Models with Random Effects A^VÇÚO 1 33 ò 1 5 Ï 2017 c 10 Chinese Journal of Applied Probability and Statistics Oct., 2017, Vol. 33, No. 5, pp. 529-537 doi: 10.3969/j.issn.1001-4268.2017.05.009 Efficient Estimation for the Partially

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract Far East J. Theo. Stat. 0() (006), 179-196 COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS Department of Statistics University of Manitoba Winnipeg, Manitoba, Canada R3T

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Yi-Hau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan

More information

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved

More information

An Encompassing Test for Non-Nested Quantile Regression Models

An Encompassing Test for Non-Nested Quantile Regression Models An Encompassing Test for Non-Nested Quantile Regression Models Chung-Ming Kuan Department of Finance National Taiwan University Hsin-Yi Lin Department of Economics National Chengchi University Abstract

More information

Introduction An approximated EM algorithm Simulation studies Discussion

Introduction An approximated EM algorithm Simulation studies Discussion 1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

A Resampling Method on Pivotal Estimating Functions

A Resampling Method on Pivotal Estimating Functions A Resampling Method on Pivotal Estimating Functions Kun Nie Biostat 277,Winter 2004 March 17, 2004 Outline Introduction A General Resampling Method Examples - Quantile Regression -Rank Regression -Simulation

More information

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level A Monte Carlo Simulation to Test the Tenability of the SuperMatrix Approach Kyle M Lang Quantitative Psychology

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Adjusted Empirical Likelihood for Long-memory Time Series Models

Adjusted Empirical Likelihood for Long-memory Time Series Models Adjusted Empirical Likelihood for Long-memory Time Series Models arxiv:1604.06170v1 [stat.me] 21 Apr 2016 Ramadha D. Piyadi Gamage, Wei Ning and Arjun K. Gupta Department of Mathematics and Statistics

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data Comparison of multiple imputation methods for systematically and sporadically missing multilevel data V. Audigier, I. White, S. Jolani, T. Debray, M. Quartagno, J. Carpenter, S. van Buuren, M. Resche-Rigon

More information

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

Conditional Empirical Likelihood Approach to Statistical Analysis with Missing Data

Conditional Empirical Likelihood Approach to Statistical Analysis with Missing Data Conditional Empirical Likelihood Approach to Statistical Analysis with Missing Data by Peisong Han A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Issues on quantile autoregression

Issues on quantile autoregression Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides

More information

Efficient Estimation of Population Quantiles in General Semiparametric Regression Models

Efficient Estimation of Population Quantiles in General Semiparametric Regression Models Efficient Estimation of Population Quantiles in General Semiparametric Regression Models Arnab Maity 1 Department of Statistics, Texas A&M University, College Station TX 77843-3143, U.S.A. amaity@stat.tamu.edu

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

T E C H N I C A L R E P O R T KERNEL WEIGHTED INFLUENCE MEASURES. HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE

T E C H N I C A L R E P O R T KERNEL WEIGHTED INFLUENCE MEASURES. HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE T E C H N I C A L R E P O R T 0465 KERNEL WEIGHTED INFLUENCE MEASURES HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE * I A P S T A T I S T I C S N E T W O R K INTERUNIVERSITY ATTRACTION

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 Z-theorems: Notation and Context Suppose that Θ R k, and that Ψ n : Θ R k, random maps Ψ : Θ R k, deterministic

More information

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality Communications for Statistical Applications and Methods 014, Vol. 1, No. 3, 13 4 DOI: http://dx.doi.org/10.5351/csam.014.1.3.13 ISSN 87-7843 An Empirical Characteristic Function Approach to Selecting a

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

Working Paper No Maximum score type estimators

Working Paper No Maximum score type estimators Warsaw School of Economics Institute of Econometrics Department of Applied Econometrics Department of Applied Econometrics Working Papers Warsaw School of Economics Al. iepodleglosci 64 02-554 Warszawa,

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates

Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates Sonderforschungsbereich 386, Paper 24 (2) Online unter: http://epub.ub.uni-muenchen.de/

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

Robustness to Parametric Assumptions in Missing Data Models

Robustness to Parametric Assumptions in Missing Data Models Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice

More information

Estimation and Testing for Common Cycles

Estimation and Testing for Common Cycles Estimation and esting for Common Cycles Anders Warne February 27, 2008 Abstract: his note discusses estimation and testing for the presence of common cycles in cointegrated vector autoregressions A simple

More information

Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood

Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood Advances in Decision Sciences Volume 2012, Article ID 973173, 8 pages doi:10.1155/2012/973173 Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood

More information

Analysis of Matched Case Control Data in Presence of Nonignorable Missing Exposure

Analysis of Matched Case Control Data in Presence of Nonignorable Missing Exposure Biometrics DOI: 101111/j1541-0420200700828x Analysis of Matched Case Control Data in Presence of Nonignorable Missing Exposure Samiran Sinha 1, and Tapabrata Maiti 2, 1 Department of Statistics, Texas

More information

Miscellanea A note on multiple imputation under complex sampling

Miscellanea A note on multiple imputation under complex sampling Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.

More information

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 Rejoinder Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 1 School of Statistics, University of Minnesota 2 LPMC and Department of Statistics, Nankai University, China We thank the editor Professor David

More information

Empirical Likelihood Methods for Sample Survey Data: An Overview

Empirical Likelihood Methods for Sample Survey Data: An Overview AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use

More information

ATINER's Conference Paper Series STA

ATINER's Conference Paper Series STA ATINER CONFERENCE PAPER SERIES No: LNG2014-1176 Athens Institute for Education and Research ATINER ATINER's Conference Paper Series STA2014-1255 Parametric versus Semi-parametric Mixed Models for Panel

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori Kaiji Motegi Zheng Zhang March 28, 2018 Abstract This paper investigates the estimation of semiparametric

More information

Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION

Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION INTRODUCTION Statistical disclosure control part of preparations for disseminating microdata. Data perturbation techniques: Methods assuring

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Imputation Algorithm Using Copulas

Imputation Algorithm Using Copulas Metodološki zvezki, Vol. 3, No. 1, 2006, 109-120 Imputation Algorithm Using Copulas Ene Käärik 1 Abstract In this paper the author demonstrates how the copulas approach can be used to find algorithms for

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Topics and Papers for Spring 14 RIT

Topics and Papers for Spring 14 RIT Eric Slud Feb. 3, 204 Topics and Papers for Spring 4 RIT The general topic of the RIT is inference for parameters of interest, such as population means or nonlinearregression coefficients, in the presence

More information

Improving linear quantile regression for

Improving linear quantile regression for Improving linear quantile regression for replicated data arxiv:1901.0369v1 [stat.ap] 16 Jan 2019 Kaushik Jana 1 and Debasis Sengupta 2 1 Imperial College London, UK 2 Indian Statistical Institute, Kolkata,

More information

Analyzing Pilot Studies with Missing Observations

Analyzing Pilot Studies with Missing Observations Analyzing Pilot Studies with Missing Observations Monnie McGee mmcgee@smu.edu. Department of Statistical Science Southern Methodist University, Dallas, Texas Co-authored with N. Bergasa (SUNY Downstate

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

Tobit and Interval Censored Regression Model

Tobit and Interval Censored Regression Model Global Journal of Pure and Applied Mathematics. ISSN 0973-768 Volume 2, Number (206), pp. 98-994 Research India Publications http://www.ripublication.com Tobit and Interval Censored Regression Model Raidani

More information

Consistent Parameter Estimation for Conditional Moment Restrictions

Consistent Parameter Estimation for Conditional Moment Restrictions Consistent Parameter Estimation for Conditional Moment Restrictions Shih-Hsun Hsu Department of Economics National Taiwan University Chung-Ming Kuan Institute of Economics Academia Sinica Preliminary;

More information

F-tests for Incomplete Data in Multiple Regression Setup

F-tests for Incomplete Data in Multiple Regression Setup F-tests for Incomplete Data in Multiple Regression Setup ASHOK CHAURASIA Advisor: Dr. Ofer Harel University of Connecticut / 1 of 19 OUTLINE INTRODUCTION F-tests in Multiple Linear Regression Incomplete

More information

arxiv: v2 [stat.me] 17 Jan 2017

arxiv: v2 [stat.me] 17 Jan 2017 Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable arxiv:1607.03197v2 [stat.me] 17 Jan 2017 BaoLuo Sun 1, Lan Liu 1, Wang Miao 1,4, Kathleen Wirth 2,3, James Robins

More information

arxiv: v1 [stat.me] 22 Mar 2010

arxiv: v1 [stat.me] 22 Mar 2010 A longest run test for heteroscedasticity in univariate regression model arxiv:1003.4156v1 [stat.me] 22 Mar 2010 Aubin Jean-Baptiste a, Leoni-Aubin Samuela b a Université de Technologie de Compiègne, Rue

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:

More information

5 Methods Based on Inverse Probability Weighting Under MAR

5 Methods Based on Inverse Probability Weighting Under MAR 5 Methods Based on Inverse Probability Weighting Under MAR The likelihood-based and multiple imputation methods we considered for inference under MAR in Chapters 3 and 4 are based, either directly or indirectly,

More information

Columbia University. Columbia University Biostatistics Technical Report Series. Handling Missing Data by Deleting Completely Observed Records

Columbia University. Columbia University Biostatistics Technical Report Series. Handling Missing Data by Deleting Completely Observed Records Columbia University Columbia University Biostatistics Technical Report Series Year 2006 Paper 13 Handling Missing Data by Deleting Completely Observed Records Cuiling Wang Myunghee Cho Paik Columbia University,

More information

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data Yujun Wu, Marc G. Genton, 1 and Leonard A. Stefanski 2 Department of Biostatistics, School of Public Health, University of Medicine

More information

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data Biometrics 000, 000 000 DOI: 000 000 0000 Discussion of Identifiability and Estimation of Causal Effects in Randomized Trials with Noncompliance and Completely Non-ignorable Missing Data Dylan S. Small

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving

More information

Subsample ignorable likelihood for regression analysis with missing data

Subsample ignorable likelihood for regression analysis with missing data Appl. Statist. (2011) 60, Part 4, pp. 591 605 Subsample ignorable likelihood for regression analysis with missing data Roderick J. Little and Nanhua Zhang University of Michigan, Ann Arbor, USA [Received

More information

Estimation for two-phase designs: semiparametric models and Z theorems

Estimation for two-phase designs: semiparametric models and Z theorems Estimation for two-phase designs:semiparametric models and Z theorems p. 1/27 Estimation for two-phase designs: semiparametric models and Z theorems Jon A. Wellner University of Washington Estimation for

More information

EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL

EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL Statistica Sinica 22 (2012), 295-316 doi:http://dx.doi.org/10.5705/ss.2010.190 EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL Mai Zhou 1, Mi-Ok Kim 2, and Arne C.

More information

7 Sensitivity Analysis

7 Sensitivity Analysis 7 Sensitivity Analysis A recurrent theme underlying methodology for analysis in the presence of missing data is the need to make assumptions that cannot be verified based on the observed data. If the assumption

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:

More information

What s New in Econometrics. Lecture 15

What s New in Econometrics. Lecture 15 What s New in Econometrics Lecture 15 Generalized Method of Moments and Empirical Likelihood Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Generalized Method of Moments Estimation

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression

On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression Working Paper 2016:1 Department of Statistics On the equivalence of confidence interval estimation based on frequentist model averaging and least-squares of the full model in linear regression Sebastian

More information

MIVQUE and Maximum Likelihood Estimation for Multivariate Linear Models with Incomplete Observations

MIVQUE and Maximum Likelihood Estimation for Multivariate Linear Models with Incomplete Observations Sankhyā : The Indian Journal of Statistics 2006, Volume 68, Part 3, pp. 409-435 c 2006, Indian Statistical Institute MIVQUE and Maximum Likelihood Estimation for Multivariate Linear Models with Incomplete

More information

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade Denis Chetverikov Brad Larsen Christopher Palmer UCLA, Stanford and NBER, UC Berkeley September

More information

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Jörg Schwiebert Abstract In this paper, we derive a semiparametric estimation procedure for the sample selection model

More information