Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size: px
Start display at page:

Download "Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests"

Transcription

1 Biometrika (2014),,, pp C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University of Kentucky, Lexington, Kentucky, , U.S.A. mai@ms.uky.edu 5 AND M.-O. KIM Division of Biostatistics and Epidemiology, Cincinnati Children s Hospital Medical Center, Cincinnati, Ohio , U.S.A. miok.kim@cchmc.org SUMMARY 10 Empirical likelihood is a general non-parametric inference methodology using likelihood principle in a way that is analogous to that used with parametric likelihoods. In a wide range of applications the methodology was shown to provide likelihood ratio statistics that have limiting chi-square distributions and observe a nonparametric version of Wilks theorem. Amongst recent extensions to the analysis of censored data, longitudinal data and semi-parametric regres- 15 sion model, however, the property only remained true in some but not in others. This motivates our discussion of relative optimality of extended empirical likelihood methods. We compare extended empirical likelihood methods and evaluate their relative optimality by comparing the confidence regions provided by inverting the likelihood ratio tests. We show that those extension methods which yield the likelihood ratio statistic observing the Wilks theorem provides the 20 smallest confidence region. Specific examples are provided for the case of censored data analysis and estimating equations involving nuisance parameters. Some key words: Empirical Likelihood, Likelihood Ratio, Wilks Confidence region. 1. INTRODUCTION Empirical likelihood is a general non-parametric inference method using likelihood principle 25 in a way that is analogous to that used with parametric likelihoods. Since its introduction by Owen (1988, 1990), the methodology has been extended to the analysis of censored data (Li, Li & Zhou (2005)), longitudinal data (e.g., Wang, Qian & Carroll (2010)) and semi-parametric regression model (e.g., Shi & Lau (2000)). In a wide range of applications as outlined in Owen (2001) empirical likelihood ratio statistics were shown to have limiting chi-square distributions 30 and observe a nonparametric version of Wilks theorem. This property remains true with some extended methods but not with others, however. This motivates discussion of relative optimality of extended methods. In this paper we consider several extension of empirical likelihood methods that have the likelihood function maximized at the same location, thus yielding equivalent maximum empirical 35 likelihood estimators, but provide different likelihood ratio statistics. We discuss the relative optimality of the methods by comparing the confidence regions provided by inverting the likelihood

2 M. ZHOU AND M.-O. KIM ratio tests. We show that the method that yields the likelihood ratio statistic observing the Wilks theorem provides the smallest confidence region. The remainder of this paper proceeds as follows. Section 2 formally defines a set-up where multiple empirical likelihood methods can be defined that yield equivalent maximum empirical likelihood estimators. The section also provides asymptotic results for the comparison of the corresponding confidence regions. Section 3 describes specific examples for the cases of censored data analysis and estimating equations involving nuisance parameters. Section 4 provides empirical results. Section 5 concludes. All the proofs are deferred to the Appendix. 2. COMPARISON OF EMPIRICAL LIKELIHOOD CONFIDENCE REGIONS 50 Given a random sample Z n = {Z i } n with a common cumulative distribution function F 0 and Z 1 R k, we consider an (extended) empirical likelihood defined for some discrete cumulative distribution function F (denoted by EL(Z n, F )) that gives rise to empirical likelihood ratio test for a finite parameter θ = T (F ) R p with the test statistic 2 log ELR n (θ) = 2[max F F log EL(Z n, F ) max F log EL(Z n, F )], 55 where F is a set of discrete cumulative distribution functions that satisfy the constraint T (F ) = θ. The true value of the parameter is given as θ 0 = T (F 0 ), and the maximum empirical likelihood estimator is given as ˆθ n = T (F n ) by the maximizer F n = argmax F log EL(Z n, F ). We invert the likelihood ratio test and obtain a confidence region C n = { θ 2 log ELR n (θ) c}, where the critical value c is chosen according to the (asymptotic) distribution of 2 log ELR n (θ) under null hypothesis for some desired confidence level. The asymptotic distribution is given by the following results: under some regularity conditions n 1/2 (ˆθ n θ 0 ) U (1) 2 log ELR n (θ 0 ) = n(ˆθ n θ 0 ) V 1 (ˆθ n θ 0 ) + o p (1), where U is a p variate random variable with U N p (0, W ), and W and V are p p (semi- )positive definite matrices. Without loss of generality we consider two methods that provide the same maximum empirical likelihood estimators, ˆθ 1n = ˆθ 2n, but different likelihood ratio statistics ELR 1n (θ) and ELR 2n (θ). We suppose that the asymptotic results in (1) hold with V = V 1 and V = V 2 with W V1 1 = I and W V2 1 I for ELR 1n (θ) and ELR 2n (θ) respectively. By the results of Hjort, McKeague & Keilegom (2009) and Qin & Lawless (1994), the empirical likelihood with V = V 1 admits a chi-square with p degree of freedom as the limiting distribution of 2 log ELR 1n (θ), whereas the empirical likelihood with V = V 2 admits a weighted sum of p independent chi-squares of 1 degree of freedom as the limiting distribution. We consider respectively defined confidence regions with the two methods at the same confidence level as follows: C 1n = { θ 2 log ELR 1n (θ) c 1 }, C 2n = { θ 2 log ELR 2n (θ) c 2 }, 75 where the critical values c 1 and c 2 are appropriately chosen by the limiting distributions. C 1n and C 2n are centered at the same value in the sense that 2 log ELR 1n (ˆθ n ) = 2 log ELR 2n (ˆθ n ) = 0 where ˆθ n denotes the common maximum empirical likelihood estimator. The following theorem shows the asymptotic relative optimality of C 1n (θ).

3 Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests 3 THEOREM 1. Assume regularity conditions under which the results in (1) holds for ELR 1n (θ) and ELR 2n (θ) respectively. Then, for confidence levels 1 αn 1/2 (0, 1), lim V ol(c 1n) < lim V ol(c 2n), n n where V ol(c) for any measurable set C in R p is defined as V ol(c) = R p I [t C] µ(dt) with µ the Lebesgue measure in R p. Theorem 1 implies that the empirical likelihood method that provides the likelihood ratio statistic which is only incompletely self-studentizing is not asymptotically optimal. Remark 1. In several papers (e.g., Wang & Jing (2001), Wang & Li (2002)), it is suggested that 80 one may adjust the empirical likelihood ratio of type 2 log ELR 2n by multiplying a factor, that will turn the asymptotic null distribution into a regular chi square, and may even improve the performance of the resulting confidence regions. It is clear that if the multiplying factor is (asymptotically) a constant and does not depend on θ, then the resulting confidence regions are the same as before, in the sense that the shape, size and orientation of the resulting confidence region is the 85 same as before. To achieve optimality, the multiplying factor must depend on θ and be the value log ELR 1n (θ)/ log ELR 2n (θ) = (ˆθ θ 0 ) V1 1 (ˆθ θ 0 )/(ˆθ θ 0 ) V2 1 (ˆθ θ 0 ) + o p (1). Obviously, computing this factor is equivalent to computing the statistic 2 log ELR 1n, if not more difficult. 3. EXTENDED EMPIRICAL LIKELIHOODS 90 In this section we present examples of extended methods that provide likelihood ratio statistics that do not observe the non-parametric version of the Wilks theorem. In each of these examples the construction of the likelihood does not match the structure of data. The ill-constructed likelihoods provide the likelihood ratio statistics that are only incompletely self-studentizing Analysis of Estimating Equations Involving Nuisance Parameters 95 We suppose that the unknown parameters θ R p is related to the common cumulative distribution function F 0 via a p variate vector of estimating functions involving finite dimensional nuisance parameters ψ R q, m(, θ, ψ), so E{m(Z 1, θ 0, ψ 0 )} = 0 for the true values θ 0 and ψ 0. We also suppose a q variate vector of estimating functions h(, θ, ψ) exists that defines the true parameter values θ 0 and ψ 0 uniquely as a solution to the following estimating equations jointly: 100 E{m(Z 1, θ 0, ψ 0 )} = 0, E{h(Z 1, θ 0, ψ 0 )} = 0. We consider empirical likelihood jointly for θ and ψ as follows: { n EL n (θ, ψ) = max w i m(z i, θ, ψ) = 0, w i h(z i, θ, ψ) = 0, w i } w i = 1, 0 w i, where w i denotes the point mass assigned to the i th observation Z i. Under some regular- 105 ity conditions, this likelihood is maximized by w i = n 1, i = 1,..., n, and the maximum empirical likelihood estimator (ˆθ n, ˆψ n ) is given by the solution to the following equations: n 1 n m(z i, θ, ψ) = 0, n 1 n h(z i, θ, ψ) = 0. Two methods have been proposed for the inference of θ alone. Qin and Lawless (1994) used the profile likelihood EL 1n (θ) = max ψ EL n (θ, ψ). The other or so-called plug-in method 110

4 M. ZHOU AND M.-O. KIM (Hjort, McKeague & Keilegom (2009)) used the following likelihood: { n } EL 2n (θ) = max w i m(z i, θ, ψ) = 0, w i = 1, 0 w i, w i where ψ is a n consistent estimator, so that E{m(Z i, θ 0, ψ)} = 0. Without externally given ψ available, a common choice is the maximum empirical likelihood estimator ˆψ n. With this particular choice, the plug-in method has ˆθ n as the maximum empirical likelihood estimator and max θ EL 2n (θ) = n n 1, same as the profile likelihood method. The likelihood ratio statistics are given respectively: ELR 1n (θ) = max ψ EL n(θ, ψ)/ n n 1, ELR 2n (θ) = EL 2n (θ)/ n n 1. (2) The following theorem shows that results in (1) hold for the profile and the plug-in method. THEOREM 2. Assume conditions under which the below results hold: ( ) ˆθn n 1/2 θ 0 N p (0, Σ), ˆψ n ψ 0 ( ) in distribution, where Σ = {S 12 S11 1 m/ θ m/ ψ S 21} 1 with S 12 = S21 = E, S h/ θ h/ ψ 11 = ( mm E mh ) hm hh, and the expectations taken at θ = θ 0 and ψ = ψ 0. Also assume that for any ψ with ψ ψ 0 = O p (n 1/2 ), n 1 n m(z i, θ 0, ψ)m (Z i, θ 0, ψ) E(mm ). Suppose Σ (jk), j, k = 1, 2, denotes the blocks of Σ that correspond to θ and ψ. Then, [ 1 U N p (0, Σ (11) ), W = V 1 = Σ (11), V 2 = E( m/ θ) {E(mm )} E( m/ θ)] 1. Specifically, 2 log ELR 2n (θ 0 ) p j=1 c jξ j in distribution, where ξ j are independent chisquared distributed random variables with degree of freedom 1 and c 1,..., c p are eigenvalues of V 2 W 1. The weighted χ 2 asymptotic results follow from the ill-construction of the likelihood: the likelihood of EL 2n (θ) takes the same form as that of Owen s likelihood for the case of independent data, whereas m(z i, θ, ˆψ n ), i = 1,..., n, are not independent anymore as they all involve ˆψ n Censored Data Analysis We suppose that {Z i } n = {(T i, δ i )} n where (T i, δ i ) denote right censored observations from lifetime variables {Y i } with a common cumulative distribution function F 0 subject to random censoring as follows: T i = min(y i, C i ) and δ i = I[Y i C i ] for i = 1,..., n. The censoring variable C i are assumed independent of the Y i with a cumulative distribution function G 0. We suppose the true value θ 0 of an unknown parameter θ R p is related to F 0 via estimating equations such that E{g(Y i )} = θ 0, where g = (g 1,..., g p ) are p functions. We examine three empirical likelihood methods that have been proposed for the inference of θ.

5 Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests 5 Censored Likelihood Method: The first method uses the empirical likelihood defined for the censored data. Kaplan & Meier (1958), Owen (2001) and others have defined the empirical likelihood of the censored data {Z i } n for some cumulative distribution function F as n EL(Z n, F ) = { F (T i )} δ i { j:t j >T i F (T j )} 1 δi where F (t) = F (t+) F (t ) is the jump of F at t or a probability mass assigned to the point 135 t. As in the case of uncensored case, the definition assumes a discrete F ( ) with possible jumps only at the observed times. With w i = F (T i ), i = 1,..., n, the likelihood above can be written in term of the jumps and the log likelihood is log EL 1n (F ) = δ i log w i + (1 δ i ) log w j I [Tj >T i ], (3) where I [ ] denotes an indicator function. We note that the likelihood is maximized at the well known Kaplan-Meier estimator (Kaplan & Meier (1958)) with w i = ˆF KM (T i ), i = 1,..., n. Using the empirical likelihood of the censored data in (3) Zhou (2011) proposed the empirical likelihood for θ as EL 1n (θ) = max w i n w δ i i w j I [Tj >T i ] j=1 1 δ i j=1 w i g(t i ) = θ, w i = 1, 0 w i. The maximum empirical likelihood estimator ˆθ n is given by the equation, g(t i ) ˆF KM (T i ) = ˆθ n. (4) The likelihood ratio statistic is given as 2 log ELR 1n (θ) = 2 log{el 1n (θ)/el 1n (ˆθ n )} ac- 140 cordingly. Synthetic data and uncensored empirical likelihood method: Recall the parameter θ is defined as Eg(Y i ) = θ. The second method was motivated by the following lemma, which can be proved easily. LEMMA 1. For any function g( ) such that Eg(Y i ) is well defined, we consider X i = 145 {g(t i )δ i }/{1 G 0 (T i )}, where G 0 ( ) is the cumulative distribution function of censoring variable C i. Then, we have Eg(Y i ) = E(X i ). If G 0 were known, {X i } n or the synthetic data would be completely observed and the empirical likelihood of uncensored data could be considered as follows: { n } EL 2n (θ) = max w i w i X i = θ, w i = 1, 0 w i. w i The first and second methods utilize different likelihoods. EL 1n (θ) is based on the likelihood of censored data {(T i, δ i )} n with w i denoting the point mass of a discrete cumulative function of Y i. EL 2n (θ) is based on the likelihood of {X i } n with w i denoting the point mass of a cu- 150 mulative distribution function of X i, assuming the knowledge of the censoring time distribution G 0.

6 6 M. ZHOU AND M.-O. KIM Usually G 0 (t) or 1 G 0 (t) is not known, and thus X i are not available. An alternative method based on the synthetic data uses an estimator Ĝ(t), often the Kaplan-Meier estimator ĜKM, instead and defines the likelihood as { n } EL 2n(θ) = max w i w i X w i = θ, w i = 1, 0 w i, i where Xi = {g(t i)δ i }/{1 ĜKM(T i )}. The replacement, however, creates a problem that Xi are not independent since all of them involve the same estimator ĜKM(T i ). Since ĜKM(t) is uniformly consistent for G 0 (t), the mean of the (no longer independent identical) sequence {Xi } is still asymptotically same as Eg(Y ), but the variance of the sequence {Xi } is different from the variance of X i. To be precise, under some regularity conditions, we can show that as n, n( X θ 0 ) N(0, σ1 2) and n( X θ 0 ) N(0, σ2 2 ) in distribution with σ1 2 > σ2 2, if g( ) is a scalar function (p = 1). This result similarly holds in a multivariate case (p > 1). We refer to Srinivasan & Zhou (1994) for the calculation and the expression of the asymptotic variances or variance-covariance matrix. Like the plug-in method, the second synthetic data method uses the empirical likelihood method developed for independent identical sequences, but applies it to the sequence {Xi } that is not independent and identical. Interestingly, if we use the Kaplan-Meier estimator Ĝ KM (t) in the above, the empirical likelihood [ EL 2n (θ) is ] maximized when θ takes the value ˆθ n defined in (4), as ˆF KM (T i ) = δ i / n{1 ĜKM(T i )}. Therefore, even though the likelihoods EL 1n (θ) and EL 2n (θ) looks different, they maximize at the same value, and thus the confidence regions obtained by inverting the tests, are centered at the same point ˆθ n. Accordingly the likelihood ratio statistic is given as 2 log ELR 2n(θ) = 2 log{el 2n(θ)/EL 2n(ˆθ n )}. This method is also a plug-in method in the sense that we plug ĜKM(t) for G 0 (t). It differs from the plug-in method described in section 3 1 in that the problem has been switched from original data to synthetic data before plug-in and the parameter that got plugged is of infinite dimensional. Weighted empirical likelihood: This third method was proposed by Ren (2001, 2008). Given fixed weights v i 0, a weighted empirical likelihood is defined as follow: n EL 3n (w 1,..., w n ) = (w i ) v i, w i = 1. It is not hard to show that EL 3n is maximized when w i = v i / j v j. With the particular choice of v i = ˆF KM (T i ) the maximizer of EL 3n is wi = ˆF KM (T i ), the jump of Kaplan-Meier. Thus, with this choice of weights, the confidence regions obtained by inverting the likelihood ratio test defined below are also centered at the same value ˆθ n defined in (4). The weighted empirical likelihood under the null hypothesis can be computed by solving the constrained maximization problem: given v i = ˆF KM (T i ), { n } EL 3n (θ) = max (w i ) v i w i δ i g(t i ) = θ, w i = 1, 0 w i. w i The likelihood ratio statistic is given as 2 log ELR 3n (θ) = 2 log{el 3n (θ)/el 3n (ˆθ n )}. In this likelihood, observations i and j are supposed to be independent, from which the product form i

7 Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests 7 of the empirical likelihood rises. With v i = ˆF KM (T i ), however, observations are implicitly connected via the weights. The following theorem shows that results in (1) hold for all three methods. THEOREM 3. Assume mild regularity conditions under which the Kaplan-Meier integral is consistent and asymptotically normal. Then, 180 n(ˆθn θ 0 ) U, (5) in distribution, where U N(0, Σ). Furthermore, 2 log ELR 1n (θ 0 ) = n(ˆθ n θ 0 ) V 1 1n (ˆθ n θ 0 ) + o p (1), (6) 2 log ELR2n(θ 0 ) = n(ˆθ n θ 0 ) V2n 1 n θ 0 ) + o p (1), (7) 2 log ELR 3n (θ 0 ) = n(ˆθ n θ 0 ) V3n 1 n θ 0 ) + o p (1), (8) where lim n V 1n = Σ, lim n V 2n Σ and lim n V 3n Σ. 185 It follows form Theorem 3 and Hjort, McKeague & Keilegom (2009) that 2 log ELR 1n (θ 0 ) χ 2 p in distribution whereas the Wilk s theorem does not hold for 2 log ELR2n (θ 0) and 2 log ELR 3n (θ 0 ). Theorems 1 and 3 prove that the confidence region provided by inverting the test 2 log ELR 1n (θ 0 ) by the censored empirical likelihood method is asymptotically optimal EMPIRICAL STUDIES We illustrate the relative optimality of confidence regions using two simulation studies regarding the cases of estimating equations and censored data analyses. Confidence regions were obtained by inverting the likelihood ratio tests described above. In each case the extended likelihood methods that yield the likelihood ratio statistics that do not observe the Wilk s theorem 195 are computationally challenging as the asymptotic distributions are empirically estimated. The associated confidence regions are also sub-optimal Simulation Studies 1 We used the following regression model with heteroscedastic errors and applied the profile likelihood and the plug-in method of section 3 1 the inference of two slope parameters: y i = α + β 1 x 1i + β 2 x 2i + e i x 2i, where x 1i N(0, 1), x 2i χ 2 1 and e i N(0, 1). Here the parameters of interest is θ = (β 1, β 2 ) and the intercept is the nuisance parameter (ψ = α). Consider Z i = (Z 1i, Z 2i ) with Z 1i = y i 200 and Z 2i = (x 1i, x 2i ). The estimating functions m(z i, θ, ψ) and h(z i, θ, ψ) are given respectively as m(z i, θ, ψ) = (Z 1i θ Z 2i )Z 2i and h(z i, θ, ψ) = (Z 1i θ Z 2i ). Figure 1 shows empirically estimated the null distributions of the resulting two likelihood ratio statistics i.e., the profile likelihood ratio statistic 2 log ELR 1n (θ 0 ) and the plug-in likelihood ratio statistic 2 log ELR 2n (θ 0 ) in (2), based on 5,000 Monte Carlo samples. The Q-Q plots show that the 205 null distribution of the plug-in likelihood ratio statistic remains deviating from a chi-squared distribution with degree of freedom 2 even in a large sample case. This implies that the plug-in likelihood ratio statistic remains only incompletely studentizing. Figure 2 shows an example of confidence regions. The confidence regions provided by the plug-in empirical likelihood method are larger relative to those by the profile likelihood and 210 shaped differently.

8 8 M. ZHOU AND M.-O. KIM Profile EL: n=100 Plug in EL: n=100 QQ Plot: n=100 Density Density Plug in EL LLR LLR Profile EL Profile EL: n=2000 Plug in EL: n=2000 QQ Plot: n=2000 Density Density Plug in EL LLR LLR Profile EL Fig. 1. Empirically estimated null distributions of the profile and the plug-in likelihood ratio statistic for the case of estimating equations with nuisance parameters n=100 n= True Value MELE Profile EL Plug in EL Fig. 2. Confidence regions drawn for a sample of the case of estimating equations with nuisance parameters 4 2. Simulation Studies 2 We used the following censored data setting: lifetime Y i and censoring times C i are independent identical with Y i exp(1) and C i exp(0.6).

9 Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests % confidence regions EL region, solid line weighted EL region, dash line plug in EL region, dotted line Fig. 3. Confidence regions drawn for a sample data of n = 1, 000 subject to 37% censoring From the above Y i and C i we obtain T i and δ i as described in section 3 2. The two expectations 215 we are going to test is EY i = a 1 ; EI[Y i > 0.5] = a 2 for various values of θ = (a 1, a 2 ). The true value of a 1 = 1 and a 2 = e 0.5. As implied by Theorem 3, the null distribution of the censored empirical likelihood ratio statistic is known, whereas the null distributions of the plug-in and the weighted empirical likelihood ratio statistics 2 log ELR 2n (θ 0) and 2 log ELR 3n (θ 0 ) need to be estimated. We empirically 220 estimated the null distributions based on 10,000 Monte Carlo samples and used the simulated null distribution to set the confidence level. Figure 3 shows three confidence regions for θ = (a 1, a 2 ) drawn for a sample data with the sample size n = 1, 000 subject to 37% censoring. We see that the orientation and shape of the confidence regions by the plug-in and the weighted empirical likelihood methods (dotted/dashed lined confidence region) are very different from the 225 confidence region by the censored empirical likelihood method (solid lined confidence region). Also, the confidence region by the censored empirical likelihood method is the one with the smallest area. 5. CONCLUSION In this paper we consider extensions of empirical likelihood method with some providing 230 likelihood ratio statistics that observe a nonparametric version of Wilks theorem similarly as the Owen s empirical likelihood ratio statistic and others not. Limiting to the extensions that provide the same maximum empirical likelihood estimators, we evaluate relative optimality of the methods by comparing the confidence regions obtained by inverting the likelihood ratio tests.

10 M. ZHOU AND M.-O. KIM We show that the extension method which yield the likelihood ratio statistic observing the Wilks theorem provides the smallest confidence region. ACKNOWLEDGEMENT First author is supported by US NSF Grant DMS Second author is supported by US NSF Grant DMS APPENDIX Proof of Theorem 1. It follows from (1) that C 1n and C 2n are asymptotically equivalent to C 1n = { θ n(ˆθ θ) V 1 1 (ˆθ θ) c 1 }, C 2n = { θ n(ˆθ θ) V 1 2 (ˆθ θ) c 2 }, where n 1/2 (ˆθ θ) N p (u, W ) for θ = θ 0 + n 1/2 u with a p variate constant vector u. By Lemma 2 below, we have lim n V ol(c1n ) < lim n V ol(c2n ) unless V 2 = W LEMMA 2. (i) Consider two types of confidence regions based on an estimator ˆθ that has distribution N(θ 0, I): C 1 = {θ : (ˆθ θ) (ˆθ θ) < c}, C 2 = {θ : (ˆθ θ) B(ˆθ θ) < c }, where B I is a symmetric non-negative definite matrix. When both confidence region have same coverage probability 1 α (0, 1), C 1 is superior to C 2 in the sense that V ol(c 1 ) < V ol(c 2 ). (ii) If ˆθ is normally distributed with N(θ 0, Σ), then the optimal confidence region for the parameter θ 0 is {θ : (ˆθ θ)σ 1 (ˆθ θ) < c}. Remark 2. Obviously if B = σ 2 I, then the two confidence regions are identical. The constant σ 2 will be canceling out when adjust the cut off level c. So the Lemma should be understood as to use any symmetric non-negative definite matrix other than σ 2 I type. Remark 3. If confidence regions are allowed not centered at ˆθ then there may be better regions available (Stein phenomena). We only consider a limited class of confidence regions that are centered at the same ˆθ since they are derived from extended empirical likelihood ratio tests. Proof of Lemma 2. First we notice (ii) is an easy consequence of (i), so we only prove (i). Note that the coverage probability of C 1 is θ 0 x 2 <c f(x)dx, where f(x) denotes the density of ˆθ. Define a distance by d B (θ 0, ˆθ) = (θ 0 ˆθ) B(θ 0 ˆθ). Then similarly the coverage probability of C 2 is d B (θ 0 x)<c f(x)dx. By substitute y = ˆθ θ 0 then y N(0, I), and the two integrals become f(y)dy y 2 <c and f(y)dy. d B (y)<c Denote the regions of the two integrations above as R 1 and R 2. Also, let R 1 R 2 = P. Then the requirement of equal coverage probability imply f(y)dy = f(y)dy and f(y)dy = f(y)dy. (9) R 1 R 2 R 1 \P R 2 \P Now, the density f(y) on R 1 \ P has values at least as large as f(y ) = K for a y with y 2 = c, since the multi-variate normal N(0, I) density is a decreasing function of y 2.

11 Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests 11 On the other hand, the density f(y) on R 2 \ P has value at most K = f(y ). Thus 265 f(y)dy > Kdy = K dy = K V ol(r 1 \ P ) R 1 \P R 1 \P and by (9) the left hand side is also equal to f(y)dy < R 2 \P R 2 \P R 1 \P Kdy = K V ol(r 2 \ P ). By the above two inequalities, we have V ol(r 2 \ P ) > V ol(r 1 \ P ) and thus we also have V ol(r 2 ) > V ol(r 1 ). It is obvious that the inequality is strict unless R 2 coincide with R 1. Proof of Theorem 2. Define Q n = ( n 1 n m(z i, θ 0, ψ 0 ), n 1 n h(z i, θ 0, ψ 0 ) ). From the proof of Theorem 1 of Qin and Lawless (1994) we have 270 n 1/2 (ˆθ n θ 0 ) = ( Σ (11) Σ (12) ) S21 S 1 11 (n1/2 Q n ) + o p (1), (10) n 1/2 ( ˆψ n ψ 0 ) = ( Σ (21) Σ (22) ) S21 S 1 11 (n1/2 Q n ) + o p (1). Also from the proof of Corollary 5 of Qin and Lawless (1994), we have [ 2 log ELR 1n (θ 0 ) = (n 1/2 Q n ) S11 1 { S 12 ΣS 21 S 12(2) S21(2) S11 1 S } ] 1 12(2) S21(2) S11 1 (n1/2 Q n ), where S 21(2) = S12(2) = E { ( m/ ψ), ( h/ ψ) }. After some matrix manipulation we have 275 [ 2 log ELR 1n (θ 0 ) = (n 1/2 Q n ) S11 1 S 12 Σ ( )] { S 21(2) S11 1 S } 1 S 21 S11 1 (n1/2 Q n ) 12(2) ( ) = (n 1/2 Q n ) S11 1 S Σ(11) 12 (Σ Σ (11) ) 1 [Σ (11), Σ (12) ]S 21 S11 1 (n1/2 Q n ) (21) = n(ˆθ n θ 0 ) V 1 1 (ˆθ n θ 0 ), where V 1 = Σ (11). On the other hand, { ( 2 log ELR 2n (θ 0 ) = 2 log 1 + λ m(z i, θ 0, ˆψ ) ( n ) log 1 + ˆλ m(z i, ˆθ n, ˆψ n )) }, 280 where ˆλ and λ are defined by solutions to the equations 0 = n 1 m(z i, ˆθ n, ˆψ n ) 1 + ˆλ m(z i, ˆθ n, ˆψ n ), 0 = n 1 m(z i, θ 0, ˆψ n ) 1 + λ m(z i, θ 0, ˆψ n ). By the standard arguments of empirical likelihood methods, and (10), we have { 2 log ELR 2n (θ 0 ) = λ E(mm )} { λ ˆλ E(mm )} ˆλ + op (1) ( =n(ˆθ n θ 0 ) E m ) { ( 1 E(mm )} E m ) (ˆθ n θ 0 ) + o p (1). 285 θ θ Proof of Theorem 3. When p = 1 the result (5) is given by Akritas (2001). Zhou (2011) show that for p > 1 the equation (5) also holds with the jkth element of Σ defined by df 0 (x) σ jk = [g j (x) ḡ j (x)][g k (x) ḡ k (x)] 1 G 0 (x ), (11)

12 M. ZHOU AND M.-O. KIM where ḡ j (t) denoting the advanced time transformation of g j in Efron and Johnstone (1990). Zhou (2011) also showed that the equation (6) holds with the jkth element of V 1n given by ˆσ jk = [g j (T i ) ḡ j (T i )][g k (T i ) ḡ k (T i )] ˆF KM (T i ) 1 ĜKM(T i ). Since V 1n is just Σ with the unknown F 0 and G 0 replaced by their Kaplan-Meier estimators and the well known fact that the Kaplan-Meier estimators are uniformly consistent, V 1n Σ in probability as n. We refer to Zhou (2011) for details. It follows from Owen (2001, p ) (with Xi in place of X i there) that where 2 log ELR 2n(θ 0 ) = U 2nV 1 2n U 2n + o p (1) U 2n = n[ X θ 0 ] = n(ˆθ n θ 0 ), V 2n = n 1 (Xi θ 0 )(Xi θ 0 ). Since ĜKM(t) is uniformly consistent for G 0 (t), V 2n = n 1 (X i θ 0 )(X i θ 0 ) + o p (1) = V ar(x 1 ) + o p (1). By the results in Zhou (1992) and Srinivasan and Zhou (1994), var(x 1 ) Σ however. This suffices to show (7). As for the equation (8), when p = 1, Ren (2001) showed that the limiting distribution of this test statistic with θ = θ 0 is a scaled chi-square distribution with degree of freedom 1. To obtain a multivariate version (with p > 1), we modify the proof of Ren along the steps of Owen (2001, p ) to incorporates weights, and derive the following: where U n = n[ 2 log ELR 3n (θ) = U n V 1 3n U n + o p (1) g(t i ) ˆF KM (T i ) θ 0 ] = n(ˆθ n θ 0 ) and V 3n is a matrix with the jkth element given by v jk = (g j (T i ) θ 0 )(g k (T i ) θ 0 ) ˆF KM (T i ). An application of the law of large number for the Kaplan-Meier integral gives v jk = (g j (t) θ 0 )(g k (t) θ 0 )d ˆF KM (t) (g j (t) θ 0 )(g k (t) θ 0 )df 0 (t) in probability as n. Since Σ depends on the censoring distribution G 0, while lim V 3n does not, lim n V 3n Σ. REFERENCES AKRITAS, M. (2000). The central limit theorem under censoring. Bernoulli 6, EFRON, B. & JOHNSTONE, I. M. (1990). Fisher s information in terms of the hazard rate. Ann. Statist. 18,

13 Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests 13 HJORT, N. L., MCKEAGUE, I. W. & KEILEGOM, I. V. (2009). Extending the scope of empirical likelihood. Ann. 310 Statist. 37, KAPLAN, E. L. & MEIER, P. (1958). Nonparametric Estimation From Incomplete Observations. J. Amer. Statist. Assoc. 53, LI, G., LI, R. & ZHOU, M. (2005). Empirical likelihood in survival analysis. In Contemporary Multivariate Analysis and Design of Experiments Edited by J. Fan and G. Li. pp The World Scientific Publisher. 315 OWEN, A. (1988). Empirical Likelihood Ratio Confidence Intervals for a Single Functional. Biometrika 75, OWEN, A. (1990). Empirical Likelihood Ratio Confidence Regions. Ann. Statist. 18, OWEN, A. (2001). Empirical Likelihood. New York: CRC Press. QIN, J. & LAWLESS, J. (1994). Empirical Likelihood and General Estimating Equations. Ann. Statist. 22, REN, J. (2001). Weighted empirical likelihood ratio confidence intervals for the mean with censored data. Ann. Inst. 320 Statist. Math. 53, REN, J. (2008). Weighted empirical likelihood in some two-sample semiparametric models with various types of censored data. Ann. Statist. 36, SHI, J. & LAU, T.-S. (2000). Empirical Likelihood for Partially Linear Models. J. Multivariate Anal. 72, SRINIVASAN, C. & ZHOU, M. (1994). Linear regression with censoring. J. Multivariate Anal. 49, TURNBULL, B. W. (1976). The empirical distribution function with arbitrarily grouped, censored and truncated data. J. R. Statist. Soc. B 38, WANG, S., QIAN, L. & CARROLL, R. (2010). Generalized empirical likelihood methods for analyzing longitudinal data. Biometrika 97, WANG, Q. H. & JING, B. Y (2001). Empirical likelihood for a class of functionals of survival distribution with 330 censored data Ann. Inst. Statist. Math. 53, WANG, Q. H. & LI, G. (2002). Empirical likelihood semiparametric regression analysis under random censorship J. Multivariate Anal. 83, ZHOU, M. (1992). Asymptotic normality of the synthetic data regression estimator for censored survival data. Ann. Statist. 20, ZHOU, M. (2011). A Wilks theorem for the censored empirical likelihood of several means. Preprint mai/research/note3.1.pdf [Received June Revised ]

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

Empirical likelihood for linear models in the presence of nuisance parameters

Empirical likelihood for linear models in the presence of nuisance parameters Empirical likelihood for linear models in the presence of nuisance parameters Mi-Ok Kim, Mai Zhou To cite this version: Mi-Ok Kim, Mai Zhou. Empirical likelihood for linear models in the presence of nuisance

More information

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier

More information

EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL

EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL Statistica Sinica 22 (2012), 295-316 doi:http://dx.doi.org/10.5705/ss.2010.190 EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL Mai Zhou 1, Mi-Ok Kim 2, and Arne C.

More information

and Comparison with NPMLE

and Comparison with NPMLE NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/

More information

Empirical Likelihood in Survival Analysis

Empirical Likelihood in Survival Analysis Empirical Likelihood in Survival Analysis Gang Li 1, Runze Li 2, and Mai Zhou 3 1 Department of Biostatistics, University of California, Los Angeles, CA 90095 vli@ucla.edu 2 Department of Statistics, The

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints and Its Application to Empirical Likelihood

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints and Its Application to Empirical Likelihood Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints and Its Application to Empirical Likelihood Mai Zhou Yifan Yang Received:

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data

Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data Mai Zhou and Julia Luan Department of Statistics University of Kentucky Lexington, KY 40506-0027, U.S.A.

More information

Goodness-of-fit tests for the cure rate in a mixture cure model

Goodness-of-fit tests for the cure rate in a mixture cure model Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Comparing Distribution Functions via Empirical Likelihood

Comparing Distribution Functions via Empirical Likelihood Georgia State University ScholarWorks @ Georgia State University Mathematics and Statistics Faculty Publications Department of Mathematics and Statistics 25 Comparing Distribution Functions via Empirical

More information

Likelihood ratio confidence bands in nonparametric regression with censored data

Likelihood ratio confidence bands in nonparametric regression with censored data Likelihood ratio confidence bands in nonparametric regression with censored data Gang Li University of California at Los Angeles Department of Biostatistics Ingrid Van Keilegom Eindhoven University of

More information

Investigation of goodness-of-fit test statistic distributions by random censored samples

Investigation of goodness-of-fit test statistic distributions by random censored samples d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Full likelihood inferences in the Cox model: an empirical likelihood approach

Full likelihood inferences in the Cox model: an empirical likelihood approach Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

More information

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and

More information

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississippi University, MS38677 K-sample location test, Koziol-Green

More information

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Estimation for two-phase designs: semiparametric models and Z theorems

Estimation for two-phase designs: semiparametric models and Z theorems Estimation for two-phase designs:semiparametric models and Z theorems p. 1/27 Estimation for two-phase designs: semiparametric models and Z theorems Jon A. Wellner University of Washington Estimation for

More information

University of California San Diego and Stanford University and

University of California San Diego and Stanford University and First International Workshop on Functional and Operatorial Statistics. Toulouse, June 19-21, 2008 K-sample Subsampling Dimitris N. olitis andjoseph.romano University of California San Diego and Stanford

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Estimation of the Bivariate and Marginal Distributions with Censored Data

Estimation of the Bivariate and Marginal Distributions with Censored Data Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly

More information

The comparative studies on reliability for Rayleigh models

The comparative studies on reliability for Rayleigh models Journal of the Korean Data & Information Science Society 018, 9, 533 545 http://dx.doi.org/10.7465/jkdi.018.9..533 한국데이터정보과학회지 The comparative studies on reliability for Rayleigh models Ji Eun Oh 1 Joong

More information

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Econometrics Working Paper EWP0401 ISSN 1485-6441 Department of Economics AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Lauren Bin Dong & David E. A. Giles Department of Economics, University of Victoria

More information

ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood

ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood ICSA Applied Statistics Symposium 1 Balanced adjusted empirical likelihood Art B. Owen Stanford University Sarah Emerson Oregon State University ICSA Applied Statistics Symposium 2 Empirical likelihood

More information

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky Summary The empirical likelihood ratio method is a general nonparametric

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky

COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky Summary Empirical likelihood ratio method (Thomas and Grunkmier 975, Owen 988,

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Extending the two-sample empirical likelihood method

Extending the two-sample empirical likelihood method Extending the two-sample empirical likelihood method Jānis Valeinis University of Latvia Edmunds Cers University of Latvia August 28, 2011 Abstract In this paper we establish the empirical likelihood method

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

More information

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications Fumiya Akashi Research Associate Department of Applied Mathematics Waseda University

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities Hutson Journal of Statistical Distributions and Applications (26 3:9 DOI.86/s4488-6-47-y RESEARCH Open Access Nonparametric rank based estimation of bivariate densities given censored data conditional

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

GOODNESS-OF-FIT TEST FOR RANDOMLY CENSORED DATA BASED ON MAXIMUM CORRELATION. Ewa Strzalkowska-Kominiak and Aurea Grané (1)

GOODNESS-OF-FIT TEST FOR RANDOMLY CENSORED DATA BASED ON MAXIMUM CORRELATION. Ewa Strzalkowska-Kominiak and Aurea Grané (1) Working Paper 4-2 Statistics and Econometrics Series (4) July 24 Departamento de Estadística Universidad Carlos III de Madrid Calle Madrid, 26 2893 Getafe (Spain) Fax (34) 9 624-98-49 GOODNESS-OF-FIT TEST

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Statistica Sinica 20 (2010), 441-453 GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Antai Wang Georgetown University Medical Center Abstract: In this paper, we propose two tests for parametric models

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means Jackknife Empirical Likelihood est for Equality of wo High Dimensional Means Ruodu Wang, Liang Peng and Yongcheng Qi 2 Abstract It has been a long history to test the equality of two multivariate means.

More information

TMA 4275 Lifetime Analysis June 2004 Solution

TMA 4275 Lifetime Analysis June 2004 Solution TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

Package emplik2. August 18, 2018

Package emplik2. August 18, 2018 Package emplik2 August 18, 2018 Version 1.21 Date 2018-08-17 Title Empirical Likelihood Ratio Test for Two Samples with Censored Data Author William H. Barton under the supervision

More information

Kernel density estimation with doubly truncated data

Kernel density estimation with doubly truncated data Kernel density estimation with doubly truncated data Jacobo de Uña-Álvarez jacobo@uvigo.es http://webs.uvigo.es/jacobo (joint with Carla Moreira) Department of Statistics and OR University of Vigo Spain

More information

Nonparametric confidence intervals. for receiver operating characteristic curves

Nonparametric confidence intervals. for receiver operating characteristic curves Nonparametric confidence intervals for receiver operating characteristic curves Peter G. Hall 1, Rob J. Hyndman 2, and Yanan Fan 3 5 December 2003 Abstract: We study methods for constructing confidence

More information

Empirical Likelihood Inference for Two-Sample Problems

Empirical Likelihood Inference for Two-Sample Problems Empirical Likelihood Inference for Two-Sample Problems by Ying Yan A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Statistics

More information

NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA

NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA Statistica Sinica 14(2004, 533-546 NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA Mai Zhou University of Kentucky Abstract: The non-parametric Bayes estimator with

More information

Chapter 3: Maximum Likelihood Theory

Chapter 3: Maximum Likelihood Theory Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

The formal relationship between analytic and bootstrap approaches to parametric inference

The formal relationship between analytic and bootstrap approaches to parametric inference The formal relationship between analytic and bootstrap approaches to parametric inference T.J. DiCiccio Cornell University, Ithaca, NY 14853, U.S.A. T.A. Kuffner Washington University in St. Louis, St.

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

EXTENDING THE SCOPE OF EMPIRICAL LIKELIHOOD

EXTENDING THE SCOPE OF EMPIRICAL LIKELIHOOD The Annals of Statistics 2009, Vol. 37, No. 3, 1079 1111 DOI: 10.1214/07-AOS555 Institute of Mathematical Statistics, 2009 EXTENDING THE SCOPE OF EMPIRICAL LIKELIHOOD BY NILS LID HJORT, IAN W. MCKEAGUE

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information

5601 Notes: The Sandwich Estimator

5601 Notes: The Sandwich Estimator 560 Notes: The Sandwich Estimator Charles J. Geyer December 6, 2003 Contents Maximum Likelihood Estimation 2. Likelihood for One Observation................... 2.2 Likelihood for Many IID Observations...............

More information

Math 181B Homework 1 Solution

Math 181B Homework 1 Solution Math 181B Homework 1 Solution 1. Write down the likelihood: L(λ = n λ X i e λ X i! (a One-sided test: H 0 : λ = 1 vs H 1 : λ = 0.1 The likelihood ratio: where LR = L(1 L(0.1 = 1 X i e n 1 = λ n X i e nλ

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Adjusted Empirical Likelihood for Long-memory Time Series Models

Adjusted Empirical Likelihood for Long-memory Time Series Models Adjusted Empirical Likelihood for Long-memory Time Series Models arxiv:1604.06170v1 [stat.me] 21 Apr 2016 Ramadha D. Piyadi Gamage, Wei Ning and Arjun K. Gupta Department of Mathematics and Statistics

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

On the Comparison of Fisher Information of the Weibull and GE Distributions

On the Comparison of Fisher Information of the Weibull and GE Distributions On the Comparison of Fisher Information of the Weibull and GE Distributions Rameshwar D. Gupta Debasis Kundu Abstract In this paper we consider the Fisher information matrices of the generalized exponential

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

ASSESSING A VECTOR PARAMETER

ASSESSING A VECTOR PARAMETER SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;

More information

The Accelerated Failure Time Model Under Biased. Sampling

The Accelerated Failure Time Model Under Biased. Sampling The Accelerated Failure Time Model Under Biased Sampling Micha Mandel and Ya akov Ritov Department of Statistics, The Hebrew University of Jerusalem, Israel July 13, 2009 Abstract Chen (2009, Biometrics)

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2008 Paper 94 The Highest Confidence Density Region and Its Usage for Inferences about the Survival Function with Censored

More information

A Review on Empirical Likelihood Methods for Regression

A Review on Empirical Likelihood Methods for Regression Noname manuscript No. (will be inserted by the editor) A Review on Empirical Likelihood Methods for Regression Song Xi Chen Ingrid Van Keilegom Received: date / Accepted: date Abstract We provide a review

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

On consistency of Kendall s tau under censoring

On consistency of Kendall s tau under censoring Biometria (28), 95, 4,pp. 997 11 C 28 Biometria Trust Printed in Great Britain doi: 1.193/biomet/asn37 Advance Access publication 17 September 28 On consistency of Kendall s tau under censoring BY DAVID

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

Asymptotic Theory. L. Magee revised January 21, 2013

Asymptotic Theory. L. Magee revised January 21, 2013 Asymptotic Theory L. Magee revised January 21, 2013 1 Convergence 1.1 Definitions Let a n to refer to a random variable that is a function of n random variables. Convergence in Probability The scalar a

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 Z-theorems: Notation and Context Suppose that Θ R k, and that Ψ n : Θ R k, random maps Ψ : Θ R k, deterministic

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Multivariate Capability Analysis Using Statgraphics. Presented by Dr. Neil W. Polhemus

Multivariate Capability Analysis Using Statgraphics. Presented by Dr. Neil W. Polhemus Multivariate Capability Analysis Using Statgraphics Presented by Dr. Neil W. Polhemus Multivariate Capability Analysis Used to demonstrate conformance of a process to requirements or specifications that

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information