Generalized pseudo empirical likelihood inferences for complex surveys

Size: px

Start display at page:

Download "Generalized pseudo empirical likelihood inferences for complex surveys"

Mae Dorsey
5 years ago
Views:

1 The Canadian Journal of Statistics Vol. 43, No. 1, 2015, Pages 1 17 La revue canadienne de statistique 1 Generalized pseudo empirical likelihood inferences for complex surveys Zhiqiang TAN 1 * and Changbao WU 2 * 1 Department of Statistics, Rutgers University, Piscataway, NJ 08854, U.S.A. 2 Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1 Key words and phrases: Auxiliary information; calibration techniques; confidence intervals; Kullback Leibler distance; survey design. MSC 2010: Primary 62D05; secondary 62G09 Abstract: We consider generalized pseudo empirical likelihood inferences for complex surveys. The method is based on a weighted version of the Kullback Leibler (KL distance for calibration estimation (Deville & Särndal, 1992 and includes the pseudo empirical likelihood estimator (Chen & Sitter, 1999; Wu & Rao, 2006 and the calibrated likelihood estimator (Tan, 2013 as special cases. We show that a suitably formulated empirical likelihood ratio-type statistic follows asymptotically a scaled chi-square distribution, which extends the main result in Wu & Rao (2006 and makes the likelihood ratio-type confidence intervals available for calibration estimation using arbitrary choices of the weighting factor in the weighted KL distance. We further show that the scaling factor for the scaled chi-square distribution can be circumvented either through a particular choice of the weighting factor for the KL distance or using a bootstrap method. The proposed bootstrap procedure is justified for single-stage sampling designs with negligible sampling fractions. Finite sample performances of confidence intervals constructed using our proposed methods are investigated and compared with existing ones through two simulation studies. The Canadian Journal of Statistics 43: 1 17; Statistical Society of Canada Résumé: Les auteurs considèrent l usage de la pseudo-vraisemblance empirique généralisée pour procéder à l inférence dans le cadre d une enquête complexe. Leur méthode est basée sur l estimation par calibration selon une version pondérée de la divergence de Kullback Leibler (KL (Deville et Särndal, L estimateur basé sur la vraisemblance empirique (Chen et Sitter, 1999; Wu et Rao, 2006 et celui basé sur la vraisemblance calibrée (Tan, 2013 en sont des cas particuliers. Les auteurs montrent qu en l exprimant convenablement, une statistique basée sur un ratio de vraisemblance suit une loi du chi deux à un facteur multiplicatif près, généralisant le résultat principal de Wu et Rao (2006, et permettant de calculer des intervalles de confiance basés sur le rapport de vraisemblance pour les estimateurs de calibration avec un choix arbitraire de poids dans la divergence de KL pondérée. Les auteurs montrent également qu il est possible d éviter le facteur multiplicatif de la loi asymptotique par un choix approprié de poids ou à l aide d une procédure bootstrap qui est justifiée pour un plan d expérience à un niveau ayant une fraction d échantillonnage négligeable. Les auteurs évaluent la performance des intervalles de confiance issus de leur méthode en les comparant à ceux obtenus par les méthodes existantes dans le cadre de deux études de simulation. La revue canadienne de statistique 43: 1 17; Société statistique du Canada 1. INTRODUCTION Calibration is a popular inference tool for analysis of complex surveys. It originates from the idea of benchmarking when population totals of certain auxiliary variables are known and used to form benchmark constraints. The method has gained significant popularity since the work of Deville & * Author to whom correspondence may be addressed. cbwu@uwaterloo.ca or ztan@stat.rutgers.edu 2015 Statistical Society of Canada / Société statistique du Canada

2 2 TAN AND WU Vol. 43, No. 1 Särndal (1992. Calibration estimators are closely related to the generalized regression estimators. Some of the important developments for regression estimators, such as variance estimation techniques, can also be used for calibration estimators. Fuller (2002 provides an excellent review on regression estimation and Särndal (2007 contains a thorough review on calibration techniques. The conventional route for inferences with calibration methods is to first compute the point estimator with an estimated variance, and then use the standard Z-statistic based on normal approximations to construct confidence intervals or conduct statistical tests. Confidence intervals under this approach are forced to be symmetric around the point estimator and are not necessarily confined within the parameter space. There have been significant developments on empirical likelihood (EL methods in non-survey statistics (Owen, 1988, Progress has also been made on using the empirical likelihood method for complex surveys. See, for instance, Chen & Sitter (1999, Wu & Rao (2006, and Chen & Kim (2014, among others. The most attractive feature of the empirical likelihood approach is the data-driven, range-respecting confidence intervals based on the empirical likelihood ratio statistic. This property is not enjoyed by the conventional calibration method with Wald-type confidence intervals. Suppose that U ={1, 2,...,N} is the set of N units for the finite population, with (y i, x i being the values of the study variable y and the vector of auxiliary variables x attached to unit i. Lett y = N i=1 y i be the parameter of interest, and t x = N i=1 x i be the known population totals. Let μ y = N 1 t y and μ x = N 1 t x be the corresponding population means. Let S beaset of n sampled units and {(y i, x i :i S} be the survey data. Let π i = P(i S be the first order inclusion probabilities and d i = πi 1 be the basic design weights. The calibration estimator of t y is computed as ˆt CAL = ŵiy i, where the ŵ i are the calibrated weights obtained by minimizing a distance measure G(d, w = G i (d i,w i between the w i an d the d i subject to the set of calibration equations (also called benchmark constraints w i x i = t x. (1 The most commonly used distance measure is the chi-squared distance specified by G i (d i,w i = (w i d i 2 /(q i d i, where the q i are the pre-specified constants. It is well-known that, under the chi-squared distance, the resulting calibration estimator ˆt CAL is algebraically identical to the generalized regression estimator (Särndal, Swensson, & Wretman, Under the conventional calibration approach (Deville & Särndal, 1992, confidence intervals on t y reply on asymptotic normality of ˆt CAL and are constructed through the standardized Z-statistics (ˆt { CAL t y / v(ˆtcal } 1/2, where v(ˆt CAL is a consistent estimator of the variance of ˆt CAL. The role of q i used in the distance measure does not affect the consistency of the calibration estimator but it has an impact on the variance of the estimator. The choice q i = πi 1 1 can lead to more efficient calibration estimators (Tan, 2010, A closely related research topic is the design-optimal regression estimator; see, for instance, Fuller & Isaki (1981, Montanari (1987, Rao (1994, Berger, Tirari, & Tillé (2003, Chen & Kim (2014, among others. Fuller (2009 considered model-optimal design-consistent estimators.

3 2015 GENERALIZED PSEUDO EMPIRICAL LIKELIHOOD INFERENCES 3 For complex survey data, Chen & Sitter (1999 proposed to use a pseudo empirical (log likelihood l(p = d i log(p i, (2 where p = (p 1,...,p n T is the discrete probability measure over the n sampled units. The maximum pseudo-el estimator of the population mean μ y is computed as ˆμ PEL = ˆp iy i, where the ˆp i maximize the pseudo-el function l(p subject to p i = 1 and the set of constraints p i x i = μ x. (3 Chen & Sitter (1999 showed that the estimator ˆμ PEL is asymptotically equivalent to the calibration estimator N 1ˆt CAL with the choice of q i = 1. Wu & Rao (2006 showed that the pseudo-el ratio statistic on μ y, adjusted by a scaling factor involving the design effect, has an asymptotic χ 2 distribution with one degree of freedom. Consequently, confidence intervals on μ y based on the pseudo-el ratio statistic can be constructed. There are two major gaps between the conventional calibration method and the pseudo empirical likelihood method. First, the pseudo-el method is designed for the population mean μ y and relies on the constraints (3 which uses the known population means μ x. The method cannot be directly applied to scenarios where the population size N is unknown and only the population totals t x are available. Second, the pseudo-el approach cannot entertain a general choice of the weight factor, q i, which is an important tool for achieving design-optimal estimation as mentioned earlier. Recently, Tan (2010, 2013 developed a calibrated likelihood method by exploiting a connection between survey calibration and missing data problems. The method can be understood in two steps. Let R i be the sample inclusion indicator, i.e., R i = 1ifand R i = 0 otherwise. The first step is to treat {(x i,y i,r i :i U} as an iid sample from a joint distribution of (x,y,r and derive a bona fide empirical likelihood estimator (Qin & Lawless, 1994 for E(y = E{π 1 (xry}, subject to the moment constraints 0 = E{π 1 (xrx x}, based on the observed data {(x i,r i y i,r i :i U}, where π(x = P(R = 1 x,y is assumed to be free of y. This empirical likelihood estimator of E(y takes the usual form N 1 ŵiy i for some weights ŵ i, but the calibration equations are generally violated, i.e., ŵix i t x. For the second step, Tan (2010 proposed a modification such that the calibration equations are satisfied but without affecting the first-order asymptotic variance. As observed in Tan (2013, the calibrated likelihood estimator turns out to be algebraically equivalent to ˆt CAL with D(w, d specified as a weighted Kullback Leibler distance and the weight factor q i set to πi 1 1. The calibrated likelihood estimator (Tan, 2010 and, similarly, the calibrated regression estimator (Tan, 2006 are shown in Tan (2013 to be asymptotically optimal under rejective or high-entropy sampling designs when π i is included as a calibration variable in x i. These two estimators are simpler than the usual optimal regression estimator involving second-order inclusion probabilities (Fuller & Isaki, 1981; Montanari, 1987; Rao, See also Chen & Kim (2014 for related results but under negligible sampling fractions. In this paper, we consider generalized pseudo empirical likelihood inferences for complex surveys. The method is based on a weighted version of the Kullback Leibler (KL distance for calibration estimation (Deville & Särndal, 1992 and includes the pseudo empirical likelihood estimator (Chen & Sitter, 1999; Wu & Rao, 2006 and the calibrated likelihood estimator (Tan, 2013 as special cases. We show that a suitably formulated empirical likelihood ratio-type statistic follows asymptotically a scaled chi-square distribution, which extends the main result in Wu & Rao

4 4 TAN AND WU Vol. 43, No. 1 (2006 and makes the likelihood ratio-type confidence intervals available for calibration estimation using arbitrary choices of the weighting factor in the weighted KL distance. We further show that the scaling factor for the scaled chi-square distribution can be circumvented either through a particular choice of the weighting factor for the KL distance or using a bootstrap method. The proposed bootstrap procedure is justified for single-stage sampling designs with negligible sampling fractions. The rest of the paper is organized as follows. Main results on the generalized pseudo empirical likelihood method are presented in Section 2. The proposed bootstrap procedure is described in Section 3. In Section 4, we report results from two simulation studies, one based on a synthetic finite population and the other using a Statistics Canada survey data set. Some concluding remarks and discussions are given in Section 5. Proofs of the major results and justification of the bootstrap method are given in the Appendix. 2. GENERALIZED PSEUDO EMPIRICAL LIKELIHOOD METHOD 2.1. Weighted Kullback Leibler Distance Based Calibration Estimators Kullback Leibler distance is a measure of divergence between two distributions. It was first described by Kullback & Leibler (1951 as a loss function in the context of information theory, and then further discussed by Kullback (1959. For two discrete probability measures f = (f 1,...,f T and g = (g 1,...,g n T, there are two types of Kullback Leibler distance: KL(f, g = n i=1 f i log(f i /g i and KL(g, f = n i=1 g i log(g i /f i. When discussing confidence intervals for iid data, taking f to be the empirical measure with f i = 1/n and g i to be another probability measure for the data, DiCiccio & Romano (1990 called KL(f, g the forward Kullback Leibler distance and KL(g, f the backward Kullback Leibler distance. Unfortunately, neither KL(f, g nor KL(g, f can be used directly as a distance measure for calibration estimation, since G i (f i,g i = f i log(f i /g i does not guarantee that G i (f i,g i 0 for all i. A simple modification is to use G i (f i,g i = f i log(f i /g i f i + g i. In this case G i (f i,g i = f i {log(g i /f i g i /f i + 1} 0for all i, since log(x x for any x>0. For the two sets of weights d = (d 1,...,d n T and w = (w 1,...,w n T, we consider the modified forward Kullback Leibler distance between w and d, weighted by (q 1,...,q n : EL(d, w = { ( qi 1 wi d i log w i + d i }. d i This is also called the minimum entropy distance by Deville & Särndal (1992. The notation EL(d, w indicates its connection to empirical likelihood. For independent but not identically distributed data where d i = n 1 and (w 1,...,w n are replaced by the probability measure (p 1,...,p n over the sample, EL(d, w was discussed in Wu (2004 as the weighted empirical log-likelihood function, with the q-weights specified through the variance function. If we let q i = 1 and impose the constraint n i=1 w i = N, then EL(d, w = n i=1 d i log(p i + C, where p i = w i /N and C is a constant not involving p i. In this case, minimizing EL(d, w subject to a set of constraints on w i is equivalent to maximizing the pseudo-el function l(p = n i=1 d i log(p i subject to the same set of constraints on p i. In other words, the pseudo-el approach of Chen & Sitter (1999 and Wu & Rao (2006 is a special case of inferences based on the modified forward Kullback Leibler distance EL(d, w. We use the term generalized pseudo empirical likelihood (GPEL to denote calibration estimation under the distance EL(d, w. The GPEL estimator of t y is given by ˆt EL = ŵiy i, where the weights ŵ i minimize EL(d, w subject to (1. If N is known, then the constraint w i = N should be included. This amounts to including 1 as the first component of x i and N as the first component of t x in the calibration equations (1. It can be shown by the standard Lagrange multiplier

5 2015 GENERALIZED PSEUDO EMPIRICAL LIKELIHOOD INFERENCES 5 method that d i ŵ i =, (4 1 + q i x T i ˆλ where ˆλ is a solution to d i x i 1 + q i x T i ˆλ t x = 0. (5 It should be noted that the modified backward Kullback Leibler distance between w and d, weighted by the pre-specified q-weights (q 1,...,q n, is given by: ET(d, w = { ( qi 1 wi w i log w i + d i }. d i The notation ET comes from the term exponential tilting, since minimizing ET(d, w with respect to w subject to constraints (1 results in calibration weights given by w i = d i g i, where g i = exp ( λ x i q i and λ is determined by constraints (1. The distance measure ET(d, w was first mentioned by Deville & Särndal (1992. Folsom (1991 provides an early example on exponential weight adjustment. Kim (2010 contains further discussions on calibration estimation using exponential titling. Now consider the choice q i = πi 1 1 (Tan, 2010, The distance EL(d, w is equal to (1 π i 1 {log(w i π i w i } up to an additive constant. The resulting calibration weights are given by ŵ i ={π i + (1 π i x T i ˆλ} 1, where ˆλ is the solution to x i/{π i + (1 π i x T i ˆλ} t x = 0. The resulting calibration estimator of μ y is given by ˆμ EL = 1 y i, (6 N π i + (1 π i x T i ˆλ which is exactly the same as the calibrated likelihood estimator of Tan (2010, The use of πi 1 1 as a weight also appeared previously in Brewer (1999 on cosmetic calibration and Berger, Tirari, & Tillé (2003 on optimal regression estimation. See further discussions after Corollary 1. An interesting interpretation of the choice q i = πi 1 1 is as follows. Let I i = 1ifi S and I i = 0ifi/ S, then E p (I i /π i = 1 and V p (I i /π i = πi 1 1. Throughout, E p ( and V p ( refer to expectation and variance under the probability sampling design. In other words, the choice q i = πi 1 1 reflects the variation of selecting the ith unit into the sample under the survey design. Another benefit of setting q i = πi 1 1 can be seen from the property that q i 0if π i 1. If the inclusion probability of a unit is close to 1, then this unit is substantially downweighted (or completely removed if π i = 1 in the calibration process. This seems to be sensible from a design perspective, because the uncertainty associated with unit i is very small if π i 1, and in this case we should force w i d i 1. In particular, this property may lead to substantial variance reduction, when the linear relationship of y i given x i is violated mostly in the region where π i 1, as seen in Tan (2013, Section Generalized Pseudo Empirical Likelihood Ratio Confidence Intervals The point estimator ˆt EL falls in the general class of calibration estimators (Deville & Särndal, 1992, with the distance measure G(d, w specified as the modified Kullback Liebler distance EL(d, w. We now establish an important new result for constructing confidence intervals based on a GPEL ratio statistic similar to the pseudo empirical likelihood ratio statistic in Wu &Rao

6 6 TAN AND WU Vol. 43, No. 1 (2006. We assume that the finite population and the survey design satisfy the same regularity conditions C1 C5 described in Wu & Rao (2006. In addition, we assume that C6. The q-weights satisfy N 1 N i=1 q 2 i = O(1. Under conditions C1 C6, we have that N 1 N i=1 q i x i x T i = O(1 and N 1 N i=1 q i x i y i = O(1. Let ŵ = (ŵ 1,...,ŵ n, where ŵ i are computed by (4 and (5. By standard asymptotic theory of calibration estimation (Deville & Särndal, 1992, we have ˆλ = O p (n 1/2 where ( ˆλ = d i q i x i x T i 1 ( ˆt x t x + o p (n 1/2, (7 and ˆt x = d ix i. This leads to the following asymptotic expansion: ˆt EL = ŵiy i = ˆt GREG + o p (Nn 1/2, where ˆt GREG = ˆt y + ˆB T( t x ˆt x (8 with ˆt y = d iy i and ( 1 ˆB = d i q i x i x T i d i q i x i y i. (9 The estimator (8 is known as the generalized regression estimator for a general choice q i (Särndal, Swensson, & Wretman, Let w(θ = ( w 1 (θ,..., w n (θ, where the weights w i (θ minimize EL(d, w subject to (1 and w i y i = θ (10 for a given θ. The GPEL ratio statistic for θ is defined as r(θ = EL(d, ŵ EL(d, w(θ. We have the following result on the asymptotic distribution of r(θ. Theorem 1. Under regularity conditions C1 C6, the adjusted GPEL ratio statistic 2r(θ/C converges in distribution to a χ 2 random variable with one degree of freedom when θ = t y. The scaling constant C is given by /( N C = V p (ˆη q i e 2 i, (11 where ˆη = ( Ni=1 d ie i, e i = y i B T x i, and B = q i x i x T i consistently estimated by ˆB defined in (9. i=1 1 ( Ni=1 q i x i y i, which can be In practice, the scaling factor C needs to be estimated by a consistent estimator Ĉ, which involves variance estimation for ˆη. This can be handled similarly as in Wu & Rao (2006. A bootstrap procedure described in Section 3 can also be used to circumvent the estimation of C for

7 2015 GENERALIZED PSEUDO EMPIRICAL LIKELIHOOD INFERENCES 7 single-stage sampling designs with negligible sampling fractions. For rejective or high-entropy sampling designs, estimation of C is also not required, as shown in Corollaries 1 and 2 below. An interesting special case of Theorem 1 is obtained for the calibrated likelihood estimator (6 of Tan (2013 under rejective sampling, using the weight factor q i = πi 1 1 and including π i as a calibration variable in x i. As defined in Hajek (1964, rejective sampling is Poisson sampling conditional on a fixed sample size. For example, simple random sampling without replacement corresponds to rejective sampling with constant inclusion probabilities. In this case the scaling constant C is asymptotically equal to 1. We assume that lim inf N N 1 N i=1 (πi 1 1e 2 i > 0. Corollary 1. Let q i = πi 1 1 and assume that π i is included as a component of x i. Under rejective sampling and the regularity conditions stated in Theorem 1 of Tan (2013, we have lim N C = 1 and 2r(θ converges in distribution to a χ 2 random variable with one degree of freedom when θ = t y. A heuristic explanation for the simplification of C is as follows; see also Tan (2013. Under Poisson sampling, V p (ˆη = N i=1 (πi 1 1e 2 i and hence C is exactly equal to 1. But rejective sampling of size n is defined as Poisson sampling conditional on a fixed sample size n, that is, d iπ i = n. Hence under rejective sampling, V p (ˆη is asymptotically equal to the residual" variance { N i=1 (πi 1 Ni=1 } 1 { Ni=1 } 1(e i bπ i 2, with b = π i (1 π i (1 π i e i, which then reduces to 0 by the definition of e i and the fact that π i is included as a component of x i. Such an argument is also implicit in Berger, Tirari, & Tillé (2003 on optimal regression estimation. In fact, under single-stage rejective sampling, the estimator of Berger et al. reduces to the same estimator, up to some minor difference, as the calibrated regression estimator of Tan (2013, taking the form of ˆt GREG in (8 with q i = πi 1 1. Incidentally, when both using q i = πi 1 1 and including π i in x i, the calibrated regression estimator can also be expressed in the cosmetic form of linear prediction estimators (Särndal & Wright, 1984, i.e., ˆt GREG = y i + i S ˆB T x i, as shown in Tan (2013. Fuller (2009 and Park & Kim (2014 also contain discussions on the topic. These choices of q i and x i satisfy a general construction of cosmetic calibration estimators in Brewer (1999. In Brewer s notation, Z s is taken here to be diagonal with diagonal elements π i. The condition Z s 1 n = X s α holds because π i s constitute a column of X s. But, in general, Brewer s (1999 proposal does not imply setting q i = πi 1 1. Similarly as in Tan (2013, Corollary 1 can be generalized from rejective sampling to other high-entropy sampling methods such that the Kullback Leibler divergence from rejective sampling tends to 0. In particular, Rao-Sampford sampling method (Rao, 1965; Sampford, 1967 is an example of high-entropy sampling provided that n i=1 π i (1 π i (Berger, 1998, which is already implied by the regularity conditions in Tan (2013, Theorem 1. Corollary 2. The same result as in Corollary 1 holds if rejective sampling procedure is replaced by the Rao-Sampford sampling procedure Computational Procedures for GPEL The basic computational problem is to minimize EL(d, w with respect to w = (w 1,...,w n subject to constraint (1. The resulting ŵ i is given by (4 with the Lagrange multiplier ˆλ being the solution to (5. The key to our computational algorithms is that the required constrained

8 8 TAN AND WU Vol. 43, No. 1 minimization with respect to w is a dual problem of maximizing K(λ = qi 1 d i log ( 1 + q i x T i λ tx T λ with respect to λ within the set (λ ={λ :1+ q i x T i λ > 0,i S}, since (5 is equivalent to 1 ( λ = λ K(λ = Note that K(λ is a concave function of λ, since the matrix 2 ( λ = 2 λ λ T K(λ = d i x i 1 + q i x T i ˆλ t x = 0. (12 d i q i x i x T i (1 + q i x T i λ2 (13 is negative definite. This duality property was also observed in Tan (2010 for the calibrated likelihood estimator, where, up to an additive constant, K(λ = log{π i + (1 π i x T i λ} 1 π i t T x λ. The solution to (5 can be found using the modified Newton Raphson procedures of Chen, Sitter, & Wu (2002 with 1 (λ and 2 (λ defined in (12 and ( A BOOTSTRAP PROCEDURE Results from Theorem 1 can be used to construct 1 α level confidence intervals for the population total θ = t y in the form of {θ 2r(θ/C<χ1 2(α}, where χ2 1 (α is the upper 100αth percentile from the χ1 2 distribution. Under an arbitrary unequal probability sampling design, the scaling constant C needs to be estimated, which involves variance estimation for ˆη. For single-stage unequal probability sampling designs with negligible sampling fractions, the scaling constant can be circumvented through a bootstrap calibration method. The bootstrap procedure also provides a useful alternative to the chi-square approximation for rejective or high-entropy sampling under Corollaries 1 and 2 where C can be replaced by Ĉ = 1. The bootstrap procedure introduced here is similar to the with-replacement bootstrap procedure described in Wu & Rao (2010 for the pseudo empirical likelihood method. The bootstrap calibrated 1 α level confidence intervals on θ = t y using the unscaled GPEL ratio statistic is constructed as {θ 2r(θ <b α }, where b α is the upper 100αth percentile from the sampling distribution of 2r(θ. The bootstrap procedure provides a Monte Carlo approximation to b α. The most crucial part of the bootstrap method is to treat the survey weights d i and the q- weights q i as part of the sample data. Let {(d i,q i, x i,y i,i S} be the original survey data set. Let t x be the known population totals for the x-variables and let ˆt EL = ŵiy i be the calibration estimator of t y using the distance measure EL(d, w. Our proposed bootstrap method consists of the following four steps: [1] Select a bootstrap sample S of size n from the original sample S using simple random sampling with replacement; denote the bootstrap sample data by {(di,q i, x i,y i,i S }. [2] Let the bootstrap version of EL(d, w be defined as EL (d, w = (q i 1{ d i log ( wi d i } w i + di.

9 2015 GENERALIZED PSEUDO EMPIRICAL LIKELIHOOD INFERENCES 9 [3] Calculate the GPEL ratio statistic r (θ = EL (d, ŵ EL (d, w(θ at θ = ˆt EL, where ŵ = (ŵ 1,...,ŵ n T maximize EL (d, w subject to w ix i = t x and w(θ = ( w 1 (θ,..., w n (θ T maximize EL (d, w subject to w ix i = t x and w iyi = ˆt EL. [4] Repeat Steps [1], [2], and [3] a large number of times, B, independently, to obtain the sequence 2r1 (θ,..., 2r B (θ, all at θ = ˆt EL.Letbα be the upper 100αth sample percentile from this sequence. The proposed bootstrap method can be formally justified for single-stage unequal probability sampling design with replacement; see the Appendix for details. The procedure also provides good approximations for single-stage unequal probability sampling designs without replacement if the sampling fraction is small. Treating survey designs with negligible sampling fractions as if the units are selected with replacement is a common practice in survey sampling for the purpose of variance estimation or other second order analysis. The bootstrap calibrated confidence interval on t y, constructed as {θ 2r(θ <bα }, has approximately correct asymptotic coverage probability at the 1 α level. 4. SIMULATION STUDIES We now report results from two simulation studies on the performances of the GPEL based estimators and GPEL ratio confidence intervals on a population total, with comparisons to the generalized regression estimators and the usual normal theory confidence intervals. Study I. The finite population of size N = 2,000 used for the simulation was generated from the model y i = β x i + 2z i + 0.5{x i I(x i < 2} z 1/2 i + σε i, where x i lognormal(0, 1, z i χ2 2, ε i N(0, 1, I( is the indicator function, β 0 was chosen such that y i 0 for i = 1,...,N. Two values of σ were used such that the correlation coefficients, ρ, between the response variable y i and the linear predictor β x i + 2z i + 0.5{x i I(x i < 2} z 1/2 i are 0.80 and 0.50, respectively. The finite population, once generated from the above model, was held fixed. Under this setting, the finite population correlation coefficients between y and x are respectively 0.46 and 0.30 for ρ = 0.80 and ρ = 0.50; the correlation coefficients between y and z are respectively 0.66 and 0.43 for the two corresponding values of ρ. Single-stage unequal probability samples of size n = 80 were taken from the finite population, with inclusion probabilities π i proportional to z i + c. Two values of c were considered such that π mm = π max /π min equals 200 and 20, respectively, where π max = max{π i,i= 1,...,N} and π min = min{π i,i= 1,...,N}. Rao-Sampford unequal probability sampling method (Rao, 1965; Sampford, 1967 was used in selecting the samples. Note that the sampling fraction is 80/2,000 = 4%, which is small, and the Rao-Sampford method has high entropy (Berger, It should also be noted that the second-order inclusion probabilities π ij can be computed exactly for the Rao-Sampford sampling method. We considered two choices of q-weights: q i = 1 and q i = πi 1 1. This gives a total of eight different scenarios with respect to the choices on ρ, π mm and q i. For each scenario, five point estimators of the population total t y were computed: (1 the basic Horvitz Thompson estimator (HT; (2 the generalized regression estimator calibrated over x i (GREG-1; (3 the generalized regression estimator calibrated over (x i,π i, 1 (GREG-2; (4 the GPEL based estimator calibrated over x i (GPEL-1; and (5 the GPEL based estimator calibrated over (x i,π i, 1 (GPEL-2.

10 10 TAN AND WU Vol. 43, No. 1 Table 1: Relative root mean square error ( 10 3 of point estimators (Study I. q i ρ π mm HT GREG-1 GREG-2 GPEL-1 GPEL π 1 i Performances of a point estimator ˆt y of the population total t y are evaluated in terms of simulated relative bias (RB and relative root mean square Error (RRMSE defined as RB = K 1 K k=1 {ˆty (k t y }/ ty and RRMSE = ( MSE 1/2 / t y, where ˆt y (k is the estimator computed from the kth simulated sample, MSE = K 1 K k=1 {ˆt y (k t y } 2, and K is the total number of simulation runs. All five estimators demonstrated negligible biases ( RB < 3% for all cases. Details are not included here to save space. The simulated values of RRMSE are summarized in Table 1. The results for the Horvitz Thompson estimator are reported from two independent simulations for the two choices of q-weights. It can be seen from the table that (i the design with less variable weights (π mm = π max /π min = 20 provides better results than the design with more variable weights (π mm = 200; (ii including the design variable (i.e., the inclusion probabilities π i and the constant 1 in the calibration equations gives significantly more accurate estimation; (iii the GPEL based calibration estimators are at least as efficient as the generalized regression estimators; and (iv the two choices of q-weights lead to similar results. A possible explanation for (iv is that the mean of y i given (x i,z i under the simulation model is only moderately nonlinear, depending mainly on x i instead of z i or π i. Using q i = πi 1 1 may lead to more noticeable gains of efficiency when the linear relationship is more seriously misspecified and the nonlinearity occurs in the region where π i is close to 1, as discussed in Section 2.1. We considered five methods for constructing confidence intervals on the population total t y : (1 the normal theory interval based on the Horvitz Thompson estimator, denoted as HT(NT; (2 the normal theory interval based on the generalized regression estimator, denoted as GREG(NT; (3 the profile GPEL ratio interval based on the scaled χ 2 distribution described in Theorem 1, denoted as GPEL(Ĉ, where Ĉ is the estimated scaling constant; (4 the profile GPEL ratio interval with C = 1 when q i = πi 1 1, denoted as GPEL(C = 1; and (5 the bootstrap calibrated GPEL ratio interval described in Section 3, denoted as GPEL(Boot, where B = 1,000 bootstrap samples were used for each simulation run. Let (ˆθ 1 (k, ˆθ 2 (k be a confidence interval on θ = t y obtained from the kth simulated sample using a particular method. Performances of the interval are measured by the (relative average length (AL, lower (L, and upper (U tail error rates, and coverage probability (CP, computed,

11 2015 GENERALIZED PSEUDO EMPIRICAL LIKELIHOOD INFERENCES 11 Table 2: Coverage probabilities and average length of 95% CI (Study I; calibrated over x i. q i ρ π mm HT (NT GREG (NT GPEL (Ĉ GPEL (C = 1 GPEL (Boot AL U CP L AL U CP L AL U CP L AL U CP L π 1 i AL U CP L AL U CP L AL U CP L AL U CP L respectively as AL = 1 K {ˆθ 2 (k ˆθ 1 (k }/ t y, K k=1 { 1 K L = I ( t y ˆθ 1 (k } 100, K k=1

12 12 TAN AND WU Vol. 43, No. 1 Table 3: Coverage probabilities and average length of 95% CI (Study I; calibrated over (x i,π i, 1. q i ρ π mm HT (NT GREG (NT GPEL (Ĉ GPEL (C = 1 GPEL (Boot AL U CP L AL U CP L AL U CP L AL U CP L π 1 i AL U CP L AL U CP L AL U CP L AL U CP L { 1 K U = I ( t y ˆθ 2 (k } 100, K k=1 { 1 K CP = I (ˆθ 1 (k t y ˆθ 2 (k } 100. K k=1 Note that L + CP + U = 100. Table 2 reports results on confidence intervals where only x i is used in calibration for GREG and GPEL. Table 3 summarizes results with (x i,π i, 1 used for calibration.

13 2015 GENERALIZED PSEUDO EMPIRICAL LIKELIHOOD INFERENCES 13 Table 4: Relative root mean square error ( 10 3 of point estimators (Study II. q i π mm HT GREG-1 GREG-2 GPEL-1 GPEL π 1 i Major observations from Tables 2 and 3 can be summarized as follows: (i the GREG(NT and GPEL(Ĉ intervals are associated with both greater average lengths and lower coverage probabilities in the two tables, under the design with more variable weights (π mm = 200 than under the design with less variable weights (π mm = 20. This demonstrates the challenges for dealing with highly variable sampling weights. (ii While the average lengths of GREG(NT and GPEL(Ĉ intervals are similar, the coverage probabilities of GPEL(Ĉ are consistently higher and closer to 95% than those of GREG(NT in the two tables. (iii including (π i, 1 in the calibration (i.e., Tables 2 and 3 significantly reduces average lengths for GREG(NT and GPEL(Ĉ intervals under the two designs of π mm, echoing the previous results on RRMSE. But the coverage probabilities for both methods decrease noticeably under the design with π mm = 200, although not so under the design with π mm = 20. (iv the GPEL(C = 1 intervals with q i = πi 1 1 seem to perform well even if π i is not included in calibration (Table 2. The method becomes almost identical to GPEL(Ĉ when (π i, 1 is included in calibration (Table 3; (v the bootstrap intervals GPEL(Boot perform remarkably well, in terms of coverage probabilities, for all the cases, especially under the design with π mm = 200. The average lengths are slightly inflated as compared to GREG(NT or GPEL(Ĉ. Study II. In this simulation study we used a real survey data set from the 2000 Statistics Canada Family Expenditure Survey for the province of Ontario. The data set contains N = 2248 observations, with measurements on x i : number of people in the household; z i : annual income; y i : total expenditure. Chen, Sitter, & Wu (2002 contains a detailed description of the data set. We treated the data as the finite population, and conducted the same types of simulation as in Study I. Once again, the Rao-Sampford sampling method was used and the sample size was set at n = 80. Both π mm = 200 and π mm = 20 are considered. Results are summarized in Tables 4 and 5. The first column in Table 5 indicates the calibration variables (CV used in related method (x i only versus (x i,π i, 1. Most of the observations from Study I remain true, except that the GREG(HT and the GPEL(Ĉ intervals have much better performances. Low coverage probabilities do not seem to be an issue with the current study. 5. CONCLUSION Calibration estimation using auxiliary information has been extensively studied in the survey literature. Choices among alternative approaches depend on (i the flexibility in obtaining efficient point estimators; (ii the efficiency and reliability of computational procedures; and (iii the capacity for drawing inferences beyond point estimation such as constructing confidence intervals or conducting hypothesis tests. The generalized pseudo empirical likelihood approach has shown advantages in all three aforementioned aspects. In practice, we recommend including π i as a calibration variable, and using the choice q i = πi 1 1 especially when y i and x i are suspected

14 14 TAN AND WU Vol. 43, No. 1 Table 5: Coverage probabilities and average length of 95% CI (Study II. CV q i π mm HT (NT GREG (NT GPEL (Ĉ GPEL (C = 1 GPEL (Boot x i AL U CP L AL U CP L π 1 i AL U CP L AL U CP L (x i,π i, AL U CP L AL U CP L π 1 i AL U CP L AL U CP L to have a strong nonlinear relationship. Confidence intervals can be obtained by using the adjusted GPEL ratio statistic or, when sampling fractions are negligible, by the bootstrap procedure. While inferences on population totals are the main focus in the current paper, extensions to parameters defined through general estimating equations, including regression and logistic regression coefficients, are the natural topic for further development. Moreover, extensions to multistage sampling designs and extensions to analyzing imputed survey data are currently under investigation.

15 2015 GENERALIZED PSEUDO EMPIRICAL LIKELIHOOD INFERENCES 15 APPENDIX Proof of Theorem 1. Note that ŵ i are computed by (4 and (5 and ŵ i /d i = 1/ ( 1 + q i x T i ˆλ. For u i = o(1, we have log ( 1 + u i = ui u2 i 2 + O(u3 i and (1 + u i 1 = 1 u i + u 2 i + O( u 3 i. Under conditions C1 C6, we have ˆλ = O p (n 1/2 and max q i x i =o p (n 1/2. It follows that u i = q i x T i ˆλ = o p (1 uniformly over all i S. This together with (7 leads to EL(ŵ, d = ( qi 1 d i {log 1 + q i x T i ˆλ ( q i x T i ˆλ } 1 1 = 1 2 ˆλ T ( d i q i x i x T i ˆλ + o p ( N n = 1 2 ( ( 1 T ( ( N ˆt x t x d i q i x i x T i ˆt x t x + o p n ( = 1 T N (ˆt x t x q i x i x T i 2 i=1 1 ( ( N ˆt x t x + o p. n To derive an asymptotic expansion for EL( w(θ, d with the additional constraint (10, we follow the same technique used in the proof of Theorem 2 of Wu & Rao (2006. For θ = t y, it can be shown that ( EL( w(θ, d = 1 T N (ˆt z t z q i z i z T i 2 i=1 1 ( ( N ˆt z t z + o p, n where z i = (x T i,y i T. The last expression remains valid if z i is defined by the linear transformation z i = (x T i,e i T, where e i = y i B T x i. Then N i=1 q i z i z T i is block-diagonal, and r(θ = EL(ŵ, d EL( w(θ, d / = 1 2 (ˆη N η2 q i e 2 i + o p i=1 where η = N i=1 e i = t y B T t x, ˆη = d ie i and ( N 1 N B = q i x i x T i q i x i y i. i=1 i=1 ( N, n Therefore, 2r(θ/Cconverges in distribution to a χ 2 random variable with one degree of freedom when θ = t y. Note that V p (ˆη = O(N 2 /n and C = O(N/n. Proof of Corollary 1. The result follows directly from the fact that V p ( e i/π i = Ni=1 (πi 1 1e 2 i + o(n by Tan (2013, Lemma 1 under rejective sampling.

16 16 TAN AND WU Vol. 43, No. 1 Justification of the Bootstrap Method. The first key result is from Theorem 1, which states that 2r(θ/C χ1 2 in distribution when θ = t y, where the scaling constant is given by C = V p (ˆη/ ( N i=1 q i e 2 i with ˆη = d ie i and e i = y i B T x i. Note that V p ( denotes the design-based variance. The second key result is the parallel development on the bootstrap version of the GPEL ratio statistic, following the exact steps used in the proof of Theorem 1, which shows that 2r (θ/c χ1 2 in distribution when θ = ˆt EL. The scaling constant is given by C = V (ˆη S /( q i d i e 2 i, where ˆη = d i e i, e i = yi ˆB T x i, and V ( S denotes the variance under the bootstrap sampling procedure, conditional on the original survey sample. Let z i = π i /n and z i = πi /n. It follows that d i = 1/πi = (1/z i /n and ˆη = n 1 r i where ri = e i /z i. Similarly, we also have d i = (1/z i /n and ˆη = n 1 r i where r i = e i /z i. Under the proposed with-replacement bootstrap procedure, we have V (ˆη S = Sr 2 /n, where Sr 2 = ( n 1 ri ˆη 2. If the original survey sample is selected by a single-stage unequal probability sampling design with replacement, then ˆη = n 1 e i/z i is the standard Hansen Hurwitz estimator. The design-based variance V p (ˆη can be unbiasedly estimated by n 1{ (n 1 1 (r i ˆη 2}. In this case, we have C/C 1asngets large, the bootstrap version 2r (θ and the original version 2r(θ follow asymptotically the same scaled χ 2 distribution. The bootstrap percentile bα is a consistent estimator of the true percentile b α. ACKNOWLEDGEMENTS The research of Z. Tan was supported by a grant from the Natural Science Foundation of United States. The research of C. Wu was supported by a grant from the Natural Sciences and Engineering Research Council of Canada. BIBLIOGRAPHY Berger, Y. G. (1998. Rate of convergence to normal distribution for the Horvitz-Thompson estimator. Journal of Statistical Planning and Inference, 67, Berger, Y. G., Tirari, E. H. M., & Tillé, Y. (2003. Towards optimal regression estimation in sample surveys. Australian & New Zealand Journal of Statistics, 45, Brewer, K. R. W. (1999. Cosmetic calibration with unequal probability sampling. Survey Methodology, 25, Chen, J. & Sitter, R. R. (1999. A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys. Statistica Sinica, 12, Chen, J., Sitter, R. R., & Wu, C. (2002. Using empirical likelihood methods to obtain range restricted weights in regression estimators for surveys. Biometrika, 89, Chen, S. & Kim, J. K. (2014. Population empirical likelihood for nonparametric inference in survey sampling. Statistica Sinica, 24, Deville, J. C. & Särndal, C. E. (1992. Calibration estimators in survey sampling. Journal of the American Statistical Association, 87, DiCiccio, T. J. & Romano, J. P. (1990. Nonparametric confidence limits by resampling methods and least favorable families. International Statistical Review, 58, Folsom, R. E. (1991. Exponential and logistic weight adjustment for sampling and nonresponse error reduction. Proceedings of the Section on Social Statistics, American Statistical Association,

17 2015 GENERALIZED PSEUDO EMPIRICAL LIKELIHOOD INFERENCES 17 Fuller, W. A. (2002. Regression estimation for survey samples. Survey Methodology, 28, Fuller, W. A. (2009. Sampling Statistics, John Wiley & Sons, Inc., Hoboken, New Jersey. Fuller, W. A. & Isaki, C. T. (1981. Survey design under superpopulation models. In Current Topics in Survey Sampling, Krewski, D., Rao, J. N. K., & Platek, R., editors. Academic Press, New York, pp Hajek, J. (1964. Asymptotic theory of rejective sampling with varying probabilities from a finite population. Annals of Mathematical Statistics, 35, Kim, J. K. (2010. Calibration estimation using exponential tilting in sample surveys. Survey Methodology, 36, Kullback, S. (1959. Information Theory and Statistics, Wiley, New York. Kullback, S. & Leibler, R. A. (1951. On information and sufficiency. Annals of Mathematical Statistics, 22, Montanari, G. E. (1987. Post-sampling efficient QR-prediction in large-scale surveys. International Statistical Review, 55, Owen, A. B. (1988. Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, Owen, A. B. (2001. Empirical Likelihood, Chapman & Hall/CRC, New York. Park, S. & Kim, J. K. (2014. Instrumental-variable calibration estimation in survey sampling. Statistica Sinica, 24, Qin, J. & Lawless, J. (1994. Empirical likelihood and general estimating equations. Annals of Statistics, 22, Rao, J. N. K. (1994. Estimating totals and distribution functions using auxiliary information at the estimation stage. Journal of Official Statistics, 10, Rao, J. N. K. (1965. On two simple schemes of unequal probability sampling without replacement. Journal of the Indian Statistical Association, 3, Sampford, M. R. (1967. On sampling without replacement with unequal probabilities of selection. Biometrika, 54, Särndal, C. E. (2007. The calibration approach in survey theory and practice. Survey Methodology, 33, Särndal, C. E., Swensson, B., & Wretman, J. H. (1992. Model-Assisted Survey Sampling, Springer-Verlag, New York. Särndal, C. E. & Wright, R. L. (1984. Cosmetic form of estimators in survey sampling. Scandinavian Journal of Statistics, 11, Tan, Z. (2006. A distributional approach for causal inference using propensity scores. Journal of the American Statistical Association, 101, Tan, Z. (2010. Bounded, efficient and doubly robust estimation with inverse weighting. Biometrika, 97, Tan, Z. (2013. Simple design-efficient calibration estimators for rejective and high-entropy sampling. Biometrika, 100, Wu, C. (2004. Weighted empirical likelihood inference. Statistics & Probability Letters, 66, Wu, C. & Rao, J. N. K. (2006. Pseudo-empirical likelihood ratio confidence intervals for complex surveys. The Canadian Journal of Statistics, 34, Wu, C. & Rao, J. N. K. (2010. Bootstrap procedures for the pseudo empirical likelihood method in sample surveys. Statistics and Probability Letters, 80, Received 9 October 2013 Accepted 12 October 2014

Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys

The Canadian Journal of Statistics Vol.??, No.?,????, Pages???-??? La revue canadienne de statistique Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys Zhiqiang TAN 1 and Changbao