Optimal Calibration Estimators Under Two-Phase Sampling
|
|
- Francis Brown
- 5 years ago
- Views:
Transcription
1 Journal of Of cial Statistics, Vol. 19, No. 2, 2003, pp. 119±131 Optimal Calibration Estimators Under Two-Phase Sampling Changbao Wu 1 and Ying Luan 2 Optimal calibration estimators require in general complete auxiliary information. When such information is not available, the estimation procedure can be combined with two-phase sampling where a large, less costly rst-phase sample measured over the auxiliary variables is used to get estimates for certain population quantities related to the covariates. In this article we propose optimal calibration estimators for the population mean, the distribution function, the population variance and other second-order nite population quantities under two-phase sampling. The proposed optimal calibration estimators for a second-order nite population quantity such as the population variance can ideally be used to obtain more ef cient variance estimators for a rst-order nite population quantity such as the total or the distribution function. The estimation strategy can be applied to various measurement error and nonresponse problems. The design-based nite sample performances of proposed estimators are investigated through a simulation study using real survey data from the 1996 Statistics Canada Family Expenditure FAME) Survey. Key words: Auxiliary information; measurement error; model-assisted approach; nonresponse; variance estimation. 1. Introduction Many highly ef cient estimation techniques in survey sampling require strong information about auxiliary variables, x. For example, the calibration estimator of Deville and SaÈrndal 1992) requires, the population totals. When such information is not available, a twophase sampling scheme can be used where a large, less costly rst-phase sample measured over the x variable is used to obtain a good estimate of. A smaller and more costly second-phase sample can then be taken and the study variable y is observed. The major advantage of using two-phase sampling is the gain in high precision without substantial increase in cost. SaÈrndal et al. 1992, Chapter 9) provide an excellent account of two-phase sampling. Recently, Wu and Sitter 2001) proposed a model-calibration approach to using auxiliary information from surveys. The proposed model-calibration estimator is not only highly ef cient but also optimal among a class of calibration estimators Wu 1 Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada. cbwu@uwaterloo.ca 2 Department of Statistics, University of California, Riverside, CA 92521, USA. Acknowledgments: This research was supported by a grant fromthe Natural Sciences and Engineering Research Council of Canada. The authors thank Professor Randy R. Sitter for helpful comments. Thanks are also due to the Associate Editor and three anonymous referees for constructive suggestions and comments that greatly improved the presentation. q Statistics Sweden
2 120 Journal of Of cial Statistics 2002). To compute the model-calibration estimator, however, one generally requires the values of the x variables to be known for the entire nite population. In practice this complete auxiliary information is often unavailable. Two-phase sampling provides an ideal solution for the use of optimal calibration estimators under such situations. Suppose U ˆf1; 2;...; Ng is the set of labels for the nite population and s is the set of labels for the sampled units. Let y i and x i be the values of the response variable y and the vector of covariates x associated with the i th unit. Let p i be the rst-order inclusion probabilities. Assuming the population totals ˆ P N i ˆ 1 x i are known, the conventional calibration estimator Deville and SaÈrndal 1992) for the nite population total Y ˆ P N i ˆ 1 y i is constructed as Y à C ˆ P w i y i, where the weights w i minimize a distance measure F s between the w i 's and the basic design weights d i ˆ 1=p i subject to the so-called benchmark constraints w i x i ˆ 1 The x i used in 1) are also referred to as calibration variables. There are two basic components in the construction of a calibration estimator: a distance measure F s and a set of constraints such as 1). The chi-squared distance measure F s ˆ P w i d i 2 = d i q i is commonly used, where the weighting factors q i are unrelated to d i. Other distance measures can also be considered but it has been shown by Deville and SaÈrndal 1992) that the resulting calibration estimator is asymptotically equivalent to the one using a chi-squared distance measure with a certain choice of q i. The benchmark constraints 1) which calibrate directly over the individual x variables are indeed ad hoc. There is no compelling reason to exclude the use of other types of constraints. Optimal calibration estimators have been studied by Wu 2002) under a model-assisted framework. The following semi-parametric superpopulation model was used to motivate the optimality considerations, E y y i x i ˆm x i ; v ; V y y i x i ˆ v x i Š 2 j 2 2 where m ; and v are known functions, v and j 2 are unknown model parameters, and E y, V y denote the expectation and the variance under the model, y. It was also assumed that y 1 ; y 2 ;...; y N are conditionally independent given the x i 's. Wu 2002) considered the class of calibration estimators for the population total Y obtained by using a) any chi-squared distance measure with the weighting factors satisfying q i > q for some constant q > 0andN 1 P N i ˆ 1 qi 2 ˆ O 1 ; b) any constraint through a dimension-reduction variable u i ˆ u x i ; v, i.e., w i u x i ; v ˆN u x i ; v 3 i ˆ 1 where the functional form of u ; can be arbitrary. Some of the major results can be summarized as follows: i) For any consistent estimator of v such that v à ˆ v o p 1, replacing v by v à in the constraint 3) will not change the resulting calibration estimator asymptotically. ii) Let v à P ˆ d i q i x i x T i 1 P d i q i x i y i. If we use u i ˆ xi T Ãv as calibration
3 Wu and Luan: Optimal Calibration Estimators Under Two-Phase Sampling 121 variable in 3), where T denotes transpose, the resulting calibration estimator is identical to the conventional calibration estimator à Y C using 1). Hence, the class of calibration estimators considered by Wu 2002) is very general and includes the conventional calibration estimator as a special member. iii) The model-calibration estimator à ÅY MC of Wu and Sitter 2001) obtained by using u i ˆ E y y i x i ˆm x i ; v in 3) is optimal under the criterion of minimum model expectation of the asymptotic design-based variance, E y fav p à Y g. Here AV p refers to the asymptotic variance under the sampling design, p, and à Y is any calibration estimator from the class. iv) For the estimation of the population distribution function N 1 F Y t ˆN I y i # t i ˆ 1 where I denotes the indicator function, the optimal calibration variable is given by g x i ; t ˆE y I y i # t jx i ŠˆP y i # t x i, which is dependent on the particular value of t. v) To estimate a second-order nite population quantity such as the population variance or more generally Q ˆ P N P Nj i ˆ 1 ˆ i 1 f y i ; y j, the optimal calibration variable is u ij ˆ E y f y i ; y j x i ; x j Š. For proofs, the required regularity conditions and more discussion we refer to Wu 2002) which is available online at It is clear that optimal calibration estimation requires in general the complete auxiliary information x 1 ; x 2 ;...; x N. In this article we focus on the use of optimal calibration estimators under two-phase sampling assuming that the complete auxiliary information is not available but a large rst-phase sample on the x variables can be easily collected with affordable cost. Estimation of the population mean or total using a regression-type estimator under twophase sampling has been discussed in the literature. Less attention, however, has been given to the estimation of the distribution function, the variance or other second-order nite population quantities under the context of two-phase sampling. In Section 2, we present optimal calibration estimators for various rst- and second-order nite population quantities under two-phase sampling under a uni ed framework. In Section 3, the optimal calibration estimators for second-order nite population quantities are naturally used to construct more ef cient variance estimators for the regression estimator for the population mean. Variance estimation for the distribution function is addressed in Section 4. Some empirical results regarding the nite sample design-based performances of the proposed estimators are reported in Section 5 using real survey data of the 1996 Statistics Canada Family Expenditure FAME) Survey for the province of Ontario. Applications to measurement error and nonresponse problems are brie y discussed in Section 6. Some concluding remarks are given in Section Optimal Calibration Estimators Under Two-Phase Sampling For simplicity of presentation, we consider cases where a relatively large rst-phase sample s 0 of xed size n 0 is selected using simple random sampling without replacement,
4 122 Journal of Of cial Statistics and x i is measured for all 0. A second-phase sample s of size n is selected using a general sampling design p s 0 with rst- and second-order inclusion probabilities p i s 0 and p ij s 0, and the response variable y i is observed for. The estimators presented below, however, can be extended to cases where the rst-phase sampling design is also arbitrary. Let d i s 0 ˆ 1=p i s 0 and d i ˆ N=n 0 d i s Estimating the population total Under Model 2), the model-calibration estimator for the population total Y is de ned as ÃY MC ˆ P w i y i, where the calibrated weights w i minimize F s ˆ w i d i 2 = d i q i subject to w i m x i ; v ˆ N=n à 0 m x i ; v à 0 Note that the right-hand side of 4) is an estimate for P N i ˆ 1 m x i ; v à based on the rstphase sample s 0. It is straightforward to show that " ÃY MC ˆ N n 0 d i s 0 y i m x i ; v à # ) d i s 0 m x i ; v à ÃB Y 0 where B à Y ˆ P d i s 0 q i Ãm i y i = P d i s 0 q i Ãm i 2 and Ãm i ˆ m x i ; v. à It can be shown that, under some mild regularity conditions, YMC à is asymptotically equivalent to " YMC ˆ N n 0 d i j s 0 y i m x i ; v N # ) d i j s 0 m x i ; v N B Y 0 where B Y ˆ P N i ˆ 1 q i m i y i = P N i ˆ 1 q i mi 2, m i ˆ m x i ; v N,andv N are the nite population parameters estimated by v. à See Wu and Sitter 2001) for a detailed discussion on the estimation of model parameters. The required regularity conditions were detailed in an unpublished master's essay at the University of Waterloo Luan 2001). It follows that AV p Y à MC ˆV p YMC ˆN 2 1 n 0 1 N S 2 Y N n 0 " # 2 E 1 V 2 d i j s 0 y i B Y m i where E 1 and V 2 denote the expectation and the variance under the rst-phase and the second-phase sampling design, respectively, and S 2 Y is the nite population variance for the y variable. Since the value of S 2 Y is unaffected by different choices of the calibration variable u i and E y E 1 V 2 ŠˆE 1 fe y V 2 g Š, following the same argument as in Theorem 1 of Wu 2002), we can show that à YMC minimizes E y AV p à Y Š among the class of calibration estimators à Y similarly de ned as in Section 1 with 3) modi ed for two-phase sampling in the form of 4) Estimating the distribution function The model-calibrated pseudo-empirical maximum likelihood estimator ME), which is asymptotically equivalent to the optimal calibration estimator Wu and Sitter 2001), is 4
5 Wu and Luan: Optimal Calibration Estimators Under Two-Phase Sampling 123 particularly appealing for the estimation of the distribution function. Under two-phase sampling, the ME estimator for F Y t is de ned as F à ME t ˆP Ãp i I y i # t where the weights Ãp i maximize the pseudo-empirical log-likelihood function Ãl p ˆN n 0 d i j s 0 log p i 5 subject to p i ˆ 1 0 < p i < 1 and p i g x i ; t ˆ 1 n 0 g x i ; t 6 0 where g x i ; t ˆE y I y i # t jx i ŠˆP y i # tjx i. Under a regression working model y i ˆ m x i ; v v x i «i ; i ˆ 1; 2;...; N 7 where the «i 's are independent and identically distributed iid) as N 0; j 2,wehave g x i ; t ˆG fy m x i ; v g=fv x i jgš The G is the cumulative distribution function of N 0; 1. In applications the unknown model parameters v and j 2 will have to be replaced by sample-based estimates. Note that if a prespeci ed t ˆ t 0 is used in 6) while the resulting weights Ãp i are used in à F ME t for an arbitrary t, à FME t will itself be a genuine distribution function. The optimality of ÃF ME t at t ˆ t 0 can be established along the lines of Theorem 2 in Wu 2002). When the regression model 7) is not desirable, a slightly less ef cient but more robust logistic regression model can be used to obtain g x i ; t. See Chen and Wu 2002) or Wu 2002) for details. A simple algorithm for computing the pseudo-empirical likelihood weights Ãp i can be found in Chen et al. 2002) Estimating second-order nite population quantities The population variance, the variance of a linear estimator or more generally a second-order nite population quantity of the form Q ˆ P N P Nj i ˆ 1 ˆ i 1 f y i ; y j can also be ef ciently estimated through optimal calibration. Under two-phase sampling, the optimal model-calibration estimator for Q is given by QMC à ˆ P P j > i w ij f y i ; y j, where the weights w ij minimize the modi ed distance measure F s ˆ P P j > i w ij d ij 2 = d ij q ij subject to 1 w ij E y f y i ; y j x i ; x j ŠˆN N n 0 n 0 E 1 y f y i ; y j x i ; x j Š 8 j > i Here d ij ˆ d ij s 0 N N 1 Š= n 0 n 0 1 Š and d ij s 0 ˆ 1=p ij s 0. The weighting factors q ij in the distance measure are prespeci ed and a common choice would be q ij ˆ 1. This optimal model-calibration estimator QMC à can be explicitly expressed as a regressiontype estimator, as detailed below for the population variance SY 2. Consider the nite population variance SY 2 ˆ N 1 1 P N i ˆ 1 y i Y 2 which can be alternatively expressed as SY 2 ˆ N N 1 Š 1 P N P Nj i ˆ 1 ˆ i 1 y i y j 2. Under a linear regression working model y i ˆ x T i v v x i «i ; i ˆ 1; 2;...; N 9 0 j > i
6 124 Journal of Of cial Statistics where the «i 's are iid with mean zero and variance j 2,wehave E y y i y j 2 x i ; x j Šˆv T x i x j x i x j T v j 2 v 2 x i v 2 x j Š Let u ij ˆ Ãv T x i x j x i x j T v à Ãj 2 v 2 x i v 2 x j Š and v ij ˆ y i y j 2.Itcanbe shown that the resulting optimal model-calibration estimator for SY 2 is given by " ÃS MC 2 1 ˆ n 0 n 0 d 1 ij s 0 v ij u ij ) # d ij s 0 u ij ÃB S j > i 0 j > i j > i where B à S ˆ P P j > i d ij s 0 q ij u ij v ij = P P j > i d ij s 0 q ij uij 2. For a general second-order nite population quantity such as Q, one needs to use v ij ˆ f y i ; y j, u ij ˆ E y f y i ; y j x i ; x j Š, and replace 1= n 0 n 0 1 Š by N N 1 = n 0 n 0 1 Š in the above formulation. If simple random sampling without replacement SRSWOR) is also used at the second phase, S à 2 MC reduces to "!# v 2 x i 1 v 2 x n i ÃB S 0 ÃS 2 MC ˆ s 2 Y à v T s 0 2 s 2 à v Ãj 2 1 n 0 where s 2 Y is the usual sample variance based on y i,, ands 0 2 and s 2 are the sample variance-covariance matrices for the x variable based on s 0 and s, respectively. If further the regression model 9) has a homogeneous variance structure, i.e., v x i ; 1, we have ÃS 2 MC ˆ s 2 Y à v T s 0 2 s 2 à v à B S More Ef cient Variance Estimators for the Regression Estimator Under the commonly used regression model 9), it can be shown that B à Y ˆ 1 and the optimal model-calibration estimator for the population mean ÅY ˆ N 1 P N i ˆ 1 y i is identical to the conventional regression estimator under two-phase sampling Wu and Sitter 2001): ÃÅY MC ˆ 1 n 0 " d i j s 0 y i x i! T # d i s 0 x i v à 0 where v à are the estimated regression coef cients. If SRSWOR is used at the second phase, the approximate design-based variance of à YMC Å is given by V p à YMC ŠLj 1 n 0 1 SY 2 1 N n 1 n 0 SE 2 where SY 2 is de ned as before, SE 2 is the nite population variance de ned over the residual variable e i ˆ y i xi T v N,andv N is the nite population regression coef cients. The conventional variance estimator for à YMC Å is obtained if one replaces SY 2 by sy 2 and SE 2 by se, 2 the sample variances based on the second-phase sample s, and using Ãe i ˆ y i xi T Ãv. See for example Cochran 1977, p. 343). We denote this basic variance estimator by v 0 à ÅY MC. An improved variance estimator which makes more complete use of the large rstphase sample measured over the covariates x can be easily developed as follows. The SY 2 can be more ef ciently estimated by S à MC 2 given by 10). As for SE 2, the optimal model-calibration estimator can be similarly constructed by noting that E y e i ˆ0and
7 Wu and Luan: Optimal Calibration Estimators Under Two-Phase Sampling 125 E y e i e j 2 x i ; x j Š Lj j 2 v 2 x i v 2 x j Š. The resulting estimator for SE 2 is given by " # ÃS E 2 ˆ se 2 1 n 0 v 2 x i 1 v 2 x n i ÃB E 0 where B à E is similarly de ned as B à S but using v ij ˆ Ãe i Ãe j 2 and u ij ˆ v 2 x i v 2 x j.it is interesting to notice that, if v x i ; 1, S à 2 E ˆ se, 2 the large rst-phase sample on x contains no extra information to improve the estimation of SE. 2 See Wu 2002) for more discussion on this issue. Our proposed variance estimator is as follows: v 1 à YMC Å ˆ 1 n 0 1 sy 2 1 N n 1 n 0 se 2 1 n 0 1 Ãv T 0 s 2 N s 2 v à B à S " # 1 n 0 v 2 x i 1 v 2 1 x n i n 0 1 ÃB N S 1 n 1 ÃB n 0 E Ãj 2 0 If v x i ; 1, the very last term in v 1 à YMC Å vanishes, and in this case v 1 à YMC Å reduces to v 1 Åy lr proposed by Sitter 1997) if we also replace B à S by 1. Note that, under the model 9), E y B à S Lj 1, E y B à E Lj 1, an alternative variance estimator is obtained if we replace both B à S and B à E by 1. This is given by v 2 à YMC Å ˆ 1 n 0 1 sy 2 1 N n 1 n 0 se 2 1 n 0 1 Ãv T 0 s 2 N s 2 v à 1 n 1 " # 1 N n 0 v 2 x i 1 v 2 x n i Ãj 2 0 Both v 1 and v 2 are design-consistent. The design-based nite sample performances of v 1 and v 2 are further investigated in Section 5 through a simulation study. Also included in the simulation study is the delete-1 jackknife variance estimator, v J. The detailed formulation of v J was given by Sitter 1997). 4. Variance Estimation for the Distribution Function 4.1. Analytical variance estimators Analytical variance estimators for F à ME t which make more ef cient use of the rst-phase sample can be developed in the same spirit to v 1 or v 2. Following the arguments of Chen and Wu 2002), it can be shown that ÃF ME t ˆ 1 Ãn 0 d i s 0 I y i # t " # 1 n 0 g x i ; t 1 Ãn 0 d i s 0 g x i ; t B N o p 0 1 p n where Ãn 0 ˆ P d i s 0, B N ˆ P N i ˆ 1 g x i ; t g N ŠI y i # t = P N i ˆ 1 g x i ; t g N Š 2,and g N ˆ N 1 P N i ˆ 1 g x i ; t. Under two-phase sampling with SRSWOR at both phases,! 11
8 126 Journal of Of cial Statistics we have V p F à ME t Š Lj 1 n 0 1 SI 2 1 N n 1 n 0 SD 2 where SI 2 and SD 2 are the nite population variances de ned over the variable I i ˆ I y i # t and D i ˆ I y i # t g x i ; t B N, respectively. The usual substitution estimator for V p F à ME t Š is denoted by v 0 F à ME t Š, withsi 2 replaced by si 2 and SD 2 replaced by sd, 2 the sample variances based on s. A more ef cient variance estimator is readily available if we estimate both SI 2 and SD 2 using the general method described in Section 2.3. The resulting estimators S à I 2 and S à D 2 can both be expressed as " # ÃS 2 ˆ s n 0 n 0 u 1 ij u n n 1 ij ÃB F 0 j > i P j > i u ij v ij = P j > i P j > i u 2 ij and s 2 is the corresponding sample var- where B à F ˆ P iance based on s. ForSI 2, v ij ˆ I i I j 2, u ij ˆ E y I i I j 2 x i ; x j Šˆg i g j 2g i g j, and g i ˆ g x i ; t ; for SD, 2 v ij ˆ D i D j 2, u ij ˆ E y D i D j 2 x i ; x j Š Lj g i 1 g i g j 1 g j. As usual, any unknown model parameters appearing in D i or u ij will be replaced by appropriate sample-based estimates. Let v 1 F à ME t Š be the estimator of V p F à ME t Š obtained by using S à I 2 and S à D 2 in place of SI 2 and SD, 2 respectively. Note that SI 2 ˆ N N 1 1 F Y t 1 F Y t Š, an alternative estimator for SI 2 could simply be N N 1 1 FME à t 1 F à ME t Š. We denote the resulting estimator of V p F à ME t Š as v 2 F à ME t Š, where SD 2 is estimated by S à D,asinv 2 1 F à ME t Š. Similar to the mean case, both v 1 and v 2 are design-consistent estimators of V p F à ME t Š Jackknife variance estimator The pseudo-empirical maximum likelihood estimator F à ME t is a nonlinear estimator and its exact design-based variance does not have a closed form. The conventional delete-1 jackknife variance estimator often provides an attractive alternative in such cases. Under two-phase sampling, the delete-1 estimator F à ME jš needs to be computed differently for j [ s and for j [ s 0 s. Let s j denote the set s with the j th unit deleted. Let FME à jš ˆP j Ãp i I y i # t if j [ s, where the weights Ãp i i Þ j maximize l à p ˆP j d i s 0 log p i subject to P j p i ˆ 1 0 < p i < 1 and P j p i g x i ; t ˆ n P 0 j g x i ; t ; let FME à jš ˆP Ãp i I y i # t if j [ s 0 s, where the weights Ãp i maximize l à p ˆP d i s 0 log p i subject to P p i ˆ 1 0 < p i < 1 and P p i g x i ; t ˆ n P 0 j g x i ; t. The usual jackknife variance estimator is v J F à ME t Š ˆ n 0 1 n 0 ff à ME jš F à ME t g 2 12 j [ s 0 This variance estimator is consistent if SRSWOR is used at both phases and the rst-phase sampling fraction n 0 =N is negligible, since F à ME t is asymptotically equivalent to a regression type estimator given by 11) Sitter 1997). If the rst-phase sampling fraction cannot be ignored, an ad hoc adjustment can be made by multiplying v J by 1 n 0 =N. This has been done in the simulation study.
9 Wu and Luan: Optimal Calibration Estimators Under Two-Phase Sampling 127 The major advantage of using v J is the operational convenience. For parameters in a general form of W ˆ WfF Y t g, the jackknife variance estimator for à W ˆ Wf à F ME t g can be readily formulated by replacing à F ME jš and à F ME t in 12)by à W jš ˆWf à F ME jšg and ÃW ˆ Wf à F ME t g. This is most useful when the method is applied to in nite populations where parameters of interest are often in the form of v ˆ v F. 5. Some Empirical Results In this section we report some simulation results regarding the design-based nite sample performances of the proposed estimators using real survey data from the 1996 Statistics Canada Family Expenditure FAME)Survey for the province of Ontario. The data set contains 2,396 observations over a variety of variables. In the simulation, the variable y, total expenditure, was used as the response, and x 1 number of people in the household) and x 2 total income after taxes)were included as auxiliary variables. It was found that a simple linear regression model y ˆ b 0 b 1 x 1 b 2 x 2 «gives a reasonable t to the data, with the nonhomogeneous variance structure V y «ˆx 2 j 2 tting slightly better than the constant variance model. Treating the data set as a xed nite population, we rst selected a two-phase sample with SRSWOR at both phases, and then computed various estimators presented in Sections 2, 3, and 4. For the distribution function F Y t we included results on ve different values of t corresponding to the 10th, 25th, 50th, 75th, and 90th population quantiles. The process was repeated B ˆ 5,000 times for point estimators à ŠYMC, à S 2 MC and à F ME t,and B ˆ 1,000 times for variance estimators v 0, v 1, v 2,andv J. The performance of an estimator à T for a certain population quantity T was evaluated using the relative percentage bias RB%)and the relative ef ciency RE)de ned as RB% ˆ 1 B B b ˆ 1 ÃT b T T 100 and RE ˆ MSE à T 0 MSE à T where à T b was computed from the b th simulated sample, MSE à T ˆB 1 P B b ˆ 1 à T b T 2, and à T 0 denotes the baseline estimator for comparison. For the estimation of ÅY, S 2 Y and F Y t, à T0 is given by Åy ˆ n 1 P y i, s 2 Y ˆ n 1 1 P y i y 2 and ÃF Y t ˆn 1 P I y i # t, respectively. Table 1 reports the simulated RE's for the optimal-calibration or the model-calibrated pseudo-empirical maximum likelihood estimators for the population mean Y, the population variance S 2 Y and the nite population distribution function F Y t at t ˆ t a, a ˆ 0:10, 0.25, 0.50, 0.75, and The rst-phase sample size was chosen as n 0 ˆ 400. The absolute values of the RB's are all less than 2% and are not reported here to save space. Table 1. Relative ef ciency of estimators for ÅY, S 2 Y and F Y t n 0 n ÃÅY MC à S 2 MC à FME t at t ˆ t a
10 128 Journal of Of cial Statistics Table 2. Relative bias in %) of variance estimators for à ÅY MC and ÃF ME t n 0 n v ÃÅY MC à FME t at t ˆ t a v v v v J v v v v J v v v v J The proposed estimators perform uniformly better and are much superior to the conventional estimator à T 0 in all cases. With xed rst-phase sample size, the gain in ef ciency seems larger when the second-phase sample size n is smaller, although such a trend is not monotonic in terms of sample size. As pointed out by one of the referees, this is due to the fact that V p à T! V p à T 0 as n! n 0 for xed n 0. As for the distribution function F Y t, the improvement is more dramatic when t is not in the tail regions. For variance estimators, the true values i.e., T in the de nitions of RB and RE) of V p à ŠYMC and V p à F ME t were estimated from another 5,000 independent simulation runs. The simulated RB's of v 0, v 1, v 2 and v J are reported in Table 2. All RB's are less than 11%, except for those cases relating to the estimation of the variance of ÃF ME 0:10 and à F ME 0:90 when n ˆ 40. The RB's in these cases all take negative values and are smaller than 20% for v 0, v 1 and v 2. The jackknife estimator v J also signi cantly overestimates the variance in these cases. It seems that none of these estimators, including the naive v 0, provides acceptable estimates when n is small and t is in the tail region. The simulated RE's for the variance estimators of à ŠYMC and à F ME t are presented in Table 3. Note that we excluded the RE 's for the case of n ˆ 40 at t ˆ t 0:10 and t 0:90 where Table 3. Relative ef ciency of variance estimators for à ÅY MC and ÃF ME t n 0 n v ÃÅY MC à FME t at t ˆ t a v Ð Ð v Ð Ð v J 0.85 Ð Ð 80 v v v J v v v J
11 Wu and Luan: Optimal Calibration Estimators Under Two-Phase Sampling 129 the bias is not negligible, since comparison in terms of RE under such cases is probably misleading. The variance estimator for baseline comparison is v 0. Table 3 can be summarized as follows: i) v 1 and v 2 perform similar to each other and are uniformly better than v 0 ; ii) the improvement from using v 1 and v 2 over v 0 is only marginal for estimating V p à ÅY MC ; iii) v 1 and v 2 seem more ef cient for estimating V p à F ME t for t at large quantiles; and iv) the jackknife variance estimator, which does not reuse the auxiliary information available at the rst phase, is less stable and less ef cient in most cases. One explanation for ii) is probably the fact that the linear model y ˆ b 0 b 1 x 1 b 2 x 2 «with the nonhomogeneous variance structure V y «ˆx 2 j 2 only provides a slightly better t than the constant variance model. Under such circumstances there is little room to improve the estimation of S 2 E by using auxiliary information through a nonhomogeneous variance structure. The marginal gain comes from the estimation of S 2 Y using à S 2 MC. When the variance structure is clearly nonhomogeneous, higher ef ciency can be expected from using v 1 and v 2. See the theoretical and empirical results reported in Wu 2002) for the case of known complete auxiliary information. 6. Applications to Measurement Error and Nonresponse The proposed optimal calibration estimators under two-phase sampling are applicable to estimation problems under measurement errors or nonresponse. Let y i be the true value of a characteristic of interest for the i th unit in the nite population. Suppose it is dif cult or very expensive to obtain exact measurement but an inaccurate value i.e., measurement with error), say z i,ofy for the i th unit, can be obtained quite easily. In such situations the two-phase sampling technique is often employed. A large rst-phase sample s 0 is taken and the cheap inaccurate measurements z i are collected for all 0. A smaller secondphase sample s Ì s 0 is drawn and the exact measurements y i are collected using more extensive effort. The goal is to estimate various nite population quantities de ned over y i. The relationship between the y i 's and the z i 's can be described by the so-called regression calibration model Carroll et al. 1995), y i ˆ a bz i «i ; i ˆ 1; 2;...; N 13 where the «i 's are assumed independent and identically distributed random variates with E m «i ˆ0 and V m «i ˆj 2, and the subscript m refers to the measurement error model 13). If we treat z i as an auxiliary variable, the methodology developed in Sections 2, 3, and 4 can be applied directly here for estimation under the assumed measurement error model. Item nonresponse is often handled through imputation. If the response variable y is highly correlated with certain covariate s) x and missing values do not occur for the x variables, a direct analysis based on the observed sample data using our proposed estimation strategy may be more desirable. The original sample with complete information on x can be viewed as a rst-phase sample, and the subsample with observed responses on y is treated as a second-phase sample. The loss of information due to the nonresponse can be retreated from auxiliary information through the optimal estimation strategy. The
12 130 Journal of Of cial Statistics inference is conditionally valid for the given sample. An advantage of this approach is that the resulting optimal calibration estimators are design-consistent even if the superpopulation model 2) is misspeci ed. This is in contrast to estimation based on imputed data where the validity of the results depends on the correctness of the model used for imputation. 7. Concluding Remarks The calibration method has gained much popularity since its emergence in the early 90s, and calibration estimators are routinely computed by many survey organizations. The optimal calibration approach provides a uni ed framework for the ef cient estimation of various nite population quantities when complete auxiliary information is available. It has been demonstrated in this article that the method also provides a exible and ef cient way of using auxiliary information at the estimation stage under a two-phase sampling scheme. The proposed optimal calibration estimators for a second-order nite population quantity such as the population variance can perfectly be used to obtain more ef cient variance estimators for a rst-order nite population quantity such as the total or the distribution function. The gain in ef ciency can be appreciable even for a rst-phase sample with moderate size. Applications to measurement error problems or nonresponse are straightforward. The method has the potential to be applied to estimation problems under in nite populations where a two-phase sample is available. 8. References Carroll,R.J.,Ruppert,D.,and Stefanski,L.A. 1995). Measurement Errors in Non-Linear Models. Chapman & Hall. Chen,J. and Wu,C. 2002). Estimation of Distribution Function and Quantiles Using the Model-calibrated Pseudo Empirical Likelihood Method. Statistica Sinica,12, 1223±1239. Chen,J.,Sitter,R.R.,and Wu,C. 2002). Using Empirical Likelihood Method to Obtain Range Restricted Weights in Regression Estimators for Surveys. Biometrika,89, 230±237. Cochran,W.G. 1977). Sampling Techniques 3rd ed.). Wiley,New York. Deville,J.C. and SaÈrndal,C.E. 1992). Calibration Estimators in Survey Sampling. Journal of the American Statistical Association,87,376±382. Luan,Y. 2001). Model-Calibration Approach under Two-Phase Sampling. Unpublished master's essay,department of Statistics and Actuarial Science,University of Waterloo, Canada. Sitter,R.R. 1997). Variance Estimation for the Regression Estimator in Two-Phase Sampling. Journal of the American Statistical Association,92,780±787. SaÈrndal,C.E.,Swensson,B.,and Wretman,J. 1992). Model Assisted Survey Sampling. Springer,New York. Wu,C. 2002). Optimal Calibration Estimators in Survey Sampling. Working paper Department of Statistics and Actuarial Science,University of Waterloo, Canada.
13 Wu and Luan: Optimal Calibration Estimators Under Two-Phase Sampling 131 Wu, C. and Sitter, R.R. 2001). A Model-calibration Approach to Using Complete Auxiliary Information from Survey Data. Journal of the American Statistical Association, 96, 185±193. Received March 2002 Revised November 2002
Bias Correction in the Balanced-half-sample Method if the Number of Sampled Units in Some Strata Is Odd
Journal of Of cial Statistics, Vol. 14, No. 2, 1998, pp. 181±188 Bias Correction in the Balanced-half-sample Method if the Number of Sampled Units in Some Strata Is Odd Ger T. Slootbee 1 The balanced-half-sample
More informationEmpirical Likelihood Methods for Sample Survey Data: An Overview
AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use
More informationANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW
SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved
More informationDesign and Analysis of Experiments Embedded in Sample Surveys
Journal of Of cial Statistics, Vol. 14, No. 3, 1998, pp. 277±295 Design and Analysis of Experiments Embedded in Sample Surveys Jan van den Brakel and Robbert H. Renssen 1 This article focuses on the design
More informationREPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES
Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationSurvey Estimation for Highly Skewed Populations in the Presence of Zeroes
Journal of Of cial Statistics, Vol. 16, No. 3, 2000, pp. 229±241 Survey Estimation for Highly Skewed Populations in the Presence of Zeroes Forough Karlberg 1 Estimation of the population total of a highly
More information11. Bootstrap Methods
11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods
More informationESTIMATION OF DISTRIBUTION FUNCTION AND QUANTILES USING THE MODEL-CALIBRATED PSEUDO EMPIRICAL LIKELIHOOD METHOD
Statistica Sinica 12(2002), 1223-1239 ESTIMATION OF DISTRIBUTION FUNCTION AND QUANTILES USING THE MOD-CALIBRATED PSEUDO EMPIRICAL LIKIHOOD METHOD Jiahua Chen and Changbao Wu University of Waterloo Abstract:
More informationCalibration as a Standard Method for Treatment of Nonresponse
Journal of Of cial Statistics, Vol. 15, No. 2, 1999, pp. 305±327 Calibration as a Standard Method for Treatment of Nonresponse Sixten LundstroÈm 1 and Carl-Erik SaÈrndal 2 This article suggests a simple
More informationEmpirical Likelihood Inference for Two-Sample Problems
Empirical Likelihood Inference for Two-Sample Problems by Ying Yan A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Statistics
More informationREPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY
REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in
More informationEmpirical Likelihood Methods
Handbook of Statistics, Volume 29 Sample Surveys: Theory, Methods and Inference Empirical Likelihood Methods J.N.K. Rao and Changbao Wu (February 14, 2008, Final Version) 1 Likelihood-based Approaches
More informationChapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70
Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:
More informationEstimation of a Local-Aggregate Network Model with. Sampled Networks
Estimation of a Local-Aggregate Network Model with Sampled Networks Xiaodong Liu y Department of Economics, University of Colorado, Boulder, CO 80309, USA August, 2012 Abstract This paper considers the
More informationStrati cation in Multivariate Modeling
Strati cation in Multivariate Modeling Tihomir Asparouhov Muthen & Muthen Mplus Web Notes: No. 9 Version 2, December 16, 2004 1 The author is thankful to Bengt Muthen for his guidance, to Linda Muthen
More informationOn Efficiency of Midzuno-Sen Strategy under Two-phase Sampling
International Journal of Statistics and Analysis. ISSN 2248-9959 Volume 7, Number 1 (2017), pp. 19-26 Research India Publications http://www.ripublication.com On Efficiency of Midzuno-Sen Strategy under
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationAsymptotic Normality under Two-Phase Sampling Designs
Asymptotic Normality under Two-Phase Sampling Designs Jiahua Chen and J. N. K. Rao University of Waterloo and University of Carleton Abstract Large sample properties of statistical inferences in the context
More informationFinansiell Statistik, GN, 15 hp, VT2008 Lecture 15: Multiple Linear Regression & Correlation
Finansiell Statistik, GN, 5 hp, VT28 Lecture 5: Multiple Linear Regression & Correlation Gebrenegus Ghilagaber, PhD, ssociate Professor May 5, 28 Introduction In the simple linear regression Y i = + X
More informationGeneralized Pseudo Empirical Likelihood Inferences for Complex Surveys
The Canadian Journal of Statistics Vol.??, No.?,????, Pages???-??? La revue canadienne de statistique Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys Zhiqiang TAN 1 and Changbao
More informationA Procedure for Strati cation by an Extended Ekman Rule
Journal of Of cial Statistics, Vol. 16, No. 1, 2000, pp. 15±29 A Procedure for Strati cation by an Extended Ekman Rule Dan Hedlin 1 This article deals with the Ekman rule, which is a well-known method
More informationDetermining Sample Sizes for Surveys with Data Analyzed by Hierarchical Linear Models
Journal of Of cial Statistics, Vol. 14, No. 3, 1998, pp. 267±275 Determining Sample Sizes for Surveys with Data Analyzed by Hierarchical Linear Models Michael P. ohen 1 Behavioral and social data commonly
More informationASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS
Statistica Sinica 17(2007), 1047-1064 ASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS Jiahua Chen and J. N. K. Rao University of British Columbia and Carleton University Abstract: Large sample properties
More informationMinimax design criterion for fractional factorial designs
Ann Inst Stat Math 205 67:673 685 DOI 0.007/s0463-04-0470-0 Minimax design criterion for fractional factorial designs Yue Yin Julie Zhou Received: 2 November 203 / Revised: 5 March 204 / Published online:
More informationStatistical Practice
Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:
More informationA Note on Bayesian Inference After Multiple Imputation
A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in
More informationMeasuring robustness
Measuring robustness 1 Introduction While in the classical approach to statistics one aims at estimates which have desirable properties at an exactly speci ed model, the aim of robust methods is loosely
More informationConfidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods
Chapter 4 Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods 4.1 Introduction It is now explicable that ridge regression estimator (here we take ordinary ridge estimator (ORE)
More informationA Design-based Analysis Procedure for Two-treatment Experiments Embedded in Sample Surveys. An Application in the Dutch Labor Force Survey
Journal of Of cial Statistics, Vol. 18, No. 2, 2002, pp. 217±231 A Design-based Analysis Procedure for Two-treatment Experiments Embedded in Sample Surveys. An Application in the Dutch Labor Force Survey
More informationImputation for Missing Data under PPSWR Sampling
July 5, 2010 Beijing Imputation for Missing Data under PPSWR Sampling Guohua Zou Academy of Mathematics and Systems Science Chinese Academy of Sciences 1 23 () Outline () Imputation method under PPSWR
More informationMeasurement error as missing data: the case of epidemiologic assays. Roderick J. Little
Measurement error as missing data: the case of epidemiologic assays Roderick J. Little Outline Discuss two related calibration topics where classical methods are deficient (A) Limit of quantification methods
More informationInference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation
Inference about Clustering and Parametric Assumptions in Covariance Matrix Estimation Mikko Packalen y Tony Wirjanto z 26 November 2010 Abstract Selecting an estimator for the variance covariance matrix
More informationNonresponse weighting adjustment using estimated response probability
Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy
More informationCombining data from two independent surveys: model-assisted approach
Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,
More informationTWO-WAY CONTINGENCY TABLES UNDER CONDITIONAL HOT DECK IMPUTATION
Statistica Sinica 13(2003), 613-623 TWO-WAY CONTINGENCY TABLES UNDER CONDITIONAL HOT DECK IMPUTATION Hansheng Wang and Jun Shao Peking University and University of Wisconsin Abstract: We consider the estimation
More informationMultinomial Discrete Choice Models
hapter 2 Multinomial Discrete hoice Models 2.1 Introduction We present some discrete choice models that are applied to estimate parameters of demand for products that are purchased in discrete quantities.
More informationA Course on Advanced Econometrics
A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.
More informationMean estimation with calibration techniques in presence of missing data
Computational Statistics & Data Analysis 50 2006 3263 3277 www.elsevier.com/locate/csda Mean estimation with calibration techniues in presence of missing data M. Rueda a,, S. Martínez b, H. Martínez c,
More informationJackknife Euclidean Likelihood-Based Inference for Spearman s Rho
Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho M. de Carvalho and F. J. Marques Abstract We discuss jackknife Euclidean likelihood-based inference methods, with a special focus on the
More informationModi ed Ratio Estimators in Strati ed Random Sampling
Original Modi ed Ratio Estimators in Strati ed Random Sampling Prayad Sangngam 1*, Sasiprapa Hiriote 2 Received: 18 February 2013 Accepted: 15 June 2013 Abstract This paper considers two modi ed ratio
More informationAn Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys
An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys Richard Valliant University of Michigan and Joint Program in Survey Methodology University of Maryland 1 Introduction
More informationECON 594: Lecture #6
ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was
More informationA measurement error model approach to small area estimation
A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion
More informationDesign and Estimation for Split Questionnaire Surveys
University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Information Sciences 2008 Design and Estimation for Split Questionnaire
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationEstimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels.
Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Pedro Albarran y Raquel Carrasco z Jesus M. Carro x June 2014 Preliminary and Incomplete Abstract This paper presents and evaluates
More informationBinary choice 3.3 Maximum likelihood estimation
Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood
More informationWeighting in survey analysis under informative sampling
Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting
More informationMultiple-Objective Optimal Designs for the Hierarchical Linear Model
Journal of Of cial Statistics, Vol. 18, No. 2, 2002, pp. 291±303 Multiple-Objective Optimal Designs for the Hierarchical Linear Model Mirjam Moerbeek 1 and Weng Kee Wong 2 Optimal designs are usually constructed
More informationASSESSING A VECTOR PARAMETER
SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;
More informationEstimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk
Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:
More informationSimple Estimators for Semiparametric Multinomial Choice Models
Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper
More informationThe Effective Use of Complete Auxiliary Information From Survey Data
The Effective Use of Complete Auxiliary Information From Survey Data by Changbao Wu B.S., Anhui Laodong University, China, 1982 M.S. Diploma, East China Normal University, 1986 a thesis submitted in partial
More information(c) i) In ation (INFL) is regressed on the unemployment rate (UNR):
BRUNEL UNIVERSITY Master of Science Degree examination Test Exam Paper 005-006 EC500: Modelling Financial Decisions and Markets EC5030: Introduction to Quantitative methods Model Answers. COMPULSORY (a)
More information13 Endogeneity and Nonparametric IV
13 Endogeneity and Nonparametric IV 13.1 Nonparametric Endogeneity A nonparametric IV equation is Y i = g (X i ) + e i (1) E (e i j i ) = 0 In this model, some elements of X i are potentially endogenous,
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationChapter 1. GMM: Basic Concepts
Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating
More informationData Integration for Big Data Analysis for finite population inference
for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation
More informationMultivariate Regression: Part I
Topic 1 Multivariate Regression: Part I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Statement of the objective: we want to explain the behavior of one variable as a
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:
More informationModification and Improvement of Empirical Likelihood for Missing Response Problem
UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu
More informationNonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments
Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z
More informationChapter 8: Estimation 1
Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.
More informationNUMERICAL DISTRIBUTION FUNCTIONS OF LIKELIHOOD RATIO TESTS FOR COINTEGRATION
JOURNAL OF APPLIED ECONOMETRICS J. Appl. Econ. 14: 563±577 (1999) NUMERICAL DISTRIBUTION FUNCTIONS OF LIKELIHOOD RATIO TESTS FOR COINTEGRATION JAMES G. MACKINNON, a * ALFRED A. HAUG b AND LEO MICHELIS
More informationCalibration estimation using exponential tilting in sample surveys
Calibration estimation using exponential tilting in sample surveys Jae Kwang Kim February 23, 2010 Abstract We consider the problem of parameter estimation with auxiliary information, where the auxiliary
More informationEstimating the Number of Common Factors in Serially Dependent Approximate Factor Models
Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models Ryan Greenaway-McGrevy y Bureau of Economic Analysis Chirok Han Korea University February 7, 202 Donggyu Sul University
More informationModel Assisted Survey Sampling
Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling
More information6. Fractional Imputation in Survey Sampling
6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population
More informationSongklanakarin Journal of Science and Technology SJST R3 LAWSON
Songklanakarin Journal of Science and Technology SJST-0-00.R LAWSON Ratio Estimators of Population Means Using Quartile Function of Auxiliary Variable in Double Sampling Journal: Songklanakarin Journal
More informationApproximate Inference for the Multinomial Logit Model
Approximate Inference for the Multinomial Logit Model M.Rekkas Abstract Higher order asymptotic theory is used to derive p-values that achieve superior accuracy compared to the p-values obtained from traditional
More informationTesting for a Trend with Persistent Errors
Testing for a Trend with Persistent Errors Graham Elliott UCSD August 2017 Abstract We develop new tests for the coe cient on a time trend in a regression of a variable on a constant and time trend where
More informationEmpirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design
1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary
More informationJong-Min Kim* and Jon E. Anderson. Statistics Discipline Division of Science and Mathematics University of Minnesota at Morris
Jackknife Variance Estimation of the Regression and Calibration Estimator for Two 2-Phase Samples Jong-Min Kim* and Jon E. Anderson jongmink@morris.umn.edu Statistics Discipline Division of Science and
More informationLeast Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions
Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error
More informationNon-parametric Tests for the Comparison of Point Processes Based on Incomplete Data
Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 28: 725±732, 2001 Non-parametric Tests for the Comparison of Point Processes Based
More informationCon dentiality, Uniqueness, and Disclosure Limitation for Categorical Data 1
Journal of Of cial Statistics, Vol. 14, No. 4, 1998, pp. 385±397 Con dentiality, Uniqueness, and Disclosure Limitation for Categorical Data 1 Stephen E. Fienberg 2 and Udi E. Makov 3 When an agency releases
More informationCombining multiple observational data sources to estimate causal eects
Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,
More informationSome Statistical Problems in Merging Data Files 1
Journal of Of cial Statistics, Vol. 17, No. 3, 001, pp. 43±433 Some Statistical Problems in Merging Data Files 1 Joseph B. Kadane Suppose that two les are given with some overlapping variables some variables
More informationEmpirical likelihood inference for a common mean in the presence of heteroscedasticity
The Canadian Journal of Statistics 45 Vol. 34, No. 1, 2006, Pages 45 59 La revue canadienne de statistique Empirical likelihood inference for a common mean in the presence of heteroscedasticity Min TSAO
More informationAGEC 661 Note Eleven Ximing Wu. Exponential regression model: m (x, θ) = exp (xθ) for y 0
AGEC 661 ote Eleven Ximing Wu M-estimator So far we ve focused on linear models, where the estimators have a closed form solution. If the population model is nonlinear, the estimators often do not have
More informationA Nonparametric Estimator of Species Overlap
A Nonparametric Estimator of Species Overlap Jack C. Yue 1, Murray K. Clayton 2, and Feng-Chang Lin 1 1 Department of Statistics, National Chengchi University, Taipei, Taiwan 11623, R.O.C. and 2 Department
More informationA Note on Auxiliary Particle Filters
A Note on Auxiliary Particle Filters Adam M. Johansen a,, Arnaud Doucet b a Department of Mathematics, University of Bristol, UK b Departments of Statistics & Computer Science, University of British Columbia,
More informationParametric fractional imputation for missing data analysis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in
More informationNotes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770
Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill November
More informationParametric Inference on Strong Dependence
Parametric Inference on Strong Dependence Peter M. Robinson London School of Economics Based on joint work with Javier Hualde: Javier Hualde and Peter M. Robinson: Gaussian Pseudo-Maximum Likelihood Estimation
More informationMarkov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1
Markov-Switching Models with Endogenous Explanatory Variables by Chang-Jin Kim 1 Dept. of Economics, Korea University and Dept. of Economics, University of Washington First draft: August, 2002 This version:
More informationCombining Non-probability and Probability Survey Samples Through Mass Imputation
Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu
More informationAveraging Estimators for Regressions with a Possible Structural Break
Averaging Estimators for Regressions with a Possible Structural Break Bruce E. Hansen University of Wisconsin y www.ssc.wisc.edu/~bhansen September 2007 Preliminary Abstract This paper investigates selection
More informationEFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING
Statistica Sinica 13(2003), 641-653 EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING J. K. Kim and R. R. Sitter Hankuk University of Foreign Studies and Simon Fraser University Abstract:
More informationINSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING
Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy
More informationSIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011
SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS
More informationAnalysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution
Journal of Computational and Applied Mathematics 216 (2008) 545 553 www.elsevier.com/locate/cam Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution
More informationVariance Estimation for Calibration to Estimated Control Totals
Variance Estimation for Calibration to Estimated Control Totals Siyu Qing Coauthor with Michael D. Larsen Associate Professor of Statistics Tuesday, 11/05/2013 2 Outline A. Background B. Calibration Technique
More informationSimplified marginal effects in discrete choice models
Economics Letters 81 (2003) 321 326 www.elsevier.com/locate/econbase Simplified marginal effects in discrete choice models Soren Anderson a, Richard G. Newell b, * a University of Michigan, Ann Arbor,
More informationTransformation and Smoothing in Sample Survey Data
Scandinavian Journal of Statistics, Vol. 37: 496 513, 2010 doi: 10.1111/j.1467-9469.2010.00691.x Published by Blackwell Publishing Ltd. Transformation and Smoothing in Sample Survey Data YANYUAN MA Department
More information5.3 LINEARIZATION METHOD. Linearization Method for a Nonlinear Estimator
Linearization Method 141 properties that cover the most common types of complex sampling designs nonlinear estimators Approximative variance estimators can be used for variance estimation of a nonlinear
More informationTime Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley
Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the
More informationJong-Min Kim* and Jon E. Anderson. Statistics Discipline Division of Science and Mathematics University of Minnesota at Morris
Jackknife Variance Estimation for Two Samples after Imputation under Two-Phase Sampling Jong-Min Kim* and Jon E. Anderson jongmink@mrs.umn.edu Statistics Discipline Division of Science and Mathematics
More information