Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek
|
|
- June Griffith
- 6 years ago
- Views:
Transcription
1 Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/2/03) Ed Stanek Here are comments on the Draft Manuscript. They are all suggestions that you can consider General Comment. Detailed Comments. Page. lines 3-4. Change wording from: permutation probability underlying SRS, and the joint permutation of response and auxiliary variables is modeled using seemingly unrelated regression. We use Royall s linear least square permutation probability underlying SRS. and tthe joint permutation of response and auxiliary variables is modeled using seemingly unrelated regression. We use Royall s linear least square Page. lines 6. I think the model we use will be called a super-population model by others. Change: variables. The predictors have similar functional form to those derived using design-based, model-assisted and calibration approaches, but depend on neither superpopulation nor regression model assumptions. variables. The predictors have similar functional form to those derived using design-based, model-assisted and calibration approaches, but depend on neither superpopulation arise directly from the sample design and do not require additional regression model assumptions. Page 2. st paragraph. I d suggest re-working this. From: In sample survey research, auxiliary information such as gender, age, income and chronic disease-bearing history are completely or partially known. Such information can be used to assist sampling design and improve estimation. Methods of improving estimation with auxiliary information have been discussed in numerous occasions in both design-based (Cassel,
2 Särndal and Wretman 977; Cochran 977; Särndal and Wright 984; Deville and Särndal 992; Särndal, Swensson and Wretman 992) and prediction-based (Bolfarine and Zacks 992; Valliant, Dorfman and Royall 2000). The two approaches are distinct by their choice of probabilistic model for inference. Estimation can be improved by accounting for auxiliary information such as gender, age, income and chronic disease-bearing history that may be completely or partially known in a population. Methods of improving estimation with auxiliary information have been discussed in numerous occasions in both design-based (Cassel, Särndal and Wretman 977; Cochran 977; Särndal and Wright 984; Deville and Särndal 992; Särndal, Swensson and Wretman 992) and prediction-based (Bolfarine and Zacks 992; Valliant, Dorfman and Royall 2000). The two approaches are distinguished by different probability models and assumptions. Page 2. 2 st paragraph I think this needs re-working. Suggest change from: The design-based approach uses probability sampling for both sample selection and inference from sample data. The probability distribution associated with the randomized sample selection provides the basis for probabilistic inferences about the target population. Common design-based approach of using known auxiliary information include ratio estimator and regression estimators (Brewer 963; Royall 970; Cochran 977). The bias, variance and mean squared error (MSE) are defined in terms of the expectation over all possible samples under the sampling design, and thus the inference of design-based approach is often referred as unconditional. This approach leads to valid repeated sampling inferences regardless of the 2
3 population structure, and is free from model misspecification (Horvitz and Thompson 952; Godambe 955; Cassel, Särndal and Wretman 977). The design-based approach uses the sampling design probabilities and additional model assumptions to develop estimators. Linear regression models may be assumed between the response and auxillary variable (Brewer 963; Royall 970; Cochran 977), or non-linear functions may be defined as ratios of response and auxiliary variables. Estimators minimize the expected mean squared error (MSE) which is defined in terms of the expectation over all possible samples under the sampling design. This approach leads to valid repeated sampling inferences regardless of the population structure, and is free from the model misspecification (Horvitz and Thompson 952; Godambe 955; Cassel, Särndal and Wretman 977). [I don t understand this statement. We have to specify a regression model. Why is it free of model miss-specification?] Page 2. Line 3 continuing to Page 3. Change and insert the following: From: In model-assisted approach, plausible population models are often used to choose efficient estimators with good design-based properties, but the sample selection is based on randomization and statistical properties of estimators are computed with respect to the probability sampling distribution (Särndal, Swensson and Wretman 992; Rao 999). 3
4 To: In model-assisted approach, efficient estimators with good design-based properties are developed for plausible population models, with efficiency determined from the probability sampling distribution (Särndal, Swensson and Wretman 992; Rao 999). The model is specified as an additive response error model with functional (usually linear) relationship between response and auxiliary variables on an elementary unit. This approach provides a formal framework for using auxiliary information at the estimation stage. [Is this your understanding? If so, why do you call them population models?] Page 3, line 2. You refer to estimation stage. I think the manuscript should only talk about estimation, not design. This approach provides a formal framework for using auxiliary information at the estimation stage. The popular generalized regression estimator (GREG) is an example of this type (Cassel, Särndal and Wretman 976; Särndal, Swensson and Wretman 992). Page 3, line. Change wording: From: Prediction- or model-based approach assumes that the target population follows a specified (superpopulation) model, the source of randomness is only attributable to the model, thus the model distribution yields inferences conditinal on the particular sample (Rao 997; Brewer 999; Valliant, Dorfman and Royall 2000). A population is typically considered as a realization of such superpopulation. The bias, variance and mean squared error of the predictor are defined in terms of the expectation over all possible realizations of a stochastic model that connects the variable of interest to a set of auxiliary variables (Brewer 995). The best linear unbiased predictor (BLUP) is a typical model-based estimator (Ghosh and Rao 994; Rao 997). Model-dependent methods can perform poorly as evaluated by the expected value of the 4
5 model based MSE over the design in large samples if the model is misspecified (Hansen, Madow and Tepping 983; Rao 997). To: Prediction- or model-based approach assumes that the target population is a realization of a superpopulation defined by a superpopulation model (Rao 997; Brewer 999; Valliant, Dorfman and Royall 2000). Inference is conditional on the sample, and focuses on predicting functions of the unobserved random variables in the potentially realized population. The bias, variance and mean squared error of the predictor are defined in terms of the expectation over the superpopulation model (Brewer 995). The best linear unbiased predictor (BLUP) is a typical model-based estimator (Ghosh and Rao 994; Rao 997). Model-dependent methods can perform poorly when evaluated by the expected value (with respect to the design) of the model based MSE if the superpopulation model is misspecified (Hansen, Madow and Tepping 983; Rao 997). Page 3 to 4. Last paragraph. I don t think it is a prediction-based approach that it appealing, but a model based approach. The idea of prediction is somewhat unique to sampling, and most people consider estimation, not prediction more important. The key is that they don t see estimation as prediction of the random variables not realized. The model based methods in the absence of a finite population are easier since they are not encumbered by the detailed structure of the population, such as actual distributions of ancillary variables. I distinguish model based from prediction based methods, since model based does not require a population at all, just a model. Prediction based (for me) implies the idea of a target to be predicted. For many, prediction also does not require a population to be conceptualized. Prediction-based approach appeals to many statisticians who practice primarily in fields other than survey statistics because the prediction-based estimators generally have regressionlike presentation, and inference is similar to those regression methods used in mainstream statistics. In addition, prediction-based approach appears to provide an easier platform for 5
6 adapting a rich collection of estimation techniques developed in regression model researches, such as generalized linear mixed models (citation, (Rao 2003)). This feature is especially pronounced in application of generalized linear mixed models in small area estimations (Rao 2003), which seems impossible in design-based framework otherwise. Page 4. re-wording second paragraph. Change: An intrigue question to survey statistician is whether and how the rich collection of estimation techniques in prediction-based method can be adapted in a design-based framework and whether the design-based estimators can be communicated with a presentation that are parallel to prediction-based predictors. To answer this, this paper illustrates a method that applies common estimation techniques used in prediction-based approach under a simple design-based framework. More specifically, we develop a design-based prediction method of using auxiliary information in estimation under simple random sampling without replacement (SRS). This method makes use of the random permutation probability underlying SRS, and requires no additional assumptions. It incorporates known auxiliary information through simple transformation of the auxiliary variables. As an example, we show how this method can be applied to derive best linear unbiased predictor (BLUP) of the population total of a response variable. An intriguing question is whether prediction based methods can be used in a designbased framework. Some work in this area has be done by Brewer ( ). Random permutation superpopulation models, as developed by Rao and Bellhouse (978) provide a link when estimating the population mean in simple random sampling, or two stage sampling settings. 6
7 However, additional model assumptions are needed to account for ancillary variables. We present a design-based approach that uses prediction theory methods to estimate the mean (or total) with auxiliary variables under simple random sampling. The method frames the problem similar to seemingly unrelated regression problem with an underlying random permutation model. o additional model assumptions are required. Known auxiliary information is incorporated through simple transformation of the auxiliary variables. We illustrate the methods in an example. Page 4. line -3. Change From: Let a finite population P consists of labeled subjects, s =, 2, K,, where is Let a finite population P consists of labeled subjects, s =,2, K,, where is Page 5. Line 4. Change From: where ( y ys y ) ( k) ( k) ( k) ( k) y = L L is an column vector for the k -th response; ( ) ( ) ( ) ( y 0 y y p ) y = L is a column vector for subject s. Further, the population values are s s s s alternatively represented as a ( p+ ) column vector z, such that = vec( ) z y. where ( y ys y ) ( k) ( k) ( k) ( k) y = L L is an column vector for the k -th response. We summarize the population values as z = vec( y ), an ( p ) Page 5, line 5. Change from: + column vector. 7
8 Population parameters for the mean, total and variance for the k -th variate are given by µ ( k) ( k) 2 ( k) ( k) = y, T = y, σ k = y P y, where P = I J, I is an ( k) ( k) dimensional identity matrix and J is an matrix of ones. The variance-covariance matrix of the p + variates are summarized as Σ ( p+ ) ( p+ ) 2 σ0 σ0 L σ 0p 2 σ0 σ L σp =. M M O M 2 σ p0 σ p σ L p Population parameters for the mean and total of the k -th variate are given by µ ( k) ( k) = y ( k) ( k) and T = y. The co-variance of the of the k -th and k *-th variates is defined in terms of terms 2 ( k) ( k* ), σ kk* = y P y, where P = I J, I is an dimensional identity matrix and J is an matrix of ones. The variance-covariance matrix of the p + variates is given by 2 σ0 σ0 L σ 0p 2 Σ ( p+ ) ( p+ ) where σ0 σ L σp Σ( p ) ( p ) =. + + M M O M 2 σ p0 σ p σ L p Page 5, line -4. Add. I think it should appear in the text the first time in addition to the abstract. Change: Suppose a sample of size n is selected via SRS from a finite population of known size. ( 0) ( k ) We assume that the parameter of interest is µ, and that µ, k =, K, p, is known. We 8
9 represent sampling with a random permutation model. To do so, we define a set of indicator random variables U, i =, 2, K,, that have a value of if the subject in the i-th position in a is permutation is subject s, and 0 otherwise. Let the matrix U = ( U U L U ) 2 represent a matrix of indicator random variables, where = ( ) U U U L U. When all i i i2 i permutation are equally likely (consequence of SRS), E ( U) = J and cov ( vec( )) U = P P. The matrix of random variables representing a joint permutation of p + variables is given by Y ( ) = U p y + ( p+ ), where ( ) ( ) ( ) ( ) ( ) Y = Y Y L Y L Y, 0 k p ( k ) Y is the random variable corresponding to variable ( k ) y, k = 0,, K, p. For simplicity, we denote vec ( ) = ( ) vec( ) Y I U y. It is shown that ( ) p+ ( ( )) ( ) = ( p ) E vec Y I + µ and cov vec Y = Σ P, where Σ is the population variance-covariance matrix of the p + variables as defined in Error! Reference source not found.. To: Suppose a sample of size n is selected via simple random sampling without replacement (SRS) from a finite population of known size. We represent the sample as the first n units in a random permutation of population units. SRS occurs when each permutation is equally likely. The random variables corresponding to the bivariate values of the permuted units constitute the random permutation model. We formalize these definitions by defining a set of indicator random variables U, i =,2, K,, that have a value of if the subject in the i-th position in a is 9
10 permutation is subject s, and 0 otherwise. Let the matrix U = ( U U U ) 2 a matrix of indicator random variables, where = ( ) i i i2 i L represent U U U L U. Then the matrix of random variables representing a joint permutation of p + variables is given by Y = U y, where ( ) ( ) p+ p+ ( ) ( ) ( ) ( ) ( ) Y = Y Y L Y L Y,where 0 k p ( k ) Y is the random variable corresponding to variable ( k ) y, k = 0,, K, p. For simplicity, we partition Y into the response vector, and the auxiliary vector, such that vec( ) ( ) ( ) ( ) ( 2 ) ( p ) Y = Y Y L Y. ( 0) ( ) ( ) Y = Y Y, where Expressions for the mean and variance can be developed using the propertied of the indicator random variables. Since all permutation are equally likely, E ( U) = J and using Y Ip+ U y, cov( vec( )) the expansion vec( ) = ( ) vec( ) ( ) ( ) we can show that E vec ( Y) = Ip+ µ and cov( vec ( )) = U = P P. Using these results, Y Σ P. Through proper rearrangement, the random variables can be partitioned into a sample and remaining portion. The sample portion corresponds to the random variables in the first n positions (rows) of Y. [WEJU, you need to write this out formally. The partition matrix should be introduced. You may do this in an appendix if it is too complex. I also think you need to provide some details, or at least explain how you will get the various variance terms. You may want to introduce some simpler notation. For example, let ( 0) ( ) ( ) Y = Y Y g and define I I I Y II similarly so that YI VI VI, II var = ] Y II VIII, VII 0
11 ( k ) Page 7. Line. Add: Wenjun, Do you really want to assume that µ is known here? I also think that you should spend some more time on motivating this. The parameter of interest is non-stochastic. You need to express the parameter as a linear combination of random variables. This is what allows you to view the problem as a prediction problem, conditional on the observed sample. This is not automatic in other literature, so you should spell it out. ( 0) ( k ) We assume that the parameter of interest is µ, and that µ, k =, K, p, is known. Page 7, line 6 etc. I m not sure you have the correct organization for the ideas. I think it may be easier to understand if you present the random permutation model followed by the seemingly unrelated regression model. With the seemingly unrelated regression model, then present the parameters of interest (and discuss them.). Finally, talk about the partitioning and the sampling. This would then lead into the estimation section. In the estimation section, don t refer to Royall s result. Instead, develop your result using the steps that Royall followed, and state that your development is parallel to that of Royall. I have had referees state that Royalls theorem doesn t apply to our setting (although I don t see why not). Avoid statements like it can be shown. Instead, provide some guidance as to how you showed it. The reviewer s are not stupid, but some of the stuff you did is complex. You may make them feel stupid if you don t provide guidance. You don t want them just to believe you, you want them to have enough information to check your results. It is OK to refer to your thesis for more details.
12 I would like to see more discussion of equation 9, and a re-expression of this equation as predicting the un-observed random variables. In its present form, this is obscure. I think it is important. I haven t yet read after page 9, but there are a fair number of things for you to work on prior to that. I hope these comments are helpful. I m gone for week, but will be back after that. Ed 2
Random permutation models with auxiliary variables. Design-based random permutation models with auxiliary information. Wenjun Li
Running heads: Random permutation models with auxiliar variables Design-based random permutation models with auxiliar information Wenjun Li Division of Preventive and Behavioral Medicine Universit of Massachusetts
More informationDESIGN-BASED RANDOM PERMUTATION MODELS WITH AUXILIARY INFORMATION. Wenjun Li. Division of Preventative and Behavioral Medicine
DESG-BASED RADOM PERMUTATO MODELS WTH AUXLARY FORMATO Wenjun Li Division of Preventative and Behavioral Medicine Universit of Massachusetts Medical School Worcester MA 0655 Edward J. Stanek Department
More informationA comparison of stratified simple random sampling and sampling with probability proportional to size
A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson 1 Introduction When planning the sampling strategy (i.e.
More informationarxiv: v2 [math.st] 20 Jun 2014
A solution in small area estimation problems Andrius Čiginas and Tomas Rudys Vilnius University Institute of Mathematics and Informatics, LT-08663 Vilnius, Lithuania arxiv:1306.2814v2 [math.st] 20 Jun
More informationBIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING
Statistica Sinica 22 (2012), 777-794 doi:http://dx.doi.org/10.5705/ss.2010.238 BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING Desislava Nedyalova and Yves Tillé University of
More informationA Model-Over-Design Integration for Estimation from Purposive Supplements to Probability Samples
A Model-Over-Design Integration for Estimation from Purposive Supplements to Probability Samples Avinash C. Singh, NORC at the University of Chicago, Chicago, IL 60603 singh-avi@norc.org Abstract For purposive
More informationNONLINEAR CALIBRATION. 1 Introduction. 2 Calibrated estimator of total. Abstract
NONLINEAR CALIBRATION 1 Alesandras Pliusas 1 Statistics Lithuania, Institute of Mathematics and Informatics, Lithuania e-mail: Pliusas@tl.mii.lt Abstract The definition of a calibrated estimator of the
More informationSupplement-Sample Integration for Prediction of Remainder for Enhanced GREG
Supplement-Sample Integration for Prediction of Remainder for Enhanced GREG Abstract Avinash C. Singh Division of Survey and Data Sciences American Institutes for Research, Rockville, MD 20852 asingh@air.org
More informationSuperpopulations and Superpopulation Models. Ed Stanek
Superpopulations and Superpopulation Models Ed Stanek Contents Overview Background and History Generalizing from Populations: The Superpopulation Superpopulations: a Framework for Comparing Statistics
More informationA MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR
Statistica Sinica 8(1998), 1165-1173 A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR Phillip S. Kott National Agricultural Statistics Service Abstract:
More informationDivision of Preventative and Behavioral Medicine. University of Massachusetts Medical School, Worcester, MA 01655
USE OF AUXLARY FORMATO A DESG-BASED RADOM PERMUTATO MODEL Wenjun Li Division of Preventative Behavioral Medicine Universit of Massachusetts Medical School, Worcester, MA 0655 Edward J. Stanek Department
More informationRESEARCH REPORT. Vanishing auxiliary variables in PPS sampling with applications in microscopy.
CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING 2014 www.csgb.dk RESEARCH REPORT Ina Trolle Andersen, Ute Hahn and Eva B. Vedel Jensen Vanishing auxiliary variables in PPS sampling with applications
More informationA comparison of stratified simple random sampling and sampling with probability proportional to size
A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson Department of Statistics Stockholm University Introduction
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationINSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING
Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy
More informationConservative variance estimation for sampling designs with zero pairwise inclusion probabilities
Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance
More informationFinite Population Sampling and Inference
Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane
More informationDomain estimation under design-based models
Domain estimation under design-based models Viviana B. Lencina Departamento de Investigación, FM Universidad Nacional de Tucumán, Argentina Julio M. Singer and Heleno Bolfarine Departamento de Estatística,
More informationSampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.
Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Keywords: Survey sampling, finite populations, simple random sampling, systematic
More informationRegression Models - Introduction
Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,
More informationCourse Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationImplications of Ignoring the Uncertainty in Control Totals for Generalized Regression Estimators. Calibration Estimators
Implications of Ignoring the Uncertainty in Control Totals for Generalized Regression Estimators Jill A. Dever, RTI Richard Valliant, JPSM & ISR is a trade name of Research Triangle Institute. www.rti.org
More informationAdmissible Estimation of a Finite Population Total under PPS Sampling
Research Journal of Mathematical and Statistical Sciences E-ISSN 2320-6047 Admissible Estimation of a Finite Population Total under PPS Sampling Abstract P.A. Patel 1* and Shradha Bhatt 2 1 Department
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationA Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling
Journal of Official Statistics, Vol. 25, No. 3, 2009, pp. 397 404 A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling Nina Hagesæther 1 and Li-Chun Zhang 1 A model-based synthesis
More informationCourse Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -
More informationThe Effective Use of Complete Auxiliary Information From Survey Data
The Effective Use of Complete Auxiliary Information From Survey Data by Changbao Wu B.S., Anhui Laodong University, China, 1982 M.S. Diploma, East China Normal University, 1986 a thesis submitted in partial
More informationAn overview of applied econometrics
An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical
More informationThe New sampling Procedure for Unequal Probability Sampling of Sample Size 2.
. The New sampling Procedure for Unequal Probability Sampling of Sample Size. Introduction :- It is a well known fact that in simple random sampling, the probability selecting the unit at any given draw
More informationSimple design-efficient calibration estimators for rejective and high-entropy sampling
Biometrika (202), 99,, pp. 6 C 202 Biometrika Trust Printed in Great Britain Advance Access publication on 3 July 202 Simple design-efficient calibration estimators for rejective and high-entropy sampling
More informationModel Assisted Survey Sampling
Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling
More informationCausal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions
Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census
More informationREPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY
REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in
More informationEstimation from Purposive Samples with the Aid of Probability Supplements but without Data on the Study Variable
Estimation from Purposive Samples with the Aid of Probability Supplements but without Data on the Study Variable A.C. Singh,, V. Beresovsky, and C. Ye Survey and Data Sciences, American nstitutes for Research,
More informationANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control You know how ANOVA works the total variation among
More informationEstimation of Some Proportion in a Clustered Population
Nonlinear Analysis: Modelling and Control, 2009, Vol. 14, No. 4, 473 487 Estimation of Some Proportion in a Clustered Population D. Krapavicaitė Institute of Mathematics and Informatics Aademijos str.
More informationStatistics in Medicine. Prediction with measurement errors: do we really understand the BLUP?
Prediction with measurement errors: do we really understand the BLUP? Journal: Manuscript ID: SIM-0-00 Wiley - Manuscript type: Paper Date Submitted by the Author: 0-Apr-00 Complete List of Authors: Singer,
More informationEstimation of Mean Population in Small Area with Spatial Best Linear Unbiased Prediction Method
Journal of Physics: Conference Series PAPER OPEN ACCESS Estimation of Mean Population in Small Area with Spatial Best Linear Unbiased Prediction Method To cite this article: Syahril Ramadhan et al 2017
More informationA prediction approach to representative sampling Ib Thomsen and Li-Chun Zhang 1
A prediction approach to representative sampling Ib Thomsen and Li-Chun Zhang 1 Abstract After a discussion on the historic evolvement of the concept of representative sampling in official statistics,
More informationEstimating Bounded Population Total Using Linear Regression in the Presence of Supporting Information
International Journal of Mathematics and Computational Science Vol. 4, No. 3, 2018, pp. 112-117 http://www.aiscience.org/journal/ijmcs ISSN: 2381-7011 (Print); ISSN: 2381-702X (Online) Estimating Bounded
More informationAdditional results for model-based nonparametric variance estimation for systematic sampling in a forestry survey
Additional results for model-based nonparametric variance estimation for systematic sampling in a forestry survey J.D. Opsomer Colorado State University M. Francisco-Fernández Universidad de A Coruña July
More informationCOS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION
COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:
More informationWhat level should we use in small area estimation?
University of Wollongong Research Online University of Wollongong Thesis Collection University of Wollongong Thesis Collections 2011 What level should we use in small area estimation? Mohammad Reza Namazi
More informationDesign & Analysis of Experiments 7E 2009 Montgomery
1 What If There Are More Than Two Factor Levels? The t-test does not directly apply ppy There are lots of practical situations where there are either more than two levels of interest, or there are several
More informationNonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling
Nonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling Ji-Yeon Kim Iowa State University F. Jay Breidt Colorado State University Jean D. Opsomer Colorado State University
More informationF. Jay Breidt Colorado State University
Model-assisted survey regression estimation with the lasso 1 F. Jay Breidt Colorado State University Opening Workshop on Computational Methods in Social Sciences SAMSI August 2013 This research was supported
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2
MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and
More informationHISTORICAL PERSPECTIVE OF SURVEY SAMPLING
HISTORICAL PERSPECTIVE OF SURVEY SAMPLING A.K. Srivastava Former Joint Director, I.A.S.R.I., New Delhi -110012 1. Introduction The purpose of this article is to provide an overview of developments in sampling
More informationA decision theoretic approach to Imputation in finite population sampling
A decision theoretic approach to Imputation in finite population sampling Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 August 1997 Revised May and November 1999 To appear
More informationDescriptive Statistics (And a little bit on rounding and significant digits)
Descriptive Statistics (And a little bit on rounding and significant digits) Now that we know what our data look like, we d like to be able to describe it numerically. In other words, how can we represent
More informationAn Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys
An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys Richard Valliant University of Michigan and Joint Program in Survey Methodology University of Maryland 1 Introduction
More informationModel-assisted Estimation of Forest Resources with Generalized Additive Models
Model-assisted Estimation of Forest Resources with Generalized Additive Models Jean D. Opsomer, F. Jay Breidt, Gretchen G. Moisen, and Göran Kauermann March 26, 2003 Abstract Multi-phase surveys are often
More informationChapter 8: Estimation 1
Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.
More informationof being selected and varying such probability across strata under optimal allocation leads to increased accuracy.
5 Sampling with Unequal Probabilities Simple random sampling and systematic sampling are schemes where every unit in the population has the same chance of being selected We will now consider unequal probability
More informationMultiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =
Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =
More informationEmpirical and Constrained Empirical Bayes Variance Estimation Under A One Unit Per Stratum Sample Design
Empirical and Constrained Empirical Bayes Variance Estimation Under A One Unit Per Stratum Sample Design Sepideh Mosaferi Abstract A single primary sampling unit (PSU) per stratum design is a popular design
More informationThe R package sampling, a software tool for training in official statistics and survey sampling
The R package sampling, a software tool for training in official statistics and survey sampling Yves Tillé 1 and Alina Matei 2 1 Institute of Statistics, University of Neuchâtel, Switzerland yves.tille@unine.ch
More informationAdaptive two-stage sequential double sampling
Adaptive two-stage sequential double sampling Bardia Panahbehagh Afshin Parvardeh Babak Mohammadi March 4, 208 arxiv:803.04484v [math.st] 2 Mar 208 Abstract In many surveys inexpensive auxiliary variables
More informationagilis D1. Define Estimation Procedures European Commission Eurostat/B1, Eurostat/F1 Contract No
Informatics European Commission Eurostat/B1, Eurostat/F1 Contract No. 611.211.5-212.426 Development of methods and scenarios for an integrated system of D1. Define Estimation Procedures October 213 (Contract
More informationREPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES
Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for
More informationDesign and Estimation for Split Questionnaire Surveys
University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Information Sciences 2008 Design and Estimation for Split Questionnaire
More informationBayesian model selection: methodology, computation and applications
Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program
More informationECNS 561 Multiple Regression Analysis
ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking
More informationModel-assisted Estimation of Forest Resources with Generalized Additive Models
Model-assisted Estimation of Forest Resources with Generalized Additive Models Jean Opsomer, Jay Breidt, Gretchen Moisen, Göran Kauermann August 9, 2006 1 Outline 1. Forest surveys 2. Sampling from spatial
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More informationGov 2000: 9. Regression with Two Independent Variables
Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Fall 2016 1 / 62 1. Why Add Variables to a Regression? 2. Adding a Binary Covariate 3. Adding a Continuous Covariate 4. OLS Mechanics
More informationDESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya
DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Jurusan Teknik Industri Universitas Brawijaya Outline Introduction The Analysis of Variance Models for the Data Post-ANOVA Comparison of Means Sample
More informationNonparametric regression estimation under complex sampling designs
Retrospective Theses and Dissertations 2004 Nonparametric regression estimation under complex sampling designs Ji-Yeon Kim Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/rtd
More informationCross-sectional variance estimation for the French Labour Force Survey
Survey Research Methods (007 Vol., o., pp. 75-83 ISS 864-336 http://www.surveymethods.org c European Survey Research Association Cross-sectional variance estimation for the French Labour Force Survey Pascal
More informationMixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012
Mixed-Model Estimation of genetic variances Bruce Walsh lecture notes Uppsala EQG 01 course version 8 Jan 01 Estimation of Var(A) and Breeding Values in General Pedigrees The above designs (ANOVA, P-O
More informationx + 2y + 3z = 8 x + 3y = 7 x + 2z = 3
Chapter 2: Solving Linear Equations 23 Elimination Using Matrices As we saw in the presentation, we can use elimination to make a system of linear equations into an upper triangular system that is easy
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationi=1 h n (ˆθ n ) = 0. (2)
Stat 8112 Lecture Notes Unbiased Estimating Equations Charles J. Geyer April 29, 2012 1 Introduction In this handout we generalize the notion of maximum likelihood estimation to solution of unbiased estimating
More informationCONTENTS OF DAY 2. II. Why Random Sampling is Important 10 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 4 Problems with small populations 9 II. Why Random Sampling is Important 10 A myth,
More informationStatistica Sinica Preprint No: SS R2
Statistica Sinica Preprint No: SS-13-244R2 Title Examining some aspects of balanced sampling in surveys Manuscript ID SS-13-244R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.2013.244 Complete
More information1 What does the random effect η mean?
Some thoughts on Hanks et al, Environmetrics, 2015, pp. 243-254. Jim Hodges Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota USA 55414 email: hodge003@umn.edu October 13, 2015
More informationMATH2206 Prob Stat/20.Jan Weekly Review 1-2
MATH2206 Prob Stat/20.Jan.2017 Weekly Review 1-2 This week I explained the idea behind the formula of the well-known statistic standard deviation so that it is clear now why it is a measure of dispersion
More informationQuadratic Equations Part I
Quadratic Equations Part I Before proceeding with this section we should note that the topic of solving quadratic equations will be covered in two sections. This is done for the benefit of those viewing
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Principles of Statistical Inference Recap of statistical models Statistical inference (frequentist) Parametric vs. semiparametric
More informationIntroduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University
Introduction to the Mathematical and Statistical Foundations of Econometrics 1 Herman J. Bierens Pennsylvania State University November 13, 2003 Revised: March 15, 2004 2 Contents Preface Chapter 1: Probability
More informationNotes 11: OLS Theorems ECO 231W - Undergraduate Econometrics
Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano For a while we talked about the regression method. Then we talked about the linear model. There were many details, but
More informationSequential Logic (3.1 and is a long difficult section you really should read!)
EECS 270, Fall 2014, Lecture 6 Page 1 of 8 Sequential Logic (3.1 and 3.2. 3.2 is a long difficult section you really should read!) One thing we have carefully avoided so far is feedback all of our signals
More informationOrdered Designs and Bayesian Inference in Survey Sampling
Ordered Designs and Bayesian Inference in Survey Sampling Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu Siamak Noorbaloochi Center for Chronic Disease
More information1 Introduction to Generalized Least Squares
ECONOMICS 7344, Spring 2017 Bent E. Sørensen April 12, 2017 1 Introduction to Generalized Least Squares Consider the model Y = Xβ + ɛ, where the N K matrix of regressors X is fixed, independent of the
More informationGov 2002: 4. Observational Studies and Confounding
Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What
More informationInstrumental Variables
Instrumental Variables Department of Economics University of Wisconsin-Madison September 27, 2016 Treatment Effects Throughout the course we will focus on the Treatment Effect Model For now take that to
More informationContextual Effects in Modeling for Small Domains
University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Engineering and Information Sciences 2011 Contextual Effects in
More informationTime Series 2. Robert Almgren. Sept. 21, 2009
Time Series 2 Robert Almgren Sept. 21, 2009 This week we will talk about linear time series models: AR, MA, ARMA, ARIMA, etc. First we will talk about theory and after we will talk about fitting the models
More informationReview of the General Linear Model
Review of the General Linear Model EPSY 905: Multivariate Analysis Online Lecture #2 Learning Objectives Types of distributions: Ø Conditional distributions The General Linear Model Ø Regression Ø Analysis
More informationEmpirical Likelihood Methods for Sample Survey Data: An Overview
AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use
More informationInterpreting and using heterogeneous choice & generalized ordered logit models
Interpreting and using heterogeneous choice & generalized ordered logit models Richard Williams Department of Sociology University of Notre Dame July 2006 http://www.nd.edu/~rwilliam/ The gologit/gologit2
More informationTime Series Analysis Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 4.384 Time Series Analysis Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. Indirect Inference 4.384 Time
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationESTP course on Small Area Estimation
ESTP course on Small Area Estimation Statistics Finland, Helsinki, 29 September 2 October 2014 Topic 1: Introduction to small area estimation Risto Lehtonen, University of Helsinki Lecture topics: Monday
More informationPost-Selection Inference
Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis
More informationCausal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies
Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed
More informationChapter 3: Element sampling design: Part 1
Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part
More informationMultiple Regression Analysis
Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators
More information