Random permutation models with auxiliary variables. Design-based random permutation models with auxiliary information. Wenjun Li

Size: px
Start display at page:

Download "Random permutation models with auxiliary variables. Design-based random permutation models with auxiliary information. Wenjun Li"

Transcription

1 Running heads: Random permutation models with auxiliar variables Design-based random permutation models with auxiliar information Wenjun Li Division of Preventive and Behavioral Medicine Universit of Massachusetts Medical School Shaw Building SH2-230, 55 Lake Avenue orth, Worcester, MA 0655, USA Telephone: (508) Fax: (508) Edward J. Stanek Department of Public Health, Universit of Massachusetts, 75. Pleasant Street, Amherst, MA 0003, USA Telephone: (43) Fax: (43) Julio da Motta Singer Departamento de Estatística, Universidade de São Paulo Caixa Postal 6628, São Paulo, SP , Brazil Phone: Fax: LiW_06.doc 9/27/2006 3:03 PM Page of 5

2 Abstract We extend the random permutation model proposed b Stanek, Singer and Lencina (2004) to obtain best linear unbiased estimators of a finite population mean under simple random without replacement sampling in situations where auxiliar information is available. The procedure provides a sstematic design-based justification for well-known results involving common estimators and ma serve as the basis for extending such a minimum assumption theor to more complicated sample designs. Kewords: auxiliar variable; design-based inference; prediction; finite sampling; random permutation model; simultaneous permutation LiW_06.doc 9/27/2006 3:03 PM Page 2 of 5

3 . ntroduction mprovements in the precision of estimates of population parameters based on random samples can be made b accounting for auxiliar information (e.g., age, gender etc.). Man estimators with such features have been proposed, but the either require assumptions beond those pertaining to the sample design or lack an integrated theor. For example, model-based approaches (Ghosh and Rao 994, Rao 997), that generate best linear unbiased predictors (BLUP) ignore the sample design but require a postulated model. Additional superpopulation model assumptions are required for model-assisted approaches that lead to generalized regression (GREG) estimators (Särndal, Swensson and Wretman 992). Calibration estimators, on the other hand, optimize benchmark weights b being adjusted to known population quantities on some set of auxiliar variables, but lack an integrated theor. Other designbased approaches consider the finite population as a sample realization from an infinite population, and thus make additional assumptions beond the sampling design (Fuller 2002). We develop a design-based estimator of a linear function of the response that accounts for auxiliar information, and requires no assumptions beond those defining simple random sampling. The development extends the use of the random permutation model (Stanek and Singer 2004, Stanek, Singer and Lencina 2004) to account for auxiliar variables. Under this minimal assumption setup, the results establish that the commonl used estimator (Cochran 977) is LiW_06.doc 9/27/2006 3:03 PM Page 3 of 5

4 BLUE. n addition, the development highlights novel ideas that emerge from the random permutation model framework. The first is the expression of population parameters as sums of random variables. Another is the classification of the underling random variables into those that will be realized, and those that need to be predicted. This paper is organized as follows. We first present definitions and notation, and introduce the random permutation model. We next include multiple auxiliar variables, and use the model to derive the best linear unbiased estimator (BLUE) of the population mean. We conclude with an example and discussion. 2. A Design-Based Model for Simple Random Sampling We represent sampling formall b a set of indicator random variables whose partial realization specifies a selected sample. These random variables permute the units in the population, and hence we refer to the underling stochastic model as a random permutation model. Elements of the population, including the response of interest and auxiliar variables, are non-stochastic, but not necessaril observed. The population is represented as a vector of random variables. We use the stochastic model for these random variables to develop an optimal estimator of a parameter defined in the population, assuming the population mean is known for the auxiliar variables. Unlike LiW_06.doc 9/27/2006 3:03 PM Page 4 of 5

5 previous work (Fuller 2002), our definition does not require the population to be a random sample from some infinite population. Let the population consist of subjects, indexed b s =, 2,, noninformative labels. A non-stochastic potentiall observable vector, s = (( zsk) ) z, k = 0,,..., p is associated with subject s, where z0s = s denotes the outcome of interest, and zks = x ks µ k for k =,..., p denote auxiliar variables (centered at zero), with x s = (( xks) ) and ( µ ) ( ) x = k = s = s µ x. The mean of the auxiliar variables is assumed known in the population. We represent the population µ 0 where vector of means b z = ( µ p) µ = s= s and the population variance b ( ) 2 σ σ X = =, s= σx ΣX Σ, where Σ ( ) ( z s µ z)( z s µ z) ( σ σ σ ) 2 p ( ) k k* σ X = x x x and Σ X = ( σ x x ). We define the random permutation model as the set of all possible equall likel permutations of subjects in the population. Following Stanek, Singer and Lencina (2004), we explicitl define a set of indicator random variables i =, 2,,, that have a value of one if subject s is in position i in a permutation, and zero otherwise. Using this notation, we define Z z zu, where Ui = ( Ui Ui2 U i) and = (( )) i = U s= is s = i U is, z z s, and LiW_06.doc 9/27/2006 3:03 PM Page 5 of 5

6 Z = Uz where U = ( U U U ). We refer to U as a random 2 permutation matrix, and require each realization to be equall likel, subject to the constraints that one unit is allocated to each position, U = and all units are assigned to a position, U =, where is an vector with all elements equal to. Taking the expectation over all possible permutations, ( Z) = µ z, and cov( vec ( )) =, E Z Σ P, where P = b J, is an ab, a a identit matrix, and J =. The random variables in Z represent a full permutation of the subjects in the population, with the first column, Y, representing response, and the remaining columns, X, k =,..., p, representing auxiliar variables. ote that subjects k are not identifiable in this representation. Without loss of generalit, we assume that the sample corresponds to the random variables in rows i =,..., n, i.e., Z S, with the remainder in the rows corresponding to i = n+,...,, i.e. Z R, where Z = Z Z. This notation explicitl represents the process of simple ( ) S R random sampling b a stochastic model. We simplif estimation b defining a column expansion of the random variables for the sample, Z = vec ( Z ) and the remainder, = vec ( ) S value of the sample and remaining random variables are ( ) Z Z. The expected R E Z = G µ and LiW_06.doc 9/27/2006 3:03 PM Page 6 of 5

7 ( ) Z = G, where = ( n np) E µ ( ) G 0 and G = n 0 ( n). The p covariance structure is Z V V, var =, where V = Σ P n,, Z V, V V = Σ P (, and n),, = n ( n) V Σ J, where J ( ) = n n. n n Consequentl, the partitioned model that reflects simple random sampling can be represented as Z G = µ + E. () Z G 3. BLUE of a linear function Our interest lie in linear functions of the permuted response variate, namel θ cy cy i= i i = =, or equivalentl, where = ( np ) when θ = CZ + C Z (2) ( ) C c 0, = ( ) ci = for all i,..., C c 0 and = ( ) n p * population total; or when c ( n ) c c c. For example, =, θ = µ ; when c i = for all i =,...,, θ is the i = for i =,..., n* < n, θ ma correspond to the mean response for an interviewer of the first n * sample subjects. After sampling, onl C Z will be unknown; thus, estimating θ is equivalent to predicting C Z. Following Roall s prediction approach (Roall 976), we LiW_06.doc 9/27/2006 3:03 PM Page 7 of 5

8 develop the best linear unbiased predictor (BLUP) of C Z which, when added to CZ, generates the BLUE of θ. We require the predictor to be a linear function of the sample, i.e., wz, to be an unbiased predictor of C Z, i.e., E( ) = E( ) wz C Z, and to have minimum expected mean squared error (MSE). As a result, the estimator of θ can be expressed as P = CZ + wz. The unbiased constraint implies that wg C G = 0. The variance of P is given b ( ), var P = wvw 2wV C + C V C. We then appl Roall s prediction theorem (Roall 976) to find the value of w that minimizes ( ) 2 2{ } Φ w = wv w w V C + w G C G,, λ where λ is a Lagrangian multiplier. The unique solution is {, ( ) (, )} ( ) β ( ) wˆ = V V + G GV G G GV V C ( ) = c f n n n n (3) c = n c, f n where ( ) i= n+ Consequentl, i = and = X X = ( β β2 βp) β Σ σ. p ( ) ( ) β ( µ ) Pˆ = cy + n c Y f k k X = k k. (4) where Y and X, k =,..., p are sample means. The variance is given b k 2 ( ˆ ) (( ) ( ) ) σ ( ) 2 ( ) ( ) var P = cc n c + f c f c f n σ 2 2 n n n ( 2( c )( c ) ( 2 )( ) ( c ) ) ( ρ ) ( σ ) n n f f n X n + + (5) LiW_06.doc 9/27/2006 3:03 PM Page 8 of 5

9 where ρ = σ Σ σ σ is the squared multiple correlation coefficient of Y on 2 2 X X X X 2 2 X. n practical applications, β, σ, and ρ X are not known, and must be replaced b sample estimators. 4. Example As an example, suppose we are interested in estimating the mean response given b b µ = Y i= i based on a simple random sample, accounting for auxiliar information. Since ci =, for i=,,, the BLUE is Pˆ = fy + f Y f k k X = k k, (6) p ( ) ( ) β ( µ ) or equivalentl, P p ˆ = Y β k( X k µ k) k = ( Pˆ ) ( ρ X )( ) 2 2, with variance, var = f n σ. (7) As a practical application, suppose there is interest in estimating the smoking rate µ = π in a population based on a simple random sample with both smoking status (=smoker, 0=non-smoker) and gender (=male, 0=female) recorded on the sample subjects. We assume that the proportion of males in the population, µ x = π x, is known, and represent the sample estimate of the proportion smoking as Y = ˆ π, the proportion of males in the sample as X = ˆ π x, and the proportion of male smokers in the sample as ˆx π. With this notation, ( ) 2 n ˆ σ ˆ ( ˆ = π π ), ( n ) ˆ σ 2 ˆ ( ˆ x πx πx) =, and LiW_06.doc 9/27/2006 3:03 PM Page 9 of 5

10 ( ) n ˆ σ = ˆ π ˆ π ˆ π. Using these estimators, we estimate β b x x x ( ˆ ˆ ˆ )( ˆ ( ˆ )) ˆ ˆ x x x x [ male] [ female] b = π π π π π = π π, which is the estimated difference in male and female prevalences based on the sample. Substituting these expressions into the estimator in (6) and (7) results in { π ( ) π ( πx πx) } ˆ ˆ ˆ ˆ P = n + n b ( [ ] [ ])( ) = ˆ π ˆ π ˆ π ˆ π π male female x x which is the well-known post-stratified estimator with estimated variance ( Pˆ ) ( ρ )( ) var ˆ ˆ ˆ 2 2 = f n σ, where ˆ ρ = ˆ π ˆ ππˆ x x ( ) ( ) ˆ π ˆ π ˆ π ˆ π x x. 5. Discussion We have shown that the estimator (4) is the best linear unbiased estimator (BLUE) of a linear combination of response under simple random sampling without replacement. The results establish that the commonl used estimator developed under alternative frameworks is BLUE. The estimator is expressed identicall as those commonl seen in multiple linear regression models that do not account for the finite population (Grabill 976), but includes a finite population correction factor in the variance. Results (6) and (7) are also identical to those for difference estimators with optimal coefficients (Montanari 987); to the GREG estimator (Särndal, Swensson and Wretman 992); and to the multiple regression estimator developed under a superpopulation model (Fuller 2002). LiW_06.doc 9/27/2006 3:03 PM Page 0 of 5

11 The surve sampling literature has struggled to reconcile design-based and model-based theories of estimation/prediction. Model-based methods recentl popularized b Valiant, Dorfman and Roall (see (Valliant, Dorfman and Roall 2000)) stem from the prediction approach developed b Roall (see (Roall 973) and (Roall 976)). The underling theoretical structure is important, since it allows such methods to be extended relativel easil to different applications with increasing complexit. The limitation of such theor is that it does not account for the sample design. A similar unifing theor has not been developed for design-based methods. Cochran s (Cochran 977) original approach was to postulate a linear regression model, and then determine the regression coefficients based on minimizing the variance. Other approaches, such as the GREG or the calibration approaches (Särndal, Swensson and Wretman 992) have combined model-based and design-based ideas, or began with ad-hoc functional forms of estimators, and optimized them in special settings. These approaches have been successful in addressing man practical problems in a design-based framework (Särndal, Swensson and Wretman 992, Brewer 2002). However, the have not provided a consistent conceptual and theoretical basis that can be readil extended to more complex applications. LiW_06.doc 9/27/2006 3:03 PM Page of 5

12 We believe that representing the sample design via a random permutation model, and then predicting functions of unobserved subjects in a sstematic wa provides an appealing, straightforward foundation for finite population inference. There are steps in this process that break with tradition, such as expressing a parameter as a sum of random variables. Focusing attention on predicting unobserved quantities is certainl intuitivel satisfing, but unusual in the context of estimation. The development also blurs the distinction between the traditional use of the term predictor (for random variables) and estimator (for parameters). We have illustrated how the design-based random permutation model theor can be extended to include auxiliar variables in a straightforward manner. These results extend the scope of previous results (Stanek and Singer 2004, Stanek, Singer and Lencina 2004) to a broader class of problems. The previous developments of the theor have identified subtleties in interpreting random effects in simple random sampling (Stanek, Singer and Lencina 2004) and developed predictors of realized random effects in balanced two stage sampling problems with response error (Stanek and Singer 2004). Current research is extending these results to clustered population settings where clusters are of different size and there is unequal probabilit sampling, and to settings where there is missing data. n each case, a similar approach is considered, with estimators (or predictors) developed via a clear optimization theor. LiW_06.doc 9/27/2006 3:03 PM Page 2 of 5

13 n practice, covariances in the expressions for the estimators need to be estimated. Some simulation stud results on the impact of such estimation are given b Li (Li 2003). The resulting estimator coincides with those developed b GREG or calibration approaches, and strengthens the appeal of the random permutation model. Still, much more work is needed to extend the methods to more complex settings, including two stage designs with cluster and unit covariates, longitudinal studies, and settings where units are randomized to treatments. We consider the basic results developed here to provide a foundation for additional work in these directions. 6. Acknowledgements This research was partiall supported b a H grant (H-PHS-R0-HD36848). The authors wish to thank Drs. John Buonaccorsi and Carol Bigelow for their constructive comments. The content of this article is a part of the first author s dissertation conducted at the Department of Biostatistics and Epidemiolog, Universit of Massachusetts, Amherst, Massachusetts. LiW_06.doc 9/27/2006 3:03 PM Page 3 of 5

14 7. References Brewer, K. R. W. (2002), Combined Surve Sampling nference: Weighing Basu's Elephants, London ; ew York, ew York: Arnold ; Distributed in the United States of America b Oxford Universit Press. Cochran, W. G. (977), Sampling Techniques (Third ed.), ew York: John Wile and Sons. Fuller, W. A. (2002), "Regression Estimation for Surve Samples," Surve methodolog, 28, Ghosh, M., and Rao, J.. K. (994), "Small Area Estimation: An Appraisal," Statistical Science, 9, Grabill, F. A. (976), Theor and Application of the Linear Model (Vol. ), Belmont, CA: Wadsworth Publishing Compan, nc. Li, W. (2003), "Use of Random Permutation Model in Rate Estimation and Standardization," Ph.D. Dissertation, Universit of Massachusetts, Department of Biostatistics and Epidemiolog. Montanari, G. E. (987), "Post-Sampling Efficient Qr-Prediction in Large-Sample Surves," nternational Statistical Review, 55, Rao, J.. K. (997), "Developments in Sample Surve Theor: An Appraisal," Canadian Journal of Statistics, 25, -2. Roall, R. M. (973), "The Prediction Approach to Finite Population Sampling Theor: Application to the Hospital Discharge Surve.," Technical, ational Center for Health Statistics, Office of Statistical Methods. Roall, R. M. (976), "The Linear Least-Squares Prediction Approach to Two-Stage Sampling," Journal of the American Statistical Association, 7, Särndal, C. E., Swensson, B., and Wretman, J. (992), Model Assisted Surve Sampling, ew York: Springer-Verlag. Stanek, E. J., and Singer, J. M. (2004), "Predicting Random Effects from Finite Population Clustered Samples with Response Error," Journal of the American Statistical Association, 99, Stanek, E. J., Singer, J. M., and Lencina, V. B. (2004), "A Unified Approach to Estimation and Prediction under Simple Random Sampling," Journal of Statistical Planning and nference, 2, LiW_06.doc 9/27/2006 3:03 PM Page 4 of 5

15 Valliant, R., Dorfman, A. H., and Roall, R. M. (2000), Finite Population Sampling and nference, a Prediction Approach, ew York: John Wile & Sons. LiW_06.doc 9/27/2006 3:03 PM Page 5 of 5

DESIGN-BASED RANDOM PERMUTATION MODELS WITH AUXILIARY INFORMATION. Wenjun Li. Division of Preventative and Behavioral Medicine

DESIGN-BASED RANDOM PERMUTATION MODELS WITH AUXILIARY INFORMATION. Wenjun Li. Division of Preventative and Behavioral Medicine DESG-BASED RADOM PERMUTATO MODELS WTH AUXLARY FORMATO Wenjun Li Division of Preventative and Behavioral Medicine Universit of Massachusetts Medical School Worcester MA 0655 Edward J. Stanek Department

More information

Division of Preventative and Behavioral Medicine. University of Massachusetts Medical School, Worcester, MA 01655

Division of Preventative and Behavioral Medicine. University of Massachusetts Medical School, Worcester, MA 01655 USE OF AUXLARY FORMATO A DESG-BASED RADOM PERMUTATO MODEL Wenjun Li Division of Preventative Behavioral Medicine Universit of Massachusetts Medical School, Worcester, MA 0655 Edward J. Stanek Department

More information

Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek

Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/2/03) Ed Stanek Here are comments on the Draft Manuscript. They are all suggestions that

More information

Domain estimation under design-based models

Domain estimation under design-based models Domain estimation under design-based models Viviana B. Lencina Departamento de Investigación, FM Universidad Nacional de Tucumán, Argentina Julio M. Singer and Heleno Bolfarine Departamento de Estatística,

More information

A comparison of stratified simple random sampling and sampling with probability proportional to size

A comparison of stratified simple random sampling and sampling with probability proportional to size A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson Department of Statistics Stockholm University Introduction

More information

Statistics in Medicine. Prediction with measurement errors: do we really understand the BLUP?

Statistics in Medicine. Prediction with measurement errors: do we really understand the BLUP? Prediction with measurement errors: do we really understand the BLUP? Journal: Manuscript ID: SIM-0-00 Wiley - Manuscript type: Paper Date Submitted by the Author: 0-Apr-00 Complete List of Authors: Singer,

More information

A comparison of stratified simple random sampling and sampling with probability proportional to size

A comparison of stratified simple random sampling and sampling with probability proportional to size A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson 1 Introduction When planning the sampling strategy (i.e.

More information

Implications of Ignoring the Uncertainty in Control Totals for Generalized Regression Estimators. Calibration Estimators

Implications of Ignoring the Uncertainty in Control Totals for Generalized Regression Estimators. Calibration Estimators Implications of Ignoring the Uncertainty in Control Totals for Generalized Regression Estimators Jill A. Dever, RTI Richard Valliant, JPSM & ISR is a trade name of Research Triangle Institute. www.rti.org

More information

arxiv: v2 [math.st] 20 Jun 2014

arxiv: v2 [math.st] 20 Jun 2014 A solution in small area estimation problems Andrius Čiginas and Tomas Rudys Vilnius University Institute of Mathematics and Informatics, LT-08663 Vilnius, Lithuania arxiv:1306.2814v2 [math.st] 20 Jun

More information

NONLINEAR CALIBRATION. 1 Introduction. 2 Calibrated estimator of total. Abstract

NONLINEAR CALIBRATION. 1 Introduction. 2 Calibrated estimator of total.   Abstract NONLINEAR CALIBRATION 1 Alesandras Pliusas 1 Statistics Lithuania, Institute of Mathematics and Informatics, Lithuania e-mail: Pliusas@tl.mii.lt Abstract The definition of a calibrated estimator of the

More information

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Keywords: Survey sampling, finite populations, simple random sampling, systematic

More information

A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR

A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR Statistica Sinica 8(1998), 1165-1173 A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR Phillip S. Kott National Agricultural Statistics Service Abstract:

More information

Model Assisted Survey Sampling

Model Assisted Survey Sampling Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling

More information

Finite Population Sampling and Inference

Finite Population Sampling and Inference Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane

More information

Superpopulations and Superpopulation Models. Ed Stanek

Superpopulations and Superpopulation Models. Ed Stanek Superpopulations and Superpopulation Models Ed Stanek Contents Overview Background and History Generalizing from Populations: The Superpopulation Superpopulations: a Framework for Comparing Statistics

More information

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy

More information

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for

More information

Estimating Realized Random Effects in Mixed Models

Estimating Realized Random Effects in Mixed Models Etimating Realized Random Effect in Mixed Model (Can parameter for realized random effect be etimated in mixed model?) Edward J. Stanek III Dept of Biotatitic and Epidemiology, UMASS, Amhert, MA USA Julio

More information

Design and Estimation for Split Questionnaire Surveys

Design and Estimation for Split Questionnaire Surveys University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Information Sciences 2008 Design and Estimation for Split Questionnaire

More information

A new approach to weighting and inference in sample surveys

A new approach to weighting and inference in sample surveys Biometria (2008), 95, 3,pp. 539 553 C 2008 Biometria Trust Printed in Great Britain doi: 10.1093/biomet/asn028 A new approach to weighting and inference in sample surves BY JEAN-FRANÇOIS BEAUMONT Statistics

More information

Admissible Estimation of a Finite Population Total under PPS Sampling

Admissible Estimation of a Finite Population Total under PPS Sampling Research Journal of Mathematical and Statistical Sciences E-ISSN 2320-6047 Admissible Estimation of a Finite Population Total under PPS Sampling Abstract P.A. Patel 1* and Shradha Bhatt 2 1 Department

More information

A NONINFORMATIVE BAYESIAN APPROACH FOR TWO-STAGE CLUSTER SAMPLING

A NONINFORMATIVE BAYESIAN APPROACH FOR TWO-STAGE CLUSTER SAMPLING Sankhyā : The Indian Journal of Statistics Special Issue on Sample Surveys 1999, Volume 61, Series B, Pt. 1, pp. 133-144 A OIFORMATIVE BAYESIA APPROACH FOR TWO-STAGE CLUSTER SAMPLIG By GLE MEEDE University

More information

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington Analsis of Longitudinal Data Patrick J. Heagert PhD Department of Biostatistics Universit of Washington 1 Auckland 2008 Session Three Outline Role of correlation Impact proper standard errors Used to weight

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Sampling Theory. Improvement in Variance Estimation in Simple Random Sampling

Sampling Theory. Improvement in Variance Estimation in Simple Random Sampling Communications in Statistics Theory and Methods, 36: 075 081, 007 Copyright Taylor & Francis Group, LLC ISS: 0361-096 print/153-415x online DOI: 10.1080/0361090601144046 Sampling Theory Improvement in

More information

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in

More information

BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING

BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING Statistica Sinica 22 (2012), 777-794 doi:http://dx.doi.org/10.5705/ss.2010.238 BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING Desislava Nedyalova and Yves Tillé University of

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

A Model-Over-Design Integration for Estimation from Purposive Supplements to Probability Samples

A Model-Over-Design Integration for Estimation from Purposive Supplements to Probability Samples A Model-Over-Design Integration for Estimation from Purposive Supplements to Probability Samples Avinash C. Singh, NORC at the University of Chicago, Chicago, IL 60603 singh-avi@norc.org Abstract For purposive

More information

Binomial and Poisson Probability Distributions

Binomial and Poisson Probability Distributions Binomial and Poisson Probability Distributions Esra Akdeniz March 3, 2016 Bernoulli Random Variable Any random variable whose only possible values are 0 or 1 is called a Bernoulli random variable. What

More information

Characterization of the Skew-Normal Distribution Via Order Statistics and Record Values

Characterization of the Skew-Normal Distribution Via Order Statistics and Record Values International Journal of Statistics and Probabilit; Vol. 4, No. 1; 2015 ISSN 1927-7032 E-ISSN 1927-7040 Published b Canadian Center of Science and Education Characterization of the Sew-Normal Distribution

More information

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys Richard Valliant University of Michigan and Joint Program in Survey Methodology University of Maryland 1 Introduction

More information

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Yi-Hau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan

More information

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance

More information

University of Michigan School of Public Health

University of Michigan School of Public Health University of Michigan School of Public Health The University of Michigan Department of Biostatistics Working Paper Series Year 003 Paper Weighting Adustments for Unit Nonresponse with Multiple Outcome

More information

Analysing Spatial Data in R Worked examples: Small Area Estimation

Analysing Spatial Data in R Worked examples: Small Area Estimation Analysing Spatial Data in R Worked examples: Small Area Estimation Virgilio Gómez-Rubio Department of Epidemiology and Public Heath Imperial College London London, UK 31 August 2007 Small Area Estimation

More information

Small area estimation with missing data using a multivariate linear random effects model

Small area estimation with missing data using a multivariate linear random effects model Department of Mathematics Small area estimation with missing data using a multivariate linear random effects model Innocent Ngaruye, Dietrich von Rosen and Martin Singull LiTH-MAT-R--2017/07--SE Department

More information

A noninformative Bayesian approach to domain estimation

A noninformative Bayesian approach to domain estimation A noninformative Bayesian approach to domain estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu August 2002 Revised July 2003 To appear in Journal

More information

F. Jay Breidt Colorado State University

F. Jay Breidt Colorado State University Model-assisted survey regression estimation with the lasso 1 F. Jay Breidt Colorado State University Opening Workshop on Computational Methods in Social Sciences SAMSI August 2013 This research was supported

More information

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS Background Independent observations: Short review of well-known facts Comparison of two groups continuous response Control group:

More information

A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling

A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling Journal of Official Statistics, Vol. 25, No. 3, 2009, pp. 397 404 A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling Nina Hagesæther 1 and Li-Chun Zhang 1 A model-based synthesis

More information

Chapter 5 Prediction of Random Variables

Chapter 5 Prediction of Random Variables Chapter 5 Prediction of Random Variables C R Henderson 1984 - Guelph We have discussed estimation of β, regarded as fixed Now we shall consider a rather different problem, prediction of random variables,

More information

Improved ratio-type estimators using maximum and minimum values under simple random sampling scheme

Improved ratio-type estimators using maximum and minimum values under simple random sampling scheme Improved ratio-type estimators using maximum and minimum values under simple random sampling scheme Mursala Khan Saif Ullah Abdullah. Al-Hossain and Neelam Bashir Abstract This paper presents a class of

More information

Cross-sectional variance estimation for the French Labour Force Survey

Cross-sectional variance estimation for the French Labour Force Survey Survey Research Methods (007 Vol., o., pp. 75-83 ISS 864-336 http://www.surveymethods.org c European Survey Research Association Cross-sectional variance estimation for the French Labour Force Survey Pascal

More information

Does low participation in cohort studies induce bias? Additional material

Does low participation in cohort studies induce bias? Additional material Does low participation in cohort studies induce bias? Additional material Content: Page 1: A heuristic proof of the formula for the asymptotic standard error Page 2-3: A description of the simulation study

More information

Computation of Csiszár s Mutual Information of Order α

Computation of Csiszár s Mutual Information of Order α Computation of Csiszár s Mutual Information of Order Damianos Karakos, Sanjeev Khudanpur and Care E. Priebe Department of Electrical and Computer Engineering and Center for Language and Speech Processing

More information

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census

More information

Known unknowns : using multiple imputation to fill in the blanks for missing data

Known unknowns : using multiple imputation to fill in the blanks for missing data Known unknowns : using multiple imputation to fill in the blanks for missing data James Stanley Department of Public Health University of Otago, Wellington james.stanley@otago.ac.nz Acknowledgments Cancer

More information

No is the Easiest Answer: Using Calibration to Assess Nonignorable Nonresponse in the 2002 Census of Agriculture

No is the Easiest Answer: Using Calibration to Assess Nonignorable Nonresponse in the 2002 Census of Agriculture No is the Easiest Answer: Using Calibration to Assess Nonignorable Nonresponse in the 2002 Census of Agriculture Phillip S. Kott National Agricultural Statistics Service Key words: Weighting class, Calibration,

More information

R function for residual analysis in linear mixed models: lmmresid

R function for residual analysis in linear mixed models: lmmresid R function for residual analysis in linear mixed models: lmmresid Juvêncio S. Nobre 1, and Julio M. Singer 2, 1 Departamento de Estatística e Matemática Aplicada, Universidade Federal do Ceará, Fortaleza,

More information

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Statistica Sinica 24 2014, 395-414 doi:ttp://dx.doi.org/10.5705/ss.2012.064 EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Jun Sao 1,2 and Seng Wang 3 1 East Cina Normal University,

More information

BIOSTATS 540 Fall 2016 Exam 1 (Unit 1 Summarizing Data) Page 1 of 7

BIOSTATS 540 Fall 2016 Exam 1 (Unit 1 Summarizing Data) Page 1 of 7 BIOSTATS 540 Fall 2016 Exam 1 (Unit 1 Summarizing Data) Page 1 of 7 BIOSTATS 540 - Introductory Biostatistics Fall 2016 Examination 1 (Unit 1 Summarizing Data) Due: Monday September 26, 2016 Last Date

More information

Comparison of Estimators in Case of Low Correlation in Adaptive Cluster Sampling. Muhammad Shahzad Chaudhry 1 and Muhammad Hanif 2

Comparison of Estimators in Case of Low Correlation in Adaptive Cluster Sampling. Muhammad Shahzad Chaudhry 1 and Muhammad Hanif 2 ISSN 684-8403 Journal of Statistics Volume 3, 06. pp. 4-57 Comparison of Estimators in Case of Lo Correlation in Muhammad Shahad Chaudhr and Muhammad Hanif Abstract In this paper, to Regression-Cum-Eponential

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

From the help desk: It s all about the sampling

From the help desk: It s all about the sampling The Stata Journal (2002) 2, Number 2, pp. 90 20 From the help desk: It s all about the sampling Allen McDowell Stata Corporation amcdowell@stata.com Jeff Pitblado Stata Corporation jsp@stata.com Abstract.

More information

Combining data from two independent surveys: model-assisted approach

Combining data from two independent surveys: model-assisted approach Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,

More information

arxiv: v1 [math.st] 28 Feb 2017

arxiv: v1 [math.st] 28 Feb 2017 Bridging Finite and Super Population Causal Inference arxiv:1702.08615v1 [math.st] 28 Feb 2017 Peng Ding, Xinran Li, and Luke W. Miratrix Abstract There are two general views in causal analysis of experimental

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

MATH 680 Fall November 27, Homework 3

MATH 680 Fall November 27, Homework 3 MATH 680 Fall 208 November 27, 208 Homework 3 This homework is due on December 9 at :59pm. Provide both pdf, R files. Make an individual R file with proper comments for each sub-problem. Subgradients and

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018 Work all problems. 60 points are needed to pass at the Masters Level and 75

More information

Sociedad de Estadística e Investigación Operativa

Sociedad de Estadística e Investigación Operativa Sociedad de Estadística e Investigación Operativa Test Volume 14, Number 2. December 2005 Estimation of Regression Coefficients Subject to Exact Linear Restrictions when Some Observations are Missing and

More information

Multiple Comparison Testing for Experimental Chemotherapy Based on Multivariate Covariance Analysis

Multiple Comparison Testing for Experimental Chemotherapy Based on Multivariate Covariance Analysis Journal of Statistical and Econometric Methods, vol., no., 0, -0 ISSN: 9-0 (print), 9-99 (online) Scienpress Ltd, 0 Multiple Comparison Testing for Experimental Chemotherap Based on Multivariate Covariance

More information

Supplement-Sample Integration for Prediction of Remainder for Enhanced GREG

Supplement-Sample Integration for Prediction of Remainder for Enhanced GREG Supplement-Sample Integration for Prediction of Remainder for Enhanced GREG Abstract Avinash C. Singh Division of Survey and Data Sciences American Institutes for Research, Rockville, MD 20852 asingh@air.org

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

Pooling multiple imputations when the sample happens to be the population.

Pooling multiple imputations when the sample happens to be the population. Pooling multiple imputations when the sample happens to be the population. Gerko Vink 1,2, and Stef van Buuren 1,3 arxiv:1409.8542v1 [math.st] 30 Sep 2014 1 Department of Methodology and Statistics, Utrecht

More information

Obnoxious lateness humor

Obnoxious lateness humor Obnoxious lateness humor 1 Using Bayesian Model Averaging For Addressing Model Uncertainty in Environmental Risk Assessment Louise Ryan and Melissa Whitney Department of Biostatistics Harvard School of

More information

arxiv: v1 [math.st] 22 Dec 2018

arxiv: v1 [math.st] 22 Dec 2018 Optimal Designs for Prediction in Two Treatment Groups Rom Coefficient Regression Models Maryna Prus Otto-von-Guericke University Magdeburg, Institute for Mathematical Stochastics, PF 4, D-396 Magdeburg,

More information

Incorporating published univariable associations in diagnostic and prognostic modeling

Incorporating published univariable associations in diagnostic and prognostic modeling Incorporating published univariable associations in diagnostic and prognostic modeling Thomas Debray Julius Center for Health Sciences and Primary Care University Medical Center Utrecht The Netherlands

More information

A flexible two-step randomised response model for estimating the proportions of individuals with sensitive attributes

A flexible two-step randomised response model for estimating the proportions of individuals with sensitive attributes A flexible two-step randomised response model for estimating the proportions of individuals with sensitive attributes Anne-Françoise Donneau, Murielle Mauer Francisco Sartor and Adelin Albert Department

More information

Topic 3 Populations and Samples

Topic 3 Populations and Samples BioEpi540W Populations and Samples Page 1 of 33 Topic 3 Populations and Samples Topics 1. A Feeling for Populations v Samples 2 2. Target Populations, Sampled Populations, Sampling Frames 5 3. On Making

More information

ESTP course on Small Area Estimation

ESTP course on Small Area Estimation ESTP course on Small Area Estimation Statistics Finland, Helsinki, 29 September 2 October 2014 Topic 1: Introduction to small area estimation Risto Lehtonen, University of Helsinki Lecture topics: Monday

More information

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression Dependence and scatter-plots MVE-495: Lecture 4 Correlation and Regression It is common for two or more quantitative variables to be measured on the same individuals. Then it is useful to consider what

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Comparison of Two Ratio Estimators Using Auxiliary Information

Comparison of Two Ratio Estimators Using Auxiliary Information IOR Journal of Mathematics (IOR-JM) e-in: 78-578, p-in: 39-765. Volume, Issue 4 Ver. I (Jul. - Aug.06), PP 9-34 www.iosrjournals.org omparison of Two Ratio Estimators Using Auxiliar Information Bawa, Ibrahim,

More information

Chapter 3: Element sampling design: Part 1

Chapter 3: Element sampling design: Part 1 Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part

More information

Prediction of New Observations

Prediction of New Observations Statistic Seminar: 6 th talk ETHZ FS2010 Prediction of New Observations Martina Albers 12. April 2010 Papers: Welham (2004), Yiang (2007) 1 Content Introduction Prediction of Mixed Effects Prediction of

More information

An analytic proof of the theorems of Pappus and Desargues

An analytic proof of the theorems of Pappus and Desargues Note di Matematica 22, n. 1, 2003, 99 106. An analtic proof of the theorems of Pappus and Desargues Erwin Kleinfeld and Tuong Ton-That Department of Mathematics, The Universit of Iowa, Iowa Cit, IA 52242,

More information

Research Design - - Topic 15a Introduction to Multivariate Analyses 2009 R.C. Gardner, Ph.D.

Research Design - - Topic 15a Introduction to Multivariate Analyses 2009 R.C. Gardner, Ph.D. Research Design - - Topic 15a Introduction to Multivariate Analses 009 R.C. Gardner, Ph.D. Major Characteristics of Multivariate Procedures Overview of Multivariate Techniques Bivariate Regression and

More information

Contextual Effects in Modeling for Small Domains

Contextual Effects in Modeling for Small Domains University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Engineering and Information Sciences 2011 Contextual Effects in

More information

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Yan Wang 1, Michael Ong 2, Honghu Liu 1,2,3 1 Department of Biostatistics, UCLA School

More information

Unbiased estimation of exposure odds ratios in complete records logistic regression

Unbiased estimation of exposure odds ratios in complete records logistic regression Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology

More information

Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys

Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys The Canadian Journal of Statistics Vol.??, No.?,????, Pages???-??? La revue canadienne de statistique Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys Zhiqiang TAN 1 and Changbao

More information

Estimation of Mean Population in Small Area with Spatial Best Linear Unbiased Prediction Method

Estimation of Mean Population in Small Area with Spatial Best Linear Unbiased Prediction Method Journal of Physics: Conference Series PAPER OPEN ACCESS Estimation of Mean Population in Small Area with Spatial Best Linear Unbiased Prediction Method To cite this article: Syahril Ramadhan et al 2017

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering John J. Dziak The Pennsylvania State University Inbal Nahum-Shani The University of Michigan Copyright 016, Penn State.

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and can be printed and given to the

More information

Job Training Partnership Act (JTPA)

Job Training Partnership Act (JTPA) Causal inference Part I.b: randomized experiments, matching and regression (this lecture starts with other slides on randomized experiments) Frank Venmans Example of a randomized experiment: Job Training

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

SAMPLING BIOS 662. Michael G. Hudgens, Ph.D. mhudgens :55. BIOS Sampling

SAMPLING BIOS 662. Michael G. Hudgens, Ph.D.   mhudgens :55. BIOS Sampling SAMPLIG BIOS 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-11-14 15:55 BIOS 662 1 Sampling Outline Preliminaries Simple random sampling Population mean Population

More information

Development of methodology for the estimate of variance of annual net changes for LFS-based indicators

Development of methodology for the estimate of variance of annual net changes for LFS-based indicators Development of methodology for the estimate of variance of annual net changes for LFS-based indicators Deliverable 1 - Short document with derivation of the methodology (FINAL) Contract number: Subject:

More information

Inference about the Slope and Intercept

Inference about the Slope and Intercept Inference about the Slope and Intercept Recall, we have established that the least square estimates and 0 are linear combinations of the Y i s. Further, we have showed that the are unbiased and have the

More information

Specification testing in panel data models estimated by fixed effects with instrumental variables

Specification testing in panel data models estimated by fixed effects with instrumental variables Specification testing in panel data models estimated by fixed effects wh instrumental variables Carrie Falls Department of Economics Michigan State Universy Abstract I show that a handful of the regressions

More information

ESTIMATION OF CONFIDENCE INTERVALS FOR QUANTILES IN A FINITE POPULATION

ESTIMATION OF CONFIDENCE INTERVALS FOR QUANTILES IN A FINITE POPULATION Mathematical Modelling and Analysis Volume 13 Number 2, 2008, pages 195 202 c 2008 Technika ISSN 1392-6292 print, ISSN 1648-3510 online ESTIMATION OF CONFIDENCE INTERVALS FOR QUANTILES IN A FINITE POPULATION

More information

Much ado about nothing: the mixed models controversy revisited

Much ado about nothing: the mixed models controversy revisited Much ado about nothing: the mixed models controversy revisited Viviana eatriz Lencina epartamento de Investigación, FM Universidad Nacional de Tucumán, Argentina Julio da Motta Singer epartamento de Estatística,

More information

Casual Mediation Analysis

Casual Mediation Analysis Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction

More information

6. Vector Random Variables

6. Vector Random Variables 6. Vector Random Variables In the previous chapter we presented methods for dealing with two random variables. In this chapter we etend these methods to the case of n random variables in the following

More information

Scatter Plot Quadrants. Setting. Data pairs of two attributes X & Y, measured at N sampling units:

Scatter Plot Quadrants. Setting. Data pairs of two attributes X & Y, measured at N sampling units: Geog 20C: Phaedon C Kriakidis Setting Data pairs of two attributes X & Y, measured at sampling units: ṇ and ṇ there are pairs of attribute values {( n, n ),,,} Scatter plot: graph of - versus -values in

More information

Chapter 2: Describing Contingency Tables - I

Chapter 2: Describing Contingency Tables - I : Describing Contingency Tables - I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu]

More information