Nonresponse weighting adjustment using estimated response probability
|
|
- Maude Bridges
- 6 years ago
- Views:
Transcription
1 Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006
2 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy for nonresponse Unit nonresponse : Call-back, Nonresponse weighting adjustment Item nonresponse : Imputation 2
3 Basic Setup Stratum Pop. Size Mean Sample Size Respondents N R Ȳ R n R Nonrespondents N M Ȳ M n M Entire population N Ȳ n SRS from the entire population, but observe only on the respondents. Use ȳ R (respondent mean) to estimate the population mean. Bias (ȳ R ) V ar (ȳ R ). = N R ) (ȲR Ȳ M N. = SR 2 n R 3
4 Two problems : Biased : Ȳ R Ȳ M Large variance due to n R < n
5 Nonresponse weighting adjustment (NWA) method Under no missing data: Ŷ HT = i A π i y i π i = P r (i A): first-order inclusion probability of unit i A: index set of the intended sample Response indicator function: R i = { if unit i responds, 0 if unit i does not respond. 4
6 Idea : Use two-phase sampling approach Population (U) P hase Sample (A) P hase2 Respondents (A R ) Estimation: Let φ i A = P r (R i = A). If φ i A were known, then Ŷ φ = R i y i i A π i φ i A would be conditionally unbiased. In practice, we use an estimator ˆφ i A of φ i A. The NWA estimator is Ŷ NW A = R i y i i A π i ˆφ i A 5
7 Example : Logistic regression model for φ i A Model : φ (x i, α) = exp ( x i α) + exp ( x i α) Estimation of α 0 by the maximum likelihood method : Solve S (α) [R i φ (x i, α)] x i = 0 i A for α. An iterative method can be used to solve the nonlinear equation : α (t+) α (t) ( S/ α) S ( α (t)) 6
8 More generally, the score equation can be weighted: for some weight k i. S k (α) i A k i [R i φ (x i, α)] x i = 0 Should we use weights or not? Optimal k i? 7
9 Asymptotic Properties of NWA estimator Assumptions about the population and the sample [A.] Sequence of finite populations with bounded fourth moments. [A.2] No extreme weights dominate the others. [A.3] n-consistency holds for the mean-type estimators 8
10 Assumptions about the response probability [B.] φ i A does not depend on the value of others. φ i A = φ i ) (i.e. [B.2] The responses are independent: Cov ( { ) φi ( φ R i, R j = i ) if i = j 0 otherwise. [B.3] The response probability is parametrically modelled. φ i = φ ( x i ; α 0), for some known smooth function φ (x i ; ) of parameter α evaluated at α = α 0. [B.4] φ i is uniformly bounded. 9
11 Estimation of φ i Estimation of α 0 : Use weighted score equation α i A k i [R i ln (φ i ) + ( R i ) ln ( φ i )] = 0, where k i is the weight of unit i in the score equation for α. Alternative representation: S k (α) (let) = i A k i (R i φ (x i ; α)) h i (α) = 0, where h i (α) = {logit (φ i )} / α. Use ˆφ i = φ (x i ; ˆα) where ˆα is the solution to S k (α) = 0. 0
12 Basic Idea (for deriving the asymptotic properties) Write the NWA estimator as a function of ˆα: Ŷ NW A (ˆα) i A π i φ i (ˆα) R iy i Taking a Taylor expansion of Ŷ NW A (ˆα) around α = α 0 : Ŷ NW A (ˆα). = Ŷ NW A ( α 0 ) + [ ŶNW A α ( α 0 )] (ˆα α 0 ) The second term in RHS does not contribute to the expectation, but does contribute to the variance.
13 Basic Idea - continued Taking a Taylor expansion of S k (ˆα) = 0: S k (ˆα). = S k ( α 0 ) + [ Sk α ( α 0 )] (ˆα α 0 ) Combine the two expansions: Ŷ NW A (ˆα). = Ŷ NW A ( α 0 ) [ ŶNW A α. = Ŷ NW A ( α 0 ) γ N S k ( α 0 ) ( α 0 )] [ Sk α ( α 0 )] S k ( α 0 ) ( where Ŷ NW A α 0 ) = i A πi φ ( i R i y i and S k α 0 ) = i A k i (R i φ i ) h i0. 2
14 Main Result Linearization Ŷ NW A. = i A π i [ π i φ i k i h i0 γ N + R i φ i ( yi π i φ i k i h i0 γ N ) ] Conditional expectation E ( Ŷ NW A A ). = i A π i y i Conditional variance V ( Ŷ NW A A ). = i A π 2 i φ i φ i ( yi π i φ i k i h i0 γ N ) 2 3
15 Main Result -continued The NWA estimator is asymptotically unbiased regardless of the choice of k i. The variance of the NWA estimator is minimized for k i = y i π i φ i h i0 γ If we don t have any prior information about the distribution of y i, then k i = π i φ i seems to be a reasonable choice for optimal NWA estimation. 4
16 Main Result -continued Estimate α 0 using k i = π i φ i : S k (α) = i A k i (R i φ (x i ; α)) h i (α) = i A π i φ i (R i φ (x i ; α)) h i (α) S k (ˆα) = 0 is equivalent to i A π i R i ˆφ i h i (ˆα) = i A π i h i (ˆα). Thus, optimal score equation = calibration equation. 5
17 Back to Example - Logistic regression model for φ i Under the logistic regression model, logit (φ i ) = x i α0 and S k (α) i A k i [R i φ (x i, α)] x i = 0 Optimal score equation i A π i R i ˆφ i x i = i A π i x i. Optimal NWA estimator applied to x = Complete sample estimator applied to x. 6
18 Simulation Study An artificial bivariate population of size N = 0, 000: ( ) [( ) ( )] yi i.i.d. 2 ρ N,, i =, 2,, N 2 ρ x i Unequal probability sampling (by stratified sampling) Generate missing data using a logistic regression model of R i on x i. About 30% missing data. 7
19 Simulation Study - continued Four estimators. Two-phase estimator : NWA estimator using true φ i 2. Unweighted NWA estimator : NWA estimator using k i = 3. Weighted NWA estimator : NWA estimator using k i = /π i 4. Optimal NWA estimator : NWA estimator using k i = / (π i φ i ) 0,000 Monte Carlo samples of size n = 00 and n = 400 are generated repeatedly from the fixed population. 8
20 Monte Carlo standardized variances of the NWA estimators, based on 0,000 samples. n Estimator Standardized variance ρ = 0.0 ρ = 0.3 ρ = 0.6 Two-phase Unweighted NWA Weighted NWA Optimal NWA Two-phase Unweighted NWA Weighted NWA Optimal NWA
21 Variance estimation Linearization : Ŷ NW A. = i A π i η i where η i = π i φ i k i h i0 γ N + R i φ i ( yi π i φ i k i h i0 γ N ) Extended definition of R i : { if unit i responds if sampled R i = 0 if unit i does not respond if sampled, for i =, 2,, N. 20
22 Variance estimation - continued Classical two-phase approach: Population (U) P hase Sample (A) P hase2 Respondents (A R ) Reverse approach: Population (U) Responding Population (U R ) Respondents (A R ) 2
23 Variance decomposition under the reverse approach: where V E V 2 V V E V ( Ŷ NW A ). = V + V 2 i A π i i A π i η i R, R 2,, R N η i R, R 2,, R N Variance component estimation ˆV = i A ˆV 2 = j A ˆφ i i A R π ij π i π j ˆη i ˆη j π ij π i π j (ˆφ i ) ( ) 2 y i π iˆφ i k iˆγ N 22
24 Extension : Nonresponse cell method A special case of NWA method. Commonly used. Partition the sample into G cells : A = A A 2 A G Assume that the response rates φ i are constant in a cell. For i A g, use ˆφ i = i A g πi R i i A g πi 23
25 Extension - Continued NWA cell : Two cell formation criteria Qausi-randomzation approach : Equal response probability assumption Model-based approach : Homogeneous study-item-value assumption Cross-classification of two dimensions of cells is not feasible : collapse the cells in an ad-hoc manner 24
26 Extension - Continued Previously we used Ŷ NW A = G g= i A π g i i A g πi R i y i i A g πi R i Here, the cells are formed to have equal response probability. Directly use ˆφ i in the cell-weighting estimator Ŷ NW A2 = G g= i A π g i i A g πi ˆφ i R i y i i A g πi ˆφ i R i 25
27 Extension - Continued Taylor expansion Ŷ NW A2 = ŶHT + G g= i A g π i ( ) Ri (y i ȳ g ) φ i Variance V ar ( Ŷ NW A2 ) = V ar (ŶHT ) +E G g= i A g π 2 i ( ) φ i (y i ȳ g ) 2 The variance will be smaller if the cells are formed with homogeneous y s. 26
28 Conclusion Even if you know the true response probability, it s better to use the estimated response probability for the NWA estimation. Maximum likelihood method may be optimal for estimating α 0, but not optimal for NWA estimation. Standard practice of calibration is indeed an optimal procedure. Variance estimation is possible using the reverse approach. 27
29 In the cell-weighting NWA method, we can Use the estimated response probability to control the bias. Use the weighting cell to control the variance.
30 Thank You! 28
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationChapter 8: Estimation 1
Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.
More informationChapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70
Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:
More informationRecent Advances in the analysis of missing data with non-ignorable missingness
Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationData Integration for Big Data Analysis for finite population inference
for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation
More informationINSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING
Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy
More informationImputation for Missing Data under PPSWR Sampling
July 5, 2010 Beijing Imputation for Missing Data under PPSWR Sampling Guohua Zou Academy of Mathematics and Systems Science Chinese Academy of Sciences 1 23 () Outline () Imputation method under PPSWR
More informationA measurement error model approach to small area estimation
A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion
More information6. Fractional Imputation in Survey Sampling
6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationParametric fractional imputation for missing data analysis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in
More informationChapter 3: Element sampling design: Part 1
Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part
More informationOn the bias of the multiple-imputation variance estimator in survey sampling
J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,
More informationIntroduction to Survey Data Analysis
Introduction to Survey Data Analysis JULY 2011 Afsaneh Yazdani Preface Learning from Data Four-step process by which we can learn from data: 1. Defining the Problem 2. Collecting the Data 3. Summarizing
More informationShu Yang and Jae Kwang Kim. Harvard University and Iowa State University
Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationPropensity score adjusted method for missing data
Graduate Theses and Dissertations Graduate College 2013 Propensity score adjusted method for missing data Minsun Kim Riddles Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/etd
More informationCalibration estimation using exponential tilting in sample surveys
Calibration estimation using exponential tilting in sample surveys Jae Kwang Kim February 23, 2010 Abstract We consider the problem of parameter estimation with auxiliary information, where the auxiliary
More informationChapter 4: Imputation
Chapter 4: Imputation Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Basic Theory for imputation 3 Variance estimation after imputation 4 Replication variance estimation
More informationCombining data from two independent surveys: model-assisted approach
Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,
More informationBiostat 2065 Analysis of Incomplete Data
Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh September 13 & 15, 2005 1. Complete-case analysis (I) Complete-case analysis refers to analysis based on
More informationWeighting in survey analysis under informative sampling
Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting
More informationTwo-phase sampling approach to fractional hot deck imputation
Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.
More informationSome methods for handling missing data in surveys
Graduate Theses and Dissertations Graduate College 2015 Some methods for handling missing data in surveys Jongho Im Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/etd
More informationarxiv:math/ v1 [math.st] 23 Jun 2004
The Annals of Statistics 2004, Vol. 32, No. 2, 766 783 DOI: 10.1214/009053604000000175 c Institute of Mathematical Statistics, 2004 arxiv:math/0406453v1 [math.st] 23 Jun 2004 FINITE SAMPLE PROPERTIES OF
More informationMultidimensional Control Totals for Poststratified Weights
Multidimensional Control Totals for Poststratified Weights Darryl V. Creel and Mansour Fahimi Joint Statistical Meetings Minneapolis, MN August 7-11, 2005 RTI International is a trade name of Research
More informationNew Developments in Nonresponse Adjustment Methods
New Developments in Nonresponse Adjustment Methods Fannie Cobben January 23, 2009 1 Introduction In this paper, we describe two relatively new techniques to adjust for (unit) nonresponse bias: The sample
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth
More informationStatistical Methods for Handling Missing Data
Statistical Methods for Handling Missing Data Jae-Kwang Kim Department of Statistics, Iowa State University July 5th, 2014 Outline Textbook : Statistical Methods for handling incomplete data by Kim and
More informationMiscellanea A note on multiple imputation under complex sampling
Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.
More informationanalysis of incomplete data in statistical surveys
analysis of incomplete data in statistical surveys Ugo Guarnera 1 1 Italian National Institute of Statistics, Italy guarnera@istat.it Jordan Twinning: Imputation - Amman, 6-13 Dec 2014 outline 1 origin
More informationModeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use
Modeling Longitudinal Count Data with Excess Zeros and : Application to Drug Use University of Northern Colorado November 17, 2014 Presentation Outline I and Data Issues II Correlated Count Regression
More informationLikelihood-based inference with missing data under missing-at-random
Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric
More informationA Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning
A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning Saharon Rosset Data Analytics Research Group IBM T.J. Watson Research Center Yorktown Heights, NY 1598 srosset@us.ibm.com Hui
More informationA Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning
A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning Saharon Rosset Data Analytics Research Group IBM T.J. Watson Research Center Yorktown Heights, NY 1598 srosset@us.ibm.com Hui
More informationof being selected and varying such probability across strata under optimal allocation leads to increased accuracy.
5 Sampling with Unequal Probabilities Simple random sampling and systematic sampling are schemes where every unit in the population has the same chance of being selected We will now consider unequal probability
More informationTesting for a unit root in an ar(1) model using three and four moment approximations: symmetric distributions
Hong Kong Baptist University HKBU Institutional Repository Department of Economics Journal Articles Department of Economics 1998 Testing for a unit root in an ar(1) model using three and four moment approximations:
More informationCluster Sampling 2. Chapter Introduction
Chapter 7 Cluster Sampling 7.1 Introduction In this chapter, we consider two-stage cluster sampling where the sample clusters are selected in the first stage and the sample elements are selected in the
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationModel Assisted Survey Sampling
Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling
More informationAsymptotic inference for a nonstationary double ar(1) model
Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationA weighted simulation-based estimator for incomplete longitudinal data models
To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun
More informationVARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA
Submitted to the Annals of Applied Statistics VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA By Jae Kwang Kim, Wayne A. Fuller and William R. Bell Iowa State University
More informationWeighting Missing Data Coding and Data Preparation Wrap-up Preview of Next Time. Data Management
Data Management Department of Political Science and Government Aarhus University November 24, 2014 Data Management Weighting Handling missing data Categorizing missing data types Imputation Summary measures
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationANALYSIS OF PANEL DATA MODELS WITH GROUPED OBSERVATIONS. 1. Introduction
Tatra Mt Math Publ 39 (2008), 183 191 t m Mathematical Publications ANALYSIS OF PANEL DATA MODELS WITH GROUPED OBSERVATIONS Carlos Rivero Teófilo Valdés ABSTRACT We present an iterative estimation procedure
More informationUnequal Probability Designs
Unequal Probability Designs Department of Statistics University of British Columbia This is prepares for Stat 344, 2014 Section 7.11 and 7.12 Probability Sampling Designs: A quick review A probability
More informationGraybill Conference Poster Session Introductions
Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary
More informationHow to Use the Internet for Election Surveys
How to Use the Internet for Election Surveys Simon Jackman and Douglas Rivers Stanford University and Polimetrix, Inc. May 9, 2008 Theory and Practice Practice Theory Works Doesn t work Works Great! Black
More informationWeighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai
Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationAn Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys
An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys Richard Valliant University of Michigan and Joint Program in Survey Methodology University of Maryland 1 Introduction
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationREPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY
REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in
More informationIntroduction to Survey Data Integration
Introduction to Survey Data Integration Jae-Kwang Kim Iowa State University May 20, 2014 Outline 1 Introduction 2 Survey Integration Examples 3 Basic Theory for Survey Integration 4 NASS application 5
More informationThe Use of Survey Weights in Regression Modelling
The Use of Survey Weights in Regression Modelling Chris Skinner London School of Economics and Political Science (with Jae-Kwang Kim, Iowa State University) Colorado State University, June 2013 1 Weighting
More informationBootstrap inference for the finite population total under complex sampling designs
Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationCombining multiple observational data sources to estimate causal eects
Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,
More informationJong-Min Kim* and Jon E. Anderson. Statistics Discipline Division of Science and Mathematics University of Minnesota at Morris
Jackknife Variance Estimation for Two Samples after Imputation under Two-Phase Sampling Jong-Min Kim* and Jon E. Anderson jongmink@mrs.umn.edu Statistics Discipline Division of Science and Mathematics
More informationABHELSINKI UNIVERSITY OF TECHNOLOGY
Cross-Validation, Information Criteria, Expected Utilities and the Effective Number of Parameters Aki Vehtari and Jouko Lampinen Laboratory of Computational Engineering Introduction Expected utility -
More informationif n is large, Z i are weakly dependent 0-1-variables, p i = P(Z i = 1) small, and Then n approx i=1 i=1 n i=1
Count models A classical, theoretical argument for the Poisson distribution is the approximation Binom(n, p) Pois(λ) for large n and small p and λ = np. This can be extended considerably to n approx Z
More informationCOMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract
Far East J. Theo. Stat. 0() (006), 179-196 COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS Department of Statistics University of Manitoba Winnipeg, Manitoba, Canada R3T
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationEconomics 671: Applied Econometrics Department of Economics, Finance and Legal Studies University of Alabama
Problem Set #1 (Random Data Generation) 1. Generate =500random numbers from both the uniform 1 ( [0 1], uniformbetween zero and one) and exponential exp ( ) (set =2and let [0 1]) distributions. Plot the
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More informationVasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks
C.M. Bishop s PRML: Chapter 5; Neural Networks Introduction The aim is, as before, to find useful decompositions of the target variable; t(x) = y(x, w) + ɛ(x) (3.7) t(x n ) and x n are the observations,
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 15 Outline 1 Fitting GLMs 2 / 15 Fitting GLMS We study how to find the maxlimum likelihood estimator ˆβ of GLM parameters The likelihood equaions are usually
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationOn Markov chain Monte Carlo methods for tall data
On Markov chain Monte Carlo methods for tall data Remi Bardenet, Arnaud Doucet, Chris Holmes Paper review by: David Carlson October 29, 2016 Introduction Many data sets in machine learning and computational
More informationEstimation from Purposive Samples with the Aid of Probability Supplements but without Data on the Study Variable
Estimation from Purposive Samples with the Aid of Probability Supplements but without Data on the Study Variable A.C. Singh,, V. Beresovsky, and C. Ye Survey and Data Sciences, American nstitutes for Research,
More information7 Day 3: Time Varying Parameter Models
7 Day 3: Time Varying Parameter Models References: 1. Durbin, J. and S.-J. Koopman (2001). Time Series Analysis by State Space Methods. Oxford University Press, Oxford 2. Koopman, S.-J., N. Shephard, and
More informationSemiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity
Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Jörg Schwiebert Abstract In this paper, we derive a semiparametric estimation procedure for the sample selection model
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationAsymptotic Normality under Two-Phase Sampling Designs
Asymptotic Normality under Two-Phase Sampling Designs Jiahua Chen and J. N. K. Rao University of Waterloo and University of Carleton Abstract Large sample properties of statistical inferences in the context
More informationLarge sample theory for merged data from multiple sources
Large sample theory for merged data from multiple sources Takumi Saegusa University of Maryland Division of Statistics August 22 2018 Section 1 Introduction Problem: Data Integration Massive data are collected
More informationIncorporating Level of Effort Paradata in Nonresponse Adjustments. Paul Biemer RTI International University of North Carolina Chapel Hill
Incorporating Level of Effort Paradata in Nonresponse Adjustments Paul Biemer RTI International University of North Carolina Chapel Hill Acknowledgements Patrick Chen, RTI International Kevin Wang, RTI
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationMonte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics
Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics Amang S. Sukasih, Mathematica Policy Research, Inc. Donsig Jang, Mathematica Policy Research, Inc. Amang S. Sukasih,
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC
More informationA bias improved estimator of the concordance correlation coefficient
The 22 nd Annual Meeting in Mathematics (AMM 217) Department of Mathematics, Faculty of Science Chiang Mai University, Chiang Mai, Thailand A bias improved estimator of the concordance correlation coefficient
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationSome methods for handling missing values in outcome variables. Roderick J. Little
Some methods for handling missing values in outcome variables Roderick J. Little Missing data principles Likelihood methods Outline ML, Bayes, Multiple Imputation (MI) Robust MAR methods Predictive mean
More informationFractional hot deck imputation
Biometrika (2004), 91, 3, pp. 559 578 2004 Biometrika Trust Printed in Great Britain Fractional hot deck imputation BY JAE KWANG KM Department of Applied Statistics, Yonsei University, Seoul, 120-749,
More informationFinite Population Sampling and Inference
Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane
More informationModeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses
Outline Marginal model Examples of marginal model GEE1 Augmented GEE GEE1.5 GEE2 Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria
ECOOMETRICS II (ECO 24S) University of Toronto. Department of Economics. Winter 26 Instructor: Victor Aguirregabiria FIAL EAM. Thursday, April 4, 26. From 9:am-2:pm (3 hours) ISTRUCTIOS: - This is a closed-book
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationEconomics 583: Econometric Theory I A Primer on Asymptotics
Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationConstrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources
Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Yi-Hau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More information