Introduction to Survey Data Integration
|
|
- Dwight Ray
- 6 years ago
- Views:
Transcription
1 Introduction to Survey Data Integration Jae-Kwang Kim Iowa State University May 20, 2014
2 Outline 1 Introduction 2 Survey Integration Examples 3 Basic Theory for Survey Integration 4 NASS application 5 Discussion Kim (ISU) Matching May 20, / 29
3 1. Introduction Question: What do you mean by integration in surveys? integrating survey and administrative data through unit record linkage improving coherence across data collections by using standard classifications and questions rationalizing content between surveys, including census greater standardization of methods, tools, IT systems and processes combining separate sample surveys into one survey vehicle. Reference: Bycroft, C. (2010). Integrated household surveys: a survey vehicle approach. Wellington: Statistics New Zealand. Kim (ISU) Matching May 20, / 29
4 1. Introduction Our focus is the last part of the several meaning of survey integration. combining separate sample surveys into one survey vehicle Most sample surveys are run as stand-alone surveys. However, survey integration is a new area of research to achieve the three conflicting goals: 1 Minimize the cost associated with surveys 2 Maximize the information (or efficiency of the survey estimation) 3 Minimize the respondent burden Kim (ISU) Matching May 20, / 29
5 1. Introduction Two approaches for survey integration Macro approach: Obtain improve estimates by combining all available information General Least Squares (GLS) estimation, GLS weighting Micro approach: Create a single data (single survey vehicle) that contains all available information Synthetic data imputation Record Linkage Statistical Matching, or data fusion Kim (ISU) Matching May 20, / 29
6 2. Survey Integration Examples How to combine several source of information? Example 1: US Census of housing and population Short Form: 100 % sample (obtain basic demographic information) Long form: about 16% sample (obtain other social and economic information as well as demographic information) Classical two-phase sampling problem: Calibration weighting for demographic variable to match known population counts from short form (Deming & Stephan, 1940). Kim (ISU) Matching May 20, / 29
7 2. Survey Integration Examples Example 2: Consumer Expenditure Survey (Zieschang, 1990) Diary survey: Observe X, Y Interview survey (quarterly): Observe X Two surveys are obtained independently from the same target population. (uses the same sampling frame.) Two estimates of X, ˆX 1 and ˆX 2, can be different because of the sampling errors. How to incorporate the information from the quarterly interview survey to diary survey estimate? Kim (ISU) Matching May 20, / 29
8 2. Survey Integration Examples GLS Weighting Two parameters with three estimates: 1 Survey one: Observe ˆX 1 2 Survey two: Observe ˆX 2 and Ŷ2 GLS model ˆX 1 ˆX 2 Ŷ 2 = ( X Y ) + e 1 e 2 e 3 (1) where V e 1 e 2 e 3 = V xx V xx2 V xy2 0 V xy2 V yy2 Kim (ISU) Matching May 20, / 29
9 2. Survey Integration Examples GLS Weighting (Continued) GLS model: We may write Y = Zθ + e where e (0, V ). GLS estimator of θ: ˆθ GLS = ( Z V 1 Z ) 1 Z V 1 Y Note that the GLS estimator is a linear combination of the three observed information: In fact, we have ˆθ GLS = α 1 ˆX1 + α 2 ˆX2 + α 3 Ŷ 2 and ˆX GLS = V xx2 V xx1 ˆX 1 + ˆX 2 V xx1 + V xx2 V xx1 + V xx2 ( ) Vxy2 Ŷ GLS = Ŷ 2 + ˆX GLS ˆX 2. V xx2 Kim (ISU) Matching May 20, / 29
10 2. Survey Integration Examples Example 3: SAIPE (Fay & Herriot, 1979), combines small area level information Current Population Survey (obtain Ŷi, estimated poverty rate for area i) Census long form data (x 1i : poverty rate for the census year for area i) Food stamp participation rate (x 2i ) Pseudo poverty rate for children from tax return information & tax nonfiler rate (x 3i ) Fay-Herriot model Ŷ i = Y i + e i = (x iβ + u i ) + e i x i = (1, x 1i, x 2i, x 3i ) : predictors, e i : sampling error, u i : area i random effect (=model error). Kim (ISU) Matching May 20, / 29
11 3. Basic Theory General Setup Survey A: Measures Ŷ i,a, subject to sampling error. Survey B: Measures ˆX i,b, subject to sampling error. E A (Ŷi,a) E B ( ˆX i,b ) due to the structural difference between the surveys Structural difference (or systematic difference) due to different mode of survey due to time difference due to frame difference Example: Objective Yield Survey vs Agricultural Yield Survey and December Agricultural Surveys Kim (ISU) Matching May 20, / 29
12 3. Basic Theory Two error models (for area i) Sampling error model Ŷ i,a = Y i + a i ˆX i,b = X i + b i where (a i, b i ) represents the sampling error such that ( ) [( ) ( )] ai 0 Vyy,i V, yx,i b i 0 V xy,i V xx,i Structural error modeling X i = β 0 N i + β 1 Y i + e i, e i (0, N i σ 2 e ) where N i is the (known) population size of area i. Kim (ISU) Matching May 20, / 29
13 3. Basic Theory Structural error model describes the relationship between the two survey measurement up to sampling error. Y : target measurement item (variable of primary interest) X : inaccurate measurement of Y with possible systematic bias. If both X and Y measure the same item (with different survey modes), structural error model is essentially a measurement error model. (β 0 = 0, β 1 = 1 means no measurement bias.) Kim (ISU) Matching May 20, / 29
14 3. Basic Theory If the parameters in the structural error model are known, Ŷ i,b β1 1 ( ˆX i,b β 0 N i ) is also an unbiased estimator of Y i, computed from survey B. Estimator Ŷi,b using consistent ( ˆβ 0, ˆβ 1 ) is often called synthetic estimator. How to combine the two estimators? 1 Best Unbiased Prediction approach 2 GLS (or GMM) approach Kim (ISU) Matching May 20, / 29
15 3. Basic Theory Best Unbiased Prediction approach: 1 Specify a model for sampling error: ˆf ( ˆX i, Ŷi X i, Y i ) 2 Specify a parametric model for structural error: g (Y i X i ; θ) h(x i ; α) 3 Use Bayes rule to combine the two models and obtain the posterior distribution f (Y i ˆX i, Ŷ i ; θ, α) ˆf ( ˆX i, Ŷ i X i, Y i )g(y i X i ; θ)h(x i ; α)dx i. 4 Compute conditional expectation ( E Y i ˆX ) i, Ŷi; θ, α = Y i f (Y i ˆX i, Ŷi; θ, α)dy i Kim (ISU) Matching May 20, / 29
16 3. Basic Theory Best Unbiased Prediction approach: Best prediction under correct model specification. Requires strong model assumptions. Computation can be heavy (MCMC) Depends on unknown superpopulation parameter 1 Empirical Bayes 2 Hierarchical Bayes Kim (ISU) Matching May 20, / 29
17 3. Basic Theory Alternative approach: GLS (Generalized least squares) approach Does not require distributional assumptions. Only require moment assumptions (second moment assumptions) Combine information through GLS, not through Bayes Recall : GLS method y = Zβ + e, e (0, V ) ˆβ GLS = ( Z V 1 Z ) 1 Z V 1 y Kim (ISU) Matching May 20, / 29
18 3. Basic Theory GLS approach to combine two error models: y = Zθ + e, e (0, V ) ( Ŷi,a β 1 1 ( ˆX i,b N i β 0 ) ) = ( 1 1 ) ( u1i Y i + u 2i where u 1i = a i and u 2i = β1 1 (b i + e i ). Thus, ( ) [( ) ( u1i 0 Vyy,i β, 1 1 V xy,i u 2i 0 β1 1 V xy,i β1 2 (V xx,i + N i σe) 2 ) )]. Kim (ISU) Matching May 20, / 29
19 3. Basic Theory GLS estimator : Best linear unbiased estimator of Y i based on the linear combination of Ŷ i,a and Ŷ i,b = β 1 1 ( ˆX i,b N i β 0 ). Under the current setup, where Ŷ i = ω i Ŷ i,a + (1 ω i )Ŷ i,b N i σe 2 + β1 2 ω i = V xx,i β 1 V xy,i N i σe 2 + β1 2V xx,i + V yy,i 2β 1 V xy,i The GLS estimator is sometimes called composite estimator. In practice, we need to use ˆβ 0, ˆβ 1, and ˆσ 2 e. Kim (ISU) Matching May 20, / 29
20 3. Basic Theory Parameter estimation for the structural model 1 Case 1: Matching measurement X and measurement Y is possible (e.g.: Survey A sample is a subset of survey B sample.) 2 Case 2: Matching is not possible. In Case 1, we can easily obtain a consistent estimator of the model parameters from the set where units have both X and Y observed. (Unit level modeling) In Case 2, we may use area level model to match ˆX i (from survey B) and Ŷi (from survey A) for each area. Kim (ISU) Matching May 20, / 29
21 3. Basic Theory To estimate (β 0, β 1 ), we express ˆX i,b = N i β 0 + Y i β 1 + e i + b i Ŷ i,a = Y i + a i Direct regression of ˆX i,b on (N i, Ŷ i,a ) does not work because Ŷ i,a has sampling error. The area-level model takes the form of measurement error model (Fuller, 1987). Parameter estimation can be performed using the measurement error model estimation methods. (Details skipped.) Kim (ISU) Matching May 20, / 29
22 4. NASS application Improving Crop Acreage Estimation National Agricultural Statistical Service (NASS) under US Department of Agriculture provides acreage estimates for crops. Several sources of information for crop acreage June Area Survey (JAS) data Satellite imagery classification data. Cropland Data Layer (CDL) Farm Service Agency (FSA) data Kim (ISU) Matching May 20, / 29
23 4. NASS application Improving Crop Acreage Estimation Several sources of information for crop acreage (for area i: analysis district) Ŷ i : estimates from JAS (subject to sampling error) ˆX 1i : estimates from CDL ˆX 2i : FSA estimates Even though satellite image data is not subject to sampling error, the CDL estimates are subject to sampling error due to classification from JAS data. FSA estimates can be modified to reduce the coverage error by modeling propensity scores for program participation. Thus, FSA estimates are subject to some type of sampling error where the sampling mechanism here refers to the program participation. Kim (ISU) Matching May 20, / 29
24 Figure : The 2009 cropland data layer products. The legend identifies aggregated agricultural and non-agricultural land cover categories by decreasing acreage. Kim (ISU) Matching May 20, / 29
25 4. NASS application Improving Crop Acreage Estimation We can construct separate structural error models X 1i = β 10 N i + β 11 Y i + e 1i X 2i = β 20 N i + β 21 Y i + e 2i where ( e1i e 2i ) [( 0 0 ) ( Ni σ, e N i σe2 2 )] Kim (ISU) Matching May 20, / 29
26 4. NASS application Improving Crop Acreage Estimation GLS method can be applied to separate structural error models to get Ŷ i Ŷ i,1 Ŷ i,2 = Y i + where Ŷ i,1 = ˆβ 11 1 ( ˆX 1i ˆβ 10 N i ) and Ŷ i,2 = ˆβ 21 1 ( ˆX 2i ˆβ 20 N i ) are two different synthetic estimators of Y i. u 1i u 2i u 3i We may use a simpler covariance matrix to achieve computational simplicity., Kim (ISU) Matching May 20, / 29
27 4. NASS application Some Results Figure : Comparison of the estimates in Indiana State Kim (ISU) Matching May 20, / 29
28 4. NASS application Some Results Table : CV for Corn Acreage (%) JAS GLS JAS GLS CA IN NE PN Kim (ISU) Matching May 20, / 29
29 5. Discussion Two error models: sampling error model and structural error model. GLS method provides a useful tool for combining two models. Does not rely on parametric distributional assumptions. Requires correct specification of the variance-covariance matrix for optimal estimation. Simpler form of covariance matrix can be used for computational simplicity over statistical efficiency. Kim (ISU) Matching May 20, / 29
30 REFERENCES Deming, W. E. & Stephan, F. F. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Annals of Mathematical Statistics 11, Fay, R. & Herriot, R. (1979). Estimates of income for small places: an application of James-Stein procedures to census data. J. Am. Statist. Assoc. 74, Fuller, W. (1987). Measurement Error Models. Hoboken, New Jersey: John Wiley & Sons, Inc. Zieschang, K. (1990). Sample weighting method and estimation of totals in the consumer expenditure survey. J. Am. Statist. Assoc. 85, Kim (ISU) Matching May 20, / 29
A measurement error model approach to small area estimation
A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationCombining Non-probability and Probability Survey Samples Through Mass Imputation
Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu
More informationSmall Area Modeling of County Estimates for Corn and Soybean Yields in the US
Small Area Modeling of County Estimates for Corn and Soybean Yields in the US Matt Williams National Agricultural Statistics Service United States Department of Agriculture Matt.Williams@nass.usda.gov
More informationChapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70
Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:
More informationChapter 8: Estimation 1
Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.
More informationCombining data from two independent surveys: model-assisted approach
Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,
More informationFitting a Bayesian Fay-Herriot Model
Fitting a Bayesian Fay-Herriot Model Nathan B. Cruze United States Department of Agriculture National Agricultural Statistics Service (NASS) Research and Development Division Washington, DC October 25,
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationRecent Advances in the analysis of missing data with non-ignorable missingness
Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation
More informationEric V. Slud, Census Bureau & Univ. of Maryland Mathematics Department, University of Maryland, College Park MD 20742
COMPARISON OF AGGREGATE VERSUS UNIT-LEVEL MODELS FOR SMALL-AREA ESTIMATION Eric V. Slud, Census Bureau & Univ. of Maryland Mathematics Department, University of Maryland, College Park MD 20742 Key words:
More informationINSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING
Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy
More informationVARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA
Submitted to the Annals of Applied Statistics VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA By Jae Kwang Kim, Wayne A. Fuller and William R. Bell Iowa State University
More informationSmall Area Confidence Bounds on Small Cell Proportions in Survey Populations
Small Area Confidence Bounds on Small Cell Proportions in Survey Populations Aaron Gilary, Jerry Maples, U.S. Census Bureau U.S. Census Bureau Eric V. Slud, U.S. Census Bureau Univ. Maryland College Park
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationA Hierarchical Clustering Algorithm for Multivariate Stratification in Stratified Sampling
A Hierarchical Clustering Algorithm for Multivariate Stratification in Stratified Sampling Stephanie Zimmer Jae-kwang Kim Sarah Nusser Abstract Stratification is used in sampling to create homogeneous
More informationSmall Area Estimates of Poverty Incidence in the State of Uttar Pradesh in India
Small Area Estimates of Poverty Incidence in the State of Uttar Pradesh in India Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi Email: hchandra@iasri.res.in Acknowledgments
More informationAdvanced Methods for Agricultural and Agroenvironmental. Emily Berg, Zhengyuan Zhu, Sarah Nusser, and Wayne Fuller
Advanced Methods for Agricultural and Agroenvironmental Monitoring Emily Berg, Zhengyuan Zhu, Sarah Nusser, and Wayne Fuller Outline 1. Introduction to the National Resources Inventory 2. Hierarchical
More information6. Fractional Imputation in Survey Sampling
6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population
More informationRobust Hierarchical Bayes Small Area Estimation for Nested Error Regression Model
Robust Hierarchical Bayes Small Area Estimation for Nested Error Regression Model Adrijo Chakraborty, Gauri Sankar Datta,3 and Abhyuday Mandal NORC at the University of Chicago, Bethesda, MD 084, USA Department
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationNonresponse weighting adjustment using estimated response probability
Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy
More informationParametric fractional imputation for missing data analysis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in
More informationSmall Area Estimation with Uncertain Random Effects
Small Area Estimation with Uncertain Random Effects Gauri Sankar Datta 1 2 3 and Abhyuday Mandal 4 Abstract Random effects models play an important role in model-based small area estimation. Random effects
More informationChapter 4: Imputation
Chapter 4: Imputation Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Basic Theory for imputation 3 Variance estimation after imputation 4 Replication variance estimation
More informationSelection of small area estimation method for Poverty Mapping: A Conceptual Framework
Selection of small area estimation method for Poverty Mapping: A Conceptual Framework Sumonkanti Das National Institute for Applied Statistics Research Australia University of Wollongong The First Asian
More informationImputation for Missing Data under PPSWR Sampling
July 5, 2010 Beijing Imputation for Missing Data under PPSWR Sampling Guohua Zou Academy of Mathematics and Systems Science Chinese Academy of Sciences 1 23 () Outline () Imputation method under PPSWR
More informationData Integration for Big Data Analysis for finite population inference
for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation
More informationSmall domain estimation using probability and non-probability survey data
Small domain estimation using probability and non-probability survey data Adrijo Chakraborty and N Ganesh March 2018, FCSM Outline Introduction to the Problem Approaches for Combining Probability and Non-Probability
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationThe Simple Regression Model. Part II. The Simple Regression Model
Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square
More informationAdvanced Econometrics
Based on the textbook by Verbeek: A Guide to Modern Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna May 16, 2013 Outline Univariate
More informationSmall area solutions for the analysis of pollution data TIES Daniela Cocchi, Enrico Fabrizi, Carlo Trivisano. Università di Bologna
Small area solutions for the analysis of pollution data Daniela Cocchi, Enrico Fabrizi, Carlo Trivisano Università di Bologna TIES 00 Genova, June nd 00 Summary The small area problem in the environmental
More informationSmall Domains Estimation and Poverty Indicators. Carleton University, Ottawa, Canada
Small Domains Estimation and Poverty Indicators J. N. K. Rao Carleton University, Ottawa, Canada Invited paper for presentation at the International Seminar Population Estimates and Projections: Methodologies,
More informationGraybill Conference Poster Session Introductions
Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationMultivariate area level models for small area estimation. a
Multivariate area level models for small area estimation. a a In collaboration with Roberto Benavent Domingo Morales González d.morales@umh.es Universidad Miguel Hernández de Elche Multivariate area level
More informationEstimation of Mean Population in Small Area with Spatial Best Linear Unbiased Prediction Method
Journal of Physics: Conference Series PAPER OPEN ACCESS Estimation of Mean Population in Small Area with Spatial Best Linear Unbiased Prediction Method To cite this article: Syahril Ramadhan et al 2017
More informationNonlinear Temperature Effects of Dust Bowl Region and Short-Run Adaptation
Nonlinear Temperature Effects of Dust Bowl Region and Short-Run Adaptation A. John Woodill University of Hawaii at Manoa Seminar in Energy and Environmental Policy March 28, 2016 A. John Woodill Nonlinear
More informationMonte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics
Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics Amang S. Sukasih, Mathematica Policy Research, Inc. Donsig Jang, Mathematica Policy Research, Inc. Amang S. Sukasih,
More informationBootstrap inference for the finite population total under complex sampling designs
Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.
More informationNew Developments in Nonresponse Adjustment Methods
New Developments in Nonresponse Adjustment Methods Fannie Cobben January 23, 2009 1 Introduction In this paper, we describe two relatively new techniques to adjust for (unit) nonresponse bias: The sample
More informationDetermining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1
Determining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1 Income and wealth distributions have a prominent position in
More informationMaking sense of Econometrics: Basics
Making sense of Econometrics: Basics Lecture 2: Simple Regression Egypt Scholars Economic Society Happy Eid Eid present! enter classroom at http://b.socrative.com/login/student/ room name c28efb78 Outline
More informationLikelihood-based inference with missing data under missing-at-random
Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric
More informationA Small Area Procedure for Estimating Population Counts
Graduate heses and Dissertations Iowa State University Capstones, heses and Dissertations 2010 A Small Area Procedure for Estimating Population Counts Emily J. erg Iowa State University Follow this and
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More informationREPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES
Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for
More informationStatistical perspectives on spatial social science
Statistical perspectives on spatial social science Discussion Sarah Nusser (nusser@iastate.edu) Center for Survey Statistics and Methodology Department of Statistics Iowa State University Morris Hansen
More informationMax. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes
Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter
More informationChapter 4. Replication Variance Estimation. J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28
Chapter 4 Replication Variance Estimation J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28 Jackknife Variance Estimation Create a new sample by deleting one observation n 1 n n ( x (k) x) 2 = x (k) = n
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationStatistical Methods for Handling Missing Data
Statistical Methods for Handling Missing Data Jae-Kwang Kim Department of Statistics, Iowa State University July 5th, 2014 Outline Textbook : Statistical Methods for handling incomplete data by Kim and
More informationg-priors for Linear Regression
Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More informationEstimation of Complex Small Area Parameters with Application to Poverty Indicators
1 Estimation of Complex Small Area Parameters with Application to Poverty Indicators J.N.K. Rao School of Mathematics and Statistics, Carleton University (Joint work with Isabel Molina from Universidad
More informationNon-Parametric Bootstrap Mean. Squared Error Estimation For M- Quantile Estimators Of Small Area. Averages, Quantiles And Poverty
Working Paper M11/02 Methodology Non-Parametric Bootstrap Mean Squared Error Estimation For M- Quantile Estimators Of Small Area Averages, Quantiles And Poverty Indicators Stefano Marchetti, Nikos Tzavidis,
More informationA BAYESIAN APPROACH TO JOINT SMALL AREA ESTIMATION
A BAYESIAN APPROACH TO JOINT SMALL AREA ESTIMATION A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Yanping Qu IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
More informationStatistics and Econometrics I
Statistics and Econometrics I Point Estimation Shiu-Sheng Chen Department of Economics National Taiwan University September 13, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I September 13,
More informationBusiness Statistics: A First Course
Business Statistics: A First Course 5 th Edition Chapter 7 Sampling and Sampling Distributions Basic Business Statistics, 11e 2009 Prentice-Hall, Inc. Chap 7-1 Learning Objectives In this chapter, you
More information21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model
21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model Copyright c 2018 (Iowa State University) 21. Statistics 510 1 / 26 C. R. Henderson Born April 1, 1911,
More informationContextual Effects in Modeling for Small Domains
University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Engineering and Information Sciences 2011 Contextual Effects in
More information36-720: The Rasch Model
36-720: The Rasch Model Brian Junker October 15, 2007 Multivariate Binary Response Data Rasch Model Rasch Marginal Likelihood as a GLMM Rasch Marginal Likelihood as a Log-Linear Model Example For more
More informationCombining multiple observational data sources to estimate causal eects
Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More informationMultiple Regression Analysis. Part III. Multiple Regression Analysis
Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant
More informationLFS quarterly small area estimation of youth unemployment at provincial level
LFS quarterly small area estimation of youth unemployment at provincial level Stima trimestrale della disoccupazione giovanile a livello provinciale su dati RFL Michele D Aló, Stefano Falorsi, Silvia Loriga
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationThe Use of Survey Weights in Regression Modelling
The Use of Survey Weights in Regression Modelling Chris Skinner London School of Economics and Political Science (with Jae-Kwang Kim, Iowa State University) Colorado State University, June 2013 1 Weighting
More informationEmpirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design
1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary
More informationMixed Effects Prediction under Benchmarking and Applications to Small Area Estimation
CIRJE-F-832 Mixed Effects Prediction under Benchmarking and Applications to Small Area Estimation Tatsuya Kubokawa University of Tokyo January 2012 CIRJE Discussion Papers can be downloaded without charge
More informationCalibration estimation using exponential tilting in sample surveys
Calibration estimation using exponential tilting in sample surveys Jae Kwang Kim February 23, 2010 Abstract We consider the problem of parameter estimation with auxiliary information, where the auxiliary
More informationNon-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators
Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators Stefano Marchetti 1 Nikos Tzavidis 2 Monica Pratesi 3 1,3 Department
More informationGEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY
GEOGRAPHY 204: STATISTICAL PROBLEM SOLVING IN GEOGRAPHY CHAPTER 2: GEOGRAPHIC DATA Primary data - acquired directly from the original source In situ or in the field Costly time and money! Campus bike racks
More informationSection on Survey Research Methods JSM 2010 STATISTICAL GRAPHICS OF PEARSON RESIDUALS IN SURVEY LOGISTIC REGRESSION DIAGNOSIS
STATISTICAL GRAPHICS OF PEARSON RESIDUALS IN SURVEY LOGISTIC REGRESSION DIAGNOSIS Stanley Weng, National Agricultural Statistics Service, U.S. Department of Agriculture 3251 Old Lee Hwy, Fairfax, VA 22030,
More informationEconometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11
Econometrics A Keio University, Faculty of Economics Simple linear model (2) Simon Clinet (Keio University) Econometrics A October 16, 2018 1 / 11 Estimation of the noise variance σ 2 In practice σ 2 too
More informationApplied Econometrics (MSc.) Lecture 3 Instrumental Variables
Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.
More informationDealing With Endogeneity
Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics
More informationSTAT 3A03 Applied Regression With SAS Fall 2017
STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.
More informationMiscellanea A note on multiple imputation under complex sampling
Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.
More informationAN INSTRUMENTAL VARIABLE APPROACH FOR IDENTIFICATION AND ESTIMATION WITH NONIGNORABLE NONRESPONSE
Statistica Sinica 24 (2014), 1097-1116 doi:http://dx.doi.org/10.5705/ss.2012.074 AN INSTRUMENTAL VARIABLE APPROACH FOR IDENTIFICATION AND ESTIMATION WITH NONIGNORABLE NONRESPONSE Sheng Wang 1, Jun Shao
More informationIntermediate Econometrics
Intermediate Econometrics Markus Haas LMU München Summer term 2011 15. Mai 2011 The Simple Linear Regression Model Considering variables x and y in a specific population (e.g., years of education and wage
More informationIntroduction to Econometrics
Introduction to Econometrics Lecture 3 : Regression: CEF and Simple OLS Zhaopeng Qu Business School,Nanjing University Oct 9th, 2017 Zhaopeng Qu (Nanjing University) Introduction to Econometrics Oct 9th,
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationEconomics 620, Lecture 2: Regression Mechanics (Simple Regression)
1 Economics 620, Lecture 2: Regression Mechanics (Simple Regression) Observed variables: y i ; x i i = 1; :::; n Hypothesized (model): Ey i = + x i or y i = + x i + (y i Ey i ) ; renaming we get: y i =
More informationModel-based Estimation of Poverty Indicators for Small Areas: Overview. J. N. K. Rao Carleton University, Ottawa, Canada
Model-based Estimation of Poverty Indicators for Small Areas: Overview J. N. K. Rao Carleton University, Ottawa, Canada Isabel Molina Universidad Carlos III de Madrid, Spain Paper presented at The First
More informationLecture 5: LDA and Logistic Regression
Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant
More informationTwo-phase sampling approach to fractional hot deck imputation
Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.
More informationEstimation of Production Functions using Average Data
Estimation of Production Functions using Average Data Matthew J. Salois Food and Resource Economics Department, University of Florida PO Box 110240, Gainesville, FL 32611-0240 msalois@ufl.edu Grigorios
More informationSurvey nonresponse and the distribution of income
Survey nonresponse and the distribution of income Emanuela Galasso* Development Research Group, World Bank Module 1. Sampling for Surveys 1: Why are we concerned about non response? 2: Implications for
More informationTypical information required from the data collection can be grouped into four categories, enumerated as below.
Chapter 6 Data Collection 6.1 Overview The four-stage modeling, an important tool for forecasting future demand and performance of a transportation system, was developed for evaluating large-scale infrastructure
More informationKey Algebraic Results in Linear Regression
Key Algebraic Results in Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Key Algebraic Results in
More informationPlanned Missingness Designs and the American Community Survey (ACS)
Planned Missingness Designs and the American Community Survey (ACS) Steven G. Heeringa Institute for Social Research University of Michigan Presentation to the National Academies of Sciences Workshop on
More informationSession 2.1: Terminology, Concepts and Definitions
Second Regional Training Course on Sampling Methods for Producing Core Data Items for Agricultural and Rural Statistics Module 2: Review of Basics of Sampling Methods Session 2.1: Terminology, Concepts
More informationCausal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions
Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census
More informationEmpirical approaches in public economics
Empirical approaches in public economics ECON4624 Empirical Public Economics Fall 2016 Gaute Torsvik Outline for today The canonical problem Basic concepts of causal inference Randomized experiments Non-experimental
More informationA Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java
IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS A Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java To cite this
More informationOn the bias of the multiple-imputation variance estimator in survey sampling
J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,
More informationOn Modifications to Linking Variance Estimators in the Fay-Herriot Model that Induce Robustness
Statistics and Applications {ISSN 2452-7395 (online)} Volume 16 No. 1, 2018 (New Series), pp 289-303 On Modifications to Linking Variance Estimators in the Fay-Herriot Model that Induce Robustness Snigdhansu
More informationREPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY
REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in
More information