A measurement error model approach to small area estimation

Size: px
Start display at page:

Download "A measurement error model approach to small area estimation"

Transcription

1 A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, Joint work with Seunghwan Park and Seoyoung Kim

2 Ouline Introduction Basic Theory Application to Korean LFS Discussion Jae-kwang Kim Survey Sampling Spring, / 26

3 Introduction Small Area estimation: want to provide reliable estimates for area with insufficient sample sizes. Sample is not planned to give accurate direct estimators for the domains: domains with few or no sample observations. Idea: Model can be used to borrow strength from other sources of information. Jae-kwang Kim Survey Sampling Spring, / 26

4 Introduction Motivation: want to combine several sources of information to get improved small area estimates. How to improve the direct estimators using auxiliary variables, from other independent survey data from census data or administrative data. In our study, Area-level model approach, Several sources of auxiliary information, A measurement error model. Using a Generalized Least Squares(GLS) method. Jae-kwang Kim Survey Sampling Spring, / 26

5 Introduction General Setup Study variable : X i Survey A: Directly compute ˆX i, subject to sampling error. Survey B: Compute Ŷi1, subject to sampling error. Census: Measures Ŷi2. E A ( ˆX i ) E B (Ŷi1) due to the structural differences between the surveys. Structural differences (or systematic difference) due to different mode of survey due to time difference due to frame difference Goal: Best prediction of X i by incorporating various types of auxiliary information. Jae-kwang Kim Survey Sampling Spring, / 26

6 Basic Steps Model specification: Measurement error model approach Best prediction: BLUP Parameter estimation: GLS method MSPE estimation Jae-kwang Kim Survey Sampling Spring, / 26

7 Model: Measurement error model approach Two error models (for area i) Sampling error model ˆX i,a Ŷ i,b = X i + a i = Y i + b i where (a i, b i ) represents the sampling error such that ( ai b i ) [( 0 0 ) (, V (a i ) Cov(a i, b i ) Cov(a i, b i ) V (b i ) )] Structural error model Y i = β 0 + β 1X i + e i, e i (0, σ 2 ei) Jae-kwang Kim Survey Sampling Spring, / 26

8 Model: measurement error model approach Structural error model describes the relationship between the two survey measurement up to sampling error. X : target measurement item (variable of primary interest) Y : inaccurate measurement of X with possible systematic bias. If both X and Y measure the same item (with different survey modes), structural error model is essentially a measurement error model. (β 0 = 0, β 1 = 1 means no measurement bias.) Why consider Y i = β 0 + β 1X i + e i instead of X i = β 0 + β 1Y i + e i? : 1 We want to explain Y in terms of X. (e.g. β 0 = 0 and β 1 = 1 means no measurement bias) 2 Can handle several Y more easily. Jae-kwang Kim Survey Sampling Spring, / 26

9 Prediction Recall GLS method: y = Zθ + e, e (0, V ) ˆθ GLS = (Z V 1 Z) 1 Z V 1 y GLS approach to combine two error models: y = Zθ + e, e (0, V ) ( ˆX i,a β 1 1 (Ŷi,b β 0) ) = ( 1 1 ) ( u1i X i + u 2i ) where u 1i = a i and u 2i = β 1 1 (b i + e i ). Thus, ( ) [( ) ( u1i 0 V (a i ) β 1 1 Cov(a i, b i ), u 2i 0 β 1 1 Cov(a i, b i ) β 2 1 (V (b i ) + σei) 2 )] Jae-kwang Kim Survey Sampling Spring, / 26

10 Prediction GLS method: Best linear unbiased estimator of X i based on the linear combination of ˆX i,a and ˆX i,b = β 1 1 (Ŷi,b β 0). Under the current setup, where α i = ˆX i = α i ˆX i,a + (1 α i ) ˆX i,b σ 2 ei + V (b i ) β 1Cov(a i, b i ) σ 2 ei + β 2 1 V (a i) + V (b i ) 2β 1Cov(a i, b i ) The GLS estimator is sometimes called composite estimator. In practice we need to use ˆβ 0, ˆβ 1, and ˆσ 2 ei. Jae-kwang Kim Survey Sampling Spring, / 26

11 Parameter estimation The area-level model takes the form of measurement error model (Fuller, 1987) Ŷ i ˆX i = β 0 + β 1X i + e i + b i = X i + a i We will consider generalized least squares (GLS) method for parameter estimation. GLS Estimation of β 0, β 1: Minimize (Ŷi β 0 β 1 ˆXi ) 2 with respect to (β 0, β 1). Q 1(β 0, β 1) = i V (Ŷi β 0 β 1 ˆXi ) (1) Jae-kwang Kim Survey Sampling Spring, / 26

12 Parameter estimation (Cont d) Since ) V (Ŷi β 0 β 1 ˆXi = σei 2 + ( β 1, 1) Σ i ( β 1, 1), (2) where σ 2 ei = V (e i ) and Σ i = V {(a i, b i ) }, we can write Q (β 0, β 1) = i w i (β 1) (Ŷi β 0 β 1 ˆX i ) 2, (3) where w i (β 1) = { σ 2 ei + ( β 1, 1) Σ i ( β 1, 1) } 1. Here, Σi is assumed to be known. Note that β 0 Q = 0 i ) w i (β 1) (Ŷi β 0 β 1 ˆXi = 0 and so ˆβ 0 = ȳ w ˆβ 1 x w, (4) where ( x w, ȳ w ) = { i w i( ˆβ 1)} 1 i w i( ˆβ 1)( ˆX i, Ŷi). Jae-kwang Kim Survey Sampling Spring, / 26

13 Plugging (4) into (3), we have only to minimize Q 1 (β 1) = i w i (β 1) {Ŷi ȳ w β 1( ˆX i x w )} 2. (5) Thus, we need to find the solution to Q1 / β 1 = 0 where Q1 = { } } 2 w i (β 1) {Ŷi ȳ w β 1( ˆX i x w ) β 1 β 1 i 2 w i (β 1)( ˆX i x w ) {Ŷi ȳ w β 1( ˆX } i x w ). i Using β 1 w i (β 1) = 2 {w i (β 1)} 2 {β 1V (a i ) C(a i, b i )}, and {Ŷ1i ȳ w β 1( ˆX } 2 i x w ) p σei 2 + ( β 1, 1) Σ i ( β 1, 1) = 1/w i (β 1), the solution to Q1 / β 1 = 0 satisfies i ˆβ 1 = w i( ˆβ 1) {( x i x w ) (ȳ i ȳ w ) C(a i, b i )} i w i( ˆβ { 1) ( x i x w ) 2 V (a i ) }. (6) Jae-kwang Kim Survey Sampling Spring, / 26

14 Parameter estimation: Estimation of σ 2 ei Assume σ 2 ei = σ 2 e. We can also consider an alternative assumption such as σ 2 ei = X i σ 2 e, but in this case, parametric model assumption is needed. In practice, one can consider a transformation T ( ) such that the structural error model becomes T (Y i ) = β 0 + β 1T (X i ) + e i, e i (0, σ 2 e ). Method-of-moment estimator of σ 2 e : Solve (Ŷ i ˆβ 0 ˆX i ˆβ 1) 2 = H 2, (7) σe 2 + ( ˆβ 1, 1)Σ i ( ˆβ 1, 1) where H is the total number of small areas. i Jae-kwang Kim Survey Sampling Spring, / 26

15 Parameter estimation (Cont d) Iterative algorithm for parameter estimation. 1 Compute the initial estimator of (β 0, β 1 ) by setting ˆσ e 2 = 0. 2 Use the current value of ( ˆβ 0, ˆβ 1 ), compute ˆσ e 2 using (7). 3 Use the current value of ˆσ e1 2 compute the updated estimator of (β 0, β 1 ), using (4) and (6). 4 Repeat step 2, step 3 until convergence. Jae-kwang Kim Survey Sampling Spring, / 26

16 MSE estimation Recall the measurement error model structure Ŷ i = β 0 + β 1X i + e i + b i GLS estimator of X i : ˆX i = X i + a i ˆX i = {(β 1, 1)V 1 i (β 1, 1) } 1 (β 1, 1)V 1 i (Ŷi β 0, ˆX i ) = α i ˆX i + (1 α i ){β 1 1 (Ŷ i β 0)} = α i ˆXi,a + (1 α i ) ˆX i,b, where V i is the variance-covariance matrix of (b i + e i, a i ) and MSE of ˆX i : α i = E{( ˆX i X i ) 2 } = E σ 2 ei + V (b i ) β 1Cov(a i, b i ) σ 2 ei + β 2 1 V (a i) + V (b i ) 2β 1Cov(a i, b i ) [ { α i ( ˆX i,a X i ) + (1 α i )( ˆX i,b X i )} 2 ] = α 2 i V ( ˆX i,a ) + (1 α i ) 2 V ( ˆX i,b ) + 2α i (1 α i )Cov( ˆX i,a, ˆX i,b ) = α i V ( ˆX i,a ) + (1 α i )Cov( ˆX i,a, ˆX i,b ) := M 1i. Jae-kwang Kim Survey Sampling Spring, / 26

17 MSE estimation The actual prediction for X i is computed by ˆX ei = ˆX i (ˆθ) where θ = (β 0, β 1, σe 2 ). MSE( ˆX ei ) = MSE( ˆX { i ) + E ( ˆX ei ˆX i ) 2} Consider a jackknife approach, ˆM 2i = H 1 H = M 1i + M 2i H ( k) ( ˆȲ i ˆȲ i ) 2 k=1 where ˆα (JK) i = ˆα i H 1 H ˆM 1i = ˆα (JK) i ˆV (a i ) + (1 ˆα (JK) i )Ĉov(a i, b i ) k=1 (ˆα( k) i ˆα i ). Jae-kwang Kim Survey Sampling Spring, / 26

18 Korean LFS Application Labor Force Survey: very important economic survey measuring unemployment rates. Several sources of information for unemployment of Korea 1 Korean Labor Force Survey (KLF) data - 7K sample households (monthly) 2 Local Area Labor Force Survey (LALF) data - 200K sample households (quarterly) 3 Census long form data (10% of the population) KLF sample is nested within LALF sample. Jae-kwang Kim Survey Sampling Spring, / 26

19 Korea LFS Application Unemployment rate for small area is the parameter of interest. Several sources of information for unemployment for analysis district area i. ˆXi : estimates from KLF data Ŷ 1i : estimates from LALF data Ŷ 2i : estimates from census data KLF : sampling error, measurement error. LALF : sampling error, measurement error. Census data : sampling error, measurement error (no updated information). Jae-kwang Kim Survey Sampling Spring, / 26

20 Korea LFS Application We can Consider also Census data. Then (3) changes to ˆ X i 1 a i ˆȲ 1i β 0 = β 1 Xi + b i + ē 1i ˆȲ 2i γ 0 γ 1 ē 2i Whole process is similar to the case combining two survey. Jae-kwang Kim Survey Sampling Spring, / 26

21 Figure: Plot of Unemployment Rate for KLF and LALF Survey for Urban Area Jae-kwang Kim Survey Sampling Spring, / 26

22 Figure: Plot of Residuals against estimated values for Urban Area Jae-kwang Kim Survey Sampling Spring, / 26

23 Korea LFS Application Data analysis Result Consider four estimates MSE KLF : Only KLF LALF : Only LALF GLS 1 : Combine KLF and LALF GLS 2 : Combine KLF, LALF, and census data MSE 1st Q Median Mean 3rd Q KLF LALF GLS GLS Jae-kwang Kim Survey Sampling Spring, / 26

24 Discussion Model specification was very difficult!. We build models separately for urban and rural areas, which ares assigned based on the proportion of households engaged in agricultural business. In KLF Survey, 25% of the whole areas have 0 unemployment rate due to the quite small sample size of individual area. The areas which have 0 unemployment rate are excluded when parameters are estimated. We have considered the structural model which has a 0 intercept. Ȳ 1i = β 1 Xi + e i Mixture model or Zero-inflated regression model can be considered. Jae-kwang Kim Survey Sampling Spring, / 26

25 Summary Motivated by a real data, Korean Labor Force Survey in small area estimation GLS prediction approach under the area-level model Measurement error model for parameter estimation Instead of GLS approach, maximum likelihood approach is also possible under parametric model assumptions. Jae-kwang Kim Survey Sampling Spring, / 26

26 Reference Kim, J.K., Park, S. and Kim, S. (2015). Small area estimation combining information from several sources, Survey Methodology, In press. Jae-kwang Kim Survey Sampling Spring, / 26

Introduction to Survey Data Integration

Introduction to Survey Data Integration Introduction to Survey Data Integration Jae-Kwang Kim Iowa State University May 20, 2014 Outline 1 Introduction 2 Survey Integration Examples 3 Basic Theory for Survey Integration 4 NASS application 5

More information

Chapter 8: Estimation 1

Chapter 8: Estimation 1 Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:

More information

Combining data from two independent surveys: model-assisted approach

Combining data from two independent surveys: model-assisted approach Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) The Simple Linear Regression Model based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #2 The Simple

More information

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Economics 582 Random Effects Estimation

Economics 582 Random Effects Estimation Economics 582 Random Effects Estimation Eric Zivot May 29, 2013 Random Effects Model Hence, the model can be re-written as = x 0 β + + [x ] = 0 (no endogeneity) [ x ] = = + x 0 β + + [x ] = 0 [ x ] = 0

More information

Small Area Modeling of County Estimates for Corn and Soybean Yields in the US

Small Area Modeling of County Estimates for Corn and Soybean Yields in the US Small Area Modeling of County Estimates for Corn and Soybean Yields in the US Matt Williams National Agricultural Statistics Service United States Department of Agriculture Matt.Williams@nass.usda.gov

More information

The regression model with one fixed regressor cont d

The regression model with one fixed regressor cont d The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

Imputation for Missing Data under PPSWR Sampling

Imputation for Missing Data under PPSWR Sampling July 5, 2010 Beijing Imputation for Missing Data under PPSWR Sampling Guohua Zou Academy of Mathematics and Systems Science Chinese Academy of Sciences 1 23 () Outline () Imputation method under PPSWR

More information

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning Økonomisk Kandidateksamen 2004 (I) Econometrics 2 Rettevejledning This is a closed-book exam (uden hjælpemidler). Answer all questions! The group of questions 1 to 4 have equal weight. Within each group,

More information

Recent Advances in the analysis of missing data with non-ignorable missingness

Recent Advances in the analysis of missing data with non-ignorable missingness Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation

More information

Applied Time Series Topics

Applied Time Series Topics Applied Time Series Topics Ivan Medovikov Brock University April 16, 2013 Ivan Medovikov, Brock University Applied Time Series Topics 1/34 Overview 1. Non-stationary data and consequences 2. Trends and

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 2: Simple Regression Egypt Scholars Economic Society Happy Eid Eid present! enter classroom at http://b.socrative.com/login/student/ room name c28efb78 Outline

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON3150/ECON4150 Introductory Econometrics Date of exam: Wednesday, May 15, 013 Grades are given: June 6, 013 Time for exam: :30 p.m. 5:30 p.m. The problem

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

Two-phase sampling approach to fractional hot deck imputation

Two-phase sampling approach to fractional hot deck imputation Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

Topic 12 Overview of Estimation

Topic 12 Overview of Estimation Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the

More information

Small Area Confidence Bounds on Small Cell Proportions in Survey Populations

Small Area Confidence Bounds on Small Cell Proportions in Survey Populations Small Area Confidence Bounds on Small Cell Proportions in Survey Populations Aaron Gilary, Jerry Maples, U.S. Census Bureau U.S. Census Bureau Eric V. Slud, U.S. Census Bureau Univ. Maryland College Park

More information

Weighting in survey analysis under informative sampling

Weighting in survey analysis under informative sampling Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

The Use of Survey Weights in Regression Modelling

The Use of Survey Weights in Regression Modelling The Use of Survey Weights in Regression Modelling Chris Skinner London School of Economics and Political Science (with Jae-Kwang Kim, Iowa State University) Colorado State University, June 2013 1 Weighting

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA Submitted to the Annals of Applied Statistics VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA By Jae Kwang Kim, Wayne A. Fuller and William R. Bell Iowa State University

More information

Econometrics I Lecture 3: The Simple Linear Regression Model

Econometrics I Lecture 3: The Simple Linear Regression Model Econometrics I Lecture 3: The Simple Linear Regression Model Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 32 Outline Introduction Estimating

More information

On the bias of the multiple-imputation variance estimator in survey sampling

On the bias of the multiple-imputation variance estimator in survey sampling J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,

More information

Cluster Sampling 2. Chapter Introduction

Cluster Sampling 2. Chapter Introduction Chapter 7 Cluster Sampling 7.1 Introduction In this chapter, we consider two-stage cluster sampling where the sample clusters are selected in the first stage and the sample elements are selected in the

More information

Advanced Econometrics

Advanced Econometrics Based on the textbook by Verbeek: A Guide to Modern Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna May 16, 2013 Outline Univariate

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

Estimation of Complex Small Area Parameters with Application to Poverty Indicators

Estimation of Complex Small Area Parameters with Application to Poverty Indicators 1 Estimation of Complex Small Area Parameters with Application to Poverty Indicators J.N.K. Rao School of Mathematics and Statistics, Carleton University (Joint work with Isabel Molina from Universidad

More information

Calibration estimation using exponential tilting in sample surveys

Calibration estimation using exponential tilting in sample surveys Calibration estimation using exponential tilting in sample surveys Jae Kwang Kim February 23, 2010 Abstract We consider the problem of parameter estimation with auxiliary information, where the auxiliary

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Regression - Modeling a response

Regression - Modeling a response Regression - Modeling a response We often wish to construct a model to Explain the association between two or more variables Predict the outcome of a variable given values of other variables. Regression

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Regression #3: Properties of OLS Estimator

Regression #3: Properties of OLS Estimator Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Statistics and Econometrics I

Statistics and Econometrics I Statistics and Econometrics I Point Estimation Shiu-Sheng Chen Department of Economics National Taiwan University September 13, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I September 13,

More information

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San

More information

Principles of forecasting

Principles of forecasting 2.5 Forecasting Principles of forecasting Forecast based on conditional expectations Suppose we are interested in forecasting the value of y t+1 based on a set of variables X t (m 1 vector). Let y t+1

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

Small Domains Estimation and Poverty Indicators. Carleton University, Ottawa, Canada

Small Domains Estimation and Poverty Indicators. Carleton University, Ottawa, Canada Small Domains Estimation and Poverty Indicators J. N. K. Rao Carleton University, Ottawa, Canada Invited paper for presentation at the International Seminar Population Estimates and Projections: Methodologies,

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression Model 1 2 Ordinary Least Squares 3 4 Non-linearities 5 of the coefficients and their to the model We saw that econometrics studies E (Y x). More generally, we shall study regression analysis. : The regression

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

More information

Chapter 2 The Simple Linear Regression Model: Specification and Estimation

Chapter 2 The Simple Linear Regression Model: Specification and Estimation Chapter The Simple Linear Regression Model: Specification and Estimation Page 1 Chapter Contents.1 An Economic Model. An Econometric Model.3 Estimating the Regression Parameters.4 Assessing the Least Squares

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Robust Hierarchical Bayes Small Area Estimation for Nested Error Regression Model

Robust Hierarchical Bayes Small Area Estimation for Nested Error Regression Model Robust Hierarchical Bayes Small Area Estimation for Nested Error Regression Model Adrijo Chakraborty, Gauri Sankar Datta,3 and Abhyuday Mandal NORC at the University of Chicago, Bethesda, MD 084, USA Department

More information

Lecture Notes 4 Vector Detection and Estimation. Vector Detection Reconstruction Problem Detection for Vector AGN Channel

Lecture Notes 4 Vector Detection and Estimation. Vector Detection Reconstruction Problem Detection for Vector AGN Channel Lecture Notes 4 Vector Detection and Estimation Vector Detection Reconstruction Problem Detection for Vector AGN Channel Vector Linear Estimation Linear Innovation Sequence Kalman Filter EE 278B: Random

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 12: Frequentist properties of estimators (v4) Ramesh Johari ramesh.johari@stanford.edu 1 / 39 Frequentist inference 2 / 39 Thinking like a frequentist Suppose that for some

More information

Chapter 4: Imputation

Chapter 4: Imputation Chapter 4: Imputation Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Basic Theory for imputation 3 Variance estimation after imputation 4 Replication variance estimation

More information

Statistics 910, #5 1. Regression Methods

Statistics 910, #5 1. Regression Methods Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Simple Linear Regression for the MPG Data

Simple Linear Regression for the MPG Data Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory

More information

Two-Variable Regression Model: The Problem of Estimation

Two-Variable Regression Model: The Problem of Estimation Two-Variable Regression Model: The Problem of Estimation Introducing the Ordinary Least Squares Estimator Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Two-Variable

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Estadística II Chapter 4: Simple linear regression

Estadística II Chapter 4: Simple linear regression Estadística II Chapter 4: Simple linear regression Chapter 4. Simple linear regression Contents Objectives of the analysis. Model specification. Least Square Estimators (LSE): construction and properties

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,

More information

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2) RNy, econ460 autumn 04 Lecture note Orthogonalization and re-parameterization 5..3 and 7.. in HN Orthogonalization of variables, for example X i and X means that variables that are correlated are made

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

11.1 Gujarati(2003): Chapter 12

11.1 Gujarati(2003): Chapter 12 11.1 Gujarati(2003): Chapter 12 Time Series Data 11.2 Time series process of economic variables e.g., GDP, M1, interest rate, echange rate, imports, eports, inflation rate, etc. Realization An observed

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

Graduate Econometrics Lecture 4: Heteroskedasticity

Graduate Econometrics Lecture 4: Heteroskedasticity Graduate Econometrics Lecture 4: Heteroskedasticity Department of Economics University of Gothenburg November 30, 2014 1/43 and Autocorrelation Consequences for OLS Estimator Begin from the linear model

More information

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013)

More information

L7: Multicollinearity

L7: Multicollinearity L7: Multicollinearity Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Introduction ï Example Whats wrong with it? Assume we have this data Y

More information

F9 F10: Autocorrelation

F9 F10: Autocorrelation F9 F10: Autocorrelation Feng Li Department of Statistics, Stockholm University Introduction In the classic regression model we assume cov(u i, u j x i, x k ) = E(u i, u j ) = 0 What if we break the assumption?

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Graybill Conference Poster Session Introductions

Graybill Conference Poster Session Introductions Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within

More information

Chapter 3: Element sampling design: Part 1

Chapter 3: Element sampling design: Part 1 Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part

More information