Econometric textbooks usually discuss the procedures to be adopted

Size: px
Start display at page:

Download "Econometric textbooks usually discuss the procedures to be adopted"

Transcription

1 Economic and Social Review Vol. 9 No. 4 Substituting Means for Missing Observations in Regression D. CONNIFFE An Foras Taluntais I INTRODUCTION Econometric textbooks usually discuss the procedures to be adopted when missing values occur in data intended for regression analysis. There is a large statistical literature on this topic. A variety of methods, differing both in their computational complexity and in the assumptions required, have been suggested, compared or simulated by, among others, Buck (1960), Afifi and Elashoff (1966), Haitovsky (1968), and Hocking and Smith (1968). More recently, there have been the papers of Dagenais (1973), and Beale and Little (1975). Yet textbooks sometimes give a not dishonourable mention to the simplest ad-hoc procedure of all: that of replacing missing values by sample means. Apart from the obvious computational ease of the procedure, advantages over ordinary least squares (on the complete observations only) are claimed for it in some circumstances. To quote Kmenta (1971): 'One redeeming feature of the estimators... is the fact that when the correlation between X and Y is low, the mean square error of these estimators is less than that of ordinary least squares'. This note examines why the simple method of substituting means seems to give good results in certain circumstances. The objective is not to promote this simple approach instead of more sophisticated methods no one has ever claimed that it could be generally superior but to explain when and why its mean square error properties are superior to ordinary least squares or even to other methods.

2 II MISSING VALUES OF THE DEPENDENT VARIABLE Consider the case ofasimple regression when there are r 1 observations on x and r (< r ) complete pairs' of both x and y. The estimator of the regression coefficient obtained by replacing missing y values by the simple mean and performing the usual calculations is P where S' implies summation over all S'xy Sxy S = = 6* (1) -S' S' P S' K ' pairs including those in which the sample mean has been substituted, and 3* is the usual estimator from the r 2 complete observations. Afifi and Elashoff (1967) gave comparisons of the mean-square error of 0 relative to that of 0*, having assumed that yj = a + 3 xj + ej, where E (ej) = 0, V (ej) = a 2 (1-p 2 2 ) and X { is N (j^, a x ) Using their results, the mean square error of /3 is: (r 2 -l)a 2 (l-p2) + f i r - r j C2.(r a -1) + ( r _ r ) j ( 2 ) while that of 3* is ( r i_ 3 )(r 1 -l)a 2 i (r t -l) 2 j r 1 + l. 1 \ [ a 2 (1-P 2 ) (' 3-3) < (3) The first term of (2) will clearly be much smaller than (3) provided r x is considerably greater than r 2. The second term of (2) depends on the true "coefficient /3 and if this is small (2) can be less than (3). If 3 is not small enough (2) will exceed (3), so the use of j3 instead of 3* implies some degree of prior knowledge. But, if 3 is small, has the reduction in MSE resulted from information contained in the extra incomplete observations? This is a valid question because Hocking and Smith (1972) have shown that if y and x are bivariate Normal or multivariate Normal if X is a vector of explanatory variables and y values are missing from some observations, then the maximum likelihood estimate of j3 is just )3*. (This does not apply to the intercept term.) In formula (1), S is always less than S', so (3 is obtained from 0* by 'shrinking' it. Shrinkage estimators form a class of biased estimators obtained by multiplying least square estimators by quantities less than unity. Their properties have been discussed in the statistical literature; for example,

3 by Mayer and Wilke (1973). The optimal in a minimum mean square error sense shrinkage factor is j 0 2 S so, provided j3 2 /a 2 is bounded, an improvement over least squares is possible. The shrinkage estimator, defined by (4) is also superior to (1) although it uses only the complete observations. Since formula (4) involves unknown parameters that must be estimated from the data or from, probably vague, prior knowledge, the optimality may not be attainable in practice. But how plausible is S 2 x/s' 2 x as a shrinkage factor? It does not involve a 2, p or )3 and its magnitude depends on the number of incomplete observations. The degree of shrinkage increases with the number of extra incomplete observations. But from (4) the optimal shrinkage increases as 3 2 decreases relative to* a 2 (1 p 2 ). Unless it could be postulated that missing values will increase in frequency in situations where the coefficient is small, S /S' is inappropriate as a shrinkage factor. So, even in cases where the choice of a 'good' shrinkage factor is far from clear, it can be stated, somewhat negatively, that the method of substituting the mean for missing Y values is unlikely to help. Ill MISSING VALUES OF AN EXPLANATORY VARIABLE The simplest case, the dependent variable measured on r x observations and one explanatory variable with missing values for r 1 r 2 of these, is a trivial one because substituting the sample mean just gives the ordinary estimator. Suppose then that there are two explanatory variables, Xj measured on all r t observations and, which is missing for Tj r 2 of them. It is obvious that: and then S' = S, S'XjX 2 = SXjX 2 and S' y = S y A S ys' - S'x i y S X l Pi S' S - (SxjX 2 ) 2 and (5) S'x^S S ysxjx 2 S' S - (S X l ) 2

4 Taking expectations, where the model E(y) = a + 3j x t is assumed, gives ^ S ' x J S x * -(Sx.x,)* and (6) E(? 2 ) = /3 2 2 ' S' S - Sx, S*x, ) j S' S 2 - (S X l ) 2 \ S*x ][ differs from S'x, in that it contains the true (but unknown) values instead of the sample means. Thus the expectations (6) are not actually estimable unless further assumptions are made about the x's. However, it is clear that the estimators (5) are biased; the extent of the bias depending on the magnitude of 3 2. (3 2 is biased downwards and biased upwards (assuming Xj and positively related; if negatively related (3j is biased downwards). The variances of p t and j ^ 2, conditional on the x's, are a 2 S a 2 S' - and i!. (7) S' S _(S X l ) 2, S' S -(S X l ) 2 respectively, and these are smaller than the variances of the ordinary least squares estimators, which are a 2 S a 2 S S S -(S X l ) 2 a n d S S -(S X l ) 2 ( 8 ) The difference will not be appreciable in the case of $ 2 if SXj is small. The differences could be very large, however, if the denominators in (8) were close.to zero. Excluding this case for the present and remembering that mean square error is the sum of variance and squared bias, it is evident that the estimators (5) will only have better mean square errors than the least squares estimators if (3 2 is small. As in Section II it appears that prior knowledge is needed to justify the estimators. There are other estimators for this situation, of course. Another approach is to replace the missing values by *2i=*2 + ( X!i-X 1 )Sx i X 2 /S, (9) where x x j is the value of x ; corresponding to the missing value and is the sample mean over the r 2 complete observations. Now the standard analysis of the 'completed' data gives the usual least squares estimator (3% say, for 0 2 and

5 S'Xjy SXjX 2 S'x- 2 S _ for j3j. Conditionally on the x's, this has expectation S'x, x, Sx, x, ) ft (10)! and if is itself assumed to have a linear regression on x t, the further expectation over is /3j. So with this frequently made assumption, (10) is unbiased, unlike the estimator $ given by (5). The mean square error of (10) is easily shown to be smaller than that of $ ( except for small )3 2 and for the circumstances to be discussed next. The estimator (10) can be improved on by various modifications; for example, starting with (9) and conducting a weighted instead of unweighted analysis. The approach of Dagenais (1973) and Methods 4 and 5 of Beale and Little (1975) are of this type. But there is a situation where the mean square error properties of the simple method of substituting means are clearly superior to ordinary least squares on the complete observations and even superior to the more sophisticated methods mentioned. If S S -(S X l ) 2 (11) is close to zero, that is, if there is a severe multicolliniarity problem, the variances given by (8) will be very large. Those given by (7) will be much smaller and this reduction will more than offset the squared bias, provided j3 2 is bounded. The simple method is also superior to (10) because the variance of that estimator may be shown to contain (11) as a denominator. It is easy to see why any approach, based on predicting missing values from values of another explanatory variable, will fail in an extreme multicolliniarity situation. Formula (9) sets up an exact linear relationship between the explanatory variables for the incomplete observations and so makes the original multicolliniarity problem even more severe. This superiority, in a mean square error sense, of the method of substituting means, is another example of the fairly general finding that, in regression situations, biased estimators are most effective when extreme multicolliniarity is present in the data. In summary then, although the method of substituting means for missing explanatory variables is not generally sound, it could outperform other approaches, at least in terms of mean square error, in studies where the data-set is close to multicolliniar.

6 REFERENCES AFIFI, A. A. and R. M. ELASHOFF, "Missing Observations in Multivariate Statistics I. Review of the Literature." Journal of American Statistical Association, 61, AFIFI, A. A. and R. M. ELASHOFF, "Missing Observations in Multivariate Statistics II. Point Estimation in Simple Linear Regression." Journal of American Statistical Association, 62, BEALE, E. M. L. and R. J. A. LITTLE, "Missing Observations in Multivariate Analysis." Journal Royal Statistical Society B, 37, BUCK, S. F "A Method of estimation of Missing Values in Multivariate data suitable for use with an electronic computer." Journal Royal Statistical Society B, 22, DAGENAIS, M. G., "The use of Incomplete observations in Multiple Regression Analysis." Journal of Econometrics, 1, HAITOVSKY, Y., "Missing Data in Regression Analysis." Journal Royal Statistical Society B, 30, HOCKING, R. R. and W. B. SMITH, "Estimation of Parameters in the Multivariate Normal Distribution with missing observations." Journal of American Statistical Association 63, HOCKING, R. R. and W. B. SMITH, "Optimum Incomplete Multinomial Samples" Technometrics 14, KMENTA, J., Elements of Econometrics, Chapter 9, New York: Macmillan. MAYER, L. S. and T. A. WILKE, "On Biased Estimation in Linear Models." Technometrics, 15,

Comments on the Weighted Regression Approach to Missing Values

Comments on the Weighted Regression Approach to Missing Values The Economic and Social Review, Vol. 14, No. 4, July 1983, pp. 259-272 Comments on the Weighted Regression Approach to Missing Values D. CONNIFFE The Economic and Social Research Institute, Dublin Abstract:

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Tightening Durbin-Watson Bounds

Tightening Durbin-Watson Bounds The Economic and Social Review, Vol. 28, No. 4, October, 1997, pp. 351-356 Tightening Durbin-Watson Bounds DENIS CONNIFFE* The Economic and Social Research Institute Abstract: The null distribution of

More information

OMISSION OF RELEVANT EXPLANATORY VARIABLES IN SUR MODELS: OBTAINING THE BIAS USING A TRANSFORMATION

OMISSION OF RELEVANT EXPLANATORY VARIABLES IN SUR MODELS: OBTAINING THE BIAS USING A TRANSFORMATION - I? t OMISSION OF RELEVANT EXPLANATORY VARIABLES IN SUR MODELS: OBTAINING THE BIAS USING A TRANSFORMATION Richard Green LUniversity of California, Davis,. ABSTRACT This paper obtains an expression of

More information

I INTRODUCTION. Magee, 1969; Khan, 1974; and Magee, 1975), two functional forms have principally been used:

I INTRODUCTION. Magee, 1969; Khan, 1974; and Magee, 1975), two functional forms have principally been used: Economic and Social Review, Vol 10, No. 2, January, 1979, pp. 147 156 The Irish Aggregate Import Demand Equation: the 'Optimal' Functional Form T. BOYLAN* M. CUDDY I. O MUIRCHEARTAIGH University College,

More information

Chapter 14 Stein-Rule Estimation

Chapter 14 Stein-Rule Estimation Chapter 14 Stein-Rule Estimation The ordinary least squares estimation of regression coefficients in linear regression model provides the estimators having minimum variance in the class of linear and unbiased

More information

Estimating grouped data models with a binary dependent variable and fixed effect via logit vs OLS: the impact of dropped units

Estimating grouped data models with a binary dependent variable and fixed effect via logit vs OLS: the impact of dropped units Estimating grouped data models with a binary dependent variable and fixed effect via logit vs OLS: the impact of dropped units arxiv:1810.12105v1 [stat.ap] 26 Oct 2018 Nathaniel Beck October 30, 2018 Department

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

REGRESSION WITH CORRELATED ERRORS C.A. GLASBEY

REGRESSION WITH CORRELATED ERRORS C.A. GLASBEY 41 REGRESSION WITH CORRELATED ERRORS C.A. GLASBEY SYSTEMATIC RESIDUALS When data exhibit systematic departures from a fitted regression line (see for example Figs 1 and 2), either the regression function

More information

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1 Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr

More information

Ridge Regression and Ill-Conditioning

Ridge Regression and Ill-Conditioning Journal of Modern Applied Statistical Methods Volume 3 Issue Article 8-04 Ridge Regression and Ill-Conditioning Ghadban Khalaf King Khalid University, Saudi Arabia, albadran50@yahoo.com Mohamed Iguernane

More information

Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model

Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model Olubusoye, O. E., J. O. Olaomi, and O. O. Odetunde Abstract A bootstrap simulation approach

More information

Don t be Fancy. Impute Your Dependent Variables!

Don t be Fancy. Impute Your Dependent Variables! Don t be Fancy. Impute Your Dependent Variables! Kyle M. Lang, Todd D. Little Institute for Measurement, Methodology, Analysis & Policy Texas Tech University Lubbock, TX May 24, 2016 Presented at the 6th

More information

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner Econometrics II Nonstandard Standard Error Issues: A Guide for the Practitioner Måns Söderbom 10 May 2011 Department of Economics, University of Gothenburg. Email: mans.soderbom@economics.gu.se. Web: www.economics.gu.se/soderbom,

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

Multicollinearity and A Ridge Parameter Estimation Approach

Multicollinearity and A Ridge Parameter Estimation Approach Journal of Modern Applied Statistical Methods Volume 15 Issue Article 5 11-1-016 Multicollinearity and A Ridge Parameter Estimation Approach Ghadban Khalaf King Khalid University, albadran50@yahoo.com

More information

CONJUGATE DUMMY OBSERVATION PRIORS FOR VAR S

CONJUGATE DUMMY OBSERVATION PRIORS FOR VAR S ECO 513 Fall 25 C. Sims CONJUGATE DUMMY OBSERVATION PRIORS FOR VAR S 1. THE GENERAL IDEA As is documented elsewhere Sims (revised 1996, 2), there is a strong tendency for estimated time series models,

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

New Wine in Old Bottles: A Sequential Estimation Technique for the LPM

New Wine in Old Bottles: A Sequential Estimation Technique for the LPM Revised January 2002 New Wine in Old Bottles: A Sequential Estimation Technique for the LPM William C. Horrace and Ronald L. Oaxaca* Abstract: The conditions under which ordinary least squares (OLS) is

More information

APPLICATION OF RIDGE REGRESSION TO MULTICOLLINEAR DATA

APPLICATION OF RIDGE REGRESSION TO MULTICOLLINEAR DATA Journal of Research (Science), Bahauddin Zakariya University, Multan, Pakistan. Vol.15, No.1, June 2004, pp. 97-106 ISSN 1021-1012 APPLICATION OF RIDGE REGRESSION TO MULTICOLLINEAR DATA G. R. Pasha 1 and

More information

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at Biometrika Trust Robust Regression via Discriminant Analysis Author(s): A. C. Atkinson and D. R. Cox Source: Biometrika, Vol. 64, No. 1 (Apr., 1977), pp. 15-19 Published by: Oxford University Press on

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at Regression Analysis when there is Prior Information about Supplementary Variables Author(s): D. R. Cox Source: Journal of the Royal Statistical Society. Series B (Methodological), Vol. 22, No. 1 (1960),

More information

Next is material on matrix rank. Please see the handout

Next is material on matrix rank. Please see the handout B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0

More information

ROBINS-WASSERMAN, ROUND N

ROBINS-WASSERMAN, ROUND N ROBINS-WASSERMAN, ROUND N CHRISTOPHER A. SIMS It may be worthwhile to build up the discussion of this example from the simplest case, showing how Bayesian methods lead to useful solutions at each stage.

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill November

More information

Regression analysis based on stratified samples

Regression analysis based on stratified samples Biometrika (1986), 73, 3, pp. 605-14 Printed in Great Britain Regression analysis based on stratified samples BY CHARLES P. QUESENBERRY, JR AND NICHOLAS P. JEWELL Program in Biostatistics, University of

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

Lecture Notes on Measurement Error

Lecture Notes on Measurement Error Steve Pischke Spring 2000 Lecture Notes on Measurement Error These notes summarize a variety of simple results on measurement error which I nd useful. They also provide some references where more complete

More information

An Introduction to Parameter Estimation

An Introduction to Parameter Estimation Introduction Introduction to Econometrics An Introduction to Parameter Estimation This document combines several important econometric foundations and corresponds to other documents such as the Introduction

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction

ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction Acta Math. Univ. Comenianae Vol. LXV, 1(1996), pp. 129 139 129 ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES V. WITKOVSKÝ Abstract. Estimation of the autoregressive

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using

More information

Directionally Sensitive Multivariate Statistical Process Control Methods

Directionally Sensitive Multivariate Statistical Process Control Methods Directionally Sensitive Multivariate Statistical Process Control Methods Ronald D. Fricker, Jr. Naval Postgraduate School October 5, 2005 Abstract In this paper we develop two directionally sensitive statistical

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 2: Simple Regression Egypt Scholars Economic Society Happy Eid Eid present! enter classroom at http://b.socrative.com/login/student/ room name c28efb78 Outline

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 11 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 30 Recommended Reading For the today Advanced Time Series Topics Selected topics

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking

More information

Discrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions Data

Discrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions Data Quality & Quantity 34: 323 330, 2000. 2000 Kluwer Academic Publishers. Printed in the Netherlands. 323 Note Discrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

The regression model with one stochastic regressor (part II)

The regression model with one stochastic regressor (part II) The regression model with one stochastic regressor (part II) 3150/4150 Lecture 7 Ragnar Nymoen 6 Feb 2012 We will finish Lecture topic 4: The regression model with stochastic regressor We will first look

More information

Multivariate Regression: Part I

Multivariate Regression: Part I Topic 1 Multivariate Regression: Part I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Statement of the objective: we want to explain the behavior of one variable as a

More information

from estimating ei by yi - m*(xi) and then using the formula for m*(x) with yi replaced by the square of this estimated ei.

from estimating ei by yi - m*(xi) and then using the formula for m*(x) with yi replaced by the square of this estimated ei. from estimating ei by yi - m*(xi) and then using the formula for m*(x) with yi replaced by the square of this estimated ei. A partial linear model y = Xb + m(z) + e is an example of a semi-parametric analysis.

More information

Unbiased prediction in linear regression models with equi-correlated responses

Unbiased prediction in linear regression models with equi-correlated responses ') -t CAA\..-ll' ~ j... "1-' V'~ /'. uuo. ;). I ''''- ~ ( \ '.. /' I ~, Unbiased prediction in linear regression models with equi-correlated responses Shalabh Received: May 13, 1996; revised version: December

More information

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income Scatterplots Quantitative Research Methods: Introduction to correlation and regression Scatterplots can be considered as interval/ratio analogue of cross-tabs: arbitrarily many values mapped out in -dimensions

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

Lecture Notes Part 7: Systems of Equations

Lecture Notes Part 7: Systems of Equations 17.874 Lecture Notes Part 7: Systems of Equations 7. Systems of Equations Many important social science problems are more structured than a single relationship or function. Markets, game theoretic models,

More information

Carl N. Morris. University of Texas

Carl N. Morris. University of Texas EMPIRICAL BAYES: A FREQUENCY-BAYES COMPROMISE Carl N. Morris University of Texas Empirical Bayes research has expanded significantly since the ground-breaking paper (1956) of Herbert Robbins, and its province

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35 Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Scanner Data and the Estimation of Demand Parameters

Scanner Data and the Estimation of Demand Parameters CARD Working Papers CARD Reports and Working Papers 4-1991 Scanner Data and the Estimation of Demand Parameters Richard Green Iowa State University Zuhair A. Hassan Iowa State University Stanley R. Johnson

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Likely causes: The Problem. E u t 0. E u s u p 0

Likely causes: The Problem. E u t 0. E u s u p 0 Autocorrelation This implies that taking the time series regression Y t X t u t but in this case there is some relation between the error terms across observations. E u t 0 E u t E u s u p 0 Thus the error

More information

Improved Liu Estimators for the Poisson Regression Model

Improved Liu Estimators for the Poisson Regression Model www.ccsenet.org/isp International Journal of Statistics and Probability Vol., No. ; May 202 Improved Liu Estimators for the Poisson Regression Model Kristofer Mansson B. M. Golam Kibria Corresponding author

More information

Nonparametric Principal Components Regression

Nonparametric Principal Components Regression Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS031) p.4574 Nonparametric Principal Components Regression Barrios, Erniel University of the Philippines Diliman,

More information

04/RT/11. Structural Breaks An Instrumental Variable Approach. Denis Conniffe and Robert Kelly

04/RT/11. Structural Breaks An Instrumental Variable Approach. Denis Conniffe and Robert Kelly 04/RT/11 Structural Breaks An Instrumental Variable Approach Denis Conniffe and Robert Kelly 1 Structural Breaks An Instrumental Variable Approach Denis Conniffe and Robert Kelly University College Dublin

More information

Linear Regression 9/23/17. Simple linear regression. Advertising sales: Variance changes based on # of TVs. Advertising sales: Normal error?

Linear Regression 9/23/17. Simple linear regression. Advertising sales: Variance changes based on # of TVs. Advertising sales: Normal error? Simple linear regression Linear Regression Nicole Beckage y " = β % + β ' x " + ε so y* " = β+ % + β+ ' x " Method to assess and evaluate the correlation between two (continuous) variables. The slope of

More information

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Reading Assignment. Serial Correlation and Heteroskedasticity. Chapters 12 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1

Reading Assignment. Serial Correlation and Heteroskedasticity. Chapters 12 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1 Reading Assignment Serial Correlation and Heteroskedasticity Chapters 1 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1 Serial Correlation or Autocorrelation y t = β 0 + β 1 x 1t + β x t +... + β k

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

Efficient Exponential Ratio Estimator for Estimating the Population Mean in Simple Random Sampling

Efficient Exponential Ratio Estimator for Estimating the Population Mean in Simple Random Sampling Efficient Exponential Ratio Estimator for Estimating the Population Mean in Simple Random Sampling Ekpenyong, Emmanuel John and Enang, Ekaette Inyang Abstract This paper proposes, with justification, two

More information

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression. PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the

More information

Multiple Regression 2

Multiple Regression 2 LECTURE 4 Multiple Regression 2 Dummy Variables and Categorical Data The era of modern econometrics began shortly after the war at a time when there was a paucity of reliable economic data The data consisted

More information

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014 ECO 312 Fall 2013 Chris Sims Regression January 12, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License What

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Using regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement

Using regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement EconS 450 Forecasting part 3 Forecasting with Regression Using regression to study economic relationships is called econometrics econo = of or pertaining to the economy metrics = measurement Econometrics

More information

Lecture 3: Multiple Regression

Lecture 3: Multiple Regression Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u

More information

The following formulas related to this topic are provided on the formula sheet:

The following formulas related to this topic are provided on the formula sheet: Student Notes Prep Session Topic: Exploring Content The AP Statistics topic outline contains a long list of items in the category titled Exploring Data. Section D topics will be reviewed in this session.

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

FORECASTING PROCEDURES FOR SUCCESS

FORECASTING PROCEDURES FOR SUCCESS FORECASTING PROCEDURES FOR SUCCESS Suzy V. Landram, University of Northern Colorado Frank G. Landram, West Texas A&M University Chris Furner, West Texas A&M University ABSTRACT This study brings an awareness

More information

Ridge Regression Revisited

Ridge Regression Revisited Ridge Regression Revisited Paul M.C. de Boer Christian M. Hafner Econometric Institute Report EI 2005-29 In general ridge (GR) regression p ridge parameters have to be determined, whereas simple ridge

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 17, 2010 Instructor: John Parman Final Exam - Solutions You have until 12:30pm to complete this exam. Please remember to put your

More information

2 Naïve Methods. 2.1 Complete or available case analysis

2 Naïve Methods. 2.1 Complete or available case analysis 2 Naïve Methods Before discussing methods for taking account of missingness when the missingness pattern can be assumed to be MAR in the next three chapters, we review some simple methods for handling

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

13.7 ANOTHER TEST FOR TREND: KENDALL S TAU

13.7 ANOTHER TEST FOR TREND: KENDALL S TAU 13.7 ANOTHER TEST FOR TREND: KENDALL S TAU In 1969 the U.S. government instituted a draft lottery for choosing young men to be drafted into the military. Numbers from 1 to 366 were randomly assigned to

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,

More information

Chapter 10: Multiple Regression Analysis Introduction

Chapter 10: Multiple Regression Analysis Introduction Chapter 10: Multiple Regression Analysis Introduction Chapter 10 Outline Simple versus Multiple Regression Analysis Goal of Multiple Regression Analysis A One-Tailed Test: Downward Sloping Demand Theory

More information

Improved Ridge Estimator in Linear Regression with Multicollinearity, Heteroscedastic Errors and Outliers

Improved Ridge Estimator in Linear Regression with Multicollinearity, Heteroscedastic Errors and Outliers Journal of Modern Applied Statistical Methods Volume 15 Issue 2 Article 23 11-1-2016 Improved Ridge Estimator in Linear Regression with Multicollinearity, Heteroscedastic Errors and Outliers Ashok Vithoba

More information

Simple Linear Regression Analysis

Simple Linear Regression Analysis LINEAR REGRESSION ANALYSIS MODULE II Lecture - 6 Simple Linear Regression Analysis Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Prediction of values of study

More information

New York University Department of Economics. Applied Statistics and Econometrics G Spring 2013

New York University Department of Economics. Applied Statistics and Econometrics G Spring 2013 New York University Department of Economics Applied Statistics and Econometrics G31.1102 Spring 2013 Text: Econometric Analysis, 7 h Edition, by William Greene (Prentice Hall) Optional: A Guide to Modern

More information

Longitudinal and Panel Data: Analysis and Applications for the Social Sciences. Table of Contents

Longitudinal and Panel Data: Analysis and Applications for the Social Sciences. Table of Contents Longitudinal and Panel Data Preface / i Longitudinal and Panel Data: Analysis and Applications for the Social Sciences Table of Contents August, 2003 Table of Contents Preface i vi 1. Introduction 1.1

More information

2 Prediction and Analysis of Variance

2 Prediction and Analysis of Variance 2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering

More information

Econometrics. 7) Endogeneity

Econometrics. 7) Endogeneity 30C00200 Econometrics 7) Endogeneity Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Common types of endogeneity Simultaneity Omitted variables Measurement errors

More information