Econometric textbooks usually discuss the procedures to be adopted
|
|
- Benedict Lloyd
- 5 years ago
- Views:
Transcription
1 Economic and Social Review Vol. 9 No. 4 Substituting Means for Missing Observations in Regression D. CONNIFFE An Foras Taluntais I INTRODUCTION Econometric textbooks usually discuss the procedures to be adopted when missing values occur in data intended for regression analysis. There is a large statistical literature on this topic. A variety of methods, differing both in their computational complexity and in the assumptions required, have been suggested, compared or simulated by, among others, Buck (1960), Afifi and Elashoff (1966), Haitovsky (1968), and Hocking and Smith (1968). More recently, there have been the papers of Dagenais (1973), and Beale and Little (1975). Yet textbooks sometimes give a not dishonourable mention to the simplest ad-hoc procedure of all: that of replacing missing values by sample means. Apart from the obvious computational ease of the procedure, advantages over ordinary least squares (on the complete observations only) are claimed for it in some circumstances. To quote Kmenta (1971): 'One redeeming feature of the estimators... is the fact that when the correlation between X and Y is low, the mean square error of these estimators is less than that of ordinary least squares'. This note examines why the simple method of substituting means seems to give good results in certain circumstances. The objective is not to promote this simple approach instead of more sophisticated methods no one has ever claimed that it could be generally superior but to explain when and why its mean square error properties are superior to ordinary least squares or even to other methods.
2 II MISSING VALUES OF THE DEPENDENT VARIABLE Consider the case ofasimple regression when there are r 1 observations on x and r (< r ) complete pairs' of both x and y. The estimator of the regression coefficient obtained by replacing missing y values by the simple mean and performing the usual calculations is P where S' implies summation over all S'xy Sxy S = = 6* (1) -S' S' P S' K ' pairs including those in which the sample mean has been substituted, and 3* is the usual estimator from the r 2 complete observations. Afifi and Elashoff (1967) gave comparisons of the mean-square error of 0 relative to that of 0*, having assumed that yj = a + 3 xj + ej, where E (ej) = 0, V (ej) = a 2 (1-p 2 2 ) and X { is N (j^, a x ) Using their results, the mean square error of /3 is: (r 2 -l)a 2 (l-p2) + f i r - r j C2.(r a -1) + ( r _ r ) j ( 2 ) while that of 3* is ( r i_ 3 )(r 1 -l)a 2 i (r t -l) 2 j r 1 + l. 1 \ [ a 2 (1-P 2 ) (' 3-3) < (3) The first term of (2) will clearly be much smaller than (3) provided r x is considerably greater than r 2. The second term of (2) depends on the true "coefficient /3 and if this is small (2) can be less than (3). If 3 is not small enough (2) will exceed (3), so the use of j3 instead of 3* implies some degree of prior knowledge. But, if 3 is small, has the reduction in MSE resulted from information contained in the extra incomplete observations? This is a valid question because Hocking and Smith (1972) have shown that if y and x are bivariate Normal or multivariate Normal if X is a vector of explanatory variables and y values are missing from some observations, then the maximum likelihood estimate of j3 is just )3*. (This does not apply to the intercept term.) In formula (1), S is always less than S', so (3 is obtained from 0* by 'shrinking' it. Shrinkage estimators form a class of biased estimators obtained by multiplying least square estimators by quantities less than unity. Their properties have been discussed in the statistical literature; for example,
3 by Mayer and Wilke (1973). The optimal in a minimum mean square error sense shrinkage factor is j 0 2 S so, provided j3 2 /a 2 is bounded, an improvement over least squares is possible. The shrinkage estimator, defined by (4) is also superior to (1) although it uses only the complete observations. Since formula (4) involves unknown parameters that must be estimated from the data or from, probably vague, prior knowledge, the optimality may not be attainable in practice. But how plausible is S 2 x/s' 2 x as a shrinkage factor? It does not involve a 2, p or )3 and its magnitude depends on the number of incomplete observations. The degree of shrinkage increases with the number of extra incomplete observations. But from (4) the optimal shrinkage increases as 3 2 decreases relative to* a 2 (1 p 2 ). Unless it could be postulated that missing values will increase in frequency in situations where the coefficient is small, S /S' is inappropriate as a shrinkage factor. So, even in cases where the choice of a 'good' shrinkage factor is far from clear, it can be stated, somewhat negatively, that the method of substituting the mean for missing Y values is unlikely to help. Ill MISSING VALUES OF AN EXPLANATORY VARIABLE The simplest case, the dependent variable measured on r x observations and one explanatory variable with missing values for r 1 r 2 of these, is a trivial one because substituting the sample mean just gives the ordinary estimator. Suppose then that there are two explanatory variables, Xj measured on all r t observations and, which is missing for Tj r 2 of them. It is obvious that: and then S' = S, S'XjX 2 = SXjX 2 and S' y = S y A S ys' - S'x i y S X l Pi S' S - (SxjX 2 ) 2 and (5) S'x^S S ysxjx 2 S' S - (S X l ) 2
4 Taking expectations, where the model E(y) = a + 3j x t is assumed, gives ^ S ' x J S x * -(Sx.x,)* and (6) E(? 2 ) = /3 2 2 ' S' S - Sx, S*x, ) j S' S 2 - (S X l ) 2 \ S*x ][ differs from S'x, in that it contains the true (but unknown) values instead of the sample means. Thus the expectations (6) are not actually estimable unless further assumptions are made about the x's. However, it is clear that the estimators (5) are biased; the extent of the bias depending on the magnitude of 3 2. (3 2 is biased downwards and biased upwards (assuming Xj and positively related; if negatively related (3j is biased downwards). The variances of p t and j ^ 2, conditional on the x's, are a 2 S a 2 S' - and i!. (7) S' S _(S X l ) 2, S' S -(S X l ) 2 respectively, and these are smaller than the variances of the ordinary least squares estimators, which are a 2 S a 2 S S S -(S X l ) 2 a n d S S -(S X l ) 2 ( 8 ) The difference will not be appreciable in the case of $ 2 if SXj is small. The differences could be very large, however, if the denominators in (8) were close.to zero. Excluding this case for the present and remembering that mean square error is the sum of variance and squared bias, it is evident that the estimators (5) will only have better mean square errors than the least squares estimators if (3 2 is small. As in Section II it appears that prior knowledge is needed to justify the estimators. There are other estimators for this situation, of course. Another approach is to replace the missing values by *2i=*2 + ( X!i-X 1 )Sx i X 2 /S, (9) where x x j is the value of x ; corresponding to the missing value and is the sample mean over the r 2 complete observations. Now the standard analysis of the 'completed' data gives the usual least squares estimator (3% say, for 0 2 and
5 S'Xjy SXjX 2 S'x- 2 S _ for j3j. Conditionally on the x's, this has expectation S'x, x, Sx, x, ) ft (10)! and if is itself assumed to have a linear regression on x t, the further expectation over is /3j. So with this frequently made assumption, (10) is unbiased, unlike the estimator $ given by (5). The mean square error of (10) is easily shown to be smaller than that of $ ( except for small )3 2 and for the circumstances to be discussed next. The estimator (10) can be improved on by various modifications; for example, starting with (9) and conducting a weighted instead of unweighted analysis. The approach of Dagenais (1973) and Methods 4 and 5 of Beale and Little (1975) are of this type. But there is a situation where the mean square error properties of the simple method of substituting means are clearly superior to ordinary least squares on the complete observations and even superior to the more sophisticated methods mentioned. If S S -(S X l ) 2 (11) is close to zero, that is, if there is a severe multicolliniarity problem, the variances given by (8) will be very large. Those given by (7) will be much smaller and this reduction will more than offset the squared bias, provided j3 2 is bounded. The simple method is also superior to (10) because the variance of that estimator may be shown to contain (11) as a denominator. It is easy to see why any approach, based on predicting missing values from values of another explanatory variable, will fail in an extreme multicolliniarity situation. Formula (9) sets up an exact linear relationship between the explanatory variables for the incomplete observations and so makes the original multicolliniarity problem even more severe. This superiority, in a mean square error sense, of the method of substituting means, is another example of the fairly general finding that, in regression situations, biased estimators are most effective when extreme multicolliniarity is present in the data. In summary then, although the method of substituting means for missing explanatory variables is not generally sound, it could outperform other approaches, at least in terms of mean square error, in studies where the data-set is close to multicolliniar.
6 REFERENCES AFIFI, A. A. and R. M. ELASHOFF, "Missing Observations in Multivariate Statistics I. Review of the Literature." Journal of American Statistical Association, 61, AFIFI, A. A. and R. M. ELASHOFF, "Missing Observations in Multivariate Statistics II. Point Estimation in Simple Linear Regression." Journal of American Statistical Association, 62, BEALE, E. M. L. and R. J. A. LITTLE, "Missing Observations in Multivariate Analysis." Journal Royal Statistical Society B, 37, BUCK, S. F "A Method of estimation of Missing Values in Multivariate data suitable for use with an electronic computer." Journal Royal Statistical Society B, 22, DAGENAIS, M. G., "The use of Incomplete observations in Multiple Regression Analysis." Journal of Econometrics, 1, HAITOVSKY, Y., "Missing Data in Regression Analysis." Journal Royal Statistical Society B, 30, HOCKING, R. R. and W. B. SMITH, "Estimation of Parameters in the Multivariate Normal Distribution with missing observations." Journal of American Statistical Association 63, HOCKING, R. R. and W. B. SMITH, "Optimum Incomplete Multinomial Samples" Technometrics 14, KMENTA, J., Elements of Econometrics, Chapter 9, New York: Macmillan. MAYER, L. S. and T. A. WILKE, "On Biased Estimation in Linear Models." Technometrics, 15,
Comments on the Weighted Regression Approach to Missing Values
The Economic and Social Review, Vol. 14, No. 4, July 1983, pp. 259-272 Comments on the Weighted Regression Approach to Missing Values D. CONNIFFE The Economic and Social Research Institute, Dublin Abstract:
More informationstatistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:
Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility
More informationTightening Durbin-Watson Bounds
The Economic and Social Review, Vol. 28, No. 4, October, 1997, pp. 351-356 Tightening Durbin-Watson Bounds DENIS CONNIFFE* The Economic and Social Research Institute Abstract: The null distribution of
More informationOMISSION OF RELEVANT EXPLANATORY VARIABLES IN SUR MODELS: OBTAINING THE BIAS USING A TRANSFORMATION
- I? t OMISSION OF RELEVANT EXPLANATORY VARIABLES IN SUR MODELS: OBTAINING THE BIAS USING A TRANSFORMATION Richard Green LUniversity of California, Davis,. ABSTRACT This paper obtains an expression of
More informationI INTRODUCTION. Magee, 1969; Khan, 1974; and Magee, 1975), two functional forms have principally been used:
Economic and Social Review, Vol 10, No. 2, January, 1979, pp. 147 156 The Irish Aggregate Import Demand Equation: the 'Optimal' Functional Form T. BOYLAN* M. CUDDY I. O MUIRCHEARTAIGH University College,
More informationChapter 14 Stein-Rule Estimation
Chapter 14 Stein-Rule Estimation The ordinary least squares estimation of regression coefficients in linear regression model provides the estimators having minimum variance in the class of linear and unbiased
More informationEstimating grouped data models with a binary dependent variable and fixed effect via logit vs OLS: the impact of dropped units
Estimating grouped data models with a binary dependent variable and fixed effect via logit vs OLS: the impact of dropped units arxiv:1810.12105v1 [stat.ap] 26 Oct 2018 Nathaniel Beck October 30, 2018 Department
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationREGRESSION WITH CORRELATED ERRORS C.A. GLASBEY
41 REGRESSION WITH CORRELATED ERRORS C.A. GLASBEY SYSTEMATIC RESIDUALS When data exhibit systematic departures from a fitted regression line (see for example Figs 1 and 2), either the regression function
More informationVariable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1
Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr
More informationRidge Regression and Ill-Conditioning
Journal of Modern Applied Statistical Methods Volume 3 Issue Article 8-04 Ridge Regression and Ill-Conditioning Ghadban Khalaf King Khalid University, Saudi Arabia, albadran50@yahoo.com Mohamed Iguernane
More informationBootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model
Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model Olubusoye, O. E., J. O. Olaomi, and O. O. Odetunde Abstract A bootstrap simulation approach
More informationDon t be Fancy. Impute Your Dependent Variables!
Don t be Fancy. Impute Your Dependent Variables! Kyle M. Lang, Todd D. Little Institute for Measurement, Methodology, Analysis & Policy Texas Tech University Lubbock, TX May 24, 2016 Presented at the 6th
More informationEconometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner
Econometrics II Nonstandard Standard Error Issues: A Guide for the Practitioner Måns Söderbom 10 May 2011 Department of Economics, University of Gothenburg. Email: mans.soderbom@economics.gu.se. Web: www.economics.gu.se/soderbom,
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationMulticollinearity and A Ridge Parameter Estimation Approach
Journal of Modern Applied Statistical Methods Volume 15 Issue Article 5 11-1-016 Multicollinearity and A Ridge Parameter Estimation Approach Ghadban Khalaf King Khalid University, albadran50@yahoo.com
More informationCONJUGATE DUMMY OBSERVATION PRIORS FOR VAR S
ECO 513 Fall 25 C. Sims CONJUGATE DUMMY OBSERVATION PRIORS FOR VAR S 1. THE GENERAL IDEA As is documented elsewhere Sims (revised 1996, 2), there is a strong tendency for estimated time series models,
More informationContest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.
Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round
More informationNew Wine in Old Bottles: A Sequential Estimation Technique for the LPM
Revised January 2002 New Wine in Old Bottles: A Sequential Estimation Technique for the LPM William C. Horrace and Ronald L. Oaxaca* Abstract: The conditions under which ordinary least squares (OLS) is
More informationAPPLICATION OF RIDGE REGRESSION TO MULTICOLLINEAR DATA
Journal of Research (Science), Bahauddin Zakariya University, Multan, Pakistan. Vol.15, No.1, June 2004, pp. 97-106 ISSN 1021-1012 APPLICATION OF RIDGE REGRESSION TO MULTICOLLINEAR DATA G. R. Pasha 1 and
More informationYour use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
Biometrika Trust Robust Regression via Discriminant Analysis Author(s): A. C. Atkinson and D. R. Cox Source: Biometrika, Vol. 64, No. 1 (Apr., 1977), pp. 15-19 Published by: Oxford University Press on
More informationLarge Sample Properties of Estimators in the Classical Linear Regression Model
Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in
More informationStatistics 203: Introduction to Regression and Analysis of Variance Penalized models
Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance
More informationChapter 2: simple regression model
Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.
More informationYour use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
Regression Analysis when there is Prior Information about Supplementary Variables Author(s): D. R. Cox Source: Journal of the Royal Statistical Society. Series B (Methodological), Vol. 22, No. 1 (1960),
More informationNext is material on matrix rank. Please see the handout
B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0
More informationROBINS-WASSERMAN, ROUND N
ROBINS-WASSERMAN, ROUND N CHRISTOPHER A. SIMS It may be worthwhile to build up the discussion of this example from the simplest case, showing how Bayesian methods lead to useful solutions at each stage.
More informationACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.
ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,
More informationNotes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770
Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill November
More informationRegression analysis based on stratified samples
Biometrika (1986), 73, 3, pp. 605-14 Printed in Great Britain Regression analysis based on stratified samples BY CHARLES P. QUESENBERRY, JR AND NICHOLAS P. JEWELL Program in Biostatistics, University of
More informationreview session gov 2000 gov 2000 () review session 1 / 38
review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review
More informationPanel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43
Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression
More informationLecture Notes on Measurement Error
Steve Pischke Spring 2000 Lecture Notes on Measurement Error These notes summarize a variety of simple results on measurement error which I nd useful. They also provide some references where more complete
More informationAn Introduction to Parameter Estimation
Introduction Introduction to Econometrics An Introduction to Parameter Estimation This document combines several important econometric foundations and corresponds to other documents such as the Introduction
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction
Acta Math. Univ. Comenianae Vol. LXV, 1(1996), pp. 129 139 129 ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES V. WITKOVSKÝ Abstract. Estimation of the autoregressive
More informationEconometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018
Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate
More informationEconometrics Review questions for exam
Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =
More informationSTA121: Applied Regression Analysis
STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using
More informationDirectionally Sensitive Multivariate Statistical Process Control Methods
Directionally Sensitive Multivariate Statistical Process Control Methods Ronald D. Fricker, Jr. Naval Postgraduate School October 5, 2005 Abstract In this paper we develop two directionally sensitive statistical
More informationMaking sense of Econometrics: Basics
Making sense of Econometrics: Basics Lecture 2: Simple Regression Egypt Scholars Economic Society Happy Eid Eid present! enter classroom at http://b.socrative.com/login/student/ room name c28efb78 Outline
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your
More informationAnswers to Problem Set #4
Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2
More informationStatistical Inference with Regression Analysis
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing
More informationEconometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague
Econometrics Week 11 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 30 Recommended Reading For the today Advanced Time Series Topics Selected topics
More informationProblem Set #6: OLS. Economics 835: Econometrics. Fall 2012
Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.
More informationRidge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation
Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking
More informationDiscrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions Data
Quality & Quantity 34: 323 330, 2000. 2000 Kluwer Academic Publishers. Printed in the Netherlands. 323 Note Discrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions
More informationWooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics
Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).
More informationThe regression model with one stochastic regressor (part II)
The regression model with one stochastic regressor (part II) 3150/4150 Lecture 7 Ragnar Nymoen 6 Feb 2012 We will finish Lecture topic 4: The regression model with stochastic regressor We will first look
More informationMultivariate Regression: Part I
Topic 1 Multivariate Regression: Part I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Statement of the objective: we want to explain the behavior of one variable as a
More informationfrom estimating ei by yi - m*(xi) and then using the formula for m*(x) with yi replaced by the square of this estimated ei.
from estimating ei by yi - m*(xi) and then using the formula for m*(x) with yi replaced by the square of this estimated ei. A partial linear model y = Xb + m(z) + e is an example of a semi-parametric analysis.
More informationUnbiased prediction in linear regression models with equi-correlated responses
') -t CAA\..-ll' ~ j... "1-' V'~ /'. uuo. ;). I ''''- ~ ( \ '.. /' I ~, Unbiased prediction in linear regression models with equi-correlated responses Shalabh Received: May 13, 1996; revised version: December
More informationsociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income
Scatterplots Quantitative Research Methods: Introduction to correlation and regression Scatterplots can be considered as interval/ratio analogue of cross-tabs: arbitrarily many values mapped out in -dimensions
More informationThis model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that
Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear
More informationLecture Notes Part 7: Systems of Equations
17.874 Lecture Notes Part 7: Systems of Equations 7. Systems of Equations Many important social science problems are more structured than a single relationship or function. Markets, game theoretic models,
More informationCarl N. Morris. University of Texas
EMPIRICAL BAYES: A FREQUENCY-BAYES COMPROMISE Carl N. Morris University of Texas Empirical Bayes research has expanded significantly since the ground-breaking paper (1956) of Herbert Robbins, and its province
More informationWISE International Masters
WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are
More informationRewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35
Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationScanner Data and the Estimation of Demand Parameters
CARD Working Papers CARD Reports and Working Papers 4-1991 Scanner Data and the Estimation of Demand Parameters Richard Green Iowa State University Zuhair A. Hassan Iowa State University Stanley R. Johnson
More informationECNS 561 Multiple Regression Analysis
ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking
More informationLikely causes: The Problem. E u t 0. E u s u p 0
Autocorrelation This implies that taking the time series regression Y t X t u t but in this case there is some relation between the error terms across observations. E u t 0 E u t E u s u p 0 Thus the error
More informationImproved Liu Estimators for the Poisson Regression Model
www.ccsenet.org/isp International Journal of Statistics and Probability Vol., No. ; May 202 Improved Liu Estimators for the Poisson Regression Model Kristofer Mansson B. M. Golam Kibria Corresponding author
More informationNonparametric Principal Components Regression
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS031) p.4574 Nonparametric Principal Components Regression Barrios, Erniel University of the Philippines Diliman,
More information04/RT/11. Structural Breaks An Instrumental Variable Approach. Denis Conniffe and Robert Kelly
04/RT/11 Structural Breaks An Instrumental Variable Approach Denis Conniffe and Robert Kelly 1 Structural Breaks An Instrumental Variable Approach Denis Conniffe and Robert Kelly University College Dublin
More informationLinear Regression 9/23/17. Simple linear regression. Advertising sales: Variance changes based on # of TVs. Advertising sales: Normal error?
Simple linear regression Linear Regression Nicole Beckage y " = β % + β ' x " + ε so y* " = β+ % + β+ ' x " Method to assess and evaluate the correlation between two (continuous) variables. The slope of
More informationTHE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS
THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations
More informationModification and Improvement of Empirical Likelihood for Missing Response Problem
UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu
More informationReading Assignment. Serial Correlation and Heteroskedasticity. Chapters 12 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1
Reading Assignment Serial Correlation and Heteroskedasticity Chapters 1 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1 Serial Correlation or Autocorrelation y t = β 0 + β 1 x 1t + β x t +... + β k
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationStatistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23
1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing
More informationDo Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods
Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of
More informationEfficient Exponential Ratio Estimator for Estimating the Population Mean in Simple Random Sampling
Efficient Exponential Ratio Estimator for Estimating the Population Mean in Simple Random Sampling Ekpenyong, Emmanuel John and Enang, Ekaette Inyang Abstract This paper proposes, with justification, two
More informationPBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.
PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the
More informationMultiple Regression 2
LECTURE 4 Multiple Regression 2 Dummy Variables and Categorical Data The era of modern econometrics began shortly after the war at a time when there was a paucity of reliable economic data The data consisted
More informationRegression. ECO 312 Fall 2013 Chris Sims. January 12, 2014
ECO 312 Fall 2013 Chris Sims Regression January 12, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License What
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model
Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory
More informationUsing regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement
EconS 450 Forecasting part 3 Forecasting with Regression Using regression to study economic relationships is called econometrics econo = of or pertaining to the economy metrics = measurement Econometrics
More informationLecture 3: Multiple Regression
Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u
More informationThe following formulas related to this topic are provided on the formula sheet:
Student Notes Prep Session Topic: Exploring Content The AP Statistics topic outline contains a long list of items in the category titled Exploring Data. Section D topics will be reviewed in this session.
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More informationFORECASTING PROCEDURES FOR SUCCESS
FORECASTING PROCEDURES FOR SUCCESS Suzy V. Landram, University of Northern Colorado Frank G. Landram, West Texas A&M University Chris Furner, West Texas A&M University ABSTRACT This study brings an awareness
More informationRidge Regression Revisited
Ridge Regression Revisited Paul M.C. de Boer Christian M. Hafner Econometric Institute Report EI 2005-29 In general ridge (GR) regression p ridge parameters have to be determined, whereas simple ridge
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationAn overview of applied econometrics
An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 17, 2010 Instructor: John Parman Final Exam - Solutions You have until 12:30pm to complete this exam. Please remember to put your
More information2 Naïve Methods. 2.1 Complete or available case analysis
2 Naïve Methods Before discussing methods for taking account of missingness when the missingness pattern can be assumed to be MAR in the next three chapters, we review some simple methods for handling
More informationNonresponse weighting adjustment using estimated response probability
Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy
More information13.7 ANOTHER TEST FOR TREND: KENDALL S TAU
13.7 ANOTHER TEST FOR TREND: KENDALL S TAU In 1969 the U.S. government instituted a draft lottery for choosing young men to be drafted into the military. Numbers from 1 to 366 were randomly assigned to
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationRegression Models - Introduction
Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,
More informationChapter 10: Multiple Regression Analysis Introduction
Chapter 10: Multiple Regression Analysis Introduction Chapter 10 Outline Simple versus Multiple Regression Analysis Goal of Multiple Regression Analysis A One-Tailed Test: Downward Sloping Demand Theory
More informationImproved Ridge Estimator in Linear Regression with Multicollinearity, Heteroscedastic Errors and Outliers
Journal of Modern Applied Statistical Methods Volume 15 Issue 2 Article 23 11-1-2016 Improved Ridge Estimator in Linear Regression with Multicollinearity, Heteroscedastic Errors and Outliers Ashok Vithoba
More informationSimple Linear Regression Analysis
LINEAR REGRESSION ANALYSIS MODULE II Lecture - 6 Simple Linear Regression Analysis Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Prediction of values of study
More informationNew York University Department of Economics. Applied Statistics and Econometrics G Spring 2013
New York University Department of Economics Applied Statistics and Econometrics G31.1102 Spring 2013 Text: Econometric Analysis, 7 h Edition, by William Greene (Prentice Hall) Optional: A Guide to Modern
More informationLongitudinal and Panel Data: Analysis and Applications for the Social Sciences. Table of Contents
Longitudinal and Panel Data Preface / i Longitudinal and Panel Data: Analysis and Applications for the Social Sciences Table of Contents August, 2003 Table of Contents Preface i vi 1. Introduction 1.1
More information2 Prediction and Analysis of Variance
2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering
More informationEconometrics. 7) Endogeneity
30C00200 Econometrics 7) Endogeneity Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Common types of endogeneity Simultaneity Omitted variables Measurement errors
More information