III. Econometric Methodology Regression Analysis

Similar documents
Linear Regression Analysis: Terminology and Notation

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Statistics for Economics & Business

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Chapter 9: Statistical Inference and the Relationship between Two Variables

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Lecture 6: Introduction to Linear Regression

Introduction to Regression

Chapter 11: Simple Linear Regression and Correlation

The Ordinary Least Squares (OLS) Estimator

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

e i is a random error

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Chapter 5 Multilevel Models

Statistics for Business and Economics

18. SIMPLE LINEAR REGRESSION III

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

28. SIMPLE LINEAR REGRESSION III

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Comparison of Regression Lines

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

x i1 =1 for all i (the constant ).

Lecture 3 Stat102, Spring 2007

Chapter 13: Multiple Regression

Basic Business Statistics, 10/e

/ n ) are compared. The logic is: if the two

Interpreting Slope Coefficients in Multiple Linear Regression Models: An Example

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Basically, if you have a dummy dependent variable you will be estimating a probability.

Polynomial Regression Models

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Learning Objectives for Chapter 11

Properties of Least Squares

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

SIMPLE LINEAR REGRESSION

Problem of Estimation. Ordinary Least Squares (OLS) Ordinary Least Squares Method. Basic Econometrics in Transportation. Bivariate Regression Analysis

January Examinations 2015

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Negative Binomial Regression

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

STAT 3008 Applied Regression Analysis

Chapter 14 Simple Linear Regression

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Midterm Examination. Regression and Forecasting Models

Statistics II Final Exam 26/6/18

More metrics on cartesian products

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

Scatter Plot x

Chapter 4: Regression With One Regressor

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Economics 130. Lecture 4 Simple Linear Regression Continued

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Chapter 8 Indicator Variables

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Correlation and Regression

a. (All your answers should be in the letter!

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Econometrics: What's It All About, Alfie?

Econometrics of Panel Data

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

Limited Dependent Variables

STATISTICS QUESTIONS. Step by Step Solutions.

β0 + β1xi. You are interested in estimating the unknown parameters β

Dummy variables in multiple variable regression model

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Statistics MINITAB - Lab 2

PBAF 528 Week Theory Is the variable s place in the equation certain and theoretically sound? Most important! 2. T-test

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

17 - LINEAR REGRESSION II

Chapter 15 - Multiple Regression

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

β0 + β1xi and want to estimate the unknown

A Comparative Study for Estimation Parameters in Panel Data Model

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

ANOVA. The Observations y ij

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Diagnostics in Poisson Regression. Models - Residual Analysis

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav;

IV. Modeling a Mean: Simple Linear Regression

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

Transcription:

Page Econ07 Appled Econometrcs Topc : An Overvew of Regresson Analyss (Studenmund, Chapter ) I. The Nature and Scope of Econometrcs. Lot s of defntons of econometrcs. Nobel Prze Commttee Paul Samuelson, et al. Econometrcs may be defned as quanttatve analyss of actual economc phenomena. Goldberger... applcaton of economc theory, mathematcs and statstcal nference to the analyss of economc phenomena. (Joke) E.E. Leamer There are two thngs you don t want to see n the makng sausage and econometrc research. II. Major Uses of Econometrcs.. Descrbng economc realty 2. Testng hypothess about economc theory 3. Forecastng future economc actvty III. Econometrc Methodology Regresson Analyss An mportant methodology n econometrcs s regresson analyss whch typcally follows these steps: Use a famous example to llustrate.. State the hypotheses. Keynes n the General Theory sad a $ ncrease n ncome wll lead to less than a $ ncrease n overall consumpton.

Page 2 We want to test ths hypothess that the MPC<. 2. Specfy the mathematcal model of the theory. Although Keynes ddn t specfy the exact nature of the relatonshp. Mght suggest a smple lnear relatonshp. C = DI 0 < < where C=aggregate consumpton and DI=aggregate dsposable ncome 3. Specfy the econometrc model. Ths purely mathematcal model s unnterestng to the econometrcan. It assumes an exact or determnstc relatonshp between C and DI. C = 0 DI ε We re-wrte the equaton wth a dsturbance or error term. Ths s now an econometrc model, or more precsely a lnear regresson model.

Page 3 4. Obtan the Data. Only way to estmate the parameters of nterest n ths model, s to obtan the necessary data. Data source could nvolve tme seres, cross-sectonal or panel data. Tme seres data are collected over tme for the same country or other sngle aggregate economc unt (e.g., aggregate C and DI could be obtaned for Sngapore from 950-2000). In ths case, we d normally re-wrte the equaton wth a t subscrpt on the varables and dsturbance term to denote tme. = DI Ct t ε t Cross-sectonal data are collected for a sample over ndvduals, households, frms or other dsaggregate economc entty at a pont n tme (e.g., C and DI could be obtaned for sample of,000 Sngapore famles durng 2000). In ths case, we d normally re-wrte the equaton wth a subscrpt on the varables and dsturbance term to denote ndvdual. = DI C ε Fnally, panel data contans elements of both tme seres and cross-sectonal data (e.g., C and DI could be obtaned for all countres n the OECD durng the perod 950-2000). Note that we have varaton across countres at any sngle pont n tme, as well as varaton across tme. In ths case, we d normally re-wrte the equaton wth both an and t subscrpt on the varables and dsturbance term to denote country and tme. = DI Ct t ε t Tme seres or cross sectonal data could be plotted as a scatter dagram below:

Page 4 5. Estmate the parameters n the econometrc model. Now t s tme to estmate the coeffcents n the model. The basc dea s to come up wth a lne that best fts the data ponts. Imagne that ths regresson analyss yelds the followng consumpton functon. Ĉ = 336.9 0.820DI These are the estmates of the 2 coeffcents. The hat on C ndcates that ths s an estmated consumpton functon or regresson model. 6. Test the hypothess. Recall that we wanted to test Keynes hypothess that the MPC was between zero and. Looks reasonable, but unsure whether there s any statstcal evdence that t s below.

Page 5 7. Forecast or predct economc behavour. One of the other uses of ths model f for forecastng or predctng future economc behavour. To predct C, however, need to know future values of DI. Suppose you know that DI s gong to be $65,000 (mllons). Ĉ = 336.9 0.820(65,000)= 53,636.9 Ths also allows you to predct savngs of $,363.. Ths s just the dfference between DI and C. 8. Use the model for polcy purposes. Can also be used for control purposes. Suppose that C of 53.6 bllon s nsuffcent to mantan full-employment. Not enough spendng by households. Government could consder ncreasng DI through tax cuts to acheve a hgher target. Suppose 62 bllon s needed. 62,000 = 336.9 0.820DI DI =75,98.9 Thus, need to cut taxes by just over $0 bllon from forecasted levels. IV. Types of Econometrcs and Names of Varables n Regresson Splt nto theoretcal and appled felds. We end up straddlng these 2 approaches. Theoretcal econometrcs concerns the development of basc estmaton approaches, propertes of estmators, etc. More closely related to mathematcal statstcs (e.g., proofs, axoms,...). Appled econometrcs s bult on ths theoretcal foundaton. Apples estmaton technques to varous areas of economc enqury. Examples: Where to open a new restaurant? How much ad? Should we fx the target nterest rate? How many hours studyng on Econ07? Academcs, prvate and government sectors have ncreasngly used econometrcs.

Page 6 Regresson analyss s the study of the relatonshp between a Dependent Varable and one or more Independent or Explanatory Varables. In the lnear regresson model (or true regresson lne or populaton regresson functon) Y = X X 0 K K ε Y s called dependent or left-hand-sde varable or regressant and s random; X k ( k =,, K) s called ndependent or explanatory or rght-hand-sde varable or regressor, t can be fxed or random; ε s called error or dsturbance term and s random; s are called regresson coeffcents, they are unknown and fxed; 0 s the ntercept coeffcent; k ( k =,, K) s the slope coeffcents. The meanng of s the mpact of a one unt ncrease n X on Y, holdng constant the other ndependent varables. The estmated regresson lne (or sample regresson functon) s wrtten as Yˆ = ˆ X 0 Yˆ s called estmated or ftted value of Y ; ˆ k ( k = 0,, K) s called estmated regresson coeffcent; Defne e = Y Yˆ and call e the resdual. When K=, the regresson model s Smple Lnear Regresson (SLR) model. When K>, the regresson model s Multple Lnear Regresson (MLR) model. V. Statstcal vs. Determnstc Relatonshps ˆ ˆ X Regresson analyss s concerned wth a Statstcal, not a Functonal or Determnstc dependence among varables. In statstcal relatonshps, the varables are Random or Stochastc. K K VI. Regresson vs. Causaton Although regresson analyss deals wth the relatonshp of one varable on other varables, t doesn t necessarly mply causaton. A causal relatonshp must come from outsde of statstcs. Economc theory s supposed to provde the compellng evdence of causaton.

Page 7 VII. The True (or Populaton) Regresson Functon (PRF) Suppose we have a small communty of 2 famles. We re nterested n studyng the relatonshp between ther weekly dsposable ncome (X) and expendture on food (Y). We want to predct the populaton mean of food expendtures, gven some level of famly ncome. The 2 famles can be grouped nto four ncome groups. Each famly wthn a group has the same dsposable ncome. Ths s the entre populaton, not a sample. Dsposable Income (X) Indvdual Food Expendtures (Y) Average Food Expendtures 250 78.00, 88.50, 96.00 87.50 300 77.50, 89.00, 96.50, 09.00 93.00 350 90.50, 06.50 98.50 400 99.00, 03.00, 0.004.00 Plot these data ponts on the followng dagram. Ths s often known as a Scatter Dagram. The sold dots are the actual observatons. Now the Condtonal Mean or Condtonal Expectaton s E(Y X = X ) The crcles are the condtonal means. Clearly, food expendtures on average ncrease wth dsposable ncome. Ths can be seen even more clearly by connectng these condtonal means wth a straght lne. Ths s the True (or Populaton) Regresson Lne. Note that t could also be a True (or Populaton) Regresson Curve.

Page 8 Geometrcally, a populaton regresson lne or curve s smply the locus of the condtonal means or expectatons of the dependent varable for fxed values of the explanatory varable(s). In general, we could wrte the Populaton Regresson Functon (PRF) as: E(Y X )= f( X ) where ths s some functon of the explanatory varable. We mght antcpate that food consumpton wll be lnearly related to dsposable ncome. Ths s an ntal assumpton of our estmaton. We could narrow ths functonal form to: E(Y X )= X Ths s known as the lnear PRF (or PR Lne).

Page 9 VIII. Lnearty n Regresson Analyss What do we mean when we say that our regresson model s lnear? One possblty s that the model s nonlnear n terms of the varables. E(Y X )= 2 X The second possblty s that the PRF s nonlnear n terms of the coeffcents. E(Y X )= X Such regressons functons wll not be consdered n ths paper, but the one gven above wll be. From now on, lnear regresson models should be read as lnear (n terms of the parameters). IX. Addng the Dsturbance Term to Our PRF The PRF tells us the 'average' food expendtures for a gven level of household ncome. But we know that any 'partcular' household s unlkely to be on ths functon. For ths reason we rewrte PRF as = X Y ε where ε s a random varable wth mean 0. Lot's of reasons why ε mght exst. Mnor nfluences of Y are omtted. The underlyng theoretcal equaton mght have a dfferent functonal form than the one chosen for the regresson. Some purely random varatons are always there. Measurement Error on Y or X.

Page 0 X. The Sample (Estmated) Regresson Functon Thus far, we've dealt wth the entre populaton and the PRF. Avoded any consderaton of samplng. In most cases, we wll never observe the entre populaton. We have to nfer from a sample or samples what the PRF mght look lke. Note that we're unlkely to know just how close we get to the truth. Each sample we draw can be used to produce a Sample (Estmated) Regresson Functon (SRF), that s, the estmated regresson functon: Yˆ = ˆ 0 ˆ X Of course, we can replace the actual value of the dependent varable ( Y ) wth ts ftted value ( Y ˆ ). The LHS s no longer an estmator, t s the actual value. The RHS now ncludes the Resdual term e. Y = ˆ ˆ X e Ths means that the actual dependent varable can be decomposed nto ts ftted value and the resdual. Y =Y ˆ e Ths resdual, lke the dsturbance can be ether postve or negatve. We can ether overestmate: Y - Yˆ = e <0 f Y <Yˆ or underestmate the true value of Y : Y - Yˆ = e >0 f Y >Yˆ X. Questons for dscusson: Q.0 XI. Run the heght regresson (Secton.4) usng the data fle provded. Do further exploraton accordng to Q.4 and Q.5