Introduction to regression. Paul Schrimpf. Paul Schrimpf. UBC Economics 326. January 23, 2018

Similar documents
I 1. 1 Introduction 2. 2 Examples Example: growth, GDP, and schooling California test scores Chandra et al. (2008)...

Paul Schrimpf. January 23, UBC Economics 326. Statistics and Inference. Paul Schrimpf. Properties of estimators. Finite sample inference

Supplementary Appendix for. Version: February 3, 2014

Solow model: Convergence

Note on Bivariate Regression: Connecting Practice and Theory. Konstantin Kashin

2017 Source of Foreign Income Earned By Fund

Chapter 8 - Appendixes

International Student Enrollment Fall 2018 By CIP Code, Country of Citizenship, and Education Level Harpur College of Arts and Sciences

2001 Environmental Sustainability Index

2006 Supplemental Tax Information for JennisonDryden and Strategic Partners Funds

Does socio-economic indicator influent ICT variable? II. Method of data collection, Objective and data gathered

USDA Dairy Import License Circular for 2018

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

USDA Dairy Import License Circular for 2018

04 June Dim A W V Total. Total Laser Met

Governments that have requested pre-export notifications pursuant to article 12, paragraph 10 (a), of the 1988 Convention

DISTILLED SPIRITS - EXPORTS BY VALUE DECEMBER 2017

Country of Citizenship, College-Wide - All Students, Fall 2014

Canadian Imports of Honey

Simple Linear Regression

The Simple Linear Regression Model

About the Authors Geography and Tourism: The Attraction of Place p. 1 The Elements of Geography p. 2 Themes of Geography p. 4 Location: The Where of

Lecture 26 Section 8.4. Mon, Oct 13, 2008

Radiation Protection Procedures

Introductory Econometrics

The following subsections contain the draws omitted from the main text and more details on the. Let Y n reflect the quasi-difference of the

How Well Are Recessions and Recoveries Forecast? Prakash Loungani, Herman Stekler and Natalia Tamirisa

Reconciling conflicting evidence on the origins of comparative development: A finite mixture model approach

DISTILLED SPIRITS - IMPORTS BY VALUE DECEMBER 2017

DISTILLED SPIRITS - IMPORTS BY VOLUME DECEMBER 2017

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

ECON Introductory Econometrics. Lecture 13: Internal and external validity

USDA Dairy Import License Circular for 2018 Commodity/

PRECURSORS. Pseudoephedrine preparations 3,4-MDP-2-P a P-2-P b. Ephedrine

Export Destinations and Input Prices. Appendix A

MULTIPLE REGRESSION. part 1. Christopher Adolph. and. Department of Political Science. Center for Statistics and the Social Sciences

North-South Gap Mapping Assignment Country Classification / Statistical Analysis

Government Size and Economic Growth: A new Framework and Some Evidence from Cross-Section and Time-Series Data

Office of Budget & Planning 311 Thomas Boyd Hall Baton Rouge, LA Telephone 225/ Fax 225/

Situation on the death penalty in the world. UNGA Vote 2012 Resolutio n 67/176. UNGA Vote 2010 Resolutio n 65/206. UNGA Vote 2008 Resolutio n 63/168

Inflation Targeting as a Tool To Control Unemployment

Export Destinations and Input Prices. Appendix A

Demography, Time and Space

Final Exam. 1. Definitions: Briefly Define each of the following terms as they relate to the material covered in class.

GINA Children. II Global Index for humanitarian Needs Assessment (GINA 2004) Sheet N V V VI VIII IX X XI XII XII HDR2003 HDR 2003 UNDP

READY TO SCRAP: HOW MANY VESSELS AT DEMOLITION VALUE?

ECON3150/4150 Spring 2015

ECO220Y Simple Regression: Testing the Slope

Intermediate Econometrics

ECON The Simple Regression Model

ICC Rev August 2010 Original: English. Agreement. International Coffee Council 105 th Session September 2010 London, England

Regression and Statistical Inference

Does Corruption Persist In Sub-Saharan Africa?

Chapter 7: Correlation and regression

Advanced Econometrics I

Office of Budget & Planning 311 Thomas Boyd Hall Baton Rouge, LA Telephone 225/ Fax 225/

Fully Modified HP Filter

The regression model with one fixed regressor cont d

Econometrics I Lecture 3: The Simple Linear Regression Model

Your Galactic Address

Fall International Student Enrollment Statistics

Ch 2: Simple Linear Regression

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Do Policy-Related Shocks Affect Real Exchange Rates? An Empirical Analysis Using Sign Restrictions and a Penalty-Function Approach

Review of Econometrics

LECTURE 5. Introduction to Econometrics. Hypothesis testing

Appendices. Please note that Internet resources are of a time-sensitive nature and URL addresses may often change or be deleted.

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Econometrics - 30C00200

Grand Total Baccalaureate Post-Baccalaureate Masters Doctorate Professional Post-Professional

Applied Econometrics (QEM)

Regression Diagnostics

Sample Statistics 5021 First Midterm Examination with solutions

Appendix A. ICT Core Indicators: Definitions

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

PROPOSED BUDGET FOR THE PROGRAMME OF WORK OF THE CONVENTION ON BIOLOGICAL DIVERSITY FOR THE BIENNIUM Corrigendum

Nursing Facilities' Life Safety Standard Survey Results Quarterly Reference Tables

Lecture 5: Political Institutions and Policy Outcomes

A Prior Distribution of Bayesian Nonparametrics Incorporating Multiple Distances

Nigerian Capital Importation QUARTER THREE 2016

Class business PS is due Wed. Lecture 20 (QPM 2016) Multivariate Regression November 14, / 44

Chapter 12 - Lecture 2 Inferences about regression coefficient

Appendix B: Detailed tables showing overall figures by country and measure

The Multiple Linear Regression Model

Ch 3: Multiple Linear Regression

Applied Statistics and Econometrics

Introduction to Econometrics

U.S. Sugar Monthly Import and Re-Exports

From Argentina to Zimbabwe: Where Should I Sell my Widgets?

Multiple Regression Analysis

Statistical Mechanics of Money, Income, and Wealth

WISE International Masters

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Briefing Notes for World Hydrography Day

Analyzing Severe Weather Data

Transcription:

Introduction UBC Economics 326 January 23, 2018

Review of last week Expectations and conditional expectations Linear Iterated expectations Asymptotics using large sample distribution to approximate finite sample distribution of estimators LLN: sample moments converge in probability to population moments, n g(x i ) 1 n } i=1 {{ } sample moment p E[g(x)] }{{} population moment CLT: centered and scaled sample moments converge in distribution to population moments n }{{} 1 n g(x i ) E[g(x) ] d N (0, Var(g(x))) n }{{} i=1 scaling centering Using CLT to calculate p-values

Motivation Conditional expectation function Population Interpretation Sample Regression in R Part I Definition and interpretation of

Motivation Conditional expectation function Population Interpretation Sample Regression in R 1 Motivation 2 Conditional expectation function 3 Population Interpretation 4 Sample 5 Regression in R

Motivation Conditional expectation function Population Interpretation Sample Regression in R Main texts: Angrist and Pischke (2014) chapter 2 Wooldridge (2013) chapter 2 Stock and Watson (2009) chapter 4-5 More advanced: Angrist and Pischke (2009) chapter 3 up to and including section 3.1.2 (pages 27-40) Bierens (2012) Abbring (2001) chapter 3 Baltagi (2002) chapter 3 Linton (2017) chapters 16-20, 22 More introductory: Diez, Barr, and Cetinkaya-Rundel (2012) chapter 7

Motivation Conditional expectation function Population Interpretation Sample Regression in R Section 1 Motivation

Motivation Conditional expectation function Population Interpretation Sample Regression in R General problem Often interested in relationship between two (or more) variables, e.g. Wages and education Minimum wage and unemployment Price, quantity, and product characterics Usually have: 1 Variable to be explained (dependent variable) 2 Explanatory variable(s) or independent variables or covariates Dependent Independent Wage Education Unemployment Minimum wage Quantity Price and product characteristics Y X For now agnostic about causality, but E[Y X] usually is not causal

Example: Growth and GDP Motivation Conditional expectation function Population Interpretation Sample Regression in R growth 7.5 5.0 2.5 0.0-2.5 0 2500 Korea, Republi Taiwan, Malta China Thailand PakistanMalaysia Cyprus Dominican Kenya India Paraguay Sri Lanka Republic Portugal Togo Ecuador Brazil Greece Colombia Japan Bangladesh Philippines Honduras Zimbabwe Panama Papua New Guinea Sierra Leone Bolivia Guatemala Costa Rica Mauritius Senegal Haiti Spain Ireland Ghana Jamaica Mexico El Salvador Israel South PeruAfrica Chile Zaire Niger Italy Iceland Uruguay Austria Finland Norway Belgium Argentina France Netherlands Trinidad and Tobago Germany United Denmark Kingdom Canada Sweden Australia Venezuela New Zealand 5000 rgdp60 7500 Switzerland United States 10000

Motivation Conditional expectation function Population Interpretation Sample Regression in R growth 7.5 5.0 2.5 0.0 Years of schooling in 1960 and growth Korea, Republi Pakistan Taiwan, China Portugal Malaysia Togo Kenya Bangladesh India Thailand Papua New Guinea Mauritius Sierra Leone Guatemala Dominican Brazil Republic Cyprus Haiti Zimbabwe Honduras Mexico Malta Ghana Colombia Ecuador Spain El Senegal Paraguay Sri Austria Lanka Salvador Jamaica Greece Italy Venezuela PeruPhilippines Costa Panama Rica France Trinidad and Tobago Iceland Norway South AfricaNetherlands Japan Chile Bolivia Argentina Uruguay Ireland Israel Belgium Finland Switzerland Germany United Sweden Kingdom Canada United Australia States New Zealand Denmark -2.5 Niger Zaire 0.0 2.5 5.0 7.5 10.0 yearsschool

Motivation Conditional expectation function Population Interpretation Sample Regression in R Section 2 Conditional expectation function

Motivation Conditional expectation function Population Interpretation Sample Regression in R Conditional expectation function One way to describe relation between two variables is a function, Y = h(x) Most relationships in data are not deterministic, so look at average relationship, Y = E[Y X] + (Y E[Y X]) }{{}}{{} h(x) ε =E[Y X] + ε Note that E[ε] = 0 (by definition of ε and iterated expectations) E[Y X] can be any function, in particular, it need not be linear

Motivation Conditional expectation function Conditional expectation function Population Interpretation Sample Regression in R Unrestricted E[Y X] hard to work with Hard to estimate Hard to communicate if X a vector (cannot draw graphs) Instead use linear Easier to estimate and communicate Tight connection to E[Y X]

Motivation Conditional expectation function Population Interpretation Sample Regression in R Section 3 Population

Population Motivation Conditional expectation function Population Interpretation Sample Regression in R The bivariate population of Y on X is (β 0, β 1 ) = arg min b 0,b 1 E[(Y b 0 b 1 X) 2 ] i.e. β 0 and β 1 are the slope and intercept that minimize the expected square error of Y (β 0 + β 1 X) Calculating β 0 and β 1 : First order conditions: [b 0 ] : 0 = E[(Y b 0 b 1 X) 2 ] b [ 0 ] =E (Y b 0 b 1 X) 2 b 0 =E [ 2(Y β 0 β 1 X)] (1)

Motivation Conditional expectation function Population Interpretation Sample Regression in R and Population [b 1 ] : 0 = E[(Y b 0 b 1 X) 2 ] b [ 1 ] =E (Y b 0 b 1 X) 2 b 1 (1) rearranged gives β 0 = E[Y] β 1 E[X] Substituting into (2) =E [ 2(Y β 0 β 1 X)X] (2) 0 =E [X( Y + E[Y] β 1 E[X] + β 1 X)] =E [X( Y + E[Y])] + β 1 E [X(X E[X])] = Cov(X, Y) + β 1 Var(X) Cov(X, Y) β 1 = Var(X) β 1 = Cov(X,Y) Var(X), β 0 = E[Y] β 1 E[X]

Motivation Conditional expectation function Population Interpretation Sample Regression in R Population approximates E[Y X] Lemma The population is the minimal mean square error linear approximation to the conditional expectation function, i.e. arg min E [ (Y (b 0 + b 1 X)) 2] [ = arg min E X (E[Y X] (b0 + b 1 X)) 2] b 0,b 1 b 0,b 1 }{{}}{{} MSE of linear approximation to E[Y X] population Corollary If E[Y X] = c + mx, then the population of Y on X equals E[Y X], i.e. β 0 = c and β 1 = m

Motivation Conditional expectation function Population Interpretation Sample Regression in R Proof. Let b 0, b 1 be minimizers of MSE of approximation to E[Y X] Same steps as in population formula gives and 0 = E [ 2(E[Y X] b 0 b 1X)] 0 = E [ 2(E[Y X] b 0 b 1X)X] Rearranging and combining, Proof b 0 = E[E[Y X]] b 1E[X] = E[Y] b 1E[X] and 0 =E [X( E[Y X] + E[Y] + b 1E[X] b 1X)] =E [X( E[Y X] + E[Y])] + b 1E [X(X E[X])] = Cov(X, Y) + b 1Var(X) b Cov(X, Y) 1 = Var(X)

Regression interpretation Motivation Conditional expectation function Population Interpretation Sample Regression in R Regression = best linear approximation to E[Y X] β 0 E[Y X = 0] β 1 d dxe[y X] change in average Y per unit change in X Not necessarily a causal relationship (usually not) Always can be viewed as description of data

Regression with binary X Motivation Conditional expectation function Population Interpretation Sample Regression in R Suppose X is binary (i.e. can only be 0 or 1) We know β 0 + β 1 X = best linear approximation to E[Y X] X only takes two values, so can draw line connecting E[Y X = 0] and E[Y X = 1], so β 0 + β 1 X = E[Y X] β 0 = E[Y X = 0] β 0 + β 1 = E[Y X = 1]

Motivation Conditional expectation function Population Interpretation Sample Regression in R Section 4 Sample

Motivation Conditional expectation function Population Interpretation Sample Regression in R Sample Have sample of observations: {(y i, x i )} i=1 n The sample (or when unambiguous just ) of Y on X is 1 (ˆβ 0, ˆβ 1 ) = arg min b 0,b 1 n n (y i b 0 b 1 x i ) 2 i.e. ˆβ 0 and ˆβ 1 are the slope and intercept that minimize the sum of squared errors, (y i (ˆβ 0 + ˆβ 1 x i )) 2 Same as population but with sample average instead of expectation Same calculation as for population would show and ˆβ 1 = Ĉov(X, Y) Var(X) = 1 i=1 n n i=1 (x i x)(y i ȳ) n i=1 (x i x) 2 1 n ˆβ 0 = ȳ ˆβ 1 x

Sample Motivation Conditional expectation function Population Interpretation Sample Regression in R Sample is an estimator for the population Given an estimator we should ask:?? Consistent? Asymptotically normal? We will address these questions in the next week or two

Motivation Conditional expectation function Population Interpretation Sample Regression in R Section 5 Regression in R

Motivation Conditional expectation function Population Interpretation Sample Regression in R Regression in R 1 r e q u i r e ( d a t a s e t s ) ## some d a t a s e t s included with R 2 statedf < data. frame ( s t a t e. x77 ) 3 summary ( statedf ) ## summary s t a t i s t i c s of data 4 5 ## Sample r e g r e s s i o n function 6 r e g r e s s < function ( y, x ) { 7 beta < vector ( length =2) 8 beta [ 2 ] < cov ( x, y ) / var ( x ) 9 beta [ 1 ] < mean( y ) beta [ 2 ] * mean( x ) 10 return ( beta ) 11 } 12 13 ## Regress l i f e expectancy on income 14 beta < r e g r e s s ( statedf [, L i f e. Exp ], statedf $Income ) 15 beta 16 17 ## b u i l t i n r e g r e s s i o n 18 lm ( L i f e. Exp ~ Income, data = statedf ) 19 ## more d e t a i l e d output 20 summary ( lm ( L i f e. Exp ~ Income, data = statedf ) ) https://bitbucket.org/paulschrimpf/econ326/src/ master/notes/03/regress.r?at=master

Part II Properties of (continued)

Section 6 (continued)

s s: Residuals: ŷ i = ˆβ 0 + ˆβ 1 x i ˆε i = y i ˆβ 0 ˆβ 1 x i = y i ŷ i (continued) Estimating σε 2 y i = ŷ i + ˆε i Sample mean of residuals = 0 First order condition for ˆβ 0, 0 = 1 n 0 = 1 n n (y i ˆβ 0 ˆβ 1 x i ) i=1 n ˆε i i=1 Sample covariance of x and ˆε = 0

s (continued) First order condition for ˆβ 1, 0 = 1 n 0 = 1 n n (y i ˆβ 0 ˆβ 1 x i )x i i=1 n ˆε i x i i=1

s (continued) Sample mean of ŷ i = ȳ = ˆβ 0 + ˆβ 1 x 1 n n y i = 1 n i=1 = 1 n = 1 n n ŷ i + ˆε i i=1 n i=1 ŷ i n ˆβ 0 + ˆβ 1 x i i=1 =ˆβ 0 + ˆβ 1 x

s (continued) Sample covariance of y and ˆε = sample variance of ˆε: 1 n n y i (ˆε i ˆε) = 1 n i=1 = 1 n n y iˆε i i=1 n (ˆβ 0 + ˆβ 1 x i + ˆε i )ˆε i i=1 =ˆβ 0 1 n = 1 n n i=1 n i=1 ˆε 2 i ˆε i + β 1 1 n n x iˆε i + 1 n i=1 n i=1 ˆε 2 i

(continued) Decompose y i y i = ŷ i + ˆε i Total sum of squares = explained sum of squares + sum of squared residuals 1 n (y i ȳ) 2 = 1 n (ŷ i ȳ) 2 + 1 n ˆε i 2 n n n i=1 i=1 i=1 }{{}}{{}}{{} SST SSE SSR R 2 R-squared: fraction of sample variation in y that is explained by x R 2 = SSE SST = 1 SSR = oĉorr(y, ŷ) SST 0 R 2 1 If all data on line, then R 2 = 1 Magnitude of R 2 does not have direct bearing on economic importance of a

Section 7 (continued)

(continued) E[ˆβ] =? Assume: SLR.1 (linear model) y i = β 0 + β 1 x i + ε i SLR.2 (independence) {(x i, y i )} n i=1 is independent random sample SLR.3 (rank condition) Var(X) > 0 SLR.4 (exogeneity) E[ε X] = 0 Then, E[ˆβ 1 ] = β 1 and E[ˆβ 0 ] = β 0

(continued) Var(ˆβ)? Assume SLR.1-4 and SLR.5 (homoskedasticity) Var(ε X) = σ 2 Then, Var(ˆβ 1 {x i } n i=1 ) = σ 2 n i=1 (x i x) 2 and n i=1 x2 i Var(ˆβ 0 {x i } n i=1 ) = σ 2 1 n n i=1 (x i x) 2

with normal errors (continued) Assume SLR.1-SLR.5 and SLR.6 (normality) ε i x i N(0, σ 2 ) Then Y X N(β 0 + β 1 X, σ 2 ), and ˆβ 1 {x i } n i=1 N (β 1, σ 2 ) n i=1 (x i x) 2 Even without assuming normality, the central limit theorem implies ˆβ is asymptotically normal (details in a later lecture)

Summary (continued) Simple linear model : SLR.1 (linear model) Y i = β 0 + β 1 x i + ε i SLR.2 (independence) {(x i, y i )} n i=1 is independent random sample SLR.3 (rank condition) Var(X) > 0 SLR.4 (exogeneity) E[ε X] = 0 SLR.5 (homoskedasticity) Var(ε X) = σ 2 SLR.6 (normality) ε i x i N(0, σ 2 ) ˆβ unbiased if SLR.1-SLR.4 If also SLR.5, then Var(ˆβ 1 {x i } n i=1 ) = σ 2 ni=1 (x i x) ( 2 ) If also SLR.6, then ˆβ1 {x i } n i=1 N σ β 1, 2 ni=1 (x i x) 2

(continued) SLR.1 Having a linear model makes it easier to state the other, but we could instead start by saying let β 1 = Cov(X,Y) Var(X) and β 0 = E[Y] β 1 E[X] be the population coefficients and define ε i = y i β 0 β 1 x i

(continued) SLR.2 Independent observations is a good assumption for data from a simple random sample Common situations where it fails in economics are when we have a time series of observations, e.g. {(x t, y t )} n t=1 could be unemployment and GDP of Canada for many different years; and clustering, e.g. the data could be students test scores and hours studying and our sample consists of randomly chosen courses or schools students in the same course would not be independent, but across different courses they might be. Still have E[ˆβ 1 ] = β 1 with non-independent observations as long as E[ε i x 1,..., x n ] = 0 The variance of ˆβ 1 will change with non-independent observations Simulation code

(continued) Estimating σε 2 SLR.3 If Var(X) = 0, then ˆβ 1 involves dividing by 0 If there is no variation in X, then we cannot see how Y is related to X

(continued) SLR.4 To think about mean independence of ε from x we should have a model motivating the If the model we want is just a population, then automatically E[εX] = 0, and E[ε X] = 0 if the conditional expectation function is linear; if conditional expectation nonlinear maybe still a useful approximation 0 1 10 0 10 x y Code

(continued) SLR.4 To think about mean independence of ε from x we should have a model motivating the If the model we want is anything else, then maybe E[εX] 0 (and E[ε X] 0), e.g. Demand curve p i = β 0 + β 1 q i + ε i ε i = everything that affects price other than quantity. q i determined in equilibrium implies E[ε i q i ] 0 E[ˆβ 1 ] β 1 and ˆβ 1 does not tell us what we want 6 3 0 3 6 2 0 2 x y Code

(continued) SLR.5 Homoskedasticity: variance of ε does not depend on X Homoskedastic Heteroskedastic 0 1 2 3 4 0 1 2 3 4 5 x y 0 1 2 3 4 0 1 2 3 4 5 x y Code Heteroskedasticity is when Var(ε X) varies with X If there is heteroskedasticity, the variance of ˆβ 1 is different, but we can fix it

(continued) robust standard errors / heteroscedasticity-consistent (HC) standard errors / Eicker Huber White standard errors

(continued) Estimating σε 2 SLR.6 If ε i x i N, then ˆβ 1 N What if ε i not normally distributed? We will see that ˆβ 1 still asymptotically normal Simulation

(continued) count 300 200 100 0 300 200 100 0 300 200 100 0 300 200 100 0 300 200 N 3 5 10 50 100 100 0 1 0 1 2 3 β^

(continued) count 120 90 60 30 0 120 90 60 30 0 120 90 60 30 0 120 90 60 30 0 120 90 60 30 0 4 2 0 2 4 (β^ β) Var(β^) N 3 5 10 50 100

Section 8 (continued)

Example: smoking and cancer (continued) Estimating σε 2 Data on per capita number of cigarettes sold and death rates per thousand from cancer for U.S. states in 1960 http://lib.stat.cmu.edu/dasl/datafiles/ cigcancerdat.html Death rates from: lung cancer, kidney cancer, bladder cancer, and leukemia Code

Smoking and lung cancer (continued) lung 25 20 y = 6.5 + 0.53 xla MD NJ NY FL RI MI IL OHMA CA VT TX MO ME WA WI IN AZ OK MT AK CT DE DC NE 15 MS SC AL AR TE KS IO NM SD MN ID KY NB WV WY UT ND PE 20 30 40 cig

Smoking and kidney cancer AK 4.0 (continued) kidney 3.5 3.0 2.5 2.0 UT ND MN MT y = 1.7 + 0.045 x WV ME VT SD NJ NY MA MI IL KS IO NB OH MD IN RI WA PE AZ TX WY CA NM MO ID OK FL WI LA TE KY SC AR CT DE DC NE MS 1.5 AL 20 30 40 cig

Smoking and bladder cancer NE 6 NJ DC (continued) bladder 5 4 3 UT y = 1.1 + 0.12 x MI NY MD WI CT RI PE WV IL ME LA VT MA OH FL CA IO WA IN MO MT MN NB SD AZ AK SC TX WY MS ID AR AL ND TE NMKS OK KY DE 20 30 40 cig

Smoking and leukemia 8 MN SD NB IO (continued) leukemia 7 6 UT y = 7 0.0078 WA x KS OK WV OH IL NY NJ CA IN AR ND TX MT MI PE MA MD MO WI LA TE ID AZ VT KY RI ME MS AL FL NM SC WY CT DE DC NE 5 AK 20 30 40 cig

Example: convergence in growth (continued) Estimating σε 2 Data on average growth rate from 1960-1995 for 65 countries along with GDP in 1960, average years of schooling in 1960, and other variables From http://wps.aw.com/aw_stock_ie_2/50/13016/ 3332253.cw/index.html, originally used in Beck, Levine, and Loayza (2000) Question: has there been in convergence, i.e. did poorer countries in 1960 grow faster and catch-up? Code

GDP in 1960 and growth (continued) growth 7.5 5.0 2.5 0.0-2.5 0 2500 y = 1.8 + 4.7e-05 x Korea, Republi Taiwan, Malta China Thailand PakistanMalaysia Cyprus Dominican Kenya India Paraguay Sri Lanka Republic Portugal Togo Ecuador Brazil Greece Colombia Japan Bangladesh Philippines Honduras Zimbabwe Panama Papua New Guinea Sierra Leone Bolivia Guatemala Costa Rica Mauritius Senegal Haiti Spain Ireland Ghana Jamaica Mexico El Salvador Israel South PeruAfrica Chile Zaire Niger Italy Iceland Uruguay Austria Finland Norway Belgium Argentina France Netherlands Trinidad and Tobago Germany United Denmark Kingdom Canada Sweden Australia Venezuela New Zealand 5000 rgdp60 7500 Switzerland United States 10000

(continued) growth 7.5 5.0 2.5 0.0 Years of schooling in 1960 and growth = 0.96 + 0.25 x Maltay Korea, Republi Pakistan Taiwan, China Portugal Malaysia Togo Kenya Bangladesh India Thailand Papua New Guinea Mauritius Sierra Leone Guatemala Dominican Brazil Republic Cyprus Haiti Zimbabwe Honduras Mexico Ghana Colombia Ecuador Spain El Senegal Paraguay Sri Austria Lanka Salvador Jamaica Greece Italy Venezuela PeruPhilippines Costa Panama Rica France Trinidad and Tobago Iceland Norway South AfricaNetherlands Japan Chile Bolivia Argentina Uruguay Ireland Israel Belgium Finland Switzerland Germany United Sweden Kingdom Canada United Australia States New Zealand Denmark -2.5 Niger Zaire 0.0 2.5 5.0 7.5 10.0 yearsschool

Things look different 1995-2014 Code to download and recreate results using updated growth data through 2014 from the World Bank (continued)

Section 9 (continued)

with normal errors (continued) Regression estimates depend on samples, which are random, so the estimates are random Some s will randomly look interesting due to chance Logic of hypothesis testing: figure out probability of getting an interesting estimate due solely to change Null hypothesis, H 0 : the is uninteresting, usually β 1 = 0

with normal errors With SR.1-SR.6 and under H 0 : β 1 = β1, we know ( ˆβ N β1 σ 2 ) ε, n i=1 (x i x) 2 or equivalently, (continued) Estimating σε 2 t ˆβ β1 n N(0, 1) σ ε / i=1 (x i x) 2 P-value: the probability of getting a estimate as or more interesting than the one we have As or more interesting = as far or further away from β 1 If we are only interested when ˆβ 1 is on one side of β 1, then we have a one sided alternative, e.g. H a : β 1 > β 1 If we are equally interested in either direction, then H a : β 1 β 1

0.4 0.3 (continued) dnorm(z) 0.2 0.1 0.0 4 2 0 2 4 z

0.4 0.3 (continued) dnorm(z) 0.2 0.1 0.0 4 2 0 2 4 z

with normal errors (continued) One-sided p-value: p = Φ( t ) = 1 Φ( t ) Two-sided p-value: p = 2Φ( t ) = 2(1 Φ( t )) Interpretation: The probability of getting an estimate as strange as the one we have if the null hypothesis is true. It is not about the probability of β 1 being any particular value. β 1 is not a random variable. It is some unknown number. The data is what is random. In particular, the p-value is not the probability that that H 0 is false given the data. Hypothesis testing: we must make a decision (usually reject or fail to reject H 0 ) Choose significance level α (usually 0.05 or 0.10) Construct procedure such that if H 0 is true, we will incorrectly reject with probability α Reject null if p-value less than α

Smoking and cancer (continued) Model 1 Model 2 Model 3 Model 4 (Intercept) 1.09 6.47 1.66 7.03 (0.48) (2.14) (0.32) (0.45) cig 0.12 0.53 0.05 0.01 (0.02) (0.08) (0.01) (0.02) R 2 0.50 0.49 0.24 0.00 Adj. R 2 0.48 0.47 0.22-0.02 Num. obs. 44 44 44 44 RMSE 0.69 3.07 0.46 0.64 p < 0.001, p < 0.01, p < 0.05 Table: Smoking and cancer

Growth and GDP (continued) Model 1 Model 2 (Intercept) 1.80 0.96 (0.38) (0.42) rgdp60 0.00 (0.00) yearsschool 0.25 (0.09) R 2 0.00 0.11 Adj. R 2-0.01 0.10 Num. obs. 65 65 RMSE 1.91 1.80 p < 0.001, p < 0.01, p < 0.05 Table: Growth and GDP and education in 1960

Caution: multiple testing (continued) Estimating σε 2 We just looked at 6 s, if H 0 : β 1 = 0 is true in all of them the probability that correctly fail to reject all 6 null hypotheses with a 5% test is 0.95 6 = 0.74 (assuming the 6 tests are independent) A quarter of the time if we look at 6 s, we will randomly find at least significant relationship; if we look at 14 s the probability that we incorrectly reject a null is more than 0.5

Caution: economic significance statistical significance (continued)

(continued) Recall that Var(ˆβ x 1,..., x n ) = σ 2 ε unknown We estimate σ 2 ε using the residuals, ˆσ 2 ε = 1 n 2 n σ 2 ε ni=1 (x i x) 2 = σ 2 ε n Var(x) ˆε 2 i }{{} i=1 =(y i ˆβ 0 ˆβ 1 x i ) 2 If SLR.1-SLR.5, E[ˆσ 2 ε ] = σ 2 ε Using 1 n 2 instead of 1 n makes ˆσ 2 ε unbiased ˆε i depends on 2 estimated parameters, ˆβ 0 and ˆβ 1, so only n 2 degrees of freedom Estimate Var(ˆβ 1 x 1,..., x n ) by Var(ˆβ 1 x 1,..., x n ) = ˆσ ε 2 1 n n 2 n i=1 (x i x) 2 = i=1ˆε2 i n i=1 (x i x) 2

(continued) Standard error of ˆβ 1 is Var(ˆβ 1 x 1,..., x n ) If SLR.1-SLR.6, t-statistic with estimated Var(ˆβ 1 x 1,..., x n ) has a t(n 2) distribution instead of N(0, 1) ˆβ 1 β 1 t = t(n 2) Var(ˆβ 1 x 1,..., x n )

(continued) ˆβ 1 is random Var(ˆβ 1 ), p-values, and hypthesis tests are ways of expressing how random is ˆβ 1 are another A 1 α confidence interval, CI 1 α = [LB 1 α, UB 1 α ] is an interval estimator for β 1 such that (CI 1 α is random; β 1 is not) P(β 1 CI 1 α = 1 α) Recall: if SLR.1-SLR.6, then ( ) ˆβ 1 N β 1, Var(ˆβ 1 )

(continued) Implies ( ) P ˆβ 1 < β 1 + Var(ˆβ 1 )Φ 1 (α/2) ( ) P ˆβ 1 Var(ˆβ 1 )Φ 1 (α/2) < β 1 ) =α/2 =α/2 Estimating σε 2 and ( ) P ˆβ 1 > β 1 + Var(ˆβ 1 )Φ 1 (1 α/2) =α/2 ( ) P ˆβ 1 Var(ˆβ 1 )Φ 1 (1 α/2) > β 1 =α/2

(continued) so P ˆβ 1 + Var(ˆβ 1 )Φ 1 (α/2) < β 1 = β 1 < ˆβ + Var(ˆβ 1 )Φ 1 (1 α/2) ( ) =1 P ˆβ 1 + Var(ˆβ 1 )Φ 1 (α/2) < β 1 ( ) P ˆβ 1 + Var(ˆβ 1 )Φ 1 (1 α/2) > β 1 =1 α For α = 0.05, Φ 1 (0.025) 1.96, Φ 1 (0.975) 1.96 For α = 0.1, Φ 1 (0.05) 1.64

0.4 0.3 β (continued) Estimating σε 2 Density of β^ 0.2 0.1 β^ β^ + se(β^)φ 1 (α 2) β^ + se(β^)φ 1 (1 α 2) 0.0 β + se(β^)φ 1 (α 2) β + se(β^)φ 1 (1 α 2) 10 0 10

(continued) 1 α confidence interval ˆβ 1 ± Var(ˆβ 1 )Φ 1 (α/2) With estimated ˆσ 2 ε, use t distribution instead of normal ˆβ 1 ± Var(ˆβ 1 )F 1 t,n 2(α/2) F 1 t,n 2 = inverse CDF of t(n 2) distribution F t,n 2 (1 α/2) n 2 α/2 5 10 20 50 100 0.025 2.57 2.23 2.09 2.01 1.98 1.96 0.05 2.02 1.81 1.72 1.68 1.66 1.64

(continued) growth 7.5 5.0 2.5 0.0-2.5 0 Example: GDP in 1960 and growth 2500 y = 1.8 + 4.7e-05 x Korea, Republi Taiwan, Malta China Thailand PakistanMalaysia Cyprus Dominican Kenya India Paraguay Sri Lanka Republic Portugal Togo Ecuador Brazil Greece Colombia Japan Bangladesh Philippines Honduras Zimbabwe Panama Papua New Guinea Sierra Leone Bolivia Guatemala Costa Rica Mauritius Senegal Haiti Spain Ireland Ghana Jamaica Mexico El Salvador Israel South PeruAfrica Chile Zaire Niger Italy Iceland Uruguay Austria Finland Norway Belgium Argentina France Netherlands Trinidad and Tobago Germany United Denmark Kingdom Canada Sweden Australia Venezuela New Zealand 5000 rgdp60 7500 Switzerland United States 10000

(continued) growth 7.5 5.0 2.5 0.0 Example: Years of schooling in 1960 and growth = 0.96 + 0.25 x Maltay Korea, Republi Pakistan Taiwan, China Portugal Malaysia Togo Kenya Bangladesh India Thailand Papua New Guinea Mauritius Sierra Leone Guatemala Dominican Brazil Republic Cyprus Haiti Zimbabwe Honduras Mexico Ghana Colombia Ecuador Spain El Senegal Paraguay Sri Austria Lanka Salvador Jamaica Greece Italy Venezuela PeruPhilippines Costa Panama Rica France Trinidad and Tobago Iceland Norway South AfricaNetherlands Japan Chile Bolivia Argentina Uruguay Ireland Israel Belgium Finland Switzerland Germany United Sweden Kingdom Canada United Australia States New Zealand Denmark -2.5 Niger Zaire 0.0 2.5 5.0 7.5 10.0 yearsschool

Section 10 (continued)

(continued) The sample estimator, (ˆβ 0, ˆβ 1 ) = arg min Gauss-Markov theorem n (y i b 0 b 1 x i ) 2 i=1 ˆβ 1 = n i=1 (x i x)y i n i=1 (x i x) 2 also called Ordinary Least Squares (OLS) is not the only unbiased estimator of y i = β 0 + β 1 x i + ε i Gauss-Markov theorem: if SLR.1-SLR.5, then OLS is the Best Linear Estimator Linear means linear in y, ˆβ 1 = n i=1 c iy i with c i = (xi x) n i=1 (xi x)2 means E[ˆβ 1 ] = β 1 Best means that among all linear unbiased estimators, OLS has the smallest variance

Proof: setup (continued) Estimating σε 2 Let β 1 be a linear unbiased estimator of β 1 Linear: β 1 = n i=1 c iy i : E[ β 1 x 1,..., x n ] = β 1 (for all possible β 0, β 1 ) We will show that Var( β 1 x 1,..., x n ) Var(ˆβ 1 x 1,..., x n )

Proof: outline (continued) 1 Show that n i=1 c i = 0 and n i=1 c ix i = 1 2 Show Cov( β 1, ˆβ 1 x 1,..., x n ) = Var(ˆβ 1 x 1,..., x n ) 3 Show Var( β 1 x 1,..., x n ) Var(ˆβ 1 x 1,..., x n ) 4 Show Var( β 1 x 1,..., x n ) = Var(ˆβ 1 x 1,..., x n ) only if β 1 = ˆβ 1 0 We will go over the proof in class. See Marmer s slides or Wooldridge (2013) 3A for details

(continued) Abbring, Jaap. 2001. An Introduction to Econometrics: Lecture notes. URL http://jabbring.home.xs4all.nl/ courses/b44old/lect210.pdf. Angrist, J.D. and J.S. Pischke. 2009. Mostly harmless econometrics: An empiricist s companion. Princeton University Press. Angrist, Joshua D and Jörn-Steffen Pischke. 2014. Mastering Metrics: The Path from Cause to Effect. Princeton University Press. Baltagi, BH. 2002. Econometrics. Springer, New York. URL http: //gw2jh3xr2c.search.serialssolutions.com/?sid= sersol&ss_jc=tc0001086635&title=econometrics.

(continued) Beck, T., R. Levine, and N. Loayza. 2000. Finance and the Sources of Growth. Journal of financial economics 58 (1):261 300. URL http://www.sciencedirect.com/ science/article/pii/s0304405x00000726. Bierens, Herman J. 2012. The Two-Variable Linear Regression Model. URL http://personal.psu.edu/hxb11/linreg2.pdf. Diez, David M, Christopher D Barr, and Mine Cetinkaya-Rundel. 2012. OpenIntro Statistics. OpenIntro. URL http://www.openintro.org/stat/textbook.php. Linton, Oliver B. 2017. Probability, Statistics and Econometrics. Academic Press. URL http://gw2jh3xr2c.search.serialssolutions.com/?sid=sersol&ss_jc=tc0001868500&title= Probability%2C%20statistics%20and%20econometrics.

(continued) Estimating σε 2 Stock, J.H. and M.W. Watson. 2009. Introduction to Econometrics, 2/E. Addison-Wesley. Wooldridge, J.M. 2013. Introductory econometrics: A modern approach. South-Western.