Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Similar documents
Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

A Matrix Representation of Panel Data

Inference in the Multiple-Regression

Internal vs. external validity. External validity. Internal validity

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

Section 11 Simultaneous Equations

INSTRUMENTAL VARIABLES

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression

Resampling Methods. Chapter 5. Chapter 5 1 / 52

Section 13 Advanced Topics

Section 11 Simultaneous Equations

SAMPLING DYNAMICAL SYSTEMS

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

Introduction to Regression

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank

MATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank

Distributions, spatial statistics and a Bayesian perspective

IN a recent article, Geary [1972] discussed the merit of taking first differences

Functional Form and Nonlinearities

Section 15 Advanced Topics

Medium Scale Integrated (MSI) devices [Sections 2.9 and 2.10]

Modelling of Clock Behaviour. Don Percival. Applied Physics Laboratory University of Washington Seattle, Washington, USA

Pattern Recognition 2014 Support Vector Machines

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving.

What is Statistical Learning?

We can see from the graph above that the intersection is, i.e., [ ).

Comparing Several Means: ANOVA. Group Means and Grand Mean

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y )

AP Statistics Practice Test Unit Three Exploring Relationships Between Variables. Name Period Date

Admin. MDP Search Trees. Optimal Quantities. Reinforcement Learning

, which yields. where z1. and z2

SIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST. Mark C. Otto Statistics Research Division, Bureau of the Census Washington, D.C , U.S.A.

Five Whys How To Do It Better

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007

Simple Linear Regression (single variable)

Lecture 13: Markov Chain Monte Carlo. Gibbs sampling

Differentiation Applications 1: Related Rates

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017

Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Key Wrds: Autregressive, Mving Average, Runs Tests, Shewhart Cntrl Chart

Chapter 5: The Keynesian System (I): The Role of Aggregate Demand

Kinetic Model Completeness

How do scientists measure trees? What is DBH?

Fall 2013 Physics 172 Recitation 3 Momentum and Springs

NUMBERS, MATHEMATICS AND EQUATIONS

Statistics, Numerical Models and Ensembles

UG Course Outline EC2203: Quantitative Methods II 2017/18

Chapter 3: Cluster Analysis

Application of ILIUM to the estimation of the T eff [Fe/H] pair from BP/RP

End of Course Algebra I ~ Practice Test #2

Smoothing, penalized least squares and splines

Data Analysis, Statistics, Machine Learning

20 Faraday s Law and Maxwell s Extension to Ampere s Law

LECTURE NOTES. Chapter 3: Classical Macroeconomics: Output and Employment. 1. The starting point

Chapter Summary. Mathematical Induction Strong Induction Recursive Definitions Structural Induction Recursive Algorithms

Trigonometric Ratios Unit 5 Tentative TEST date

Department of Economics, University of California, Davis Ecn 200C Micro Theory Professor Giacomo Bonanno. Insurance Markets

2004 AP CHEMISTRY FREE-RESPONSE QUESTIONS

Corrections for the textbook answers: Sec 6.1 #8h)covert angle to a positive by adding period #9b) # rad/sec

The Law of Total Probability, Bayes Rule, and Random Variables (Oh My!)

ECEN 4872/5827 Lecture Notes

REGRESSION DISCONTINUITY (RD) Technical Track Session V. Dhushyanth Raju Julieta Trias The World Bank

Maximum A Posteriori (MAP) CS 109 Lecture 22 May 16th, 2016

ENSC Discrete Time Systems. Project Outline. Semester

Section 10 Simultaneous Equations

Lab 1 The Scientific Method

Midwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter

Exam #1. A. Answer any 1 of the following 2 questions. CEE 371 October 8, Please grade the following questions: 1 or 2

COMP 551 Applied Machine Learning Lecture 4: Linear classification

Excessive Social Imbalances and the Performance of Welfare States in the EU. Frank Vandenbroucke, Ron Diris and Gerlinde Verbist

Math Foundations 20 Work Plan

CHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation

Physics 2B Chapter 23 Notes - Faraday s Law & Inductors Spring 2018

x 1 Outline IAML: Logistic Regression Decision Boundaries Example Data

You need to be able to define the following terms and answer basic questions about them:

Lab #3: Pendulum Period and Proportionalities

On Out-of-Sample Statistics for Financial Time-Series

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS

LHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers

READING STATECHART DIAGRAMS

Mark Scheme (Results) January International GCSE Mathematics B (4MB0) Paper 01

Comment on John Taylor: Rules Versus Discretion: Assessing the Debate over the Conduct of Monetary Policy

I. Analytical Potential and Field of a Uniform Rod. V E d. The definition of electric potential difference is

Part 3 Introduction to statistical classification techniques

Section 5.8 Notes Page Exponential Growth and Decay Models; Newton s Law

Math 10 - Exam 1 Topics

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis

Hypothesis Tests for One Population Mean

CS 109 Lecture 23 May 18th, 2016

Phys. 344 Ch 7 Lecture 8 Fri., April. 10 th,

Preparation work for A2 Mathematics [2018]

APPLICATION OF THE BRATSETH SCHEME FOR HIGH LATITUDE INTERMITTENT DATA ASSIMILATION USING THE PSU/NCAR MM5 MESOSCALE MODEL

Chapter 8: The Binomial and Geometric Distributions

Exam #1. A. Answer any 1 of the following 2 questions. CEE 371 March 10, Please grade the following questions: 1 or 2

Transcription:

Sectin 7 Mdel Assessment This sectin is based n Stck and Watsn s Chapter 9. Internal vs. external validity Internal validity refers t whether the analysis is valid fr the ppulatin and sample being studied. External validity refers t whether these results can be generalized t ther ppulatins: is the ppulatin frm which the sample is drawn representative f a larger ppulatin abut which inference is sught? Ecnmic vs. statistical significance: Even if t > 2, the effect may be t small t be ecnmically imprtant. Beta cefficients are used t give the number f standard deviatins that y changes when x increases by ne standard deviatin. Marginal effects in standard deviatins can be mre useful than marginal effect in units. External validity External validity is related t Assumptin #0. But in this case, the questin is nt whether all sample bservatins fllw the same mdel but rather d the sample bservatins fllw the same mdel as the mre general ppulatin. Or, alternatively, are they drawn frm a sub-ppulatin that has characteristics that wuld make the cefficients (r specificatin) different? All ppulatins have sub-ppulatins that vary in their characteristics. If ur sampling prcess is based n a particular sub-ppulatin, we must wrry abut the generalizability f ur results, which is external validity: Can perfrm an internally valid analysis f an idisyncratic sub-ppulatin that wuld nt generalize t thers. Example: Nel s wrk measuring the value f tree canpy r walkability in Prtland. D results generalize t ther cities r d Prtlanders value these characteristics mre (r less) than peple in ther cities. There are n direct statistical tests fr external validity (unless yu have data drawn frm a brader ppulatin, in which case yu prbably shuld have used it t begin with). It is a usually a matter f judgment. One way that sme peple try t assess external validity is t split the sample in half, estimate ver ne sample, then assess the predictins fr the ther sample. If predictins are gd, then bth halves f the sample may fllw same mdel. ~ 71 ~

This is useless if bth halves f the sample are drawn frm a subppulatin that is idisyncratic, thugh. Meta-analysis: Ding a study where each data pint is an ecnmetric result. Direct estimatin f mapping frm assumptin space t cnclusins Internal validity Given the ppulatin frm which the sample is drawn, are the assumptins underlying the estimatrs valid? Omitted variables They are always there. Omitted variables bias the cefficient estimatrs fr any included variables that are crrelated with them. In a strict sense, nearly every ecnmetric regressin is biased because f this. What variables are mst bviusly mitted? What variables in the equatin wuld be crrelated with them? Hw des this missin bias the included cefficients? Prxy variables are bservable variables that are crrelated with unbserved variables that shuld be included. Prxy variables are legitimate if we are nt particularly interested in the effect f the variable fr which they prxy. Can t interpret the cefficient n the prxy directly as the cefficient n the mitted variable. OK if the difference between the true variable and the prxy is uncrrelated with included variables. Panel data can help if unbserved variables vary acrss units but nt ver time r ver time but nt acrss units. Misspecificatin f functinal frm Can use RESET test t explre whether quadratics are useful. If yu knw what alternative functinal frms might be mre apprpriate, yu can test them. Measurement errr (errrs-in-variables bias) Measurement errr in dependent variable Suppse that the true dependent variable is y but that we instead bserve y y, where i is a randm measurement errr. i i i The estimated mdel, then is y x e ~ 72 ~. i 1 2 i i i As lng as the measurement errr in y ( ) is uncrrelated with x, there is n bias in the estimatr f 2. The SER will be an estimate f the

standard deviatin f the cmpsite errr term e +, but therwise OLS is fine. Measurement errr in regressr Suppse that the dependent variable is measured accurately but that we measure x with errr: x x. i i i. The estimated mdel is y x e i 1 2 i i 2 i Because is part f x and therefre crrelated with it, the cmpsite errr term is nw crrelated with the actual regressr, meaning that b 2 is biased and incnsistent. 2 x If e and are independent and nrmal, then plim b2 2 2 2. x The estimatr is biased tward zer. If mst f the variatin in x cmes frm x, then the bias will be small. As the variance f the measurement errr grws in relatin t the variatin in the true variable, the magnitude f the bias increases. As a wrst-case limit, if the true x desn t vary acrss ur sample f bservatins and all f the variatin in ur measure x is randm nise, then the expected value f ur cefficient is zer. Best slutin is getting a better measure. Alternatives are instrumental variables r direct measurement f degree f measurement errr. Fr example, if an alternative, precise measure is available fr sme arguably randm sub-sample f bservatins, then we can calculate the variance f the true variable and the variance f the measurement errr and crrect the estimate. Sample selectin bias Few samples are truly randm draws frm full ppulatin. Instead, they are draws (randm r nt) frm sme sub-ppulatin: Many hmeless are uncunted in census N wage data n thse wh d nt wrk Plls miss peple with n listed phne number Crss-cuntry regressins are ften limited t the cuntries fr which gd data are available (which is nt a randm sample f cuntries) If sample selectin is related t x, then we have issues f external validity (d estimates apply t missed sub-ppulatin) but nt internal validity. Results may be valid fr the sub-ppulatin fr which they are estimated. ~ 73 ~

If sample selectin is related t y (r, specifically, t e), then we are nt drawing randmly frm the ppulatin distributin f the errr term (as we assume) and ur results will be biased. There are methds f cping with sample-selectin bias. Imputing values fr missing wage data t allw inclusin f full sample Simultaneity bias (reverse r bidirectinal causality) If changes in y (presumably due t changes in e) cause x t change, then x and e will be crrelated and OLS estimates will be biased and incnsistent. Fr example, fr many years macrecnmists estimated Keynesian cnsumptin functins by OLS: Ct 0 1 GDPt ut. (There are time-series prblems with this regressin that we will study later.) Fr nw, nte that if aggregate demand affects utput, then GDP in each year is C + I + G + NX, s a psitive shck t cnsumptin (a psitive e) increases GDP. Because the regressin is crrelated with the errr term, OLS estimates f 1 were biased and incnsistent. (But they lked gd and had ridiculusly high R 2 values, s they persisted fr many years despite the prtests f ecnmetricians.) The usual crrectin is t use an instrumental-variables (tw-stage least squares) estimatr. Heterskedasticity Heterskedasticity (as we will discuss sn) causes OLS t be inefficient (relative t WLS), but it is still unbiased and cnsistent. The classical standard errrs will be biased under heterskedasticity, but we can use White s rbust cvariance matrix estimatr, which we ve talked abut earlier. Using rbust errrs is the mst cmmn crrectin fr heterskedasticity. Autcrrelatin If errr terms f different bservatins are crrelated, then OLS is als inefficient (relative t a crrected GLS estimatr), but is unbiased and cnsistent. Autcrrelatin can be spatial: Unmeasured neighbrhd characteristics (mitted variables) that cause huses that are clse tgether t be mre r less valuable. Autcrrelatin is ubiquitus in time-series data: This perid s errr term is nearly always related t last perid s. (Unmeasured mitted variables are themselves crrelated ver time.) Again, standard errrs are biased, but White s heterskedastic-cnsistent standard errrs dn t help here. ~ 74 ~

There are estimated standard errrs that are rbust t autcrrelatin. (Use hac ptin in Stata.) Alternatively, ne can try t mdel the autcrrelatin and transfrm the mdel int ne that has n autcrrelatin (GLS). Examples include AR(1) mdels in time series and mdeling spatially crrelated errrs in crss-sectin mdels. Validity in frecasting/predictin Regressin mdels may be valid fr frecasting even if their cefficients are nt unbiased r cnsistent. Suppse that we knw that x is measured with errr. We can still use a regressin f y n x t predict the utcme f a particular measured x even thugh the estimated cefficient is a biased estimatr fr the effect f x. That is because we have crrectly estimated the relatinship between the nisy x and y. We wuld nt get reliable estimates if ur predictin questin relied n the true x rather than the nisy x. We ften build mdels with nisy data r prxy variables t get predictins f anther variable. The biggest questin in frecasting is external validity: des the mdel that applies t the sample yu used fr estimatin als apply t the bservatin fr which yu want a frecast? ~ 75 ~