Multiple Regression Analysis: The Problem of Inference

Similar documents
Two-Variable Regression Model: The Problem of Estimation

Linear Regression & Correlation

Statistical Inference for Means

Heteroscedasticity. Jamie Monogan. Intermediate Political Methodology. University of Georgia. Jamie Monogan (UGA) Heteroscedasticity POLS / 11

Autocorrelation. Jamie Monogan. Intermediate Political Methodology. University of Georgia. Jamie Monogan (UGA) Autocorrelation POLS / 20

Modeling the Mean: Response Profiles v. Parametric Curves

Measuring the fit of the model - SSR

Lecture 3: Multiple Regression

Granger Causality Testing

Inference in Regression Analysis

LECTURE 5 HYPOTHESIS TESTING

Matematické Metody v Ekonometrii 5.

The regression model with one fixed regressor cont d

Simple Linear Regression

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Chapter 10: Inferences based on two samples

Simple and Multiple Linear Regression

Vector Autoregression

Multiple Regression Analysis

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Density Temp vs Ratio. temp

Ch 2: Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Chapter 12 - Lecture 2 Inferences about regression coefficient

Lecture 14 Simple Linear Regression

Lectures 5 & 6: Hypothesis Testing

Statistical Inference. Part IV. Statistical Inference

Difference in two or more average scores in different groups

Introduction to Multivariate Relationships

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Stats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16

Confidence Intervals, Testing and ANOVA Summary

Scatter plot of data from the study. Linear Regression

Inference for the Regression Coefficient

Linear models and their mathematical foundations: Simple linear regression

Basic Business Statistics, 10/e

Tables Table A Table B Table C Table D Table E 675

Scatter plot of data from the study. Linear Regression

What is a Hypothesis?

Lecture 2 Multiple Regression and Tests

Motivation for multiple regression

Linear Models and Estimation by Least Squares

ECO375 Tutorial 4 Introduction to Statistical Inference

Univariate, Nonstationary Processes

Answers to Problem Set #4

Bias Variance Trade-off

Simple Linear Regression

Modeling the Covariance

Categorical Predictor Variables

Linear Regression Model. Badr Missaoui

Probability and Statistics Notes

2. Outliers and inference for regression

A discussion on multiple regression models

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Mathematical statistics

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Regression Analysis: Basic Concepts

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

ECON 5350 Class Notes Functional Form and Structural Change

In the previous chapter, we learned how to use the method of least-squares

11 Hypothesis Testing

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Basic Business Statistics 6 th Edition

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Applied Econometrics (QEM)

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Regression With a Categorical Independent Variable

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

Multiple Regression Analysis

Inference for Regression

Single and multiple linear regression analysis

Final Exam - Solutions

Simple Linear Regression: A Model for the Mean. Chap 7

Correlation and Regression

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias

A Significance Test for the Lasso

Chapter Eight: Assessment of Relationships 1/42

Simple Linear Regression: One Qualitative IV

Review of Econometrics

Inferences for Regression

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Lecture 5: Clustering, Linear Regression

Mathematics for Economics MA course

Lecture 5: Clustering, Linear Regression

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Econometrics Summary Algebraic and Statistical Preliminaries

1 Independent Practice: Hypothesis tests for one parameter:

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Chapter 14 Simple Linear Regression (A)

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

CAS MA575 Linear Models

Linear Regression In God we trust, all others bring data. William Edwards Deming

In Class Review Exercises Vartanian: SW 540

Lecture 5: Clustering, Linear Regression

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

Statistics and econometrics

Binary Logistic Regression

Multiple Linear Regression

Transcription:

Multiple Regression Analysis: The Problem of Inference Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Multiple Regression Analysis: Inference POLS 7014 1 / 10

Objectives By the end of this meeting, participants should be able to: Construct confidence intervals for individual partial regression coefficients. Test hypotheses about individual partial regression coefficients with t-ratios. Test hypotheses about a whole model, a set of coefficients, or a restriction on a model with F -ratios. Forecast values of the outcome given values of the predictors. Jamie Monogan (UGA) Multiple Regression Analysis: Inference POLS 7014 2 / 10

Inference on Individual Partial Regression Coefficients Just as with the two-variable model, we assume that the disturbances have a normal distribution. This allows us to conduct inferences on individual partial coefficients using the t-distribution just as before. Choose a Type I error rate α and then build a confidence interval or conduct a hypothesis test. Confidence interval for β j : ˆβ j ± t α/2 se( ˆβ j ). Hypothesis test for β j : t = ˆβ j β j se( ˆβ j ). For hypothesis testing, you could calculate a p-value or compare to the critical value, t α for a one-tailed test and t α/2 for a two-tailed test. In all of this, t has n k degrees of freedom where k is the number of parameters. Jamie Monogan (UGA) Multiple Regression Analysis: Inference POLS 7014 3 / 10

Inference on Multiple Partial Regression Coefficients Suppose you want to test a hypothesis about more than one coefficient at once. Estimate a restricted model as if the null hypothesis is true and an unrestricted model that does not impose that assumption. For example, consider the unrestricted model: Y i = β 1 + β 2 X 2i + β 3 X 3i + β 4 X 4i + u i. You state the null hypothesis: H 0 : β 2 = β 3 = 0. If the null hypothesis is true, then the restricted model is: Y i = β 1 + β 4 X 4i + u i. The unrestricted model will explain more variance in Y i. Does it have a significantly better fit? Jamie Monogan (UGA) Multiple Regression Analysis: Inference POLS 7014 4 / 10

General F Testing Provided the restricted model is nested within the unrestricted model, we can use an F -ratio to test whether the null hypothesis is true. Record the coefficient of determination for the unrestricted model (RUR 2 ) and for the restricted model (R2 R ) and use them to calculate the test statistic: F = (R2 UR R2 R )/m (1 RUR 2 )/(n k) In this equation n is the sample size, k is the number of parameters in the unrestricted model, and m is the number of parameters omitted in restricted model. (In the example from the previous slide: k = 4 & m = 2.) This test statistic has an F distribution with m & n k degrees of freedom. Jamie Monogan (UGA) Multiple Regression Analysis: Inference POLS 7014 5 / 10

An Alternate Version of F An equivalent way to state F is: F = (RSS R RSS UR )/m (RSS UR )/(n k) If ever your restriction requires you to rescale the dependent variable, you must use this form of the test. The distribution is exactly the same. This specification also illustrates why the test statistic has an F -distribution: It is a ratio of χ 2 distributed statistics. Jamie Monogan (UGA) Multiple Regression Analysis: Inference POLS 7014 6 / 10

Uses of the F Test The unrestricted model is your theoretically-specified linear model. Unrestricted model: Y i = β 1 + β 2 X 2i + β 3 X 3i + β 4 X 4i + + β k X ki + u i. Overall significance of multiple regression: H 0 : β 2 = β 3 = = β k = 0. Restricted model: Y i = β 1 + u i. Significance of a block of predictors: E.g., H 0 : β 2 = β 3 = 0. Restricted model: Y i = β 1 + β 4 X 4i + + β k X ki + u i. Testing a linear restriction: E.g., H 0 : β 2 + β 3 = 1. Restricted model: Y i = β 1 + X 2i + β 3 (X 3i X 2i ) + β 4 X 4i + + β k X ki + u i. On the last point, we would estimate the model using restricted least squares (RLS). Jamie Monogan (UGA) Multiple Regression Analysis: Inference POLS 7014 7 / 10

Post hoc t Tests Besides restricted least squares, we also can test linear restrictions and (within that) equality of coefficients using a post hoc t-test. This is very handy for inference on interaction terms. (Next week.) State your null hypothesis such that all parameters are on one side and a number is on the other side. For example: H 0 : β 2 = β 3 is equivalent to H 0 : β 2 β 3 = 0. H 0 : β 2 = 1 β 3 is equivalent to H 0 : β 2 + β 3 = 1. Let s say in general your hypothesis is: H 0 : aβ 2 + bβ 3 = c, where a, b, & c are constants (potentially 1 for subtraction). Then the standard error of your t test is: se = a 2 Var( ˆβ 2 ) + b 2 Var( ˆβ 3 ) + 2abCov( ˆβ 2, ˆβ 3 ). Your test statistic is: t = a ˆβ 2 +b ˆβ 3 c se. Jamie Monogan (UGA) Multiple Regression Analysis: Inference POLS 7014 8 / 10

Prediction Forecasting with a multiple regression model is just as easy as for the two-variable model. Ŷ 0 = ˆβ 1 + ˆβ 2 X 20 + + ˆβ k X k0. We can still compute error variances for prediction of the mean or of individual values. This is tricky in scalar notation, though. Refer to Appendix C for notes on matrix-based computation of standard errors. Jamie Monogan (UGA) Multiple Regression Analysis: Inference POLS 7014 9 / 10

For Next Time Read Gujarati & Porter chap 9 (Dummy Variable Regression Models). Study for the midterm. Bring your questions next time. Load Agresti & Finlay s Crime Data, http: //www.ats.ucla.edu/stat/stata/webbooks/reg/crime.dta Estimate a model in which crimes per 100,000 residents is a function of percent metropolitan, percent poverty, and percent single parent. Present your results in a well-formatted table. Interpret the intercept and partial slope coefficients. Present the 95% percent confidence intervals for each parameter and report the results of the hypothesis test H 0 : β = 0, H 1 : β 0 at the 95% confidence level for each parameter. Also, test whether the whole model explains a significant portion of variance in crime. Would a percentage point increase in metropolitan population and a percentage point increase in poverty together raise the crime rate by 10 per 100,000 residents on average? State the null and alternative hypothesis. Report the results of the hypothesis test. Jamie Monogan (UGA) Multiple Regression Analysis: Inference POLS 7014 10 / 10