Model Specification and Data Problems. Part VIII

Size: px
Start display at page:

Download "Model Specification and Data Problems. Part VIII"

Transcription

1 Part VIII Model Specification and Data Problems As of Oct 24, 2017

2 1 Model Specification and Data Problems RESET test Non-nested alternatives Outliers

3 A functional form misspecification generally means that the model does not account for some important nonlinearities. Recall that omitting important variable is also model misspecification. Generally functional form misspecification causes bias in the remaining parameter estimators.

4 Example 1 Suppose that the correct specification of the wage equation is (1) log(wage) = β 0 + β 1 educ + β 2 exper + β 3 (exper) 2 + u. Then the return for an extra year of experience is log(wage) exper = β 2 + 2β 3 exper. (2) If the second order term is dropped from (1), use of the resulting biased estimate of β 2 can be misleading.

5 1 Model Specification and Data Problems RESET test Non-nested alternatives Outliers

6 Ramsey (1969) 2 proposed a general functional form misspecification test, Regression Specification Error Test (RESET), which has proven to be useful. Estimate y = β 0 + β 1 x β k x k + u, (3) get ŷ and test in the augmented model y = β 0 + β 1 x β k x k + δ 1 ŷ 2 + δ 2 ŷ 3 + e. (4) Test the null hypothesis H 0 : δ 1 = δ 2 = 0. (5) with the F -test with numerator df 1 = 2 and denominator df 2 = n k 3. 2 Ramsey, J.B. (1969). Tests for specification errors in classical linear least-squares analysis, Journal of the Royal Statistical Society, Series B, 71,

7 Example 2 Consider the house price data (Exercise 3.1) and estimate price = β 0 + β 1 lotsize + β 2 sqrft + β 3 bdrms + u. (6) Estimation results are: Dependent Variable: PRICE Method: Least Squares Sample: 1 88 Included observations: 88 ========================================================== Variable Coefficient Std. Error t-statistic Prob C LOTSIZE SQRFT BDRMS ========================================================== ============================================================ R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood F-statistic Durbin-Watson stat Prob(F-statistic) ============================================================

8 Estimate next (6) augmented with ( price) 2 and ( price) 3 as in (4). The F -statistic for the null hypothesis (5) becomes F = 4.67 with 2 and 82 degrees of freedom. The p-value is 0.012, such that we reject the null hypothesis at the 5% level. Thus, there is some evidence of non-linearity.

9 Estimate next log(price) = β 0 + β 1 log(lotsize) + β 2 log(sqrft) + β 3 bdrms + u. (7) Estimation results: Dependent Variable: LOG(PRICE) Method: Least Squares Date: 10/19/06 Time: 00:01 Sample: 1 88 Included observations: 88 ============================================================ Variable Coefficient Std. Error t-statistic Prob. ============================================================ C LOG(LOTSIZE) LOG(SQRFT) BDRMS ============================================================ ============================================================== R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood F-statistic Durbin-Watson stat Prob(F-statistic) ==============================================================

10 Applying the RESET test, the F -statistic for the null hypothesis (5) is now F = 2.56 with p-value 0.084, which implies that the hypothesis is not rejected at the 5% level. Thus overall, on the basis of the RESET test the log-log model (7) is preferred.

11 1 Model Specification and Data Problems RESET test Non-nested alternatives Outliers

12 For example if the model choices are y = β 0 + β 1 x 1 + β 2 x 2 + u (8) and y = β 0 + β 1 log(x 1 ) + β 2 log(x 2 ) + u. (9) Because the models are non-nested the usual F -test does not apply. A common approach is to estimate a combined model y = γ 0 + γ 1 x 1 + γ 2 x 2 + γ 3 log(x 1 ) + γ 4 log(x 2 ) + u. H 0 : γ 3 = γ 4 = 0 is a hypothesis for (8) and H 0 : γ 1 = γ 2 = 0 is a hypothesis for (9). The usual F -test applies again here. (10)

13 Davidson and MacKinnon (1981) 3 procedure: For example to test (8), estimate first y = β 0 + β 1 x 1 + β 2 x 2 + θ 1 ŷ + v, (11) where ŷ is the fitted value of (9). A significant t value of the θ 1 -estimate is a rejection of (8). Similarly, if ŷ denotes the fitted values of (8), the test of (9) is the t-staistic of the θ 1 -estimate from y = β 0 + β 1 log(x 1 ) + β 2 log(x 2 ) + θ 1 ŷ + v, (12) 3 Davidson, R. and J.G. MacKinnon (1981). Several tests for model specification in the presence of alternative hypotheses, Econometrica 49,

14 Remark 8.1: A clear winner need not emerge. Both models may be rejected or neither may be rejected. In the latter case adjusted R-square can be used to select the better fitting one. If both models are rejected, more work is needed. 4 4 For more complicated cases, see Wooldridge, J.M. (1994). A simple specification test for the predictive ability of transformation models, Review of Economics and Statistics 76,

15 1 Model Specification and Data Problems RESET test Non-nested alternatives Outliers

16 As discussed earlier, an important source of bias in OLS is omitted variables that are correlated with the included explanatory variables. Often the reason for omission is that these variables are unobservable. A way to mitigate the problem is to collect data on proxy variables. Consider the following regression y = β 0 + β 1 x 1 + β 2 x 2 + u, (13) where x 2 is unobservable variable (e.g. human ability).

17 Suppose that the primary interest is to estimate β 1, so that x 2 is a control variable. However, as we know the simple regression y = β 0 + β 1 x 1 + v results to biased and inconsistent OLS estimator of β 1 such plim ˆβ 1 = β 1 + γ 1 β 2, where δ 1 is the coefficient of regression x 2 = γ 0 + γ 1 x 1 + error Suppose that we have a good proxy x 2 for x 2 such tat E[x 2 x 2, x 1 ] = E[x 2 x 2], i.e., given the proxy x 2, x 1 does not help in predicting the unobserved variable x 2. E[u x 2 ] = 0 for the error term in regression (13). These imply that in regression x 2 = δ 0 + δ 1 x 2 + θx 1 + e, θ = 0 so that only the proxy x 2 is related to the unobserved variable x 2, and that the proxy x 2 is not correlated with error term of the true regression in equation (13).

18 With this kind of a good proxy instead of (13), the model to be estimated becomes y = α 0 + β 1 x 1 + α 2 x 2 + w. (14) Now OLS is unbiased and consistent estimator of β 1, the parameter we are primarily interested in (also OLS estimators of α 0 and α 1 are unbiased and consistent for these parameters, but α 0 = β 0 + β 2 δ 0 and α 1 = δ 1 β 2 differ from β 0 and β 2 ).

19 Example 3 Consider the return to education in wages (monthly) for men (wage2 data set). lm(formula = log(wage) ~ educ + exper + tenure + married + south + urban + black, data = wdf) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** educ < 2e-16 *** exper e-05 *** tenure e-06 *** married e-07 *** south *** urban e-11 *** black e-07 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 927 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 7 and 927 DF, p-value: < 2.2e-16

20 The estimated return to education is 6.5%. However, if the omitted ability is positively correlated with educ, the estimate is too high. Adding IQ as a proxy to ability into the equation reduces the estimate to 5.4%, which is consistent with the omitted variable bias assumption. lm(formula = log(wage) ~ educ + exper + tenure + married + south + urban + black + iq, data = wdf) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** educ e-14 *** exper e-06 *** tenure e-06 *** married e-07 *** south ** urban e-11 *** black *** iq *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 926 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 8 and 926 DF, p-value: < 2.2e-16

21 Test whether the interaction of ability and education affects wages. lm(formula = log(wage) ~ educ + exper + tenure + married + south + urban + black + iq + iq:educ, data = wdf) # iq:educ introduces interaction iq*educ Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** educ exper e-05 *** tenure e-06 *** married e-07 *** south ** urban e-11 *** black *** iq educ:iq Signif. codes: 0 *** ** 0.01 * Residual standard error: on 925 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 9 and 925 DF, p-value: < 2.2e-16

22 Adding iq educ is not only insignificant but it also renders educ and iq insignificant! This is due to high correlation of the interaction term with its components: > with(wdf, cor(cbind(educ, iq, educ*iq))) educ iq educ*iq educ iq educ*iq The implied collinearity can be materially reduced by defining the interaction term in terms of demeand variables: > with(wdf, cor(cbind(educ, iq, (educ - mean(educ))*(iq - mean(iq))))) educ iq (e-m(e))*(i-m(i)) educ iq (educ-m(educ)*(iq-m(iq))

23 Interaction term of the demeaned components leads also to a meaningful interpretation of the implied model. We can write as log(wage) = β 0 + β 1 educ + β 2 iq + β 12 ( educ ĩq) + other factors log(wage) = β 0 + β 1 educ + β 2 ĩq + β 12 ( educ ĩq) + other factors, where ẽduc = educ educ and ĩq = iq iq are demeaned educ and iq, and β 0 = β 0 + β 1 educ + β 2 iq. We can further write log(wage) = β 0 + (β 1 + β 12 ĩq) educ + β 2 ĩq + other factors.

24 The slope coefficient β 1 + β 12 ĩq of educ implies that return to education depends on the level of ability (measured by IQ). At the mean IQ, ĩq = 0, so that β 1 indicates the return to education for a person with average ability and β 12 indicates per IQ point the rate by which return to education changes when ability (measured in terms of IQ) deviates from the average. Assuming β 12 > 0, above average ability implies higher return to education and below average lower return to education.

25 Estimating the model, however, indicates that ˆβ 12 = with p-value.37 is not at all statistically significant, which implies that there is no evidence that variability in IQ as such affects return to education. lm(formula = log(wage) ~ educ + exper + tenure + married + south + urban + black + iq + I((iq - mean(iq)) * (educ - mean(educ))), data = wdf) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 educ e-13 exper e-05 tenure e-06 married e-07 south urban e-11 black iq I((iq - mean(iq)) * (educ - mean(educ))) Residual standard error: on 925 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 9 and 925 DF, p-value: < 2.2e-16

26 Outliers 1 Model Specification and Data Problems RESET test Non-nested alternatives Outliers

27 Outliers Particularly in small data sets OLS estimates may be influenced by one or several observations (see figure). Generally such observations are called outliers or influential observations. Loosely, an observation is an outlier if dropping it changes estimation results materially. In detection of outliers a usual practice is to investigate standardized (or studentized ) residuals. If an outlier is an obvious mistake in recording the data, it can be corrected. Usual practice also is to eliminate such observations. Data transformations, like taking logarithms often narrow the range of data and hence may alleviate outlier problems, too.

Statistical Inference. Part IV. Statistical Inference

Statistical Inference. Part IV. Statistical Inference Part IV Statistical Inference As of Oct 5, 2017 Sampling Distributions of the OLS Estimator 1 Statistical Inference Sampling Distributions of the OLS Estimator Testing Against One-Sided Alternatives Two-Sided

More information

More on Specification and Data Issues

More on Specification and Data Issues More on Specification and Data Issues Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Specification and Data Issues 1 / 35 Functional Form Misspecification Functional

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

Regression with Qualitative Information. Part VI. Regression with Qualitative Information

Regression with Qualitative Information. Part VI. Regression with Qualitative Information Part VI Regression with Qualitative Information As of Oct 17, 2017 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving

More information

Outline. 2. Logarithmic Functional Form and Units of Measurement. Functional Form. I. Functional Form: log II. Units of Measurement

Outline. 2. Logarithmic Functional Form and Units of Measurement. Functional Form. I. Functional Form: log II. Units of Measurement Outline 2. Logarithmic Functional Form and Units of Measurement I. Functional Form: log II. Units of Measurement Read Wooldridge (2013), Chapter 2.4, 6.1 and 6.2 2 Functional Form I. Functional Form: log

More information

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators 1 2 Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE Hüseyin Taştan 1 1 Yıldız Technical University Department of Economics These presentation notes are based on Introductory

More information

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) 5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) Assumption #A1: Our regression model does not lack of any further relevant exogenous variables beyond x 1i, x 2i,..., x Ki and

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Multiple Regression: Inference

Multiple Regression: Inference Multiple Regression: Inference The t-test: is ˆ j big and precise enough? We test the null hypothesis: H 0 : β j =0; i.e. test that x j has no effect on y once the other explanatory variables are controlled

More information

Multiple Regression Analysis: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing

More information

Problem C7.10. points = exper.072 exper guard forward (1.18) (.33) (.024) (1.00) (1.00)

Problem C7.10. points = exper.072 exper guard forward (1.18) (.33) (.024) (1.00) (1.00) BOSTON COLLEGE Department of Economics EC 228 02 Econometric Methods Fall 2009, Prof. Baum, Ms. Phillips (TA), Ms. Pumphrey (grader) Problem Set 5 Due Tuesday 10 November 2009 Total Points Possible: 160

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

4. Nonlinear regression functions

4. Nonlinear regression functions 4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change

More information

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables Lecture 8. Using the CLR Model Relation between patent applications and R&D spending Variables PATENTS = No. of patents (in 000) filed RDEP = Expenditure on research&development (in billions of 99 $) The

More information

6. Assessing studies based on multiple regression

6. Assessing studies based on multiple regression 6. Assessing studies based on multiple regression Questions of this section: What makes a study using multiple regression (un)reliable? When does multiple regression provide a useful estimate of the causal

More information

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama Course Packet The purpose of this packet is to show you one particular dataset and how it is used in

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

7. Prediction. Outline: Read Section 6.4. Mean Prediction

7. Prediction. Outline: Read Section 6.4. Mean Prediction Outline: Read Section 6.4 II. Individual Prediction IV. Choose between y Model and log(y) Model 7. Prediction Read Wooldridge (2013), Chapter 6.4 2 Mean Prediction Predictions are useful But they are subject

More information

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity Outline: Further Issues in Using OLS with Time Series Data 13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process I. Stationary and Weakly Dependent Time Series III. Highly Persistent

More information

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf

More information

Inference in Regression Analysis

Inference in Regression Analysis ECNS 561 Inference Inference in Regression Analysis Up to this point 1.) OLS is unbiased 2.) OLS is BLUE (best linear unbiased estimator i.e., the variance is smallest among linear unbiased estimators)

More information

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3 Solutions to Problem Set 5 (Due November 22) EC 228 02, Fall 2010 Prof. Baum, Ms Hristakeva Maximum number of points for Problem set 5 is: 220 Problem 7.3 (i) (5 points) The t statistic on hsize 2 is over

More information

Heteroscedasticity 1

Heteroscedasticity 1 Heteroscedasticity 1 Pierre Nguimkeu BUEC 333 Summer 2011 1 Based on P. Lavergne, Lectures notes Outline Pure Versus Impure Heteroscedasticity Consequences and Detection Remedies Pure Heteroscedasticity

More information

Lecture 8. Using the CLR Model

Lecture 8. Using the CLR Model Lecture 8. Using the CLR Model Example of regression analysis. Relation between patent applications and R&D spending Variables PATENTS = No. of patents (in 1000) filed RDEXP = Expenditure on research&development

More information

Eastern Mediterranean University Department of Economics ECON 503: ECONOMETRICS I. M. Balcilar. Midterm Exam Fall 2007, 11 December 2007.

Eastern Mediterranean University Department of Economics ECON 503: ECONOMETRICS I. M. Balcilar. Midterm Exam Fall 2007, 11 December 2007. Eastern Mediterranean University Department of Economics ECON 503: ECONOMETRICS I M. Balcilar Midterm Exam Fall 2007, 11 December 2007 Duration: 120 minutes Questions Q1. In order to estimate the demand

More information

Brief Suggested Solutions

Brief Suggested Solutions DEPARTMENT OF ECONOMICS UNIVERSITY OF VICTORIA ECONOMICS 366: ECONOMETRICS II SPRING TERM 5: ASSIGNMENT TWO Brief Suggested Solutions Question One: Consider the classical T-observation, K-regressor linear

More information

ECMT 676 Assignment #1 March 18, and x. are unknown? - Run the following regression: directly? What if μ1

ECMT 676 Assignment #1 March 18, and x. are unknown? - Run the following regression: directly? What if μ1 ECMT 676 Assignment #1 March 18, 008 4.8 (Average Partial Effect) (, ) = β + β + β + β + β E y x x x x x x x 1 0 1 1 3 1 4 a. Average Partial Effect(APE) of x1 and x - APE of x 1 : - APE of x : (, ) E

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

10. Time series regression and forecasting

10. Time series regression and forecasting 10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the

More information

The linear model. Our models so far are linear. Change in Y due to change in X? See plots for: o age vs. ahe o carats vs.

The linear model. Our models so far are linear. Change in Y due to change in X? See plots for: o age vs. ahe o carats vs. 8 Nonlinear effects Lots of effects in economics are nonlinear Examples Deal with these in two (sort of three) ways: o Polynomials o Logarithms o Interaction terms (sort of) 1 The linear model Our models

More information

3. Linear Regression With a Single Regressor

3. Linear Regression With a Single Regressor 3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)

More information

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the

More information

7. Integrated Processes

7. Integrated Processes 7. Integrated Processes Up to now: Analysis of stationary processes (stationary ARMA(p, q) processes) Problem: Many economic time series exhibit non-stationary patterns over time 226 Example: We consider

More information

Tests of Linear Restrictions

Tests of Linear Restrictions Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some

More information

ECON2228 Notes 8. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 35

ECON2228 Notes 8. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 35 ECON2228 Notes 8 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 6 2014 2015 1 / 35 Functional form misspecification Chapter 9: More on specification and data problems

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

7. Integrated Processes

7. Integrated Processes 7. Integrated Processes Up to now: Analysis of stationary processes (stationary ARMA(p, q) processes) Problem: Many economic time series exhibit non-stationary patterns over time 226 Example: We consider

More information

1 Quantitative Techniques in Practice

1 Quantitative Techniques in Practice 1 Quantitative Techniques in Practice 1.1 Lecture 2: Stationarity, spurious regression, etc. 1.1.1 Overview In the rst part we shall look at some issues in time series economics. In the second part we

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Exercise Sheet 6: Solutions

Exercise Sheet 6: Solutions Exercise Sheet 6: Solutions R.G. Pierse 1. (a) Regression yields: Dependent Variable: LC Date: 10/29/02 Time: 18:37 Sample(adjusted): 1950 1985 Included observations: 36 after adjusting endpoints C 0.244716

More information

ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON Introductory Econometrics. Lecture 16: Instrumental variables ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental

More information

Financial Time Series Analysis: Part II

Financial Time Series Analysis: Part II Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 1 Unit root Deterministic trend Stochastic trend Testing for unit root ADF-test (Augmented Dickey-Fuller test) Testing

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C = Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =

More information

Inferences on Linear Combinations of Coefficients

Inferences on Linear Combinations of Coefficients Inferences on Linear Combinations of Coefficients Note on required packages: The following code required the package multcomp to test hypotheses on linear combinations of regression coefficients. If you

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests Brief Sketch of Solutions: Tutorial 3 3) unit root tests.5.4.4.3.3.2.2.1.1.. -.1 -.1 -.2 -.2 -.3 -.3 -.4 -.4 21 22 23 24 25 26 -.5 21 22 23 24 25 26.8.2.4. -.4 - -.8 - - -.12 21 22 23 24 25 26 -.2 21 22

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance

More information

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Practical Econometrics. for. Finance and Economics. (Econometrics 2) Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics

More information

Introduction to Econometrics Chapter 4

Introduction to Econometrics Chapter 4 Introduction to Econometrics Chapter 4 Ezequiel Uriel Jiménez University of Valencia Valencia, September 2013 4 ypothesis testing in the multiple regression 4.1 ypothesis testing: an overview 4.2 Testing

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Practice Questions for the Final Exam. Theoretical Part

Practice Questions for the Final Exam. Theoretical Part Brooklyn College Econometrics 7020X Spring 2016 Instructor: G. Koimisis Name: Date: Practice Questions for the Final Exam Theoretical Part 1. Define dummy variable and give two examples. 2. Analyze the

More information

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE 1. You have data on years of work experience, EXPER, its square, EXPER, years of education, EDUC, and the log of hourly wages, LWAGE You estimate the following regressions: (1) LWAGE =.00 + 0.05*EDUC +

More information

CHAPTER 4. > 0, where β

CHAPTER 4. > 0, where β CHAPTER 4 SOLUTIONS TO PROBLEMS 4. (i) and (iii) generally cause the t statistics not to have a t distribution under H. Homoskedasticity is one of the CLM assumptions. An important omitted variable violates

More information

Problem Set 2: Box-Jenkins methodology

Problem Set 2: Box-Jenkins methodology Problem Set : Box-Jenkins methodology 1) For an AR1) process we have: γ0) = σ ε 1 φ σ ε γ0) = 1 φ Hence, For a MA1) process, p lim R = φ γ0) = 1 + θ )σ ε σ ε 1 = γ0) 1 + θ Therefore, p lim R = 1 1 1 +

More information

Multiple Regression Analysis. Basic Estimation Techniques. Multiple Regression Analysis. Multiple Regression Analysis

Multiple Regression Analysis. Basic Estimation Techniques. Multiple Regression Analysis. Multiple Regression Analysis Multiple Regression Analysis Basic Estimation Techniques Herbert Stocker herbert.stocker@uibk.ac.at University of Innsbruck & IIS, University of Ramkhamhaeng Regression Analysis: Statistical procedure

More information

APPLIED MACROECONOMETRICS Licenciatura Universidade Nova de Lisboa Faculdade de Economia. FINAL EXAM JUNE 3, 2004 Starts at 14:00 Ends at 16:30

APPLIED MACROECONOMETRICS Licenciatura Universidade Nova de Lisboa Faculdade de Economia. FINAL EXAM JUNE 3, 2004 Starts at 14:00 Ends at 16:30 APPLIED MACROECONOMETRICS Licenciatura Universidade Nova de Lisboa Faculdade de Economia FINAL EXAM JUNE 3, 2004 Starts at 14:00 Ends at 16:30 I In Figure I.1 you can find a quarterly inflation rate series

More information

Inference with Heteroskedasticity

Inference with Heteroskedasticity Inference with Heteroskedasticity Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables.

More information

Estimating the return to education for married women mroz.csv: 753 observations and 22 variables

Estimating the return to education for married women mroz.csv: 753 observations and 22 variables Return to education Estimating the return to education for married women mroz.csv: 753 observations and 22 variables 1. inlf =1 if in labor force, 1975 2. hours hours worked, 1975 3. kidslt6 # kids < 6

More information

LECTURE 11. Introduction to Econometrics. Autocorrelation

LECTURE 11. Introduction to Econometrics. Autocorrelation LECTURE 11 Introduction to Econometrics Autocorrelation November 29, 2016 1 / 24 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists of choosing: 1. correct

More information

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression Model 1 2 Ordinary Least Squares 3 4 Non-linearities 5 of the coefficients and their to the model We saw that econometrics studies E (Y x). More generally, we shall study regression analysis. : The regression

More information

Exercise Sheet 5: Solutions

Exercise Sheet 5: Solutions Exercise Sheet 5: Solutions R.G. Pierse 2. Estimation of Model M1 yields the following results: Date: 10/24/02 Time: 18:06 C -1.448432 0.696587-2.079327 0.0395 LPC -0.306051 0.272836-1.121740 0.2640 LPF

More information

11. Simultaneous-Equation Models

11. Simultaneous-Equation Models 11. Simultaneous-Equation Models Up to now: Estimation and inference in single-equation models Now: Modeling and estimation of a system of equations 328 Example: [I] Analysis of the impact of advertisement

More information

About the seasonal effects on the potential liquid consumption

About the seasonal effects on the potential liquid consumption About the seasonal effects on the potential liquid consumption Lucie Ravelojaona Guillaume Perrez Clément Cousin ENAC 14/01/2013 Consumption raw data Figure : Evolution during one year of different family

More information

1. The shoe size of five randomly selected men in the class is 7, 7.5, 6, 6.5 the shoe size of 4 randomly selected women is 6, 5.

1. The shoe size of five randomly selected men in the class is 7, 7.5, 6, 6.5 the shoe size of 4 randomly selected women is 6, 5. Economics 3 Introduction to Econometrics Winter 2004 Professor Dobkin Name Final Exam (Sample) You must answer all the questions. The exam is closed book and closed notes you may use calculators. You must

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including

More information

Problemsets for Applied Econometrics

Problemsets for Applied Econometrics Department of Economics Problemsets for Applied Econometrics c Seminar of Statistics University of Fribourg Schwitzerland Introduction Datasets All used datasets have been taken from the following book:

More information

Multiple Regression Analysis

Multiple Regression Analysis Chapter 4 Multiple Regression Analysis The simple linear regression covered in Chapter 2 can be generalized to include more than one variable. Multiple regression analysis is an extension of the simple

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Exercise sheet 6 Models with endogenous explanatory variables

Exercise sheet 6 Models with endogenous explanatory variables Exercise sheet 6 Models with endogenous explanatory variables Note: Some of the exercises include estimations and references to the data files. Use these to compare them to the results you obtained with

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Univariate linear models

Univariate linear models Univariate linear models The specification process of an univariate ARIMA model is based on the theoretical properties of the different processes and it is also important the observation and interpretation

More information

Applied Econometrics. Applied Econometrics Second edition. Dimitrios Asteriou and Stephen G. Hall

Applied Econometrics. Applied Econometrics Second edition. Dimitrios Asteriou and Stephen G. Hall Applied Econometrics Second edition Dimitrios Asteriou and Stephen G. Hall MULTICOLLINEARITY 1. Perfect Multicollinearity 2. Consequences of Perfect Multicollinearity 3. Imperfect Multicollinearity 4.

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Wednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms

Wednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms Amherst College Department of Economics Economics 360 Fall 2012 Wednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms Preview A One-Tailed Hypothesis Test: The Downward Sloping

More information

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2018 Part III Limited Dependent Variable Models As of Jan 30, 2017 1 Background 2 Binary Dependent Variable The Linear Probability

More information

ARDL Cointegration Tests for Beginner

ARDL Cointegration Tests for Beginner ARDL Cointegration Tests for Beginner Tuck Cheong TANG Department of Economics, Faculty of Economics & Administration University of Malaya Email: tangtuckcheong@um.edu.my DURATION: 3 HOURS On completing

More information

Econometrics - Slides

Econometrics - Slides 1 Econometrics - Slides 2011/2012 João Nicolau 2 1 Introduction 1.1 What is Econometrics? Econometrics is a discipline that aims to give empirical content to economic relations. It has been defined generally

More information

Hint: The following equation converts Celsius to Fahrenheit: F = C where C = degrees Celsius F = degrees Fahrenheit

Hint: The following equation converts Celsius to Fahrenheit: F = C where C = degrees Celsius F = degrees Fahrenheit Amherst College Department of Economics Economics 360 Fall 2014 Exam 1: Solutions 1. (10 points) The following table in reports the summary statistics for high and low temperatures in Key West, FL from

More information

Multiple Regression Analysis: Further Issues

Multiple Regression Analysis: Further Issues Multiple Regression Analysis: Further Issues Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) MLR: Further Issues 1 / 36 Effects of Data Scaling on OLS Statistics Effects

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent

More information

Heteroskedasticity (Section )

Heteroskedasticity (Section ) Heteroskedasticity (Section 8.1-8.4) Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Heteroskedasticity 1 / 44 Consequences of Heteroskedasticity for OLS Consequences

More information

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012 Econometric Methods Prediction / Violation of A-Assumptions Burcu Erdogan Universität Trier WS 2011/2012 (Universität Trier) Econometric Methods 30.11.2011 1 / 42 Moving on to... 1 Prediction 2 Violation

More information

Problem set 1: answers. April 6, 2018

Problem set 1: answers. April 6, 2018 Problem set 1: answers April 6, 2018 1 1 Introduction to answers This document provides the answers to problem set 1. If any further clarification is required I may produce some videos where I go through

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i 1/34 Outline Basic Econometrics in Transportation Model Specification How does one go about finding the correct model? What are the consequences of specification errors? How does one detect specification

More information

Christopher Dougherty London School of Economics and Political Science

Christopher Dougherty London School of Economics and Political Science Introduction to Econometrics FIFTH EDITION Christopher Dougherty London School of Economics and Political Science OXFORD UNIVERSITY PRESS Contents INTRODU CTION 1 Why study econometrics? 1 Aim of this

More information

centeris paribus. w w partial effect E(y w, c)á w. abil. c =(exper, abil) exper

centeris paribus. w w partial effect E(y w, c)á w. abil. c =(exper, abil) exper c 2002 1 2.1 centeris paribus y c y E(y w, c) w y 1 c c c w w w partial effect E(y w, c)á w w E(y w, c) c w c c c E(wage educ, exp er, abil) educ exper abil c =(exper, abil) exper abil ( ) c y w w y (

More information