Exercices for Applied Econometrics A

Similar documents
Lecture 3 Linear random intercept models

Fixed and Random Effects Models: Vartanian, SW 683

****Lab 4, Feb 4: EDA and OLS and WLS

Problem Set 10: Panel Data

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

Lecture#12. Instrumental variables regression Causal parameters III

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

1 The basics of panel data

Lab 07 Introduction to Econometrics

Monday 7 th Febraury 2005

Interpreting coefficients for transformed variables

Quantitative Methods Final Exam (2017/1)

multilevel modeling: concepts, applications and interpretations

ECON 497 Final Exam Page 1 of 12

Econometrics Homework 4 Solutions

1

Practice 2SLS with Artificial Data Part 1

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.

Practice exam questions

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

Empirical Application of Panel Data Regression

Problem Set 5 ANSWERS

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

Handout 11: Measurement Error

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).

Fortin Econ Econometric Review 1. 1 Panel Data Methods Fixed Effects Dummy Variables Regression... 7

Lecture 4: Multivariate Regression, Part 2

Nonlinear Regression Functions

General Linear Model (Chapter 4)

Microeconometrics (PhD) Problem set 2: Dynamic Panel Data Solutions

Jeffrey M. Wooldridge Michigan State University

UNIVERSITY OF WARWICK. Summer Examinations 2015/16. Econometrics 1

point estimates, standard errors, testing, and inference for nonlinear combinations

An explanation of Two Stage Least Squares

Econometrics II Censoring & Truncation. May 5, 2011

Econometrics Homework 1

Question 1 [17 points]: (ch 11)

Lecture 24: Partial correlation, multiple regression, and correlation

Econometrics. 8) Instrumental variables

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Binary Dependent Variables

Graduate Econometrics Lecture 4: Heteroskedasticity

14.32 Final : Spring 2001

Section Least Squares Regression

Applied Econometrics. Lecture 3: Introduction to Linear Panel Data Models

Longitudinal Data Analysis. RatSWD Nachwuchsworkshop Vorlesung von Josef Brüderl 25. August, 2009

10) Time series econometrics

Suggested Answers Problem set 4 ECON 60303

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Lecture 4: Multivariate Regression, Part 2

Answers: Problem Set 9. Dynamic Models

ECON Introductory Econometrics. Lecture 13: Internal and external validity

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

Econometrics. 9) Heteroscedasticity and autocorrelation

Lab 6 - Simple Regression

Ordinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much!

Econometrics Midterm Examination Answers

Statistical Modelling in Stata 5: Linear Models

Handout 12. Endogeneity & Simultaneous Equation Models

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Problem Set 1 ANSWERS

Instrumental Variables, Simultaneous and Systems of Equations

Problem set - Selection and Diff-in-Diff

ECO220Y Simple Regression: Testing the Slope

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Question 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%.

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Heteroskedasticity. Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set

Mediation Analysis: OLS vs. SUR vs. 3SLS Note by Hubert Gatignon July 7, 2013, updated November 15, 2013

sociology 362 regression

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Lecture 3: Multivariate Regression

Lecture 8: Instrumental Variables Estimation

Empirical Application of Simple Regression (Chapter 2)

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

Correlation and Simple Linear Regression

Dealing With and Understanding Endogeneity

Dynamic Panel Data Models

Lecture 3.1 Basic Logistic LDA

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Greene, Econometric Analysis (7th ed, 2012)

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant)

Introduction to Regression

Panel Data: Very Brief Overview Richard Williams, University of Notre Dame, Last revised April 6, 2015

Lab 10 - Binary Variables

sociology 362 regression

Control Function and Related Methods: Nonlinear Models

Econ 836 Final Exam. 2 w N 2 u N 2. 2 v N

Lecture 14. More on using dummy variables (deal with seasonality)

Maria Elena Bontempi Roberto Golinelli this version: 5 September 2007

Description Quick start Menu Syntax Options Remarks and examples Stored results Methods and formulas Acknowledgment References Also see

Applied Statistics and Econometrics

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

Transcription:

QEM F. Gardes-C. Starzec-M.A. Diaye Exercices for Applied Econometrics A I. Exercice: The panel of households expenditures in Poland, for years 1997 to 2000, gives the following statistics for the whole population and for rich and poor households: Foodm97 (resp. 198, 99, 00)= households food expenditure in 1997 (resp. 1998, 1999, 2000) Depm97= households total expenditure in1997 a. Whole population:. sum foodm97 depm97 foodm98 depm98 foodm99 depm99 foodm00 depm00 Variable Obs Mean Std. Dev. Min Max foodm97 3052 548.2228 250.0745 38.05 2644.08 depm97 3052 1340.535 994.8333 292.24 20363.52 foodm98 3052 524.4633 242.7529 35.76956 3119.484 depm98 3052 1662.085 941.3402 364.4 14285.11 foodm99 3052 487.575 216.9362 66.65106 1832.366 depm99 3052 1740.509 1040.891 311.82 18606.06 foodm00 3051 491.9385 224.7626 39.0117 1989.525 depm00 3051 1930.883 1207.007 411.15 24977.98 b. Poor households. sum foodm97 depm97 foodm98 depm98 foodm99 depm99 foodm00 depm00 if depnuc97<389 Variable Obs Mean Std. Dev. Min Max foodm97 914 500.2924 203.411 83.94 1576.4 depm97 914 935.7614 307.5771 292.24 2206.23 foodm98 914 514.0506 230.4244 95.22786 3119.484 depm98 914 1417.85 527.155 527.79 5566.32 foodm99 914 479.9694 197.1058 79.5568 1416.003 depm99 914 1456.783 583.2609 311.82 4695.32 foodm00 913 488.8045 213.125 53.80725 1989.525 depm00 913 1611.158 666.043 458.99 6581.92 c. Rich households. sum foodm97 depm97 foodm98 depm98 foodm99 depm99 foodm00 depm00 if depnuc97>610 Variable Obs Mean Std. Dev. Min Max foodm97 916 595.9659 298.6445 38.05 2644.08 depm97 916 1930.112 1558.774 611.39 20363.52 foodm98 916 542.1295 262.1956 78.76182 2121.875 depm98 916 1942.058 1131.565 510.52 14090.46 foodm99 916 505.4126 241.5322 66.65106 1577.161 depm99 916 2062.098 1223.265 490.1 16585.97 foodm00 916 501.7999 245.1337 64.44155 1939.467 depm00 916 2261.475 1453.021 621.1 19493.11

1. Discuss these statistics (you may for instance examine the budget shares of food). 2. Estimate the linear regression between the average food expenditure and total expenditure: a. Between the three periods for the whole population; b. Between the three sub-populations in 1997. Compare the marginal propensity or the income elasticity of food between these statistics a and b (cross-section vs time-series estimates). II. Exercice: Suppose we have a survey S over 3000 households, which is aggregated by 10 income groups crossed with three types of family (bachelor, couple without children, couples with children). Do you expect that the coefficient of correlation R 2 in a linear regression would be the same for the two datasets? Explain the potential difference. The same question for the estimates of the coefficients. Application: Individual data:. regress foodm97 depm97 Source SS df MS Number of obs = 3052 F( 1, 3050) = 925.39 Model 44414540.8 1 44414540.8 Prob > F = 0.0000 Residual 146386671 3050 47995.63 R-squared = 0.2328 Adj R-squared = 0.2325 Total 190801212 3051 62537.2705 Root MSE = 219.08 foodm97 Coef. Std. Err. t P> t [95% Conf. Interval] depm97.1212806.0039868 30.42 0.000.1134634.1290977 _cons 385.642 6.65505 57.95 0.000 372.5932 398.6908 Grouped data: 14 cells by income group and family type. regress foodm97 depm97 Source SS df MS Number of obs = 14 F( 1, 12) = 40.53 Model 384369.928 1 384369.928 Prob > F = 0.0000 Residual 113815.984 12 9484.66531 R-squared = 0.7715 Adj R-squared = 0.7525 Total 498185.912 13 38321.9932 Root MSE = 97.389 foodm97 Coef. Std. Err. t P> t [95% Conf. Interval] depm97.2977259.0467684 6.37 0.000.1958262.3996256 _cons 104.5538 52.44963 1.99 0.069-9.724154 218.8317

III. Exercice: A shop for hambergers is open in Pekin. The owner change the price each week during 12 weeks in order to appreciate the demand law between consumption and its price: Week Quantity sold: Price: 1 892 1.23 2 1012 1.15 3 1060 1.10 4 987 1.20 5 680 1.35 6 739 1.25 7 809 1.28 8 1275 0.99 9 946 1.22 10 874 1.25 11 720 1.30 12 1096 1.05 1. Calculate. 2. Calculate the estimate by OLS of the linear equation: ln and interprete the result. 3. Does the owner has an incentive to increase or diminish the price in order to increase its sale ( )? IV. Exercice: Suppose that in the previous exercice you knew that the intercept would you proceed to estimate and what is its estimated value? was equal to 0. How

V. Exercice: Suppose the previous table represents values of quantities and prices for 4 consumers and three periods: t=1 for weeks 1,4,7,10; t=2 for weeks 2,5,8,11; t=3 for weeks 3,6,9,12. 1. Compute Between and Within transforms of lnx and lnp. 2. Estimate the Between and Within price elasticity. Period t Individual i Quantity sold: Price: 1 1 892 1.23 2 1 1012 1.15 3 1 1060 1.10 1 2 987 1.20 2 2 680 1.35 3 2 739 1.25 1 3 809 1.28 2 3 1275 0.99 3 3 946 1.22 1 4 874 1.25 2 4 720 1.30 3 4 1096 1.05

VI. Exercice: What is the significance of parameters α i and β i in the Almost Ideal Demand System: w i,t h = α i + γ i p t h + β i [x t h a(p t h,θ)] + u it h (1) with w i,t h the budget share of commodity i for household h and period t, x h the income of household h and p the price vector. VII. Exercice: 1. Consider a consumption function for housing expenditures C h (rents+charges), depending on household income per capita y, family size S, housing relative price P h, transport relative price P t, and location (L=1 for households living in Paris, 0 elsewhere): C h = a 0 + a 1 y + a 2 S + a 3 P h + a 4 P t + a 5 L + ε From what structural model can this type of linear demand equation be deduced? 2. Among all these explanatory variables, what are those which may be endogenous? Explain. 3. What are the problems posed by this endogeneity in the estimation? 4. The Instrumental Variables method is used to treat this endogeneity problem on explanatory variable X k. Explain the method. 5. The estimates for family size are: (i) for the not-instrumented S. (ii) for the instrumented S. Discuss the difference between these estimates.

Corrections Correction of Exercice II 1.. regress foodm97 depm97 Source SS df MS Number of obs = 3052 F( 1, 3050) = 925.39 Model 44414540.8 1 44414540.8 Prob > F = 0.0000 Residual 146386671 3050 47995.63 R-squared = 0.2328 Adj R-squared = 0.2325 Total 190801212 3051 62537.2705 Root MSE = 219.08 foodm97 Coef. Std. Err. t P> t [95% Conf. Interval] depm97.1212806.0039868 30.42 0.000.1134634.1290977 _cons 385.642 6.65505 57.95 0.000 372.5932 398.6908 Budget share of food: 548/1340=0.41 Income elasticity of food: 0.121/0.41=0.3 2. gen cell=0 replace cell=11 if depnuc<200 & nuc==1 replace cell=12 if depnuc<200 & nuc==1.7. regress foodm97 depm97 Source SS df MS Number of obs = 14 F( 1, 12) = 40.53 Model 384369.928 1 384369.928 Prob > F = 0.0000 Residual 113815.984 12 9484.66531 R-squared = 0.7715 Adj R-squared = 0.7525 Total 498185.912 13 38321.9932 Root MSE = 97.389 foodm97 Coef. Std. Err. t P> t [95% Conf. Interval] depm97.2977259.0467684 6.37 0.000.1958262.3996256 _cons 104.5538 52.44963 1.99 0.069-9.724154 218.8317

Correction of Exercice II Grouping the data cancels information known at the individual level. As a consequence, the unexplained heterogeneity (unexplained by the model) diminishes in the aggregate data, which increases the coefficient of correlation R 2. The estimates would be identical in the two estimations: on individual data or on grouped data, except in the case where the grouping procedure introduces some endogeneity in the data set.

generate lnplnx=log(var1)*log(var2) generate lnx2=log(var1)^2 generate lnp2=log(var2)^2 generate lnx=log(var1) generate lnp=log(var2) sum lnx lnp lnplnx lnx2 lnp2 var1 var2 regress lnx lnp nl (lnx={alpha}+{beta}*lnp) Correction of Exercice III. sum lnx lnp lnplnx lnx2 lnp2 var1 var2 Variable Obs Mean Std. Dev. Min Max lnx 12 6.812644.1882742 6.522093 7.150702 lnp 12.1764679.0916859 -.0100503.3001046 lnplnx 12 1.187361.6025784 -.0718669 1.95731 lnx2 12 46.44462 2.566888 42.53769 51.13253 lnp2 12.0388467.0276141.000101.0900628 var1 12 924.1667 174.6695 680 1275 var2 12 1.1975.1062694.99 1.35.. regress lnx lnp Source SS df MS Number of obs = 12 F( 1, 10) = 73.97 Model.343481726 1.343481726 Prob > F = 0.0000 Residual.046437139 10.004643714 R-squared = 0.8809 Adj R-squared = 0.8690 Total.389918865 11.03544717 Root MSE =.06814 lnx Coef. Std. Err. t P> t [95% Conf. Interval] lnp -1.927315.2240958-8.60 0.000-2.426632-1.427999 _cons 7.152754.0441683 161.94 0.000 7.05434 7.251167.. nl (lnx={alpha}+{beta}*lnp) (obs = 12) Iteration 0: residual SS =.0464371 Iteration 1: residual SS =.0464371 Source SS df MS Number of obs = 12 Model.343481726 1.343481726 R-squared = 0.8809 Residual.046437139 10.004643714 Adj R-squared = 0.8690 Root MSE =.0681448 Total.389918865 11.03544717 Res. dev. = -32.60022 lnx Coef. Std. Err. t P> t [95% Conf. Interval] /alpha 7.152754.0441683 161.94 0.000 7.05434 7.251167 /beta -1.927315.2240958-8.60 0.000-2.426632-1.427999 Parameter alpha taken as constant term in model & ANOVA table

regress lnx lnp nl (lnx={beta}*lnp). nl (lnx={beta}*lnp) (obs = 12) Iteration 0: residual SS = 121.8305 Iteration 1: residual SS = 121.8305 Source SS df MS Number of obs = 12 Model 435.504882 1 435.504882 R-squared = 0.7814 Residual 121.830515 11 11.0755014 Adj R-squared = 0.7615 Root MSE = 3.327988 Total 557.335397 12 46.4446164 Res. dev. = 61.86722 lnx Coef. Std. Err. t P> t [95% Conf. Interval] /beta 30.56531 4.87432 6.27 0.000 19.83701 41.29362. regress lnx lnp, noconstant Source SS df MS Number of obs = 12 F( 1, 11) = 39.32 Model 435.504882 1 435.504882 Prob > F = 0.0001 Residual 121.830515 11 11.0755014 R-squared = 0.7814 Adj R-squared = 0.7615 Total 557.335397 12 46.4446164 Root MSE = 3.328 lnx Coef. Std. Err. t P> t [95% Conf. Interval] lnp 30.56531 4.87432 6.27 0.000 19.83701 41.29362

Correction of Exercice V Suppose the previous table represents values of quantities and prices for 4 consumers and three periods: t=1 for weeks 1,4,7,10; t=2 for weeks 2,5,8,11; t=3 for weeks 3,6,9,12. 3. Compute Between and Within transforms of lnx and lnp. 4. Estimate the Between and Within price elasticity. Correction of exercice V: Transforms Between and Within of x and p. Per.t Ind. i 1 1 892 988-96 1.23 1.16 +0.07 2 1 1012 988 +24 1.15 1.16-0.01 3 1 1060 988 +72 1.10 1.16-0.06 1 2 987 802 +185 1.20 1.267-0.067 2 2 680 802-122 1.35 1.267 +0.083 3 2 739 802-63 1.25 1.267-0.017 1 3 809 1010 +201 1.28 1.28 +0.117 2 3 1275 1010 +265 0.99 0.99-0.173 3 3 946 1010-64 1.22 1.22 +0.057 1 4 874 897-23 1.25 1.25 +.05 2 4 720 897-177 1.30 1.30 +.10 3 4 1096 897 +199 1.05 1.05-0.15 Elasticity: (i) (ii). xtreg var1 var2, be Between regression (regression on group means) Number of obs = 12 Group variable: id Number of groups = 3 R-sq: within = 0.9477 Obs per group: min = 4 between = 0.9964 avg = 4.0 overall = 0.9212 max = 4 F(1,1) = 277.67 sd(u_i + avg(e_i.))= 2.95981 Prob > F = 0.0382 var1 Coef. Std. Err. t P> t [95% Conf. Interval] var2-820.5881 49.24474-16.66 0.038-1446.302-194.8744 _cons 1906.821 58.99533 32.32 0.020 1157.214 2656.428

. sum Bx Bp Variable Obs Mean Std. Dev. Min Max Bx 12 924.25 85.97793 802 1010 Bp 12 1.197583.0972798.99 1.3 Remark: Estimation in log: El=-0.86 Between regression (regression on group means) Number of obs = 12 Group variable: id Number of groups = 3 R-sq: within = 0.9222 Obs per group: min = 4 between = 0.7223 avg = 4.0 overall = 0.8809 max = 4 F(1,1) = 2.60 sd(u_i + avg(e_i.))=.0278534 Prob > F = 0.3534 lnx Coef. Std. Err. t P> t [95% Conf. Interval] lnp -.8636887.5355588-1.61 0.353-7.668609 5.941231 _cons 6.965058.0958673 72.65 0.009 5.746948 8.183167. xtreg lnx lnp, fe Fixed-effects (within) regression Number of obs = 12 Group variable: id Number of groups = 3 R-sq: within = 0.9222 Obs per group: min = 4 between = 0.7223 avg = 4.0 overall = 0.8809 max = 4 F(1,8) = 94.81 corr(u_i, Xb) = -0.3126 Prob > F = 0.0000 lnx Coef. Std. Err. t P> t [95% Conf. Interval] lnp -2.068255.2124134-9.74 0.000-2.558081-1.578429 _cons 7.177625.0413771 173.47 0.000 7.082209 7.273041 sigma_u.04847924 sigma_e.06069603 rho.38948315 (fraction of variance due to u_i) F test that all u_i=0: F(2, 8) = 2.30 Prob > F = 0.1622. xtset id panel variable: id (balanced). xtrc lnx lnp Random-coefficients regression Number of obs = 12 Group variable: id Number of groups = 3 Obs per group: min = 4 avg = 4.0 max = 4 Wald chi2(1) = 37.44 Prob > chi2 = 0.0000 lnx Coef. Std. Err. z P> z [95% Conf. Interval] lnp -2.349175.3839322-6.12 0.000-3.101669-1.596682 _cons 7.244809.1051874 68.88 0.000 7.038645 7.450972 Test of parameter constancy: chi2(4) = 19.90 Prob > chi2 = 0.0005 Correction of Exercice VI

Parameters α i and β i correspond to the intercept and the income effect in an Almost Ideal demand system. The income elasticity can be recovered by the formula: with the budget-share of expenditure i.. In this specification, [x t h a(p t h,θ)] is the logarithm of the real income (income divided by a price index). Correction of Exercice VII 1. This linear consumption function can be derived from the maximisation of the Stone- Geary direct elasticity (the so called Linear Expenditure function of Staone, 1954). 2. Household s income can be endogeneous, since it is obtained in the same period during which housing expenditures are made: a common factor (for instance weather) may determine both income and this expenditure. The other variables can be supposed to correspond to choices made before, so that they are not correlated to the residual term of the housing expenditure function. 3. His endogeneity biases all coefficients, espeacially the income coefficient:: 4. IV method in two steps: choose instrumental variables, check that they are correlated to income and independent from the residual. 5. The estimates for family size are quite different, which shows that this variable may also be endogeneous.