Use of Dummy (Indicator) Variables in Applied Econometrics

Size: px
Start display at page:

Download "Use of Dummy (Indicator) Variables in Applied Econometrics"

Transcription

1 Chapter 5 Use of Dummy (Indicator) Variables in Applied Econometrics

2 Section 5.1 Introduction

3 Use of Dummy (Indicator) Variables Model specifications in applied econometrics often necessitate the use of qualitative variables as explanatory factors. You can present examples of qualitative variables, either with the use of time-series data or with the use of cross-sectional data. Emphasis is placed on the mechanics of the transformations of qualitative variables into dummy (indicator) variables. Emphasis is placed on the interpretation of the estimated coefficients associated with dummy (indicator) variables. 3

4 Dummy (Indicator) Variables Dummy (indicator) variables represent qualitative variables. Key Features: Intercept shifters Slope shifters Singularity problem (Dummy Variable Trap) Qualitative choice models 4

5 Zero-One Variables or Dummy Variables (Possible for both explanatory variables and dependent variables) Qualitative variables can represent the following: temporal effects -- seasons wartime and peacetime years political regimes government programs geographical regions characteristics of households or individuals such as gender, marital status, race, occupation, or employment status structural shifts 5

6 Dummy (Indicator) Variables and Attributes Dummy (indicator) variables represent the occurrence or nonoccurrence of a particular attribute of a qualitative variable. If the attribute occurs, the dummy (indicator) variable takes on the value of 1. If the attribute does not occur, the dummy (indicator) variable takes on the value of 0. In the same way as a light switch, either the attribute is on (value of 1) or off (value of 0). 6

7 Qualitative Variables A qualitative variable can consist of two or more categories but these categories must be mutually exclusive and exhaustive. Ease and convenience of analysis should guide the construction of the 0-1 variable. However, interpretation depends on the construction of the indicator variable. 7

8 8 Example The Investment Tax Credit (ITC), YEAR ITC Two categories: either in force or not 1, ITC in force 0, otherwise

9 Example: Qualitative Variable Region with Multiple Categories 9 STATE PCEXP PCAID PCINC REGION INDICATOR ME NH VT MA RI CT NY NJ PA OH IND IL MICH WISC MINN IOWA

10 10 Investigate the Relationship of PCAID and PCINC on PCEXP One possibility: Four regions are evident: Northeast (NE), Midwest (MW), South (S), and West (W). For each region, run the regression. PCEXP i = a0 r + a1 rpcaidi + a2rpcinci + ε i, r = 1,2,3,4. Four regions (r), four regression runs Obtain these coefficients, one for each region. Set Set Set Set Do not use this specification: PCEXP 1: a = c 0 i = 1, 2,..., 50. i 01 2: a 3: a 4: a , a 11, a, a, a 12 13, a 14 21, a, a, a + c PCAID i + c 2 PCINC i + c 3 REGION i + ε i

11 Section 5.2 Intercept Shifters

12 0-1 Variables in Regression Analysis Intercept and/or slope shifters Key purpose: Achieve a greater degree of generalization of the model. Run one model with qualitative variables. Intercept Shifters For example, PCEXP + β NE 3 i i = β + β PCAID + β MW 4 0 i 1 + β WE 5 i i + ε + β PCINC i 2 i The number of observations pertaining to each category does not have to be equal. However, there must be at least one observation in each category. 12 continued...

13 0-1 Variables in Regression Analysis The number of ones in each category equals the number of replications. NE i MW i WE i = 1 if the i th observation corresponds to the Northeast region = 0 otherwise = 1 if the i th observation corresponds to the Midwest region = 0 otherwise = 1 if the i th observation corresponds to the West region = 0 otherwise The omitted category is the South. South i = 1 if the i th observation corresponds to the South region = 0 otherwise 13 continued...

14 0-1 Variables in Regression Analysis Another important difference of using one equation with dummies, instead of running four regressions is that you treat the residuals differently. In four different regressions, you have four different residuals; a shock in regression 1 might not affect the other regressions. In one regression with four dummies, a shock in the residuals might affect all regions, because they are included in one regression. 14

15 Singularity Problem In the previous model specification, although four regions were evident, only three dummy variables appear. Why? Dummy variable trap singularity problem: sum of all dummy variables equals the intercept (perfect collinear situation). Two alternatives handle this problem: 1. Eliminate the intercept from the regression model and use all dummy variables. 2. Arbitrarily eliminate one of the categories of the qualitative variable and keep the intercept (very common). 15 The intercept of the regression equation, wrt (2), is the intercept pertaining to the omitted zero-one variable (base intercept).

16 General Rule If there are r categories of a qualitative variable, use r-1 indicator variables to avoid the dummy variable trap. 16

17 Interpretation The coefficient of any 0-1 variable indicates the difference between the base intercept and the intercept pertaining to the particular category of the attribute. For example, ˆ PCEXPi = β 0 + β1pcaidi + β2 + ˆ β ˆ + ˆ β WE 3NEi + β4mwi 5 ˆ i ˆ PCINC i If there is more than one set of discrete variables, one 0-1 variable must be deleted from each set. 17

18 Graphical Illustration of Intercept Shifters PCEXP NE SOUTH WE B o +B 3 B o B o +B 5 PCINC 18 The coefficients B 3 and B 5 represent how far above (below) PCEXP is from the base region (SOUTH).

19 Dependent Variable: pcexp Number of Observations Read 50 Number of Observations Used 50 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var continued...

20 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 pcaid <.0001 pcinc <.0001 ne mw we Interpretation of these estimated coefficients? 20

21 Tests of Hypotheses Associated with Dummy Variables 1. Test the statistical significance of each estimated coefficient associated with the included dummy variables. H o : βi = 0 (use of t test) The estimated coefficient describes the difference between the included qualitative category and the base category. 21 continued...

22 Tests of Hypotheses Associated with Dummy Variables The estimated intercepts: NE : ˆ β WE : ˆ β o MW : ˆ β : ˆ β S βˆo o + ˆ β o 3 + ˆ β + ˆ β 5 4 Note that ˆ β ˆ β ˆ β = = = difference in intercept from the Northeast and the South difference in intercept from the Midwest and the South difference in intercept from the West and the South Reason: To measure whether the level of PCEXP is statistically different among the included regions and the base region. 22 continued...

23 Tests of Hypotheses Associated with Dummy Variables 2. Test the statistical significance of a particular estimated coefficient associated with an included dummy variable from other estimated coefficients associated with included dummy variables. H o : β i = β j (use F test) In this example, this process entails three separate tests H o i j of : β β = β 3 = β4, H o : β3 = β5, H o : H o H o H o : β = β : β = β : β = β Test whether the level of PCEXP in the Northeast and the level of PCEXP in the Midwest are the same. Test whether the level of PCEXP in the Northeast and the level of PCEXP in the West are the same Test whether the level of PCEXP in the Midwest and the level of PCEXP in the West are the same. continued...

24 Tests of Hypotheses Associated with Dummy Variables 3. Test whether or not the qualitative variable plays a statistically significant role in affecting the dependent variable. H o In the example, test test of hypothesis. (use of F-test) : β 3 = β 4 = β 5 = 0, a joint If you fail to reject H o, then region does not play a statistically significant role in affecting the level of per capita state expenditures. 24

25 Tests of Hypotheses In conducting tests of hypotheses of indicator variables as intercept shifters, carry out all three tests of hypotheses. H : β = 0 The test of o i is automatic, but the other null hypotheses involving joint tests of coefficients associated with dummy variables are not automatic. Each of the three hypothesis tests conveys important information. 25

26 Using Intercept and Slope Shifter Variables * run separate regressions for each region; data ne; set statedata1970; if region=1; proc reg data=ne; model pcexp=pcaid pcinc / dw; data mw; set statedata1970; if region=2; proc reg data=mw; model pcexp=pcaid pcinc / dw; data so; set statedata1970; if region=3; proc reg data=so; model pcexp=pcaid pcinc / dw; data we; set statedata1970; if region=4; proc reg data=we; model pcexp=pcaid pcinc / dw; 26 continued...

27 * use of both intercept shifters and slope shifters; proc reg data=statedata1970; South is arbitrarily selected as the base or reference category; model pcexp=pcaid pcinc pcaidne pcaidmw pcaidwe pcincne pcincmw pcincwe ne mw we / dw; test ne=0, mw=0, we=0; test pcaidne=0, pcaidmw=0, pcaidwe=0; test pcincne=0, pcincmw=0, pcincwe=0; test pcaidne=0, pcaidmw=0, pcaidwe=0, pcincne=0, pcincmw=0, pcincwe=0; test ne=0, mw=0, we=0, pcaidne=0, pcaidmw=0, pcaidwe=0, pcincne=0, pcincmw=0, pcincwe=0; 27

28 The MEANS Procedure Output Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc ne mw so we

29 region= The MEANS Procedure Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc region= Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc continued...

30 region= Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc region= Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc

31 Model: MODEL1 Dependent Variable: pcexp Illustration of the Dummy Variable Trap Number of Observations Read 50 Number of Observations Used 50 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var continued...

32 NOTE: Model is not full rank. Least-squares solutions for the parameters are not unique. Some statistics will be misleading. A reported DF of 0 or B means that the estimate is biased. NOTE: The following parameters have been set to 0, since the variables are a linear combination of other variables as shown. we = Intercept - ne - mw so Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept B <.0001 pcaid <.0001 pcinc <.0001 ne B mw B

33 Illustrating the Dummy Variable Trap * Illustration of dummy variable trap; model pcexp=pcaid pcinc ne mw so we / dwprob; * South is arbitrarily selected as the base or reference intercept; model pcexp=pcaid pcinc ne mw we / dwprob; test ne=mw; test ne=we; test mw=we; test ne=0, mw=0, we=0; * West is arbitrarily chosen as the base or reference intercept; model pcexp=pcaid pcinc ne mw so / dwprob; test ne=mw; test ne=so; test mw=so; test ne=0, mw=0, so=0; 33

34 The REG Procedure Output The REG Procedure Model: MODEL2 Test 1 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Interpretation? Test H 0 : Coefficient of NE = Coefficient of MW 34 continued...

35 The REG Procedure Output The REG Procedure Model: MODEL2 Test 2 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Interpretation? Test H 0 : Coefficient of NE = Coefficient of WE 35 continued...

36 The REG Procedure Output The REG Procedure Model: MODEL2 Test 3 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Interpretation? Test H 0 : Coefficient of MW = Coefficient of WE 36 continued...

37 The REG Procedure Output The REG Procedure Model: MODEL2 Test 4 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Interpretation? Test H 0 : coefficients of NE, MW, and WE are jointly equal to zero. 37

38 Run the same model, but choose the West as the reference region. What do you observe? The REG Procedure Model: MODEL3 Dependent Variable: pcexp Number of Observations Read 50 Number of Observations Used 50 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var continued...

39 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 pcaid <.0001 pcinc <.0001 ne mw so

40 The REG Procedure Output The REG Procedure Model: MODEL3 Test 5 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : Coefficient of NE = Coefficient of MW 40 continued...

41 The REG Procedure Output The REG Procedure Model: MODEL3 Test 6 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : Coefficient of NE = Coefficient of the South 41 continued...

42 The REG Procedure Output The REG Procedure Model: MODEL3 Test 7 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : Coefficient of MW = Coefficient of the South 42 continued...

43 The REG Procedure Output The REG Procedure Model: MODEL3 Test 8 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : Coefficients of NE, MW, and South are jointly equal to zero. 43

44 Section 5.3 Slope Shifters

45 Slope Shifters Allow for differences in slopes of one or more of the continuous predetermined variables. Generate certain interaction variables. Produce a dummy variable with a continuous predetermined variable. For example, does MPC vary by race? Geographic region? Season? Before 1980 versus after 1980? 45 continued...

46 Slope Shifters It is possible to use different regression models in lieu of a single regression model with slope shifters. Key Point: Generate new variables by forming cross products between each of the 0-1 variables representing categories of the attribute and the selected continuous predetermined variables whose coefficients are being allowed to vary among categories. 46

47 Model Specification Model specification is now given by the following: For example: PCEXP = ˆ β + ˆ β PCAID + ˆ β ( MW 4 i i 0 * PCINC 1 i i + ˆ β PCINC ) + ˆ β ( WE * PCINC 5 2 i + ˆ β ( NE i 3 ) + ε i i * PCINC i ) NE i MW WE i * PCINC i * PCINC * PCINC i i interaction term i interaction term interaction term Again, it is necessary to arbitrarily omit one category of the attribute to avoid dummy variable trap. 47

48 Illustration of the Use of Slope Shifters Suppose that you want to ascertain whether or not the effect of PCINC on PCEXP is the same across regions. That is, is the marginal effect of PCINC in PCEXP the same for the Northeast, the Midwest, the South, and the West? 48

49 Marginal Effects by Region With this specification, marginal effects of PCINC on PCEXP by region are given as follows: Region Northeast Midwest South West Marginal Effect of PCINC on PCEXP ˆ ˆ β + β ˆ β ˆ β ˆ β ˆ β + ˆ β

50 Graphical Illustration of Slope Shifters PCEXP NE B 2 +B 3 SOUTH B 2 WE B 2 +B 5 B o PCINC 50

51 The key hypotheses to consider are shown below: 1. t-tests 0 : 0 : 0 : = = = β β β o o o H H H } Key Hypotheses 2. F-test 3. F-test 51 0 : = = = β β β H o : : : β β β β β β = = = o o o H H H }

52 Illustration of Slope Shifter Variables * Illustration of slope shifter variables; * South is arbitrarily selected as the base or reference category; model pcexp=pcaid pcinc pcincne pcincmw pcincwe / dwprob; test pcincne=pcincmw; test pcincne=pcincwe; test pcincmw=pcincwe; test pcincne=0, pcincmw=0, pcincwe=0; * West is arbitrarily chosen as the base or reference category; model pcexp=pcaid pcinc pcincne pcincmw pcincso / dwprob; test pcincne=pcincmw; test pcincne=pcincso; test pcincmw=pcincso; test pcincne=0, pcincmw=0, pcincso=0; 52

53 The REG Procedure Model: MODEL1 Dependent Variable: pcexp Number of Observations Read 50 Number of Observations Used 50 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var continued...

54 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 pcaid <.0001 pcinc <.0001 pcincne pcincmw pcincwe Interpretation of estimated coefficients? 54

55 The REG Procedure Output Model: MODEL1 Test 1 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficient of PCINCNE = coefficient of PCINCMW 55 continued...

56 The REG Procedure Output Model: MODEL1 Test 2 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficient of PCINCNE = coefficient of PCINCWE 56 continued...

57 The REG Procedure Output Model: MODEL1 Test 3 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficient of PCINCMW = coefficient of PCINCWE 57 continued...

58 The REG Procedure Output Model: MODEL1 Test 4 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test the joint hypothesis that coefficients PCINCNE, PCINCMW, and PCINCWE equal zero. 58

59 The REG Procedure Model: MODEL2 Dependent Variable: pcexp Number of Observations Read 50 Number of Observations Used 50 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var continued...

60 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 pcaid <.0001 pcinc <.0001 pcincne pcincmw pcincso

61 The REG Procedure Output The REG Procedure Model: MODEL2 Test 5 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficient of PCINCNE = coefficient of PCINCMW 61 continued...

62 The REG Procedure Output The REG Procedure Model: MODEL2 Test 6 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficient of PCINCNE = coefficient of PCINCSO 62 continued...

63 The REG Procedure Output The REG Procedure Model: MODEL2 Test 7 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficient of PCINCMW = coefficient of PCINCSO 63 continued...

64 The REG Procedure Output The REG Procedure Model: MODEL2 Test 8 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficients of PCINCNE, PCINCMW, and PCINCSO are jointly equal to zero. 64

65 region= The MEANS Procedure Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc region= Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc continued...

66 region= Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc region= Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc

67 Marginal Effects/Elasticities Region Marginal Effect of PCAID on PCEXP Marginal Effect of PCINC on PCEXP Northeast = Midwest = South West = continued...

68 Region % Change in PCEXP to a 1% Change in PCAID % Change in PCEXP to a 1% Change in PCINC Northeast (2.5177)( )/ = ( )( )/ = Midwest (2.5177)( )/ ( )( )/ = = South (2.5177)( )/ = West (2.5177)( )/ = ( )( )/ = ( )( )/ =

69 Section 5.4 Intercept Shifters and Slope Shifters

70 Use of Both Intercept Shifters and Slope Shifter Variables PCEXP NE i = β + β PCAID + β PCINC + β NE + β MW + β WE o + β6 NE PCAID + β7 MW PCAID + β8 WE PCAID + β NE PCINC + β MW PCINC + β WE PCINC + ε 9 MW PCEXP PCEXP South West i i PCEXP PCEXP i i = β + β ) + ( β + β ) PCAID + ( β + β ) PCINC + ε ( = β + β ) + ( β + β ) PCAID + ( β + β ) PCINC + ε ( = β + β PCAID + β PCINC + ε = β + β ) + ( β + β ) PCAID + ( β + β ) PCINC + ε ( i 4 i 5 i i i 70

71 region= The MEANS Procedure Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc region= Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc continued...

72 region= Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc region= Variable N Mean Std Dev Minimum Maximum pcexp pcaid pcinc

73 Model: MODEL1 Dependent Variable: pcexp Number of Observations Read 11 Number of Observations Used 11 Separate Regression for Region 1 (NE) Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t 73 Intercept pcaid pcinc

74 The REG Procedure Model: MODEL1 Dependent Variable: pcexp Number of Observations Read 12 Number of Observations Used 12 Separate Regression for Region 2 (MW) Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept pcaid pcinc

75 The REG Procedure Model: MODEL1 Dependent Variable: pcexp Number of Observations Read 14 Number of Observations Used 14 Separate Regression for Region 3 (South) Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept pcaid pcinc

76 Dependent Variable: pcexp Number of Observations Read 13 Number of Observations Used 13 Separate Regression for Region 4 (West) Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t 76 Intercept pcaid <.0001 pcinc <.0001

77 The REG Procedure Model: MODEL1 Dependent Variable: pcexp Number of Observations Read 50 Number of Observations Used 50 Analysis of Variance Regression with Intercept and Slope Shifters with Region (50 observations) Reference Region South Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var continued...

78 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept pcaid pcinc pcaidne pcaidmw pcaidwe pcincne pcincmw pcincwe ne mw we

79 Run Separate Regressions for Each Region NE PCEXP i = PCAID PCINC MW PCEXP i South PCEXP i = PCAID PCINC = PCAID PCINC West PCEXP i = PCAID PCINC 79

80 The REG Procedure Output The REG Procedure Model: MODEL1 Test 1 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficient of NE, coefficient of MW, and coefficient of WE = continued...

81 The REG Procedure Output The REG Procedure Model: MODEL1 Test 2 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficients of PCAIDNE, PCAIDMW, and PCAIDWE are jointly equal to zero. 81 continued...

82 The REG Procedure Output The REG Procedure Model: MODEL1 Test 3 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficients of PCINCNE, PCINCMW, and PCINCWE are jointly equal to zero. 82 continued...

83 The REG Procedure Output The REG Procedure Model: MODEL1 Test 4 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficients of PCAIDNE, PCAIDMW, PCAIDWE, PCINCNE, PCINCMW, and PCINCWE are jointly equal to zero. 83 continued...

84 The REG Procedure Output The REG Procedure Model: MODEL1 Test 5 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : coefficients of NE, MW, WE, PCAIDNE, PCAIDMW, PCAIDWE, PCINCNE, PCINCMW, and PCINCWE are jointly equal to zero. 84

85 Testing Hypotheses with Dummy Variables * test ne regression the same the so regression; test ne=0, pcaidne=0, pcincne=0; * test mw regression the same the so regression; test mw=0, pcaidmw=0, pcincmw=0; * test we regression the same the so regression; test we=0, pcaidwe=0, pcincwe=0; * test ne regression the same the mw regression; test ne=mw, pcaidne=pcaidmw, pcincne=pcincmw; * test ne regression the same the we regression; test ne=we, pcaidne=pcaidwe, pcincne=pcincwe; * test mw regression the same the we regression; test mw=we, pcaidmw=pcaidwe, pcincmw=pcincwe; 85

86 The REG Procedure Output The REG Procedure Model: MODEL1 Test 6 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : NE Regression the same as the SOUTH Regression 86 continued...

87 The REG Procedure Output The REG Procedure Model: MODEL1 Test 7 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : MW Regression the same as the SOUTH Regression 87 continued...

88 The REG Procedure Output The REG Procedure Model: MODEL1 Test 8 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : WEST Regression the same as the SOUTH Regression 88 continued...

89 The REG Procedure Output The REG Procedure Model: MODEL1 Test 9 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : NE Regression the same as the MW Regression 89 continued...

90 The REG Procedure Output The REG Procedure Model: MODEL1 Test 10 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : NE Regression the same as the WEST Regression 90 continued...

91 The REG Procedure Output The REG Procedure Model: MODEL1 Test 11 Results for Dependent Variable pcexp Mean Source DF Square F Value Pr > F Numerator Denominator Test H 0 : MW Regression the same as the WEST Regression 91

92 Test NE Regression the Same as South Regression H β = β β 0 : 3 6 = 9 = 0 F test Test MW Regression the Same as South Regression H β = β β 0 : 4 7 = 10 = F test Test WE Regression the Same as South Regression H β = β β 0 : 5 8 = 11 = 0 0 F test Test NE Regression the Same as MW Regression H 0 3 = β4, β6 = β7, : β β = β Test NE Regression the Same as WE Regression H : β β β β β = β 0 3 = 5, 6 = 8, F test F test 92 Test MW Regression the Same as WE Regression H 0 4 = β5, β7 = β8, : β β = β F test

93 Section 5.5 Final Thoughts about the Use of Dummy (Indicator) Variables

94 Caveats 1. The use of 0-1 variables particularly when slopes are allowed to vary requires a large number of degrees of freedom. 2. Difficulties might arise in analysis and interpretation, particularly with the use of slope shifter variables. 3. Insure an adequate number of replications. 4. It might be advantageous not to omit an extreme category for comparison purposes. 5. The category to omit might be the one in which the analyst is most interested. 6. Generate dummy variables using IF/THEN statements to save time and cut down on data entry errors. 94

95 Model Formulation Suppose the model formulation is written as: = β β β ln X + ε lnyt 0 + 1DUM t + 2 t t (1) where Y t refers to the value of the dependent variable in time t; X t refers to the value of the explanatory variable in time t; and DUM t refers to a dummy variable in time t. This dummy variable is an intercept shifter of the econometric relationship between Y and X. It takes on the value of 1 or 0. Suppose DUM t = 1 if YR > 1988; 0 otherwise. Relative to the years prior to 1988, the percentage change in Y t can be expressed as β1 ( e 1) x100 %. 95 continued...

96 Model Formulation To understand this result, consider that for years prior to 1988, Y t = lny e t ( β = β + β ln X 0 or 0 + β2 ln X t ) β0 = e ( X ) 2 t t β 2 (2) (3) For years 1988 and on, t = ( β 0 + β1) + β ln X t lny 2 (4) 96 Therefore, β 1 represents how much higher (if β 1 > 0) or lower (if β 1 < 0) the natural logarithm of Y t is relative to the years prior to By the same token, for years 1988 and on, Y = e 0 + β1 ) ( ) ( β β2 t X t (5) continued...

97 Model Formulation Now the percentage change in Y for years 1988 on, relative to the years prior to 1988, is given by Y (1988 on) Y ( prior to t Y ( prior to1988) t 1988) x100 % t (6) By substitution, equation (6) can be written using equations (3) and (5). e ( β β 0 + 1) ( X e t β 0 β2 ) e ( X ) t β β 2 0 ( X t ) β 2 x100% (7) Equation (7) can be simplified algebraically as β ( e 1 1)x100% (8) 97 Hence, equation (8) represents the percentage change in Y t relative to the base period (the years prior to 1988). continued...

98 Model Formulation The moral of this story is that if you have a double-log (or linear in logarithms) specification where some of the exogenous variables are dummy variables, be careful of the interpretation of the coefficients associated with the dummy variables. The correct interpretation is the percentage change in the dependent variable relative to a base period. That is, you use this expression: β ( e 1 1)x100% where β 1 represents the coefficient associated with the relevant dummy variable. 98

99 Test of Seasonality in Per Capita Orange Juice Consumption The AUTOREG Procedure Dependent Variable lallojgalpc Ordinary Least Squares Estimates SSE DFE 150 MSE Root MSE SBC AIC Regress R-Square Total R-Square Durbin-Watson continued...

100 Standard Approx Variable DF Estimate Error t Value Pr > t Intercept lrallojprice <.0001 lrallgfjprice lrpcdpi m m <.0001 m m <.0001 m <.0001 m <.0001 m <.0001 m <.0001 m <.0001 m <.0001 m <.0001 Test 1 Source DF Mean Square F Value Pr > F 100 Numerator <.0001 Denominator

101 U.S. Per Capita Orange Juice Consumption Reference (Base) Month: December 101 Month Percentage Change in Per Capita Orange Juice Consumption Relative to December January (exp(0.0432) -1) x 100% = 4.41% February (exp( ) -1) x 100% = -9.48% March (exp( ) -1) x 100% = -1.42% April (exp( ) -1) x 100% = -8.70% May (exp( ) -1) x 100% = -8.63% June (exp( ) -1) x 100% = % July (exp( ) -1) x 100% = % August (exp( ) -1) x 100% = -9.49% September (exp( ) -1) x 100% = -9.36% October (exp( ) -1) x 100% = -5.06% November (exp( ) -1) x 100% = -6.46%

102 Section 5.6 Additional Readings

103 Additional Readings See references: Kennedy (1981) Kennedy (1986) Suits (1984) Van Garderen and Shah (2002) 103

Regression Analysis II

Regression Analysis II Regression Analysis II Measures of Goodness of fit Two measures of Goodness of fit Measure of the absolute fit of the sample points to the sample regression line Standard error of the estimate An index

More information

Eco and Bus Forecasting Fall 2016 EXERCISE 2

Eco and Bus Forecasting Fall 2016 EXERCISE 2 ECO 5375-701 Prof. Tom Fomby Eco and Bus Forecasting Fall 016 EXERCISE Purpose: To learn how to use the DTDS model to test for the presence or absence of seasonality in time series data and to estimate

More information

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 The MEANS Procedure DRINKING STATUS=1 Analysis Variable : TRIGL N Mean Std Dev Minimum Maximum 164 151.6219512 95.3801744

More information

Chapter 9. Dummy (Binary) Variables. 9.1 Introduction The multiple regression model (9.1.1) Assumption MR1 is

Chapter 9. Dummy (Binary) Variables. 9.1 Introduction The multiple regression model (9.1.1) Assumption MR1 is Chapter 9 Dummy (Binary) Variables 9.1 Introduction The multiple regression model y = β+β x +β x + +β x + e (9.1.1) t 1 2 t2 3 t3 K tk t Assumption MR1 is 1. yt =β 1+β 2xt2 + L+β KxtK + et, t = 1, K, T

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the

More information

CHAPTER 4: Forecasting by Regression

CHAPTER 4: Forecasting by Regression CHAPTER 4: Forecasting by Regression Prof. Alan Wan 1 / 57 Table of contents 1. Revision of Linear Regression 3.1 First-order Autocorrelation and the Durbin-Watson Test 3.2 Correction for Autocorrelation

More information

Autocorrelation or Serial Correlation

Autocorrelation or Serial Correlation Chapter 6 Autocorrelation or Serial Correlation Section 6.1 Introduction 2 Evaluating Econometric Work How does an analyst know when the econometric work is completed? 3 4 Evaluating Econometric Work Econometric

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total Math 221: Linear Regression and Prediction Intervals S. K. Hyde Chapter 23 (Moore, 5th Ed.) (Neter, Kutner, Nachsheim, and Wasserman) The Toluca Company manufactures refrigeration equipment as well as

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Chapter 8 Quantitative and Qualitative Predictors

Chapter 8 Quantitative and Qualitative Predictors STAT 525 FALL 2017 Chapter 8 Quantitative and Qualitative Predictors Professor Dabao Zhang Polynomial Regression Multiple regression using X 2 i, X3 i, etc as additional predictors Generates quadratic,

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 984. y ˆ = a + b x + b 2 x 2K + b n x n where n is the number of variables Example: In an earlier bivariate

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

a. The least squares estimators of intercept and slope are (from JMP output): b 0 = 6.25 b 1 =

a. The least squares estimators of intercept and slope are (from JMP output): b 0 = 6.25 b 1 = Stat 28 Fall 2004 Key to Homework Exercise.10 a. There is evidence of a linear trend: winning times appear to decrease with year. A straight-line model for predicting winning times based on year is: Winning

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables Lecture 8. Using the CLR Model Relation between patent applications and R&D spending Variables PATENTS = No. of patents (in 000) filed RDEP = Expenditure on research&development (in billions of 99 $) The

More information

Stat 500 Midterm 2 12 November 2009 page 0 of 11

Stat 500 Midterm 2 12 November 2009 page 0 of 11 Stat 500 Midterm 2 12 November 2009 page 0 of 11 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. Do not start until I tell you to. The exam is closed book, closed

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

Econometrics Midterm Examination Answers

Econometrics Midterm Examination Answers Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i

More information

Question 1 [17 points]: (ch 11)

Question 1 [17 points]: (ch 11) Question 1 [17 points]: (ch 11) A study analyzed the probability that Major League Baseball (MLB) players "survive" for another season, or, in other words, play one more season. They studied a model of

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

Chapter 7. Testing Linear Restrictions on Regression Coefficients

Chapter 7. Testing Linear Restrictions on Regression Coefficients Chapter 7 Testing Linear Restrictions on Regression Coefficients 1.F-tests versus t-tests In the previous chapter we discussed several applications of the t-distribution to testing hypotheses in the linear

More information

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF). CHAPTER Functional Forms of Regression Models.1. Consider the following production function, known in the literature as the transcendental production function (TPF). Q i B 1 L B i K i B 3 e B L B K 4 i

More information

Lab 10 - Binary Variables

Lab 10 - Binary Variables Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Math 3330: Solution to midterm Exam

Math 3330: Solution to midterm Exam Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the

More information

In Class Review Exercises Vartanian: SW 540

In Class Review Exercises Vartanian: SW 540 In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE

More information

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Circle the single best answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 6, 2017 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. There are 32 multiple choice

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

STAT 212 Business Statistics II 1

STAT 212 Business Statistics II 1 STAT 1 Business Statistics II 1 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 1: BUSINESS STATISTICS II Semester 091 Final Exam Thursday Feb

More information

The multiple regression model; Indicator variables as regressors

The multiple regression model; Indicator variables as regressors The multiple regression model; Indicator variables as regressors Ragnar Nymoen University of Oslo 28 February 2013 1 / 21 This lecture (#12): Based on the econometric model specification from Lecture 9

More information

Lecture 13 Extra Sums of Squares

Lecture 13 Extra Sums of Squares Lecture 13 Extra Sums of Squares STAT 512 Spring 2011 Background Reading KNNL: 7.1-7.4 13-1 Topic Overview Extra Sums of Squares (Defined) Using and Interpreting R 2 and Partial-R 2 Getting ESS and Partial-R

More information

Discrete distribution. Fitting probability models to frequency data. Hypotheses for! 2 test. ! 2 Goodness-of-fit test

Discrete distribution. Fitting probability models to frequency data. Hypotheses for! 2 test. ! 2 Goodness-of-fit test Discrete distribution Fitting probability models to frequency data A probability distribution describing a discrete numerical random variable For example,! Number of heads from 10 flips of a coin! Number

More information

Final Exam. 1. Definitions: Briefly Define each of the following terms as they relate to the material covered in class.

Final Exam. 1. Definitions: Briefly Define each of the following terms as they relate to the material covered in class. Name Answer Key Economics 170 Spring 2003 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment

More information

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity OSU Economics 444: Elementary Econometrics Ch.0 Heteroskedasticity (Pure) heteroskedasticity is caused by the error term of a correctly speciþed equation: Var(² i )=σ 2 i, i =, 2,,n, i.e., the variance

More information

Paper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD

Paper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD Paper: ST-161 Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop Institute @ UMBC, Baltimore, MD ABSTRACT SAS has many tools that can be used for data analysis. From Freqs

More information

1 The basics of panel data

1 The basics of panel data Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Related materials: Steven Buck Notes to accompany fixed effects material 4-16-14 ˆ Wooldridge 5e, Ch. 1.3: The Structure of Economic Data ˆ Wooldridge

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Lecture 7: OLS with qualitative information

Lecture 7: OLS with qualitative information Lecture 7: OLS with qualitative information Dummy variables Dummy variable: an indicator that says whether a particular observation is in a category or not Like a light switch: on or off Most useful values:

More information

Practice exam questions

Practice exam questions Practice exam questions Nathaniel Higgins nhiggins@jhu.edu, nhiggins@ers.usda.gov 1. The following question is based on the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + u. Discuss the following two hypotheses.

More information

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Inference ME104: Linear Regression Analysis Kenneth Benoit August 15, 2012 August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Stata output resvisited. reg votes1st spend_total incumb minister

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the

More information

Lecture 11 Multiple Linear Regression

Lecture 11 Multiple Linear Regression Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression

More information

Multiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar

Multiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar Multiple Regression and Model Building 11.220 Lecture 20 1 May 2006 R. Ryznar Building Models: Making Sure the Assumptions Hold 1. There is a linear relationship between the explanatory (independent) variable(s)

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

ECON 4230 Intermediate Econometric Theory Exam

ECON 4230 Intermediate Econometric Theory Exam ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the

More information

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.

More information

Possibly useful formulas for this exam: b1 = Corr(X,Y) SDY / SDX. confidence interval: Estimate ± (Critical Value) (Standard Error of Estimate)

Possibly useful formulas for this exam: b1 = Corr(X,Y) SDY / SDX. confidence interval: Estimate ± (Critical Value) (Standard Error of Estimate) Statistics 5100 Exam 2 (Practice) Directions: Be sure to answer every question, and do not spend too much time on any part of any question. Be concise with all your responses. Partial SAS output and statistical

More information

Topic 20: Single Factor Analysis of Variance

Topic 20: Single Factor Analysis of Variance Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory

More information

Chapter 12: Multiple Regression

Chapter 12: Multiple Regression Chapter 12: Multiple Regression 12.1 a. A scatterplot of the data is given here: Plot of Drug Potency versus Dose Level Potency 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 Dose Level b. ŷ = 8.667 + 0.575x

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Econometrics - Exam May 11, 2011 1 Exam Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Problem 1: (15 points) A researcher has data for the year 2000 from

More information

The OLS Estimation of a basic gravity model. Dr. Selim Raihan Executive Director, SANEM Professor, Department of Economics, University of Dhaka

The OLS Estimation of a basic gravity model. Dr. Selim Raihan Executive Director, SANEM Professor, Department of Economics, University of Dhaka The OLS Estimation of a basic gravity model Dr. Selim Raihan Executive Director, SANEM Professor, Department of Economics, University of Dhaka Contents I. Regression Analysis II. Ordinary Least Square

More information

Empirical Application of Panel Data Regression

Empirical Application of Panel Data Regression Empirical Application of Panel Data Regression 1. We use Fatality data, and we are interested in whether rising beer tax rate can help lower traffic death. So the dependent variable is traffic death, while

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force

More information

Exercices for Applied Econometrics A

Exercices for Applied Econometrics A QEM F. Gardes-C. Starzec-M.A. Diaye Exercices for Applied Econometrics A I. Exercice: The panel of households expenditures in Poland, for years 1997 to 2000, gives the following statistics for the whole

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

Chapter 1 Linear Regression with One Predictor

Chapter 1 Linear Regression with One Predictor STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

3 Time Series Regression

3 Time Series Regression 3 Time Series Regression 3.1 Modelling Trend Using Regression Random Walk 2 0 2 4 6 8 Random Walk 0 2 4 6 8 0 10 20 30 40 50 60 (a) Time 0 10 20 30 40 50 60 (b) Time Random Walk 8 6 4 2 0 Random Walk 0

More information

Multiple Regression: Inference

Multiple Regression: Inference Multiple Regression: Inference The t-test: is ˆ j big and precise enough? We test the null hypothesis: H 0 : β j =0; i.e. test that x j has no effect on y once the other explanatory variables are controlled

More information

THE MULTIVARIATE LINEAR REGRESSION MODEL

THE MULTIVARIATE LINEAR REGRESSION MODEL THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus

More information

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression

More information

Multiple Regression Part I STAT315, 19-20/3/2014

Multiple Regression Part I STAT315, 19-20/3/2014 Multiple Regression Part I STAT315, 19-20/3/2014 Regression problem Predictors/independent variables/features Or: Error which can never be eliminated. Our task is to estimate the regression function f.

More information

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model. Statistical Methods in Business Lecture 5. Linear Regression We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

More information

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like. Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and

More information

Multiple linear regression S6

Multiple linear regression S6 Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

The regression model with one fixed regressor cont d

The regression model with one fixed regressor cont d The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8

More information

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

More information

School of Mathematical Sciences. Question 1

School of Mathematical Sciences. Question 1 School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 8 and Assignment 7 Solutions Question 1 Figure 1: The residual plots do not contradict the model assumptions of normality, constant

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

STAT 350 Final (new Material) Review Problems Key Spring 2016

STAT 350 Final (new Material) Review Problems Key Spring 2016 1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX,

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information