Regression with Qualitative Information. Part VI. Regression with Qualitative Information

Size: px
Start display at page:

Download "Regression with Qualitative Information. Part VI. Regression with Qualitative Information"

Transcription

1 Part VI Regression with Qualitative Information As of Oct 17, 2017

2 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

3 Qualitative information: family owns a car or not, a person smokes or not, firm is in bankruptcy or not, industry of a firm, gender of a person, etc Some of the above examples can be both in a role of background (independent) variable or dependent variable. Technically these kinds of variables are coded by a binary values (1 = yes ) (0 = no ). In Econometrics these variables are called generally called dummy variables. Remark 6.1 Usual practice is to denote the dummy variable by the name of one of the categories. For example, instead of using gender one can define the variable e.g. as female, which equals 1 if the gender is female and 0 if male.

4 Exploring Further (W 5ed 7.1, p. 216) Suppose, in a study comparing election outcomes between Democratic and Republican candidates, the researcher wishes to indicate the party of each candidate. Is a name such as party a wise choice for a binary variable in this case. What would be a better name?

5 Single Dummy Independent Variable 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

6 Single Dummy Independent Variable Dummy variables can be incorporated into a regression model as any other variables. Consider the simple regression y = β 0 + δ 0 D + β 1 x + u, (1) where D = 1 if individual has the property and D = 0 otherwise, and E[u D, x] = 0. Parameter δ 0 indicated the difference with respect to the reference group (D = 0), for which the parameter is β 0. Then δ 0 = E[y D = 1, x] E[y D = 0, x]. (2) The value of x is same in both expectations, thus the difference is only due to the property D.

7 Single Dummy Independent Variable y β 0 + δ 0 β 0 E[y D = 1, x] = β 0 + δ 0 + β 1 x E[y D = 0, x] = β 0 + β 1 x x Figure 6.1: E[y D, x] = β 0 + δ 0 D + x + u, δ 0 > 0. The category with D = 0 makes the reference category, and δ 0 indicates the change in the intercept with respect to the reference group.

8 Single Dummy Independent Variable From interpretation point of view it may also be beneficial to associate the categories directly to the regression coefficients. Consider the wage example, where log(w) = β 0 + β 1 educ + β 2 exper + β 3 tenure + u. (3) Suppose we are interested about the difference in wage levels between men an women. Then we can model β 0 as a function of gender as β 0 = β m + δ f female, (4) where subscripts m and f refer to male and female, respectively. Model (3) can be written as log(w) = β m + δ f female + β 1 educ +β 2 exper + β 3 tenure + u. (5)

9 Single Dummy Independent Variable In the model the female dummy is zero for men. All other factors remain the same. Thus the expected difference between wages in terms of the model is according to (2) equal to δ f Because the left hand side is in logs, 100 δ f indicates the difference approximately in percentages.

10 Single Dummy Independent Variable The log approximation may be inaccurate if the percentage difference is big. A more accurate approximation is obtained by using the fact that log(w f ) log(w m ) = log(w f /w m ), where w f and w m refer to a woman s and a man s wage, respectively, thus ( ) wf w m 100 % = 100 (exp(δ f ) 1) %. (6) w m

11 Single Dummy Independent Variable Example 1 Augment the wage example with squared exper and squared tenure to account for possible reducing incremental effect of experience and tenure and account for the possible wage difference with the female dummy. Thus the model is log(w) = β m + δ f female + β 1 educ +β 2 exper + β 3 tenure (7) +β 4 (exper) 2 + β 5 (tenure) 2 + u.

12 Single Dummy Independent Variable Example 1 continues Dependent Variable: LOG(WAGE) Included observations: 526 ======================================================= Variable Coefficient Std. Error t-statistic Prob C FEMALE EDUC EXPER TENURE EXPER^ TENURE^ ====================================================== R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion F-statistic Prob(F-statistic) ===========================================================

13 Single Dummy Independent Variable Example 1 continues Using (6), with ˆδ f = ŵf ŵ m ŵ m = 100[exp(ˆδ f ) 1] 25.7%, (8) which suggests that, given the other factors, women s wages (w f ) are on average 25.7 percent lower than men s wages (w m ). It is notable that exper and tenure squared have statistically significant negative coefficient estimates, which supports the idea of diminishing marginal increase due to these factors.

14 Multiple Categories 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

15 Multiple Categories Additional dummy variables can be included to the regression model as well. In the wage example if married (married = 1, if married, and 0, otherwise) is included we have the following possibilities female married characteization 1 0 single woman 1 1 married woman 0 1 married man 0 0 single man and the intercept parameter refines to β 0 = β sm + δ f female + δ ma marr. (9) Coefficient δ ma is the wage marriage premium.

16 Multiple Categories Example 2 Including married dummy into the wage model yields Dependent Variable: LOG(WAGE) Method: Least Squares Sample: Included observations: 526 ====================================================== Variable Coefficient Std. Error t-statistic Prob C FEMALE MARRIED EDUC EXPER TENURE EXPER^ TENURE^ ====================================================== R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info crit Sum squared resid Schwarz criterion Log likelihood F-statistic Durbin-Watson stat Prob(F-statistic) ======================================================

17 Multiple Categories Example 2 continues The estimate of the marriage premium is about 5.3%, but it is not statistically significant. A major limitation of this model is that it assumes that the marriage premium is the same for men and women. We relax this next.

18 Multiple Categories Generating dummies: singfem, marrfem, and marrmale we can investigate the marriage premiums for men and women. The intercept term becomes β 0 = β sm + δ mm marrmale +δ mf marrfem + δ sf singfem. (10) The needed dummy-variables can be generated as cross-products form the female and married dummies. For example, the singfem dummy is singfem = (1 married) female. So, how do you generate marrmale (married man)?

19 Multiple Categories Example 3 Estimating the model with the intercept modeled as (10) gives Dependent Variable: LOG(WAGE) Method: Least Squares Sample: Included observations: 526 ====================================================== Variable Coefficient Std. Error t-statistic Prob C MARRMALE MARRFEM SINGFEM EDUC EXPER TENURE EXPER^ TENURE^ ====================================================== R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info crit Sum squared resid Schwarz criterion Log likelihood F-statistic Durbin-Watson stat Prob(F-statistic) ======================================================

20 Multiple Categories Example 3 continues The reference group is single men. Single women and men estimated wage difference: -11.0%, just borderline statistically significant in the two sided test. Marriage premium for men: 21.3% (more accurately, using (7), 23.7%). Married women are estimated to earn 19.8% less than single men. Marriage premium for women: or 8.8%. ˆδ mf ˆδ sf ( 0.110) = The statistical significance of this can be tested either by redefining the dummies such that the single women become the base.

21 Multiple Categories Example 3 continues Another option is to use advanced econometric software. EViews Wald test produces for H 0 : δ mf = δ sf (11) F = (df 1 and 517) with p-value , which is not statistically significant and there is not empirical evidence for wage difference between married and single women. In the same manner testing for H 0 : δ mm = δ mf produces F = with p-value , i.e., highly statistically significant. Thus, there is strong empirical evidence of marriage premium for men.

22 Multiple Categories Remark 6.1: If there are q then q 1 dummy variables are needed. The category which does not have a dummy variable becomes the base category or benchmark. Remark 6.2: Dummy variable trap. If the model includes the intercept term, defining q dummies for q categories leads to an exact linear dependence, because 1 = D D q. Note also that D 2 = D, which again leads to an exact linear dependency if a dummy squared is added to the model. All these cases which lead to the exact linear dependency with dummy-variables are called the dummy variable trap.

23 Multiple Categories Exploring Further (W 5ed 7.2, p. 227) In the baseball salary data MLB1 (available at players are given one of six positions: frstbase, scndbase, thrdbase, shrtsop, outfield, or catcher. To allow for salary differentials accross position, with outfielders as the base group, which dummy variables would you include as independent variables?

24 Multiple Categories 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

25 Multiple Categories If the categories include ordinal information (e.g. 1 = good, 2 = better, 3 = best ), sometimes people these variables as such in regressions. However, interpretation may be a problem, because one unit change implies a constant partial effect. That is the difference between better and good is as big as best and better.

26 Multiple Categories The usual alternative to use dummy-variables. In the above example two dummies are needed. D 1 = 1 is better, and 0 otherwise, D 2 = 1 for best and 0 otherwise. As a consequence, the reference group is good.

27 Multiple Categories Exploring Further (Adapted from W 5ed 7.3, p228) Consider the effect of credit rating (CR) on the municipal bond rate (MBR). Credit ratings are indicated by ranking from zero to four, with zero the worst and four being the best. Using using model MBR = β 0 + δ 1 CR 1 + δ 2 CR 2 + δ 3 CR 3 + δ 4 CR 4 + other factors, where CR 1 = 1 if CR = 1, and CR 1 = 0 otherwise, CR 2 = 1 if CR = 2, and CR 2 = 0 otherwise, and so on. What can you expect about the signs of the coefficients δ i? How would you test that credit rating has no effect on MBR?

28 Multiple Categories The constant partial effect can be tested by testing the restricted model y = β 0 + δ(d 1 + 2D 2 ) + x + u (12) against the unrestricted alternative y i = β 0 + δ 1 D 1 + δ 2 D 2 + x + u. (13) Exploring Further How would you test using Eviews whether the credit rating affects linearly on the municipal bond rate?

29 Multiple Categories Example 4 Effects of law school ranking on starting salaries. Dummy variables top10, r11 25, r26 40, r41 60, and r The reference group is the schools ranked below 100.

30 Multiple Categories Example 4 continues Below are estimation results with some additional covariates (Wooldridge, Example 7.8). Dependent Variable: LOG(SALARY) Sample (adjusted): Included observations: 136 after adjustments ====================================================== Variable Coefficient Std. Error t-statistic Prob C TOP R11_ R26_ R41_ R61_ LSAT GPA LOG(LIBVOL) LOG(COST) ====================================================== R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info crit Sum squared resid Schwarz criterion F-statistic Prob(F-statistic) ======================================================

31 Multiple Categories Example 4 continues The estimation results indicate that the ranking has a big influence on the staring salary. The estimated median salary at a law school ranked between 61 and 100 is about 13% higher than in those ranked below 100. The coefficient estimate for the top 10 is , using (7) we get 100 [exp(0.6996) 1] 101.4%, that is median starting salaries in top 10 schools tend to be double to those ranked below 100.

32 Multiple Categories Example 5 Although not fully relevant, let us for just illustration purposes test constant partial effect hypothesis. I.e., whether δ top10 = 5δ , δ H 0 : = 4δ , δ = 3δ , δ = 2δ (14) Using Wald test for coefficient restrictions in EViews gives F = with df 1 = 4 and df 2 = 126 and p-value This indicates that the there is not much empirical evidence against the constant partial effect for the starting salary increment. The estimated constant partial coefficient is , i.e., at each ranking class starting median salary is estimated to increase approximately by 14%.

33 Multiple Categories Example 5 continues LOG(SALARY) RANK Figure 6.1: Starting median salary and law school ranking.

34 Interaction Involving Dummy Variables 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

35 Interaction Involving Dummy Variables 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

36 Interaction Involving Dummy Variables In Example 3 additional dummy variables were generated by multiplying various other dummy variables. Factually these define interaction terms among dummy variables. Example 6 The same model as in Example 3 can be specified as log(wage) = β 0 +δ f female+δ mar married+δ m f female married+ (15)

37 Interaction Involving Dummy Variables Example 6 continues lm(formula = log(wage) ~ female + married + I(female * married) + educ + educ + exper + tenure + I(exper^2) + I(tenure^2), data = wdfr) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) ** female * married *** I(female * married) e-05 *** educ < 2e-16 *** exper e-07 *** tenure e-05 *** I(exper^2) e-06 *** I(tenure^2) * --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 517 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 8 and 517 DF, p-value: < 2.2e-16

38 Interaction Involving Dummy Variables Example 6 continues This example allows to estimate wage differential for all the four groups (single men, single women, married men, married women), but we need to be careful in deriving the differentials. It is better to compute the differential as differences of the intercepts of the groups. For example marriage premium for women (compared to single women): The intercept for married women is β 0 + δ f + δ mar + δ m f and the intercept for single women is β 0 + δ f, such that the difference is δ mar + δ m f with estimate as in Example 3. We observe that the marriage premium for men to single men in this specification is directly δ mar (= β 0 + δ mar β 0 ).

39 Interaction Involving Dummy Variables 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

40 Interaction Involving Dummy Variables Consider y = β 0 + β 1 x 1 + u. (16) If the slope depends also on the group, we get in addition to β 0 = β 00 + δ 0 D, (17) for the slope coefficient similarly β 1 = β 11 + δ 1 D. (18) The regression equation is then y = β 00 + β 1 D + β 11 x 1 + δ 1 x 2 + u, (19) where x 2 = D x 1.

41 Interaction Involving Dummy Variables Example 7 Wage example. Test whether return to eduction differs between women and men. This can be tested by defining The null hypothesis is H 0 : δ feduc = 0. β educ = β meduc + δ feduc female. (20)

42 Interaction Involving Dummy Variables Example 7 continues Dependent Variable: LOG(WAGE) Method: Least Squares Sample: Included observations: 526 ====================================================== Variable Coefficient Std. Error t-statistic Prob C MARRMALE MARRFEM SINGFEM FEMALE*EDUC EDUC EXPER TENURE EXPER^ TENURE^ ====================================================== R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info crit Sum squared resid Schwarz criterion Log likelihood F-statistic Durbin-Watson stat Prob(F-statistic) ======================================================

43 Interaction Involving Dummy Variables Example 7 continues ˆδ feduc = with p-value Thus there is no empirical evidence that the return to education would differ between men and women.

44 Interaction Involving Dummy Variables 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

45 Interaction Involving Dummy Variables Suppose there are two populations (e.g. men and women) and we want to test whether the same regression function applies to both groups. All this can be handled by introducing a dummy variable, D with D = 1 for group 1 and zero for group 2.

46 Interaction Involving Dummy Variables If the regression in group g (g = 1, 2) is y g,i = β g,0 + β g,1 x g,i,1 + + β g,k x g,i,k + u g,i, (21) i = 1,..., n g, where n g is the number of observations from group g. Using the group dummy, we can write β g,j = β j + δ j D, (22) j = 0, 1,..., k. An important assumption is that in both groups var[u g,i ] = σ 2 u. The null hypothesis is H 0 : δ 0 = δ 1 = = δ k = 0. (23)

47 Interaction Involving Dummy Variables The null hypothesis (23) can be tested with the F -test, given in (4.20). In the first step the unrestricted model is estimated over the pooled sample with coefficients of the form in equation (21) (thus 2(k + 1)-coefficients). Next the restricted model, with all δ-coefficients set to zero, is estimated again over the pooled sample. Using the SSRs from restricted and unrestricted models, test statistic (4.20) becomes F = (SSR r SSR ur )/(k + 1), (24) SSR ur /[n 2(k + 1)] which has the F -distribution under the null hypothesis with k + 1 and n 2(k + 1) degrees of freedom.

48 Interaction Involving Dummy Variables Exactly the same result is obtained if one estimates the regression equations separately from each group and sums up the SSRs. That is SSR ur = SSR 1 + SSR 2, (25) Where SSR g is from the regression estimated from group g, g = 1, 2. Thus, statistic (24) can be written alternatively as F = [SSR r (SSR 1 + SSR 2 )]/(k + 1), (26) (SSR 1 + SSR 2 )/[n 2(k + 1)] which is known as Chow statistic (or Chow test).

49 Binary Dependent Variable 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

50 Binary Dependent Variable 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

51 Binary Dependent Variable Up until now in regression y = x β + u, (27) where x β = β 0 + β 1 x β k x k, y has had quantitative meaning (e.g. wage). What if y indicates a qualitative event (e.g., firm has gone to bankruptcy), such that y = 1 indicates the occurrence of the event ( success ) and y = 0 non-occurrence ( fail ), and we want to explain it by some explanatory variables?

52 Binary Dependent Variable The meaning of the regression y = x β + u, when y is a binary variable. Then, because E[u x] = 0, E[y x] = x β. (28) Because y is a random variable that can have only values 0 or 1, we can define probabilities for y as P(y = 1 x) and P(y = 0 x) = 1 P(y = 1 x), such that E[y x] = 0 P(y = 0 x) + 1 P(y = 1 x) = P(y = 1 x).

53 Binary Dependent Variable Thus, E[y x] = P(y = 1 x) indicates the success probability and regression in equation 28 models P(y = 1 x) = β 0 + β 1 x β k x k, (29) the probability of success. This is called the linear probability model (LPM). The slope coefficients indicate the marginal effect of corresponding x-variable on the success probability, i.e., change in the probability as x changes, or P(y = 1 x) = β j x j. (30)

54 Binary Dependent Variable In the OLS estimated model ŷ = β 0 + ˆβ 1 x ˆβ k x k (31) ŷ is the estimated or predicted probability of success. In order to correctly specify the binary variable, it may be useful to name the variable according to the success category (e.g., in a bankruptcy study, bankrupt = 1 for bankrupt firms and bankrupt = 0 for non-bankrupt firm [thus success is just a generic term]).

55 Binary Dependent Variable Example 8 Married women participation in labor force (year 1975). (See R-snippet for the R-commands) Call: lm(formula = inlf ~ huswage + educ + exper + I(exper^2) + age + kidslt6 + kidsge6, data = mdfr) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-05 *** huswage * educ e-07 *** exper e-12 *** I(exper^2) ** age e-11 *** kidslt e-14 *** kidsge Signif. codes: 0 *** ** 0.01 * Residual standard error: on 745 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 7 and 745 DF, p-value: < 2.2e-16

56 Binary Dependent Variable Example 8 continues All but kidsge6 are statistically significant with signs as might be expected. The coefficients indicate the marginal effects of the variables on the probability that inlf = 1. Thus e.g., an additional year of educ increases the probability by (other variables held fixed). Marginal effect of experince on marri women labor force participation Marginal effect of eduction on marrie women labor force participation Probability Probability Experience (years) Education (years)

57 Binary Dependent Variable Some issues with associated to the LPM. Dependent left hand side restricted to (0, 1), while right hand side (, ), which may result to probability predictions less than zero or larger than one. Heteroskedasticity of u, since by denoting p(x) = P(y = 1 x) = x β var[u x] = (1 p(x))p(x) (32) which is not a constant but depends on x, and hence violating Assumption 2.

58 Binary Dependent Variable 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving Dummy Variables Interaction among Dummy Variables Different Slopes Chow Test Binary Dependent Variable The Linear Probability Model The Logit and Probit Model

59 Binary Dependent Variable The first of the above problems can be technically easily solved by mapping the linear function on the right hand side of equation 29 by a non-linear function to the range (0, 1). Such a function is generally called a link function. That is, instead we write equation (19) as P(y = 1 x) = G(x β). (33) Although any function G : R [0, 1] applies in principle, so called logit and probit transformations are in practice most popular (the former is based on logistic distribution and the latter normal distribution). Economists favor often the probit transformation such that G is the distribution function of the standard normal density, i.e., G(z) = Φ(z) = z 1 2π e 1 2 v 2 dv, (34)

60 Binary Dependent Variable In the logit tranformation G(z) = Both as S-shaped ez 1 + e z = e z = z e v dv. (35) (1 + e v 2 ) Probit transformation Logit transformation G(z) G(z) z z

61 Binary Dependent Variable The price, however, is that the interpretation of the marginal effects is not any more as straightforward as with the LPM. However, negative sign indicates decreasing effect on the probability and positive increasing.

62 Binary Dependent Variable Example 9 Probit and logit estimatation married women labor force participation probability. Probit: (family = binomial(link = probit ) in glm) glm(formula = inlf ~ huswage + educ + exper + I(exper^2) + age + kidslt6 + kidsge6, family = binomial(link = "probit"), data = wkng) Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) huswage * educ e-07 *** exper e-11 *** I(exper^2) ** age e-11 *** kidslt e-13 *** kidsge Signif. codes: 0 *** ** 0.01 * Null deviance: on 752 degrees of freedom Residual deviance: on 745 degrees of freedom AIC:

63 Binary Dependent Variable Example 9 continues Logit: (family = binomial(link = logit ) in glm) glm(formula = inlf ~ huswage + educ + exper + I(exper^2) + age + kidslt6 + kidsge6, family = binomial(link = "logit"), data = wkng) Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) huswage * educ e-07 *** exper e-11 *** I(exper^2) ** age e-10 *** kidslt e-12 *** kidsge Signif. codes: 0 *** ** 0.01 * Null deviance: on 752 degrees of freedom Residual deviance: on 745 degrees of freedom AIC: Qualitatively the results are similar to those of the LPM. (R exercise: create similar graphs to those of the linear case for the marginal effects.)

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2018 Part III Limited Dependent Variable Models As of Jan 30, 2017 1 Background 2 Binary Dependent Variable The Linear Probability

More information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information 1. Qualitative Information Qualitative Information Up to now, we assume that all the variables has quantitative meaning. But often in empirical work, we must incorporate qualitative factor into regression

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

Econometrics I Lecture 7: Dummy Variables

Econometrics I Lecture 7: Dummy Variables Econometrics I Lecture 7: Dummy Variables Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 27 Introduction Dummy variable: d i is a dummy variable

More information

Statistical Inference. Part IV. Statistical Inference

Statistical Inference. Part IV. Statistical Inference Part IV Statistical Inference As of Oct 5, 2017 Sampling Distributions of the OLS Estimator 1 Statistical Inference Sampling Distributions of the OLS Estimator Testing Against One-Sided Alternatives Two-Sided

More information

Model Specification and Data Problems. Part VIII

Model Specification and Data Problems. Part VIII Part VIII Model Specification and Data Problems As of Oct 24, 2017 1 Model Specification and Data Problems RESET test Non-nested alternatives Outliers A functional form misspecification generally means

More information

Outline. 2. Logarithmic Functional Form and Units of Measurement. Functional Form. I. Functional Form: log II. Units of Measurement

Outline. 2. Logarithmic Functional Form and Units of Measurement. Functional Form. I. Functional Form: log II. Units of Measurement Outline 2. Logarithmic Functional Form and Units of Measurement I. Functional Form: log II. Units of Measurement Read Wooldridge (2013), Chapter 2.4, 6.1 and 6.2 2 Functional Form I. Functional Form: log

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Course Econometrics I

Course Econometrics I Course Econometrics I 3. Multiple Regression Analysis: Binary Variables Martin Halla Johannes Kepler University of Linz Department of Economics Last update: April 29, 2014 Martin Halla CS Econometrics

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do

More information

Lab 10 - Binary Variables

Lab 10 - Binary Variables Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

4. Nonlinear regression functions

4. Nonlinear regression functions 4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change

More information

CHAPTER 7. + ˆ δ. (1 nopc) + ˆ β1. =.157, so the new intercept is = The coefficient on nopc is.157.

CHAPTER 7. + ˆ δ. (1 nopc) + ˆ β1. =.157, so the new intercept is = The coefficient on nopc is.157. CHAPTER 7 SOLUTIONS TO PROBLEMS 7. (i) The coefficient on male is 87.75, so a man is estimated to sleep almost one and one-half hours more per week than a comparable woman. Further, t male = 87.75/34.33

More information

Estimating the return to education for married women mroz.csv: 753 observations and 22 variables

Estimating the return to education for married women mroz.csv: 753 observations and 22 variables Return to education Estimating the return to education for married women mroz.csv: 753 observations and 22 variables 1. inlf =1 if in labor force, 1975 2. hours hours worked, 1975 3. kidslt6 # kids < 6

More information

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies)

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies) Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies) Statistics and Introduction to Econometrics M. Angeles Carnero Departamento de Fundamentos del Análisis Económico

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

7. Prediction. Outline: Read Section 6.4. Mean Prediction

7. Prediction. Outline: Read Section 6.4. Mean Prediction Outline: Read Section 6.4 II. Individual Prediction IV. Choose between y Model and log(y) Model 7. Prediction Read Wooldridge (2013), Chapter 6.4 2 Mean Prediction Predictions are useful But they are subject

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

CHAPTER 4. > 0, where β

CHAPTER 4. > 0, where β CHAPTER 4 SOLUTIONS TO PROBLEMS 4. (i) and (iii) generally cause the t statistics not to have a t distribution under H. Homoskedasticity is one of the CLM assumptions. An important omitted variable violates

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

ECON Interactions and Dummies

ECON Interactions and Dummies ECON 351 - Interactions and Dummies Maggie Jones 1 / 25 Readings Chapter 6: Section on Models with Interaction Terms Chapter 7: Full Chapter 2 / 25 Interaction Terms with Continuous Variables In some regressions

More information

Applied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid

Applied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid Applied Economics Regression with a Binary Dependent Variable Department of Economics Universidad Carlos III de Madrid See Stock and Watson (chapter 11) 1 / 28 Binary Dependent Variables: What is Different?

More information

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables Lecture 8. Using the CLR Model Relation between patent applications and R&D spending Variables PATENTS = No. of patents (in 000) filed RDEP = Expenditure on research&development (in billions of 99 $) The

More information

Course Econometrics I

Course Econometrics I Course Econometrics I 4. Heteroskedasticity Martin Halla Johannes Kepler University of Linz Department of Economics Last update: May 6, 2014 Martin Halla CS Econometrics I 4 1/31 Our agenda for today Consequences

More information

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama Course Packet The purpose of this packet is to show you one particular dataset and how it is used in

More information

Practice Questions for the Final Exam. Theoretical Part

Practice Questions for the Final Exam. Theoretical Part Brooklyn College Econometrics 7020X Spring 2016 Instructor: G. Koimisis Name: Date: Practice Questions for the Final Exam Theoretical Part 1. Define dummy variable and give two examples. 2. Analyze the

More information

Linear Regression With Special Variables

Linear Regression With Special Variables Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:

More information

Econometrics Problem Set 10

Econometrics Problem Set 10 Econometrics Problem Set 0 WISE, Xiamen University Spring 207 Conceptual Questions Dependent variable: P ass Probit Logit LPM Probit Logit LPM Probit () (2) (3) (4) (5) (6) (7) Experience 0.03 0.040 0.006

More information

Universidad Carlos III de Madrid Econometría Nonlinear Regression Functions Problem Set 8

Universidad Carlos III de Madrid Econometría Nonlinear Regression Functions Problem Set 8 Universidad Carlos III de Madrid Econometría Nonlinear Regression Functions Problem Set 8 1. The sales of a company amount to 196 millions of dollars in 2009 and increased up to 198 millions in 2010. (a)

More information

Would you have survived the sinking of the Titanic? Felix Pretis (Oxford) Econometrics Oxford University, / 38

Would you have survived the sinking of the Titanic? Felix Pretis (Oxford) Econometrics Oxford University, / 38 Would you have survived the sinking of the Titanic? Felix Pretis (Oxford) Econometrics Oxford University, 2016 1 / 38 Introduction Econometrics: Computer Modelling Felix Pretis Programme for Economic Modelling

More information

Heteroscedasticity 1

Heteroscedasticity 1 Heteroscedasticity 1 Pierre Nguimkeu BUEC 333 Summer 2011 1 Based on P. Lavergne, Lectures notes Outline Pure Versus Impure Heteroscedasticity Consequences and Detection Remedies Pure Heteroscedasticity

More information

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3 Solutions to Problem Set 5 (Due November 22) EC 228 02, Fall 2010 Prof. Baum, Ms Hristakeva Maximum number of points for Problem set 5 is: 220 Problem 7.3 (i) (5 points) The t statistic on hsize 2 is over

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Exercise Sheet 6: Solutions

Exercise Sheet 6: Solutions Exercise Sheet 6: Solutions R.G. Pierse 1. (a) Regression yields: Dependent Variable: LC Date: 10/29/02 Time: 18:37 Sample(adjusted): 1950 1985 Included observations: 36 after adjusting endpoints C 0.244716

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

x = 1 n (x = 1 (x n 1 ι(ι ι) 1 ι x) (x ι(ι ι) 1 ι x) = 1

x = 1 n (x = 1 (x n 1 ι(ι ι) 1 ι x) (x ι(ι ι) 1 ι x) = 1 Estimation and Inference in Econometrics Exercises, January 24, 2003 Solutions 1. a) cov(wy ) = E [(WY E[WY ])(WY E[WY ]) ] = E [W(Y E[Y ])(Y E[Y ]) W ] = W [(Y E[Y ])(Y E[Y ]) ] W = WΣW b) Let Σ be a

More information

ECON 5350 Class Notes Functional Form and Structural Change

ECON 5350 Class Notes Functional Form and Structural Change ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this

More information

Applied Econometrics. Applied Econometrics Second edition. Dimitrios Asteriou and Stephen G. Hall

Applied Econometrics. Applied Econometrics Second edition. Dimitrios Asteriou and Stephen G. Hall Applied Econometrics Second edition Dimitrios Asteriou and Stephen G. Hall MULTICOLLINEARITY 1. Perfect Multicollinearity 2. Consequences of Perfect Multicollinearity 3. Imperfect Multicollinearity 4.

More information

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) 5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) Assumption #A1: Our regression model does not lack of any further relevant exogenous variables beyond x 1i, x 2i,..., x Ki and

More information

Multiple Regression: Inference

Multiple Regression: Inference Multiple Regression: Inference The t-test: is ˆ j big and precise enough? We test the null hypothesis: H 0 : β j =0; i.e. test that x j has no effect on y once the other explanatory variables are controlled

More information

Lecture 8. Using the CLR Model

Lecture 8. Using the CLR Model Lecture 8. Using the CLR Model Example of regression analysis. Relation between patent applications and R&D spending Variables PATENTS = No. of patents (in 1000) filed RDEXP = Expenditure on research&development

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them. Sample Problems 1. True or False Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them. (a) The sample average of estimated residuals

More information

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations. Exercises for the course of Econometrics Introduction 1. () A researcher is using data for a sample of 30 observations to investigate the relationship between some dependent variable y i and independent

More information

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points EEP 118 / IAS 118 Elisabeth Sadoulet and Kelly Jones University of California at Berkeley Fall 2008 Introductory Applied Econometrics Final examination Scores add up to 125 points Your name: SID: 1 1.

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Lecture Module 6. Agenda. Professor Spearot. 1 P-values. 2 Comparing Parameters. 3 Predictions. 4 F-Tests

Lecture Module 6. Agenda. Professor Spearot. 1 P-values. 2 Comparing Parameters. 3 Predictions. 4 F-Tests Lecture Module 6 Professor Spearot Agenda 1 P-values 2 Comparing Parameters 3 Predictions 4 F-Tests Confidence intervals Housing prices and cancer risk Housing prices and cancer risk (Davis, 2004) Housing

More information

Univariate linear models

Univariate linear models Univariate linear models The specification process of an univariate ARIMA model is based on the theoretical properties of the different processes and it is also important the observation and interpretation

More information

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

Multiple Regression. Peerapat Wongchaiwat, Ph.D. Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com The Multiple Regression Model Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (X i ) Multiple Regression Model

More information

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity Outline: Further Issues in Using OLS with Time Series Data 13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process I. Stationary and Weakly Dependent Time Series III. Highly Persistent

More information

Practice exam questions

Practice exam questions Practice exam questions Nathaniel Higgins nhiggins@jhu.edu, nhiggins@ers.usda.gov 1. The following question is based on the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + u. Discuss the following two hypotheses.

More information

Outline. 11. Time Series Analysis. Basic Regression. Differences between Time Series and Cross Section

Outline. 11. Time Series Analysis. Basic Regression. Differences between Time Series and Cross Section Outline I. The Nature of Time Series Data 11. Time Series Analysis II. Examples of Time Series Models IV. Functional Form, Dummy Variables, and Index Basic Regression Numbers Read Wooldridge (2013), Chapter

More information

1 Quantitative Techniques in Practice

1 Quantitative Techniques in Practice 1 Quantitative Techniques in Practice 1.1 Lecture 2: Stationarity, spurious regression, etc. 1.1.1 Overview In the rst part we shall look at some issues in time series economics. In the second part we

More information

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 Matt Tudball University of Toronto St. George October 6, 2017 Matt Tudball (University of Toronto) ECO375H1 October 6, 2017 1 / 36 ECO375 Tutorial 4 Welcome

More information

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Problem Set 2: Box-Jenkins methodology

Problem Set 2: Box-Jenkins methodology Problem Set : Box-Jenkins methodology 1) For an AR1) process we have: γ0) = σ ε 1 φ σ ε γ0) = 1 φ Hence, For a MA1) process, p lim R = φ γ0) = 1 + θ )σ ε σ ε 1 = γ0) 1 + θ Therefore, p lim R = 1 1 1 +

More information

Matched Pair Data. Stat 557 Heike Hofmann

Matched Pair Data. Stat 557 Heike Hofmann Matched Pair Data Stat 557 Heike Hofmann Outline Marginal Homogeneity - review Binary Response with covariates Ordinal response Symmetric Models Subject-specific vs Marginal Model conditional logistic

More information

Exercise sheet 6 Models with endogenous explanatory variables

Exercise sheet 6 Models with endogenous explanatory variables Exercise sheet 6 Models with endogenous explanatory variables Note: Some of the exercises include estimations and references to the data files. Use these to compare them to the results you obtained with

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Practical Econometrics. for. Finance and Economics. (Econometrics 2) Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics

More information

3. Linear Regression With a Single Regressor

3. Linear Regression With a Single Regressor 3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)

More information

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3 4 5 6 Full marks

More information

Hint: The following equation converts Celsius to Fahrenheit: F = C where C = degrees Celsius F = degrees Fahrenheit

Hint: The following equation converts Celsius to Fahrenheit: F = C where C = degrees Celsius F = degrees Fahrenheit Amherst College Department of Economics Economics 360 Fall 2014 Exam 1: Solutions 1. (10 points) The following table in reports the summary statistics for high and low temperatures in Key West, FL from

More information

Testing and Model Selection

Testing and Model Selection Testing and Model Selection This is another digression on general statistics: see PE App C.8.4. The EViews output for least squares, probit and logit includes some statistics relevant to testing hypotheses

More information

Chapter 9. Dummy (Binary) Variables. 9.1 Introduction The multiple regression model (9.1.1) Assumption MR1 is

Chapter 9. Dummy (Binary) Variables. 9.1 Introduction The multiple regression model (9.1.1) Assumption MR1 is Chapter 9 Dummy (Binary) Variables 9.1 Introduction The multiple regression model y = β+β x +β x + +β x + e (9.1.1) t 1 2 t2 3 t3 K tk t Assumption MR1 is 1. yt =β 1+β 2xt2 + L+β KxtK + et, t = 1, K, T

More information

Project Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang

Project Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang Project Report for STAT7 Statistical Methods Instructor: Dr. Ramon V. Leon Wage Data Analysis Yuanlei Zhang 77--7 November, Part : Introduction Data Set The data set contains a random sample of observations

More information

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 7 pages long. There are 4 questions, first 3 worth 10

More information

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression

More information

Inference in Regression Analysis

Inference in Regression Analysis ECNS 561 Inference Inference in Regression Analysis Up to this point 1.) OLS is unbiased 2.) OLS is BLUE (best linear unbiased estimator i.e., the variance is smallest among linear unbiased estimators)

More information

Econometrics II Tutorial Problems No. 1

Econometrics II Tutorial Problems No. 1 Econometrics II Tutorial Problems No. 1 Lennart Hoogerheide & Agnieszka Borowska 15.02.2017 1 Summary Binary Response Model: A model for a binary (or dummy, i.e. with two possible outcomes 0 and 1) dependent

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Lecture 7: OLS with qualitative information

Lecture 7: OLS with qualitative information Lecture 7: OLS with qualitative information Dummy variables Dummy variable: an indicator that says whether a particular observation is in a category or not Like a light switch: on or off Most useful values:

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial

More information

Solutions: Monday, October 22

Solutions: Monday, October 22 Amherst College Department of Economics Economics 360 Fall 2012 1. Focus on the following agricultural data: Solutions: Monday, October 22 Agricultural Production Data: Cross section agricultural data

More information

Introduction to Econometrics Chapter 4

Introduction to Econometrics Chapter 4 Introduction to Econometrics Chapter 4 Ezequiel Uriel Jiménez University of Valencia Valencia, September 2013 4 ypothesis testing in the multiple regression 4.1 ypothesis testing: an overview 4.2 Testing

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

ECNS 561 Topics in Multiple Regression Analysis

ECNS 561 Topics in Multiple Regression Analysis ECNS 561 Topics in Multiple Regression Analysis Scaling Data For the simple regression case, we already discussed the effects of changing the units of measurement Nothing different here Coefficients, SEs,

More information

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf

More information

BMI 541/699 Lecture 22

BMI 541/699 Lecture 22 BMI 541/699 Lecture 22 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Power and sample size for t-based

More information

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests Brief Sketch of Solutions: Tutorial 3 3) unit root tests.5.4.4.3.3.2.2.1.1.. -.1 -.1 -.2 -.2 -.3 -.3 -.4 -.4 21 22 23 24 25 26 -.5 21 22 23 24 25 26.8.2.4. -.4 - -.8 - - -.12 21 22 23 24 25 26 -.2 21 22

More information

Econometrics - Slides

Econometrics - Slides 1 Econometrics - Slides 2011/2012 João Nicolau 2 1 Introduction 1.1 What is Econometrics? Econometrics is a discipline that aims to give empirical content to economic relations. It has been defined generally

More information