MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

Similar documents
Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

ECO321: Economic Statistics II

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

ECON 497 Midterm Spring

Final Exam - Solutions


WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

WISE International Masters

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Linear Regression with Multiple Regressors

Econometrics -- Final Exam (Sample)


Inferences for Regression

Linear Regression with Multiple Regressors

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

WISE International Masters


Final Exam - Solutions

Nonlinear Regression Functions

STP 226 EXAMPLE EXAM #3 INSTRUCTOR:

Applied Statistics and Econometrics

2. Linear regression with multiple regressors

Econometrics Midterm Examination Answers

4. Nonlinear regression functions

Econometrics Homework 1

Sociology 593 Exam 2 Answer Key March 28, 2002

Chapter 3 Multiple Regression Complete Example

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Midterm 2 - Solutions

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Assessing Studies Based on Multiple Regression

Eco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013

Simple Linear Regression: One Qualitative IV

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

11 Correlation and Regression

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Sociology 593 Exam 2 March 28, 2002

ECON3150/4150 Spring 2015

Extra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences , July 2, 2015

STAT 350 Final (new Material) Review Problems Key Spring 2016

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies)

Tribhuvan University Institute of Science and Technology 2065

The F distribution. If: 1. u 1,,u n are normally distributed; and 2. X i is distributed independently of u i (so in particular u i is homoskedastic)

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Introduction to Econometrics. Multiple Regression (2016/2017)

Ch 13 & 14 - Regression Analysis

ECON3150/4150 Spring 2016

Homework Set 2, ECO 311, Fall 2014

ECO220Y Simple Regression: Testing the Slope

ECON 4230 Intermediate Econometric Theory Exam

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

CHAPTER 6: SPECIFICATION VARIABLES

Mathematics for Economics MA course

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Class time (Please Circle): 11:10am-12:25pm. or 12:45pm-2:00pm

Tests of Linear Restrictions

Question 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%.

Simple Linear Regression

Applied Statistics and Econometrics

Introduction to Statistics for the Social Sciences Review for Exam 4 Homework Assignment 27

Practice exam questions

9. Linear Regression and Correlation

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias

The Simple Linear Regression Model

ECON 5350 Class Notes Functional Form and Structural Change

Chapter 14 Student Lecture Notes 14-1

Applied Statistics and Econometrics

Introduction to Econometrics. Multiple Regression

2 Regression Analysis

P1.T2. Stock & Watson Chapters 4 & 5. Bionic Turtle FRM Video Tutorials. By: David Harper CFA, FRM, CIPM

Two-Sample Inference for Proportions and Inference for Linear Regression

Variance Decomposition and Goodness of Fit

Chapter 6: Linear Regression With Multiple Regressors

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Unit 6 - Introduction to linear regression

Review of Econometrics

Multiple Linear Regression

Chapter 3: Examining Relationships

Announcements. Final Review: Units 1-7

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

UNIVERSIDAD CARLOS III DE MADRID ECONOMETRICS Academic year 2009/10 FINAL EXAM (2nd Call) June, 25, 2010

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

Lectures 5 & 6: Hypothesis Testing

In order to carry out a study on employees wages, a company collects information from its 500 employees 1 as follows:

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Hypothesis Tests and Confidence Intervals. in Multiple Regression

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Basic Business Statistics, 10/e

At this point, if you ve done everything correctly, you should have data that looks something like:

MBF1923 Econometrics Prepared by Dr Khairul Anuar

ECONOMETRICS I. Cheating and the violation of any of the above instructions, lead to the cancellation of the student s paper.

Econometrics Problem Set 6

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

Transcription:

Last Name (Print): Solution First Name (Print): Student Number: MGECHY L Introduction to Regression Analysis Term Test Friday July, PM Instructor: Victor Yu Aids allowed: Time allowed: Calculator and one aid sheet (two 8."x" pages) written or typed on both sides Two () hours This exam consists of questions in pages including this cover page. It is the student s responsibility to hand in all 9 pages of this exam. Any missing page will get a zero mark. Show your work in part. No marks will be given if you do not show your work. This exam is worth % of your course grade. Do not write on the space below, for markers only. Page Question Max Mark - 7 6 8 6 7-8 9 6 9-7 Total The University of Toronto's Code of Behaviour on Academic Matters applies to all University of Toronto Scarborough students. The Code prohibits all forms of academic dishonesty including, but not limited to, cheating, plagiarism, and the use of unauthorized aids. Students violating the Code may be subject to penalties up to and including suspension or expulsion from the University. Management, 6 Military Trail, Toronto, ON, MC A4, Canada www.utsc.utoronto.ca/mgmt

Part I. Multiple Choice. marks in each question. No part mark. Circle only one answer. If there are more than one correct answer, circle the best one.. The model y x x u is x (a) a simple regression model (b) a linear multiple regression model (c) a non-linear multiple regression model (d) all of above can be correct (e) none of above is correct. Let y x u be a regression model with one regressor, and let be the correlation between x and y. Which one of the following statements is false? (a) If the F value in the ANOVA table is less than, the model is not significant. (b) Testing H is equivalent to testing H :. : (c) The OLS estimator b always has the same sign as the sample correlation coefficient r. (d) To test the significance of the model, we must assume that the error u has a normal distribution with mean and variance. (e) The estimated regression equation of y on x is always the same as that of x on y.. Let y x u be a regression model with one regressor. To obtain the sample regression coefficient b using the method of least squares, which one of the following statements is false? E b (a) n (b) y y i n (c) y i y i i n (d) y i y (e) i n i i y i y i is a minimum is a minimum 4. Given a set of data x y, x, y,...,,, x y n n. If the correlation coefficient is computed to be r., what is the correlation coefficient computed from the set of data x, y, x, y,..., x n, y n? (a). (b). 7 (c). (d).7 (e) cannot be calculated from given condition

. For the pairs of measurements x y, x, y,...,,, x y n n, the OLS regression line of y on x is y. 4x ; and the OLS regression line of x on y is x y. What is the correlation coefficient between x and y? (a).4 (b). (c). (d). (e).4 6. The regression R is a measure of (a) whether or not X causes Y. (b) the goodness of fit of your regression line. (c) whether or not ESS > TSS. (d) the square of the regression coefficient. (e) none of the above 7. The reason why estimators have a sampling distribution is that (a) economics is not a precise science. (b) individuals respond differently to incentives. (c) in real life you typically get to sample many times. (d) the values of the explanatory variable and the error term differ across samples. (e) the values of the explanatory variable and the error term are the same across samples. 8. Imagine you regressed earnings of individuals on a constant, a binary variable ("Male") which takes on the value for males and is otherwise, and another binary variable ("Female") which takes on the value for females and is otherwise. Because females typically earn less than males, you would expect (a) the coefficient for Male to have a positive sign, and for Female a negative sign. (b) both coefficients to be the same distance from the constant, one above and the other below. (c) none of the OLS estimators to exist because there is perfect multicollinearity. (d) this to yield a difference in means statistic. (e) this to yield better results. 9. The intercept in the multiple regression model (a) should be excluded if one explanatory variable has negative values. (b) determines the height of the regression line. (c) should be excluded because the population regression function does not go through the origin. (d) is statistically significant if it is larger than.96. (e) is always statistically significant.

. In multiple regression, the R increases whenever a regressor is (a) added unless the coefficient on the added regressor is exactly zero. (b) added even when the coefficient on the added regressor is exactly zero. (c) added unless there is heterosckedasticity. (d) greater than.96 in absolute value. (e) greater than.64 in absolute value. In the multiple regression model, the adjusted R (a) cannot be negative. (b) will never be greater than the regression R. (c) equals the square of the correlation coefficient r. (d) cannot decrease when an additional explanatory variable is added. (e) is none of the above. Let R unrestricted and R restricted be.466 and.449 respectively in multiple regression. The difference between the unrestricted and the restricted model is that you have imposed two restrictions. There are 4 observations. The F-statistic in this case is closest to (a) 4.6 (b) 8. (c).4 (d) 7.7 (e).4. Consider the following regression output where the dependent variable is testscores and the two explanatory variables are the student-teacher ratio (STR) and the percent of English (PctEL) learners: = 698.9 -. STR -.6 PctEL. You are told that the t- statistic on the student- teacher ratio coefficient is.6. The standard error therefore is approximately (a). (b).96 (c).6 (d).4 (e).64 Questions 4 7: Suppose in a sample of men that their monthly income (in thousands of dollars), years of schooling and ages are as follows: y x x Men (income in $,) (Years of Schooling) (Age) 6 8 4 7 4 8 6 9 4 4

Assume a linear model y x x u Computer outputs show the following results: Regression Statistics Multiple R.998 R Square.8449 Adj R Square.68989 Standard Error.9796 Observations ANOVA df SS MS F Significance F Regression 6...44786.44 Residual..7 Total 4 74 Coefficients Standard Error Intercept..6 x.8.89 x -..4 4. At % significance level, the model is (a) Significant (b) Not Significant (c) not able to determine the significance. For someone with years of schooling and years of age, the expected monthly income is closest to: (a) $,8 (b) $,88 (c) $6, (d) $6,7 (e) $6,9 6. At % significance, which one of the following statements is true? (a) Both x and x are significant variables to predict y. (b) Only x is significant, x is not significant. (c) Only x is significant, x is not significant. (d) Both x and x are not significant variables to predict y (e) None of the above is true. 7. A 9% confidence interval for is closest to b t / SE b. 4..4 That is,.46,.76.. 96

Part II Show your work in each question. 8. (6 marks) You have obtained a sample of 744 individuals from the Current Population Survey (CPS) and are interested in the relationship between weekly earnings and age. The regression, yielded the following result: = 9.6 +. Age, R =., SER = 87.., (.4) (.7) where Earn and Age are measured in dollars and years respectively. (a) Interpret the regression coefficient.. Solution: A person who is one year older increases her weekly earnings by $.. ( marks) (b) Interpret the measures of fit R. ( marks) Solution: The regression R indicates that five percent of the variation in earnings is explained by the model. The typical error is $87.. (c) Is the relationship between Age and Earn statistically significant? (Use % significance level). ( marks) Solution: H, H : : Method : At % significance level, do not reject H if.96 t. 96, reject H if b. t.96 or t. 96. Test statistic is t 9.8 which falls in the SE b.7 rejection region. Reject H and conclude that the model is significant. Method : At % significance level, do not reject H if F. 84, do not reject H if n R 74. F.84. Test statistic is F 9.684 which falls in the R. rejection region. Reject H and conclude that the model is significant. (d) Construct a 9% confidence interval for the slope. Solution: t SEb..96.7.. 7 b, or (4.88, 6.7) / ( marks) 6

9. (6 marks) An analyst studies the effects of age ( x ), body size ( x ), and smoking history (Z) on systolic blood pressure (y) for a sample of people. The multiple regression model is: y x x Z x Z x Z u 4 The fitted regression equations for smokers and non-smokers are, respectively: Smokers: y 48.7.466x 6. 744x Non-smokers: y 48.6.9x. 4x (a) Obtain the estimates of,,,, 4, and in the model above. Write your answers down below. b b b b b b Show your work below. Solution: For smokers, the model is 4 y x x 4x x x x u y 4 Therefore b b 48. 7, b b. 466, b b 6. 744 For non-smokers, the model is 4 y x x b 48.6, 9 b., b. 4, And b. 8, b 4. 47, b. 77 u, hence u, which is (6 marks) 7

(b) The Sum of Squares in the ANOVA table for the model with parameters,,,, and are given below. 4, SS df MS F Regression 496 Residual Total 646 Test the overall significance of the model using a significance level.. (You must write down the null and alternative hypotheses, the test statistic and the conclusion). (4 marks) Solution: H, H : At least one, i,,,4,. : 4 [Alternatively, H : Model is NOT significant, H : Model is significant.] The ANOVA table is SS df MS F Regression 496 98. 6.9 Residual 6 8.769 Total 646 At % significance level, reject H if F. 9, do not reject H if F. 9, where F has df = (,6). The F statistic from the ANOVA table is 6.9, reject H and conclude that the model is significant. i (c) The Sum of Squares in the ANOVA table for the model with parameters,, and are given below., SS df MS F Regression 489 Residual Total Test the hypothesis H : 4 versus H : At least one of 4 or is non-zero. Use the significance level.. Solution: The partial F test is SSRres SSRunres / q 6 / F SSR / n k / unres is not to reject H. 8.769 (6 marks).84, the conclusion 8

. (7 marks) The cost of attending your college has once again gone up. Although you have been told that education is investment in human capital, which carries a return of roughly % a year, you (and your parents) are not pleased. One of the administrators at your university/college does not make the situation better by telling you that you pay more because the reputation of your institution is better than that of others. To investigate this hypothesis, you collect data randomly for national universities and liberal arts colleges from the - U.S. News and World Report annual rankings. Next you perform the following regression (a) = 7,.7 +,98. Reputation. Size (,8.6) (664.8) (.) + 8,46.79 Dpriv 46.8 Dlibart,76. Dreligion (,4.8) (,.9) (,7.86) R=.7, SER =,77. where Cost is Tuition, Fees, Room and Board in dollars, Reputation is the index used in U.S. News and World Report (based on a survey of university presidents and chief academic officers), which ranges from ("marginal") to ("distinguished"), Size is the number of undergraduate students, and Dpriv, Dlibart, and Dreligion are binary variables indicating whether the institution is private, a liberal arts college, and has a religious affiliatio The numbers in parentheses are standard errors. Indicate whether or not, each coefficient is significantly different from zero. Solution: H :, H :, i,,,4,. i i (4 marks) For., reject H if t. 984 or t. 984. Do not reject H if.984 t. 984, where we have used the t distribution with degrees of freedom. The actual degrees of freedom for this t-test is n k 94, but the corresponding t-value is not available from our t-table. [To marker: some students may use.96 from the Z-table since the sample size is large. Please consider it correct.] bi The t-test for each coefficient is t SEbi Variable Coefficient SE t conclusion Reputation b 98. SE b 664. 8.9966 significant Size b. SE b.. 8 not significant Dpriv b 846. 79 SE b 8 4..9 significant Dlibart b 4 46. 8 SE b 4. 9. 7 not significant Dreligion b 76. SE b 7. 86. 8 significant 9

(b) What is the p-value for the null hypothesis that the coefficient on Size is equal to zero? Based on this, should you eliminate the variable from the regression? Why or why not? ( marks) Solution: From the Z-table, p-value = P Z.4.68. 6 Alternative solution. From the t-table with degrees of freedom,. p value., hence. p value. (c) You want to test simultaneously the hypotheses that βsize = and βdilbert =. Your regression package returns the F-statistic of.. At % significance level, can you reject the null hypothesis? ( marks) Solution: The degrees of freedom for this partial F test is (, 94). From the F-table, the critical value is.9. Reject H if F. 9, do not reject H if F.9. Since the regression package returns the F-statistic of., we do not reject the null hypothesis. (d) Eliminating the Size and Dlibart variables from your regression, the estimation regression becomes =,4. +,8.84 Reputation +,9.7 Dpriv,78. Dreligion; (,77.) (9.49) (87.) (,8.7) R=.7, SER =,79.68 Test the overall significance of this model. Why do you think that the effect of attending a private institution has increased now? ( marks) Solution: H : Model is not significant, H : Model is significant At % significance level, reject H if F. 7 and do not reject H if F. 7, where F has df = (, 96). We have used df = (, ) here, which is the closest. R / k.7/.4 Test statistic is F 8.98 R / n k.7/.9 Reject H and conclude that the model is significant. Private institutions are smaller, on average, and some of these are liberal arts colleges. Both of these variables had negative coefficients.