Name Final Exam Bus 320 Spring 2000 Russell Do not turn over this page until you are told to do so. You will have 3 hours minutes to complete this exam. The exam has a total of 100 points and is divided into three parts. The true and false questions are worth 10 points, the multiple choice are worth 30 points and the long answer questions are worth 60 points. You can use two sides of an 8.5x11 sheet of notes during the exam. Please write clearly and provide answers in the space provided. If you need additional space use the back of the exam pages and clearly organize your work. A table for the cumulative distribution function (cdf) for the t- distribution is attached on the last page. Students in my class are required to adhere to the standards of conduct in the GSB Honor Code and the GSB Standards of Scholarship. The GSB Honor Code also require students to sign the following GSB Honor pledge, "I pledge my honor that I have not violated the Honor Code during this examination. I also understand that discussing the content of this exam with anyone prior to completion by all students would be a violation of the honor code." Please sign here to acknowledge
I. True or False Clearly indicate the best answer by circling T or F indicating that the statement is true or false respectively. If neither T nor F is clearly indicated the problem will be marked as incorrect. Each problem is worth 1 point. 1. T F Shaq, a player for the Los Angeles Lakers basketball team successfully shoots 47% of his free throw shots. Last week he made 9 free throws in 9 attempts. Given that he is a 47% free throw shooter and that he attempts 9 free throws, the probability of this happening in that game was about 1 in 893. 2. T F If X is uniformly distributed such that 2 if a x 2 f (x) = then a=1.5. 0 otherwise 3. T F If X i are iid with a mean of µ and a variance of 45 i= 1 45σ 2 x 2 σ x and are independent then X i is approximately normally distributed with a 45µ and a variance of 4. T F For random samples, the sample average is an unbiassed and consistent estimator for the population mean. 5. T F If X 1,,X n are independent and identically distributed (iid) Bernoulli random variables then if Y = n X i i= 1 then E ( Y ) = np 6. T F If returns for an asset are normally distributed with mean.08 and variance.064 then the probability of a return less than.08 is.25. 7. T F In the multiple regression of Y on X1, X2, X3, the R-squared is the squared correlation between Y and the fitted values Yˆ. 8. T F In a multiple regression the residuals are uncorrelated with the fitted values Yˆ. 9. T F If X and Y are independent then the correlation between X and Y is zero. 10. T F The regression parameter estimates are the parameter values that maximize the R-squared. 2
II. Multiple choice: Clearly circle the answer that is best. Each problem is worth 3 points for a total of 30. No partial credit will be given in this section. If no answer is clearly circled the problem will be marked as incorrect. For questions 1 through 4 consider the output below from a regression in Minitab of the sales price of 150 homes on the square footage and the number of bedrooms. Some of the details of the output have been deleted. In this section, all answers have been rounded to the nearest 100 th. Regression Analysis The regression equation is Price = 49.1 + 4.81 Square_Footage - 0.18 Number_Bedrooms Predictor Coef StDev T P Constant 49.111 6.749 7.28 0.000 Square_F 4.8121 0.5556 8.66 0.000 Number_B -0.178 1.491 0.905 Analysis of Variance Source DF SS MS F P Regression 2 18559.6 9279.8 102.06 0.000 Residual Error 147 13365.6 90.9 Total 149 31925.2 1. The R-squared of the regression is a..12 b..58 c..42 d..72 e. Not enough information is given. 2. The test statistic for the null hypothesis that the coefficient on the number of bedrooms (Number_B) is equal to zero is a. -.12 b..905 c. -.178 d. -2.0 e. Not enough information is given. 3. The estimate of the variance of the error term ε is a. 6.75 b. 102.06 c. 1.1 d. 90.9 e. Not enough information is given. 3
4. Let β 1 and β 2 be the slope coefficients in the regression. You would the null hypothesis that β 1 =β 2 =0 at the level. a. Reject, 5% level b. Fail to reject, 10% level. c. Reject, 1% level d. Both a and d. e. None of the above. 5. Let X have the following density function Find the probability that X<1.5 (ie F(1.5)) a..375 b..75 c..25 d..50 e. None of the above. f.25for 0 x 1 =.50for 1< x < 2.5 0 elsewhere 6. A developer is considering purchasing 2 properties. The return on his investment is uncertain, but he knows that profits are independent across the two properties and each profit is normally distributed. The expected profit from the first investment is 10,000 with a variance of 2000 2. The second has an expected profit of 15,000 with a variance of 3500 2. What is the distribution of total profits on the two projects: a. N(25,000, 4031 2 ) b. N(25,000, 5500 2 ) c. We would need to know the covariance between the profits. d. No longer a Normal distribution. e. None of the above. 7. If you randomly guessed the answers to each of the questions in this multiple choice section (including this one), the mean and variance of the total number of correct answers is: a. mean=2, var=.16 b. mean=.2 var=.16 c. mean=2 var=1.6 d. mean=5, var=2.5 e. None of the above. 8. Smaller confidence intervals result for a. Larger samples. b. Smaller variances of the population we are sampling from. c. Larger confidence levels. d. a and b e. all of the above. 4
9. In 1997 the probability that a new car shopper visits the Honda showroom is 23%. Toyota knows that the probability that a Toyota is sold given that the consumer visited a Honda dealer is 18.5%. The probability that a Toyota is sold given that the consumer did not visit a Honda dealer is 21.1%. What is the probability that the consumer visited the Honda dealer given that the consumer purchased a Toyota? a..2618 b..2300 c..2075 d. Not enough information provided to answer. e. None of the above. 10. Which of the following is the most correct statement about the Central Limit Theorem (CLT)? a. The CLT states that for large, random samples, the sample mean X is always equal to µ. b. The CLT states that for random samples the sample mean X is always equal to µ. c. The CLT states that for large random samples the distribution of the population mean is approximately normally distributed. d. The CLT states that the sum of N iid random variables tends to a Normal distribution as N gets large. e. Both (a) and (d) are correct. 5
III. Long answer questions. Try to do work in the space provided under each question and be show all work in order to facilitate partial credit. Also be sure to place the final answer in the space provided as indicated by the underline when present. 1. The following is summary information about returns of IBM and Exxon. Descriptive Statistics Variable N Mean Median TrMean StDev SE Mean IBM 1011 0.00091 0.00000 0.00078 0.01573 0.00049 Exxon 1011 0.00182 0.00133 0.00150 0.02248 0.00071 Variable Minimum Maximum Q1 Q3 IBM -0.07385 0.05927-0.00972 0.01098 Exxon -0.14953 0.13164-0.01096 0.01300 Covariances IBM Exon IBM 0.00024729 Exxon 0.00006809 0.00050517 Consider the following portfolio: P1=c 1 r f +c 2 IBM+c 3 Exxon Where r f is the risk free rate of return with mean 0.0002 and variance zero. a. Find the mean and variance of the portfolio with weights c 1 =.2 c 2 =.3 and c 3 =.5 b. Find the mean and variance of the portfolio with weights c 1 =-.2 c 2 =.5 c 3 =.7 Notice that c 1 is negative here. This corresponds to borrowing money at the risk free rate and investing that borrowed money in the stocks. Notice also that the weights still sum to one. 6
2. (18 points) A Gallup poll taken in May of this year asked individuals "If you had to choose, which of the following issues is likely to be more important in determining your vote for president this year -- where the candidates stand on abortion or where the candidates stand on gun control?" (http://www.gallup.com/poll/indicators/indguns.asp) Gun control Abortion No opinion Male.12.27.11 Female.15.25.10 a. What is the probability that an individual is a male and finds Gun control to be the more important issue? b. What is the probability that an individual thinks gun control is the more important issue? c. What is the conditional distribution of the more important topic given that the individual is Male (find Pr(topic male) for each topic)? d. Is the sex of the respondent independent of their views on the more important topic? Be sure to formally state why or why not using a definition of independence. e. Interpret the results of part d in plain English. A sentence or two should be sufficient. 7
3. A factory that discharges waste water into the sewage system is required to monitor the arsenic levels in its waste water and report the results to the Environmental Protection Agency (EPA) at regular intervals. Twenty five beakers of waste water from the discharge are obtained at randomly chosen times. The measurement of arsenic is in nonograms per liter for each beaker of water obtained. The summary statistics from minitab are presented below. Descriptive Statistics Variable: Arsenic Anderson-Darling Normality Test A-Squared: P-Value: 1.966 0.000 18 0 40 80 120 160 95% Confidence Interval for Mu 28 38 48 95% Confidence Interval for Median Mean StDev Variance Skewness Kurtosis N Minimum 1st Quartile Median 3rd Quartile Maximum 19.658 95% Confidence Interval for Sigma 24.398 43.468 20.740 32.5560 31.2459 976.308 2.33222 6.15526 25 3.200 15.050 24.000 37.550 152.700 95% Confidence Interval for Mu 45.454 95% Confidence Interval for Median 35.725 a. The company claims that the mean output of arsenic is 20 nonograms per liter. Stating any necessary assumptions, test the null hypothesis at the 5% level against the two-sided alternative. Assumptions H 0 H a Conclusion? b. What is the p-value associated with the above test? 8
c. Use the p-value from the Andersen Darling test statistic (given near the top of the output on the right hand side) to the null hypothesis that the data are normally distributed. Do the test at the 5% level. H 0 H a Conclusion d. What implications, if any, does the normality test result have on your test in part a? 9
4. Let X i denote the number of heads on two tosses of a coin. So, X i can take the values 0, 1, or 2. a. What is the probability of each outcome? Note that you can find the probability of no heads and the probability that both outcomes are heads. The probability of just one head must be 1 minus the probability of the other two outcomes. b. Let Y=X 1 +X 2 +X 3 where the X i are iid. What is the mean of Y? c. What is the variance of Y? d. What are the possible values that the random variable Y can take? e. What is the distribution of Y (be as explicit as possible)? 10
5. Consider the regression output of sales price of a home on the number of bedrooms. The regression equation is Price = 102 + 10.2 Number_Bedrooms Predictor Coef StDev T P Constant 102.188 3.463 29.51 0.000 Number_B 10.158 1.095 9.28 0.000 S = 11.68 R-Sq = 36.8% R-Sq(adj) = 36.3% a. Test the null hypothesis that the Number of bedrooms (Number_B) has no impact on the sales price of the home at the 5% level. Now consider the multiple regression output of Sales price on number of bedrooms and square footage. The regression equation is Price = 49.1 + 4.81 Square_Footage - 0.18 Number_Bedrooms Predictor Coef StDev T P Constant 49.111 6.749 7.28 0.000 Square_F 4.8121 0.5556 8.66 0.000 Number_B -0.178 1.491-0.12 0.905 S = 9.535 R-Sq = 58.1% R-Sq(adj) = 57.6% b. Test the null hypothesis that the Number of bedrooms (Number_B) has no impact on the sales price of the home at the 5% level using the multiple regression output. c. What is the interpretation of the coefficient on Number_B in the multiple regression? (use a couple of explicit sentences). 11
d. What is the intuition behind the reduction in the test statistic for the coefficient on the Number of bedrooms when the multiple regression is run instead of the simple linear regression? (again, a couple of explicit sentences should be sufficient). Next, I regressed the sales price on the square footage The regression equation is Price = 49.5 + 4.76 Square_Footage Predictor Coef StDev T P Constant 49.502 5.881 8.42 0.000 Square_F 4.7592 0.3320 14.33 0.000 S = 9.504 R-Sq = 58.1% R-Sq(adj) = 57.8% The following is the output from Minitab predicting the sales value of the home from the given that the square footage is 18 (1800 square feet). Predicted Values Fit StDev Fit 95.0% CI 95.0% PI 135.167 0.790 ( 133.606, 136.727) ( 116.322, 154.012) e. What is the expected sales price given the house has 1800 square feet? f. Explain in English what the 95% CI is. g. Explain in English what the 95% PI is. h. Use the plug in method to calculate the an approximate value for the 95% PI 12
6. Consider the following output from regressing IBM returns on S&P500 returns. Recall that the slope coefficient in this regression model is referred to as IBM's "beta". The regression equation is ibm =0.000810 + 1.10 s&p Predictor Coef StDev T P Constant 0.0008097 0.0006005 1.35 0.178 s&p 1.09772 0.05486 20.01 0.000 S = 0.01903 R-Sq = 28.4% R-Sq(adj) = 28.3% a. Test the null hypothesis that IBM's beta is 1.0 at the 5% level. b. Build a 95% CI for IBM's beta using the above regression output. 7. In May of 2000 511 people out of 1000 surveyed nationwide said that they thought that it was all right for state capitols to fly the confederate flag. a. What is the estimate pˆ of the true population proportion that thinks it is "all right" for state capitols to fly the flag? b. Build a 95% CI for the true population parameter. c. Test the null hypothesis that the true value of p is.5 at the 5% level. 13
8. Consider the following regression of Y on X. The regression equation is y = 2.80 + 0.252 x Predictor Coef StDev T P Constant 2.8035 0.1011 27.72 0.000 x 0.2523 0.1134 2.23 0.028 S = 1.015 R-Sq = 4.8% R-Sq(adj) = 3.8% The following plot is of the residual versus X. Residuals Versus x (r esponse is y) 4 3 2 Residual 1 0-1 -2-2 -1 0 1 2 x a. What assumption of the regression model appears to be violated here? The next plot is a simple scatter plot of Y versus X. 8 7 6 5 y 4 3 2 1 0-2 -1 0 x 1 2 b. After viewing the scatter plot, why should we not expect the regression model of Y on X estimated above to be a good model? 14
15